Agisoft Metashape

Agisoft Metashape => General => Topic started by: Wishgranter on August 24, 2012, 07:53:01 PM

Title: Benchmarking a GPUs
Post by: Wishgranter on August 24, 2012, 07:53:01 PM
Hello Agisoft team, could you explain to us, how test GPU performance, what to set that we get most precise test data ? A lot of us want/can invest in more than one GPU, but we don?t know to test it. This will help all in the comunity here...... We need to test the Dual GPUs, from AMD and NVIDIA.....
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on August 24, 2012, 07:58:09 PM
The 32 GB RAM modules going from 2000-3000 EUR range to 900 EUR price now, and have read some info it should go lower on next 12 months, afther procuction lines go on full power ( 300-500 range ) ......
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on August 26, 2012, 02:56:44 PM
Everyone who have the Nvidia 500 and 600 series and the AMD 6900 and 7900+7800 GPUs, send me email with your results. Download the http://downloads.agisoft.ru/photoscan/sample01.zip and test on ULTRA setings on first 10 images for now......
. with proper CPU assigned to GPUs. send info on RAM and CPU info too....

Send it to muzeumhb@gmail.com will create table with results.....

Hoping that Agisoft team give us soon proprer info how to benchmark.....
Title: Re: Benchmarking a GPUs
Post by: Alexey Pasumansky on August 27, 2012, 06:44:24 PM
Hello Wishgranter,

Once the community will be running the same datasets in the same settings (Quality, object/geometry type, face count = 0 for arbitrary reconstructions), we recommend for GPU benchmarking turn out all CPU cores. And if CPU is also tested, then to turn off all OpenCL devices.

Also we recommend to make tests on the version 0.9.0 and save log files.

Title: Re: Benchmarking a GPUs
Post by: Wishgranter on September 27, 2012, 08:20:02 PM
Hello all here.....

Here is link to sample images taken from the agisoft test scene01, there is PSZ file, and every 2nd image disabled, that mean from 32 images are 16 used. Thre you can find 2 EXCEL files where you can put data from "benchmark".

https://dl.dropbox.com/u/15047343/Benchmark-sample01.zip

Run the benchmark on ARBITRARY - SMOOTH - ULTRA - Face Count 0 - Filter treshold 0,5 - Hole treshold 0,1 

afther that = Go to menu VIEW - CONSOLE search for end of the console for something like this

and copy performance data to the provided EXCEL file and send to muzeumhb@gmail.com


timings: rectify: 0.001 disparity: 0.015 borders: 0.04 filter: 0 fill: 0
finished depth reconstruction in 4.376 seconds
Device 1 performance: 120.354 million samples/sec (GeForce GTX 560 Ti)
Total performance: 120.354 million samples/sec
Generating mesh...
58336 points extracted
Grid size: 102 x 102 x 160
Tree depth: 8
Tree set in 0.193 s
Tree size 7.2564 MB (118889 leaves, 135873 nodes)
Tree refined in 0.027s
Tree size 7.2564 MB (118889 leaves, 135873 nodes)
Normal Size: 1.21558 MB
Laplacian weights set in 0.581s
Tree refined in 0.033s
Tree size 7.2564 MB (118889 leaves, 135873 nodes)
Depth 1/8, 56.25% entries (36 / 8^2)
Depth 2/8, 34.2773% entries (1404 / 64^2)
Depth 3/8, 10.2926% entries (12753 / 352^2)
Depth 4/8, 6.67589% entries (22149 / 576^2)
Depth 5/8, 1.96685% entries (105135 / 2312^2)
Depth 6/8, 0.593564% entries (385218 / 8056^2)
Depth 7/8, 0.173352% entries (1441044 / 28832^2)
Depth 8/8, 0.0496552% entries (4545003 / 95672^2)
Linear system solved in 0.725s (setup: 0.142s, solve: 0.324s, update: 0.193s)
Got Iso-value in 0.071s
Iso-value -3558.31
Normal Size: 0.307752 MB
20782 vertices extracted in 0.165 sec
41560 faces extracted in 0.053 sec
filtering mesh (41560 -> 41560)
Finished processing in 9.055 sec (exit code 1)
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on September 28, 2012, 11:44:07 AM
OK first benchmarks are here, feel free to send it, will create a excell table with results and share with you.....


And  most importaint, for us are the:
NVIDIA 580+590 and 680+690
ATI series 6xxxx and 7xxxx

that we can compare what is faster...... 
Title: Re: Benchmarking a GPUs
Post by: FoodMan on October 01, 2012, 02:14:11 PM
oki.. here is mine..

estimating 784x1250x160 disparity using 262x250x160 tiles, offset -111
timings: rectify: 0.021 disparity: 0.83 borders: 0.042 filter: 0.023 fill: 0
finished depth reconstruction in 65.316 seconds
Device 1 performance: 206.868 million samples/sec (CPU)
Device 2 performance: 184.67 million samples/sec (GeForce GTX 690)
Device 3 performance: 184.067 million samples/sec (GeForce GTX 690)
Device 4 performance: 186.652 million samples/sec (GeForce GTX 690)
Device 5 performance: 199.1 million samples/sec (GeForce GTX 690)
Total performance: 961.358 million samples/sec
Generating mesh...
3432830 points extracted
Grid size: 819 x 818 x 1281
Tree depth: 11
Tree set in 9.69 s
Tree size 387.825 MB (6354132 leaves, 7261865 nodes)
Tree refined in 1.508s
Tree size 387.825 MB (6354132 leaves, 7261865 nodes)
Normal Size: 63.532 MB
Laplacian weights set in 10.384s
Tree refined in 1.735s
Tree size 387.825 MB (6354132 leaves, 7261865 nodes)
Depth 1/11, 56.25% entries (36 / 8^2)
Depth 2/11, 34.2773% entries (1404 / 64^2)
Depth 3/11, 9.89337% entries (13398 / 368^2)
Depth 4/11, 7.38831% entries (19368 / 512^2)
Depth 5/11, 2.12754% entries (92046 / 2080^2)
Depth 6/11, 0.61675% entries (359241 / 7632^2)
Depth 7/11, 0.172989% entries (1340004 / 27832^2)
Depth 8/11, 0.0412316% entries (5134477 / 111592^2)
Depth 9/11, 0.0103413% entries (20715457 / 447568^2)
Depth 10/11
   Nodes 1/8: 386944, 0.0118445% entries (17734303 / 386944^2)
   Nodes 2/8: 239952, 0.0186982% entries (10765854 / 239952^2)
   Nodes 3/8: 405160, 0.0114647% entries (18819764 / 405160^2)
   Nodes 4/8: 258848, 0.0174242% entries (11674600 / 258848^2)
   Nodes 5/8: 188720, 0.0246344% entries (8773589 / 188720^2)
   Nodes 6/8: 117712, 0.0387354% entries (5367225 / 117712^2)
   Nodes 7/8: 82648, 0.0558205% entries (3812926 / 82648^2)
   Nodes 8/8: 45024, 0.100118% entries (2029552 / 45024^2)
Depth 11/11
   Nodes 8/64: 1128312, 0.00407384% entries (51863561 / 1128312^2)
   Nodes 15/64: 772424, 0.00585323% entries (34922660 / 772424^2)
   Nodes 22/64: 1163464, 0.00402772% entries (54521169 / 1163464^2)
   Nodes 29/64: 697088, 0.00667035% entries (32413366 / 697088^2)
   Nodes 36/64: 592240, 0.00789448% entries (27689755 / 592240^2)
   Nodes 40/64: 36560, 0.122452% entries (1636732 / 36560^2)
   Nodes 43/64: 385152, 0.0118953% entries (17645763 / 385152^2)
   Nodes 47/64: 10216, 0.432591% entries (451481 / 10216^2)
   Nodes 50/64: 258408, 0.0180707% entries (12066638 / 258408^2)
   Nodes 54/64: 11440, 0.387589% entries (507252 / 11440^2)
   Nodes 57/64: 155496, 0.0294553% entries (7121987 / 155496^2)
Linear system solved in 28.878s (setup: 11.947s, solve: 11.955s, update: 2.258s)
Got Iso-value in 1.367s
Iso-value -29178.6
Normal Size: 18.6293 MB
1220176 vertices extracted in 11.499 sec
2440356 faces extracted in 0.911 sec
filtering mesh (2440356 -> 2440356)
Finished processing in 145.487 sec (exit code 1)
Title: Re: Benchmarking a GPUs
Post by: FoodMan on October 01, 2012, 02:20:37 PM
btw, here is my complete log.. no excel

http://www.divshare.com/download/19694461-563
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on October 01, 2012, 03:24:04 PM
FoodMan, you have physical 4x 690 GPUs in that PC ?
Title: Re: Benchmarking a GPUs
Post by: Matt on October 01, 2012, 03:38:43 PM
690 is a dual GPU card. So he has 2 of them.
Title: Re: Benchmarking a GPUs
Post by: FoodMan on October 01, 2012, 03:39:31 PM
yes two of them..

f/
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on October 01, 2012, 04:11:44 PM
OK, then is it OK, becasue other one that send me results have not DISABLED a SLI in drivers, so jut one GPU uis working, so it  seems that is working in your PC..... ( or using VGA terminators there ?? )

Therefore everyone, DISABLE the SLI technology in drivers fthat you can use both GPUs.....
Title: Re: Benchmarking a GPUs
Post by: Matt on October 01, 2012, 04:25:45 PM
From memory I just disabled the multi GPU function in the nvidia control panel  ;)
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on October 02, 2012, 07:49:06 PM
Hello anyone with a ATI 7970 Card here ??? or similar from ATI ???
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on October 10, 2012, 11:15:29 PM
Hello, it seems that wil have a overclocked dual XEON and 4x 7990 card = 8 x 7970 AMDs for tests.....
Title: Re: Benchmarking a GPUs
Post by: Matt on October 11, 2012, 05:40:05 AM
Wow you should be able to crack over 2 billion samples per second with that rig but it will need a personal nuclear reactor to power it ;D
Title: Re: Benchmarking a GPUs
Post by: ReginaK on October 11, 2012, 11:49:12 PM
My RIG is a 2 x GTX 580 overclocked + i7 @ 3.8ghz + 24Gb RAM

Device 1 performance: 137.908 million samples/sec (CPU)
Device 2 performance: 250.863 million samples/sec (GeForce GTX 580)
Device 3 performance: 247.25 million samples/sec (GeForce GTX 580)
Total performance: 636.021 million samples/sec

Image:
(http://imageshack.us/a/img402/9242/agisoft.jpg)
Title: Re: Benchmarking a GPUs
Post by: juneau3000 on November 11, 2012, 08:19:07 AM
4 x 7970 3GB @ 1100mhz/1700mhz
Intel 3770k @ 4.7ghz 4c/8t
2x4gb ddr3-2600 9-12-12-19


Device 1 performance: 505.96 million samples/sec (Tahiti)
Device 2 performance: 546.441 million samples/sec (Tahiti)
Device 3 performance: 506.016 million samples/sec (Tahiti)
Device 4 performance: 504.028 million samples/sec (Tahiti)
Total performance: 2062.45 million samples/sec


Generating mesh...
13304348 points extracted
Grid size: 1638 x 1636 x 2563
Tree depth: 12
Tree set in 23.743 s
Tree size 1517.71 MB (24866241 leaves, 28418561 nodes)
Tree refined in 4.43s
Tree size 1517.71 MB (24866241 leaves, 28418561 nodes)
Normal Size: 249.706 MB
Laplacian weights set in 68.007s
Tree refined in 5.633s
Tree size 1517.71 MB (24866241 leaves, 28418561 nodes)
Depth 1/12, 56.25% entries (36 / 8^2)
Depth 2/12, 34.2773% entries (1404 / 64^2)
Depth 3/12, 9.89337% entries (13398 / 368^2)
Depth 4/12, 7.38831% entries (19368 / 512^2)
Depth 5/12, 2.16436% entries (89367 / 2032^2)
Depth 6/12, 0.627468% entries (351822 / 7488^2)
Depth 7/12, 0.178709% entries (1290465 / 26872^2)
Depth 8/12, 0.0432821% entries (4846308 / 105816^2)
Depth 9/12, 0.0108605% entries (19537053 / 424136^2)
Depth 10/12
   Nodes 1/8: 397328, 0.0115373% entries (18213859 / 397328^2)
   Nodes 2/8: 252936, 0.0176532% entries (11293901 / 252936^2)
   Nodes 3/8: 426136, 0.0108643% entries (19728673 / 426136^2)
   Nodes 4/8: 298456, 0.015252% entries (13585900 / 298456^2)
   Nodes 5/8: 194520, 0.0239189% entries (9050433 / 194520^2)
   Nodes 6/8: 120032, 0.0379344% entries (5465471 / 120032^2)
   Nodes 7/8: 84072, 0.0547675% entries (3871021 / 84072^2)
   Nodes 8/8: 45520, 0.0992257% entries (2056027 / 45520^2)
Depth 11/12
   Nodes 8/64: 1453512, 0.00318671% entries (67325534 / 1453512^2)
   Nodes 15/64: 904432, 0.00502197% entries (41079543 / 904432^2)
   Nodes 22/64: 1551176, 0.00299513% entries (72067181 / 1551176^2)
   Nodes 29/64: 962776, 0.0047575% entries (44099037 / 962776^2)
   Nodes 36/64: 695872, 0.00678441% entries (32852692 / 695872^2)
   Nodes 40/64: 47968, 0.0948568% entries (2182588 / 47968^2)
   Nodes 43/64: 443472, 0.0104579% entries (20567360 / 443472^2)
   Nodes 47/64: 12872, 0.351308% entries (582077 / 12872^2)
   Nodes 50/64: 301384, 0.0156763% entries (14239180 / 301384^2)
   Nodes 54/64: 12416, 0.362556% entries (558905 / 12416^2)
   Nodes 57/64: 164488, 0.0279929% entries (7573835 / 164488^2)
Depth 12/12
   Nodes 57/368: 77320, 0.0471448% entries (2818496 / 77320^2)
   Nodes 58/368: 742456, 0.00503086% entries (27732141 / 742456^2)
   Nodes 59/368: 975016, 0.00386316% entries (36725337 / 975016^2)
   Nodes 60/368: 829792, 0.00454031% entries (31262489 / 829792^2)
   Nodes 62/368: 61864, 0.0591943% entries (2265457 / 61864^2)
   Nodes 63/368: 173312, 0.0215044% entries (6459287 / 173312^2)
   Nodes 64/368: 1571056, 0.00240964% entries (59475151 / 1571056^2)
   Nodes 113/368: 753648, 0.00494485% entries (28086029 / 753648^2)
   Nodes 114/368: 7496, 0.458528% entries (257647 / 7496^2)
   Nodes 115/368: 1062808, 0.0035025% entries (39562833 / 1062808^2)
   Nodes 116/368: 69448, 0.0512892% entries (2473689 / 69448^2)
   Nodes 117/368: 63176, 0.0582321% entries (2324164 / 63176^2)
   Nodes 119/368: 1098592, 0.00340633% entries (41111112 / 1098592^2)
   Nodes 169/368: 1189888, 0.00320646% entries (45398201 / 1189888^2)
   Nodes 170/368: 590944, 0.00645144% entries (22529395 / 590944^2)
   Nodes 171/368: 59664, 0.0613152% entries (2182694 / 59664^2)
   Nodes 172/368: 1832832, 0.0021012% entries (70584962 / 1832832^2)
   Nodes 173/368: 152136, 0.0250708% entries (5802739 / 152136^2)
   Nodes 174/368: 762328, 0.00498398% entries (28964071 / 762328^2)
   Nodes 176/368: 5408, 0.605454% entries (177074 / 5408^2)
   Nodes 225/368: 1425056, 0.00267097% entries (54241692 / 1425056^2)
   Nodes 226/368: 99776, 0.0367272% entries (3656286 / 99776^2)
   Nodes 227/368: 513632, 0.00744717% entries (19646963 / 513632^2)
   Nodes 228/368: 2008, 1.65208% entries (66613 / 2008^2)
   Nodes 229/368: 717952, 0.00532773% entries (27462042 / 717952^2)
   Nodes 276/368: 1030184, 0.00374059% entries (39698068 / 1030184^2)
   Nodes 280/368: 1248952, 0.00305794% entries (47700262 / 1248952^2)
   Nodes 284/368: 133368, 0.0280868% entries (4995800 / 133368^2)
   Nodes 299/368: 907488, 0.00416611% entries (34309341 / 907488^2)
   Nodes 303/368: 573824, 0.00654664% entries (21556382 / 573824^2)
   Nodes 307/368: 34536, 0.107592% entries (1283291 / 34536^2)
   Nodes 322/368: 182816, 0.0207891% entries (6948081 / 182816^2)
   Nodes 326/368: 838416, 0.00461138% entries (32415280 / 838416^2)
   Nodes 338/368: 43624, 0.0877051% entries (1669075 / 43624^2)
   Nodes 345/368: 363432, 0.0104874% entries (13851995 / 363432^2)
   Nodes 349/368: 217280, 0.0171502% entries (8096721 / 217280^2)
Linear system solved in 71.773s (setup: 14.786s, solve: 25.177s, update: 26.002s)
Got Iso-value in 9.711s
Iso-value -60290.4
Normal Size: 71.6336 MB
4707876 vertices extracted in 39.375 sec
9415720 faces extracted in 6.429 sec
filtering mesh (9415720 -> 9413930)
Finished processing in 498.187 sec (exit code 1)

sorry if i did it wrong, not sure why my tree depth and grid size in my run is higher then the others that have run...im new to this :P
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on November 12, 2012, 12:44:07 AM
Depht ad tree depend on the GPU internal organization - OpenCL device hierarchy, is difeerent from NVIDIA..... as far im know... and THANX for providing results, this is verification from other user that reported similat and hilarious performance from AMD.... What precise GPU card is used ? and ammount and etc....

Thanx.....
Title: Re: Benchmarking a GPUs
Post by: juneau3000 on November 12, 2012, 03:09:53 AM
This is 4 HD7970 3GB gpu's overclocked to 1150mhz core 1700mhz mem lead by a Intel 3770k 8 threads at 5ghz.

Performance seems low? what should I be looking for in terms of samples per second or should you go by processing time?

Thanks in advanced, wonder if 2x 680's might be better then 2x 7970's. I will try both.
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on November 12, 2012, 09:54:02 AM
No, this sort of performance is  possible only with AMD/ATI cards. The nvidia is much slower than this.....

and nice to see overclock on CPU/GPU sides and its impact on performance...... 
Title: Re: Benchmarking a GPUs
Post by: Alexey Pasumansky on November 12, 2012, 01:57:20 PM
Hello,

Quote
not sure why my tree depth and grid size in my run is higher then the others that have run...im new to this
Tree depth depend on the selected quality and bounding box size. However, Mesh generation step doesn't involve GPU processing. OpenCL devices are only used for Depth maps generation.
Title: Re: Benchmarking a GPUs
Post by: juneau3000 on November 12, 2012, 07:42:31 PM
thanks for the information guys, will update with 2x 7970's vs 2x GTX-680 this afternoon.
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on November 12, 2012, 07:51:10 PM
Juneau, it can casue problems when mixing diferent GPUs, for best performance you need to acomplish use the 7970 cards, not the NVIDIAs 680 ( the 680 is fast just like 580 !!! )

Title: Re: Benchmarking a GPUs
Post by: juneau3000 on November 12, 2012, 09:42:12 PM
Im sorry I meant at seperate times.  :D


Here are results with the following hardware.
Intel 3970X 6cores 12threads at 4.8ghz (running 8 threads saving 2 each for the cards because of ht)
2x HD 7970 3GB cards at 1100/1700
4x4GB @ CAS10 2133mhz

finished depth reconstruction in 368.598 seconds
Device 1 performance: 306.784 million samples/sec (CPU)
Device 2 performance: 517.826 million samples/sec (Tahiti)
Device 3 performance: 506.601 million samples/sec (Tahiti)
Total performance: 1331.21 million samples/sec
Generating mesh...
13299863 points extracted
Grid size: 1638 x 1637 x 2563
Tree depth: 12
Tree set in 23.821 s
Tree size 1516.9 MB (24852948 leaves, 28403369 nodes)
Tree refined in 4.368s
Tree size 1516.9 MB (24852948 leaves, 28403369 nodes)
Normal Size: 249.543 MB
Laplacian weights set in 47.768s
Tree refined in 5.491s
Tree size 1516.9 MB (24852948 leaves, 28403369 nodes)
Depth 1/12, 56.25% entries (36 / 8^2)
Depth 2/12, 34.2773% entries (1404 / 64^2)
Depth 3/12, 9.89337% entries (13398 / 368^2)
Depth 4/12, 7.38831% entries (19368 / 512^2)
Depth 5/12, 2.16436% entries (89367 / 2032^2)
Depth 6/12, 0.62886% entries (350346 / 7464^2)
Depth 7/12, 0.178867% entries (1288533 / 26840^2)
Depth 8/12, 0.043282% entries (4847026 / 105824^2)
Depth 9/12, 0.0108639% entries (19529262 / 423984^2)
Depth 10/12
   Nodes 1/8: 397016, 0.0115446% entries (18196790 / 397016^2)
   Nodes 2/8: 252632, 0.0176684% entries (11276485 / 252632^2)
   Nodes 3/8: 426064, 0.0108665% entries (19725931 / 426064^2)
   Nodes 4/8: 298560, 0.0152487% entries (13592359 / 298560^2)
   Nodes 5/8: 194280, 0.0239416% entries (9036675 / 194280^2)
   Nodes 6/8: 120120, 0.0379237% entries (5471940 / 120120^2)
   Nodes 7/8: 84024, 0.0547993% entries (3868847 / 84024^2)
   Nodes 8/8: 45280, 0.0996754% entries (2043623 / 45280^2)
Depth 11/12
   Nodes 8/64: 1453992, 0.00318538% entries (67341824 / 1453992^2)
   Nodes 15/64: 903336, 0.00502759% entries (41025937 / 903336^2)
   Nodes 22/64: 1550512, 0.00299644% entries (72037114 / 1550512^2)
   Nodes 29/64: 962392, 0.00475896% entries (44077453 / 962392^2)
   Nodes 36/64: 695272, 0.00678932% entries (32819795 / 695272^2)
   Nodes 40/64: 47864, 0.0950325% entries (2177159 / 47864^2)
   Nodes 43/64: 443136, 0.010464% entries (20548111 / 443136^2)
   Nodes 47/64: 12832, 0.35214% entries (579834 / 12832^2)
   Nodes 50/64: 301168, 0.015685% entries (14226671 / 301168^2)
   Nodes 54/64: 12416, 0.362618% entries (559001 / 12416^2)
   Nodes 57/64: 164136, 0.0280488% entries (7556515 / 164136^2)
Depth 12/12
   Nodes 57/368: 77448, 0.0470791% entries (2823893 / 77448^2)
   Nodes 58/368: 742144, 0.00503303% entries (27720833 / 742144^2)
   Nodes 59/368: 975296, 0.00386201% entries (36735488 / 975296^2)
   Nodes 60/368: 830392, 0.00453717% entries (31286101 / 830392^2)
   Nodes 62/368: 61608, 0.0593687% entries (2253368 / 61608^2)
   Nodes 63/368: 173064, 0.0215245% entries (6446823 / 173064^2)
   Nodes 64/368: 1569024, 0.00241267% entries (59396079 / 1569024^2)
   Nodes 113/368: 753528, 0.00494683% entries (28088316 / 753528^2)
   Nodes 114/368: 7472, 0.458983% entries (256254 / 7472^2)
   Nodes 115/368: 1063304, 0.00350076% entries (39580087 / 1063304^2)
   Nodes 116/368: 69040, 0.0515391% entries (2456621 / 69040^2)
   Nodes 117/368: 63080, 0.0582343% entries (2317195 / 63080^2)
   Nodes 119/368: 1097288, 0.00340952% entries (41051995 / 1097288^2)
   Nodes 169/368: 1189944, 0.00320623% entries (45399109 / 1189944^2)
   Nodes 170/368: 590800, 0.00645318% entries (22524490 / 590800^2)
   Nodes 171/368: 59768, 0.0611872% entries (2185736 / 59768^2)
   Nodes 172/368: 1833488, 0.00210064% entries (70616686 / 1833488^2)
   Nodes 173/368: 151944, 0.0250969% entries (5794127 / 151944^2)
   Nodes 174/368: 762312, 0.00498329% entries (28958848 / 762312^2)
   Nodes 176/368: 5408, 0.605598% entries (177116 / 5408^2)
   Nodes 225/368: 1424848, 0.00267148% entries (54236125 / 1424848^2)
   Nodes 226/368: 99152, 0.0369453% entries (3632141 / 99152^2)
   Nodes 227/368: 513552, 0.00744825% entries (19643679 / 513552^2)
   Nodes 228/368: 2080, 1.60984% entries (69648 / 2080^2)
   Nodes 229/368: 715952, 0.00533993% entries (27371803 / 715952^2)
   Nodes 276/368: 1028656, 0.00374592% entries (39636862 / 1028656^2)
   Nodes 280/368: 1246264, 0.00306318% entries (47576502 / 1246264^2)
   Nodes 284/368: 133136, 0.0281344% entries (4986871 / 133136^2)
   Nodes 299/368: 907384, 0.00416652% entries (34304863 / 907384^2)
   Nodes 303/368: 573976, 0.00654438% entries (21560372 / 573976^2)
   Nodes 307/368: 34368, 0.10801% entries (1275773 / 34368^2)
   Nodes 322/368: 182536, 0.0208148% entries (6935355 / 182536^2)
   Nodes 326/368: 838672, 0.00461006% entries (32425830 / 838672^2)
   Nodes 338/368: 43552, 0.0877833% entries (1665053 / 43552^2)
   Nodes 345/368: 363200, 0.0104926% entries (13841250 / 363200^2)
   Nodes 349/368: 217288, 0.0171512% entries (8097771 / 217288^2)
Linear system solved in 60.7s (setup: 10.167s, solve: 27.677s, update: 16.787s)
Got Iso-value in 6.864s
Iso-value -60218.6
Normal Size: 71.6137 MB
4706303 vertices extracted in 38.969 sec
9412562 faces extracted in 4.431 sec
filtering mesh (9412562 -> 9410710)
Finished processing in 597.231 sec (exit code 1)


I will exchange the GTX-680's in next.
Should I be judging by total time it takes to complete? or go by the samples per second rating.
Thanks for all the help.
Title: Re: Benchmarking a GPUs
Post by: Matt on November 13, 2012, 03:59:03 AM
I havnt had any problems mixing nvidia/ATI video cards at all.  If anything junneau's issues may stem from not enough RAM. Drop in an extra 8 gigs and see how that goes. That overclocked CPU is flying pity i cant overclock my Xeon E5's   :( . My guess is around 600 million samples per second with the 2 * GTX 680's.
Title: Re: Benchmarking a GPUs
Post by: juneau3000 on November 16, 2012, 09:16:17 PM
with 2x GTX-680 GPU's my total time is around 850-889s  ???
they show 255m samples per/s each

This seems incredibly slow vs the 7970's.

So can I say that 7970's vs gtx-680, 7970's win by a large margin?

I do know if cinebench 11.5 open cl for the 7970 is 103FPS vs 62FPS of the 680. Seems like a real workhorse of a card ATI has put out.
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on November 16, 2012, 09:36:20 PM
Juneau, a BIG thanx for the testing you have done for the community, for us all here.  8)

So now a lot of us can decide what to buy for their workstations. And it seems that AMD/ATI will put the 8000 series in jan/feb 2013 so even more performance await us.

Matt, thanx too about hte mixing cards, have asked few monts in the Khronos group http://www.khronos.org/opencl/ and they say to me that is problematic use different architectures, but now is clear that for our use is not problematic.

If were possible to port more functions of Photoscan to the OpenCL eingine then we will se a significant boost to efficiency, so hoping that will be possible in future. If we get speedup of whole process then Photogrametry could be a very competetive to the laser.....
Title: Re: Benchmarking a GPUs
Post by: Matt on November 17, 2012, 01:29:20 PM
The 7970 beats the 500 and 600 series in all programming based clbenchmarks apart from Image Filter Global Atomic Add and Bitonic Merge sort. Still like my GTX 590 though 400+ million points from one card and no fiddly driver issues. To get the Nvidia Crads and ATI cards working together i plugged them both into the same monitor.
Title: Re: Benchmarking a GPUs
Post by: ajg-cal on November 22, 2012, 01:15:51 PM
Hi all - thanks sharing these results :). I only have gtx 580 and 680 here so I'm not sure how much original data I would be able to supply.

I am interested by the increase in samples per second shown in the 7970.

I've had an awful experience with AMD drivers in laptops (7970M), which of course is not wholly applicable here I suppose.

Are many of you going to be buying 7970s based on Juneau's data do you think?
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on November 22, 2012, 01:53:16 PM
Im wil buy 4 of 7970s in next few months. and few more in comming year.. With the Drivers is it problematic even with NVIDIA sometimes, not to say about decreasing performance in OpenCL on consumer cards....... Crashing Pscan for driver issue that was till not removed over 2 years and few others. What precisely for problem you have on the ATI card ??
Title: Re: Benchmarking a GPUs
Post by: ajg-cal on November 22, 2012, 02:14:51 PM
Wow! that will be quite a machine  :). We're looking into a intel xeon based workstation currently.

Don't worry too much as my references to AMD drivers were a bit off topic / were not really related to Agisoft processing.

If you're interested, the main problem for a lot of users with the 7970M (mobility i.e. for laptops) has been an underutilisation issues coupled with instability in the later driver hotfixes. I think the added complexity of enduro techonology (optimus for nvidia) doesn't help either...

I am actually going to start testing agisoft on my laptop to investigate how viable preliminary processing in-the-field is.

my laptop specs

i7-3610QM
8 gb ram (may upgrade to 16 GB)
7970M
128 ssd and 1TB hdd
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on November 22, 2012, 02:27:25 PM
Yes, i know that problematic from forums on other sites, and yes its a driver related that can be solved ( if already not - something get on my eyes last week ) But remember the 7970M will be just on aprox half that performance of 7970 card, becasue its mobile version.....

Yes it wil be powerfull and will be aviable for interesant price for everyone......

If want can help with the hw setup of the dual xeon ( im putting my own together righ now ) 
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on November 22, 2012, 03:15:24 PM
http://www.hardocp.com/article/2012/07/18/pci_express_20_vs_30_gpu_gaming_performance_review/13
Title: Re: Benchmarking a GPUs
Post by: Aristarchos on November 22, 2012, 09:34:19 PM
Hi,

I have registered just to say that I get 400 million samples/sec with a Radeon 7870 in Photoscan trial version in the Wishgranter scene.

The model is Sapphire Radeon HD 7870 GHz Edition,it is the model with the two fans and
it?s very silent .The temperature tops at 59 celsius when in Photoscan at 85%-97% of load.
Price around 220 euros in Spain.

Title: Re: Benchmarking a GPUs
Post by: Wishgranter on November 23, 2012, 12:56:48 AM
Aristarchos thanx for the info, so even the 7870 are very fast  8)
Title: Re: Benchmarking a GPUs
Post by: Matt on November 23, 2012, 02:17:29 AM
My opencl programming buddies all use radeon mobile cards in their laptops the as they are not throttled back like the mobile nvidia cards. If i had the money I would go for something like this for field processing. http://shop.bestdeal4u.com.au/service/t/0/i/1988/n/Clevo+X7350.html (http://shop.bestdeal4u.com.au/service/t/0/i/1988/n/Clevo+X7350.html). It has the dual 7970 video cards and capacity for lots of ram. I wouldnt call it portable though  ;D
Title: Re: Benchmarking a GPUs
Post by: Aristarchos on November 23, 2012, 04:51:10 AM
The Radeon 7870 desktop and 7970m(mobile) are in fact the same chip (Pitcairn),they have the same number of shader processors,1280 and memory Speed ,1200 MHz.The difference is in the core speed ,that is 1000 MHz in the desktop version vs. 850 MHz in the mobile one.

The perfomance of desktop version is only 9% faster than the mobile one.You can check this on this link http://www.notebookcheck.net/AMD-Radeon-HD-7970M.72675.0.html, look for benchmark 3DMark 11 - Performance GPU 1280x720.

The safe bet,currently,for a 7970m laptop is to get the Alienware m17x-r4,where you can deactivate the integrated gpu in the intel processor in bios,so you can avoid the whole Enduro mess.

Enduro is the AMD version of Nvidia Optimus.

You can check in the notebookcheck.net  Alienware m17x-r4 specific forum that there is not performance problems like in the Clevo/Sager notebooks.
Title: Re: Benchmarking a GPUs
Post by: ajg-cal on November 23, 2012, 03:54:52 PM
You can check in the notebookcheck.net  Alienware m17x-r4 specific forum that there is not problems of performance like in the Clevo/Sager notebooks.

Don't I know it  :'(

I have a Clevo laptop from a reseller which doesn't have a independent connection from dGPU to output. http://forum.notebookreview.com/sager-clevo/696634-amd-12-11p-7970m-performance-driver-install-guide.html (http://forum.notebookreview.com/sager-clevo/696634-amd-12-11p-7970m-performance-driver-install-guide.html)
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on December 18, 2012, 06:58:47 PM
Hello All.....
Finaly im get the ATI 7970 installed to my PC, im doing few tests, even mixing ATI and NVIDIA card, but it seems that mixing it in same system is not the best solution.

Have tested few simple solution on clean install of Win8, like installing first the 590 then 7970, and ins first the 7970 then 590 and it seems that nvidia is doing something with OpenCL drivers, if im have installed the ATI Open CL first, im get speedbump about 15-20 % for now.

Kris3D, the ATI cards are much faster than NVIDIA, but its it just under the DEPHT MAP subrutine, so as whole project you propably get the 10-20 % speedup depend on the  speed of the meshing process.

Over the holidays, will test it much deeper and will let you know........
Title: Re: Benchmarking a GPUs
Post by: Matt on December 19, 2012, 03:20:31 PM
I have both nvidia/ati cards installed and both drivers on a dual cpu system tested under windows 7 and 8.  If I want to use the nvidia driver plug that into the display as the primary source, if you want the ATI (which seems to work better for the 7970) plug that into the primary source.
It is also worth noting that the ATI cards seem to perform better the more complex the geometry. They will perform better in terms of calculation per second on ultra high than on high in the height field mode and better again in arbitrary ultra high.
The benchmark test set Milos created could be improved on by including a inside looking out arbitrary model of a space and a height field dataset as well.
Any chance we can look at the benchmark results you have collated yet Milos :D
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on December 19, 2012, 07:41:06 PM
Yup, will dosclose it, but as you pointedand im have forget, its need that GPUs need to be connected to monitor, or use terminators ( not T-800 aka Schwarzeneger to presuade the GPU) that card think its enabled. Will post results over the weekend, curently must deliver few results to clients..... For now, can disclosure this, but it need confirmation-will run it few times again to be sure. Used the benchmark scene.

result for Depht map generation @ULTRA

Win7 64bit - 1x 560Ti 192 GB RAM  - 1761 sec
Win Server 2012 Datacenter edition - 1x560Ti 192 GB - 1390 sec

If someone can share test data for benchmark few others reconstruction methods = some aerial photos for GIS application.....
Title: Re: Benchmarking a GPUs
Post by: kris3d on December 20, 2012, 04:42:22 AM
I did another test.
900 photos.
ATI 7970 computer hangs right at the start of the calculation.
The whole works very whimsical. Larger projects to hang.
A system with this card is unpredictable and unstable.
10% increase in performance over the card GTX 580 . Unless I crash.
Unfortunately, I sold the good old 580 card
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on December 20, 2012, 02:58:42 PM
Kris, see in detail on CHUNK, and see the time for DEPHT MAP  thats what GPUs calculate.
If system is unstable, best way is to use cklean install of Windows and just one GPU to use.

Afther have installed ATI card, im could not read a lot of things with special sw, afther clean install its work OK.....

900 phots on what settings ???
Title: Re: Benchmarking a GPUs
Post by: kris3d on December 21, 2012, 02:10:02 PM
I have a stable system.
This card is unstable.
I had to lower the clock speed below the default parameters.
It works well. Exactly as fast as the old GTX580.
Replacement cards do not have any sense.
But as someone that works well. This congratulations.
Reinstalling the system is the last thing I do.
ATI is the way it is. no miracles.
If someone is going to buy an ATI 7970.
I suggest that first borrow from the store.
See how it really works.
And then possibly buy.
It may be a change does not make sense.
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on December 21, 2012, 03:22:37 PM
Kris,my 7970 working flawlesly under W2012 datacenter, under win7 its problematic, will explain later. few results

results for DEPHT MAP stage

Win7
560Ti 1761 sec
7970 690 sec

W2012Datacenter
560Ti 1390 sec
7970 650 sec

So please see where the problem is, mostly under drivers, but depend on hw config, it can be a hw related problem......

Im wil dig very deep intro this over the next 14 days.....
Title: Re: Benchmarking a GPUs
Post by: kris3d on December 21, 2012, 07:37:51 PM
Very interesting.
Another system, a very large difference in the calculation.
Is this is a question of drivers? are different for Windows Server 2012.?
Is the system itself can have an impact on the calculation.
Do other stages calculations also work better in Windows Server 2012?
Title: Re: Benchmarking a GPUs
Post by: Matt on December 23, 2012, 03:37:44 AM
I have got similar performance results running both windows 7 and 8 64 bit.
I am running an Asus Matrix Platinum 7970 at 1250 mhz on my system but it took some effort to get it running at maximum potential.

You really need to totally uninstall all the windows ATI Drivers before you start especially the generic windows ones.

The blow link describes the process pretty well even though its for a 7970m.

http://forum.notebookreview.com/alienware-m17x/698471-ati-amd-7970m-drivers-windows-8-server-2012-a.html (http://forum.notebookreview.com/alienware-m17x/698471-ati-amd-7970m-drivers-windows-8-server-2012-a.html)

Note they recommend using driver version 12.10

Hope it helps
Title: Re: Benchmarking a GPUs
Post by: Porly on January 04, 2013, 04:14:27 PM
Hello,

I also bought the HD7970(1100/1500) using Windows 7 64, 3770k@4,2 32GB-1600MHZ. My problem is that Photoscan recognized the card with 2gb instead of 3gb. So the result of the benchmark showed "only" 325 mio. samples per second for the graphic device. Had anyone the same problem???

Thanks!

Title: Re: Benchmarking a GPUs
Post by: Wishgranter on January 04, 2013, 05:09:28 PM
yes, that courious to me too, why just 2GB instead of 3GB, will digg intro that....

What drivers  version you used for testing ?

Have set to ULTRA ?
Title: Re: Benchmarking a GPUs
Post by: Porly on January 04, 2013, 07:38:52 PM
I have downloaded the latest package from amd.com (with catalyst) revision number 12.10, installed driver version is 9.2.0.0

the result about 325 mio samples/s was on high

I guess that something is wrong, because the CPU is also very weak (100 mio. samples/second) my 2700k had about 30% more power.

(2 cores of CPU are disabled for one GPU 6/8)

I made it also on Ultra and the benchmark showed 400mio. for the card device. So, is the full perfomence only reachable on Ultra???

Thanks a lot!
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on January 05, 2013, 01:24:14 AM
Yes, only on ultra, if you watch CPu and GPU utilization you will see that on lover levels (HIGH, MEDIUM, LOW) they are not used on 100 %.

And the 2GB are recognized in other benchmarks too....
Title: Re: Benchmarking a GPUs
Post by: Porly on January 05, 2013, 02:27:08 AM
So, i solved the problem with the recognizing of only 2 GB. I had to switch off the onboard gpu in BIOS. Now I have 3 GB but without any improvement in performence. But even better than the gtx580.

Last question ;) : How can it be that the navigation in photoscan became worse with a better CPU (2700k-->3770k) and GPU (GTX580-->HD7970) (32GB/Same Memory). Rotating the Modelview is lagging even with a small facecount (2.000.000). I already checked the memory with some diagnostic-tools, seems to be ok...

thanks a lot for your help Wishgranter!



Title: Re: Benchmarking a GPUs
Post by: Wishgranter on January 05, 2013, 04:32:42 AM
Thats something that worry my - the LAG with 7970, im using for primary monitor the 560Ti, and for computation the 7970. That is because AMD have problem with implementing the VBO feature in its consumer cards, so the problem is driver related and mean AMD cannot/want implement it right ( at least in drivers that have tested. im searching for solution in this...... So for best viewing performance use the Nvidia cards ( more GPU RAM and bigger models is the way ) and for computation the ATI ones.... It could be interesant to test some professional ATI card Fire pro in Pscan....
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on January 05, 2013, 05:07:10 AM
hmm propably have found the problem with VBO, will test it today afther morning coffe and let you know....
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on January 06, 2013, 02:48:31 PM
Hello all, the problem seems to be related to the OpenGl. dll file, becasue of conflict for hardware what file to use. becasue Nvidai OpenGl and ATI openGl are not the same the system have problem to use the proprer one. im on the way to test it, so afthre few day im will repost what have found out...
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on January 06, 2013, 04:46:02 PM
VBO is "vertex buffer object".  It's an OpenGL feature used in rendering for programs using the OpenGL rendering architecture such as Second Life.  Enabling VBO should give better performance for your graphics....but the key word is "should".  Some video cards and the associated drivers show problems with VBO and by turning the feature off in your preferences you get a noticable increase in performance and quality.  Both nVidia and ATI (now known as AMD) cards have occassional issues with VBO enabled in preferences...........most to the time the problem is fixed with a driver version that doesn't have the problem.  ATI/AMD cards have historically had more problems than nVidia in the past........that's probably why you were told to turn off VBO in preferences.  If you've done that it fixed your problem then leave it off.  Over time a new driver will be released for your card and if you update to it, you might try VBO again to see if that driver fixed the previous issue (that would be up to you).
The reason you don't see any issues with VBO in your other games is that those other games are not rendered with the OpenGL rendering architecture..........they are most likely DirectX (especially if you are using a Windows machine).  You won't have any problems or issues with a rendering engine that use the VBO feature because DirectX doesn't have that feature.
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on January 06, 2013, 08:08:12 PM
VBO performance:

For every 1 GB of VRAM ( on board of GPU ) you can use aprox 45-50 mil trias = that fit in memory and can get fast response in viewport.....
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on January 07, 2013, 03:09:25 PM
http://www.techpowerup.com/178330/NVIDIA-Posts-GeForce-310.90-WHQL-Drivers.html
Title: Re: Benchmarking a GPUs
Post by: Matt on February 05, 2013, 02:53:09 PM
Hi Wishgranter,

Just wondering where those GPU benchmarks are?
I guess you are pretty busy but its been around 6 months now 
I am still tweaking the 7970 Matrix/GTX 590 combo.

Latest result below

Using device: GeForce GTX 590, 16 compute units, 1535 MB global memory
  max work group size 1024
  max work item sizes [1024, 1024, 64]
  max mem alloc size 383 MB
  max workgroup size c1: 1024 c3: 1024 zero: 1024 hamming: 1024 filter: 1024 box: 1024
  max workgroup size zero: 1024 costs: 1024 b1: 1024 bn: 1024 wta: 1024 transpose: 1024
Using device: GeForce GTX 590, 16 compute units, 1535 MB global memory
  max work group size 1024
  max work item sizes [1024, 1024, 64]
  max mem alloc size 383 MB
  max workgroup size c1: 1024 c3: 1024 zero: 1024 hamming: 1024 filter: 1024 box: 1024
  max workgroup size zero: 1024 costs: 1024 b1: 1024 bn: 1024 wta: 1024 transpose: 1024
Using device: Tahiti, 32 compute units, 2048 MB global memory
  max work group size 256
  max work item sizes [256, 256, 256]
  max mem alloc size 512 MB
  max workgroup size c1: 256 c3: 256 zero: 256 hamming: 256 filter: 256 box: 256
  max workgroup size zero: 256 costs: 256 b1: 256 bn: 256 wta: 256 transpose: 256
initializing...
selected 16 cameras from 16 in 0.004 sec
Loading photos...
Reconstructing depth...

--->deleted to save space

finished depth reconstruction in 458.607 seconds
Device 1 performance: 161.974 million samples/sec (CPU)
Device 2 performance: 176.341 million samples/sec (GeForce GTX 590)
Device 3 performance: 170.51 million samples/sec (GeForce GTX 590)
Device 4 performance: 615.622 million samples/sec (Tahiti)
Total performance: 1124.45 million samples/sec
Generating mesh...
generating 1638x1636 grid (0.0061235 resolution)
setting sample weights
adding points
updating levels
merging levels
triangulating... 528602 points 1057063 faces done in 2.178 sec
filtering mesh (1057063 -> 1057063)
Finished processing in 483.345 sec (exit code 1)
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on February 05, 2013, 02:58:50 PM
Hello Matt, some have send, but noone send me more of them..

its clear that as for now the 7970 are the fastest cards here, but my recomendation go to have at least one Nvidia 5xx or 6xx series in PC, because problems with VBO in ATI drivers, so cannot work with bigger objects ( over 20mil is on 7970 end, on 560Ti can go up cca 50 mil trias with realtime -- 50 mil trias take aprox 950 MB or GPU ram)

Try test the new beta version....... 50% faster that 0.9.0 version
Title: Re: Benchmarking a GPUs
Post by: Matt on February 05, 2013, 03:28:39 PM
Thanks, i thought as much.

I am in full production mode right now so will try the beta after this lot of work is out the door.
I have some photos to contribute for a Orthophoto benchmark that I can get to you in a week or so.
Title: Re: Benchmarking a GPUs
Post by: Alterco on February 13, 2013, 04:17:44 PM
Hi guys,

I have a question for you, GPU experts.

I am looking to buy a computer.
According to the Agisoft wiki, it is recommended to use NVidia GeForce GTX 580 or GeForce GTX 680 GPU.
So, I have chosen the GTX 680 GPU as reference.

After some documentation research, it appears that a SLI using 2 GTX 660 is more powerful than a single GTX680 for less money.

Then, I wonder why not use a 2 GTX 660 instead a GTX 680 with Photoscan.

Different thoughts :
- 2 GTX 660 are less expensive than one GTX 680
- the SLI appears more powerful
- 2 GTX 660 can provide 2 GPU instead of one with the GTX 680

Considering that the future SFM algorithm appear to use more and more the GPU computation, it seems interesting to multiply the GPU core number.

Finally, I can not find any arguments to not use a 2 GTX660 SLI.
Maybe just one, does/will Photoscan support this configuration ?

So, what do you think ?

Thank you in advance for you expertise.

Adrien
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on February 13, 2013, 04:28:15 PM
SLI not work with OpenCL  ( it should but its still not iplemented ! and propably will be not for a while...  ). When thinking to use GPUs, thing on that every GPU need a 1 CPU core, so its better to have one card with better spec. Even better at least for Pscan is to use ATI solution, its faster, almost double the performance against NVIDIA cards...... In short, its  Pscan speciality of using raw performance of the ATI cards...... they are dubble so fast in single precision aritmetics as Nvidia...

Title: Re: Benchmarking a GPUs
Post by: Alterco on February 13, 2013, 06:59:15 PM
Hi,

Thank you for your quick answer.
It is too bad that this promising configuration could not work properly with Photoscan...
It could have been interesting for other work too...

But I will keep the GTX680 instead the faster ATI solution because it seems that there is some issue with big scenes. Am I wrong ?

Regards,

Adrien
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on February 13, 2013, 09:36:33 PM
Yes.... Stick with 680 and its OK.......
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on February 13, 2013, 10:08:12 PM
Hmm as from today test with new 13.1 drivers, it seems that ATI drivers are repaired and the VBO is working again.... So a large models can be opened and edited fast like on Nvidia cards....
Title: Re: Benchmarking a GPUs
Post by: Matt on February 19, 2013, 01:57:13 PM
We have a new toy to play with http://www.asus.com/ROG/ARES26GD5/ (http://www.asus.com/ROG/ARES26GD5/)
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on February 19, 2013, 02:25:07 PM
hmmmmm how high will overclock it ? im think over 1200-1250 is possible....
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on February 21, 2013, 08:04:49 PM
Hmm early tests of Geforce Titan http://www.tomshardware.com/reviews/geforce-gtx-titan-performance-review,3442-10.html in OpenCl mode.......

It seems that will be disappointing speed in Pscan too.....
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on February 21, 2013, 08:16:10 PM
We did bring these issues up with Nvidia, and were told that they all stem from its driver. Fortunately, that means we should see fixes soon." I suspect their fix will be "Use CUDA".

Nvidia has really dropped the ball on OpenCL. They don't support OpenCL 1.2, they make it difficult to find all their OpenCL examples. Their link for OpenCL is not easy to find. However their OpenCL 1.1 driver is quite good for Fermi and for the 680 and 690 despite what people say. But if the Titan has troubles it looks like they will be giving up on the driver now as well or purposely crippling it (I can't imagine they did not think to test some OpenCL benchmarks which every review site uses). Nvidia does not care about OpenCL Nvidia users like myself anymore. I wish there more people influential like Linus Torvalds that told Nvidia where to go.
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on February 22, 2013, 06:18:05 PM
hmm interesant bench together

http://www.computerbase.de/artikel/grafikkarten/2013/test-nvidia-geforce-gtx-titan/13/

Yes that are common task, Pscan use very specific code that shine on ATI cards, OpenCl is better supported on AMDs cards. im think that do Nvida because they want support CUDA solution. CUDA is more "powerful" easier to code and do a lot of scientific code.......  But OpenCL is independed on hw that we use.......
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on April 17, 2013, 12:08:26 PM
http://www.tomshardware.com/reviews/geforce-gtx-titan-opencl-cuda-workstation,3474.html
Title: Re: Benchmarking a GPUs
Post by: Andrew on April 21, 2013, 12:40:57 AM
Wishgranter, you mentioned that latest ATI drivers seem to have fixed the VBO issues. Do you know if it is the case for all Windows versions (Win7, Win8 as well as Windows 2012 Server)?

It was also mentioned that Photoscan performs better (OpenCL-wise) on Windows 2012 Server than on Win7. Is it still the case using these latest drivers? What about regular Windows 8, is it as fast as 2012 Server or as slow as Win7?

Cheers,
Andrew
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on April 21, 2013, 10:21:29 AM
Yes, VBO drivers are fixed.....
Yup, W2012server performnace it was best to see with Nvidia cards, 30 % difference in perfrocmannceon my 560Ti, but as im have read its posible that other cards are faster under W2012Server. Im have not so many time to test all configs under all win versions.. but take this as good hint, its for longer time why is this possible with W2012Server...
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on October 24, 2013, 12:53:03 PM
Hi All, as first benchmarks comming out it seems that COMPUTE in the new 290X is not so good as on the 7970 cards..... http://www.anandtech.com/show/7457/the-radeon-r9-290x-review/17

But it seems that Nvidia will lower prices on 780s and Titan cards in next few weeks...
Title: Re: Benchmarking a GPUs
Post by: Wishgranter on November 22, 2013, 03:34:35 PM
test results from Marcell
Here are the results for the 'official' benchmark file, point cloud on Ultra:

Quote
finished depth reconstruction in 133.39 seconds
Device 1 performance: 765.774 million samples/sec (Hawaii)
Device 2 performance: 762.169 million samples/sec (Hawaii)
Total performance: 1527.94 million samples/sec

So with the official file, the R290 is about 25% faster than a 7970.
Title: Re: Benchmarking a GPUs
Post by: Peter on November 30, 2013, 07:21:21 PM
Finished buildning my workstation, havent oc it yet...
Specs are: I7-4960x, asus rampage motherboard, corsair 1866 mhz 64 gb ram, dual 7970

here are results on ultra with photoscan 1.0

[GPU] estimating 1764x2280x544 disparity using 882x760x8u tiles, offset -342
timings: rectify: 0.172 disparity: 3.453 borders: 0.063 filter: 0.187 fill: 0
finished depth reconstruction in 119.424 seconds
Device 1 performance: 200.474 million samples/sec (CPU)
Device 2 performance: 754.388 million samples/sec (Tahiti)
Device 3 performance: 754.261 million samples/sec (Tahiti)
Total performance: 1709.12 million samples/sec
Generating dense point cloud...
selected 16 cameras in 0.25 sec
working volume: 1640x1638x2566
tiles: 1x1x1
selected 16 cameras
preloading data... done in 0.64 sec
filtering depth maps... done in 5.266 sec
accumulating data... done in 6.266 sec
accumulator: 287.192 MB
octree constructed in 0.375 sec
nodes: 2161 (0.103728 MB)
points: 4069497 (40.695 MB)
nodes: 2161 (0.103728 MB)
points: 4069497 (40.695 MB)
4069497 points extracted
Finished processing in 134.722 sec (exit code 1)
Title: Re: Benchmarking a GPUs
Post by: florent.dallot on January 16, 2014, 11:47:26 AM
Hi just some test with nvidia Grid

Device 1 performance: 153.9 million samples/sec (CPU)
Device 2 performance: 410.873 million samples/sec (GRID K2)
Device 3 performance: 413.247 million samples/sec (GRID K2)
Device 4 performance: 412.337 million samples/sec (GRID K2)
Device 5 performance: 414.145 million samples/sec (GRID K2)
Total performance: 1804.5 million samples/sec
Title: Re: Benchmarking a GPUs
Post by: tommyboy on January 22, 2014, 12:11:53 PM
Wishgranter, any luck with a nice list?  I keep flipping through the posts here yet don't feel like I quite have it all straight.

Please oh please has anyone benched an R9 290X yet?  It seems from latest Anandtech review, the 290X and 280X dance around each other, and the 290 is maybe 5% slower than the 290X:

http://www.anandtech.com/show/7481/the-amd-radeon-r9-290-review/14

Given that the 290 ran about 25% faster than the 7970 earlier in this thread, the 290X should then only manage 30-35% faster than 7970/280X, sound right?

Just about to plunk down on a new system here, and trying to decide between two 280X, and a single 290X.  The single 290X would provide option for a second 290X later, however it sounds like people are having power and cooling problems with getting a 2 x 290X system running nicely...is anyone running a 2 x 280X system run alright with just air cooling?
Title: Re: Benchmarking a GPUs
Post by: ARF on February 15, 2014, 01:40:34 AM
Hi,

I've run the sample project on my machine:

ultra and mild depth settings.

Device 1 performance: 1072.65 million samples/sec (GeForce GTX 780 Ti)
Device 2 performance: 1054.08 million samples/sec (GeForce GTX 780 Ti)
Total performance: 2126.73 million samples/sec
Title: Re: Benchmarking a GPUs
Post by: tommyboy on March 18, 2014, 03:00:32 AM
Core i7-4930K (12 HT cores), 64GB RAM, dual R9 280X

We got the best GPU performance when limiting to 8 CPU cores as suggested by PS, making the 7970 about 8% faster.  Interestingly, the best overall time was achieved by setting to 10 CPU cores:

8 Cores
Device 1 performance: 160.09 million samples/sec (CPU)
Device 2 performance: 693.38 million samples/sec (Tahiti)
Device 3 performance: 696.031 million samples/sec (Tahiti)
Finished processing in 192.699 sec

9 Cores
Device 1 performance: 161.812 million samples/sec (CPU)
Device 2 performance: 660.052 million samples/sec (Tahiti)
Device 3 performance: 645.288 million samples/sec (Tahiti)
Finished processing in 176.007 sec

10 Cores
Device 1 performance: 162.184 million samples/sec (CPU)
Device 2 performance: 644.929 million samples/sec (Tahiti)
Device 3 performance: 648.454 million samples/sec (Tahiti)
Finished processing in 169.472 sec

Wishgranter, have you been updating your spreadsheet?  Would you be interested in publishing your spreadsheet so far as a shared Google Doc?
Title: Re: Benchmarking a GPUs
Post by: Oli63 on March 18, 2014, 02:00:21 PM
Do the latest results from ARF mean that the GTX780i is significantly faster than the 7970 Tahitis with respect to tommyboys results? Until now it was common opinion that it is preferable to buy AMD instead of Nvidia in terms of performance.
Title: Re: Benchmarking a GPUs
Post by: Exhale on March 23, 2014, 12:27:16 AM
By the way,  Did anyone try  Two SSD  with Raid O settings on this kinda powerful system?
Did you notice any significant performance ?
Title: Re: Benchmarking a GPUs
Post by: ksau on March 27, 2014, 06:58:55 PM
Hi!

I am quite new into Photoscan and love its possibilities. But - as always - could not have enough speed ;-)

I am trying to get the best out of Amazon EC2 as described here:
http://acuasi.alaska.edu/2014/02/11/configure-windows-2012-for-nvidia-grid-on-amazon-ec2/

Photoscan works but I think not in OpenCL mode and not very fast... Could you explain how to run the benchmark you are comparing here? I do not get that  :o I downloaded the sample and I open sample01_smooth.psz. But what now? "Workflow" -> "Build dense cloud"?

Thank you and best regards
Keith
Title: Re: Benchmarking a GPUs
Post by: tommyboy on March 27, 2014, 10:08:12 PM
By the way,  Did anyone try  Two SSD  with Raid O settings on this kinda powerful system?
Did you notice any significant performance ?
I have tried SSD vs spinning platter drive, where the performance difference should be much more noticeable, and didn't see any real difference.  I think if you have enough RAM, the only time it's really hitting disk is when loading the photos, and odds are they will be cached by the OS into RAM already, regardless of drive.
Title: Re: Benchmarking a GPUs
Post by: power64 on October 30, 2014, 04:53:01 AM
GPU and CPU Summary Sheet:

https://docs.google.com/spreadsheets/d/1iLX4iAVwcOJ0zyt5XNJW17ARz6lDrh0gJS5DAkXi_Ns/edit?usp=sharing
 (https://docs.google.com/spreadsheets/d/1iLX4iAVwcOJ0zyt5XNJW17ARz6lDrh0gJS5DAkXi_Ns/edit?usp=sharing)

It is freely editable by anyone with the link and update as you see fit.  Please don't change numbers unless you are positive the ones in the sheet are off.

FYI, The values reported are from various posts in the Agisoft forum.  I have attempted to infer expected results from non-tested GPU cards by utilizing the Si Soft Single Precision Open CL GPU results and they look quite good for estimating the AMD cards, but not so good for expected Nvidia performance.

For any newbies checking out which card to buy.  Lots of users seem to like the Nvidia GTX 580 or faster cards and if you have a need for speed, but don't mind possible unstable performance, the AMD cards scream, such as 7970 or better.

Please keep in mind that a fast GPU card setup will only speed up certain steps in the dense point cloud and heights, so a strong processor is also needed for the other Agisoft Pro functions.

Best Regards,
-Jerome (power64)

Title: Re: Benchmarking a GPUs
Post by: Wishgranter on October 30, 2014, 02:24:50 PM
Hi Power64. thanx a lot for this summary.....
Title: Re: Benchmarking a GPUs
Post by: JohnyJoe on February 18, 2015, 12:47:36 AM
Is it still true, that ATI cards offer in the same price range significantly better performance than NVIDIA cards?

For example i believe GTX 960 and R9 280 (and R9 280X), are almost in the same price range and their performance in games is similiar, but due to something (ATIs better drivers in openCL), these ATI cards offer huge performance increase in comparison to NVIDIA cards in the same price range?

Is it true still please?
Title: Re: Benchmarking a GPUs
Post by: petrovka on March 25, 2015, 02:30:04 PM
Hi, maybe this benchmark will help us

OpenCL benchmark:
http://compubench.com/result.jsp
Title: Re: Benchmarking a GPUs
Post by: JohnyJoe on April 03, 2015, 11:15:23 PM
Hm... maybe i saw this aswel, but photoscna uses open cl 1 (or 1,1) i think, the test uses probably later versions of open cl...?

Can anyone confirm that this test is useable for gpu agisoft comparison?
Title: Re: Benchmarking a GPUs
Post by: Lambo on June 09, 2015, 07:22:36 AM
Hello everyone, I just finished building a new machine for Photoscan processing. It uses an Intel I7-5820K processor, 32GB of DDR4 RAM and an AMD HD 7990 video card. For those that dont know this card, it is composed of 2 HD 7970's "sandwiched" together into 1 single card with 6GB of memory.
The results I got were pretty good. I ran the building test in Ultra high.

Device 1 performance: 192.642 million samples/sec (CPU)
Device 2 performance: 869.168 million samples/sec (Tahiti)
Device 3 performance: 847.02 million samples/sec (Tahiti)
Total performance: 1908.83 million samples/sec

I will tweak the video card and overclock it a bit to see what can it do.
Will report back soon.

Leo
Title: Re: Benchmarking a GPUs
Post by: Lambo on June 22, 2015, 12:45:47 PM
Well unfortunately as some people might know already, even though this is a beast of a video card, the HD 7990 is not good at overclocking :(  Even with the highest settings you can set in the AMD Overdrive or Afterburner, the difference in performance is minimal.
Anyway, I am happy that the card performs so well in stock form though :)

Leo
Title: Re: Benchmarking a GPUs
Post by: barrubba on February 05, 2016, 10:18:55 PM
hot to test my gpu and compare it with that?
https://www.pugetsystems.com/labs/articles/Agisoft-PhotoScan-GPU-Acceleration-710/
which scene test they used?
thanks
Title: Re: Benchmarking a GPUs
Post by: 3DFranz on March 22, 2016, 04:00:40 PM
Anybody tested the Radeon R9 390X2 yet?   (=Dual-GPU, 2x 8 GB GDDR5 Ram, 2x Grenada Pro Chip at 1000 MHz)

are 2 R9 390X faster for Agisoft?

or should I use 2 GTX 980 Ti ?

Thx for your advise.

Title: Re: Benchmarking a GPUs
Post by: bulls4ever on September 02, 2016, 09:13:27 PM
I tested Titan X Pascal: 1900 msamples/sec on monument medium (stock settings). it gets hot on overclocking and not really worth the extra 100.

My gtx 1080 does around 1080 on stock and 1330 on max overclock.

Title: Re: Benchmarking a GPUs
Post by: glennn on September 12, 2016, 02:00:12 PM
Currently testing Titan X Pascal in SLI
I bought the wrong HB bridge but tests with 2 cards installed are still terrible.
Has anyone else tried Titan X in SLI ?
What was your setup?

Im currently using a x79 rig with 3960X CPU oc @4.5ghz

Testing with the monument sample file on High quality and getting 532 seconds.  Which seems no faster than my 970 sli rig
Any suggestions?
Cheers
Title: Re: Benchmarking a GPUs
Post by: GrinGEO on September 12, 2016, 02:08:16 PM
Anybody tested the Radeon R9 390X2 yet?   (=Dual-GPU, 2x 8 GB GDDR5 Ram, 2x Grenada Pro Chip at 1000 MHz)

are 2 R9 390X faster for Agisoft?

or should I use 2 GTX 980 Ti ?

Thx for your advise.

Im not shure whats the benefit of this... than you can install 2x R9 390X with each 8GB and youll have the same performance  but will cost about 200 EUR less...

Title: Re: Benchmarking a GPUs
Post by: Alexey Pasumansky on September 13, 2016, 11:50:21 AM
Hello glennn,

Using SLI wouldn't improve the processing performance in PhotoScan, so it's better to remove SLI connector.
Title: Re: Benchmarking a GPUs
Post by: GrinGEO on September 13, 2016, 12:12:26 PM
probably also the crossfire link, right? So GPUs should be always run in single mode, right?