Agisoft Metashape

Agisoft Metashape => General => Topic started by: Triplegangers on June 14, 2013, 12:56:12 AM

Title: Strange workstations testing results.
Post by: Triplegangers on June 14, 2013, 12:56:12 AM
 Hello Agisoft team. We at Infinite Realities started one sweet experiment and baffled with results. Would be cool if you could shed some light on it.
 So we have two powerful Workstations running Agisoft PhotoScan Professional edition 0.9.1

(http://img10.imageshack.us/img10/8126/wstest.jpg)

 As you can see, Photo Align stage was a total disaster for Xeons Workstation. Which doesn't make any sense as there are two E5-2670 running on Max Turbo Frequency at 3.3 GHz against one i7-3930K
 On Geometry build stage we didn't see much difference in speed. Although Xeon WS showed dramatic speed boost on Depth Reconstruction and was already at 10% building geometry while i7 WS was still on Depth stage. This might be only because GPU's kicked in, but were busy only during Depth Reconstruction stage which is sad because its just 3-4% of the overall time.

The questions are:
How is it possible that 6 core with 12 threads at 3.8Ghz overtakes 16 cores 32 threads at 3.3 GHz?
Why GPU's potential is not harnessed during all processing stages?
Title: Re: Strange workstations testing results.
Post by: Matt on June 14, 2013, 03:24:23 AM
I think you will find the faster processor speed of the i7 (3.8 ghz) will give that machine the edge. Regardless of the amount of cores the i7 is simply faster than the Xeons. For this reason many people overclock the i7 processors to maximise the efficiency of alignment etc. The two machines should perform pretty much the same during depth processing as the GPU's are the same.
Title: Re: Strange workstations testing results.
Post by: meshmaster on June 14, 2013, 03:42:45 AM
I've got a few multi-processor xenon boxes as well as a couple single processor i7 extreme boxes.  Honestly, I've always found the i7 machines leave my xenons in the dust.

:-/

Title: Re: Strange workstations testing results.
Post by: Wishgranter on June 14, 2013, 10:11:39 AM
try read litle about eficiency of multiprocessors, adding every new core lower the efficiency so a 4 cores have eff around 96 %, but when have 16+ cores its come down to 70 % in some examples.....

Title: Re: Strange workstations testing results.
Post by: RalfH on June 14, 2013, 10:50:04 AM
Thanks, Wishgranter. This is something I have been wondering about a little while ago. I had a large project that was running for several days (CPU only, 8 core Xeon 3.2 GHz), and during the days I gave Photoscan only 2 out of 8 processor cores (so I could still do other work on the machine), but over night Photoscan was allowed to use 7 out of 8 cores. Out of curiosity I took notes on how many ultra quality depth maps were created per hour. What I found was that using 7 instead of 2 cores only resulted in 2 times as many depth maps per hour (instead of 3.5 times as many). Multi-core efficiency appears to be a much bigger issue than I had expected. Is this something that could be improved by improving the software, or is it a hardware issue?
Title: Re: Strange workstations testing results.
Post by: Wishgranter on June 14, 2013, 12:59:53 PM
its mostly HW issue, what OS you use for it ? Win8 is little better in thread efficiency(5-15%) ....

OK im wil do few more tests on the dataset whats used for GPU benchmarks, so we have comparable results and can see how it performs....
Title: Re: Strange workstations testing results.
Post by: RalfH on June 14, 2013, 01:01:20 PM
I am using Windows Vista 64 bit. What about using Linux?
Title: Re: Strange workstations testing results.
Post by: Alexey Pasumansky on June 14, 2013, 01:18:05 PM
Hello Alexander,

Please note that real frequency for hard processing steps is equal to nominal (3.20 GHz for i7 and 2.60 GHz for Xeons), while Turbo could be applied only for short periods of time when the core is quite cool.


As for the GPUs, they are only utilized during depth maps reconstruction.
Title: Re: Strange workstations testing results.
Post by: Matt on June 14, 2013, 04:58:11 PM
I find the xeons great you have just got get the data processing centre specific chips. Less cores with more ghz. Multi CPU boards are currently the only way to get enough ram to process really large projects in a single chunk.
Title: Re: Strange workstations testing results.
Post by: Triplegangers on June 15, 2013, 04:06:57 PM
 Thank you all for replying, there were some good ideas!

 So the last two days I spent testing this two systems, to understand how they deal with the load. Turned out classical testing, of starting two machines simultaneously on the same task and see which one finishes first, don't really do justice this days. Especially when you get into equation dual CPU systems.

 Digging for hidden potential in both systems I decided to run 2 and 3 parallel Photoscan windows, working at the same time on 6 photos set. Which started to show some very interesting results. You can see from bottom table, time performance on each test.

(http://imageshack.us/a/img822/7449/kboc.jpg)

 This graph shows how each system deals with the load during parallel Photoscan tasking. Obviously exponential peaking is a bad thing

(http://imageshack.us/a/img839/3457/tai3.jpg)

 While working on the same one task, during conventional testing, systems didn't show much difference in speed and only in cost ;D. However when you multitasking, you're starting to reach the true Xeon station potential. As it deals with load more efficiently judging from that graph.

 Here's some results of Xeon working on 90 photos set where each one is 18mp!

(http://imageshack.us/a/img692/3313/1bqr.jpg)

 Having that data, its pretty clear that sequential chunk processing is not the most efficient way to work, for those of us who need to process multiple sets of photos. Hereof I would like to request for Parallel chunk processing feature in Photoscan. As from this, all will benefit.

 Here's an example.

 Sequential processing of 5 sets, 90 photos each:
 107 + 107 +107 +107 +107 = 535 min

 Parallel processing of 5 sets, 90 photos each:
 128 + 128 + 107 = 363 min

And this is 172 hours of saved time, we can spend processing two more sets and walk a dog  :)

Also depending on system potential, it would be cool to be able to set the number of parallel processes Photoscan will do. If sets are not heavy, like 20-35 photos, it can be set to 4 may be even 5. If its heavy, 100 photos and more, could be set to 2.

Would like to hear what you all think, may be you have something to add or see where I'm wrong.

Some screen shots of nice smooth synchronized parallel processing here:

(http://img837.imageshack.us/img837/5347/q1sb.jpg)
(http://img708.imageshack.us/img708/3545/xckw.jpg)
Title: Re: Strange workstations testing results.
Post by: James on June 15, 2013, 07:09:21 PM
Don't suppose it makes a massive difference but did you try parallel processing identical chunks with the same images, or completely different chunks with different images? Just wondered if there may be any caching of anything anywhere in that case that might make the results look better than they really are. I doubt it really but it might have an effect.

Brilliant work though :)
Title: Re: Strange workstations testing results.
Post by: Triplegangers on June 15, 2013, 08:28:25 PM
Hey James, from what I know caching is not something that happens for no reason. You have to really sweat a little on the code side to make caching possible.

Also I did ran test on totally different sets of photos as well as on identical. And did not notice anything of that sort.
Title: Re: Strange workstations testing results.
Post by: Infinite on June 16, 2013, 01:41:22 AM
Thanks for sharing these results Alexander, it's good to see people talking about this topic and sharing stats.
Title: Re: Strange workstations testing results.
Post by: Wishgranter on June 16, 2013, 01:49:20 PM
Hi All, today will try compose few things that you understand the problematic of multicore systems, problematic around it and how to improve few things....

Have anybody from here some 4 socket system AMD - INTEL so we can test it how it performs as we add even more cores to the system ??
Title: Re: Strange workstations testing results.
Post by: Wishgranter on June 16, 2013, 04:34:03 PM
OK, this is not a easy understandable stuff, its need a litle better understanding how CPUs work, how OS work, and how is software  well writen..... wil try give here as many as possible text explanation, some shorter some larger..... if interested can put here much more links for explanations

In short, programming for multicore is not a easy task, and with adding more core to the system you get LOWER efficiency, so mostly when going to say duble ammount of cores say from i7 to dual xeon you get NOT linear speedup of the process. Why ? 

Think as a director of a small company, you have 4 employe, to divide work to them is easy, 4 people, work is clearly divided to them so everyone get 25 % of say cooking a cake. One get the ingrediences, other mix it together, third do bake stuff and 4th put it all together. Easy to command, just say one by one what he should do.....

When going to 16 employee, its need a much deeper thinking who will do what. So you need carefully divide same work to more people, and carefully think what everyone get just so much that they do same ammount of work... just imagine to say to 16 people their instructions.....
so when we get even larger stuff of to do, this process of proper planing a delivering it need to be more complicated. You start to use some of the people just to synchronize others, say for every 8 people is one just commanding that mean when 16 people „working“ 2 of them do nothing, they just organize all other, 16-2 = 14 people work on what they should...... if have 32 peoples ( cores) 4 or them just organize, if 64 of them, its not just the 1 every 8 people, you need 2 extra for commanding the 8 supervisors, so ending with 64 people and just 54 work ( 2 for managing 8supervisors ) so overal efficiency is starting to decrease.....


so back to our problematic with hard numbers :-)

so with Alexander we have tested „identical systems“ ( CPU + Windows 2012 Server ) my system is set with DISABLED TurboCore (2,7 GHz), Alex was with ENABLED TurboCore ( 2,7 GHz @ 3,4 GHz max ) so we have 32 thread system
we have tested it on this dataset for easier comparation http://downloads.agisoft.ru/photoscan/sample01.zip
settings align stage MEDIUM, no mask, no pairs, 40k points

(https://dl.dropboxusercontent.com/u/15047343/benchmark.jpg)

So as you can see in 1 test, having enabled TurboCore on say same CPUs results in very small difference. Why ? because TC is for single to quad thread processes. So TC is enabled just few moments....

result 2nd
Afther going to TASK MANAGER – DETAILS, selecting photoscan.exe, rightclick on it and setting the PRIORITY to REALTIME ( later on this ) we set the photoscan.exe to run in higher priority level, so OS will see this process as most importaint, so other sw running will get smaller amount of CPU time. The system is then litlle unresponsible, becasue of realtime level.... but as you can see we get just aprox 2% speedup !! results are in the 2nd line......

result 3rd
Now for the 3rd test we disabled Hyperthreading in BIOS, so we have just real core ( 2x8 cores – 16 and not 32 ) and run the test.....
So voila, as you can see in 3rd result we get aprox 68 % speedup, just disabling the HT stuff !!
So just DISABLING Hyperthreading we get very decent speedup, but WHY when we lower ammount of threads get BETTER results ???


The short explanation is the VIRTUAL ( HYPERTHREAD) cores are not real cores, so the virtual core MUST share L1 cache and few other resources.... Even Intel explain that we can get just 10-15 % speedup. But why we see so big difference in results in Pscan ? from my perspective as im see it from my knowlwge: Agisoft team has writen a VERY efficient code ( this can be seen when REALTIME settings are ON ) so even when we set that all CPU resources are set to pscan it can get just 2% out. But they create pscan subroutines so that they handle every core as a realcore, but not with hyperthreading ( HT core have its resources only when realcore waits on data ) here is better explanation:

All threads are not created equal. Two hardware threads might be on separate chips, on the same chip, or even on the same core. The most important configuration for game programmers to be aware of is two hardware threads on one core—Simultaneous Multi-Threading (SMT) or Hyper-Threading Technology (HT Technology).
SMT or HT Technology threads share the resources of the CPU core. Because they share the execution units, the maximum speedup from running two threads instead of one is typically 10 to 20 percent, instead of the 100 percent that is possible from two independent hardware threads.
More significantly, SMT or HT Technology threads share the L1 instruction and data caches. If their memory access patterns are incompatible, they can end up fighting over the cache and causing many cache misses. In the worst case, the total performance for the CPU core can actually decrease when a second thread is run.


so Agisoft

so who interested in deeper knowlwge read this links:
 
1. http://scalibq.wordpress.com/2012/06/01/multi-core-and-multi-threading/
2. http://en.wikipedia.org/wiki/Simultaneous_multithreading
3. http://en.wikipedia.org/wiki/Amdahl%27s_law !!!

So for now try on yours i5 or i7 or any CPU with HT to disable the Hyperthreading and post your results.......

Title: Re: Strange workstations testing results.
Post by: Triplegangers on June 16, 2013, 10:32:05 PM
Hey Milos! Thank you for sharing that info!

I second that! Switching off Hyper Threading (HT), did show nice improvement in processing speed. However switching off Turbo Boost only crippled general productivity in my system. As my Xeon's boosting up not just for a second or two, but do work over 3GHz during the whole process. May be its due to cooling system. So I can confidently say boosting helps a great deal.

(http://img842.imageshack.us/img842/2893/0tgj.jpg)

So funny thing about HT. Apps either support it, or they don't. Never knew it can work against you. Its like if all of those employees, (in Wishgranter post) starting to argue with each and screw around pissed off :)

Also my tests showed that HT OFF only helps during Photo Align process. Geometry Build showed the same time with it being ON or OFF.

IMHO, based on the testing data, Photoscan doesn't support HT, and has a faulty algorithm(s) responsible for Photo Align process. Sounds reasonable to make it not use HP feature, unless we benefit from it.

Would be nice to hear what Agi Team has to say.
Title: Re: Strange workstations testing results.
Post by: Infinite on June 16, 2013, 11:28:05 PM
Hey Milos! Thank you for sharing that info!

Would be nice to hear what Agi Team has to say.

Agreed! as this could have possible impact on how future users build rack server for 'possible' network processing!!  8)
Title: Re: Strange workstations testing results.
Post by: RalfH on June 17, 2013, 12:14:19 PM
Yet another test. I have used the sample01_align.psz from the test data set (http://downloads.agisoft.ru/photoscan/sample01.zip (http://downloads.agisoft.ru/photoscan/sample01.zip)) and tested building geometry with 1, 2, 3... 8 cores of an 8-core Xeon workstation (2 quadcore CPUs), processing with CPU only, no hypertheading.

The results are very interesting as they show that multicore efficiency is very low. Using all 8 cores you get only 2.5 times as many samples per second as you would get when using a single core - 69% of the total computating power are lost to "management" and synchronisation. Computation speed per core drops from 18.2540 Mio. samples per second (one core) to 5.7956 Mio. samples per second (8 cores).

Running two parallel instances of Photoscan seems to be a possible solution to reduce total processing time (if you have several projects to be processed), because the total number of samples per second is higher for running two projects at the same time than running them one after the other. Assigning a separate group of cores to each Photoscan instance is slightly more efficient than letting them share all 8 cores on their own terms. Assigning cores by physical CPU further slightly increases efficiency.

Of course this sheds an interesting light on batch processing where several tasks are processed one after the other. Instead of than processing several chunks in one project using batch processing, it would be much more efficicent to split the project and process the different chunks in separate parallel instances of Photocan.
Title: Re: Strange workstations testing results.
Post by: Wishgranter on June 17, 2013, 05:34:42 PM
Ralf, for depht stage are here GPUs, up to 10-15 times faster per GPU, depht map is prepared to be processed on GPU stage not CPU, because it could teoretically be possible make LITTLE faster but it would need a lot of effort do it... CPUs are not ideal for data like this, they are great at "serial" work, GPUs are best for paralel work, because we have say 1500+ cores for disposal, on CPU we get 8 cores ( its not so easy to compare but as a guideline for understanding ).......

For now Pscan is slow on ALIGN and MESH stage, im think later can AGISOFT team do it on GPU but its not a so easy task, if it could be ported to GPUs only then we will not need even network ready solution ( what is even more problematic to develop ) so with Align on GPUs we could see inprovement in aprox 10-20times from what im know about the ALIGN stage, with MESH generation even more....... why ? because GPUs are ideal for task like this.... 

with ONE 7970 im get aprox 690 Mil. samples.... so its a 38x faster...
Title: Re: Strange workstations testing results.
Post by: Diego on June 17, 2013, 07:06:42 PM

IMHO, based on the testing data, Photoscan doesn't support HT, and has a faulty algorithm(s) responsible for Photo Align process. Sounds reasonable to make it not use HP feature, unless we benefit from it.


I think it's wrong appreciation. For my tests are totally contrary.

HT On Much faster. Maybe it has to do with Board, Memory or Operating System, do not know.

My results with the proposed example.

Machine 01

Dual Intel Xeon X5677 @3.47 GHz. (4 Cores CPU; 4 Cores Virtual HT)

HT On 16 Cores Finished processing in 118.002 sec

HT Off 8 Cores Finished processing in 145.382 sec

Machine 02

Intel Core i7 3740 QM @2.70 GHz. (4 Cores CPU; 4 Cores Virtual HT)

HT On 8 Cores Finished processing in 140.927 sec

HT Off 4 Cores Finished processing in 180.951 sec

Clearly the HT works very well, and if Agisoft algorithm takes advantage, at least in my case, with two machines, of very different Hadware.
Title: Re: Strange workstations testing results.
Post by: Wishgranter on June 17, 2013, 07:10:05 PM
you forget to mention what OS you using !!!! as MOST importaint thing.... all up to date with updates or ???
Title: Re: Strange workstations testing results.
Post by: Diego on June 17, 2013, 07:15:12 PM
Hi Wishgranter,

Windows 8 Enterprise x64, Updates On, all time
Title: Re: Strange workstations testing results.
Post by: Wishgranter on June 17, 2013, 07:18:03 PM
hmm this is really very interesting !!!! so we will need do more test.......

And dont be suprised, but Xeons and i7 are the same hw inside, only the cache size is different and some microcode is different, architectural its the same thing.....

Title: Re: Strange workstations testing results.
Post by: Diego on June 17, 2013, 07:20:47 PM
I agree
Title: Re: Strange workstations testing results.
Post by: Wishgranter on June 17, 2013, 07:31:40 PM
hmm we use W2012 Datacenter edition with few tweaks ( me and Alexander ) so im wil dig intro this, because of my digging on what OS to use im stuck with this, as im get here better results on Nvidia GPU. the Datacenter edition should be the fastest one ( long story ) from all WIN versions.......
But as im see now, then WinShitt have done great work to criple it so that people will need upgrade hw and sw..... its hidden, so not everyone get it easy way out......

   
Title: Re: Strange workstations testing results.
Post by: Alexey Pasumansky on June 17, 2013, 10:39:54 PM
Hello all,

Thank you for detailed experiment description. They are quite informative and interesting.

We consider that there could be some saturation threshold of core number when the control of processing and task paralellizing requires much resources and they can no longer be used for calculations itself. Even the fall of performance could be observed.

And as Wishgranter has already mentioned under different OS the multithread performance could differ.

Title: Re: Strange workstations testing results.
Post by: James on June 17, 2013, 11:53:55 PM
Hey James, from what I know caching is not something that happens for no reason. You have to really sweat a little on the code side to make caching possible.

I would say I wish I'd paid attention in operating systems and computer systems architecture lectures, but I did pay attention and it didn't help...
Title: Re: Strange workstations testing results.
Post by: RalfH on June 18, 2013, 12:03:52 PM
Wishgranter,

yes, I know that GPU would be much faster for depth reconstruction, but the machine I'm working on doesn't have a powerful GPU, so CPU is all I can work with for the time being. I think that my test results can be useful for everybody who has to work with a multi-core CPU only.

Anyway, I have done the same test series for the Photoscan align stage (same data set as before), and the results are partially similar but not quite as bad as for depth reconstruction: Total core time (seconds x number of cores) for the task increases as more cores are used. If multi-core efficiency was 100%, total core time would remain constant. Using all 8 instead of only 1 core almost doubles total core time - 47% of the total processing power are used for "management" and synchronisation.

Interestingly, processing of two parallel align tasks is only slightly more efficient as running them one after the other, and core or CPU sharing brings counterintuitive results (with the total core time increasing instead of decreasing if a separate group of cores is assigned to each Photoscan instance).

P.S.: Forgot to mention that this and the previous test were run under Windows Vista 64 bit, using Photoscan standard edition, version 0.9.1 bulid 1621.
Title: Re: Strange workstations testing results.
Post by: gEEvEE on June 18, 2013, 04:24:18 PM
Hi all,

for what it is worth, I have reported this problem in a private email already three months ago to Dmitry and Alexey. I did not receive any answer so far, but will just share that email and included results.

Quote
I am now back from a National Geographic campaign in Greece, where we documented the Bronze Age site of Akrotiri. I returned with thousands of photographs and during the processing of them, I noticed some very strange CPU behavior in PhotoScan (the latest version, build 1640). What I will describe was only noticed now, so I do not know if it was also an issue in previous versions.

I ran several tests on my two computers (both running window 7 ultimate 64-bit) that I have. In my home PC, I have an Intel Core i7 Extreme 980X 3.33 GHz with 6 cores (so a total of 12 threads) and 24 GB of RAM. On our workstation at the institute, we have two Intel Xeons E5-2690 2.9 GHz, good for 2 * 8 cores  or a total of 32 threads and 128 GB of RAM. This Xeon processer is so far the third fastest ever (http://www.cpubenchmark.net/high_end_cpus.html (http://www.cpubenchmark.net/high_end_cpus.html)) and thus a real processing beast.

However, photoset alignment on my home PC is at least twice as fast as on our workstation. This is very annoying, since this super PC has cost us € 10 000 and was bought just for PhotoScan.
I aligned about 2200 photographs on my home PC and on the workstation, but “looking for image pairs” was 2.5 times slower on the workstation (although all processors worked 100 %).

Afterwards, I did a test with 50 JPEGs: these were the results: see comparison.png

As you can see, finding interest points is slightly faster on the workstation, but the pair selection is at least twice as slow. Can it be that this is not at all optimized for two physical processors and that the computation is done double?

To make sure that the CPUs and the motherboard and not wrongly configured, I ran some benchmarks (also of the GPU, but both have an Nvidia GeForce GTX 580 and this is not of interest for this SfM step). The results below show several highly intensive computations performed with the Performance test of Passmark (http://www.Passmark.com (http://www.Passmark.com)). AN overview of the CPU test can be found here: https://www.dropbox.com/s/l6suugfgonxx66z/CPUs.jpg (https://www.dropbox.com/s/l6suugfgonxx66z/CPUs.jpg)

As you can see, the workstation outperforms the home PC in ALL possible CPU computations.

What could be the cause that PhotoScan performs so much slower on the super PC, although both PCs have all processers 100% in use?
I really hope that you can help us out with this.

Best regards,

Geert

Title: Re: Strange workstations testing results.
Post by: Wishgranter on June 18, 2013, 05:09:10 PM
GeeVee, what OS you used for tests ?? because it seems that its something broken in Windows, not a hw issue, but much more a sw issue...... Have prepared my hdd for instalation of  Win XP, 7, 8, Win Server editions, Ubuntu and OSX, im will try test it on every mentioned OS and afther that we can get some data that can show what we canot see..... from my perspective its not a pscan fault but much more of a OS thread stuff....... im hoping wil have test done to end of the week.....
Title: Re: Strange workstations testing results.
Post by: gEEvEE on June 18, 2013, 10:54:48 PM
Windows 7 ultimate 64-bit

If it would be a Windows issue, I think that the benchmark tests wouldn't show the true potential of the Workstation's CPUs. In most software applications, the Xeons blow the i7 away...

Geert
Title: Re: Strange workstations testing results.
Post by: Triplegangers on June 18, 2013, 11:18:18 PM
Ahhh, do I love this thread? So much weird and fascinating stuff going on. Everyone is getting strange results. Now maybe, just maybe! Our own observation of tests, affects the observed reality  :o Quantum physics theory at computer software testing  ;D  Ok back to our strange stuff.

Spent some time installing and setting up Windows 7 just to run this tests and see if OS makes much difference. And here are some peculiar results:

(http://img23.imageshack.us/img23/3451/2hy4.jpg)

Switching to Windows 7 did show some positive changes in speed. But, difference between 402 and 394 seconds is very low and I will neglect it, saying that there was no difference between two OS on this stage. However when I started playing with HP, weirdness came up. Not only it worked faster with HP off, it also performed 10% better under Windows 7. Although I think its only due to software issue

My conclusion is that OS doesn't matter that much, and I will stop my tests on this stage before I loose my mind. Clearly, and I hope you all will agree, we have a problem that affects Photoscan performance and its quite complex. Starting with OS and BIOS settings, ending with complex workstation hardware.

I hope in the future, we will see Photoscan that can optimize itself for best performance, based on the system hardware and OS. Feature request maybe!?  ???

Also wish AgiTeam implements parallel chunk processing some time soon.
Title: Re: Strange workstations testing results.
Post by: Wishgranter on June 18, 2013, 11:46:35 PM
No, Agisoft team done excelent work with paralelism in pscan,  for everyone, its better to switch OFF the HyperThreading, as its not something that we can use for our datasources.....

for the ones who understand little bit more, is this link..... im think its clear why its better to disable it.... 

http://bitsum.com/pl_when_hyperthreading_hurts.php

and this one

http://www.agner.org/optimize/blog/read.php?i=6
Title: Re: Strange workstations testing results.
Post by: Triplegangers on June 19, 2013, 12:20:31 AM
Hey Milos,

Thank you for that link! Was wondering if there was a way to switch off HP for some programs from windows. Hate turning it off from BIOS, it hurts other software and processes which do benefit from HP.

Instead of completely disabling HyperThreading, you can use programs like Process Lasso (free) to set default CPU affinities for critical processes, so that their threads never get allocated to logical cores. We call this feature HyperThreaded Core Avoidance. It is better than completely disabling Hyper-Threading because it leaves the rest of the system free to take advantage of this otherwise useful feature.

http://bitsum.com/processlasso (http://bitsum.com/processlasso) its free and totally works, just tested. Also has many more interesting tools.

No, Agisoft team done excelent work with paralelism in pscan
I was speaking about parallel chunk processing, feature requested here:
http://www.agisoft.ru/forum/index.php?topic=1337.0 (http://www.agisoft.ru/forum/index.php?topic=1337.0)
Title: Re: Strange workstations testing results.
Post by: jedfrechette on June 19, 2013, 02:19:00 AM
My Xeon E5-2687W is significantly faster with Hyper Threading Enabled, regardless of OS.

The following tests were done using the "sample01" data on the same machine under Debian Jessie and Windows 7 Pro. In both cases PhotoScan Pro 0.9.0 build 1586 was used. The times below are the best of three runs for alignment only.

Hyper-Threading On
====================
Debian: 65.7 s
Win 7: 80.1 s

Hyper-Threading Off
====================
Debian: 84.3 s
Win 7: 92.9 s

It would be interesting to have an automated benchmark built in to PhotoScan. Ideally with the option to upload the results to a public database so that it would be easier to compare performance on a variety of hardware platforms. I had a go at hacking something together in Python a few months ago but ran in to various road blocks and ultimately put it aside.
Title: Re: Strange workstations testing results.
Post by: Wishgranter on June 19, 2013, 11:30:09 AM
Jed, thanx for the results....

im want not confuse people, but results are OK, one of the best results could/should be under WinXP and Liunx, then W7 and Win 8. its about how the OS share resources and etc. back in90s we have tested Win 3.11, W95 and Win NT under pshop, and results were so that Win 3.11 was the fastest OS back then.

im wil do tests as have mentioned so we have clear understanding how it work and where it run best....... 
   

Title: Re: Strange workstations testing results.
Post by: airmap3d on June 23, 2013, 04:32:38 PM
Hello all,

I have just run a test on my machine and have come up with the following...

I used a large project of mine and tested align photos with hyper threading both on and off.  I am running windows 7 pro with dual xeon E5620's, Photoscan Pro 0.9.0 (1586) and it was 20% faster with hyper threading enabled.

I will continue to use PS with HT enabled but from what I can gather from all of this is that every machine and config is going to be different so I would suggest testing your system and running with what works best for you??  Seems a simple solution to me.

Cheers!!
Title: Re: Strange workstations testing results.
Post by: Wishgranter on June 23, 2013, 05:04:58 PM
Wooow, really weird results.... my question is have you  tested it on same dataset ? just to be sure :-)

Title: Re: Strange workstations testing results.
Post by: airmap3d on June 24, 2013, 03:45:05 PM
Yep, same data. It was about 500 images so was a reasonable sized job.

This thread seems to be getting some very mixed results! All very confusing!!
Title: Re: Strange workstations testing results.
Post by: Wishgranter on June 24, 2013, 06:25:26 PM
yes, litle confusing, but every win version ( linux too ) acess the CPU resources little other way, explaining it here is realy problematic for people that have no extra knowlege on this stuff.....

 
Title: Re: Strange workstations testing results.
Post by: RalfH on June 24, 2013, 06:33:12 PM
Also, it shows that there could be a good potential for software optimisation - Photoscan could read hardware and OS data and use them to optimise hardware usage.
Title: Re: Strange workstations testing results.
Post by: jedfrechette on June 25, 2013, 07:11:32 AM
Also, it shows that there could be a good potential for software optimisation

Not to mention meat-space optimization. Give users a simple way to reliably compile useful benchmarks and I have little doubt that optimum configurations would be identified quite rapidly.
Title: Re: Strange workstations testing results.
Post by: Wishgranter on July 06, 2013, 12:06:05 PM
Hi All, another interesting article on manycore system, this time a 4P system with Win2012 server. read carefully

http://www.anandtech.com/show/7121/trials-of-an-intel-quad-processor-system-4x-e54650l-from-supermicro
Title: Re: Strange workstations testing results.
Post by: Wishgranter on July 15, 2013, 11:19:45 PM
Hi All, im digging intro the benchmarks, something its not OK under windows, but will post results afther AGI team say what they think about it.....
Title: Re: Strange workstations testing results.
Post by: Wishgranter on July 16, 2013, 12:00:25 AM
Who could help us with few benchmarks ? i need little more benches to be sure about results.
Best if 2P or even 4P system is tested on same, little bigger dataset, with aprox 50 images. We need run tests under WinXP, Win7, Win8, W2012Server, UBUNTU and OSX. must be not run on every OS, but just to have some overview on it too... Single CPUs too,but best if something even with HIGH overclock.

The difference on ALIGN stage is over 390% - same hw setup, so from this could benefit everyone..... 
Title: Re: Strange workstations testing results.
Post by: tincansassoc on July 16, 2013, 02:28:22 AM
Can you provide a set of images for everyone to run? Your benchmark comparison should be more accurate if everyone runs the same data sets. Maybe two different example sets and specify the settings we should try tweaking. If bandwidth is an issue maybe you could upload the datasets as a torrent file and we can do P2P transfers. Just a thought.
Title: Re: Strange workstations testing results.
Post by: Wishgranter on July 16, 2013, 02:35:57 AM
Yes it will be something in this way, 2 dataset for sure, one smaller 32 images and one bigger 50-70 images with proper image size so its not take hours to finish :-)

because the difference is going to be higher as number of images is higher..... morning will prepare it.....

Title: Re: Strange workstations testing results.
Post by: glennn on March 14, 2014, 06:16:29 AM
I am playing with a new Xeon too and noticed the speeds were dismal and found this thread.
I noticed my home system was considerably faster and that I needed to correct some settings to bring this Xeon workstation up to a reasonable speed.
The sample01.zip scene seems to run at a decent speed.  Its only when processing my own image set where it is a huge difference in processing speed
Image resolution of the sample images are 2184 x 1456 @ 32 images
My personal images are 5184 x 3456 @ 45 images

Xeon system - Dell Precision T3610 -
Windows 7 64-bit Enterprise
Intel Xeon CPU E5-1650 v2 @ 3.5GHz
32GB Ram 1866MHz
Quadro K4000 - 3GB

I changed settings to use 11/12 cores. Is this still correct practice?

Testing the sample01.zip scene on Xeon
Note : This seems to run at an ok speed.

align photos : High points 40,000 > 130.748 sec
build dense cloud : high > 248.198 sec
build mesh (dense cloud) 1,000,000 > 63.719 sec
build texture n/a
------------------------------------------------------------------------------------------
Testing my own scene on Xeon

align photos : High points 40,000 > 296.753 sec
build dense cloud : high > 4790 sec  (took about 2 hours ?)
build mesh (dense cloud) 725sec
build texture 89 seconds

I will post the differences between the Xeon and my i7 from home.
Title: Re: Strange workstations testing results.
Post by: Alexey Pasumansky on March 14, 2014, 02:14:39 PM
Hello Glen,

The difference in dense cloud generation stage is understandable, since the size of your dataset is 1.5 bigger by the number of images and the resolution is almost 5.5 times higher. So in terms of resolution, using High for your dataset will be almost equal to the use of Ultra for sample01 data.

Also please note that in case you wish to test only CPU performance for both configurations you need to uncheck OpenCL devices (I believe that at home you should have something faster than Quadro4000).
Title: Re: Strange workstations testing results.
Post by: VoRo on September 09, 2015, 06:29:02 PM
I can confirm an unexpected performance on a multi-processor system
(4 x Xeon E5-4640 + Nvidia Tesla k20c + Nvidia Quadro k5200, 32 cores/64 threads, 30/60 used).
For comparison I used the recommended building dataset with 50 images in medium resolution.
The runtime is really disappointing. However, other application specific benchmarks on this server show quite promising results (see table attached)
Title: Re: Strange workstations testing results.
Post by: Wishgranter on September 10, 2015, 03:02:06 AM
VoRo can contact me on muzeumhb@gmail.com propably have some not optimal settings on it...