Forum

Author Topic: Depth maps generation performance tip  (Read 2296 times)

Bzuco

  • Sr. Member
  • ****
  • Posts: 262
    • View Profile
Depth maps generation performance tip
« on: February 07, 2025, 03:04:00 PM »
Tip 1:
Recently I switched from rtx 2060 super to rtx 4070 ti super(core@3060MHz) and I noticed immediately that my CPU(Intel 11700F@8c16t@4.4GHz) is not enough performant for new GPU during generating depth maps phase.

So I played a little bit with one tweak "BuildDepthMaps/max_gpu_multiplier". If I am correct, this parameter means how many concurrent kernels will be running on GPU. Default value is 2...you can see this as [GPU 1], [GPU 2] in logs.

I quickly measured time of one or two batches on medium, high and ultra high quality with different parameter values(2/4/6/8)...[GPU 1]..[GPU 8].
For my hardware and project(ground photogrammetry from hand, 435x 18Mpix), on medium quality it make sense to increase the value to 4. On high to 6 and on ultra high to 8.
My depth maps processing times decreesed to ~92.8% for medium, ~86.3% for high and ~84.8% for ultra high.

In attachment(one batch ultra high depth maps quality) you can see the CPU and GPU utilization, where GPU graph is more compact and with higher bars. Also higher tweak values means more vram allocation.

I think higher values than 6-8 can make some difference only in specific cases like very unbalanced CPU/GPU systems or low/very low depth maps quality or in projects with very high Mpix photos.

Let me/us know what speed increasements you see in your projects with different tweak values, you can post also what CPU and GPU do you have.

UPDATE:
Tip 2:
I tried another tweak "main/gpu_enable_opencl" set to true and "main/gpu_enable_cuda" set to false.
This gives me another performance boost. I tested only ultra high depth maps quality with 2 and 8 max_gpu_multiplier. Here are the results:
CUDA, 2x - 77.6s ...default Metashape, without tweeks
CUDA, 8x - 69.1s
OpenCL, 8x - 54.7s ...time decreased to 70.48% :)
...second screenshot in attachment.
...what is interesting in my case is GPU utilization which fluctuated only between ~45-75%. GPU memory allocation was also much lower than using CUDA.
I will also try selecting pairs and matching points stages, if I see some boost.
« Last Edit: February 08, 2025, 02:39:55 PM by Bzuco »

CheeseAndJamSandwich

  • Full Member
  • ***
  • Posts: 211
    • View Profile
    • Sketchfab Models
Re: Depth maps generation performance tip
« Reply #1 on: February 08, 2025, 10:13:23 AM »
You've given us even MORE performance gains????!!!

My setup, also very unbalanced, an old ThinkPad P51 with a Xeon E3-1505m V6, with Metashape just using the 5700XT plugged in as an eGPU (so limited to 3.0x4)

Quick test: scan of a coral restoration frame... 365 12mp photos.
max_gpu_multiplier = 2  = 10m23s
max_gpu_multiplier = 4  = 8m16s
max_gpu_multiplier = 6  = 8m7s

Another sizeable performance gain!  20%!

This, on top of your trick to run multiple network nodes locally, on the one machine, which knocked days off my dive site scan processing!  There's a LOT of performance gains to be had in Metashape!!!

Alexey, it seems that running multiple threads of stuff gives us some decent gains... And that the number of threads seems to change from task to task... And from hardware to hardware...  Can Metashape get some clever work scheduling (AI lol) code to dish out the optimum number of threads at every stage?

As always, we really don't like looking at the $$,$$$ of hardware we've purchased and see it only partially utilised, or sometimes mostly idle...  We just wanna watch it run at 100%, for 100% of the time! Melting for the whole duration, lol, such that it finishes as fast as possible.  Time = Money, etc. etc.
My 'little' scan of our dive site, 'Manta Point'.  Mantas & divers photoshopped in for scale!
https://postimg.cc/K1sXypzs
Sketchfab Models:
https://sketchfab.com/cheeseandjamsandwich/models

Bzuco

  • Sr. Member
  • ****
  • Posts: 262
    • View Profile
Re: Depth maps generation performance tip
« Reply #2 on: February 08, 2025, 12:18:48 PM »
Updated original post, added tip2.

Corensia

  • Jr. Member
  • **
  • Posts: 66
    • View Profile
Re: Depth maps generation performance tip
« Reply #3 on: February 17, 2025, 07:23:45 AM »
Thanks for the interesting post Bzuco!

Did a quick test using our i9 14900K / RX 7900XTX setup for a non-NVIDIA result.

Similar time gains as yours at High, processing 320 images at 20MP.

Default :          9m 21s
MaxGPU=6 :   6m 59s

Roughly 25% faster using the tweak. Definitely will set it as a baseline tweak from now on as long as further testing doesn't result in any unexpected performance errors.