Forum

Author Topic: Stability of depth maps calculation  (Read 1502 times)

ManyPixels

  • Newbie
  • *
  • Posts: 40
    • View Profile
Stability of depth maps calculation
« on: December 05, 2022, 12:47:24 PM »
I encountered big problems during the second step of generating depth maps (depth map filtering). The bigger the project, the more likely the crash. It seems related to graphics cards (2x RX 6900 XT). Before you ask, yes my drivers are up to date (and I feel like I have more crashes with latest update 22.11.1).

I ended up succeeding in the calculation using only the CPU and I noticed that on this problematic step, the time was identical or even less than the calculation time on the GPU! About 10s to filter a block of 39 depth maps on CPU (AMD Threadripper Pro 3955WX) and about 12s on the 2 GPUs.

It would be good to be able to adjust the parameters of material use more finely by step. A better division of steps would also allow for better monitoring. Here I restarted the calculation about 10 times on GPU with systematically a crash at a random point in the 2nd step, i.e., after 3 hours of calculation. So, I will have lost 30h while the whole thing only took 10h on CPU and working the first time.

Alexey Pasumansky

  • Agisoft Technical Support
  • Hero Member
  • *****
  • Posts: 14962
    • View Profile
Re: Stability of depth maps calculation
« Reply #1 on: December 05, 2022, 03:37:26 PM »
Hello ManyPixels,

Have you submitted the crash reports already? If not, please use Crash Reporter shown after unexpected Metashape termination and add the link to this thread to the comments field, so that we could identify it easily.
In most cases the crashing problems observed during the depth maps calculation stage are caused either by GPU (faulty hardware or driver) or by RAM failures.

As for the comparison of the processing time using CPU only and using dual GPU setup, would be useful to see the corresponding logs related to the same data processing using the same parameters.
Best regards,
Alexey Pasumansky,
Agisoft LLC

ManyPixels

  • Newbie
  • *
  • Posts: 40
    • View Profile
Re: Stability of depth maps calculation
« Reply #2 on: December 05, 2022, 03:48:45 PM »
Hi,

I'll try to export all this data next time I get this problem. By the way, it would be nice to just set a directory to save logs, rather than manually set a file before processing. Usually I save the log after 2 or 3 crashes but it's a lot of debug data lost.

I was very surprised to see that the CPU performed as good as a dual GPU on this step! It could be because I was processing in lowest quality but it's still good to know as this step crashed only on GPUs. We always compare softwares to their performances, but stability is way more important. Nevertheless, I'm a Python developer too and I know how hard it is to work with GPUs...


Alexey Pasumansky

  • Agisoft Technical Support
  • Hero Member
  • *****
  • Posts: 14962
    • View Profile
Re: Stability of depth maps calculation
« Reply #3 on: December 05, 2022, 04:53:46 PM »
Currently you define the path to the single file and it will be appended automatically. So you do not need to define it manually for each Metashape instance or for each project.

Also please check, if you are reaching IO limits during the processing, as it may be a reason of similar processing time for CPU-only and GPU-enabled processing. We recommend to store the project files on the local SSD drive or on the network storage with high-speed connection.
« Last Edit: December 05, 2022, 06:23:32 PM by Alexey Pasumansky »
Best regards,
Alexey Pasumansky,
Agisoft LLC

ManyPixels

  • Newbie
  • *
  • Posts: 40
    • View Profile
Re: Stability of depth maps calculation
« Reply #4 on: December 07, 2022, 11:02:37 AM »
Hi,

For the log files there could be a little issue where you activate the log, then Metashape crashes and the log is no more activated (need a clean close).

It's probably a matter of IO limit, but I can't do better overall, I'm working with a RAID 0 array of two M.2 SSD (around 6Go/s). It's then possible that the CPU has a faster access to data than the GPUs.

For the repeated crashes, there is always hardware issues, but as I run huge processes without any problem on other programs without any problem, it's just a problem of error management in your side. As an example, when you're writing a code to control something wireless, you'll get a lot of hardware problems (mainly loss of connection). But you have to deal with it and build something stable, which handles errors without systematically crashing and without rejecting the fault on the components..

Alexey Pasumansky

  • Agisoft Technical Support
  • Hero Member
  • *****
  • Posts: 14962
    • View Profile
Re: Stability of depth maps calculation
« Reply #5 on: December 07, 2022, 01:43:41 PM »
Hello ManyPixels,

If you re-start Metashape after setting valid path in "write log to file" field, the path will be properly stored and will not be discarded upon unexpected application termination.

Anyway, additional information such as submitted crash reports and complete processing logs would be helpful for the problem analysis. In case you are working on Windows, then .evtx file exported from Windows Event Viewer for metashape.exe related lines (marked with error sign) would be also useful as it should contain additional information.
Best regards,
Alexey Pasumansky,
Agisoft LLC

ManyPixels

  • Newbie
  • *
  • Posts: 40
    • View Profile
Re: Stability of depth maps calculation
« Reply #6 on: December 20, 2022, 02:29:38 PM »
Hello,

After more tests, I confirm that the filtering time is equal or even lower (up to 20%) on CPU (1.8.4). Knowing that it is the operation that causes GPU crashes, is it possible to have a Python script to better adjust the use of the hardware?
Which means, a command to start the creation of depth maps, then one to deactivate the GPUs and a last one to filter the depth maps? I think the software's functions should be subdivided at this level but the functions are not public because they won't make anything alone.

This would be extremely useful, because the first step takes 5 to 10 times more time on CPU! I really don't want to decompile the software to modify it by myself...

Thanks in advance !