Forum

Author Topic: cudaMemGetInfo time out error  (Read 22413 times)

Alexey Pasumansky

  • Agisoft Technical Support
  • Hero Member
  • *****
  • Posts: 14813
    • View Profile
Re: cudaMemGetInfo time out error
« Reply #15 on: November 20, 2020, 04:24:57 PM »
Hello c-r-o-n-o-s,

Can you please attach the processing logs related to the GPU processing (using CUDA and OpenCL with multiplier = 2)? I suggest to reboot before every start, just for case the driver cannot be recovered.

For me it seems like a driver issue, so I can suggest to make a clean driver install. Also please specify, what is the Quadro model that you are using? Do you mean RTX 3000 or P3000?
Best regards,
Alexey Pasumansky,
Agisoft LLC

RHenriques

  • Full Member
  • ***
  • Posts: 225
    • View Profile
Re: cudaMemGetInfo time out error
« Reply #16 on: November 30, 2020, 10:12:08 PM »
These errors are coming more frequent, even in smaller projects. I've been noticing that if the "Generic Preselection" is active in the Align Photos, there is more chance of success in this stage. If switched off, failure is certain. This problem seems to be linked to excess of peak use by external GPU's. Is there a way to lower or fine-tune a bit each GPU use?
Best Regards



c-r-o-n-o-s

  • Jr. Member
  • **
  • Posts: 91
    • View Profile
Re: cudaMemGetInfo time out error
« Reply #17 on: December 02, 2020, 09:20:14 PM »
The main/depth_max_gpu_multiplier 1 setting work fine, but the speedimpact is roud about 50%!

RHenriques

  • Full Member
  • ***
  • Posts: 225
    • View Profile
Re: cudaMemGetInfo time out error
« Reply #18 on: December 02, 2020, 09:55:15 PM »
The main/depth_max_gpu_multiplier 1 setting work fine, but the speedimpact is roud about 50%!

Same here. Probably not has drastic but noticeable. However is the only way to minimize CUDA errors.

Alexey Pasumansky

  • Agisoft Technical Support
  • Hero Member
  • *****
  • Posts: 14813
    • View Profile
Re: cudaMemGetInfo time out error
« Reply #19 on: December 03, 2020, 01:44:25 PM »
These errors are coming more frequent, even in smaller projects. I've been noticing that if the "Generic Preselection" is active in the Align Photos, there is more chance of success in this stage. If switched off, failure is certain. This problem seems to be linked to excess of peak use by external GPU's. Is there a way to lower or fine-tune a bit each GPU use?

I don't think professional graphic cards like Quadro RTX are so delicate and cannot handle high load. It seems to be one of their main purposes - to be installed in the server racks and workstations that are performing regular and almost constant calculations. So for me this particular case (reported by c-r-o-n-o-s) seems to be related to hardware or drivers. If NVIDIA drivers are up to date and clean install doesn't change anything, no RAM issues are detected and there are no problems with power supply management, I would suggest to contact NVIDIA support regarding the observed problem - CUDA errors when two contexts are used on the same GPU.
According to my experience, NVIDIA should provide good support to professional graphic card owners. Maybe they could suggest to alter some settings or will run additional diagnostics for the GPU in order to check, if the issues are related to some factory flaw.
Best regards,
Alexey Pasumansky,
Agisoft LLC

c-r-o-n-o-s

  • Jr. Member
  • **
  • Posts: 91
    • View Profile
Re: cudaMemGetInfo time out error
« Reply #20 on: December 04, 2020, 01:48:16 PM »
Now I have changed a lot in the graphics card drivers.
Energy mode: Adaptive, max - optimal performance and so on.
What to say, now it runs!

My current settings are the same as I started with (with errors) and still it runs, even under OpenCL.

There really is a "node" in the driver settings.

Alberto C

  • Newbie
  • *
  • Posts: 1
    • View Profile
Re: cudaMemGetInfo time out error
« Reply #21 on: February 01, 2021, 06:30:46 PM »
When I first run Build Dense Point Cloud on Medium or Low setting I get this error after several seconds:

cudaMemGetInfo(&free_mem_size, &total_mem_size): the launch timed out and was terminated (6) at line 211

When I try running the same thing again, it errors out almost immediately and gives the same error message except that it terminated at line 33.

The Build Dense Point Cloud process will finish if I run it on "Lowest" quality setting.

Is this something I can fix? Do I need to change the amount of time the computer waits for the time out? Thanks!

I'm running Windows 10 Home on Dell G7 7790 laptop. GPU is NVidia GeForce RTX 2080 with Max-Q Design.




Hola, tuve el mismo problema y lo solucione regresando a una versiĆ³n anterior del controlador de la grafica integrada!! Funciono perfectamente y mi PC cuenta con una NVIDIA RTX2060

flogs

  • Newbie
  • *
  • Posts: 37
    • View Profile
Re: cudaMemGetInfo time out error
« Reply #22 on: January 18, 2022, 02:38:23 PM »
Hi,

RTX 2070 super, Windows 11, Metashape 1.8.0. Occational errors during processing. Drivers reinstalled - I tried both game and studio drivers. Any idea if it is hardware or software problem and what to do?

Kernel failed: an illegal memory access was encountered (700) at line 269
cudaMemGetInfo(&free_mem_size, &total_mem_size): an illegal memory access was encountered (700) at line 40

Thanks,
Filip

Alexey Pasumansky

  • Agisoft Technical Support
  • Hero Member
  • *****
  • Posts: 14813
    • View Profile
Re: cudaMemGetInfo time out error
« Reply #23 on: January 18, 2022, 03:30:33 PM »
Hello Filip,

Looks like the driver failure.

Do you observe GPU-based processing problems on different stages, like image matching, depth maps generation, depth maps based mesh generation? Also please specify the driver versions that you have tried.

Best regards,
Alexey Pasumansky,
Agisoft LLC

flogs

  • Newbie
  • *
  • Posts: 37
    • View Profile
Re: cudaMemGetInfo time out error
« Reply #24 on: January 18, 2022, 05:12:49 PM »
Hello Filip,

Looks like the driver failure.

Do you observe GPU-based processing problems on different stages, like image matching, depth maps generation, depth maps based mesh generation? Also please specify the driver versions that you have tried.

No errors before, problems started about at the same time when migrating to Windows 11. But I cannot be sure.

I use up-to-date drivers. When I do a clean driver reinstal, I do not observe errors immediately. But also when I close and open again Metashape, sometimes I can succesfully complete the desired calculation (which stopped with an error before).
NVidia Studio, 511.09, 01/04/2022
NVidia Game, 511.23, 01/14/2022

Most often I see the error (Kernel failed) during image alignment stage. But sometimes even later during next steps. But sometimes even a couple of hours computation work (build dense cloud, build mesh) is ok.

I wonder if I should run some software test of my GPU/RAM to see everything is ok. (OCCT, memtest)
GPU cards are pretty expensive and unavailable these days so I hope my card will last a little bit longer. :)

flogs

  • Newbie
  • *
  • Posts: 37
    • View Profile
Re: cudaMemGetInfo time out error
« Reply #25 on: January 25, 2022, 05:02:32 PM »
I can confirme that when I do a clean reinstall of a GPU driver, I do not see errors for some time afterwards. But it starts later on. So far it is always like this. Are going to investigate this issue further? Filip

Alexey Pasumansky

  • Agisoft Technical Support
  • Hero Member
  • *****
  • Posts: 14813
    • View Profile
Re: cudaMemGetInfo time out error
« Reply #26 on: January 25, 2022, 05:33:12 PM »
Hello Filip,

We'll try to reproduce the problem on similar system configuration (Windows 11 + RTX 2070).

Meanwhile you can set up the following tweak via Advanced preference tab: main/gpu_enable_cuda, set its value to False, re-start Metashape and check if the problem no longer persists. The tweak switches from CUDA implementation to OpenCL and may help if only CUDA part of the driver is somehow affected.
Best regards,
Alexey Pasumansky,
Agisoft LLC

flogs

  • Newbie
  • *
  • Posts: 37
    • View Profile
Re: cudaMemGetInfo time out error
« Reply #27 on: January 28, 2022, 07:06:54 PM »
Hello Filip,

We'll try to reproduce the problem on similar system configuration (Windows 11 + RTX 2070).

Meanwhile you can set up the following tweak via Advanced preference tab: main/gpu_enable_cuda, set its value to False, re-start Metashape and check if the problem no longer persists. The tweak switches from CUDA implementation to OpenCL and may help if only CUDA part of the driver is somehow affected.

Ok. thanks. Just to be accurate, I have RTX 2070 Super, not older RTX 2070. From MSI.

Alexey Pasumansky

  • Agisoft Technical Support
  • Hero Member
  • *****
  • Posts: 14813
    • View Profile
Re: cudaMemGetInfo time out error
« Reply #28 on: January 28, 2022, 09:04:13 PM »
Hello Filip,

We haven't yet got a chance to test RTX 20 series on Windows 11, but if you were able to run the processing using OpenCL, please let me know, if it went fine or produced any similar error (likely would start with CL_ prefix).
Best regards,
Alexey Pasumansky,
Agisoft LLC

flogs

  • Newbie
  • *
  • Posts: 37
    • View Profile
Re: cudaMemGetInfo time out error
« Reply #29 on: February 16, 2022, 02:02:04 PM »
Hello Filip,

We haven't yet got a chance to test RTX 20 series on Windows 11, but if you were able to run the processing using OpenCL, please let me know, if it went fine or produced any similar error (likely would start with CL_ prefix).

Hi, thanks for info. I did not switch to OpenCL either. The problem is still there but when I reinstall the driver, I am somehow able to work for some time. Regarding swithing to OpenCL... how much time increases while processing projects?