Forum

Author Topic: cudaMemGetInfo time out error  (Read 14282 times)

Alexey Pasumansky

  • Agisoft Technical Support
  • Hero Member
  • *****
  • Posts: 14012
    • View Profile
Re: cudaMemGetInfo time out error
« Reply #30 on: February 18, 2022, 11:12:41 AM »
Hello Filip,

There shouldn't be any noticeable increase of the processing time, when switching to OpenCL implementation.
Best regards,
Alexey Pasumansky,
Agisoft LLC

ttsesm

  • Newbie
  • *
  • Posts: 9
    • View Profile
Re: cudaMemGetInfo time out error
« Reply #31 on: February 25, 2022, 05:38:54 PM »
Hi Alexey, I am getting the same error with RTX2080 super on my linux machine. I would like to test the OpenCL option, is there a way to set this in my script through python?

Moreover, adding the suggested tweak on the gui seems to work for the Dense Point cloud creation, but then on the Mesh creation I am getting the following error:

Code: [Select]
ciErrNum: CL_UNKNOWN_ERROR_CODE_-9999 (-9999) at line 209
Which I guess is related to opencl.
« Last Edit: February 25, 2022, 05:44:40 PM by ttsesm »

Alexey Pasumansky

  • Agisoft Technical Support
  • Hero Member
  • *****
  • Posts: 14012
    • View Profile
Re: cudaMemGetInfo time out error
« Reply #32 on: February 25, 2022, 07:03:13 PM »
Hello ttsesm,

To switch to OpenCL implementation via Python, you need to add the following line to the beginning of your script:
Code: [Select]
Metashape.app.settings.setValue("main/gpu_enable_cuda", "0")
As for the error that you are observing, please provide the related log  corresponding to the failed operation. Also specify Linux distribution used and NVIDIA driver version.
Best regards,
Alexey Pasumansky,
Agisoft LLC

ttsesm

  • Newbie
  • *
  • Posts: 9
    • View Profile
Re: cudaMemGetInfo time out error
« Reply #33 on: February 25, 2022, 07:55:07 PM »
Thanks Alexey, setting the parameter worked.

However after switching to OpenCL the error that I get is the following:

Code: [Select]
...
...
Using device: NVIDIA GeForce RTX 2080 SUPER, 48 compute units, 7980 MB global memory, OpenCL 3.0
  driver version: 510.54, platform version: OpenCL 3.0 CUDA 11.6.110
  max work group size 1024
  max work item sizes [1024, 1024, 64]
  max mem alloc size 1995 MB
  warp size 32
Building OpenCL kernels for NVIDIA GeForce RTX 2080 SUPER...
Kernels compilation done in 2.83562 seconds
Building OpenCL kernels for NVIDIA GeForce RTX 2080 SUPER...
Kernels compilation done in 0.646988 seconds
Traceback (most recent call last):
  File "/home/ttsesm/Development/metashape_project/bundler_extractor.py", line 68, in <module>
    main()
  File "/home/ttsesm/Development/metashape_project/bundler_extractor.py", line 47, in main
    chunk.matchPhotos()
Exception: Kernel locatePoints: clWaitForEvents(1, &ev): CL_UNKNOWN_ERROR_CODE_-9999 (-9999) at line 638

Process finished with exit code 1

My linux distribution is Arch linux, fully updated to the latest packages. The nvidia driver version is again the latest and specifically v.510.54 as you can see above in the output.

Is there any other log that I can provide you? I am running the script through pycharm with the metashape interpreter as described here https://agisoft.freshdesk.com/support/solutions/articles/31000154762-how-to-make-python-interpreter-to-use-metashape-module

------------------------------------------------------------------------------------

Also without the OpenCL workaround, initially the error usually I get is the following:

Code: [Select]
...
...
Found 1 GPUs in 0.000133 sec (CUDA: 7.3e-05 sec, OpenCL: 5.3e-05 sec)
Using device: NVIDIA GeForce RTX 2080 SUPER, 48 compute units, free memory: 6807/7980 MB, compute capability 7.5
  driver/runtime CUDA: 11060/10010
  max work group size 1024
  max work item sizes [1024, 1024, 64]
[GPU] photo 19: 8310 points
[GPU] photo 48: 8221 points
[GPU] photo 77: 8417 points
[GPU] photo 106: 7715 points
[GPU] photo 135: 6117 points
[GPU] photo 164: 8341 points
[GPU] photo 193: 9600 points
[GPU] photo 222: 8013 points
[GPU] photo 251: 9177 points
[GPU] photo 280: 9049 points
[GPU] photo 309: 7521 points
[GPU] photo 338: 9145 points
[GPU] photo 367: 8751 points
[GPU] photo 396: 8642 points
[GPU] photo 425: 9032 points
[GPU] photo 454: 8926 points
[GPU] photo 483: 9299 points
[GPU] photo 512: 8379 points
Warning: cudaStreamDestroy failed: an illegal memory access was encountered (700)
Traceback (most recent call last):
  File "/home/ttsesm/Development/metashape_project/bundler_extractor.py", line 68, in <module>
    main()
  File "/home/ttsesm/Development/metashape_project/bundler_extractor.py", line 47, in main
    chunk.matchPhotos()
Exception: Kernel failed: an illegal memory access was encountered (700) at line 143

Process finished with exit code 1

In case it helps somehow.
« Last Edit: February 25, 2022, 08:11:53 PM by ttsesm »

ttsesm

  • Newbie
  • *
  • Posts: 9
    • View Profile
Re: cudaMemGetInfo time out error
« Reply #34 on: February 28, 2022, 03:20:34 PM »
Hi Alexey,

any update how I could resolve the issue or apply any workaround?

Thanks.

Alexey Pasumansky

  • Agisoft Technical Support
  • Hero Member
  • *****
  • Posts: 14012
    • View Profile
Re: cudaMemGetInfo time out error
« Reply #35 on: February 28, 2022, 05:38:14 PM »
Hello ttsesm,

To investigate the problem further, can you please check, if the same operation works or returns the error in the version 1.6.5:
https://s3-eu-west-1.amazonaws.com/download.agisoft.com/metashape-pro_1_6_5_amd64.tar.gz

If the issue is still there, please save the related log.

In case it is possible for you, please also check, if the issue persists in 1.8.1 and 1.6.5 with the older driver version (latest available in 4xx.xx series).
Best regards,
Alexey Pasumansky,
Agisoft LLC

ttsesm

  • Newbie
  • *
  • Posts: 9
    • View Profile
Re: cudaMemGetInfo time out error
« Reply #36 on: March 02, 2022, 01:21:40 PM »
Hello ttsesm,

To investigate the problem further, can you please check, if the same operation works or returns the error in the version 1.6.5:
https://s3-eu-west-1.amazonaws.com/download.agisoft.com/metashape-pro_1_6_5_amd64.tar.gz

If the issue is still there, please save the related log.

In case it is possible for you, please also check, if the issue persists in 1.8.1 and 1.6.5 with the older driver version (latest available in 4xx.xx series).

Hi Alexey,

Some update. Downgrading to an older version didn't really help, I was getting the same errors as well. Then reading around that this might be a hardware related issue, I plugged in an older nvidia card that I had available and more specifically an Nvidia GTX 1080 Ti with 12Gb memory and all of a sudden everything works smoothly without errors without anything. My current card is an RTX Nvidia 2080 super with 7Gb of memory. Thus, apparently it is related to the hardware somehow. Now I am not sure whether it is due to the memory difference or just because my 2080 super is broken (though for everything else works fine) or because in the new rtx cards there is something different in the processing of memory or something. Unfortunately, I do not have any other RTX 20xx card to test but it would be interesting to see if the other guys who have similar issues are also using an RTX card.

In any case, thanks for the support and your time.
« Last Edit: March 02, 2022, 01:26:12 PM by ttsesm »

flogs

  • Newbie
  • *
  • Posts: 33
    • View Profile
Re: cudaMemGetInfo time out error
« Reply #37 on: October 12, 2022, 01:07:06 PM »
Hi,

swithing to OpenCL did not help, another error occured.

Filip

Alexey Pasumansky

  • Agisoft Technical Support
  • Hero Member
  • *****
  • Posts: 14012
    • View Profile
Re: cudaMemGetInfo time out error
« Reply #38 on: October 12, 2022, 02:13:07 PM »
Hello Filip,

Please provide the logs related to CUDA and OpenCL runs and also specify the OS version used and NVIDIA driver version installed.
Best regards,
Alexey Pasumansky,
Agisoft LLC