Forum

Author Topic: AMD 7900XT low gpu usage on Align Photos and Build point cloud  (Read 12642 times)

tutoss

  • Jr. Member
  • **
  • Posts: 59
    • View Profile
AMD 7900XT low gpu usage on Align Photos and Build point cloud
« on: October 16, 2024, 01:34:40 AM »
Hi,
In "generating depth maps" stage,  the amd's 7900XT usage is pretty low. When I run the same task with all gpu (7900XT and 2x 2080 super) the nvidia's gpu show peaks of 100%, and the amd gpu stands on ~30% usage.
You can see the attatchments the gpu usage.

Metashape 2.1.3 build 18946, latest amd and nvidia drivers.

Edit:
I do a proper benchmark, with ~830 images, running 8 metashape workers on the same pc. I do this because I had benchmarked this project for testing purposes before (see attachment please):

The only boost in performance was in the first stage of align photos (green cell).

The task manager screenshots are running the project locally (1 worker), not on "local network mode". Due this, the gpu usage has this "wave" behavior.

GPU's before:
2x Nvidia 2080 super
1x Nvidia 1070

GPU's now:
1x AMD 7900XT
2x Nvidia 2080 super

Regards,

Tutos
« Last Edit: October 16, 2024, 05:26:41 AM by tutoss »

tutoss

  • Jr. Member
  • **
  • Posts: 59
    • View Profile
Re: AMD 7900XT low gpu usage on Align Photos and Build point cloud
« Reply #1 on: October 16, 2024, 06:45:49 PM »
Are there user here with a 7900XT or 7900XTX or any AMD GPU? how much is the gpu utilization in Metashape?

I don't understand. I ran an AIDA64 benchmark (GPGPU benchmark) and the card utilization reaches 100%:




I bought this gpu because I saw this benchmark:
https://techgage.com/article/amd-radeon-rx-7900-xt-radeon-rx-7900-xtx-creator-review/

In Metashape (1.8.4), the 7900xt gpu is one of the fastest:



does anyone know what could be happening?

Best regards,

Tutos


Mak11

  • Sr. Member
  • ****
  • Posts: 387
    • View Profile
Re: AMD 7900XT low gpu usage on Align Photos and Build point cloud
« Reply #2 on: October 16, 2024, 07:41:34 PM »
I'm on an Radeon 6700XT with no issues.
What's important is the Compute 0 /1 / 3 utilization not really the overall GPU utilization %.The 7900XT is probably seeing less usage because the project is probably not that taxing for it compared to the less powerful 2080s.




olihar

  • Sr. Member
  • ****
  • Posts: 291
    • View Profile
Re: AMD 7900XT low gpu usage on Align Photos and Build point cloud
« Reply #3 on: October 16, 2024, 08:18:17 PM »
Task manager is very poor at showing GPU utilization.

tutoss

  • Jr. Member
  • **
  • Posts: 59
    • View Profile
Re: AMD 7900XT low gpu usage on Align Photos and Build point cloud
« Reply #4 on: October 16, 2024, 09:05:23 PM »
Hi!
Yes, I noted that about windows task manager, so I monitored the gpu usage with GPU-Z, and show the same behavior.

About the non so taxing project, I processing it with 8 worker on the same machine to demand the hardware to the max: the nvidia's gpu work to 100% and they are very loud (max fan speed), but the amd gpu only make loud on the second half of "match photos" with 100% gpu usage on windows task manager and the performance/time improves considerably, see the table below:



(I know the gpu's are used only on Match photos and build depth maps phases)


Now, see the time in "build depth maps": just a small improvement over the nvidia 1070. I have also tried to use only the 7900XT, and in phase "build depth maps" it has the same low gpu utilization (ckecked on GPU-Z) and the fans do not make any noise  :-\. while the nvidia cards at this stage are at 100% and with the fans at maximum speed.

EDIT:
Now I tried with a bigger project, 7300 photos. Here the utilization:


« Last Edit: October 16, 2024, 10:00:42 PM by tutoss »

Bzuco

  • Full Member
  • ***
  • Posts: 246
    • View Profile
Re: AMD 7900XT low gpu usage on Align Photos and Build point cloud
« Reply #5 on: October 16, 2024, 10:47:03 PM »
@tutos You need better CPU to feed all 3 GPUs. Also photo resolution matters. AMD 7900XT is probably in FAN less mode under 60°C.
One 7900XT(~51 TFLOP/s) should be more performant in build depth maps than 2x 2080s(2x ~11 TFLOP/s). Try to run only 7900XT  :) and check the times. If it beats two 2080s you can sell those cards and save tons of watts. Or buy better CPU.

tutoss

  • Jr. Member
  • **
  • Posts: 59
    • View Profile
Re: AMD 7900XT low gpu usage on Align Photos and Build point cloud
« Reply #6 on: October 17, 2024, 12:04:25 AM »
Hi Bzuco!

Yeah, I was thinking the same thing after running just the 7900XT and seeing 100% utilization (with 8 workers).

I ran my benchmark again and the 7900XT performs a bit faster than the 3 older cards:



In any case, the phase "build depth maps" maintains similar performance. Could it be that the CPU bottlenecks even at this step with just the 7900XT?

Bzuco

  • Full Member
  • ***
  • Posts: 246
    • View Profile
Re: AMD 7900XT low gpu usage on Align Photos and Build point cloud
« Reply #7 on: October 17, 2024, 11:10:23 AM »
If you have large photos and use highest depth map settings, then you do not need a lot of workers for building depth maps and 1-3 workers are enough to keep GPU saturated without % usage drops during loading/unloading data.
The best practise how to know if CPU is bottleneck is to check log where you can see how many seconds took to compute one depth map and what was the CPU/GPU usage during this phase. If you see GPU % drops between calculating two depth maps, then add another worker...but if the CPU usage was already 100%, then it is better to buy faster CPU.

Detecting point phase needs a lot of workers, but at the same time be sure you do not exceeded the VRAM usage, because then workers can not receive task.

Matching points is easy task for GPU, so no problem here.

For other non GPU tasks is better to keep only one active worker.

I also see in GPUZ that your nvidia GPUs are quite hot, which is not good for achieving the maximum core frequency, which GPU of 2xxx series can reach only under 50°C and the with increasing the temperature max. freqv. are decreasing.  Here you can help with MSI Afterburner and set the power limit to 70-80% and at the same time increase core clock to +80-100MHz. This tweak gives you less power consumption and higher frequencies and lower temperatures and fan noise. I am able to reach 2080-2150MHz on my RTX 2060 super during computation. The same can be done for 7900XT.

What model of CPU do yo have? You can also undervolt CPU to get more performance out of it.

tutoss

  • Jr. Member
  • **
  • Posts: 59
    • View Profile
Re: AMD 7900XT low gpu usage on Align Photos and Build point cloud
« Reply #8 on: October 17, 2024, 02:50:12 PM »
Hi Bzuco,
I use 8 workers because is the optimal number in my system. Weeks ago I run multiple test to find it (830 images):



Please look at the times with 1 worker (normal local processing).

My cpu is a AMD Threadripper 2950X + 128gb ram (yes i know, a bit old now, but for a big project (38,000 20mp images), the pc went from +80 hours (normal local processing) to 17 hours (8 local workers) on "build point cloud", using the 3 old gpu. On later stages, with 8 workers on this project, the pc was about ~300% faster than normal local processing).

Here is here is part of the log (3 gpu, normal local processing):

Code: [Select]
2024-10-17 08:16:34 [GPU 6] Camera 198 samples after final filtering: 84% (3.98155 avg inliers) = 100% - 0% (not matched) - 7% (bad matched) - 1% (no neighbors) - 0% (no cost neighbors) - 5% (inconsistent normal) - 0% (estimated bad angle) - 0% (found bad angle) - 2% (speckles filtering)
2024-10-17 08:16:34 [GPU 6] Camera 198: level #4/4 (x4 downscale: 1368x912, image blowup: 2736x1824) done in 0.346 s = 28% propagation + 40% refinement + 17% filtering + 0% smoothing
2024-10-17 08:16:34 [GPU 4] group 1/1: estimating depth map for 44/49 camera 266 (16 neighbs)...
2024-10-17 08:16:34 [GPU 3] Camera 199 samples after final filtering: 96% (3.69169 avg inliers) = 100% - 0% (not matched) - 2% (bad matched) - 0% (no neighbors) - 0% (no cost neighbors) - 1% (inconsistent normal) - 0% (estimated bad angle) - 0% (found bad angle) - 0% (speckles filtering)
2024-10-17 08:16:34 [GPU 3] Camera 199: level #4/4 (x4 downscale: 1368x912, image blowup: 2736x1824) done in 0.323 s = 30% propagation + 32% refinement + 20% filtering + 0% smoothing
2024-10-17 08:16:34 [GPU 5] group 1/1: estimating depth map for 45/49 camera 267 (16 neighbs)...
2024-10-17 08:16:34 [GPU 6] group 1/1: estimating depth map for 46/49 camera 268 (16 neighbs)...
2024-10-17 08:16:35 [GPU 3] group 1/1: estimating depth map for 47/49 camera 269 (16 neighbs)...
2024-10-17 08:16:35 [GPU 1] Camera 201 samples after final filtering: 87% (4.08086 avg inliers) = 100% - 0% (not matched) - 5% (bad matched) - 1% (no neighbors) - 1% (no cost neighbors) - 4% (inconsistent normal) - 0% (estimated bad angle) - 0% (found bad angle) - 2% (speckles filtering)
2024-10-17 08:16:35 [GPU 1] Camera 201: level #4/4 (x4 downscale: 1368x912, image blowup: 2736x1824) done in 0.421 s = 19% propagation + 47% refinement + 19% filtering + 0% smoothing
2024-10-17 08:16:35 [GPU 4] Camera 266 samples after final filtering: 82% (2.88791 avg inliers) = 100% - 0% (not matched) - 7% (bad matched) - 1% (no neighbors) - 1% (no cost neighbors) - 6% (inconsistent normal) - 0% (estimated bad angle) - 0% (found bad angle) - 3% (speckles filtering)
2024-10-17 08:16:35 [GPU 4] Camera 266: level #4/4 (x4 downscale: 1368x912, image blowup: 2736x1824) done in 0.417 s = 23% propagation + 39% refinement + 21% filtering + 0% smoothing
2024-10-17 08:16:35 [GPU 2] Camera 265 samples after final filtering: 85% (3.18771 avg inliers) = 100% - 0% (not matched) - 5% (bad matched) - 0% (no neighbors) - 1% (no cost neighbors) - 6% (inconsistent normal) - 0% (estimated bad angle) - 0% (found bad angle) - 2% (speckles filtering)
2024-10-17 08:16:35 [GPU 2] Camera 265: level #4/4 (x4 downscale: 1368x912, image blowup: 2736x1824) done in 0.522 s = 16% propagation + 31% refinement + 30% filtering + 0% smoothing
2024-10-17 08:16:35 [GPU 1] group 1/1: estimating depth map for 48/49 camera 282 (16 neighbs)...
2024-10-17 08:16:35 [GPU 4] group 1/1: estimating depth map for 49/49 camera 284 (9 neighbs)...
2024-10-17 08:16:36 [GPU 6] Camera 268 samples after final filtering: 92% (4.08849 avg inliers) = 100% - 0% (not matched) - 3% (bad matched) - 0% (no neighbors) - 0% (no cost neighbors) - 3% (inconsistent normal) - 0% (estimated bad angle) - 0% (found bad angle) - 2% (speckles filtering)
2024-10-17 08:16:36 [GPU 6] Camera 268: level #4/4 (x4 downscale: 1368x912, image blowup: 2736x1824) done in 0.613 s = 36% propagation + 39% refinement + 14% filtering + 0% smoothing
2024-10-17 08:16:36 [GPU 5] Camera 267 samples after final filtering: 84% (3.18624 avg inliers) = 100% - 0% (not matched) - 7% (bad matched) - 1% (no neighbors) - 1% (no cost neighbors) - 4% (inconsistent normal) - 0% (estimated bad angle) - 0% (found bad angle) - 3% (speckles filtering)
2024-10-17 08:16:36 [GPU 5] Camera 267: level #4/4 (x4 downscale: 1368x912, image blowup: 2736x1824) done in 0.608 s = 32% propagation + 38% refinement + 16% filtering + 0% smoothing
2024-10-17 08:16:36 [GPU 3] Camera 269 samples after final filtering: 93% (4.34878 avg inliers) = 100% - 0% (not matched) - 3% (bad matched) - 0% (no neighbors) - 0% (no cost neighbors) - 2% (inconsistent normal) - 0% (estimated bad angle) - 0% (found bad angle) - 1% (speckles filtering)
2024-10-17 08:16:36 [GPU 3] Camera 269: level #4/4 (x4 downscale: 1368x912, image blowup: 2736x1824) done in 0.417 s = 19% propagation + 34% refinement + 31% filtering + 0% smoothing
2024-10-17 08:16:36 [GPU 4] Camera 284 samples after final filtering: 90% (2.56725 avg inliers) = 100% - 0% (not matched) - 3% (bad matched) - 0% (no neighbors) - 0% (no cost neighbors) - 5% (inconsistent normal) - 0% (estimated bad angle) - 0% (found bad angle) - 1% (speckles filtering)
2024-10-17 08:16:36 [GPU 4] Camera 284: level #4/4 (x4 downscale: 1368x912, image blowup: 2736x1824) done in 0.273 s = 22% propagation + 26% refinement + 22% filtering + 0% smoothing
2024-10-17 08:16:36 [GPU 1] Camera 282 samples after final filtering: 81% (3.47738 avg inliers) = 100% - 0% (not matched) - 8% (bad matched) - 1% (no neighbors) - 1% (no cost neighbors) - 5% (inconsistent normal) - 0% (estimated bad angle) - 0% (found bad angle) - 3% (speckles filtering)
2024-10-17 08:16:36 [GPU 1] Camera 282: level #4/4 (x4 downscale: 1368x912, image blowup: 2736x1824) done in 0.386 s = 23% propagation + 37% refinement + 18% filtering + 0% smoothing
2024-10-17 08:16:36 Depth reconstruction devices performance:
2024-10-17 08:16:36  - 17% done by [GPU 1] AMD Radeon RX 7900 XT (gfx1100) = [propagation 3.06 s (33%) + refinement 2.501 s (27%) + filtering 1.294 s (14%) + data preps 0.68 s (7%) + gpu data transfer 0.795 s (9%) + others 0.58 s (6%)]
2024-10-17 08:16:36  - 14% done by [GPU 2] AMD Radeon RX 7900 XT (gfx1100) = [propagation 2.653 s (31%) + refinement 2.615 s (31%) + filtering 1.18 s (14%) + data preps 0.641 s (8%) + gpu data transfer 0.706 s (8%) + others 0.456 s (5%)]
2024-10-17 08:16:36  - 16% done by [GPU 3] NVIDIA GeForce RTX 2080 SUPER = [propagation 2.482 s (31%) + refinement 2.146 s (27%) + filtering 1.169 s (15%) + data preps 0.571 s (7%) + gpu data transfer 0.435 s (5%) + others 0.593 s (7%)]
2024-10-17 08:16:36  - 19% done by [GPU 4] NVIDIA GeForce RTX 2080 SUPER = [propagation 2.29 s (28%) + refinement 2.328 s (29%) + filtering 1.213 s (15%) + data preps 0.614 s (8%) + gpu data transfer 0.435 s (5%) + others 0.7 s (9%)]
2024-10-17 08:16:36  - 16% done by [GPU 5] NVIDIA GeForce RTX 2080 SUPER = [propagation 2.775 s (34%) + refinement 2.328 s (28%) + filtering 1.201 s (15%) + data preps 0.437 s (5%) + gpu data transfer 0.32 s (4%) + others 0.642 s (8%)]
2024-10-17 08:16:36  - 18% done by [GPU 6] NVIDIA GeForce RTX 2080 SUPER = [propagation 2.934 s (35%) + refinement 2.482 s (30%) + filtering 1.015 s (12%) + data preps 0.495 s (6%) + gpu data transfer 0.293 s (4%) + others 0.613 s (7%)]
2024-10-17 08:16:36 Peak VRAM usage: Camera 111 (16 neihbs): 235 MB = 100 MB gpu_neighbImages (43%) + 64 MB gpu_tmp_hypo_ni_cost (27%) + 12 MB gpu_tmp_normal (5%) + 9 MB gpu_neighbMasks (4%) + 7 MB gpu_mipmapNeighbImage (3%) + 4 MB gpu_refImage (2%) + 4 MB gpu_depth_map (2%) + 4 MB gpu_cost_map (2%) + 4 MB gpu_coarse_depth_map_radius (2%) + 4 MB gpu_coarse_depth_map (2%)
2024-10-17 08:16:36 Summary time: images preps 5.654 s (32%), depth estimation 11.665 s (66%)


I take a screenshot showing the gpu and cpu load on normal local processing, using my benchmark project. Screenshot attached.

Regards,

Tutos
« Last Edit: October 17, 2024, 03:06:41 PM by tutoss »

Bzuco

  • Full Member
  • ***
  • Posts: 246
    • View Profile
Re: AMD 7900XT low gpu usage on Align Photos and Build point cloud
« Reply #9 on: October 17, 2024, 11:01:37 PM »
I think network processing log is not so detailed, but you can orient according camera number.
In my log I see 6 phases for each camera depthmap, so easier to know how long it took.

Maybe for your system overall 8 workers works, but the main point of using workers locally is during single threaded tasks like first detecting points and probably depth maps calculation, if there are usage drops. Otherwise for tasks which are multithreaded, one worker is enough because is able to fully saturate CPU(estimating camera locations, build point cloud). But OK, if 8 works for you, then keep it.

So now it's time to decide whether to buy a new CPU or just stick with the 7900XT.

tutoss

  • Jr. Member
  • **
  • Posts: 59
    • View Profile
Re: AMD 7900XT low gpu usage on Align Photos and Build point cloud
« Reply #10 on: October 18, 2024, 12:59:09 AM »
The log isn't in network processing, but yes, is different to your log :s (metashape last version)
(screenshot showing more log attached)

the fact that with the 7900XT + 2x 2080 Super card I have approximately the same performance as with a 7900XT, or with the 3 old cards, shows that I have an important bottleneck with the CPU...
Damn, and I just wanted to do an upgrade and I was left with more or less the same performance   ;D.

But yes, now I am looking at the possibility of upgrading CPU+motherboard+ram. After all, it is a 2018 CPU. But a very good one indeed.

Thanks Bzuco for helping me realize that it was a CPU bottleneck that caused the similar performance after adding the 7900XT.

Best regards,

Tutos
« Last Edit: October 18, 2024, 01:01:23 AM by tutoss »

andyroo

  • Sr. Member
  • ****
  • Posts: 460
    • View Profile
Re: AMD 7900XT low gpu usage on Align Photos and Build point cloud
« Reply #11 on: October 24, 2024, 12:08:40 PM »
One other thought I had since it looks like you might have a CPU/GPU bottleneck- what motherboard do you have, and which pcie slots are the GPUs in? I'm wondering if you have the 7900xt in a full x16 electrical slot with three GPUs populated.

tutoss

  • Jr. Member
  • **
  • Posts: 59
    • View Profile
Re: AMD 7900XT low gpu usage on Align Photos and Build point cloud
« Reply #12 on: October 24, 2024, 03:34:22 PM »
Hi,
 the mainboard is a X399 AORUS Gaming 7 (rev. 1.0):



Two cards in 16x and the third on 8x speed.

But yes, I have a CPU bottleneck :/

I will sell two cards and keep the 7900XT and a single 2080S for now until a CPU+RAM+mainboard upgrade. With the current pci express slots offers in the current mainboards (classic mainboard, not workstation type) I can put only two GPU + one add-in single slot card. The 7900XT use 3 slots, the 2080S two slots, and I need a pcie slot for my sound blaster AE-5 card  :P. Three gpu don't fit anyway.

I benchmarked with only the 7900XT and a single 2080S and is the faster configuration.

olihar

  • Sr. Member
  • ****
  • Posts: 291
    • View Profile
Re: AMD 7900XT low gpu usage on Align Photos and Build point cloud
« Reply #13 on: October 24, 2024, 04:38:20 PM »
You are CPU and or PCI lane limited in your testing.

tutoss

  • Jr. Member
  • **
  • Posts: 59
    • View Profile
Re: AMD 7900XT low gpu usage on Align Photos and Build point cloud
« Reply #14 on: October 25, 2024, 06:42:48 PM »
CPU limited (is a six years old cpu), Threadripper has plenty of pcie lanes (64).