Forum

Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - tutoss

Pages: [1] 2 3 4
1
The last slot you are running is at 4 Gbps speed. So pretty bottle necked.
Metashape doesn't need to transfer 4GB every second between RAM and VRAM , so it is still OK.

4Gbit is not 4Gbytes, its 0.5 so it is extremely slow and unusable for any work. As the motherboard limits the slot to PCI 2.0

I installed the cards in the pcie 3.0 ports, 16x and 8x speed, I don't know where you get that I installed a GPU it in the bottom port (physically impossible with any of my cards) that works at pcie 2.0 4x. There I put the sound card...

2
CPU limited (is a six years old cpu), Threadripper has plenty of pcie lanes (64).

The last slot you are running is at 4 Gbps speed. So pretty bottle necked.

That slot was for my sound card before.
I said in a previous post that my cards were in two slots at 16x (16gb/s) and one at 8x (8gb/s).

3
CPU limited (is a six years old cpu), Threadripper has plenty of pcie lanes (64).

4
Hi,
 the mainboard is a X399 AORUS Gaming 7 (rev. 1.0):



Two cards in 16x and the third on 8x speed.

But yes, I have a CPU bottleneck :/

I will sell two cards and keep the 7900XT and a single 2080S for now until a CPU+RAM+mainboard upgrade. With the current pci express slots offers in the current mainboards (classic mainboard, not workstation type) I can put only two GPU + one add-in single slot card. The 7900XT use 3 slots, the 2080S two slots, and I need a pcie slot for my sound blaster AE-5 card  :P. Three gpu don't fit anyway.

I benchmarked with only the 7900XT and a single 2080S and is the faster configuration.

5
The log isn't in network processing, but yes, is different to your log :s (metashape last version)
(screenshot showing more log attached)

the fact that with the 7900XT + 2x 2080 Super card I have approximately the same performance as with a 7900XT, or with the 3 old cards, shows that I have an important bottleneck with the CPU...
Damn, and I just wanted to do an upgrade and I was left with more or less the same performance   ;D.

But yes, now I am looking at the possibility of upgrading CPU+motherboard+ram. After all, it is a 2018 CPU. But a very good one indeed.

Thanks Bzuco for helping me realize that it was a CPU bottleneck that caused the similar performance after adding the 7900XT.

Best regards,

Tutos

6
Hi Bzuco,
I use 8 workers because is the optimal number in my system. Weeks ago I run multiple test to find it (830 images):



Please look at the times with 1 worker (normal local processing).

My cpu is a AMD Threadripper 2950X + 128gb ram (yes i know, a bit old now, but for a big project (38,000 20mp images), the pc went from +80 hours (normal local processing) to 17 hours (8 local workers) on "build point cloud", using the 3 old gpu. On later stages, with 8 workers on this project, the pc was about ~300% faster than normal local processing).

Here is here is part of the log (3 gpu, normal local processing):

Code: [Select]
2024-10-17 08:16:34 [GPU 6] Camera 198 samples after final filtering: 84% (3.98155 avg inliers) = 100% - 0% (not matched) - 7% (bad matched) - 1% (no neighbors) - 0% (no cost neighbors) - 5% (inconsistent normal) - 0% (estimated bad angle) - 0% (found bad angle) - 2% (speckles filtering)
2024-10-17 08:16:34 [GPU 6] Camera 198: level #4/4 (x4 downscale: 1368x912, image blowup: 2736x1824) done in 0.346 s = 28% propagation + 40% refinement + 17% filtering + 0% smoothing
2024-10-17 08:16:34 [GPU 4] group 1/1: estimating depth map for 44/49 camera 266 (16 neighbs)...
2024-10-17 08:16:34 [GPU 3] Camera 199 samples after final filtering: 96% (3.69169 avg inliers) = 100% - 0% (not matched) - 2% (bad matched) - 0% (no neighbors) - 0% (no cost neighbors) - 1% (inconsistent normal) - 0% (estimated bad angle) - 0% (found bad angle) - 0% (speckles filtering)
2024-10-17 08:16:34 [GPU 3] Camera 199: level #4/4 (x4 downscale: 1368x912, image blowup: 2736x1824) done in 0.323 s = 30% propagation + 32% refinement + 20% filtering + 0% smoothing
2024-10-17 08:16:34 [GPU 5] group 1/1: estimating depth map for 45/49 camera 267 (16 neighbs)...
2024-10-17 08:16:34 [GPU 6] group 1/1: estimating depth map for 46/49 camera 268 (16 neighbs)...
2024-10-17 08:16:35 [GPU 3] group 1/1: estimating depth map for 47/49 camera 269 (16 neighbs)...
2024-10-17 08:16:35 [GPU 1] Camera 201 samples after final filtering: 87% (4.08086 avg inliers) = 100% - 0% (not matched) - 5% (bad matched) - 1% (no neighbors) - 1% (no cost neighbors) - 4% (inconsistent normal) - 0% (estimated bad angle) - 0% (found bad angle) - 2% (speckles filtering)
2024-10-17 08:16:35 [GPU 1] Camera 201: level #4/4 (x4 downscale: 1368x912, image blowup: 2736x1824) done in 0.421 s = 19% propagation + 47% refinement + 19% filtering + 0% smoothing
2024-10-17 08:16:35 [GPU 4] Camera 266 samples after final filtering: 82% (2.88791 avg inliers) = 100% - 0% (not matched) - 7% (bad matched) - 1% (no neighbors) - 1% (no cost neighbors) - 6% (inconsistent normal) - 0% (estimated bad angle) - 0% (found bad angle) - 3% (speckles filtering)
2024-10-17 08:16:35 [GPU 4] Camera 266: level #4/4 (x4 downscale: 1368x912, image blowup: 2736x1824) done in 0.417 s = 23% propagation + 39% refinement + 21% filtering + 0% smoothing
2024-10-17 08:16:35 [GPU 2] Camera 265 samples after final filtering: 85% (3.18771 avg inliers) = 100% - 0% (not matched) - 5% (bad matched) - 0% (no neighbors) - 1% (no cost neighbors) - 6% (inconsistent normal) - 0% (estimated bad angle) - 0% (found bad angle) - 2% (speckles filtering)
2024-10-17 08:16:35 [GPU 2] Camera 265: level #4/4 (x4 downscale: 1368x912, image blowup: 2736x1824) done in 0.522 s = 16% propagation + 31% refinement + 30% filtering + 0% smoothing
2024-10-17 08:16:35 [GPU 1] group 1/1: estimating depth map for 48/49 camera 282 (16 neighbs)...
2024-10-17 08:16:35 [GPU 4] group 1/1: estimating depth map for 49/49 camera 284 (9 neighbs)...
2024-10-17 08:16:36 [GPU 6] Camera 268 samples after final filtering: 92% (4.08849 avg inliers) = 100% - 0% (not matched) - 3% (bad matched) - 0% (no neighbors) - 0% (no cost neighbors) - 3% (inconsistent normal) - 0% (estimated bad angle) - 0% (found bad angle) - 2% (speckles filtering)
2024-10-17 08:16:36 [GPU 6] Camera 268: level #4/4 (x4 downscale: 1368x912, image blowup: 2736x1824) done in 0.613 s = 36% propagation + 39% refinement + 14% filtering + 0% smoothing
2024-10-17 08:16:36 [GPU 5] Camera 267 samples after final filtering: 84% (3.18624 avg inliers) = 100% - 0% (not matched) - 7% (bad matched) - 1% (no neighbors) - 1% (no cost neighbors) - 4% (inconsistent normal) - 0% (estimated bad angle) - 0% (found bad angle) - 3% (speckles filtering)
2024-10-17 08:16:36 [GPU 5] Camera 267: level #4/4 (x4 downscale: 1368x912, image blowup: 2736x1824) done in 0.608 s = 32% propagation + 38% refinement + 16% filtering + 0% smoothing
2024-10-17 08:16:36 [GPU 3] Camera 269 samples after final filtering: 93% (4.34878 avg inliers) = 100% - 0% (not matched) - 3% (bad matched) - 0% (no neighbors) - 0% (no cost neighbors) - 2% (inconsistent normal) - 0% (estimated bad angle) - 0% (found bad angle) - 1% (speckles filtering)
2024-10-17 08:16:36 [GPU 3] Camera 269: level #4/4 (x4 downscale: 1368x912, image blowup: 2736x1824) done in 0.417 s = 19% propagation + 34% refinement + 31% filtering + 0% smoothing
2024-10-17 08:16:36 [GPU 4] Camera 284 samples after final filtering: 90% (2.56725 avg inliers) = 100% - 0% (not matched) - 3% (bad matched) - 0% (no neighbors) - 0% (no cost neighbors) - 5% (inconsistent normal) - 0% (estimated bad angle) - 0% (found bad angle) - 1% (speckles filtering)
2024-10-17 08:16:36 [GPU 4] Camera 284: level #4/4 (x4 downscale: 1368x912, image blowup: 2736x1824) done in 0.273 s = 22% propagation + 26% refinement + 22% filtering + 0% smoothing
2024-10-17 08:16:36 [GPU 1] Camera 282 samples after final filtering: 81% (3.47738 avg inliers) = 100% - 0% (not matched) - 8% (bad matched) - 1% (no neighbors) - 1% (no cost neighbors) - 5% (inconsistent normal) - 0% (estimated bad angle) - 0% (found bad angle) - 3% (speckles filtering)
2024-10-17 08:16:36 [GPU 1] Camera 282: level #4/4 (x4 downscale: 1368x912, image blowup: 2736x1824) done in 0.386 s = 23% propagation + 37% refinement + 18% filtering + 0% smoothing
2024-10-17 08:16:36 Depth reconstruction devices performance:
2024-10-17 08:16:36  - 17% done by [GPU 1] AMD Radeon RX 7900 XT (gfx1100) = [propagation 3.06 s (33%) + refinement 2.501 s (27%) + filtering 1.294 s (14%) + data preps 0.68 s (7%) + gpu data transfer 0.795 s (9%) + others 0.58 s (6%)]
2024-10-17 08:16:36  - 14% done by [GPU 2] AMD Radeon RX 7900 XT (gfx1100) = [propagation 2.653 s (31%) + refinement 2.615 s (31%) + filtering 1.18 s (14%) + data preps 0.641 s (8%) + gpu data transfer 0.706 s (8%) + others 0.456 s (5%)]
2024-10-17 08:16:36  - 16% done by [GPU 3] NVIDIA GeForce RTX 2080 SUPER = [propagation 2.482 s (31%) + refinement 2.146 s (27%) + filtering 1.169 s (15%) + data preps 0.571 s (7%) + gpu data transfer 0.435 s (5%) + others 0.593 s (7%)]
2024-10-17 08:16:36  - 19% done by [GPU 4] NVIDIA GeForce RTX 2080 SUPER = [propagation 2.29 s (28%) + refinement 2.328 s (29%) + filtering 1.213 s (15%) + data preps 0.614 s (8%) + gpu data transfer 0.435 s (5%) + others 0.7 s (9%)]
2024-10-17 08:16:36  - 16% done by [GPU 5] NVIDIA GeForce RTX 2080 SUPER = [propagation 2.775 s (34%) + refinement 2.328 s (28%) + filtering 1.201 s (15%) + data preps 0.437 s (5%) + gpu data transfer 0.32 s (4%) + others 0.642 s (8%)]
2024-10-17 08:16:36  - 18% done by [GPU 6] NVIDIA GeForce RTX 2080 SUPER = [propagation 2.934 s (35%) + refinement 2.482 s (30%) + filtering 1.015 s (12%) + data preps 0.495 s (6%) + gpu data transfer 0.293 s (4%) + others 0.613 s (7%)]
2024-10-17 08:16:36 Peak VRAM usage: Camera 111 (16 neihbs): 235 MB = 100 MB gpu_neighbImages (43%) + 64 MB gpu_tmp_hypo_ni_cost (27%) + 12 MB gpu_tmp_normal (5%) + 9 MB gpu_neighbMasks (4%) + 7 MB gpu_mipmapNeighbImage (3%) + 4 MB gpu_refImage (2%) + 4 MB gpu_depth_map (2%) + 4 MB gpu_cost_map (2%) + 4 MB gpu_coarse_depth_map_radius (2%) + 4 MB gpu_coarse_depth_map (2%)
2024-10-17 08:16:36 Summary time: images preps 5.654 s (32%), depth estimation 11.665 s (66%)


I take a screenshot showing the gpu and cpu load on normal local processing, using my benchmark project. Screenshot attached.

Regards,

Tutos

7
Hi Bzuco!

Yeah, I was thinking the same thing after running just the 7900XT and seeing 100% utilization (with 8 workers).

I ran my benchmark again and the 7900XT performs a bit faster than the 3 older cards:



In any case, the phase "build depth maps" maintains similar performance. Could it be that the CPU bottlenecks even at this step with just the 7900XT?

8
Hi!
Yes, I noted that about windows task manager, so I monitored the gpu usage with GPU-Z, and show the same behavior.

About the non so taxing project, I processing it with 8 worker on the same machine to demand the hardware to the max: the nvidia's gpu work to 100% and they are very loud (max fan speed), but the amd gpu only make loud on the second half of "match photos" with 100% gpu usage on windows task manager and the performance/time improves considerably, see the table below:



(I know the gpu's are used only on Match photos and build depth maps phases)


Now, see the time in "build depth maps": just a small improvement over the nvidia 1070. I have also tried to use only the 7900XT, and in phase "build depth maps" it has the same low gpu utilization (ckecked on GPU-Z) and the fans do not make any noise  :-\. while the nvidia cards at this stage are at 100% and with the fans at maximum speed.

EDIT:
Now I tried with a bigger project, 7300 photos. Here the utilization:



9
Are there user here with a 7900XT or 7900XTX or any AMD GPU? how much is the gpu utilization in Metashape?

I don't understand. I ran an AIDA64 benchmark (GPGPU benchmark) and the card utilization reaches 100%:




I bought this gpu because I saw this benchmark:
https://techgage.com/article/amd-radeon-rx-7900-xt-radeon-rx-7900-xtx-creator-review/

In Metashape (1.8.4), the 7900xt gpu is one of the fastest:



does anyone know what could be happening?

Best regards,

Tutos


10
General / AMD 7900XT low gpu usage on Align Photos and Build point cloud
« on: October 16, 2024, 01:34:40 AM »
Hi,
In "generating depth maps" stage,  the amd's 7900XT usage is pretty low. When I run the same task with all gpu (7900XT and 2x 2080 super) the nvidia's gpu show peaks of 100%, and the amd gpu stands on ~30% usage.
You can see the attatchments the gpu usage.

Metashape 2.1.3 build 18946, latest amd and nvidia drivers.

Edit:
I do a proper benchmark, with ~830 images, running 8 metashape workers on the same pc. I do this because I had benchmarked this project for testing purposes before (see attachment please):

The only boost in performance was in the first stage of align photos (green cell).

The task manager screenshots are running the project locally (1 worker), not on "local network mode". Due this, the gpu usage has this "wave" behavior.

GPU's before:
2x Nvidia 2080 super
1x Nvidia 1070

GPU's now:
1x AMD 7900XT
2x Nvidia 2080 super

Regards,

Tutos

11
General / Re: Error increasing after inputting marker accuracy
« on: January 12, 2024, 12:59:13 AM »
Hi MeganG,
With the marked button on the attached image, you can choose your desired coordinate system to convert.

Best regards,

tutos

12
General / Re: Negligible GPU usage on Alignmnet.
« on: August 15, 2020, 10:07:54 PM »
Well, 50mp are pretty big hehe, plus they are 16bit.
about the two cameras, in this step (detecting points) is irrelevant.
If you wish, you can send me a small dataset (~200 images maybe) to benchmark the times with and without gpu (remember we have "similar" cpu and gpu).

Regards,

Tutos

13
General / Re: Negligible GPU usage on Alignmnet.
« on: August 14, 2020, 08:35:57 PM »
Thank you guys, this is helpful.

This particular block is comprised of ~4000 medium format images as an added bit of info.

In the preferences>advanced>tweaks section there is an option, "main/refine_max_gpu_multiplier" which has a value of 2. I cannot find any documentation on this anywhere. Any ideas as to what it does?

Which is your photos resolution?
My times for "detecting points" are with 21MP photos (5472x3648), Phantom 4 rtk.
I tested with a similar project (3500 images), with one 2080 Super, and the time to finish "detecting points" is ~27 minutes.
Without GPU, the time up to ~1 hour 50 minutes.

We have similar CPU and GPU, and you says your time is 6 hours for 4000 images!!! or the image resolution is very very high or something more is happening.

About "refine_max_gpu_multiplier" i don't know, the only thing i change on metashape are enable all GPU and turn off the CPU for GPU usage.

Regards,

Tutos.


14
General / Re: Negligible GPU usage on Alignmnet.
« on: August 13, 2020, 11:17:14 PM »
Also "low" gpu usage (see attached image) on two 2080 super and one 1070.
Anyway, 7 minutes for "detecting points" for 843 photos, is very good! 

I tested with all gpu disabled, and the time for run "detecting points" is about 32 minutes (AMD TR 2950X).
With one 2080 super, the time is 7,5 minutes (the true power of 3 gpu is much more noticeable on other stages).

Metashape 1.6.3

You can try benchmarking with all gpu disabled and share the results.

Regards,

Tutos

15
General / Re: Markers with fewer than three coordinates (x,y,z)
« on: April 13, 2018, 09:34:47 PM »
wow, that is useful, it will be in the GUI, no "hidden"

Pages: [1] 2 3 4