Forum

Author Topic: Network processing - BuildOrthomosaic.buildmosaic ~30x slower than workstation?  (Read 2162 times)

andyroo

  • Sr. Member
  • ****
  • Posts: 438
    • View Profile
Trying to figure out what's going on in this step in the network process - it looks like it's 30-40x slower than equivalent on my workstation. Each network node is 2 x 18c 2.3 GHz Skylake CPUs with 384GB RAM with (gpu node) or without (cpu node) 4x NVidia V100s. Workstation is a Threadripper 3960x with 256GB RAM and 2x 2080 Super GPUs. Screenshot of node activity from monitor attached.

Could this be related to CUDA version? Orthomosaic built with averaging mode.

Network node times:

Code: [Select]
2020-12-01 11:04:10 Updating orthomosaic...
2020-12-01 11:05:54 62 images blended in 104.078 sec
2020-12-01 11:10:24 153 images blended in 269.605 sec
2020-12-01 11:17:42 239 images blended in 436.92 sec
2020-12-01 11:26:53 296 images blended in 550.747 sec
2020-12-01 11:37:10 309 images blended in 617.283 sec

Workstation times:

Code: [Select]
2020-09-11 23:59:29 Updating orthomosaic...
2020-09-11 23:59:30 28 images blended in 1.47 sec
2020-09-11 23:59:34 74 images blended in 3.563 sec
2020-09-11 23:59:40 133 images blended in 5.989 sec
2020-09-11 23:59:47 170 images blended in 6.683 sec
2020-09-11 23:59:55 174 images blended in 7.564 sec
2020-09-12 00:00:02 165 images blended in 6.912 sec
2020-09-12 00:00:10 151 images blended in 7.498 sec
2020-09-12 00:00:22 167 images blended in 12.318 sec
2020-09-12 00:00:31 139 images blended in 8.179 sec
2020-09-12 00:00:33 43 images blended in 1.625 sec
2020-09-12 00:00:33 orthomosaic updated in 64.348 sec

Andy
« Last Edit: December 01, 2020, 09:04:28 PM by andyroo »

Alexey Pasumansky

  • Agisoft Technical Support
  • Hero Member
  • *****
  • Posts: 14816
    • View Profile
Hello Andy,

Orthomosaic generation process is not utilizing GPU at any step, so this operation is CPU only.

Does the node has high-speed connection to the data storage folders: source images and project.files directory?
Best regards,
Alexey Pasumansky,
Agisoft LLC

andyroo

  • Sr. Member
  • ****
  • Posts: 438
    • View Profile
The HPC has high-speed file system access from every node. Waaay faster than my workstation (like between 15GB/s and 150GB/s depending on cache, with very large cache). Would it help troubleshoot if I respawn the processing node with --verbose?

Also how far back doesit fail to if I pause then kill the worker node? I had one die (expire) and a newer one kicked in about a day ago, but it looked like I lost some progress. It looks like I'll find out in about four hours when this node expires :-/ - I'll turn verbose logging on for the next node I spawn.
« Last Edit: December 03, 2020, 03:12:35 AM by andyroo »

Alexey Pasumansky

  • Agisoft Technical Support
  • Hero Member
  • *****
  • Posts: 14816
    • View Profile
Hello andyroo,

If you terminate the node, only the progress of the current sub-task will be lost, providing that we are speaking about operations which support fine level task subdivision.
Best regards,
Alexey Pasumansky,
Agisoft LLC

andyroo

  • Sr. Member
  • ****
  • Posts: 438
    • View Profile
Hello andyroo,

If you terminate the node, only the progress of the current sub-task will be lost, providing that we are speaking about operations which support fine level task subdivision.

Hi Alexey,

The sub-task was alignment finalizing (adding cameras/adjusting/optimizing) - I should have put that comment under this post - sorry.

That process is performing slowly after a while on the network and my workstation (gets up to ~80% fine, then slows to a crawl) and I'm wondering if it has to do with RAM / Cache usage - I described that better in the linked post.

jetdog6

  • Newbie
  • *
  • Posts: 17
    • View Profile
Any solution to this?  We just set up our network server.  However, it's barely using any resources and taking absolutely forever to do anything.  CPU usage never goes over 15%, when it should be maxed out as it is on every other computer.