Forum

Author Topic: Network process surprise in 1.7.2 - align resumes after cancel & divide project!  (Read 549 times)

andyroo

  • Sr. Member
  • ****
  • Posts: 397
    • View Profile
I recently attempted a network processing alignment with around 140k images. When the Alignment.Cleanup failed due to out-of-RAM errors on the cleanup node, I killed the job in Monitor, then divided the project into two separate projects by copying the original chunk, deleting half the images from the original chunk, and the other half from the copied chunk. Each chunk was saved out to a new PSX file, and the original was closed without saving. "save keypoints" is not enabled (since I can't selectively delete them when/if I delete some images and divide the project later for dense matching). I also killed the server and monitor and restarted all processes.

I then restarted network alignment on the PSX saved out from the original chunk. There was an initial error about "can't resume matching without keypoints" or something like that, before the nodes started on the AlignCameras.align task without performing any matching (?!).

BUT - there is a point_cloud folder (~40GB) in the original project .files heirarchy, and in both of the sub-projects I divided and saved out, there are also ~20GB point_cloud folders. Despite the fact that I canceled alignment because cleanup couldn't continue, and that I didn't have "save keypoints" enabled. So it appears that the network processing saved the matched-but-not-aligned state - effectively saving the keypoints anyway since the align stage didn't complete?! (if so, yay!).

The nodes appear to be crunching through what they perceive as valid matches (pic of monitor attached), and I'm confused - did the server save the partially complete state/matched keypoints, or are these data/status saved in the original project and transferred when I exported the edited chunk as a new PSX? Did the network processing task, because it didn't complete, somehow save the "matched-but-not-aligned" state of the original project? Would this NOT be saved in the copied chunk, but only as a property of the original chunk? So many questions.

The interesting thing to me was that the project skipped matching entirely, but essentially restarted the align task from some post-matching point - even though I don't have "save keypoints" enabled, and  after it looked like it initially tried to restart align.cleanup. I'll have to take a look at the logs when this is done (and see if it runs out of RAM again during cleanup) but I'm guessing since each sub-project only has half the points, that the task will complete.

This would be a nice "feature" in non-network mode, and it makes me want to bench running a large project in network vs non-network mode on a single workstation (I do this with small projects sometimes to test python script). I could definitely see value in running projects in network mode if it allows me to skip re-matching even if I choose not to save keypoints, if the process is somehow interrupted.