Forum

Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - andyroo

Pages: [1] 2 3 ... 27
1
I have the following script, which works fine in Metashape 1.6.5 but fails in 1.7.5 with error: vertical datum out of range

My dense cloud is in GCS and my DEM (from which the dense cloud CRS is derived) is in a compound CS (NAD832011 + NAVD88). Geoid is in the appropriate subdir.

If I open the 1.7.5 project in 1.6.6 I can run the script below and export everything fine.


Code: [Select]
import Metashape
import math
import os
from os import path
laz_subdir = 'laz'
doc = Metashape.app.document
doc_path = os.path.split(doc.path)[0]
outPath = os.path.normpath(doc_path + os.sep + laz_subdir)
app = Metashape.app
doc = app.document
network_tasks = list()

for chunk in doc.chunks:
    if chunk.dense_cloud:
        print(chunk.label)
       
        v_projection = chunk.elevation.crs #presumes DEM built with desired PCS
        crs_label = v_projection.name
        crs_label = ''.join([x if x.isalnum() else '_' for x in crs_label if x not in '()/+'])
        crs_label = crs_label.replace('__','_')
        crs_label = crs_label.replace('_zone','')

        outFilename = chunk.label + '_dense_' + crs_label + '.laz'
        exportFile = os.path.normpath(outPath+os.sep+outFilename)
        if not os.path.exists(outPath):
            print('testing create path: ' + outPath)
            os.makedirs(outPath)
            print('testing file writestring: ' + exportFile)
            chunk.exportPoints(exportFile, source_data=Metashape.DenseCloudData, crs=v_projection, format=Metashape.PointsFormatLAZ)
        else:
            if not os.path.isfile(exportFile):
                print('testing file writestring: ' + exportFile)
                chunk.exportPoints(exportFile, source_data=Metashape.DenseCloudData, crs=v_projection, format=Metashape.PointsFormatLAZ)
    else:
        print(chunk.label, ' has no dense cloud')
print('script complete')

2
I remember back in 1.4 or 1.5 it seemed like the point detection was continuous and very fast on 2-3 GPUs. Now I get a "found 2 GPUs" message every 20 images with a 1.5s pause, then a fraction of a second pause after every pair of photos for 10 pairs of photos, then the sequence repeats.

This seems like a significantly slower performance than before, but I'm not sure if it's actually faster because I don't remember what version to compare it with. Also if there are other performance benefits, I understand, I'm just curious what change.

Finally if there's a tweak to use an old method, or to just assume that I have the same video cards and they're behaving properly so it avoids the 1.5s pause ever 20 photos that would be cool because when I'm doing projects with 100,000 photos that equates to just over 2h of pause time, and it's pretty clear looking at my CUDA graphs that it's not continuously using the CUDA.

I'm willing to experiment with different tweaks if they're available, and happy to report my results.

3
Bug Reports / corrupt .laz dense_cloud export? (1.6.5 linux)
« on: October 16, 2021, 12:57:59 AM »
I exported 64 dense clouds from ~15 metashape projects and 23 of them are corrupt. I haven't been able to identify a consistent pattern to the corrupt files, but I tried re-copying from the source and verified that the files are corrupted on the source drive. I also re-exported a dense cloud from the GUI that was corrupted in a scripted export, and I get exactly the same error ( 'chunk with index 0 of 1 is corrupt' after 332 of 107417870 points).

[EDIT] I noticed that of the un-corrupt files, only two are one is larger than 50GB (largest is 57,663,283 52,403,552), and all of the corrupted ones are larger than 50GB (smallest is 51,941,696)

-most corrupted files were exported with a script on HPC nodes, but one was manually exported. (I exported 15 of the clouds manually through the GUI on my login node, one-at-a-time)

for the script, I used this code (snippet) for the exportPoints function:
Code: [Select]
chunk.exportPoints(exportFile, source_data=Metashape.DenseCloudData, crs = v_projection, format=Metashape.PointsFormatLAZ)
network processing was used, and scripts were distributed to 3 nodes, writing to the same dir on a lustre filesystem.

node 1: psx file 1 4/4 dense clouds ok
              psx file 2 2/4 dense clouds ok (#1 and #4 bad)
              psx file 3 2/4 dense clouds ok (#1 and #4 bad - different version of previous chunk)
              psx file 4 4/4 dense clouds ok
              psx file 5 2/4 dense clouds ok (#1 and #4 bad)
              psx file 6 0/4 dense clouds ok (ALL BAD)

node 2: psx file 1 4/4 dense clouds ok
              psx file 2 0/4 dense clouds ok (ALL BAD)
              psx file 3 0/4 dense clouds ok (ALL BAD)

node 3: psx file 1 0/4 dense clouds ok (ALL BAD)
              psx file 2 0/4 dense clouds ok (ALL BAD)

The "bad" files appear to be about the same size as the files that process ok, and the problem seems to be in the beginning of the file (there may be more bad ones further on in the file, but I've processed ~10 of the "good" files so far with no errors (gridding normals and confidence and Stddev elevation).
Here are the relevant errors I get with lasvalidate and lasinfo:

Code: [Select]
>lasvalidate

WARNING: end-of-file after 222 of 369637189 points
needed 0.00 sec for '20181007_NR_to_OA_dense.laz' fail
WARNING: end-of-file after 997 of 2943737 points
needed 0.00 sec for '20190830-0902_VA_to_OR_dense.laz' fail
WARNING: end-of-file after 1409 of 1011263656 points
needed 0.00 sec for '20190830_OR_to_HA_dense.laz' fail
WARNING: end-of-file after 1823 of 155724795 points
needed 0.00 sec for '20190830_VA_to_OR_dense.laz' fail
WARNING: end-of-file after 1920 of 2700500566 points
needed 0.00 sec for '20190902_VA_to_OR_dense.laz' fail
WARNING: end-of-file after 332 of 107417870 points
needed 0.00 sec for '20191011_OC_to_LO_dense.laz' fail
WARNING: end-of-file after 1906 of 1629065455 points
needed 0.00 sec for '20191011_OR_to_HA_dense.laz' fail
WARNING: end-of-file after 4167 of 2477398798 points
needed 0.01 sec for '20191011_VA_OR_dense.laz' fail
WARNING: end-of-file after 27 of 1681857002 points
needed 0.00 sec for '20191126_OC_to_LO_dense.laz' fail
WARNING: end-of-file after 85 of 2932739702 points
needed 0.00 sec for '20191126_OR_to_HA_dense.laz' fail
WARNING: end-of-file after 3906 of 785969002 points
needed 0.01 sec for '20191126_VA_OR_dense.laz' fail
WARNING: end-of-file after 3875 of 1345029075 points
needed 0.00 sec for '20200208-9_OC_to_LO_dense.laz' fail
WARNING: end-of-file after 460 of 2881636414 points
needed 0.00 sec for '20200208-9_OR_to_HA_dense.laz' fail
WARNING: end-of-file after 3017 of 1373215110 points
needed 0.00 sec for '20200208-9_VA_OR_dense.laz' fail
WARNING: end-of-file after 413 of 1500086455 points
needed 0.00 sec for '20200508-9_OC_to_LO_dense.laz' fail
WARNING: end-of-file after 898 of 3101815941 points
needed 0.00 sec for '20200508-9_OR_to_HA_dense.laz' fail
WARNING: end-of-file after 489 of 2661668716 points
needed 0.00 sec for '20200508-9_VA_OR_dense.laz' fail
WARNING: end-of-file after 4294 of 908102077 points
needed 0.01 sec for '20200802_OC_to_LO_dense.laz' fail
WARNING: end-of-file after 1631 of 1270674803 points
needed 0.00 sec for '20200802_OR_dense.laz' fail
WARNING: end-of-file after 1609 of 2230961910 points
needed 0.00 sec for '20200802_OR_to_HA_dense.laz' fail
WARNING: end-of-file after 4220 of 586845194 points
needed 0.01 sec for '20210430_OC_to_LO_dense.laz' fail
WARNING: end-of-file after 119 of 1732898564 points
needed 0.00 sec for '20210430_OR_dense.laz' fail
WARNING: end-of-file after 2076 of 2464394245 points
needed 0.00 sec for '20210430_OR_to_HA_dense.laz' fail
done. total time 0.08 sec. total fail (pass=0,warning=0,fail=23)

>lasinfo:
ERROR: 'chunk with index 0 of 1 is corrupt' after 222 of 369637189 points for '20181007_NR_to_OA_dense.laz'
ERROR: 'chunk with index 0 of 1 is corrupt' after 997 of 2943737 points for '20190830-0902_VA_to_OR_dense.laz'
ERROR: 'chunk with index 0 of 1 is corrupt' after 1409 of 1011263656 points for '20190830_OR_to_HA_dense.laz'
ERROR: 'chunk with index 0 of 1 is corrupt' after 1823 of 155724795 points for '20190830_VA_to_OR_dense.laz'
ERROR: 'chunk with index 0 of 1 is corrupt' after 1920 of 2700500566 points for '20190902_VA_to_OR_dense.laz'
ERROR: 'chunk with index 0 of 1 is corrupt' after 332 of 107417870 points for '20191011_OC_to_LO_dense.laz'
ERROR: 'chunk with index 0 of 1 is corrupt' after 1906 of 1629065455 points for '20191011_OR_to_HA_dense.laz'
ERROR: 'chunk with index 0 of 1 is corrupt' after 4167 of 2477398798 points for '20191011_VA_OR_dense.laz'
ERROR: 'chunk with index 0 of 1 is corrupt' after 27 of 1681857002 points for '20191126_OC_to_LO_dense.laz'
ERROR: 'chunk with index 0 of 1 is corrupt' after 85 of 2932739702 points for '20191126_OR_to_HA_dense.laz'
ERROR: 'chunk with index 0 of 1 is corrupt' after 3906 of 785969002 points for '20191126_VA_OR_dense.laz'
ERROR: 'chunk with index 0 of 1 is corrupt' after 3875 of 1345029075 points for '20200208-9_OC_to_LO_dense.laz'
ERROR: 'chunk with index 0 of 1 is corrupt' after 460 of 2881636414 points for '20200208-9_OR_to_HA_dense.laz'
ERROR: 'chunk with index 0 of 1 is corrupt' after 3017 of 1373215110 points for '20200208-9_VA_OR_dense.laz'
ERROR: 'chunk with index 0 of 1 is corrupt' after 413 of 1500086455 points for '20200508-9_OC_to_LO_dense.laz'
ERROR: 'chunk with index 0 of 1 is corrupt' after 898 of 3101815941 points for '20200508-9_OR_to_HA_dense.laz'
ERROR: 'chunk with index 0 of 1 is corrupt' after 489 of 2661668716 points for '20200508-9_VA_OR_dense.laz'
ERROR: 'chunk with index 0 of 1 is corrupt' after 4294 of 908102077 points for '20200802_OC_to_LO_dense.laz'
ERROR: 'chunk with index 0 of 1 is corrupt' after 1631 of 1270674803 points for '20200802_OR_dense.laz'
ERROR: 'chunk with index 0 of 1 is corrupt' after 1609 of 2230961910 points for '20200802_OR_to_HA_dense.laz'
ERROR: 'chunk with index 0 of 1 is corrupt' after 4220 of 586845194 points for '20210430_OC_to_LO_dense.laz'
ERROR: 'chunk with index 0 of 1 is corrupt' after 119 of 1732898564 points for '20210430_OR_dense.laz'
ERROR: 'chunk with index 0 of 1 is corrupt' after 2076 of 2464394245 points for '20210430_OR_to_HA_dense.laz'


4
...the entire map is covered in blotches of red (low confidence points). Is there a way to remove those while keeping the low confidence points within the model itself (which are sometimes useful)?
I typically filter the dense cloud to clean it a bit by classifying low-confidence points (0-2, sometimes more) as something (I usually do "low noise"). This allows you to use other tools to remove them or do further filtering. You could always just delete them, but why remove if you can classify? :-)

5
I'll try to make a smaller project in the area where we have ground control - the existing project is several hundred thousand images.

Andy

6
General / Re: Placing markers increase camera erros after calibration
« on: July 22, 2021, 08:18:59 PM »
For accurate error estimates and apportionment between GCP and camera positions it is important to make sure that your error estimates for each are accurate. If the defaults for camera and GCP error are used (10m and 0.005m I think) then your GCPs will be much much more strongly weighted than your camera positions. For a good consumer grade system (not sure of DJI), the camera error might be 3m, or maybe 3/6 (xy/z) - I have no idea about the vertical. Similarly for a good consumer handheld GPS. Your error estimates will only be accurate, and your positions optimized, if you use accurate values for the camera and GCP error.

Also I'm assuming both camera and GCP are checked in the reference pane.

Finally, for projects where you can't constrain either the camera or the GCP accuracy very well (say you only have access to consumer GPS), if you can accurately measure the distance between two points you can create a scalebar - and you can assign accuracy values to that too. If you have several of these that are at different angles this can improve accuracy too.

7
Hi Alexey,

Sorry my original description wasn't totally accurate - I modified the script to remove some network-checking functions (wrote a version that I wanted to work with network and non-network config) and got my outputs confused when reporting behavior. If I run the above script from the console and DO NOT send (the script) for network processing, then it builds the ortho as a network task (as intended).

If I run the script (or put it in a batch step as run script) and send it for network processing from the GUI client, then the node runs the script and I get these error messages from the node:

2021-07-21 13:40:48 RunScript: path = C:/Users/FloSup/Documents/Python_Scripts/HPC_metashape_scripts/alexeytest.py
2021-07-21 13:40:48 GCS
2021-07-21 13:40:48
2021-07-21 13:40:48 original DEM BBox coordinates:
2021-07-21 13:40:48 min:  Vector([-119.89130795946117, 34.401359588674026])
2021-07-21 13:40:48 max:  Vector([-119.84912047308137, 34.426608133905866])
2021-07-21 13:40:48 extent was SHRUNK to:
2021-07-21 13:40:48 min:  Vector([-110.0, 40.0])
2021-07-21 13:40:48 max:  Vector([-120.0, 30.0])
2021-07-21 13:40:48 building ortho in network mode
2021-07-21 13:40:48 PCS
2021-07-21 13:40:48
2021-07-21 13:40:48 original DEM BBox coordinates:
2021-07-21 13:40:48 min:  Vector([785712.068273728, 3811055.956547237])
2021-07-21 13:40:48 max:  Vector([789630.9063509195, 3813949.131303574])
2021-07-21 13:40:48 extent was SHRUNK to:
2021-07-21 13:40:48 min:  Vector([785720.0, 3811060.0])
2021-07-21 13:40:48 max:  Vector([789630.0, 3813940.0])
2021-07-21 13:40:48 building ortho in network mode
2021-07-21 13:40:48 sending  2 for processing
2021-07-21 13:40:48 disconnected from server
2021-07-21 13:40:48 connected to 127.0.0.1:5840
2021-07-21 13:40:48 Traceback (most recent call last):
2021-07-21 13:40:48   File "C:/Users/FloSup/Documents/Python_Scripts/HPC_metashape_scripts/alexeytest.py", line 83, in <module>
2021-07-21 13:40:48     batch_id = client.createBatch(doc.path, network_tasks)
2021-07-21 13:40:48 RuntimeError: Request failed with error: Duplicate batch: D:/metatest/psx/simple_test_project.psx
2021-07-21 13:40:48 Error: Request failed with error: Duplicate batch: D:/metatest/psx/simple_test_project.psx
2021-07-21 13:40:48 processing failed in 0.206 sec

When I send the script to network processing, the node also reports that app.settings.network_enable = False (thus it was resorting to non-network processing before I took that if-statement out).

I tried changing the way of doc definition with your method like this, but still got the same results:

Code: [Select]
app = Metashape.app
docpath = app.document.path
doc = Metashape.Document()
doc.open(docpath, read_only=False, ignore_lock=True)

It seems like the problem is when the script is run from the node, rather than from the client - and I need to do this with my current workflow because it is a combination of batch steps and scripts. If I don't run as a network process then all steps of the batch are done on the single GUI machine. Otherwise I guess my only option is to write a script for the whole workflow... but the batch steps allow me to select only certain chunks, then run scripts only on the chunks with specific products that exist from the batch operations (ie dense cloud, elevation model).

8
I wrote the below code to build a 25cm ortho (rather than full-res) with integer cell bounds, and it works great if I copy/paste into the console, but if I run it from the <>run script command and send to network processing, or if I launch it as a batch/run script (and send the batch to network processing) it fails to generate any network tasks (only shows as "running script" and the node reports "no network tasks to do!" from my else statement)

- note that I've only tested this on my local workstation configured as host/monitor/node/client, but I haven't seen this behavior on other scripts I tested.

Code: [Select]
import Metashape
import math
import os
from os import path
#-------------------------------------------------------#
#define user-set variables
raster_rounding_multiple = 40   # Default = 40 - This will be the multiple that the raster resolution is multiplied by to define the units the min/max extents are rounded to (DEM is 10 so this keeps unit rounding consistent)
raster_resolution = 0.25           # Default = 0.25 - cell size of exported ortho
raster_crop = True              # Default = True - True means Bounding Box in rounded IN - minimum extent is rounded up and maximum extent is rounded down from raster edges. False is reversed
                                # TODO - make it so metashape checks to see if this is an interpolated raster (shrink) or uninterpolated (grow?)
                                # ALSO - maybe we want to project the xy coordinates of the 3D dense cloud region and use those instead? this will result in no/minimal collar though...
ortho_subdir = 'ortho'              # this is a subdir that will be created under the document (PSX) path
ortho_suffix = '_NAD83_2011_UTM18'  #suffix to append to ortho (future - modify this to grab from WKT)
raster_rounding_interval = raster_rounding_multiple * raster_resolution
app = Metashape.app
doc = app.document
network_tasks = list()
for chunk in doc.chunks:
    if chunk.elevation:
        print(chunk.label)
        out_projection = chunk.elevation.projection
        compression = Metashape.ImageCompression()
        compression.tiff_compression = Metashape.ImageCompression.TiffCompressionLZW
        compression.tiff_big = True
        compression.tiff_overviews = True
        compression.tiff_tiled = True
           
        def round_down(x):
            return int(raster_rounding_interval * math.floor(float(x)/raster_rounding_interval))

        def round_up(x):
            return int(raster_rounding_interval * math.ceil(float(x)/raster_rounding_interval))


        #chunk.elevation.crs(wkt) #returns CRS for DEM
        testbox = Metashape.BBox() #create a bounding box for the raster
        print('')
        print('original DEM BBox coordinates:')
        print('min: ', Metashape.Vector((min(chunk.elevation.left, chunk.elevation.right), min(chunk.elevation.bottom, chunk.elevation.top))))
        print('max: ', Metashape.Vector((max(chunk.elevation.left, chunk.elevation.right), max(chunk.elevation.bottom, chunk.elevation.top))))

        if raster_crop:
            testbox.min = Metashape.Vector((round_up(min(chunk.elevation.left, chunk.elevation.right)), round_up(min(chunk.elevation.bottom, chunk.elevation.top))))
            testbox.max = Metashape.Vector((round_down(max(chunk.elevation.left, chunk.elevation.right)), round_down(max(chunk.elevation.bottom, chunk.elevation.top))))
        else:
            testbox.min = Metashape.Vector((round_down(min(chunk.elevation.left, chunk.elevation.right)), round_down(min(chunk.elevation.bottom, chunk.elevation.top))))
            testbox.max = Metashape.Vector((round_up(max(chunk.elevation.left, chunk.elevation.right)), round_up(max(chunk.elevation.bottom, chunk.elevation.top))))

        if raster_crop:
            print('extent was SHRUNK to: ')
            print('min: ',testbox.min)
            print('max: ',testbox.max)
        else:
            print('extent was GROWN to: ')
            print('min: ',testbox.min)
            print('max: ',testbox.max)
       
        print('building ortho in network mode')
        task = Metashape.Tasks.BuildOrthomosaic()
        task.blending_mode = Metashape.BlendingMode.AverageBlending
        task.cull_faces = False
        task.fill_holes = True
        task.projection = out_projection
        task.region = testbox
        task.resolution = raster_resolution
        task.resolution_x = raster_resolution
        task.resolution_y = raster_resolution
        task.refine_seamlines = False
        task.subdivide_task = True
        task.surface_data = Metashape.DataSource.ElevationData

        n_task = Metashape.NetworkTask()
        n_task.name = task.name
        n_task.params = task.encode()
        n_task.frames.append((chunk.key, 0))
        network_tasks.append(n_task)


if network_tasks:
    print('sending ', len(network_tasks), 'for processing')
    client = Metashape.NetworkClient()
    client.connect(app.settings.network_host) #server ip
    batch_id = client.createBatch(doc.path, network_tasks)
    client.resumeBatch(batch_id)
else:
    print('no network tasks to do!')
print('script complete')

9
Do you have differences between just GCP checkpoints - results just from the alignment and optimizations? Since the surfaces from depth maps is a different process, as I understand it, from 1.6.5 and 1.7.x I am wondering if the difference is because of the DEMs and how they are created.

Hi Tom,

I have only placed and enabled a single GCP (as a control point) because I am trying to (a) minimize all of my manual and subjective pointing and clicking, and (b) evaluate how well we can reconstruct  surfaces with minimal ground control. All of the other GCPs/checkpoints are in the project, but they are disabled and unplaced on individual images. I should be able to make a GCP master file with each point placed on every photo and import that and enable it (with the other GCPs unchecked) to compare accuracy - I hope to do that soon, but summer field season is going to make it a fall/winter project I think.

To clarify re DEM production - I am still generating DEMs from the dense cloud - again, using the same process, exactly (scripted/batched) for both versions of metashape, but from what Alexey has said (and I think maybe what you were referring to), I understand that they made significant changes in both the alignment and the depth maps algorithms. I did notice with the dense clouds that there is much more complete reconstruction in areas that previously had larger gaps (particularly in areas of low texture) - maybe those areas are being more thoroughly, but more noisily reconstructed? Not sure.

So, in other words, yes. I wouldn't be surprised if some aspect of the error is attributable to changes in depth maps/dense cloud reconstruction, and is propagating to the DEM from the dense cloud. One thing I noticed that makes me think there's error attributable to the alignment though is that the sparse clouds have many more points initially (something like 20% in our case I think) - and if that represents including points in their implementation of the SIFT algorithm that would have been rejected (something like reducing the difference-of-gaussians threshold or changing the way that's calculated?) in previous versions, maybe that could be a factor? total guessing here.

I've tried to do a closer look at residual error in the camera model, but I'm having issues with calibration coefficients not reliably propagating through to the final aligned project, so that makes recovering those values hit-or-miss at the moment.

Are there deviations in the pure camera calibration / control points?
(Without a point cloud or an elevation map being calculated?).

Hi c-r-o-n-o-s,

If I understand correctly, you're wondering about differences in the GCP and camera error (lens model error too?) between 1.7.3 and 1.6.6? I haven't been able to fully evaluate the lens model errors because of problems propagating camera calibration coefficients through optimization, but I should be able to compare the camera and single-GCP errors and update on that - Although it probably would be good to do what I've been thinking about, and what Tom asked about, and manually place the other GCP points on all the images to use as checkpoints and compare the error in the alignment-only between the two versions.

The only things I can comment on at the moment re the "pure" alignment parameters are that after optimization:

1) RMS Reprojection Error was slightly lower in 1.6.6 vs 1.7.3 (0.289357 pix vs 0.297011 pix - test 2b and 0.30069 pix vs 0.306297 pix - test 2a), (max RE was higher in 1.6 in test 2b but higher in 1.7 in test 2a) (2b and 2a are different collections of flights - internal referencing system for my brain - processed identically).

2) 1.7 found about 20% more points than 1.6, removed about 3% less of the original points during optimization, and aligned about 0.05% more images in both test 2a and 2b

3) in test 2a, total camera error (m) is reported lower for 1.7 (0.14496) than 1.6 (0.14795), and computed average camera error (m) from individual camera error values is also lower in 1.7 (0.12472  than in 1.6 (0.12660). BUT interestingly, average camera error (pix) is higher in 1.7 (0.31083) than in 1.6 (0.30448),  Haven't looked in 2b yet.


10
General / Re: Your opinion on USGS Agisoft Processing Workflow
« on: July 14, 2021, 09:41:47 PM »
Enjoying the discussion on this topic a lot.

We developed the most recent USGS workflow using fixed lens DSLR cameras with survey-grade RTK positions and precise (sub-millisecond) event marks, so I wouldn't be at all surprised if the "best" workflow for built-in drone cameras (often with rolling shutter) and using only GCPs and/or consumer grade drone GPS is much different.

I am especially interested in the discussion limiting keypoints and tiepoints - since as Paulo noted, that increases processing time substantially. I hope to get around to doing some experiments on that soon, but again I expect the results will be somewhat specific to different camera types.

A couple notes re optimization from my current "best" workflow, which is very similar to the current USGS published workflow:

1) I've found that there doesn't appear to be any significant difference in how many/which points are selected whether  you optimize or not between performing gradual selection on Reconstruction Uncertainty (RU) and on Projection Accuracy (PA), and because I'm trying to minimize the number of times I optimize (both for speed and error propagation) I perform both RU and PA gradual selection before my first optimization.

2) At the moment I'm only performing 1 RU, 1 PA, and 2 Reprojection Error (RE) optimizations.

3) for all but the last RE optimization, I only optimize f, cx, cy, k1, k2, k3, p1, p2, and for the last RE optimization, I add Fit Additional Corrections (FAC), but I DO NOT add b1, b2, k4, p3, p4. I haven't re-evaluated recently, but last time I took a deep dive into the lens model parameters, I found that enabling b1, b2, k4, p3, p4 resulted in residual errors for those variables that were a significant fraction of the variable value (much more so than the other variables) - without significantly improving camera position errors or GCP errors. This implied strongly to me that I was overfitting - Again - this is specific to fixed-lens DSLR cameras with precise camera and ground control positions.

4) I've noticed that if I do multiple optimizations with FAC enabled, error (camera and GCP) appears to increase, so I only enable Fit Additional Corrections on my final optimization.

5) I've had mixed results, sometimes worse, and at best insignificant improvement using the above methods and trying to tighten tie point accuracy values (tighten to the previously optimized RE step) on the last iteration, so at the moment I'm leaving tie point accuracy at the default. I expect that if I do multiple RE optimizations, I might be able to improve tie point accuracy, but in general I'm able to meet accuracy targets using the above methods with our camera systems.

11
General / Re: Hardware requirements
« on: July 09, 2021, 10:09:10 PM »
Not sure what your general workflow and project size is, but my guess is that you'd see faster results with the AMD depending on workflow (maybe even 20%+ improvement) but the RAM would limit the # of images you can process at once. Making mesh is more demanding in my experience than making dense cloud/dem, so kinda depends on your workflow and estimated project size. Would be cool to get numbers running the Puget Systems extended benchmark for each - maybe your vendor could do that and give you results?

Whichever one you get, compare hyperthreading/SMT on vs off - you're right at the point where I'm not sure you'd see improvement or not, but I've seen significant differences on some systems.

12
I was a little surprised to find that our accuracy decreased, and processing time and memory requirements increased, when processing imagery with the most recent versions of Metashape.

I compared alignment and optimization results for 1.6.5/1.6.6 and 1.7.2 using eight overlapping flights (F1 through F8 in table below, ~140,000 images total) processed with precise camera positions and a single GCP.  These eight flights were aligned and optimized in two batches of four temporally adjacent flights using a 4D technique (sensu Warrick et al 2017), with each batch being roughly 70,000 images total. Results were compared by differencing 1m integer-cell-bounded DSMs produced by metashape for each flight with RTK elevations from 34 GCPs distributed in a region comprising about 20% of the total flight area.

The same source project and processing steps were used for both 1.6.5 and 1.7.2 (alignment, optimization, dense cloud processing, and DEM production and export were entirely batch processing or python API). Settings such as alignment quality and keypoint/tiepoint limit were identical (70,000/0). After alignment and optimization, a sub-region with ground control points was reconstructed and DEM was output, then the DEM elevations at 34 different GCPs were extracted (one of which was used in the optimization).

The table below shows mean and median signed and unsigned error when differencing 34 GCPs with DEMs created from each flight, as well as the average  of each error for all flights (last data column). In all cases, the 1.7.2 values were elevated vs 1.6.5.

F1F2F3F4F5F6F7F8Avg
-0.01-0.03-0.06-0.07-0.11-0.09-0.10-0.11-0.07mean signed differenceV 1.6.5 RuPaReX2+FitAddl on last iteration
-0.03-0.07-0.08-0.09-0.14-0.13-0.14-0.13-0.10mean signed differenceV 1.7.2 RuPaReX2+ FitAddl on last iteration
     
0.00-0.02-0.04-0.05-0.09-0.09-0.09-0.11-0.06median signed differenceV 1.6.5 RuPaReX2+FitAddl on last iteration
-0.04-0.06-0.08-0.07-0.13-0.12-0.11-0.13-0.09median signed differenceV 1.7.2 RuPaReX2+ FitAddl on last iteration
     
0.060.070.090.090.120.110.150.120.10mean absolute differenceV 1.6.5 RuPaReX2+FitAddl on last iteration
0.120.160.170.180.210.240.260.210.19mean absolute differenceV 1.7.2 RuPaReX2+ FitAddl on last iteration
     
0.050.060.050.060.090.100.110.110.08median absolute differenceV 1.6.5 RuPaReX2+FitAddl on last iteration
0.060.080.090.070.130.120.110.130.10median absolute differenceV 1.7.2 RuPaReX2+ FitAddl on last iteration

Interestingly, 1.7.2 found more points, and kept more points after optimization. A few other observations (drawn from one of the two batches - sorry for the code formatting, was having trouble with column alignment):

Code: [Select]
1.6.5 1.7.2 1.7.2/1.6.5
Matching time (h) 14 28 200%
Alignment time (h) 44 60 136%
Optimize time 15.1 6.2 41%
Matching memory(GB) 48 217 452%
Alignment memory 97 132 136%


Total Align/Opt (h) 73.1 94.2 129%

Reference:

Jonathan A. Warrick, Andrew C. Ritchie, Gabrielle Adelman, Kenneth Adelman, Patrick W. Limber "New Techniques to Measure Cliff Change from Historical Oblique Aerial Photographs and Structure-from-Motion Photogrammetry," Journal of Coastal Research, 33(1), 39-55, (1 January 2017), https://doi.org/10.2112/JCOASTRES-D-16-00095.1


13
General / Re: Checkered pattern in hillshade of DSM
« on: June 30, 2021, 11:57:51 PM »
I have seen a similar checkered pattern as an artifact of reprojection if you have a source DEM in GCS and you reproject in PCS to create a hillshade. If you did something like that you might want to check to make sure the checkered pattern isn't a post-processing/hillshading algorithm artifact (if you're going from GCS DEM to PCS hillshade, try outputting the DEM from metashape in PCS, then generating hillshade).

14
General / Re: using more than one pcu
« on: June 30, 2021, 08:51:30 AM »
If you have a machine with a single OS running multiple CPUs (eg Xeon, EPYC) then metashape will recognize that it is a single physical machine and use all of the cores when processing CPU-intensive tasks. Many of the processes on metashape are very effectively parallelized.

Multi-CPU machines also generally have more PCIe lanes available for GPUs so you may be able to configure with 2-3 GPUs if you have the right case/motherboard. Note that the choice/number of GPUs will significantly affect your processing time.

There are a variety of tweaks in the BIOS that you will want to optimize for CPU-GPU bandwidth and CPU-memory bandwidth (enable NUMA, generally disable hyperthreading/SMT, etc).

15
TLDR: GPU masking with Metashape 1.7.2 on a CentOS linux node is mirrored/reversed, or applied from high-to-low.

Not sure if this is expected behavior because the examples I found in the API/forum were ambiguous.

We had uncorrectable GPU memory errors on one of our cards on a HPC GPU node (CentOS) that I worked around by masking the offending GPU:
Code: [Select]
Jun 21 19:56:35 dl-0001 kernel: NVRM: GPU at PCI:0000:89:00: GPU-<censored>
Jun 21 19:56:35 dl-0001 kernel: NVRM: GPU Board Serial Number: <censored>
Jun 21 19:56:35 dl-0001 kernel: NVRM: Xid (PCI:0000:89:00): 63, Dynamic Page Retirement: New page retired, reboot to activate (0x00000000000a44da).
Jun 21 19:56:37 dl-0001 kernel: NVRM: Xid (PCI:0000:89:00): 63, Dynamic Page Retirement: New page retired, reboot to activate (0x00000000000a249c).
Jun 21 19:56:40 dl-0001 kernel: NVRM: Xid (PCI:0000:89:00): 48, An uncorrectable double bit error (DBE) has been detected on GPU in the framebuffer at partition 6, subpartition 0.
Jun 21 19:56:40 dl-0001 kernel: NVRM: Xid (PCI:0000:89:00): 63, Dynamic Page Retirement: New page retired, reboot to activate (0x00000000000a2ca1).

 nvidia-smi reported GPU2 was bad (of GPUs 0-3) :
Code: [Select]
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.67       Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  On   | 00000000:61:00.0 Off |                    0 |
| N/A   30C    P0    40W / 300W |      0MiB / 16130MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla V100-SXM2...  On   | 00000000:62:00.0 Off |                    0 |
| N/A   29C    P0    38W / 300W |      0MiB / 16130MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla V100-SXM2...  On   | 00000000:89:00.0 Off |                    1 |
| N/A   30C    P0    38W / 300W |      0MiB / 16130MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla V100-SXM2...  On   | 00000000:8A:00.0 Off |                    0 |
| N/A   31C    P0    41W / 300W |      0MiB / 16130MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

But my attempts to mask led to some confusion about expected mask behavior. For clarity below I'm representing the masks as binary, but they were converted to decimal in my metashape -gpu_mask argument (ie binary 1011 = decimal 11).

With a depth mapping job active on the node, we confirmed that masking 0011 activated GPU 0 and 1, masking 1100  crashed the metashape process, and masking 1011 (decimal 11) enabled GPU 0,1, and 3. Metashape was called as below to mask GPU2:
Code: [Select]
srun metashape.sh --node --dispatch $ip4 --capability any --cpu_enable 1 --gpu_mask 11 --inprocess -platform offscreen

Pages: [1] 2 3 ... 27