So here comes the test, and something is weird:
While the machine is generally (very) fast, the mesh generation step is around 100 times slower than on my HPZ840 running Windows. How come? Alexey, can you comment?
This is what the machine does with ten 20MP images:
LoadProject
loaded project in 0.009438 sec
clGetPlatformIDs failed: CL_UNKNOWN_ERROR_CODE_-1001
HERE BE GPU #:
[{'compute_units': 56, 'clock': 1480, 'version': '', 'mem_size': 17100439552, 'pci_bus_id': 6, 'name': 'Tesla P100-SXM2-16GB', 'pci_device_id': 0, 'vendor': ''}, {'compute_units': 56, 'clock': 1480, 'version': '', 'mem_size': 17100439552, 'pci_bus_id': 7, 'name': 'Tesla P100-SXM2-16GB', 'pci_device_id': 0, 'vendor': ''}, {'compute_units': 56, 'clock': 1480, 'version': '', 'mem_size': 17100439552, 'pci_bus_id': 10, 'name': 'Tesla P100-SXM2-16GB', 'pci_device_id': 0, 'vendor': ''}, {'compute_units': 56, 'clock': 1480, 'version': '', 'mem_size': 17100439552, 'pci_bus_id': 11, 'name': 'Tesla P100-SXM2-16GB', 'pci_device_id': 0, 'vendor': ''}, {'compute_units': 56, 'clock': 1480, 'version': '', 'mem_size': 17100439552, 'pci_bus_id': 133, 'name': 'Tesla P100-SXM2-16GB', 'pci_device_id': 0, 'vendor': ''}, {'compute_units': 56, 'clock': 1480, 'version': '', 'mem_size': 17100439552, 'pci_bus_id': 134, 'name': 'Tesla P100-SXM2-16GB', 'pci_device_id': 0, 'vendor': ''}, {'compute_units': 56, 'clock': 1480, 'version': '', 'mem_size': 17100439552, 'pci_bus_id': 137, 'name': 'Tesla P100-SXM2-16GB', 'pci_device_id': 0, 'vendor': ''}, {'compute_units': 56, 'clock': 1480, 'version': '', 'mem_size': 17100439552, 'pci_bus_id': 138, 'name': 'Tesla P100-SXM2-16GB', 'pci_device_id': 0, 'vendor': ''}]
MatchPhotos: accuracy = High, preselection = generic, reference, keypoint limit = 40000, tiepoint limit = 4000, constrain features by mask = 0
clGetPlatformIDs failed: CL_UNKNOWN_ERROR_CODE_-1001
Using device: Tesla P100-SXM2-16GB, 56 compute units, 16308 MB global memory, CUDA 6.0
max work group size 1024
max work item sizes [1024, 1024, 64]
Using device: Tesla P100-SXM2-16GB, 56 compute units, 16308 MB global memory, CUDA 6.0
max work group size 1024
max work item sizes [1024, 1024, 64]
Using device: Tesla P100-SXM2-16GB, 56 compute units, 16308 MB global memory, CUDA 6.0
max work group size 1024
max work item sizes [1024, 1024, 64]
Using device: Tesla P100-SXM2-16GB, 56 compute units, 16308 MB global memory, CUDA 6.0
max work group size 1024
max work item sizes [1024, 1024, 64]
Using device: Tesla P100-SXM2-16GB, 56 compute units, 16308 MB global memory, CUDA 6.0
max work group size 1024
max work item sizes [1024, 1024, 64]
Using device: Tesla P100-SXM2-16GB, 56 compute units, 16308 MB global memory, CUDA 6.0
max work group size 1024
max work item sizes [1024, 1024, 64]
Using device: Tesla P100-SXM2-16GB, 56 compute units, 16308 MB global memory, CUDA 6.0
max work group size 1024
max work item sizes [1024, 1024, 64]
Using device: Tesla P100-SXM2-16GB, 56 compute units, 16308 MB global memory, CUDA 6.0
max work group size 1024
max work item sizes [1024, 1024, 64]
photo 4078: 39794 points
photo 4088: 39843 points
photo 4072: 39886 points
photo 4070: 39983 points
photo 4074: 39791 points
photo 4082: 39868 points
photo 4084: 39831 points
photo 4086: 39998 points
photo 4076: 39935 points
photo 4080: 39860 points
points detected in 6.30511 sec
clGetPlatformIDs failed: CL_UNKNOWN_ERROR_CODE_-1001
Using device: Tesla P100-SXM2-16GB, 56 compute units, 16308 MB global memory, CUDA 6.0
max work group size 1024
max work item sizes [1024, 1024, 64]
Using device: Tesla P100-SXM2-16GB, 56 compute units, 16308 MB global memory, CUDA 6.0
max work group size 1024
max work item sizes [1024, 1024, 64]
Using device: Tesla P100-SXM2-16GB, 56 compute units, 16308 MB global memory, CUDA 6.0
max work group size 1024
max work item sizes [1024, 1024, 64]
Using device: Tesla P100-SXM2-16GB, 56 compute units, 16308 MB global memory, CUDA 6.0
max work group size 1024
max work item sizes [1024, 1024, 64]
Using device: Tesla P100-SXM2-16GB, 56 compute units, 16308 MB global memory, CUDA 6.0
max work group size 1024
max work item sizes [1024, 1024, 64]
Using device: Tesla P100-SXM2-16GB, 56 compute units, 16308 MB global memory, CUDA 6.0
max work group size 1024
max work item sizes [1024, 1024, 64]
Using device: Tesla P100-SXM2-16GB, 56 compute units, 16308 MB global memory, CUDA 6.0
max work group size 1024
max work item sizes [1024, 1024, 64]
Using device: Tesla P100-SXM2-16GB, 56 compute units, 16308 MB global memory, CUDA 6.0
max work group size 1024
max work item sizes [1024, 1024, 64]
4112 matches found in 0.230473 sec
matches combined in 0.000547 sec
matches filtered in 0.048923 sec
12 of 14 pairs selected in 1e-05 sec
clGetPlatformIDs failed: CL_UNKNOWN_ERROR_CODE_-1001
Using device: Tesla P100-SXM2-16GB, 56 compute units, 16308 MB global memory, CUDA 6.0
max work group size 1024
max work item sizes [1024, 1024, 64]
Using device: Tesla P100-SXM2-16GB, 56 compute units, 16308 MB global memory, CUDA 6.0
max work group size 1024
max work item sizes [1024, 1024, 64]
Using device: Tesla P100-SXM2-16GB, 56 compute units, 16308 MB global memory, CUDA 6.0
max work group size 1024
max work item sizes [1024, 1024, 64]
Using device: Tesla P100-SXM2-16GB, 56 compute units, 16308 MB global memory, CUDA 6.0
max work group size 1024
max work item sizes [1024, 1024, 64]
Using device: Tesla P100-SXM2-16GB, 56 compute units, 16308 MB global memory, CUDA 6.0
max work group size 1024
max work item sizes [1024, 1024, 64]
Using device: Tesla P100-SXM2-16GB, 56 compute units, 16308 MB global memory, CUDA 6.0
max work group size 1024
max work item sizes [1024, 1024, 64]
Using device: Tesla P100-SXM2-16GB, 56 compute units, 16308 MB global memory, CUDA 6.0
max work group size 1024
max work item sizes [1024, 1024, 64]
Using device: Tesla P100-SXM2-16GB, 56 compute units, 16308 MB global memory, CUDA 6.0
max work group size 1024
max work item sizes [1024, 1024, 64]
59225 matches found in 0.299517 sec
matches combined in 0.006723 sec
matches filtered in 0.356821 sec
finished matching in 7.35111 sec
setting point indices... 21295 done in 0.003269 sec
generated 21295 tie points, 2.44593 average projections
removed 106 multiple indices
removed 18 tracks
selected 13228 tracks out of 21277 in 0.001883 sec
AlignCameras: adaptive fitting = 1
processing block: 10 photos
pair 4076 and 4078: 2772 robust from 2820
pair 4078 and 4080: 2728 robust from 2783
adding photos 4076 and 4078 (2772 robust)
adding 2815 points, 0 far (6.08 threshold), 0 inaccurate, 5 invisible, 0 weak
adjusting: xxxxxxx 0.457606 -> 0.14951
adding 2 points, 1 far (6.08 threshold), 0 inaccurate, 1 invisible, 0 weak
optimized in 0.321948 seconds
adding camera 4080 (3 of 10), 1648 of 1654 used
adding camera 4074 (4 of 10), 1497 of 1502 used
adding 2306 points, 7 far (6.08 threshold), 0 inaccurate, 4 invisible, 0 weak
adjusting: xxxxxxx 0.211001 -> 0.124918
adding 5 points, 1 far (6.08 threshold), 0 inaccurate, 4 invisible, 0 weak
optimized in 0.33958 seconds
adding camera 4082 (5 of 10), 1257 of 1261 used
adding camera 4072 (6 of 10), 707 of 709 used
adding 2587 points, 5 far (6.08 threshold), 0 inaccurate, 15 invisible, 0 weak
adjusting: xxxxxxxxx 0.163178 -> 0.133594
adding 17 points, 1 far (6.08 threshold), 0 inaccurate, 16 invisible, 0 weak
optimized in 0.13536 seconds
adding camera 4084 (7 of 10), 777 of 778 used
adding camera 4070 (8 of 10), 399 of 399 used
adding 3023 points, 3 far (6.08 threshold), 0 inaccurate, 16 invisible, 0 weak
adjusting: xxxxxxxxx 0.157561 -> 0.138203
adding 18 points, 2 far (6.08 threshold), 0 inaccurate, 17 invisible, 0 weak
optimized in 0.06208 seconds
adding camera 4086 (9 of 10), 543 of 544 used
adding 1479 points, 2 far (6.08 threshold), 0 inaccurate, 20 invisible, 0 weak
adjusting: xxxxxxxx 0.148055 -> 0.13876
adding 23 points, 2 far (6.08 threshold), 1 inaccurate, 21 invisible, 0 weak
optimized in 0.215891 seconds
adding camera 4088 (10 of 10), 284 of 284 used
adding 1047 points, 2 far (6.08 threshold), 1 inaccurate, 20 invisible, 0 weak
adjusting: xxxxxxxxxx 0.177195 -> 0.148982
adding 23 points, 3 far (6.08 threshold), 14 inaccurate, 20 invisible, 0 weak
optimized in 0.067274 seconds
3 sigma filtering...
adjusting: xxxxx 0.148239 -> 0.148087
point variance: 0.134174 threshold: 0.402522
adding 6 points, 187 far (0.402522 threshold), 0 inaccurate, 6 invisible, 0 weak
adjusting: xxxx 0.115785 -> 0.115254
point variance: 0.104041 threshold: 0.312123
adding 43 points, 298 far (0.312123 threshold), 0 inaccurate, 6 invisible, 0 weak
adjusting: xxxxx 0.106958 -> 0.106917
point variance: 0.0960216 threshold: 0.288065
adding 182 points, 286 far (0.288065 threshold), 0 inaccurate, 6 invisible, 0 weak
adjusting: xxxxx 0.104337 -> 0.104323
point variance: 0.0935057 threshold: 0.280517
adding 253 points, 298 far (0.280517 threshold), 0 inaccurate, 6 invisible, 0 weak
adjusting: xxxxx 0.103163 -> 0.103144
point variance: 0.0923657 threshold: 0.277097
adding 284 points, 302 far (0.277097 threshold), 0 inaccurate, 6 invisible, 0 weak
optimized in 0.782872 seconds
f 1548.12, cx -8.74659, cy -18.8127, k1 -0.0209463, k2 0.000455108, k3 -0.000270836
finished sfm in 2.44364 seconds
coordinates applied in 0 sec