Author Topic: Projecting from raw images to orthomosaic using chunk xml transforms (Read 5318 times)

elliebowler · « **on:** January 20, 2021, 09:47:17 PM »

Hi!

I'm working on a detection method for a UAV dataset, which was collected and processed in Agisoft by a partner in the project. For our method we would like to run detection on the raw images captured by the drone (in x_pixel, y_pixel form), and then project these onto the orthomosaic so we can combine overlapping detections (e.g to long/lat). However to build this into the machine learning pipeline, we would like to calculate projections in python using affine transformations, outside the agisoft software.

I was wondering if it is possible to calculate the projections using the xml file stored with the project? These have been saved from when our partner proccessed the orthomosaics and annotated the data. I understand from a few posts on the forum this is possible using the PhotoScan python library - but am i right this can only be imported/run in agisoft not in another python environment? I can see there are parameters for the entire chunk (e.g f, cx, cy, k1, k2, etc), as well as translation and rotation/location covariance matrices for each raw image (example below). I was wondering if it would be possible to explain a little how these parameters can be used to create the projections we need? Especially if the intrinsic matrix is stored in the xml, and under which tag?

Thanks in advance!

Code: [Select]

.............................
      <calibration class="adjusted" type="frame">
        <resolution height="3000" width="4000" />
        <f>3118.23872343487</f>
        <cx>21.6929492116479</cx>
        <cy>-15.0408107258771</cy>
        <k1>0.0303345331370718</k1>
        <k2>-0.0667270494782827</k2>
        <k3>0.0653506193462477</k3>
        <p1>-3.94887881926124e-005</p1>
        <p2>-0.000593847873789833</p2>
      </calibration>
      <covariance>
        <params>f cx cy k1 k2 k3 p1 p2</params>
        <coeffs>2.0468751869991163e+001 -5.3764593396939653e-001 -2.0849002785751725e+000 6.7189315228864807e-004 -1.6497207752060089e-003 2.4646804676160135e-003 -4.2935290026870297e-006 -3.9453328768781123e-005 -5.3764593396939653e-001 1.0322535496260234e-001 5.2120752064783622e-002 -9.0786128682199680e-006 3.9834494161823351e-005 -6.0501242778801419e-005 2.2683126222501854e-006 -6.7464192211726927e-007 -2.0849002785751725e+000 5.2120752064783622e-002 2.9360214203540624e-001 -6.8661956725641512e-005 1.7218488711043673e-004 -2.5479146665632337e-004 -2.6130556639053032e-007 1.3178895802044476e-005 6.7189315228864807e-004 -9.0786128682199680e-006 -6.8661956725641512e-005 1.6020123978103154e-007 -3.1356187592838213e-007 3.4307591807736369e-007 -7.7777465362305866e-010 -4.7568949608808401e-009 -1.6497207752060089e-003 3.9834494161823351e-005 1.7218488711043673e-004 -3.1356187592838213e-007 1.0816372717144955e-006 -1.2016747628465014e-006 3.3070610393767247e-010 2.9129344608067300e-009 2.4646804676160135e-003 -6.0501242778801419e-005 -2.5479146665632337e-004 3.4307591807736369e-007 -1.2016747628465014e-006 1.3942773284397268e-006 -3.3260914196314242e-010 -4.4361470605027930e-009 -4.2935290026870297e-006 2.2683126222501854e-006 -2.6130556639053032e-007 -7.7777465362305866e-010 3.3070610393767247e-010 -3.3260914196314242e-010 3.6717553086160803e-010 -4.6052236881848388e-011 -3.9453328768781123e-005 -6.7464192211726927e-007 1.3178895802044476e-005 -4.7568949608808401e-009 2.9129344608067300e-009 -4.4361470605027930e-009 -4.6052236881848388e-011 1.6945015932751167e-009</coeffs>
      </covariance>
    </sensor>
  </sensors>
  <cameras next_group_id="0" next_id="229">
    <camera id="0" label="DJI_0002" sensor_id="0">
      <transform>9.9367761520725317e-001 -1.1154450989110640e-001 1.2752229184717922e-002 1.6355136081158577e+001 -1.1174408706270952e-001 -9.9360520953338949e-001 1.6184764274795770e-002 -7.2377742564520382e+000 1.0865359752364273e-002 -1.7507424175531350e-002 -9.9978769449128035e-001 4.9736616107715687e-001 0 0 0 1</transform>
      <rotation_covariance>9.8217092719010480e-006 -4.2857770172187034e-007 3.2338520246152594e-008 -4.2857770172186997e-007 2.1911384793986368e-005 -1.1127430980809863e-006 3.2338520246152594e-008 -1.1127430980809863e-006 5.8461829356115879e-006</rotation_covariance>
      <location_covariance>2.0055242000696963e-003 1.8493823777608193e-006 -9.6731442372176409e-006 1.8493823777608193e-006 2.0218170118861392e-003 9.1746781498499200e-005 -9.6731442372176409e-006 9.1746781498499200e-005 3.4676077055425325e-003</location_covariance>
      <orientation>1</orientation>
      <reference enabled="true" x="105.633025333333" y="-10.4760358611111" z="313.197" />
.......................

elliebowler · « **Reply #1 on:** January 29, 2021, 02:41:12 AM »

Hi, i'm wondering if anyone can help with this? I have a more specific example.

I would like to use the xml to calculate the projection from a pixel in a 2d image (u, v) to a location in the point cloud (X, Y, Z). In the files i have, a specific example is getting from pixel (2521.209961, 2522.959961) in image DJI_0217 to point (105.63217, -10.47468, 243.83644). The xml files corresponding to this camera are at the bottom of this post.

I have found a post which i think outlines what i need to do https://programmer.group/xml-interpretation-of-agisoft-photoscan-mateshape-camera-parameters.html . The equation they give is

I'm unsure what parameters below should be used for dx, dy. And also R - the rotation matrix. Should this be the transforms for the specific camera, or the chunk transform matrix? I've tried a number of different combinations using the example points, but can't get them to match. I've also tried in reverse to go from 3d to 2d, following this post https://www.agisoft.com/forum/index.php?topic=6437.0, but again am unclear on where i'm going wrong with picking values from the xml. Any help would be very appreciated! Thanks for your time, sorry for the long post

-------------------------------------------------------------------------------

for the calibration i have...

Code: [Select]

      <calibration class="adjusted" type="frame">
        <resolution height="3000" width="4000" />
        <f>3118.23872343487</f>
        <cx>21.6929492116479</cx>
        <cy>-15.0408107258771</cy>
        <k1>0.0303345331370718</k1>
        <k2>-0.0667270494782827</k2>
        <k3>0.0653506193462477</k3>
        <p1>-3.94887881926124e-005</p1>
        <p2>-0.000593847873789833</p2>
      </calibration>

for the specific camera mentioned in the example i have...

Code: [Select]

    </camera>
    <camera id="215" label="DJI_0217" sensor_id="0">
      <transform>-9.9845367866015977e-001 -4.7360369045444578e-002 2.9107507860923802e-002 2.0635994236918069e+000 -5.0010632995164908e-002 9.9389702762500165e-001 -9.8324132671528716e-002 -6.4615532779234619e-001 -2.4273198335146698e-002 -9.9627776859993980e-002 -9.9472866547642924e-001 -2.8160118158196525e-003 0 0 0 1</transform>
      <rotation_covariance>8.4460505284871782e-006 -1.4902639446243033e-006 -1.1296027835376122e-008 -1.4902639446243037e-006 2.1520441968010174e-005 -6.9017167960388795e-007 -1.1296027835376149e-008 -6.9017167960388763e-007 5.7679957063865655e-006</rotation_covariance>
      <location_covariance>1.2021170339207746e-003 -1.8157680112466801e-006 -1.3976334597317290e-005 -1.8157680112466801e-006 1.2033529966674376e-003 4.4792361160838019e-005 -1.3976334597317290e-005 4.4792361160838019e-005 2.0488157555429709e-003</location_covariance>
      <orientation>1</orientation>
      <reference enabled="true" x="105.632175972222" y="-10.4748751388889" z="313.197" />
    </camera>

the chunk transform is

Code: [Select]

 <transform>
    <rotation locked="true">-8.1053840961487511e-001 -5.3146724142603119e-001 -2.4610984911212641e-001 -3.3158710760897486e-001 7.0029820158928557e-002 9.4082188237520392e-001 -4.8278098202653230e-001 8.4417912529244210e-001 -2.3298954443061240e-001</rotation>
    <translation locked="true">-1.6902721015857060e+006 6.0408286681674914e+006 -1.1519789192598241e+006</translation>
    <scale locked="true">1.0044411226433386e+001</scale>
  </transform>

which i understand can be made into a 4x4 tranform matrix by doing rotation times scale, and combining with translation, i.e to get
Matrix([[-8.141381100991113, -5.33827552626121, -2.4720285313576693, -1690272.101585706],
[-3.330597266208162, 0.7034083117894531, 9.45000187740369, 6040828.668167491],
[-4.8492507157758356, 8.479282283208121, -2.3402427957204432, -1151978.919259824],
[0.0, 0.0, 0.0, 1.0]])

Paulo · « **Reply #2 on:** January 29, 2021, 10:06:39 AM »

Hi EllieBowler,

Given a point P in model coordinates, K the calibration matrix for pin hole camera and CT the camera transform matrix then the pixel coordinates of projected point P in camera is given by following screen capture.

Attached is Excel where you can test projection by entering your own Camera transform, Calibration and point coordinates....

Of course the reverse problem going from pixel coordinates to model or world 3D coordinates is not so trivial as the camera transform will give you only a direction and not point on 3D surface,,, To do this you can use pickPoint method in MS.

Given a point p with pixel position (x,y) in a given camera, then the corresponding point P on model surface will be defined by:

Code: [Select]

P = surface.pickPoint(camera.center, camera.unproject(Metashape.Vector([x, y, 1])))
where surface = chunk.dense_cloud

Hope this can be useful,

elliebowler · « **Reply #3 on:** February 10, 2021, 09:48:57 PM »

Hi Paul,

Thank you so much for your reply! Sorry i have only just got back to you, i have been looking into this problem of getting the 3D surface which you pointed out to me.

I was wondering in your code where the camera.center vector is stored (or if it is!) in the xml? Or is there another way i can export these values in a different format (e.g omega kappa phi) in agisoft? Also just to clarify does this camera.center vector locate the middle of the 2d camera image in the 3d model reference system?

Also I realised my data is exported in WGS 84, so will i need to perform another transformation to convert it to the correct coordinate system for applying these transformations stored in the xml? I think this may be the reason i'm not able to get the values to match up at the moment. I wonder if you have any advice on how to project to and from geographic coordinate systems (outside the metashape python library).

Thank you again for your help already,
Ellie

Paulo · « **Reply #4 on:** February 11, 2021, 05:01:07 AM »

Hi Ellie,

in the xml, the camera.center coordinates are stored in the last column of the 4x4 camera transform as in:

Quote

camera = Metashape.app.document.chunk.cameras[0]

camera.center
Out[2]: 2021-02-10 19:47:31 Vector([0.6667420411798293, -0.6872752106074211, 0.008129134090471436])

camera.transform
Out[3]: 2021-02-10 19:47:52
2021-02-10 19:47:52 Matrix([[0.5291440348212693, 0.8485319837894297, -0.00025079761404796187, 0.6667420411798293],
2021-02-10 19:47:52 [0.8482357034523637, -0.5289514179099659, 0.026581739591503416, -0.6872752106074211],
2021-02-10 19:47:52 [0.022422796474593297, -0.014278304430592536, -0.9996466116687668, 0.008129134090471436],
2021-02-10 19:47:52 [0.0, 0.0, 0.0, 1.0]])

or from the xml canera transform entry, extract the 4th, 8th and 12th number in list of numbers:

<transform>5.2914403482126948e-01 8.4853198378942918e-01 -2.5079761404796642e-04 6.6674204117982927e-01 8.4823570345236399e-01 -5.2895141790996592e-01 2.6581739591503409e-02 -6.8727521060742114e-01 2.2422796474593304e-02 -1.4278304430592553e-02 -9.9964661166876667e-01 8.1291340904714356e-03 0 0 0 1</transform>

The chunk transform 4 x 4 matrix T transforms from internal coordinates to geocentric coordinates. T.inv() transforms from geocentric coordinates to internal chunk coordinates. So if you are starting from geographic coordinates then you have to do geographic to geocentric transformation first. Using parameters of reference elipsoid (a,1/f) as in following attachment...

elliebowler · « **Reply #5 on:** February 12, 2021, 04:05:46 PM »

Hi Paul,

Thanks again! My camera.center coordinates match with the last column of the camera.transform matrix exactly as you say. This would have taken me a long time to figure out!

Also thanks for explaining the geocentric -> geographic projection. I was wondering whether it also possible to get this information using just the information from the xml, i presume from this <reference> section? I've previously used the gdal/ogr library to convert results into long/lat. But i've never worked with geocentric coordinates, so i'm not sure what to put as the crs for the input source, which i need to project into wgs84 (or vice versa).

Quote

<reference>GEOGCS["WGS 84",DATUM["World Geodetic System 1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],TOWGS84[0,0,0,0,0,0,0],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.01745329251994328,AUTHORITY["EPSG","9102"]],AUTHORITY["EPSG","4326"]]</reference>
<region>
<center>8.0295666577678944e+000 7.5971928366392634e+000 -6.9026469309386114e+000</center>
<size>4.6588402261177400e+001 3.3074866737197382e+001 1.1623963295679715e+001</size>
<R>-1.1289887338639719e-001 -9.9224287739757688e-001 5.2037646390469303e-002 9.9353856559191167e-001 -1.1212388138353775e-001 1.7588459428196531e-002 -1.1617360700660691e-002 5.3687125805611760e-002 9.9849022501629103e-001</R>

Many thanks again,
Ellie

Paulo · « **Reply #6 on:** February 13, 2021, 12:47:43 AM »

Hey Ellie,

the EPSG WKT for geocentric CS referred to WGS84 datum is (https://spatialreference.org/ref/epsg/4978/html/):

GEOCCS["WGS 84",
DATUM["World Geodetic System 1984",
SPHEROID["WGS 84",6378137.0,298.257223563,
AUTHORITY["EPSG","7030"]],
AUTHORITY["EPSG","6326"]],
PRIMEM["Greenwich",0.0,
AUTHORITY["EPSG","8901"]],
UNIT["m",1.0],
AXIS["Geocentric X",OTHER],
AXIS["Geocentric Y",EAST],
AXIS["Geocentric Z",NORTH],
AUTHORITY["EPSG","4978"]]

so you just need to use the EPSG codes 4326 (GEOGCS WGS84) and 4978 (GEOCCS WGS84) to do transformation as in following console entry from Metashape where I transform geographic coordinates 90 deg long W, 45 deg lat N, 100 m height to geocentric on WGS84:

Code: [Select]

PG = Metashape.Vector((-90,45,100)) # Geographic coordinates
PC = chunk.crs.transform(PG,Metashape.CoordinateSystem("EPSG::4326"),Metashape.CoordinateSystem("EPSG::4978")) # Geocentric coordinates
print("Geographic Coordinates WGS84: {:.4f} {:.4f} {:.3f}".format(PG.x,PG.y,PG.z))
print("Geocentric Coordinates WGS84: {:.3f} {:.3f} {:.3f}".format(PC.x,PC.y,PC.z))

2021-02-16 13:05:02 Geographic Coordinates WGS84: -90.0000 45.0000 100.000
2021-02-16 13:05:02 Geocentric Coordinates WGS84: 0.000 -4517661.590 4487419.120

Hope this clears up your doubt,

Edited to change 4328 EPSG code (deprecated) to 4978.

Forum

Author Topic: Projecting from raw images to orthomosaic using chunk xml transforms (Read 5318 times)

elliebowler

Projecting from raw images to orthomosaic using chunk xml transforms

elliebowler

Re: Projecting from raw images to orthomosaic using chunk xml transforms

Paulo

Re: Projecting from raw images to orthomosaic using chunk xml transforms

elliebowler

Re: Projecting from raw images to orthomosaic using chunk xml transforms

Paulo

Re: Projecting from raw images to orthomosaic using chunk xml transforms

elliebowler

Re: Projecting from raw images to orthomosaic using chunk xml transforms

Paulo

Re: Projecting from raw images to orthomosaic using chunk xml transforms