Forum

Author Topic: How to work with large datasets-Chunks  (Read 21755 times)

rainman

  • Newbie
  • *
  • Posts: 21
    • View Profile
How to work with large datasets-Chunks
« on: October 30, 2017, 12:31:58 AM »
Dear Alex,

I am working with a large data set of photos (1600 images) based off a UAV that captures imagery from along a straight line and then back again, parallel to the first line, with about 80% overlap, (front and back and side to side). I have tried processing as one large project file but there are simply too many images and I run out of computer memory!!!!!

So I am creating 4 chunks and putting in about 400images in each chuck. Each image has lat,long, roll, pitch and yaw, so each chunk has its own camera position file. Is that correct to add?

Am I correct in thinking that once I have create each chunk and added the imagery and camera file to each chunk, I then align each chunk, then build dense point cloud for each chunk, once this is done can I then align all chunks together and merge as one. Trying to understand the whole process but there seems to be very little information available. I hope you can advise me of the whole process.

Regards

Ian

SAV

  • Hero Member
  • *****
  • Posts: 710
    • View Profile
Re: How to work with large datasets-Chunks
« Reply #1 on: October 30, 2017, 04:41:09 AM »
Hi rainman,

Processing of pictures (in a single chunk) is limited by your hardware (RAM), as you have experienced yourself. Check out Photoscan's memory requirements:
http://www.agisoft.com/pdf/tips_and_tricks/PhotoScan_Memory_Requirements.pdf

Basically the amount of available RAM will limit the number of pictures you can process in a single chunk.

However, I highly recommend to run the image alignment step on ALL images before splitting up your project into chunks. Image alignment is mainly using your CPU (=processor) and since the latest version 1.3 also your GPU (= graphics card; not sure to which percentage though). Image alignment requires less RAM compared to point cloud densification. Also make sure to have REFERENCE enabled for image alignment because your UAV imagery has probably location information in the EXIF (=geotagged images).

After you have aligned all images (and cleaned & optimized your sparse point cloud), split your project into chunks in order to avoid the RAM/memory issues associated with the computation of your dense point cloud.
To split your project into chunks, use the Python script from the Agisoft Wiki (http://wiki.agisoft.com/wiki/Python).
It's called PS130_split_in_chunks_dialog.py

All the best.

Regards,
SAV



rainman

  • Newbie
  • *
  • Posts: 21
    • View Profile
Re: How to work with large datasets-Chunks
« Reply #2 on: October 30, 2017, 08:05:04 PM »
Hi SAV,

Thank you for your reply. I have some additional questions which I am hoping you maybe able to help me with.

My flight line is basically two parallel lines, side by side, I believe that the overlap between images was calculated to be 75% (front and back and side by side). So you suggest I initially align all images together as one chunk and clean them up first. Then use the python script to split the project into chunks. The question I have is does there need to be overlap between chunks and if so how much? I have attached a drawing of what I am trying to illustrate. Can I chunk images 0-200, 201-400, 401-600 or do I need to have a degree of overlap, for example chunk 1 0-200, chunk 2 175-400, chunk 3 375-500. I am trying to understand the best way to chunk these images up. Also because I have 2 flight lines which are parallel do I need to have overlap (side to side) between each of these two flight lines?

Regards
Rainman.

SAV

  • Hero Member
  • *****
  • Posts: 710
    • View Profile
Re: How to work with large datasets-Chunks
« Reply #3 on: October 31, 2017, 04:50:05 AM »
Hi rainman,

I've just tested my previously suggested workflow and saw that the 'split into chunks' Python script does not allow to only have 1 row or 1 column (you can only do a 2 x 2 matrix, but not 1 x 2). This means it is not the best solution for your 2 line survey where you want to keep the pictures of both of the 2 parallel flight lines in each chunk.

I think the following workflow is better (I just tried it myself):

1. Align ALL images to generate your sparse point cloud (use pair preselection: REFERENCE and accuracy: HIGH)

2. Clean your sparse point cloud using EDIT > GRADUAL SELECTION. Then run OPTIMIZE.

3. Duplicate the active chunk to have 3 additional copies of it (right click on chunk in workspace pane and choose DUPLICATE). You should now have 4 chunks.

4. Double click on first chunk and adjust your region to cover 1/4 of the area (10th icon from left in the main icon bar). Make sure that you include both parallel flight lines in each chunk. Roughly remember the boundary of the new region.

5. Open the second chunk and again adjust the region, slightly overlapping the end of the region of the previous chunk.

6. Repeat this step for the rest of the chunks.

7. Start a batch process to generate the dense point clouds for all 4 chunks. Note that because we have resized the region for each chunk, the point cloud will only be generated within this area (hence it is computational less intensive).

8. Merge chunks using WORKFLOW > MERGE CHUNKS. Choose MERGE DENSE CLOUDS. You should now have a dense point cloud for the whole region.

DONE.

Note that you will now have each aerial photograph 4 times in your merged chunk. Only keep the first one and delete the 3 other ones for each image. Then compute your DEM and orthophotograph.

Regards,
SAV
« Last Edit: October 31, 2017, 07:01:43 AM by SAV »

juanfraf

  • Newbie
  • *
  • Posts: 6
    • View Profile
Re: How to work with large datasets-Chunks
« Reply #4 on: July 24, 2018, 08:44:29 PM »
Hello!

I am in a similar situation. I have about 9.000 images. I tried to work with a diferents chunks but the problem is when I merge the chunks... that if I make a profile where there are a joint chunk I have 2 point cloud parallel with 20-50 cm of different like in the image attach.

Then I have two question:

1. Which do you think that is the best option to process that huge number images and get the best result.

2. I am  trying the phyton script, I downdload from the link and load in photoscan, but I dont know how to use. what must I write in argument part?

haglore

  • Newbie
  • *
  • Posts: 3
    • View Profile
Re: How to work with large datasets-Chunks
« Reply #5 on: September 17, 2018, 06:20:58 PM »
Hi Sav,
So in the past when I've worked with chunks (about a year ago), I've had trouble with the overlapping part between 2 chunks.   So, the dense cloud of each chunk would overlap, but the points did not perfectly match positions, so the resulting cloud/image in the overlapping locations is very messy/noisy.  Perhaps i needed to split chunks AFTER aligning the photos.. and applying the same camera model would place the dense cloud points in EXACTLY the same position in the overlap zone, but honestly I can't remember when before generating the dense cloud I actually split the model into chunks (this was some time ago).

In my projects I'm mapping fractures in bedrock cliffs; I want very clear models/orthoimages (usually using the texture model -> scaled orthophotos), so noise in the dense cloud at the overlapping-chunks zone propagates to noise in these regions in the derived products. Have you noticed an issue like this with your workflow?

Thanks for your suggestions on this topic. Very interested in your workflow because I am usually pushing the computing power of my machine too!
Cheers,
Al

JRM

  • Jr. Member
  • **
  • Posts: 81
    • View Profile
Re: How to work with large datasets-Chunks
« Reply #6 on: September 18, 2018, 10:58:55 AM »
Please upvote this feature request "Add overlap/buffer handling to core processing", the ability to use and discard overlap buffers would greatly ease this kind of workflow :
1. align the whole dataset or align different chunks then align them
2. divide with an overlap
3. build dense cloud/mesh
3. discard the overlap
3. merge or export the result
Here is an example used on LIDAR data, the end result is free of edge artifacts, empty pixel, etc.