I have a similar situation and am interested in the responses. I have four+ image sets of the same object (rock face) taken with different camera settings (but no changes in camera focus or aperture within each of the four image sets). The reasons for multiple image sets of the same object were primarily the limited memory card available for the camera (8 Gb), which ran out before finishing one of the sets, so we repeated the capture. Also, we captured the scale in two image sets on an adjacent part of the wall, at two distances and camera settings.
Because it was a vertical wall, we had >60 percent vertical overlap of images in portrait orientation within each image set, and we also rotated the camera 180-degrees and horizontally in overlapping rows of images for calibration purposes. The total image data set was 800+ images, with 2 chunks each having about 400 images and 2 chunks for scale each having ~20-40 images.
My workflow, which didn't produce as good a result as I had hoped, was as follows:
1. I loaded each of four image sets into a chunk within the project. I masked and aligned photos in each chunk independently, producing a single camera calibration for that chunk.
2. I aligned the chunks (point-based, high accuracy, constrain features by mask, generic preselection, 80,000 point limit).
3. Merged chunks. I didn't add scale bars to the ground control pane yet; I just marked the points on my scale in the photos. I'd suggest not merging the markers if you have markers in more than one chunk, because Marker 1 in one chunk will get confused with Marker 1 in a different chunk--generating many gray marker flags that are mislocated.
4. I optimized the alignment on the merged chunks. I prefer to use gradual selection very lightly (only removing about 40K points from a sparse point cloud starting with >9M points) at each optimization step. Once I got reprojection error to about 0.35 pixel and reconstruction uncertainty of less than about 30, I added scale bars and continued the optimization until RE < 0.3 (about 2.6) and RU <10.
5. I generated the dense point cloud at low quality to check the optimization.
I was disappointed to find that the dense point cloud is in layers with many discontinuities, rather than converging on a single continuous surface. It appears that the slight differences in camera calibration leads to depth maps that differ, even though the calibration for each chunk produces a single, continuous surface.
I had learned (I thought) that using multiple cameras of the same object can help to improve the reconstruction uncertainty, but apparently not in this case. I'm wondering how I can get the different chunks to agree better and converge on a single surface? I spent many hours optimizing these data sets, but the results are not what I had hoped.