Hello feiko.lai,
For better accuracy we usually suggest to align all the images at once, optimize them and then, if there are no sufficient resources, use duplicated chunks to build the dense cloud in the smaller region.
If you run the process in the network mode (even on one node) the memory consumption for some steps will be lower, also you can lower the memory consumption if you use lower key point and tie point limit, for example 20,000 / 4,000 for the parameters.
As for the issues that you have with the merging script, I can suggest to disable Merge Back function, so at least you'll get the chunks processed and could merge the data back manually, previously removing the unnecessary data and choosing only important data for merging.