So I take it you didn't try at a low setting, or you did and it made no difference?
If you think of it in terms of a laser scanner then it doesn't make sense, because the more densely you sample an object with a laser scanner then the better the representation will be.
With multiview reconstruction it makes sense, to me anyway, because densely sampling an image isn't enough on its own - you have to make correspondence with each sampled point in multiple images.
It's a bit like holding two 24x36" prints 1cm from your face - hard to tell if you're looking at exactly the same thing in each image, and depending on the scale of the texture you may be able to move the images around in front of your face without the view changing very much. If you hold them a bit further away it's like subsampling the image and the smallest things you can see now are more likely to be distinct from their surroundings.
If your images are perfectly sharp and the scale of texture is almost fractal like, i.e. as in aerial photogrammetry, then using ultra high quality and every last pixel makes sense.
If your images are not sharp, or the texture becomes less texturous at high magnification then there will come a point where higher 'quality' dense cloud building becomes pointless.
Imagine if you had a perfect 5 billion gigapixel camera and you could see individual atoms, the chances are they all look pretty homogenous and pixel matching wouldn't work. Similarly some kind of fake moulded plastic texture (not saying your shoes are fake or plastic!) might look interesting from a certain distance, but up close its just smooth plastic, which doesn't work well in photoscan.
I should point out I have no inside information whatsoever about the algorithms going on in photoscan, and most of this is based on what i've read on this forum and imagined when i should have probably been working.