Hello Duytap,
Yes, you need to mask the background on all the photos (can be automated, if you take one image of the background without the object from the same camera position).
Darkened area means that it is masked, so on your second screenshot the object s masked, actually, and not the background.
The object itself seems to be quite difficult for the reconstruction - it has not so many feature details (lacking texture pattern) and there are blinks on the surface.
Additionally i can suggest to take images from the shorter distance (if possible with your camera) or use bigger focal length. It will allow to occupy more image frame space with the valuable data and not with the background that wouldn't be used after masking.
As for the alignment settings, i can suggest to use High accuracy, Generic or disabled preselection, 40000/10000 for key point and tie point limits correspondingly.