Forum

Author Topic: Remove Duplicate Image - On image name and metadata - not only path  (Read 1487 times)

rbush

  • Newbie
  • *
  • Posts: 5
    • View Profile
Problem:
I am trying to remove duplicate images, that are located in different folders. Each image captured has a unique name, e.g. DJI_20240910143418_0013_V. Duplicates processed to same location, but produce erroneous final results if included.

Attempts:
1) Previous scripts remove duplicate images, once chunks are merged based on folder locations i.e path (camera.photo.path). In my case, someone generated a large project alignment, with many images, some of which are duplicate images located in different folders (paths). Tried all available scripts i could find.
2) Have also tried the last entry (Topic: Delete duplicate photos? By: mwillis - June 17, 2022) which should disable duplicate images based on name length. Unfortunately this does not work in 2.1.3 and all images remain.
I have tried running the script in Run Script command, and enabeling "Enable rich python console" and pasting / running script in terminal. No success.

Request:
Please could you assist with a script to remove all duplicate images base on image name, and not relative to path. If it could go further and look at the image name and capture date/time metadata for other users that will search this that have duplciate names (e.g. DJI0001) but different times, brilliant too, but not critical here.

Version:
Metashape 2.1.3

Thank you.

   
« Last Edit: November 18, 2024, 11:42:40 PM by rbush »

Bzuco

  • Full Member
  • ***
  • Posts: 244
    • View Profile
Re: Remove Duplicate Image - On duplicate image name, not folder path
« Reply #1 on: November 18, 2024, 11:03:12 PM »
I am using czkawka to find and delete similar or duplicate images. It works base on image content and it is amazingly fast.
https://github.com/qarmin/czkawka

rbush

  • Newbie
  • *
  • Posts: 5
    • View Profile
Re: Remove Duplicate Image - On duplicate image name, not folder path
« Reply #2 on: November 18, 2024, 11:38:31 PM »
Thanks, will take a look. Also found a solution using CoPilot.

For those looking to remove duplicate images located in different folders within your Agisoft project, this code does the following - Looks at image Name & Metadata (data time of capture). It then disables the duplicate images. You can sort these and remove or keep deactivated in your processing.

Code: [Select]
import Metashape
from datetime import datetime

# Get the current chunk
chunk = Metashape.app.document.chunk

# Create a set to store unique image names and their metadata
unique_images = set()

# Iterate through all cameras in the chunk
for camera in chunk.cameras:
    # Extract the image name without the path
    image_name = camera.photo.path.split('/')[-1]
   
    # Extract the metadata date and time
    metadata = camera.photo.meta
    date_time_str = metadata["Exif/DateTimeOriginal"] if "Exif/DateTimeOriginal" in metadata else ""
    date_time = datetime.strptime(date_time_str, "%Y:%m:%d %H:%M:%S") if date_time_str else None
   
    # Create a unique identifier for the image based on name and metadata
    image_id = (image_name, date_time)
   
    # Check if the image identifier is already in the set
    if image_id in unique_images:
        # Deactivate the camera if the image identifier is a duplicate
        camera.enabled = False
    else:
        # Add the image identifier to the set if it's unique
        unique_images.add(image_id)

print("Duplicate images based on name and metadata date and time have been deactivated.")