Progress pill
Securing your computer

Metadata cleansing

  • Why is metadata a risk?
  • On Linux (Debian)
  • Other platforms (Windows and macOS)
Every time you create a digital file, be it a photo, office document, audio or video file, it contains metadata in the background. This information isn't directly visible when you open the file, but it's there, and can contain extremely sensitive elements.

Why is metadata a risk?

Metadata is data attached to a file, the role of which is to provide contextual information about the content. In an image, this can include the date and time the image was taken, precise GPS coordinates, the model of camera or smartphone used, and sometimes even technical settings. In a text document, it can include the author's name, the company's name, the user's session ID, creation and modification timestamps, or even internal comments left during editing.
This metadata may seem harmless, but it can be used by malicious actors to identify the author of a file, physically locate a person, reconstruct events or habits, or even exploit software flaws based on the version of software used.
Let's take a concrete example: you post a supposedly anonymous photo on a forum. If you haven't removed the EXIF metadata, a single click can reveal the precise GPS coordinates of your home, the model of your phone and the exact date the photo was taken. Similarly, a PDF document sent anonymously may contain your full name in its properties.
That's why some media publishing and communication platforms automatically remove metadata from your photos. These include X (Twitter), Instagram, Signal and Session. On the other hand, other platforms don't remove metadata at all: this is the case with most online forums, many e-mail clients, or even when you publish directly on a website.
It's an essential reflex to adopt: as soon as a file leaves your private sphere, you need to think about cleaning up its metadata to avoid disclosing personal or sensitive information without your knowledge. Let's take a look at how to do this, depending on your operating system.

On Linux (Debian)

Using ExifTool

The most complete and reliable means of managing and deleting metadata is the ExifTool, developed by Phil Harvey. It is compatible with a large number of file formats (JPG, PNG, PDF, MP3, DOCX...) and enables both display and removal of metadata.
  • Step 1: Install ExifTool
To install it on a Debian distribution (Ubuntu), open a terminal and type the command:
sudo apt update sudo apt install libimage-exiftool-perl
This package installs exiftool, which you can then use directly from the command line.
  • Step 2: Viewing file metadata
To view all the metadata contained in a file, use the following command:
exiftool name.webp
Replace name.webp with the real name of your file. Also make sure you're positioned in the directory containing this image. For example, if I have a photo of the Satoshi Nakamoto statue in the /Downloads directory, I can display its metadata by running the following command:
cd Downloads exiftool Satoshi-Nakamoto-Lugano.webp
You'll then see a long list of attributes, potentially including:
  • Date and time of creation
  • GPS location
  • Camera manufacturer and model
  • The software used for editing
  • Information about the author...
This gives you a complete overview of what you're about to publish or transmit.
  • Step 3: Delete the metadata
To delete all unnecessary metadata from a file, use the command:
exiftool -all= name.webp
This command automatically creates a copy of the original file with the metadata removed. The original is preserved with the _original extension added to its name.
If you don't want to keep the original, you can delete it with the command:
exiftool -all= -overwrite_original name.webp
If we look again at our file's metadata, we can see that all unnecessary or sensitive metadata has been removed.
  • Step 4: Clean up an entire directory
If you have several files to process in the same directory, you can use a generic command such as:
exiftool -all= *.webp
This deletes the metadata of all JPEG files in the current directory. You can adapt the extension to suit your needs (*.webp, *.pdf...).

Using MAT2

As an alternative to ExifTool, you can use MAT2 (Metadata Anonymization Toolkit v2).
  • Installing MAT2:
sudo apt install mat2
Once installed, you can use it from the command line like this:
mat2 fichier.pdf
By default, MAT2 does not modify the original file: it creates a cleaned-up version in the same directory with the -clean suffix added.
To clean an entire directory, such as the /Downloads directory:
mat2 ~/Downloads/*

Other platforms (Windows and macOS)

On Windows and macOS, there are several methods for removing metadata from your documents. In my opinion, the easiest is to use the open source software ExifCleaner. This lightweight tool features a graphical Interface and can handle most file formats by simply dragging and dropping. By dropping one or more files onto the Interface, the software automatically removes unnecessary metadata and replaces the original files in the same directory. ExifCleaner is available for Windows, macOS and Linux.
It's extremely easy to use: just run the software, then drag and drop one or more files into the window.
Wait a few moments while the tool cleans up the metadata. Once the process is complete, you'll see a summary showing the initial and final metadata counts. All superfluous information will have been removed.
Cleaning up the metadata of the files you share is therefore a good practice to adopt when it comes to IT security. Thanks to the simple tools presented in this chapter, it's a habit you can easily implement on a daily basis.
We've come to the end of this section on securing your computer. In the next section, we'll be taking an in-depth look at one of your machine's most critical programs: the web browser. It alone centralizes a large proportion of your digital activities, making it a prime target in terms of both security and privacy.
Quiz
Quiz1/5
Why do digital file metadata represent a risk to privacy?