September 2, 2021

Libelle Glossary Part 1: What do the terms anonymization and pseudonymization mean?

AuthorMichael Schwenk

When the European General Data Protection Regulation (EU GDPR) came into force on May 25, 2018, companies were put under an obligation not only to be responsible in their handling of personal data, but also to provide special protection for it.

It distinguishes between personal data that allow conclusions to be drawn about real existing persons and other data whose processing as well as storage ensure the confidentiality of the data. Even though the GDPR does not make any specifications regarding the protection of data, it does list procedures that are to be used. These include anonymization and pseudonymization.

But what exactly do these terms mean, and what are the differences between the individual procedures? This blog post deals with these mysteries and tries to shed light on them.

Anonymization - what does that mean exactly?

Anonymization involves the removal of certain characteristics that could allow a specific person to be identified or conclusions to be drawn about that person. So that the data can no longer be assigned to this person, certain characteristics, such as the name, are deleted. If characteristics occur for only one person in the dataset at a time, these must also be deleted, or at least generalized. It must also be ensured that a combination of different characteristics does not allow conclusions to be drawn about a specific person.

How pseudonymization differs from anonymization

The GDPR defines pseudonymization in Article 4 as a:

"processing of personal data in such a way that the personal data can no longer be attributed to a specific data subject without the addition of further information, provided that such additional information is kept separate and is subject to technical and organizational measures which ensure that the personal data are not attributed to an identified or identifiable natural person." (Source)

This indicates that the pseudonymization procedure results in data which can no longer be assigned to a specific person without additional information. The information needed for the assignment is not available because it is stored separately and protected from access by technical and organizational measures. The procedure only gives authorized persons the right to restore this reference.

Libelle IT Group has developed LibelleDataMasking, a software solution for the required anonymization and pseudonymization. The solution was designed to produce anonymized, logically consistent data on development, test, and QA systems across all platforms.

The anonymization methods used deliver realistic, logically correct values that can be used to describe relevant business cases and test them in a meaningful end-to-end manner. Furthermore, developers as well as users are provided with a "clean" database with which they do not have to worry about data protection.

Recommended articles
December 22, 2022 Libelle IT Glossary Part 22: What is DevOps?
December 19, 2022 Anonymized data in the data pipeline

All blog articles