September 15, 2022

Why do system copies and data anonymization belong together?

AuthorAxel Kirn

System copies and data anonymization belong together like cars and their seat belts. System copy as a car takes us from the production system to the test system. Data anonymization is the seat belt: essential for safety and in most cases required by law.

Why are  system copies mandatory

Although the relevance of system copies is known to most, awareness of the necessity and benefits of anonymization still lacks in many cases. A system copy is essential for a regulated development process. Chaos would quickly prevail if developers had to try their hand at the production system. Permanent failures and uncertainty would be the result. This leads to a slow pace of development. No one wants to be the reason for hours of system downtime and therefore  lost earnings.

That is why most companies rely on development systems and many even on QA systems. Here developers can push and pull to their heart's content. If you commit too boldly, you might catch the resentment of your fellow developers, but the rest of the company or even the customers won't notice this internal drama. Normal business operations remain undisturbed.

In order to ensure that developers can work on a system that is as similar as possible to the production system, one usually resorts to system copies. Of course, it would also be possible to create a new system "on the greenfield", but the benefits would be limited,because every system grows and develops over time. What is developed and tested on a new system does not necessarily work on a historically grown system. The solution is therefore  repeated system copies on a regular basis. This ensures that the development/QA environment and the production environment never differ too significantly. The catch: a system copy can be a lengthy process.

And here we're not just talking about pure running time, i.e. the duration of the individual steps that need to be triggered. We're also talking about staff time that has to be spent. In addition, there is the question of when a system copy can be carried out at all so as not to disrupt business operations. Most of the time, developers have to bite the bullet and work weekend shifts. If only there were a product that could perform the system copy automatically.

Data anonymization as special protection for test systems

But as soon as the copy was successfully completed than we faced  the next problem: The copy was successful! All the data that was previously curated on a hermetically sealed system by hand-picked individuals is now withinthe test system. How is the test system protected? Who has access? Who needs which permissions? What was that again about the GDPR? If you want to make sure that no mischief can be done to  the test system, you have a lot of questions to be answered. Luckily, the answer always will  be the same: Anonymization!

Who within my company can I give access to what ? Everyone who wants it, the data is anonymized. What happens if unauthorized people gain access to the test system? The data is anonymized. What does the GDPR say?

"Where the processing is based on a purpose other than that for which the personal data were collected [...], the controller shall - in order to determine whether the processing for another purpose is compatible with that for which the personal data were originally collected - take into account, inter alia [...]
(e) the existence of appropriate safeguards, which may include encryption or pseudonymization."

GDPR Article 6(4)(e)

This is good to know, because otherwise many test systems would probably fail because of the following condition of the GDPR:

"Processing is lawful only if at least one of the following conditions is met:
(a) the data subject has given consent to the processing of personal data relating to him or her for one or more specific purposes;"

GDPR Article 6(1)(a)

Because for whatever reason the data was collected to begin with, it probably wasn't collected in order to run tests with. Perhaps you could ask your customers/suppliers for permission, but would probably gain irritated looks instead of data.

It is much easier to copy the data first and then anonymize it. But even reasonable anonymization needs to be considered. The simplest solution would be to insert XXX everywhere. However, this would mean losing one of the great advantages of the system copy, namely the proximity to the productive system. Validity checks become obsolete and clarity is lost.

Much better results are achieved by using "real data". Real addresses, credit card numbers with valid check digits and "real" names.

Now we have an actual system, being  a test system, but it is as close to the production system as possible. Thereforecopy and anonymization go hand in hand.

The perfect combination of system copy and anonymization

At Libelle, we have two software solutions for exactly this case: Libelle SystemCopy (LSC) and Libelle DataMasking (LDM).

With Libelle SystemCopy, you have a trustworthy helper at your side. What was previously a truly Sisyphean task becomes a simple automatism after setting up LSC once.

For anonymization after the copy, we offer Libelle DataMasking. Fed with real addresses and real-looking names, LDM allows you to develop in an environment that feels like the production system.

LSC and LDM can also be used stand-alone, but the two programs only come to full form in the "Dreamteam" combination. Here, LDM can be easily embedded into the LSC process. Every time a successful system copy is performed with LSC, LDM is automatically triggered. This keeps your test landscape as close to reality as possible without compromising on security.

Would you like to learn more about IT topics or news from the Libelle IT Group? You can find exciting additional articles on our blog. Follow us also on our social media channels LinkedIn.


Recommended articles
December 22, 2022 Libelle IT Glossary Part 22: What is DevOps?
December 19, 2022 Anonymized data in the data pipeline
February 9, 2022 IT auditing: support through automatic report

All blog articles