Please note: The algorithm descriptions in English have been automatically translated. Errors may have been introduced in this process. For the original descriptions, go to the Dutch version of the Algorithm Register.

Anonymisation software

The algorithm shows which (personal) data in documents an organisation should flag away. So that personal or confidential data can no longer be seen in those documents.

Last change on 18th of November 2025, at 11:25 (CET) | Publication Standard 1.0
Publication category
Other algorithms
Impact assessment
DPIA
Status
In use

General information

Theme

Organisation and business operations

Begin date

2024-06

Contact information

infoboxstaftaken@autoriteitpersoonsgegevens.nl

Responsible use

Goal and impact

The anonymisation software will be used to anonymise documents published by the AP faster and better. In this way, we prevent data breaches and contribute to better protection of data subjects' AVG rights.

Considerations

Anonymisation is an effective way to protect personal data and reduce the risks of processing it. The AP often needs to anonymise many and large documents. It takes a lot of time and work to fully anonymise these documents by hand. Moreover, it entails the risk that documents are not anonymised in the same way and not completely. Using this algorithm largely removes this risk. This allows the AP to anonymise (personal) data in documents more consistently and efficiently.

Human intervention

A staff member of the AP always assesses the suggestions of the algorithm before the AP finally anonymises (strips away) (personal) data in the documents.

Risk management

A technical risk is that the algorithm makes incorrect suggestions for anonymising certain (personal) data. As a result, the AP could either anonymise too little or too much (personal) data. As a result, the AP would either disseminate privacy-sensitive information or provide too little information. To prevent this, an AP employee manually checks the algorithm's suggestions.

Legal basis

General Data Protection Regulation (GDPR): Anonymisation is an effective way to protect personal data and reduce the risks of processing it. When data is properly anonymised, it no longer falls under the scope of the AVG. This is because the data is then no longer traceable to a natural person. Therefore, anonymisation is often applied as a security measure to comply with the AVG principles. Although the AVG does not contain a specific article on anonymisation, recital 26 does refer to "the anonymisation of personal data" as a way to reduce risks to data subjects (the people whose personal data are processed) to an acceptable level. Anonymisation is thus seen as an important technique to effectively protect personal data in line with the AVG.


Open Government Act (Woo): Under Article 5.1(2)(e) of the Woo, government organisations do not have to disclose information if the importance of doing so does not outweigh the importance of "respecting personal privacy". Thus, if government information contains personal data, such data need not be disclosed. Making personal data anonymous is then a logical step to still be able to disclose (part of) the relevant government information.

Links to legal bases

  • AVG: https://eur-lex.europa.eu/legal-content/NL/TXT/HTML/?uri=CELEX:32016R0679&qid=1685451198313
  • Woo: https://wetten.overheid.nl/BWBR0045754/2023-04-01

Impact assessment

Data Protection Impact Assessment (DPIA): De AP heeft op de anonimiseringstool een DPIA uitgevoerd.

Operations

Data

The type of data used by the algorithm depends on the documents (to be anonymised). It usually involves personal data and business-sensitive information, sometimes special personal data. Examples include: names, addresses, dates of birth, signatures and e-mail addresses.

Technical design

Algorithms - Algorithms are instructions programmed in computer language that make automated decisions autonomously or with human involvement.


This is what the xxllnc Anonymise algorithms do by (1) deciding in user-supplied documents containing (personal) data, (2) within the xxllnc Anonymise editor environment, (3) deciding whether to suggest to users that they should take measures to anonymise this data, (4) after which, when finalised into a final document, the decisions on these suggestions may impact the people to whom this data belongs.

Artificial intelligence (AI) - The literature distinguishes two categories of AI: human or rational thinking (machines capable of making decisions, solving problems and learning) and human or rational acting (machines can perform activities that require intelligence).

xxllnc Anonymise applies specific AI of the "Natural Language Processing" type: processing written language. In doing so, we use the "Named Entity Recognition" technique to arrive at suggestions for the entity: "names", processing the entire context of the written language within a document. The intelligence of the solution is (further) trained based on datasets in a supervised machine learning (pre-programmed input and output) environment of xxllnc Anonymise. The datasets for further training never concern the documents provided by users of xxllnc Anonymise, unless prior explicit and separate consent is given by all involved.

Details of processing

The text layer of documents is offered via an API call to the AI / Machine Learning, Text Analytics module of xxllnc Anonymise, part of which is hosted at Microsoft Azure. In both cases, the data resides within EEA.

Text sent by the xxllnc Anonymise API in synchronous or asynchronous calls to the Cloud is not stored by the hosting provider. xxllnc Anonymise has deliberately disabled this functionality by default, so that temporary storage of text input is also prevented. For this, xxllnc Anonymise (as part of Privacy by Default), has the available option: LoggingOptOut query parameter set accordingly. As a result, the API with the Text Analytics only indicates which data in texts has been analysed as entity "name" and sent via an API signal, along with the probability score (a percentage) to the xxllnc Anonymise client server after which the data input at the Cloud is automatically deleted and destroyed.

Securing the process

xxllnc and Microsoft Azure use the ISO27001 standards system and have implemented and housed all the measures to be taken for this in an ISMS. The Text Analytics service falls within the scope of Microsoft's certification. Processing takes place within the EEA (Western Europe), further processing (for other purposes) is contractually excluded.

External provider

XXLNC

Similar algorithm descriptions

  • Among other things, the algorithm recognises and anonymises (personal) data and confidential financial data in documents before they are published or shared.

    Last change on 24th of June 2024, at 11:30 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    DEDA, DPIA
    Status
    In development
  • Among other things, the algorithm recognises and anonymises (personal) data and confidential financial data in documents before they are published.

    Last change on 14th of June 2024, at 7:27 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    Field not filled in.
    Status
    In use
  • Among other things, the algorithm recognises and anonymises (personal) data and confidential financial data in documents before they are published.

    Last change on 11th of June 2024, at 10:56 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    Field not filled in.
    Status
    In use
  • Among other things, the algorithm recognises and anonymises (personal) data, confidential financial data and other privavy sensitive information in documents before they are published or shared.

    Last change on 7th of November 2024, at 10:08 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    DPIA
    Status
    In use
  • Among other things, the algorithm recognises and anonymises (personal) data and confidential financial data in documents before they are published.

    Last change on 8th of July 2025, at 13:26 (CET) | Publication Standard 1.0
    Publication category
    High-Risk AI-system
    Impact assessment
    Field not filled in.
    Status
    In development