Please note: The algorithm descriptions in English have been automatically translated. Errors may have been introduced in this process. For the original descriptions, go to the Dutch version of the Algorithm Register.

Anonymising documents

The algorithm identifies personal data and pre-entered words in documents. An employee must go through the document and check whether the alert is justified and approve or reject it. An employee can add further markings himself. After approval by the employee, all approved alerts and markings are blacklisted.

Last change on 14th of October 2024, at 10:02 (CET) | Publication Standard 1.0
Publication category
Other algorithms
Impact assessment
Field not filled in.
Status
In use

General information

Theme

Field not filled in.

Begin date

2023-01

Contact information

info@peelenmaas.nl

Responsible use

Goal and impact

The aim is to anonymise privacy-sensitive information in documents published by the municipality.

In this way, we protect the privacy of citizens and organisations and prevent (possible) data leaks.

Considerations

The municipality wants to make information public. In doing so, privacy or business-sensitive information must be protected.

The advantage of anonymisation software is faster anonymisation. A disadvantage may be that too much reliance is placed on the outcome of the algorithm, by not checking as closely.

Human intervention

The outcome of the algorithm is checked by an employee. The clerk is required by the software to check all pages. The clerk determines whether the document is correctly anonymised.

Risk management

  • The municipal employee always does the final check whether a document is correctly anonymised. There is a risk that employees do not check properly; we overcome this by paying attention to the importance of carefully checking the personal data found by the algorithm.
  • Datamask is a SaaS (Software As A Service) solution. A copy of the document is uploaded without metadata to the supplier's environment for processing. Immediately after processing, the data and data processing is deleted. If the copy is not processed immediately, it is kept on the supplier's (Dutch) server for up to 30 days.
  • The supplier is ISO 27001 certified.

Legal basis

Anonymisation is important because it helps protect the privacy of individuals and ensures that sensitive information is not inadvertently disclosed. The legal basis for anonymising data in the Netherlands is mainly laid down in the General Data Protection Regulation (AVG).

Links to legal bases

AVG: https://wetten.overheid.nl/BWBR0040940/2021-07-01/0

Link to Processing Index

Privacyverklaring - Gemeente Peel en Maas

Operations

Data

All information found in the uploaded documents (except metadata) is processed by the algorithm. This may include ordinary personal data, special personal data and criminal data. It may also include business-sensitive information.

Technical design

Documents are uploaded to the application. At that point, a copy is created in the form of a PDF with text layer and the metadata of the original document is removed from the copy. This copy arrives on the supplier's (Dutch) server and remains there for a maximum of 30 days. The text layer of the PDF is offered to the machine learning algorithm through an API.

This is a Natural Language Processing algorithm (named entity recognition) from Microsoft Azure. The API returns at which location in the analysed texts a personal data is likely to occur, along with the probability score (a percentage). The vendor uses the probability score along with proprietary AI models to make personal data recognition as accurate as possible.

Finally, an employee checks the document and when it completes the document, the data to be anonymised is permanently removed from the text layer and a black bar is placed.

External provider

Xxllnc Anonymise, was previously known as DataMask

Similar algorithm descriptions

  • The algorithm underlines personal data in documents. An employee has to look at all pages and check whether the document is properly anonymised. Then the software removes all highlighted information and blacklists it. After that, the documents can be published, for example under the Open Government Act (WOO).

    Last change on 31st of October 2024, at 15:08 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    DPIA, ...
    Status
    In development
  • The algorithm underlines personal data in documents. An employee has to look at all pages and check whether the document is properly anonymised. Then the software removes all highlighted information and blacklists it. After that, the documents can be published, for example under the Open Government Act (WOO).

    Last change on 31st of October 2024, at 9:40 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    DPIA, ...
    Status
    In use
  • The algorithm underlines personal data in documents. An employee has to look at all pages and check whether the document is properly anonymised. Then the software removes all highlighted information and blacklists it. After that, the documents can be published, for example under the Open Government Act (WOO).

    Last change on 12th of February 2025, at 9:26 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    DPIA, ...
    Status
    In use
  • The algorithm underlines personal data in documents. An employee has to look at all pages and check whether the document is properly anonymised. Then the software removes all highlighted information and blacklists it. After that, the documents can be published, for example under the Open Government Act (WOO).

    Last change on 12th of November 2024, at 7:25 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    DPIA, ...
    Status
    In use
  • The algorithm underlines personal data in documents. An employee has to look at all pages and check whether the document is properly anonymised. Then the software removes all highlighted information and blacklists it. After that, the documents can be published, for example under the Open Government Act (WOO).

    Last change on 27th of January 2025, at 10:18 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    DPIA, ...
    Status
    In use