Please note: The algorithm descriptions in English have been automatically translated. Errors may have been introduced in this process. For the original descriptions, go to the Dutch version of the Algorithm Register.

Text analysis

Based on language technology, personal and company names are read and filtered out of text files such as emails and individual documents.

Last change on 1st of September 2025, at 14:09 (CET) | Publication Standard 1.0
Publication category
Other algorithms
Impact assessment
DPIA
Status
In use

General information

Theme

Organisation and business operations

Begin date

2025-01

Contact information

https://mijn.hollandskroon.nl/contactformulier/

Responsible use

Goal and impact

Support in the review process where legal protection applies to information that is disclosed. Protection from the AVG (persons) and Woo laws (especially company confidential), where grounds for exception are named.

Considerations

Manual review is intensive and error-prone. A suggestion list from the entity extraction algorithm captures all conceivable instances of individuals in the text.

Human intervention

Within the software, a list is built and offered to the user to select in the automatic varnishing process. The choice to adopt an advised term as a person name and not to disclose it is up to the user.

Risk management

There is no risk of automated decision-making and the algorithm has no impact on fundamental rights because the algorithm does not make decisions with legal consequences. It only makes a proposal for anonymising personal data. The employee of the administrative body always makes the final check whether a document has been correctly anonymised.

Legal basis

Legislation around public access to government data (Woo)

Links to legal bases

Open Government Act: https://wetten.overheid.nl/BWBR0045754/2023-04-01#Hoofdstuk5

Impact assessment

Data Protection Impact Assessment (DPIA)

Operations

Data

This refers to documents and messaging information within central government. Including email, files, Whatsapp messages and other media where administrative decision-making can be found.

Links to data sources

General Office applications: Dit betreft standaard Office formaten inclusief email en social media formaten.

Technical design

Texts are recognised on the basis of Named Entity Recognition (NER) and a process within Insights extracts the names for further processing towards the management interface and the automatic lacquer rules.

External provider

ZyLAB eDiscovery & Compliance Services B.V.

Similar algorithm descriptions

  • Based on language technology, personal and company names are read and filtered out of text files such as emails and individual documents.

    Last change on 14th of October 2024, at 10:47 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    Field not filled in.
    Status
    In use
  • Based on language technology, personal and company names are read and filtered out of text files such as emails and individual documents.

    Last change on 22nd of April 2025, at 11:33 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    Field not filled in.
    Status
    In use
  • Based on language technology, personal and company names are read and filtered out of text files such as emails and individual documents.

    Last change on 5th of February 2025, at 9:14 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    DPIA
    Status
    In use
  • Based on language technology, personal and company names are read and filtered out of text files such as emails and individual documents.

    Last change on 20th of March 2025, at 14:50 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    Field not filled in.
    Status
    In use
  • An application is used to help our organisation digitise and manage documents. This involves converting scanned documents from an image to text. Metadata is automatically read from the text to give the document information for the route to follow in the internal process.

    Last change on 16th of May 2025, at 8:36 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    Field not filled in.
    Status
    In use