Please note: The algorithm descriptions in English have been automatically translated. Errors may have been introduced in this process. For the original descriptions, go to the Dutch version of the Algorithm Register.

Octobox Anonymisation

Octobox Anonymisation is used when making information, such as personal data, unreadable (varnishing). This is mainly done in requests under the Open Government Act (Woo). The basis of the algorithm is formed by AVG rules. In addition, the algorithm has self-learning properties based on human input.

Last change on 11th of September 2024, at 15:14 (CET) | Publication Standard 1.0
Publication category
Other algorithms
Impact assessment
DPIA, ...
Status
In use

General information

Theme

Organisation and business operations

Begin date

12-2023

Contact information

https://www.rijksoverheid.nl/ministeries/ministerie-van-algemene-zaken/contact

Responsible use

Goal and impact

Octobox Anonymise helps the ministry to process information requests under the Woo faster and thus meet statutory processing deadlines. Octobox makes suggestions for laxing personal data and/or recurring passages to be laxed. The impact of using this algorithm is low: lax suggestions are adopted only after human verification. There are no automatic decisions.

Considerations

Octobox Anonymisation speeds up and simplifies an existing process (active and passive disclosure) that was previously entirely manual. It also increases the quality of the process because the manual method was more prone to error. The risk of a data leak is reduced and citizens' and companies' data are better protected.

Human intervention

All passages proposed to be lacquered by Octobox Anonymise are approved, modified or rejected by an employee. There are no automatic decisions.

Risk management

Deployment of the algorithm poses no additional risks. Its use speeds up and simplifies an existing process and increases the quality of documents to be varnished. The outcome of the process is and remains the responsibility of ministry staff.

Legal basis

General data protection regulation (AVG)

General Administrative Law Act (AWB)

Disclosure Act

Open Government Act (WOO)

Electronic Publications Act (WEP)

Links to legal bases

  • Algemene verordening gegevensbescherming (AVG): https://wetten.overheid.nl/BWBR0040940
  • Algemene wet bestuursrecht (AWB): https://wetten.overheid.nl/BWBR0005537
  • Bekendmakingswet: https://wetten.overheid.nl/BWBR0004287
  • Wet open overheid (WOO): https://wetten.overheid.nl/BWBR0045754
  • Wet elektronische publicaties (WEP): https://wetten.overheid.nl/BWBR0043961

Elaboration on impact assessments

An impact analysis was carried out, based on the Ministry of the Interior and Kingdom Relations' Implementation Framework 'Responsible Use of Algorithms'.

From this impact analysis, the ministry concludes that there is no high-risk algorithm and that the use of the algorithm has no significant effect on data subjects (including Woo petitioners). The reason for publishing the algorithm is that (the handling of) Woo requests in general and making passages illegible in particular are regularly the subject of public debate.

Impact assessment

  • Impact- en maatregelenanalyse Algoritmen
  • Pre-scan DPIA

Operations

Data

The algorithm is basically trained with public documents and/or articles to specifically recognise entities and names in different types of documents and formats.

Technical design

Octobox Anonymisation works on the basis of SpaCy's open source Natutal Language Processing (NLP), which combines language and artificial intelligence. Among other things, texts can be classified by recognising, for example, what the subject of a sentence is or what a verb is. Within the model, Named Entity Recognition (NER) is used to recognise names, for example. The open source library YOLO (You Only Look Once) enables signature recognition. Furthermore, techniques such as Optical Character Recognition (OCR) are applied to detect sensitive data complying with recognisable formats (such as phone numbers, BSN numbers, IBAN numbers, postal codes or e-mail addresses). The output of the algorithm involves suggestions for passages to be lacquered in documents. The algorithm learns based on human corrections and additions.

External provider

Octobox Netherlands B.V.

Similar algorithm descriptions

  • Recognise and anonymise privacy-sensitive information in documents.

    Last change on 3rd of July 2024, at 13:49 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    Field not filled in.
    Status
    In use
  • The algorithm in the software is mainly set to recognise and anonymise privacy-sensitive information in documents. Basis for this is the AVG. The tool is also used to highlight and mask information that cannot be shared for other reasons (based on another basis, e.g. the Woo) in a document.

    Last change on 6th of August 2024, at 11:49 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    Field not filled in.
    Status
    In use
  • The algorithm in the software is mainly set to recognise and anonymise privacy-sensitive information in documents. Basis for this is the AVG. The tool is also used to highlight and mask information that cannot be shared for other reasons (based on another basis, e.g. the Woo) in a document.

    Last change on 26th of November 2024, at 8:23 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    Field not filled in.
    Status
    In use
  • The algorithm in the software is mainly set to recognise and anonymise privacy-sensitive information in documents. Basis for this is the AVG. The tool is also used to highlight and mask information that cannot be shared for other reasons (based on another basis, e.g. the Woo) in a document.

    Last change on 13th of November 2024, at 13:53 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    Field not filled in.
    Status
    In use
  • The algorithm in the software is mainly set to recognise and anonymise privacy-sensitive information in documents. Basis for this is the AVG. The tool is also used to highlight and mask information that cannot be shared for other reasons (based on another basis, e.g. the Woo) in a document.

    Last change on 15th of November 2024, at 13:58 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    Field not filled in.
    Status
    In use