Please note: The algorithm descriptions in English have been automatically translated. Errors may have been introduced in this process. For the original descriptions, go to the Dutch version of the Algorithm Register.

Octobox Anonymisation

The algorithm in the software is mainly set to recognise and anonymise privacy-sensitive information in documents. Basis for this is the AVG. The tool is also used to highlight and mask information that cannot be shared for other reasons (based on another basis, e.g. the Woo) in a document.

Last change on 6th of August 2024, at 11:49 (CET) | Publication Standard 1.0
Publication category
Other algorithms
Impact assessment
Field not filled in.
Status
In use

General information

Theme

Organisation and business operations

Begin date

01/2023

Contact information

postbus.cio@minbzk.nl

Link to publication website

https://www.rijksoverheid.nl/ministeries/ministerie-van-binnenlandse-zaken-en-koninkrijksrelaties/contact/woo-verzoek-indienen

Responsible use

Goal and impact

The anonymisation tool is used to give substance to transparency on the one hand and the necessary protection of the individuals, companies and institutions to whom documents relate on the other.


Transparency because it enables the organisation to share information according to regulations such as the Woo, either actively or passively. For the citizen about whom there is data in documents to be published, use means that there is no privacy violation and the organisation thus complies with the AVG. The same applies to the protection of privacy data of the organisation's employees.


The applicant of a Woo request receives the information she asked for, either in anonymised version or partially masked on a different basis. For the departments within the organisation responsible for handling a Woo request and/or publishing information, it means complying with laws and regulations. Use of the software reduces turnaround time and therefore contributes to being able to provide requested information within the legal deadlines.


The risk impact of the algorithm is low. This applies to individuals (citizens, employees of buying organisations) and companies and institutions. The algorithm searches specifically for (personal) data and masks or indicates them regardless of the further content of documents. A proposal is made for anonymising a text fragment to a subject person, there are no automatic decisions. In addition, the tool has the option of manually masking information that cannot be made public for other reasons. With this, for example, a text fragment containing strategic information can be marked to protect one's own organisation or a partner organisation (government, company or institution). The basis for anonymising or masking is indicated in the box.

Considerations

It happens that excerpts of text in documents made public cannot be shared with the public. The Woo has provided bases under which this is possible. The AVG is the basis for this for non-Woo publications. Without the use of the software, anonymising text fragments in documents would take significantly more time. Using the anonymisation tool speeds up and simplifies the process for active and passive disclosure. Automated anonymisation is also less error-prone than human intervention. This reduces the risk of a data breach and better protects individuals' data.

Human intervention

Human intervention and control is always the norm in use. The software works on the basis of a setup document. Through this setup document and various mechanisms, the organisation can tailor/parameterise algorithm use to its own unique situation. A proposal is made for anonymising a text fragment to a subject person. No automated decisions are involved.

The algorithm searches specifically for (personal) data and marks or designates it regardless of further document content. The subject matter person handles the suggestions, indicating where they are correct and correcting where they should be. This work can also be reviewed within the software by a second person. For the citizen, this means that the organisation is demonstrably and proportionately working to eliminate (the risk of) privacy breaches and thus complies with the AVG.

Risk management

To mitigate the risk of documents being insufficiently anonymised, human verification always takes place. This involves a full check where the software can be used intuitively to check or modify/enrich. If no human control were to take place when anonymising documents, various risks could arise, especially as a result of disclosing or publishing privacy-sensitive data. This tool in conjunction with humans, helps prevent this:


Violation of privacy laws:

The inadvertent disclosure of personal data may violate privacy laws, such as the EU's AVG. This can lead to significant fines and legal penalties.


Identity theft:

Disclosing personally identifiable information (PII) such as names, addresses and social security numbers can lead to identity theft and financial fraud.


Damage to reputation:

Both the reputation of the individuals whose information has been leaked and that of the organisation responsible for the leak can be seriously damaged.


Loss of Trust:

The confidence of the public and affected stakeholders in the organisation may decrease, leading to a decline in engagement and support.


Personal Damage:

Individuals may suffer emotional and psychological damage if their personal data, such as medical or financial information, is disclosed.


Exploitation and Abuse:

Disclosed data can be used for malicious purposes, such as strike, harassment or discrimination.


Human monitoring helps to mitigate these risks by providing an additional layer of assessment and confirmation that anonymisation processes have been adequately carried out before information is made public.

Legal basis

General data protection regulation (AVG)

Environment Act

General Administrative Law Act (AWB)

Disclosure Act

Open Government Act (WOO)

Electronic Publications Act (WEP)

Links to legal bases

  • Algemene verordening gegevensbescherming (AVG): https://wetten.overheid.nl/BWBR0040940
  • Omgevingswet: https://wetten.overheid.nl/BWBR0037885
  • Algemene Wet Bestuursrecht (AWB): https://wetten.overheid.nl/BWBR0005537
  • Bekendmakingswet: https://wetten.overheid.nl/BWBR0004287
  • Wet Open Overheid (WOO): https://wetten.overheid.nl/BWBR0045754
  • Wet Elektronische Publicaties (WEP): https://wetten.overheid.nl/BWBR0043961

Operations

Data

At the beginning of use, a setup document was compiled by organisation and supplier. This contains preferences of the organisation regarding anonymisation. This set is merged with Octobox's Basic Model, which by default looks for data traceable to individuals, such as Citizen Service Numbers, bank account numbers, phone numbers, e-mail addresses, dates, residential addresses and postal codes.

The organisation may have a preference not to lacquer certain names (of the administrator, director) and others. Staff names may be missing from the Basic Model, but can be added in advance. Another preference the organisation can indicate is, for example, the format in which an e-mail address is made unrecognisable.

Technical design

Based on smart rules, the software searches through the texts of every document in the offered file. Certain texts, words or character combinations are recognised as traceable data, such as Citizen Service Numbers, Bank Account Numbers, telephone numbers, e-mail addresses, dates, residential addresses and postal codes.

The software can be set to the degree of certainty in which a condition is met. For the employee, the screen indicates which text fragments should be lacquered with certainty, and which meet the established smart rule to a lesser extent. By means of a user-customisable list of words that should not or should instead be lacquered, these lacquering proposals can be further specified. Through the screen, the employee can approve or reject the proposals, as well as modify them for approval. Employees can also mark text themselves and make it eligible for varnishing, including adding a basis.

It is possible for a second collaborator to check the work of the first. Once all (pages of all) documents in the file have been reviewed, the final version is created, in a suitable format for publication.

External provider

Octobox Netherlands B.V.

Similar algorithm descriptions

  • The algorithm in the software is mainly set to recognise and anonymise privacy-sensitive information in documents. Basis for this is the AVG. The tool is also used to highlight and mask information that cannot be shared for other reasons (based on another basis, e.g. the Woo) in a document.

    Last change on 30th of October 2024, at 15:57 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    Field not filled in.
    Status
    In use
  • The algorithm in the software is mainly set to recognise and anonymise privacy-sensitive information in documents. Basis for this is the AVG. The tool is also used to highlight and mask information that cannot be shared for other reasons (based on another basis, e.g. the Woo) in a document.

    Last change on 15th of November 2024, at 13:58 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    Field not filled in.
    Status
    In use
  • The algorithm in the software is mainly set to recognise and anonymise privacy-sensitive information in documents. Basis for this is the AVG. The tool is also used to highlight and mask information that cannot be shared for other reasons (based on another basis, e.g. the Woo) in a document.

    Last change on 26th of November 2024, at 8:23 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    Field not filled in.
    Status
    In use
  • The algorithm in the software is mainly set to recognise and anonymise privacy-sensitive information in documents. Basis for this is the AVG. The tool is also used to highlight and mask information that cannot be shared for other reasons (based on another basis, e.g. the Woo) in a document.

    Last change on 13th of November 2024, at 13:53 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    Field not filled in.
    Status
    In use
  • The algorithm in the software is mainly set to recognise and anonymise privacy-sensitive information in documents. Basis for this is the AVG. The tool is also used to highlight and mask information that cannot be shared for other reasons (based on another basis, e.g. the Woo) in a document.

    Last change on 12th of December 2024, at 14:39 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    Field not filled in.
    Status
    In use