Please note: The algorithm descriptions in English have been automatically translated. Errors may have been introduced in this process. For the original descriptions, go to the Dutch version of the Algorithm Register.
Anonymise
- Publication category
- Other algorithms
- Impact assessment
- Field not filled in.
- Status
- In use
General information
Theme
Begin date
Contact information
Link to publication website
Responsible use
Goal and impact
The purpose of Octobox Anonymise is to support the province of North Brabant in securely and efficiently varnishing privacy-sensitive information in documents. Octobox makes suggestions for varnishing information to be protected. Employees can accept or reject the suggestions. There are no automatic decisions, so the impact is low.
Considerations
Using the anonymisation tool speeds up and simplifies the process for passive and the active disclosure. Automated anonymisation is also less error-prone than human intervention. This reduces the risk of a data leak and better protects citizens' and companies' data. Octobox anonymisation automates this process, by recognising information to be protected. The deployment of Octobox is justified because (trained) employees must always approve, modify or reject Octobox's suggestion.
Human intervention
Octobox's software works on the basis of a setup document set by the province. Using this setup document, the province can determine which categories of information the software lakes in draft. Examples include people's names, BSN numbers, or signatures. The lacquering employee maintains control by approving, modifying or rejecting a proposal.
Risk management
The biggest risk is that information is incorrectly lacquered or accidentally disclosed anyway, which can lead to violation of privacy laws (such as the AVG), reputational damage for the province, or damage to affected individuals (such as identity theft or misuse of data). These risks are mitigated by mandatory human control: Octobox only makes suggestions and employees make the final decision. In addition, employees are trained in the use of Octobox and additional guidelines are available.
Legal basis
General Data Protection Regulation (AVG), General Administrative Law Act (AWB), Disclosure Act, Open Government Act (WOO), Electronic Publications Act (WEP).
Operations
Data
The algorithm processes complete documents reviewed for disclosure, such as Woo requests, policy documents, reports or e-mails. These documents may contain any kind of information, including personal data such as names, addresses, phone numbers, e-mail addresses, dates of birth, BSN numbers, financial data or signatures. Octobox scans the entire document to detect possible information to be protected. Thus, the algorithm is not limited to specific data categories, but works on the total content of the document.
Technical design
Octobox Anonymise works on the basis of 1) algorithms to search in the context of data 2) value lists that allow automatic recognition of terms and 3) Natural Language Processing (NLP). NLP can classify texts by recognising what the subject of the sentence is or what, for example, a verb or name is. The software uses underlying open source engines such as SpaCy and Yolo where labels are assigned to recognised entities - these are then validated again by Octobox and if within the confidence zone they will be accepted for review by user before being finalised. and if within the confidence zone they will be accepted for review by user before being finalised.
External provider
Similar algorithm descriptions
- The algorithm in the software is mainly set to recognise and anonymise privacy-sensitive information in documents. Basis for this is the AVG. The tool is also used to highlight and mask information that cannot be shared for other reasons (based on another basis, e.g. the Woo) in a document.Last change on 12th of February 2025, at 13:34 (CET) | Publication Standard 1.0
- Publication category
- Other algorithms
- Impact assessment
- Field not filled in.
- Status
- In use
- The algorithm in the software is mainly set to recognise and anonymise privacy-sensitive information in documents. Basis for this is the AVG. The tool is also used to highlight and mask information that cannot be shared for other reasons (based on another basis, e.g. the Woo) in a document.Last change on 3rd of November 2025, at 9:42 (CET) | Publication Standard 1.0
- Publication category
- Other algorithms
- Impact assessment
- Field not filled in.
- Status
- In use
- The algorithm in the software is mainly set to recognise and anonymise privacy-sensitive information in documents. Basis for this is the AVG. The tool is also used to highlight and mask information that cannot be shared for other reasons (based on another basis, e.g. the Woo) in a document.Last change on 20th of March 2025, at 13:40 (CET) | Publication Standard 1.0
- Publication category
- Impactful algorithms
- Impact assessment
- Field not filled in.
- Status
- In use
- The algorithm in the software is mainly set to recognise and anonymise privacy-sensitive information in documents. Basis for this is the AVG. The tool is also used to highlight and mask information that cannot be shared for other reasons (based on another basis, e.g. the Woo) in a document.Last change on 6th of August 2024, at 11:49 (CET) | Publication Standard 1.0
- Publication category
- Other algorithms
- Impact assessment
- Field not filled in.
- Status
- In use
- The algorithm in the software is mainly set to recognise and anonymise privacy-sensitive information in documents. Basis for this is the AVG. The tool is also used to highlight and mask information that cannot be shared for other reasons (based on another basis, e.g. the Woo) in a document.Last change on 13th of November 2024, at 13:53 (CET) | Publication Standard 1.0
- Publication category
- Other algorithms
- Impact assessment
- Field not filled in.
- Status
- In use