Please note: The algorithm descriptions in English have been automatically translated. Errors may have been introduced in this process. For the original descriptions, go to the Dutch version of the Algorithm Register.
Anonymisation software
- Publication category
- Impactful algorithms
- Impact assessment
- Field not filled in.
- Status
- In use
General information
Theme
Begin date
Contact information
Responsible use
Goal and impact
The anonymisation software is used to anonymise documents published by the Province of Zeeland faster and better. This way, we prevent data leaks and contribute to better protection of data subjects' AVG rights.
Considerations
The Province of Zeeland increasingly has to make information public. Therefore, privacy or business-sensitive information has to be masked out. The advantage of the anonymisation software is that anonymisation is faster and better than with a manual approach.
Human intervention
The outcome of the algorithm is checked by an employee. The clerk is required by the software to check all pages. The clerk determines whether the document is correctly anonymised.
Risk management
There is no risk of automated decision-making and the algorithm has no impact on fundamental rights because the algorithm does not make decisions with legal consequences. It only suggests anonymising personal data. If the algorithm does not work well enough, we can make adjustments with black- and whitelists. The employee of Province of Zeeland always does the final test of whether a document has been anonymised correctly.
Legal basis
1. WOO 2. WCO 3. UAVG 4. WEP 5. WDO
Links to legal bases
- Woo: https://wetten.overheid.nl/BWBR0045754/
- WDO: https://eur-lex.europa.eu/legal-content/NL/TXT/HTML/?uri=CELEX:31995L0046
- UAVG: https://wetten.overheid.nl/BWBR0040940
- Wep: https://wetten.overheid.nl/BWBR0043961
- Wdo: https://wetten.overheid.nl/BWBR0048156
Operations
Data
All information found in the uploaded documents (except metadata) is processed by the algorithm. This may include ordinary personal data, special personal data and criminal data. It may also include business-sensitive information.
Technical design
Documents are uploaded to the application by an employee. At that point, a (temporary) copy is made of the original in the form of a PDF with text layer and the metadata of the original document is removed from the copy. This copy ends up on a Dutch server and remains there for a maximum of 30 days. The text layer of the PDF is offered to the machine learning algorithm through an API. This is a Natural Language Processing algorithm (named entity recognition) from Microsoft Azure. The API returns at which location in the analysed texts a personal data is likely to occur, along with the probability score (a percentage). At that point, Azure immediately removes the text layer. The probability score is used along with vendor-developed proprietary ai models to make the recognition of personal data as accurate as possible. The models are trained with trained datasets. Finally, an employee checks the document and when they finalise the document, the data to be anonymised is permanently removed from the text layer and varnished.
External provider
Similar algorithm descriptions
- The algorithm underlines personal data in documents. An employee has to look at all pages and check whether the document is properly anonymised. Then the software removes all highlighted information and blacklists it. After that, the documents can be published, for example under the Open Government Act.Last change on 19th of September 2024, at 9:19 (CET) | Publication Standard 1.0
- Publication category
- Other algorithms
- Impact assessment
- DPIA
- Status
- In use
- The algorithm underlines personal data in documents. An employee has to look at all pages and check whether the document is properly anonymised. Then the software removes all highlighted information and blacklists it. After that, the documents can be published, for example under the Open Government Act.Last change on 29th of October 2024, at 7:21 (CET) | Publication Standard 1.0
- Publication category
- Other algorithms
- Impact assessment
- DPIA, ...
- Status
- In use
- The algorithm underlines personal data in documents. An employee has to look at all pages and check whether the document is properly anonymised. Then the software removes all highlighted information and blacklists it. After that, the documents can be published, for example under the Open Government Act.Last change on 20th of August 2024, at 8:38 (CET) | Publication Standard 1.0
- Publication category
- Impactful algorithms
- Impact assessment
- Field not filled in.
- Status
- In use
- The algorithm underlines personal data in documents. An employee has to look at all pages and check whether the document is properly anonymised. Then the software removes all highlighted information and blacklists it. After that, the documents can be published, for example under the Open Government Act.Last change on 19th of September 2024, at 8:21 (CET) | Publication Standard 1.0
- Publication category
- Other algorithms
- Impact assessment
- DPIA, ...
- Status
- In use
- The algorithm underlines personal data in documents. An employee has to look at all pages and check whether the document is properly anonymised. Then the software removes all highlighted information and blacklists it. After that, the documents can be published, for example under the Open Government Act.Last change on 8th of July 2024, at 15:45 (CET) | Publication Standard 1.0
- Publication category
- Other algorithms
- Impact assessment
- Field not filled in.
- Status
- In use