Please note: The algorithm descriptions in English have been automatically translated. Errors may have been introduced in this process. For the original descriptions, go to the Dutch version of the Algorithm Register.
Transcription platform Transkribus
- Publication category
- Other algorithms
- Impact assessment
- Field not filled in.
- Status
- In use
General information
Theme
Begin date
Contact information
Responsible use
Goal and impact
Making historical handwritten documents digitally accessible and searchable for researchers and other interested parties. No impact.
Considerations
Making historical research easier. This will allow more people to access historical source material.
Human intervention
The AI models were trained within the Transkribus tool by regional archive staff. The computer-read texts (handwritten text recognition (HTR)) are randomly checked afterwards and corrected where necessary. Third parties who discover errors can report them to the regional archive. After assessing the report, this may lead to a correction.
Risk management
The risks are low. The regional archive only processes public documents with HTR. Transkribus emerged from an EU Horizon 2020 programme a then developed into a European cooperative with a large number of international heritage institutions as members. All data and metadata are hosted on European servers and are AVG compliant.
Operations
Data
Transcriptions and Ground Truth
The dataset contains machine-read transcriptions and Ground Truth (training material) of historical manuscripts from the public section of the Civil Registry. New scans with HTR are added periodically.
Technical design
With machine learning and Handwritten Text Recognition (HTR) techniques, AI models are trained to recognise manuscripts. From 19th-century and more modern manuscripts.
Architecture of the model
The HTR is implemented with several specific and generic AI models within Transkribus, using convolutional neural networks and transformer neural networks.
External provider
Similar algorithm descriptions
- This algorithm has low impact. Making historical handwritten documents searchable by words.Last change on 9th of December 2024, at 14:25 (CET) | Publication Standard 1.0
- Publication category
- Other algorithms
- Impact assessment
- Field not filled in.
- Status
- In use
- The algorithm recognises (personal) data and otherwise confidential information in a document and makes a proposal to anonymise it. A staff member evaluates the proposal and makes the final adjustment, making the document suitable for publication.Last change on 15th of January 2025, at 7:03 (CET) | Publication Standard 1.0
- Publication category
- Other algorithms
- Impact assessment
- Field not filled in.
- Status
- In use
- Recognising and anonymising privacy-sensitive information in documentsLast change on 4th of June 2024, at 14:53 (CET) | Publication Standard 1.0
- Publication category
- Other algorithms
- Impact assessment
- Field not filled in.
- Status
- In use
- This algorithm has a low impact. On 4 March 2020, the municipality of Amsterdam demonstrated to a large group of interested people at its data lab how it can currently record when placements are made via moving cameras.Last change on 26th of November 2024, at 15:30 (CET) | Publication Standard 1.0
- Publication category
- Impactful algorithms
- Impact assessment
- Field not filled in.
- Status
- Out of use
- The algorithm recognises (personal) data and otherwise confidential information in a document and makes a proposal to anonymise it. A staff member evaluates the proposal and makes the final adjustment, making the document suitable for publication.Last change on 7th of October 2024, at 15:33 (CET) | Publication Standard 1.0
- Publication category
- Other algorithms
- Impact assessment
- Field not filled in.
- Status
- In use