Please note: The algorithm descriptions in English have been automatically translated. Errors may have been introduced in this process. For the original descriptions, go to the Dutch version of the Algorithm Register.

Transcription platform Transkribus

This algorithm has low impact. Making historical handwritten documents searchable by words.

Last change on 24th of June 2024, at 7:00 (CET) | Publication Standard 1.0
Publication category
Other algorithms
Impact assessment
Field not filled in.
Status
In use

General information

Theme

Organisation and business operations

Begin date

2024-05

Contact information

info@goeree-overflakkee.nl

Responsible use

Goal and impact

Making historical handwritten documents digitally accessible and searchable for researchers and other interested parties. No impact.

Considerations

Making historical research easier. This will allow more people to access historical source material.

Human intervention

The AI models were trained within the Transkribus tool by regional archive staff. The computer-read texts (handwritten text recognition (HTR)) are randomly checked afterwards and corrected where necessary. Third parties who discover errors can report them to the regional archive. After assessing the report, this may lead to a correction.

Risk management

The risks are low. The regional archive only processes public documents with HTR. Transkribus emerged from an EU Horizon 2020 programme a then developed into a European cooperative with a large number of international heritage institutions as members. All data and metadata are hosted on European servers and are AVG compliant.

Operations

Data

Transcriptions and Ground Truth

The dataset contains machine-read transcriptions and Ground Truth (training material) of historical manuscripts from the public section of the Civil Registry. New scans with HTR are added periodically.

Technical design

With machine learning and Handwritten Text Recognition (HTR) techniques, AI models are trained to recognise manuscripts. From 19th-century and more modern manuscripts.


Architecture of the model

The HTR is implemented with several specific and generic AI models within Transkribus, using convolutional neural networks and transformer neural networks.

External provider

Transkribus

Similar algorithm descriptions

  • This algorithm has low impact. Making historical handwritten documents searchable by words.

    Last change on 9th of December 2024, at 14:25 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    Field not filled in.
    Status
    In use
  • The algorithm recognises (personal) data and otherwise confidential information in a document and makes a proposal to anonymise it. A staff member evaluates the proposal and makes the final adjustment, making the document suitable for publication.

    Last change on 15th of January 2025, at 7:03 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    Field not filled in.
    Status
    In use
  • Recognising and anonymising privacy-sensitive information in documents

    Last change on 4th of June 2024, at 14:53 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    Field not filled in.
    Status
    In use
  • This algorithm has a low impact. On 4 March 2020, the municipality of Amsterdam demonstrated to a large group of interested people at its data lab how it can currently record when placements are made via moving cameras.

    Last change on 26th of November 2024, at 15:30 (CET) | Publication Standard 1.0
    Publication category
    Impactful algorithms
    Impact assessment
    Field not filled in.
    Status
    Out of use
  • The algorithm recognises (personal) data and otherwise confidential information in a document and makes a proposal to anonymise it. A staff member evaluates the proposal and makes the final adjustment, making the document suitable for publication.

    Last change on 7th of October 2024, at 15:33 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    Field not filled in.
    Status
    In use