Please note: The algorithm descriptions in English have been automatically translated. Errors may have been introduced in this process. For the original descriptions, go to the Dutch version of the Algorithm Register.
Intelligent search in the BSN management facility
- Publication category
- Other algorithms
- Impact assessment
- DPIA
- Status
- In use
General information
Theme
Begin date
Contact information
Link to publication website
Responsible use
Goal and impact
The purpose of this algorithm is to maximise the quality of the search process in the BSN Management Service.
The BSN Management Service offers several services that require searching for the right person:
- UC12 Match identifying data (Known in the BRP as Presence query);
- UC23 Request BSN on the basis of identifying data
- UC35 Request BSN for the purpose of cleaning and initial filling
- UC37 Processing bulk demand
For all these services, it is important to find the right person if they are registered in the BSN Management Facility. The algorithm implements a form of 'intelligent' search, where, in addition to 1-to-1 searching with the entered data, a number of operations are performed on the entered data before searching. This increases that probability of finding the searched person if present in the database (reduces false negatives). At the same time, the algorithm ensures that the probability is so low that the wrong person is wound up (reduces false positives).
Considerations
If no intelligent search algorithm were deployed, the search of persons would take place on the basis of a 1-to-1 comparison of the search data entered with the personal data in the database. Experience shows, however, that the users of the Management facility are far from always having exactly the same data as are present in the database. For example, many organisations do not have the full first names of the person, but only the initials, or only the first name in full. It is also common that diacritical marks such as ã, â, ä or ligatures such as Ӕ are not registered or not registered correctly. Furthermore, there are many different spellings in circulation of (especially) foreign names. In a 1-to-1 comparison, these search queries would wrongly lead to the answer Not Found; we call this a false negative result. In this case, introducing some form of intelligent search increases the hit probability, and thus reduces the false-negative rate.
At the same time, such a search algorithm may increase the probability of false positives, by finding individuals who do not meet the intended search query.
When weighing up the advantages against the disadvantages, the choice was made to add measures that eliminate the unjustified use of a false-positive result as much as possible. This creates a balance between preventing false negatives without simultaneously making the percentage of false positives too large.
Human intervention
The result of the search result is presented to the user, a human who asked the search question. The answer also includes whether the result is a 100% match (all compulsory data match completely) or a lower percentage (see Technical operation). Based on that result, the user can judge whether the correct answer is in the search result, or whether an adjusted query is necessary. Documentation is available to support the user in this process: Handreiking BSN voor gebruikers, Handreiking BSN Burgerzaken en RNI and Functionele specificaties BV BSN.
Risk management
Regular consultation takes place with the users of the BSN Management facility, including satisfaction with the services offered. The perceived quality of the search algorithm is part of this. RvIG has an application management team that, under the leadership of the product owner, deals with questions, wishes and any malfunctions.
Legal basis
At the level of the law, this is the General Provisions Citizen Service Number Act. Below that, the Citizen Service Number Decree. Below that, the Regulation on Citizen Service Number, where, in Article 3, the Logical Design BSN is designated as the system description. This Logical Design describes the operation of the search algorithm.
Links to legal bases
- Wet algemene bepalingen burgerservicenummer: https://wetten.overheid.nl/BWBR0022428
- Besluit burgerservicenummer: https://wetten.overheid.nl/BWBR0022829
- Regeling burgerservicenummer: https://wetten.overheid.nl/BWBR0022835
Elaboration on impact assessments
The BSN Management Facility was designed and built in the years 2005-2007. At that time, (D)PIAs were not yet common. RvIG is currently working on updating the existing (D)PIAs and completing missing (D)PIAs. That working list will also include the BSN Management Facility.
Impact assessment
Operations
Data
The maximum set of search data that can be entered in the different search algorithms consists of:
- Name: given names, prefixes, gender name
- Birth: date of birth, place of birth, country of birth
- Gender designation
- Nationality
- Address: municipality, street name, house number, house letter, house number suffix, postal code
- Country from where registered
- Date of departure from the Netherlands
Links to data sources
Technical design
Different search methods can be distinguished:
- Exact match: The specified field exactly matches the found field. This search method will be used for all specified fields.
- Diacritic transformation: The specified field, after diacrites transformation, exactly matches the found field (which has been stripped of its diacrites using the same method). The table below shows the fields for which diacrites transformation is applied. The note "Diacrites" already describes how to handle diacrites transformation.
- Other search methods: Different search methods for each field have been distinguished. These are listed in the second table. When selecting the search methods, the following assumptions were made:
- A well-identifiable person (preferably with identity document), identified at the counter, is assumed;
- Typing errors are not resolved by the proposed search methods.
The diacritic transformation is performed as follows:
- All diacritics are removed and some special characters (e.g. Ǽ) are transformed to a corresponding character or characters in the range 'a' to 'z'. This is done according to a translation table (see conversion tables).
- All (normal) uppercase characters A to Z are converted to the corresponding lowercase characters.
- All remaining characters, other than digits '0' to '9' and letters 'a' to 'z', are converted to spaces after which all spaces are removed.
Searching by first names:
- It is possible to search only by those first name(s) specified in the query. This means that if only one given name is specified, it is compared with the first given name in the given names field and if two given names are specified, they are compared with the first two given names in the given names field, etc. The order specified and the order recorded in the registration must be exactly the same.
- A search is performed by first initial only if only 1 position in the given names field is specified. The specified character is compared with the first position of the field in the registration (after diacritic transformation).
Sex name search:
Based on a conversion table, the given name is compared with the Sex name field in the registrations. Principles here are:
- Aimed at resolving differences in foreign gender names, as Dutch gender names are expected to be well documented (passport, driving licence);
- --The conversion is mainly focused on transliteration and transcription. This partially resolves differences in the conversion of non-Roman scripts. (In Russian passports, for example, Cyrillic is often transcribed according to a French transliteration. This differs from the Dutch or English transliteration);
- Particular attention has been paid to the conversion of Cyrillic (and derivative scripts), Chinese, Korean and Arabic;
- The proposed transliteration is applied to the gender name, as created after diacritic transformation (in a broad sense).
The specified prefixes and gender name are stripped of diacritics and merged to compare with the fields prefixes and gender name from the registration, which have undergone similar processing.
Weighting factor
The weighting factor of a field determines the order in which the "dressing regime" is applied: the determination of which additional search criteria are included to arrive at a search result. In addition, the weighting factor is used to calculate the score for the purpose of the Presence question. The weighting factors specified here represent the current setting of the intelligent search. They can be changed as the intelligent search is adjusted.
Field / Weighting factor
Given names 0.70
Sex name prefixes 0.65
Sex name 0.85
Date of birth 0.90
Place of birth 0,67
Country of birth 0.60
Indication of sex 0.90
Nationality 0.30
Municipality of registration 0.50
Street name 0.45
House number 0.75
House letter 0.40
House number suffix 0.25
Indication of house number 0,20
Postcode 0.80
Location description 0,10
Country from which registered 0,32
Date of departure from the Netherlands 0,35
External provider
Link to code base
Similar algorithm descriptions
- Algorithm that enables phonetic (writing data as they sound) searches on personal data of foreigners registered in the Basisvoorziening Vreemdelingen (BVV).Last change on 21st of December 2023, at 15:38 (CET) | Publication Standard 1.0
- Publication category
- Other algorithms
- Impact assessment
- Field not filled in.
- Status
- In use
- The search algorithm used by the BRP Link Point to determine whether a person logging in with an eIDAS resource already appears in the Basic Registration of Persons (BRP).Last change on 15th of October 2024, at 8:31 (CET) | Publication Standard 1.0
- Publication category
- Other algorithms
- Impact assessment
- DPIA
- Status
- In use
- This algorithm checks whether the data contained in a person list meets the requirements of the BRP Logical Design in terms of data structure and content (Structure and Domain checks).Last change on 13th of January 2025, at 9:04 (CET) | Publication Standard 1.0
- Publication category
- Other algorithms
- Impact assessment
- Field not filled in.
- Status
- In use
- Search & Find is a software system that helps to quickly and efficiently find information within internal systems, documents, databases and other digital sources. Search & Find is used in the Woo process, parliamentary inquiries, answering parliamentary questions, AVG reports and employee information searches, among others.Last change on 8th of August 2024, at 13:35 (CET) | Publication Standard 1.0
- Publication category
- Other algorithms
- Impact assessment
- Field not filled in.
- Status
- In use
- The algorithm in the software recognises and anonymises personal data and other sensitive information in documents. Governments regularly publish information related to the drafting and implementation of their policies (e.g. based on the Woo). This tool is used to render sensitive data unrecognisable in the process.Last change on 9th of January 2025, at 9:23 (CET) | Publication Standard 1.0
- Publication category
- Other algorithms
- Impact assessment
- DPIA
- Status
- In use