Please note: The algorithm descriptions in English have been automatically translated. Errors may have been introduced in this process. For the original descriptions, go to the Dutch version of the Algorithm Register.

TenderTrends

TenderTrends is a model that searches for relevant tenders based on entered keywords and filters and displays data from these. For instance, if the words 'grass', 'mowing' and 'trees' are entered, tenders containing one or more of these words will appear. The user can convert a selection of tenders into charts.

Last change on 16th of January 2024, at 10:19 (CET) | Publication Standard 1.0
Publication category
Other algorithms
Impact assessment
Field not filled in.
Status
In development

General information

Theme

Organisation and business operations

Begin date

2024-01

Contact information

https://www.rvo.nl/onderwerpen/contact/formulier

Link to publication website

tenderned.nl

Responsible use

Goal and impact

The purpose of this algorithm is to support contracting authorities in their tender preparation process. It does this by providing insights into similar tenders.


This goal is achieved in a number of steps:

  1. The contracting authority prepares the tender by formulating the needs, requirements, characteristics and other issues.
  2. Next, the contracting authority can consult TenderTrends by entering keywords and specifying additional criteria (e.g. type of contracting authority: 'municipality').
  3. The user receives a list of potentially relevant tenders based on his search query.
  4. From this list, the user selects the most relevant tenders in his situation. This will return a summary of historical key figures (average number of questions from market players, average turnaround time, most frequently selected procedures, etc.) and public contact data belonging to these selected tenders.
  5. Users can apply this data to their own tenders to make more appropriate choices in advance. The user can also contact colleagues to gain experiences.


The impact of TenderTrends is mainly noticeable in applying public funds more efficiently and offering more insights to contracting authorities. Furthermore, TenderTrends possibly supports in more diverse procurement by showing alternative procedures of similar tenders.


No negative impact has been noted to TenderTrends because:

  1. its use is optional;
  2. no confidential information (personal or company data) is shown;
  3. the contracting authority always makes the decisions itself.


Furthermore, tenders can be drafted more clearly for entrepreneurs from the beginning. This contributes to the ease of tendering. A possible example: a new tender was drafted with TenderTrends where the contracting authority noticed that many questions were asked in historical tenders (data from TenderTrends). The contracting authority then investigated this further and resolved some of the common questions in advance by establishing clear communication and requirements. Subsequently, market players no longer need to ask a question about this and can proceed with the tender.


There is no direct impact for citizens and businesses.

Considerations

TenderNed has taken into account the questions: 'what is allowed?', 'what is possible?' and 'what is desirable?'. Legal and ethical questions were also taken into account.


From this, points such as privacy, propriety, responsibility emerged. For each of the possible risks, steps were taken to minimise the negative impact. An example is sensitive procurement information. Here, it was decided to use only publicly accessible tenders.


Another measure is Human in the loop (HITL), or the human who sits at the knobs of the model and ultimately chooses which data is relevant for the summary. The model is unsupervised, meaning the model is trained without knowing what is right or wrong. The model looks purely at how many words match and which ones are most informative. This means that the word 'grass' is more informative than 'tender' because the word 'tender' appears in almost all texts. In the case of TenderTrends, this means that what the model predicts may be statistically correct, but the results will not always be relevant to the user. We periodically perform internal validation by testing a number of searches and manually determining whether they are really relevant. Also, the end user always makes the final decision so that the relevance of the summarised data is ensured. For example, the model can predict that a search about plant maintenance is very similar to embankment maintenance. The end user can then notice this and ignore this 'prediction'.


Each of the risks found are resolved or avoided. The risks were also discussed with various domain experts (users, lawyers, etc.).


The trade-off that remained after resolving the risks was whether to deploy TenderTrends or continue with the current state of affairs, without TenderTrends. The choice was made to continue because TenderTrends:

  1. is optional, the contracting authority is not obliged to use TenderTrends.
  2. does not carry a negative impact. It is a non-directive, neutral application that only provides insights.
  3. potentially saves public money. Information on comparable tenders can be collected faster.


Furthermore, even after conducting the Impact Assessment for Human Rights when deploying Algorithms (IAMA), it was found that there is no negative impact.

Human intervention

Within this model, there is human intervention. Briefly, the model works as follows:

  1. The user enters search terms himself.
  2. The model calculates relevance of all historical tenders in the dataset.
  3. The model presents a maximum number of relevant tenders (the user determines the maximum number) including the amount of agreement from 0 (no agreement) to 100 (full agreement).
  4. The user sees the title, description, and procurement category of each tender.
  5. Based on this information, the user can select the tenders to be analysed about.
  6. The application (not the algorithm) produces a number of graphs of the selected tenders based on the key figures present (lead time, price, procedure, number of questions, etc.).

Herein, it becomes clear that the TenderTrends application consists of 3 parts in total: the search, the pre-selection by the algorithm, and finally, the human intervention.


This split ensures that the user has a lot of control over the analysis and can work with it himself. At the front end (the search), the user defines requirements (mandatory keywords, region of tender, etc.). The model searches more than 50,000 historical tenders and shows the most important results. The user evaluates the results and includes the relevant tenders in the analysis.


The path after this analysis lies further with the contracting authority. However, TenderTrends does offer

Risk management

The following mitigating measures have been taken within the model:

  • The user makes a selection of the actually relevant tenders prior to the data analysis. This ensures human intervention and checks which data the model bases the analysis on.
  • TenderTrends only displays and uses publicly available data. This prevents sensitive information from being included in the model and thus becoming publicly available.
  • The model uses a 'local dataset' that is free of sensitive information. Thus, the model does not communicate directly with an internal database. With this, there are fewer inter-system dependencies (database must be approachable) and there is no risk of an attack on the database (information security).
  • Other mitigating measures include things like the way data is presented so that it is more accessible to the user. The best example is the output of the model, which is a static score that is difficult to understand for people not at home in the field of language data. The choice was made to rename the score to 'degree of similarity' and display it as a filled bar. The fuller the bar the greater the similarity to the query.

Operations

Data

Data used for the algorithm are: the names of all Dutch public tenders, their descriptions and contracting authorities. The following tender data were used:

  • Tender feature
  • Name contracting authority
  • Type of contracting authority
  • Location of the contracting authority
  • Name of the tender
  • Description of the tender
  • Final value of the tender
  • Currency of the final value of the tender
  • Type of procurement of the tender
  • Tender procedure
  • Award criteria of the tender
  • Status of the tender
  • CPV code of the tender
  • CPV description of the tender
  • Tender start date
  • Tender end date
  • Tender start date
  • Number of awards in the tender
  • Number of operators registered in the tender
  • Type of procurement
  • ID of the first publication of the tender procedure
  • Number of tender documentation
  • Number of questions asked about the tender procedure
  • Number of questions answered in the tender procedure
  • Number of lots at the tender
  • Companies awarded at the tender procedure
  • Contact person of the procuring entity
  • Phone number of the contact person of the contracting authority
  • Indication whether the tender is a market consultation

Links to data sources

Datasets aanbestedingen: https://www.tenderned.nl/cms/nl/aanbesteden-in-cijfers/datasets-aanbestedingen

Technical design

TenderTrends uses a language model - based on specified keywords and filters - to search for the most relevant announcements for the user (buyers from (de)central governments, public law institutions and special-sector companies). The user then has the option to further select the notices relevant to them. This selection is then converted into insights for the user through graphs and tables (e.g. average number of lots).


The language model is trained with historical and public data of tenders. The title and description of tenders were pasted together. Then typical Dutch stop words were removed from the text (e.g. de, het, een, zoals, omdat). With the remaining texts, a dictionary was created in which all words were counted. Finally, a number emerges from the model indicating whether a word contains a lot of information (a high number) or little information (a low number). The more often a word is used the lower the number. These numbers are recalculated after a certain amount of time. Calculating these numbers is the self-learning part.