Please note: The algorithm descriptions in English have been automatically translated. Errors may have been introduced in this process. For the original descriptions, go to the Dutch version of the Algorithm Register.

PDF to 3D objects (Vector data)

Retrieve building data such as area/size, type, layout, geometry (shape and location) from various types of drawings for use in (basic) registrations. The status of PDF2GIS is 'under development'. PDF2GIS is in the test phase of phase 1 flat drawing (of 3 phases).

Last change on 23rd of August 2024, at 15:38 (CET) | Publication Standard 1.0
Publication category
Other algorithms
Impact assessment
Field not filled in.
Status
In development

General information

Theme

Organisation and business operations

Begin date

Field not filled in.

Contact information

datashop@denhaag.nl

Link to publication website

De code wordt niet openbaar gepubliceerd. De applicatie PDF2GIS wordt in opdracht van de gemeente Den Haag ontwikkeld door Coders Co. De businesscase en het intellectueel eigendom liggen bij de leverancier. PDF2GIS is geïnitieerd vanuit project Startup in Residence.

Responsible use

Goal and impact

The PDF2GIS application is designed to retrieve building data from subdivision drawings (phase 1), construction drawings (phase 2) and estate agent floor plans (phase 3). It involves the coherent retrieval of objects, relationships and attributes of objects such as property, residence object, building zone, level, space and the flat right, among others. Coherent acquisition benefits quality (including completeness, consistency and accuracy). Higher quality reduces queries and coordination problems between e.g. (basic) registrations. The application eventually replaces measuring in a paper drawing or in a digital drawing with e.g. Bluebeam (where existing lines are redrawn). A data manager or geo-information manager operates the application and can intervene in each of 13 steps to make corrections. The data are used in (basic) registrations such as WOZ, BAG and in the future perhaps SOR (Cohesive Object Registration). The programme functions well when the quality of the data can measure up to manually collected data.

Considerations

There is no alternative to PDF2GIS's algorithms. In fact, PDF2GIS itself is seen as an alternative to a traditional way of collecting building data. In a traditional way of working, individual municipal employees each make their own interpretation of lines and text in order to deduce building data. This involves proportionally fewer consistent interpretations and proportionally more errors, e.g. in calibration. The strength of PDF2GIS is the ability to collect data that are currently still kept separately in different basic registries in a coherent way. Current practice is that data in four different registries (WOZ, BAG, BGT, 3D city model) are not maintained simultaneously and from different sources. This is comparatively expensive (differences in feedback etc.) and the quality is comparatively low (due to differences in time, accuracy, precision etc.). There are no other programmes with algorithms that can collect building data in coherence, using existing, known building data, such as in the system of basic registries or the WOZ registration.

Human intervention

Yes, certainly there is human intervention in the sense of user of the algorithm PDF2GIS. Distilling building data from scanned (sometimes old, less legible) drawings is (still) too complex to be left 'blind' solely to an algorithm. Processing a drawing is done in 13 steps in phase 1. And in each step, the user is expected to make corrections if necessary. For the time being, the user also has to process the result himself in possibly decreasing (basic) registrations.

Risk management

There are no risks associated with working with PDF2GIS's algorithms. The biggest pitfall is working with a drawing that does not represent an actual situation. Employees working with PDF2GIS pay attention to this by studying visual material such as aerial photos. Moreover, the output of PDF2GIS eventually forms the input of a chain in which building data is validated and distributed. There is no question of bias, in the sense that the algorithm may produce a systematic difference in representativeness ('discriminates'). At most, algorithms may work better on drawings of common buildings (flats) and less well on unique buildings, requiring the user to make manual corrections more often.

Legal basis

Under the BAG Act (on managing addresses and buildings data), among other things, the size of the residential object is measured, under the WOZ Act (on the valuation of immovable property) the size of building areas on building floors within the WOZ object. WOZ uses BAG and in The Hague, WOZ is also delegated source holder for area data in BAG (WOZ measures for BAG in order to use the same area data for every building municipality-wide).

Operations

Data

Data in basic registries are used in the PDF2GIS algorithm. These include the WOZ (property valuation), BAG (addresses and buildings), BGT (large-scale topography) and BRK (land register) as data sources. The result of PDF2GIS is ideally processed to (basic) registrations WOZ and BAG, in the future to the SOR (coherent object registration). The algorithm processes a drawing (flat drawing phase 1, construction drawing in phase 2 and estate agent's floor plan in phase 3) together with known data from basic registrations. As far as possible, the processing results in data about buildings. These are the objects, relations and attributes as mentioned in information models of BAG (IMBAG), WOZ (IMWOZ) and the SOR.

Technical design

PDF2GIS uses algorithms such as text detection (EAST algorithm), text recognition (Tesseract OCR), line recognition (Hough Line Transform), polygon-shaping algorithms as well as algorithms for georeferencing (rubber-sheeting etc.), among others.

Similar algorithm descriptions

  • The programme provides a visual representation of traffic flow on roads and intersections. The reality and possible future situation are sketched and simulated based on the inputs. With this computer model, options for road and intersection layouts and traffic arrangements can be compared and thus help with advice and choices.

    Last change on 23rd of August 2024, at 15:21 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    Field not filled in.
    Status
    In use
  • We record side placements and litter with images. From these images, we create information to measure cleanliness levels and monitor objects. The goal is better maintenance planning.

    Last change on 5th of September 2024, at 12:18 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    DPIA
    Status
    In use
  • Digital scanning and categorisation of types of byplacements at Orac's.

    Last change on 23rd of August 2024, at 16:11 (CET) | Publication Standard 1.0
    Publication category
    Impactful algorithms
    Impact assessment
    Field not filled in.
    Status
    Out of use
  • This algorithm is a calculation programme for determining noise levels from road traffic, railways and industry.

    Last change on 5th of January 2024, at 14:35 (CET) | Publication Standard 1.0
    Publication category
    Impactful algorithms
    Impact assessment
    Field not filled in.
    Status
    In use
  • Based on image recognition of satellite imagery and analysis, changes in natura2000 areas are understood and mapped in detail. We do this for ecological purposes.

    Last change on 22nd of November 2024, at 11:34 (CET) | Publication Standard 1.0
    Publication category
    Other algorithms
    Impact assessment
    Field not filled in.
    Status
    In use