Intro to CBS data

LISS
dataset
guide

Here is some useful information that can help navigate CBS datasets.

Authors

Lisa Sivak

Gert Stulp

Published

July 8, 2024

Here we describe different datasets based on Dutch registers, that are available for selected teams in the second part of PreFer.

The Netherlands use administrative data from different registers (for example, municipality databases about residents or tax administration databases about tax payers) instead of census and surveys as a source of information about the population. Statistics Netherlands (CBS) collects administrative data from different registers with information about persons, households, jobs, businesses, dwellings, vehicles, and more. All personal identifiers in these datasets are replaced with anonymous linkable keys (e.g. anonymous personal ID, household ID, address ID). CBS maintains and updates these datasets, imputes missing data, combine them to infer additional characteristics (e.g. cohabitation, or unmarried couples, based on address data and tax data), and even to create whole new datasets (e.g. family network or colleagues network). These data is used for official statistics and also provided to scientists from selected organizations for scientific research under strict conditions.

Useful resources to understand and navigate CBS data

  1. Prins K. Population register data, basis for the Netherlands Population Statistics: the paper which describes Personal Records Database which is one of the main sources of CBS data.

  2. van de Laan D.j.: the paper describing how the original network datasets were created.

  3. Updates in the methodology of constructing network datasets: insights not only about network datasets and changes in methodology of establishing ties, but also about other CBS datasets (e.g. details about how some characteristics are inferred which are not mentioned elsewhere). MUCH easier to read than other CBS documentation.

  4. ODISSEI portal: here you can search (by the dataset name or keywords) for different CBS datasets and read the documentation (use Chrome and enable auto-translate from Dutch; this way it’s easier to read the description of the dataset and variables)

  5. Microdata catalog: list of all available datasets with documentation (pdf files, in Dutch)

CBS datasets used in PreFer

General comments

CBS uses different dataset structures to store information. This is important to know to avoid mistakes in linking datasets and prevent using information from 2021 and later.

  • BUS in the name of a dataset means that there are several rows per unit of observation. A new row is added, if a change occur. For example, if someone gets divorced, a new row is added to this person in the GBABURGERLIJKESTAATBUS dataset which covers civil status of Dutch residents.

  • TAB in the name of a dataset means that there is only one row per unit of observation. For example, each yearly dataset about the highest level of education (HOOGSTEOPL__TAB__) includes one row per person.

  • For most datasets, separate files for each year are available. Some of this yearly files include only information from this year (e.g. Spolisbus - information about people’s jobs that year; HOOGSTEOPLTAB - highest level of education that year, INPATAB - personal income that year). Some yearly datasets are cumulative (include all data up to a certain year), e.g. GBAPERSOONTAB yearly files include personal information about all people registred in the Netherlands up to December 31 YYYY. In is important to use only files up to (and including) 2020 in PreFer.

In some cases there is only one file which is undated each year; when a new version is released, the older version becomes unavailable. Examples include GBABURGERLIJKESTAAT (civil status - married/in registered partnership/single/etc.) and SECMBUS (socio-economic category). If you use these datasets, make sure you don’t use information starting from January 1, 2021 (civil statuses and socio-economic category with start date on January 1, 2021 or later).

CBS datasets

  1. GBAPERSOONTAB (CBS codebook, ODISSEI portal): demographic characteristics (e.g. gender, year of birth, country of origin) of people included in the Municipal Personal Records Database (BRP). Most of them are Dutch residents; but from 2014 the dataset also includes people who are not residents but have relationships with Dutch authorities (e.g. work in the Netherlands but don’t live there long-term).

There is one file per year (starting from 2009). Each file includes people who are registered in the BRP from January 1, 1994 to December 31, YYYY. People who were registered before January 1, 1994, but not after, are only partly covered.

  1. GBAADRESOBJECTBUS (CBS codebook, ODISSEI portal): addresses of people registered in the Netherlands: address ID, date of registration at this address, and end date of address (when a person moved out or died). Also includes object ID (i.e. house ID) which can be used to link neighborhood characteristics to individuals.

There is one file per year (starting from 2009). Each file contains information about all past and current addresses of all people registered in the Netherlands from January 1, 1994 to December 31, YYYY. People who were registered before January 1, 1994, but not after, are only partly covered.

People who plan to stay in the Netherlands longer than 4 months are obliged to register at an address. This file includes all long-term residents of the Netherlands. Additionally, some people who stay in the Netherlands short-term can also register; they are also included in this data.

When a person moves, she/he has to report the move (with an end date of previous address and start date of new address). Most people do that. Representative survey showed that most people (~95%) actually live at an address where they are registered.

  1. HOOGSTEOPLTAB (CBS codebook, ODISSEI portal): highest level of education of people registered in the Netherlands. There is one file per year (with data about highest level of education on October YYYY) starting from 1999.

The file is based on data from various registers and the Occupational Population Survey (EBB) which is used to impute missing data. Although the coverage rate is high, the dataset does not represent the entire target population (for older people (>40) and for immigrants information is more often missing than for younger native Dutch population). Private education is also undercovered.

  1. SECMBUS (CBS codebook, ODISSEI portal): social economic category of persons (e.g. employee, self-employed, social benefit recipient) with start and end date of each category. There is only one file (with multiple rows per person) which is updated every year.

  2. SPOLISBUSCBS codebook, ODISSEI portal): Characteristics of jobs. The file is per person per job per perioud (usually month). Self-employment is not covered; only people who were employed at companies are included

  3. GBABURGERLIJKESTAATBUS CBS codebook, ODISSEI portal): Civil status of people registered in BRP
    Sometimes divorces take months; some people are formally still married (have ‘married’ status in this dataset) but are not a married couple any more. Living together (e.g. belonging to the same household of having the same address) might be an additional parameter to use to estimate who are ‘really’ married. Partnership is easier to dissolve

  4. GBAVERBINTENISPARTNERBUS CBS codebook, ODISSEI portal)

  5. KOPPELPERSOONHUISHOUDEN

  6. INPATAB CBS codebook, ODISSEI portal) from 2011

  7. INHATAB CBS codebook, ODISSEI portal)

from 2011

  1. VEHTAB CBS codebook, ODISSEI portal)

from 2006

  1. NABIJHEIDKINDOPVTAB CBS codebook, ODISSEI portal)

from 2012

  1. VBOWONINGTYPETAB CBS codebook, ODISSEI portal)

from 2020

  1. GBAHUISHOUDENSBUS CBS codebook, ODISSEI portal) Which household a person belongs to and characteristics of this household

Network files: from 2009

  1. Familienetwerktab CBS codebook, ODISSEI portal)

  2. Burennetwerktab CBS codebook, ODISSEI portal)

  3. Colleganetwerktab CBS codebook, ODISSEI portal)

  4. Huisgenotennetwerktab CBS codebook, ODISSEI portal)

  5. Klasgenotennetwerktab CBS codebook, ODISSEI portal)

Network files - only people who are residents that year are included