The task, goal, and research questions
The goal of the data challenge is to assess the current predictability of individual-level fertility and improve our understanding of fertility behaviour.
The task of PreFer is to predict for people aged 18-45 in 2020, who will have a(nother) child within the following three years (2021-2023) based on the data up to and including 2020.
The goal of the data challenge is to assess the current predictability of individual-level fertility and improve our understanding of fertility behaviour.
The results of PreFer will be used to answer the following research questions:
- How well can we predict who will have a(nother) child in the short-term future in the Netherlands?
- What are the most important predictors of this fertility outcome?
- Are there novel predictors for this fertility outcome, unaccounted for in the existing theoretical literature? (this can include non-linear effects and interactions between predictors)
- How do theory-driven methods compare to data-driven methods in terms of predictive accuracy?
- What poses larger constraints on predictive ability: the number of cases or the number of (‘subjective’) variables?
- Survey data typically consists of hundreds or thousands of variables (including subjective measures like intentions or values) on a relatively small sample (at least in comparison to data science projects). Population registries typically contain fewer variables only on a set of ‘objective’ measures (e.g., income, education, cohabitation) but describe a large number of people.
- To what extent can predictions on survey data be improved by augmenting it by register data? (e.g. imputing missing values, correcting measurement errors, adding new variables)
- To what extent can predictions based on the register data be improved by augmenting it with survey data (e.g. “subjective” variables)?
Further details:
- Gert Stulp talks in more detail about potential of the data challenge for research into fertility behaviour here
- We describe the advantages of using a prediction-focused approach in the preprint about PreFer.
Photo by Adam Mosley on Unsplash