Skip to content

Calibration inflates UK population from 69M to 74M (should be ~70M) #217

@MaxGhenis

Description

@MaxGhenis

Summary

The dataset calibration process is inflating the UK population significantly above ONS targets. The base FRS 2025 dataset has ~69M people, but after calibration this jumps to ~74M - about 6% above the ONS mid-2024 actual estimate of 69.3M.

Evidence

Dataset 2025 Population
Base dataset (frs_2025_with_ss.h5) 68.97M
Calibrated dataset (frs_2025_calibrated_v3.h5) 73.57M
ONS mid-2024 actual estimate 69.3M
ONS 2022-based projection for 2025 ~70M

Root Cause Investigation

  1. The uk_population target IS included in the calibration loss function (loss.py line 318)
  2. The population index in policyengine-uk (ons.population) uses reasonable growth rates
  3. BUT the calibration is not constraining population properly - other targets are pulling weights in a direction that inflates total population

Potential Solutions

  1. Increase the weight on population targets in the calibration loss function
  2. Add a hard constraint that total population must match the target
  3. Review conflicting targets that may be inflating population (e.g., regional age bands sum to more than national total)

Impact

This is causing CI test failures in PR #216:

  • Test expected: 69.5M (ONS 2022-based)
  • Test got: 73.7M

Data Sources

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions