Demographic projection

Matt Bhagat-Conway

Demographic projection

  • Census data is always about the (recent) past, not the present and certainly not the future
  • In planning, though, we are by definition looking to the future
  • This requires us to project what the future population will look like

Uses for demographic projections

Graph showing human population exploding in recent centuries but peaking around 2085 at 10 billion and then falling rapidly

NYTimes

Uses for demographic projections

  • Demographic projections form the basis of long-range planning
  • Used to develop regional housing plans
  • Critical in transportation modeling
  • May be used to investigate scenarios (e.g. better maternal health care, increased immigration)

Describing populations: the population pyramid

Population pyramids for the US from 2000, 2010, and 2020, showing an aging population and fewer young children

US Census Bureau

Population pyramids in developing countries

Population pyramid for Angola, showing many more young people than older people

Population pyramid for Angola, Wikipedia

The demographic transition

  • Historically, human populations had high birth and death rates
  • In the last century or so, death rates have fallen rapidly around the world
  • Birth rates have as well, but lagged behid death rates

How are populations forecasted?

  • What are the processes that contribute to changes in population size?
    • Fertility (births)
    • Mortality (deaths)
    • Inmigration
    • Outmigration

The cohort-component model

  • By far the most common demographic forecasting method
  • We divide the population into cohorts—groups of people sharing some characteristic (e.g. men aged 30-35)
  • We then simulate how each cohort changes over time—how many people die, how many people give birth, how many people leave the region, and how many people migrate into the region

Many more details available in Smith, Tayman, and Swanson (2013), which this lecture is based on

The cohort-component model

  • You start with the population from a base or launch year, divided into cohorts
    • Generally divided by age and biological sex/child-bearing potential, possibly also by race and ethnicity
  • You then adjust that to make a prediction for the next period (often 1, 5, or 10 years later)
    • Use survival rates by age to estimate how many in each cohort survive to the next period
    • The survivors “graduate” to the next cohort—e.g. women 20–25 in a 2020 launch year will be 26–30 in 2025
    • Use fertility rates to predict how many births there will be; these become the new youngest cohort
    • Use migration estimates to adjust how the population changes due to migration
  • Repeat the process to forecast the next period, and so on

Mortality

  • Mortality data is generally presented in life tables
  • Life tables show the probability of death in the next period for people of a particular age ::::

Life tables

age Probability of dying in interval Number surviving to start of interval Person-years lived in interval
0–5 0.0072 100,000 496,629
5–10 0.0004 99,259 496,149
10–15 0.0004 99,207 495,916
15–20 0.0011 99,151 495,437
20–25 0.0019 99,007 494,455
25–30 0.0034 98,749 492,732
30–35 0.0040 98,325 490,398
35–40 0.0053 97,817 487,501
40–45 0.0067 97,160 483,817
45–50 0.0103 96,315 478,552
50–55 0.0143 95,039 471,052
55–60 0.0231 93,252 459,695
60–65 0.0331 90,461 443,102
65–70 0.0452 86,613 421,051
70–75 0.0710 81,518 389,908
75–100 0.9610 73,955 911,918

Selected rows from the 2019 life table for females in North Carolina, US CDC

Mortality

Line plot showing elevated probability of death at year zero, low probabilities of death until around 50-60, and then increasing probabilities. Women have lower probabilities of death than men across the board.

Probability of death in next year, North Carolina, 2019 (CDC)

Calculating survival rates from life tables

  • The life table has the probability that someone on their \(x\)th birthday will make it to their \(x+n^{\mathrm{th}}\) birthday
  • We don’t all have the same birthday
  • What we really want is a survival rate for a cohort—what is the probability that a randomly-selected person from anywhere in the cohort will survive another year/period?

Calculating survival rates from life tables

  • If no one in the cohort died, the person-years lived in one cohort and the next cohort would be the same
  • The ratio of person-years lived in the next cohort to this cohort is the survival rate
  • For instance, using the life table above, women aged 50-55 will live 471,052 person-years, and women 55-60 live 459,695
  • The survival rate is 459,695 / 471,052 = 0.977

Applying survival rates in the cohort-component model

  • To project the size of the cohort in the next period, you multiply by the survival rate
  • In 2020, there were 70,022 women aged 50–55 in the Raleigh-Durham-Cary CSA
  • Assuming no migration, how many women 55–60 would we expect in 2025?
    • \(70,022 \times 0.977 = 68,411\)
  • We do this for each cohort in the model

How do you forecast survival rates?

  • This is where demographic forecasting becomes an art
  • We know what survival rates are from past periods to now, but we don’t know what scientific breakthroughs/disasters/policy changes might occur in the next 50 years
  • Since you repeat the cohort-component model many times, any errors in survival rate will compound

Survival rates and major events

Line graph of total US population from 2018 to 2023. There is a significant flattening of population growth in 2020.

US population, 2018–2023. Source: FRED

Fertility

  • Fertility is one of two main ways populations grow
  • We measure the age-specific birth rate and multiply that by the number of women in each age group to get an estimate of births
  • These births are added directly to the first cohort in our next period

Age-specific birth rates in NC

Age-specific birth rates, North Carolina, 2017 (CDC)

Estimating fertility

  • There were 78,232 women aged 30-34 in the Raleigh-Durham-Cary, NC CSA in 2020
  • How many babies do we expect them to have in the next five years?
  • \(78,232 \times 95.4 \div 1000 \times 5 = 37,317\)

Forecasting fertility

  • Again, this is difficult, with many factors
    • Economic conditions
    • Support for parents
    • Access to contraception, abortion, and other reproductive care
    • etc etc

Variations in fertility

  • We used state-level data on fertility
  • Do you think that accurately represents fertility in the Triangle?

Migration

  • Migration from/to outside the region is a critical driver of population change
    • especially in NC
  • Data on migration is somewhat hard to come by, relative to fertility and mortality data

Migration

  • There are two methods for accounting for migration in population projection: net migration and gross migration
  • Gross migration involves calculating both rates of inmigration and outmigration
  • Net migration involves just calculating the difference, which is easier with some datasets

Gross migration rates

  • The outmigration rate is the number of people who left the area divided by the number of people in the area at the start
  • The inmigration rate is the number of people who entered the area, generally divided by the population of the nation less the population of the area
    • i.e. the probability someone somewhere else in the US moves to the Triangle
  • Foreign immigration often handled separately

Migration

  • There are few data sources on gross migration
  • The ACS does ask about migration, but only on a one-year time horizon
  • Most population forecasting models use a longer time horizon
  • Because people can move more than once, aggregation is difficult

Migration

  • Net migration can be derived from the Census by looking at population change and subtracting births and deaths
  • In 2021, the Triangle had a net migration of +37,746
  • This leads to a net migration rate of 1.79% per year, or 9.29% per five-year period
    • This isn’t just \(1.79 \times 5\), this is \(1.0179^5\)
    • This is very crude, demographers avert your gaze, but we’ll use it anyways
    • What do we ignore here? Migrants who die before the next period
  • If we were fancier, we might break down by age, etc.

Migration: denominator

  • Smith, Tayman, and Swanson (2013) recommend for slow-growing regions to use a migration rate based on the regional population, and for fast-growing regions to base it on the adjusted national population (e.g. US population minus Triangle population)
  • Because more folks are moving in than out, so the population of the origin is more relevant
  • You can apply migration rates to launch year population or to survived population, you just need to calculate the migration rate based on the same population you apply it to
  • Even though the Triangle is fast-growing, we’re going to use the regional population for simplicity
  • Otherwise we’d need to estimate the national population as well

Forecasting migration

  • Migration depends on a variety of economic factors domestically and internationally
  • For instance, high housing prices in coastal cities are likely partly driving growth in the Triangle
  • Forecasting this can be difficult
  • We’ll just assume constant migration

Migration

  • 2,106,463 people called the Triangle home in 2020
  • How many people do we expect to migrate by 2025?
    • \(2,106,463 \times 0.0929 = 195,690\)

Putting it all together

  • You run the mortality step, followed by migration, followed by fertility
  • Why?
    • Dead people can’t migrate or have babies
    • Inmigrants might have babies once they get here
    • Outmigrants can’t have babies here after they leave

Putting it all together

Calculating survivors

  • Multiply each cohort by its corresponding survival rate from the life tables sheet
  • You can enter one formula and then just drag down

Calculating migration

  • Multiply each cohort’s starting value by 9.29%

Calculating fertility

  • Multiply each child-bearing cohort by the fertility rates in the fertility sheet
    • Remember to divide by 1000 and multiply by five to get a five-year birth rate per capita
    • Use the average of the launch-year population and the survived population (assume that women who died lived through half the interval)
    • Add the migration as well
    • Use the average of this cohort and the next cohort’s birth rates, because women will get older during the projection period

Finishing the projection

  • Add up survival and migration, move them to the next cohort
  • For the oldest cohorts, add survival and migration from the oldest and second-oldest in the launch year
  • For the youngest cohorts, sum up fertility and multiply by 0.5 (we’re assuming births are completely balanced by sex)
  • Ideally, we would also apply a survival rate to these births

Finishing the projection

  • I get 2,358,959 in the Triangle in 2025
  • The State Demographer estimates 2,303,567
  • Not bad for a back-of-the-envelope forecast in Excel

Special events and populations

  • In some areas, you may need to account for certain special events or populations
    • e.g. New Orleans post-Katrina, or the pandemic across the US - extrapolating from these events could give incorrect answers
  • In Fayetteville, population dynamics might be different due to the large military presence (folks are assigned there)
  • College towns may have similar dynamics
  • Often these populations are modeled separately

Sensitivities

  • Forecasts can be sensitive to the values used, so think carefully about this
  • Not only what your projected rates are, but what group you’re applying to
    • e.g. recall how we didn’t just use the fertility rate for each cohort, but averaged them, because some people will age out of the cohort during the projection period

References

Smith, Stanley K., Jeff Tayman, and David A. Swanson. 2013. A Practitioner’s Guide to State and Local Population Projections. Vol. 37. The Springer Series on Demographic Methods and Population Analysis. Dordrecht: Springer Netherlands. https://doi.org/10.1007/978-94-007-7551-0.

Creative Commons License
This work by Matthew Bhagat-Conway is licensed under a Creative Commons Attribution 4.0 International License.