My First Four-Step Model

A simple and accessible introduction to travel demand modeling

Matt Bhagat-Conway

University of North Carolina at Chapel Hill

What is a travel demand model?

  • Large collection of econometric models that predict how people will use the transportation system, based on forecasted demographics and changes to the transportation network
  • All US regions have one
  • Used to prioritize billions in transportation funding
  • The most common model is the four-step model

Why introduce modeling to planners?

  • Most planners will never use a model directly
  • However, most planners will be consumers of model output
  • Giving planners more experience with modeling will improve communication with modelers
    • Provide a “healthy skepticism” of model results, but also
    • Understand what the model can and can’t do
    • Understand how the model can fit into planning processes
    • Think of novel ways to use models

Typical experience of planners with models

An image of a 'pile of linear algebra' with an input and an output, and text 'This is your machine learning system? Yup! You pour the data into this big pile of linear algebra, then collect the answers on the other side. What if the answers are wrong? Just stir the pile until they start looking right.'

© xkcd

How we usually teach modeling

  1. Take transportation planning
  2. Take statistics
  3. Take econometrics
  4. Take choice modeling
  5. Take GIS
  6. Work with component models (mode choice, destination choice, etc.)
  7. Actually run a model (optional)

An alternate approach

  1. Actually run a model
  2. Take transportation planning
  3. Take statistics
  4. Take econometrics (optional)
  5. Take choice modeling (optional)
  6. Take GIS (optional)
  7. Work with component models (mode choice, destination choice, etc.) (optional)

How do you run a model first?

  • In 720, I do one lecture on modeling
  • Then, every student runs a very simple model and interprets the inputs and outputs

My First Four Step Model

  • I implement this model using My First Four Step Model, an R package I developed for implementing very simple models
  • Running the model only requires R and minimal computing power, so students can run it on their laptops
    • Even Chromebooks!
  • The four steps of the model map directly onto four functions in the package

The four steps

  • Trip generation: how many trips originate or terminate in each zone
  • Trip distribution: how many of those trips travel between each pair of zones
  • Mode choice: what modes do they use (bus/drive/walk etc)
  • Network assignment: what routes do they take, and what congestion levels result

Running the model: trip generation

  • The entire trip generation process happens with one function, trip_generation
  • I have students interpret
    • coefficients in the regression models
    • maps of model output
# Run trip generation
trip_ends = trip_generation(model, model$scenarios$baseline)

Trip generation results

AM Peak home-based work trip productions and attractions. Productions are spread across the region, whereas attractions are more concentrated.

Running the model: trip distribution

  • Trip distribution is likewise a single function
  • I have students interpret
    • coefficients in the model
    • maps of trip destinations from a tract of their choice
flows = trip_distribution(model, model$scenarios$baseline, trip_ends)
AM Peak trip distribution from a census tract Carrboro, NC (near the UNC campus); most trips go to nearby destinations, but some go to further-flung large employment centers in Durham, Raleigh, and Research Triangle Park

Understanding mode choice

  • Mode choice uses a multinomial logit model, which I explain very briefly, but do have students interpret
  • I have students interpret the mode shares as well
flows_by_mode = mode_choice(model, model$scenarios$baseline, flows)
Car Bike Walk Transit
0.92 0.01 0.05 0.03

Assignment

  • Traffic assignment is also a single function, and we map the results
pm_network_flows = network_assignment(
    model,
    model$scenarios$baseline,
    model$networks$baseline,
    flows_by_mode,
    "PM Peak"
)
Map of forecast PM Peak congestion, with heavy congestion on some major routes and light congestion elsewhere.

Forecast congestion, PM Peak

Scenarios

  • Models are most useful to evaluate scenarios
  • I have students evaluate a scenario based on Chatham Park, adding 20,000 households to Pittsboro
model$scenarios$future = model$scenarios$baseline |>
  add_households(
    "37037020801",
    tribble(
      ~hhsize, ~workers, ~vehicles, ~income, ~n,
      4,       2,        3,         150000,  10000,
      4,       2,        2,         75000,   10000
    )
  )

Chatham Park: network assignment output

Forecast congestion levels after adding 20,000 households, PM Peak

Network scenarios

  • I also have students evaluate the impacts of widening 15-501, and use this to discuss induced demand
model$networks$widen = model$networks$baseline |>
  modify_ways(
    # US 15-501 between Pittsboro and Chapel Hill
    c(
      "16468788", "133051274", "16471803", "285898984",
      . . .
      "712336821", "712336826", "712336827", "998595932"
    ),
    lanes_per_direction=3,
    highway_type="motorway"
  )

Forecast congestion, PM Peak, with widened 15-501.

Conclusion

All models are wrong, but some are useful.

—George Box

 

This one is very wrong, but that makes it more useful.

— George Box, from beyond the grave

Things to ponder

  • Is the balance of simplicity and accuracy appropriate?
  • Is more background needed before assigning this?
  • How do I transition from this into more advanced modeling?
  • Where do I publish this?