Skip to contents

This vignette shows how to read GTFS-realtime trip updates into R using the {gtfsrealtime} package. Trip updates describe the real-time progress of a vehicle along a scheduled trip, including predicted arrival and departure times, delays, skipped stops, canceled trips, and other real-time information. The {gtfsrealtime} package reads the nested GTFS-realtime trip update format and flattens it into a data frame that is easier to inspect and analyze in R.

Load libraries

First, we load {gtfsrealtime} to read GTFS-realtime files and {dplyr} to inspect and summarize the resulting data frame.

Load a GTFS-realtime trip updates feed

This example uses a New York City trip updates feed included with {gtfsrealtime}. The file is compressed with bzip2 to save space. {gtfsrealtime} can automatically detect and read uncompressed files as well as files compressed with zip, gzip, or bzip2. Zip files can contain multiple GTFS-realtime files, in which case {gtfsrealtime} will read all of them. You can differentiate which file each update came from based on the file_index field.

GTFS-realtime time values are stored as Unix timestamps, which are interpreted relative to UTC. To convert to local time, we provide a local time zone. Time zones are specified in standardized TZ database format, generally Continent/City. If you do not want to convert times, you can specify a time zone of Etc/UTC.

updates <- read_gtfsrt_trip_updates(
  system.file("nyc-trip-updates.pb.bz2", package = "gtfsrealtime"),
  "America/New_York"
)

When reading this example feed, {gtfsrealtime} warns that some GTFS-realtime entity IDs are duplicated. In these cases, the package appends suffixes such as _duplicated_1 so that each row can be represented with a unique id. There are quite a few of them, so they are suppressed here to keep the vignette readable, but the first two are:

1: ! ID UP_A6-Weekday-SDon-094800_B6_243 is duplicated. Replacing with UP_A6-Weekday-SDon-094800_B6_243_duplicated_1 . This may cause joins between different
  GTFS-realtime files (even within a ZIP archive) to be incorrect.
2: ! ID UP_A6-Weekday-SDon-094800_B6_243 is duplicated. Replacing with UP_A6-Weekday-SDon-094800_B6_243_duplicated_2 . This may cause joins between different
  GTFS-realtime files (even within a ZIP archive) to be incorrect.

These warnings are useful in practice: duplicated entity IDs can affect workflows that join records across GTFS-realtime files or across multiple files within a ZIP archive, as IDs may no longer match across files.

Explore trip updates

GTFS-realtime trip updates are hierarchical; one trip update can contain information about the trip as a whole as well as updates for multiple stops along that trip. read_gtfsrt_trip_updates() flattens that structure into a data frame. As a result, the same trip_id may appear in multiple rows when the feed contains stop-level updates for multiple stops.

glimpse(updates)
#> Rows: 90,881
#> Columns: 26
#> $ id                            <chr> "MV_A6-Weekday-SDon-102600_M96_826", "MV…
#> $ trip_id                       <chr> "MV_A6-Weekday-SDon-102600_M96_826", "MV…
#> $ route_id                      <chr> "M96", "M96", "M96", "M96", "M96", "M96"…
#> $ direction_id                  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1…
#> $ start_time                    <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
#> $ start_date                    <chr> "20260128", "20260128", "20260128", "202…
#> $ trip_schedule_relationship    <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
#> $ modifications_id              <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
#> $ vehicle_id                    <chr> "MTA NYCT_9771", "MTA NYCT_9771", "MTA N…
#> $ vehicle_label                 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
#> $ vehicle_license_plate         <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
#> $ vehicle_wheelchair_accessible <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
#> $ stop_sequence                 <dbl> 1, 3, 4, 5, 6, 7, 8, 9, 10, 11, 1, 2, 3,
#> $ stop_id                       <chr> "401933", "401935", "401936", "401937", 
#> $ arrival_delay                 <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
#> $ arrival_time                  <dttm> 2026-01-28 17:13:20, 2026-01-28 17:14:1…
#> $ arrival_scheduled_time        <dttm> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
#> $ arrival_uncertainty           <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
#> $ departure_delay               <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
#> $ departure_time                <dttm> 2026-01-28 17:13:20, 2026-01-28 17:14:1…
#> $ departure_scheduled_time      <dttm> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
#> $ departure_uncertainty         <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
#> $ departure_occupancy_status    <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
#> $ stop_schedule_relationship    <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
#> $ file_timestamp                <dttm> 2026-01-28 17:13:34, 2026-01-28 17:13:3…
#> $ file_index                    <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…

Inspecting one trip across its stops

Because a single trip can include predictions for multiple stops, it is useful to inspect all rows associated with one trip_id. In the example below, we select the first trip in the feed and display the route, stop sequence, stop ID, and predicted arrival and departure times for each stop. If a trip update has no stop time updates, it will appear as a single row with all the stop_* fields NA. Documentation for all of the columns is in the documentation for read_gtfsrt_trip_updates().

updates |>
  filter(trip_id == first(trip_id)) |>
  select(
    trip_id,
    route_id,
    stop_id,
    stop_sequence,
    arrival_time,
    departure_time,
    arrival_delay,
    departure_delay
  )
#>                              trip_id route_id stop_id stop_sequence
#> 1  MV_A6-Weekday-SDon-102600_M96_826      M96  401933             1
#> 2  MV_A6-Weekday-SDon-102600_M96_826      M96  401935             3
#> 3  MV_A6-Weekday-SDon-102600_M96_826      M96  401936             4
#> 4  MV_A6-Weekday-SDon-102600_M96_826      M96  401937             5
#> 5  MV_A6-Weekday-SDon-102600_M96_826      M96  404087             6
#> 6  MV_A6-Weekday-SDon-102600_M96_826      M96  401939             7
#> 7  MV_A6-Weekday-SDon-102600_M96_826      M96  401941             8
#> 8  MV_A6-Weekday-SDon-102600_M96_826      M96  401942             9
#> 9  MV_A6-Weekday-SDon-102600_M96_826      M96  401943            10
#> 10 MV_A6-Weekday-SDon-102600_M96_826      M96  903003            11
#>           arrival_time      departure_time arrival_delay departure_delay
#> 1  2026-01-28 17:13:20 2026-01-28 17:13:20            NA              NA
#> 2  2026-01-28 17:14:15 2026-01-28 17:14:15            NA              NA
#> 3  2026-01-28 17:15:58 2026-01-28 17:15:58            NA              NA
#> 4  2026-01-28 17:17:42 2026-01-28 17:17:42            NA              NA
#> 5  2026-01-28 17:21:41 2026-01-28 17:21:41            NA              NA
#> 6  2026-01-28 17:24:30 2026-01-28 17:24:30            NA              NA
#> 7  2026-01-28 17:27:43 2026-01-28 17:27:43            NA              NA
#> 8  2026-01-28 17:30:05 2026-01-28 17:30:05            NA              NA
#> 9  2026-01-28 17:32:03 2026-01-28 17:32:03            NA              NA
#> 10 2026-01-28 17:33:38 2026-01-28 17:33:38            NA              NA