Project information

  • Category: Data Science
  • Data: Public transport passenger smart card data
  • Project: EU Project SETA
  • Methods: Data Compression, Least Squares, Estimation, 3D Maps
  • Technologies: Python, pandas, scipy, numpy
  • Journal: Transportation Research Part C

Project details

This novel method uses smart card data to attribute passenger delays to specific transit network components like stations and track segments. By breaking down delays into track segment delays, initial waiting times, and transfer delays, we create a solvable system to compute each component's delays. Tested on a year's data from the Washington DC metro, our method successfully compresses millions of trajectories into 3D networks, reducing dimensionality by 94%. With less than five seconds of estimation errors per passenger for 90% of travelers, our method can reveal recurring delay patterns, model delay propagation, and detect disruptions, aiding more strategic transit management.