Project information

  • Category: Machine Learning
  • Data: Open source NYC ridership data & Twitter data
  • Project: Work of PhD student - Yanyan
  • Methods: Text Mining, Topic Modelling, BERT, Clustering
  • Technologies: Python, sklearn, nltk, bertopic, sentence_transformers, wordcloud
  • Conference/Journal: MT-ITS Conference 2023
  • URL: Proceedings needs to be published

Project details

This study utilizes social media data and Natural Language Processing (NLP) technologies to understand public attitudes towards public transportation during the COVID-19 pandemic. Using more than 500K tweets from New York City (2019-2022), the research employs text mining, topic modelling, and sentiment analysis to discern public reactions. It identifies a generally negative sentiment towards public transit and highlights five key topics concerning COVID-19. The findings offer valuable insights to policymakers and transit managers for better decision-making. This research showcases how NLP and social media data can aid in understanding dynamic travel behaviour and support policy-making in the transportation sector.