Identifying health-related topics on twitter: an exploration of tobacco-related tweets as a test topic

  • Authors:
  • Kyle W. Prier;Matthew S. Smith;Christophe Giraud-Carrier;Carl L. Hanson

  • Affiliations:
  • Department of Health Science, Brigham Young University, Provo, UT;Department of Computer Science, Brigham Young University, Provo, UT;Department of Computer Science, Brigham Young University, Provo, UT;Department of Health Science, Brigham Young University, Provo, UT

  • Venue:
  • SBP'11 Proceedings of the 4th international conference on Social computing, behavioral-cultural modeling and prediction
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Public health-related topics are difficult to identify in large conversational datasets like Twitter. This study examines how to model and discover public health topics and themes in tweets. Tobacco use is chosen as a test case to demonstrate the effectiveness of topic modeling via LDA across a large, representational dataset from the United States, as well as across a smaller subset that was seeded by tobacco-related queries. Topic modeling across the large dataset uncovers several public health-related topics, although tobacco is not detected by this method. However, topic modeling across the tobacco subset provides valuable insight about tobacco use in the United States. The methods used in this paper provide a possible toolset for public health researchers and practitioners to better understand public health problems through large datasets of conversational data.