Leveraging Social Media to Understand and Change Policy Around Emerging forms of Tobacco Use

Principal Investigator(s)

Brian Primack, MD, PhD

Tobacco use remains the leading cause of preventable death and disease in the US and Pennsylvania. Nearly 250,000 youth now 17 or under in Pennsylvania will ultimately die prematurely from smoking. Additionally, tobacco use is responsible for over $10 billion in annual direct and indirect health care costs in Pennsylvania alone. While surveillance and intervention efforts can help to prevent tobacco addiction, they are often limited to traditional forms of tobacco consumption. Given the rapid emergence and influence of new forms of tobacco – including electronic cigarettes, hookah tobacco smoking, and snus – reducing tobacco-related morbidity and mortality requires innovative paradigms involving real-time data collection, analysis, and translation to intervention which can leverage the power of newly-available social media technologies.


To address these concerns, this pilot project aims to leverage social media to examine changing and emerging forms of tobacco use in order to support the policy and intervention activities of the Division of Tobacco Prevention and Control (DTPC) within the Pennsylvania Department of Health. The project will evolve in three stages. First, a multidisciplinary research team will generate a comprehensive list of Twitter-optimized search strings related to various forms of tobacco and related substance use with a focus on policy. Second, drawing on the first-stage results, researchers will then develop a comprehensive set of specialized algorithms based on state-of-the-art methods from Natural Language Processing (NLP) to perform automated analyses of this content. Third, based on algorithms generated in the second stage, researchers will use Time Series Analysis (TSA) with seasonally-adjusted Auto-Regressive Integrated Moving Averages (ARIMA) to characterize changes over time in messaging related to each search string and each coded variable of interest. This will allow for automated alerts on real-time changes in the volume and content of tweets related to particular substances. However, we will also conduct exploratory analyses examining content more generally.


This project will create three opportunities for translation that will allow for immediate and sustained impacts on public health policy and practice in Pennsylvania and across the US. First, initial results will lead to additional observational studies of particular interest and value to researchers and practitioners. Second, the study will allow for the creation and testing of specialized Web portals to feed surveillance data directly to community-based prevention and treatment organizations. Third, the study will allow for the development, implementation, and evaluation of interventions which leverage social media itself.