Date of Graduation


Document Type


Degree Type



Statler College of Engineering and Mineral Resources


Lane Department of Computer Science and Electrical Engineering

Committee Chair

Donald Asogu Adjeroh

Committee Co-Chair

Saiph Savage

Committee Member

Yanfang fanny Ye


Recent advancements in pharmacovigilance tasks have shown the usage of social media as a resource to obtain real-time signals for drug surveillance. Researchers demonstrated a good potential for the detection of Adverse Drug Events (ADEs) using social media much earlier than the traditional reporting systems maintained by official regulatory authorities like the United States Food and Drug Administration (FDA). Existing automated drug surveillance systems have used various types of social media channels and search query logs for monitoring ADE signals.;In this thesis, we address two key performance issues related to automated drug surveillance systems. The first is to improve the ADE signal detection by analyzing signals from multiple social media channels, and the second is usage of semantic similarity to evaluate ADE narratives detected by drug surveillance systems. Most current approaches for detecting ADEs from social media rely on a single channel: forums or microblogs or query logs. In this study we propose a new methodology to fuse signals from different social media channels. We use graphical causal models to discover potentially hidden connections between data channels, and then use such associations to generate signals for ADEs. Further, prior work have not emphasized much on the language of healthcare consumers, which is often casual and informal in expressing health issues on social media. There is a high potential to miss the semantic similarity between ADE terms extracted from social media and terms from formal official narratives when the two sets of terms do not share exact text. Thus, we exhibit the usage of semantic similarity to enhance accuracy of detected ADEs, and evaluated similarity measurement algorithms developed over biomedical vocabularies in ADE surveillance domain. We experimented on a dataset of drugs which had FDA black box warnings with a retrospective analysis spanning years 2008 to 2015. The results show a better detection rate and an improved performance in terms of precision, recall and timeliness using our proposed methods.