Semester
Summer
Date of Graduation
2020
Document Type
Dissertation
Degree Type
PhD
College
School of Pharmacy
Department
Pharmaceutical Systems and Policy
Committee Chair
Usha Sambamoorthi
Committee Co-Chair
Nilanjana Dwibedi
Committee Member
Traci J. LeMasters
Committee Member
Ranjita Misra
Committee Member
Danielle E. Rose
Abstract
There is robust evidence that heart failure (HF) is associated with substantial mortality, morbidity, poor health-related quality of life, healthcare utilization, and economic burden. Previous research has revealed that there are sex differences in the epidemiology, etiology, and disease burden of HF. However, research on HF among women, especially postmenopausal women, is limited. To fill the knowledge gap, the three related aims of this dissertation were to: (1) identify knowledge gaps in HF research among women, especially postmenopausal women, using unsupervised machine learning methods and big data (i.e., articles published in PubMed); (2) identify emerging predictors (i.e., polypharmacy and some prescription medications) of incident HF among postmenopausal women using supervised machine learning methods; (3) identify leading predictors of HF-related emergency room use among postmenopausal women using supervised machine learning methods with data from a large commercial insurance claims database in the United States. This study utilized machine learning methods. In the first aim, non-negative matrix factorization algorithms were used to cluster HF articles based on the primary topic. Clusters were independently validated and labeled by three investigators familiar with HF research. The most understudied area among women was atrial fibrillation. Among postmenopausal women, the most understudied topic was stress-induced cardiomyopathy. For the second and third aims, a retrospective cohort design and Optum’s de-identified Clinformatics® Data Mart Database (Optum, Eden Prairie, MN), de-identified health insurance claims data, were used. In the second aim, multivariable logistic regression and three classification machine learning algorithms (cross-validated logistic regression (CVLR), random forest (RF), and eXtreme Gradient Boosting (XGBoost) algorithms) were used to identify predictors of incident HF among postmenopausal women. The associations of the leading predictors to incident HF were explored with an interpretable machine learning SHapley Additive exPlanations (SHAP) technique. The eight leading predictors of incident HF consistent across all models were: older age, arrhythmia, polypharmacy, Medicare, chronic obstructive pulmonary disease (COPD), coronary artery disease, hypertension, and chronic kidney disease. Some prescription medications such as sulfonylureas and antibiotics other than fluoroquinolones predicted incident HF in some machine learning algorithms. In the third aim, a random forest algorithm was used to identify predictors of HF-related emergency room use among postmenopausal women. Interpretable machine learning techniques were used to explain the association of leading predictors to HF-related emergency room use. Random forest algorithm had high predictive accuracy in the test dataset (Area Under the Curve: 94%, sensitivity: 93%, specificity: 77%, and accuracy: 0.81). We found that the number of HF-related emergency room visits at baseline, fragmented care, age, insurance type (Health Maintenance Organization), and coronary artery disease were the top five predictors of HF-related emergency room use among postmenopausal women. Partial dependence plots suggested positive associations of the top predictors with HF-related emergency room use. However, insurance type was found to be negatively associated with HF-related emergency room use. Findings from this dissertation suggest that machine learning algorithms can achieve comparable and better predictive accuracy compared to traditional statistical models.
Recommended Citation
Alhussain, Khalid Abdullah, "Applications of Machine Learning Methods in Health Outcomes Research: Heart Failure in Women" (2020). Graduate Theses, Dissertations, and Problem Reports. 7979.
https://researchrepository.wvu.edu/etd/7979
Embargo Reason
Publication Pending