Date of Graduation


Document Type


Degree Type



Statler College of Engineering and Mineral Resources


Industrial and Managements Systems Engineering

Committee Chair

Leily Farrokhvar

Committee Co-Chair

Behrooz Kamali

Committee Member

Kenneth Currie

Committee Member

Thorsten Wuest

Committee Member

Michael Russell


The availability of data and advanced data analysis tools in the health care domain provide great opportunities for the discovery of unknown or hidden patterns in clinical data. However, medical data is often incomplete and has poor quality, e.g., it contains missing values or too many unnecessary features. Majority of these issues are induced by uncertainty in clinical trials and examinations, such as human error, patients’ withdrawal during studies, malfunctioning of data collection equipment, and are generally inevitable. Properly addressing these issues can lead to the development of precise predictive models and consequently more reliable clinical decision making. Accurate predictive models can profoundly help health care administrators optimize hospital resource consumptions, e.g., manage bed usages, and improve the quality of their services.

In this dissertation, we aim to provide more accurate estimations of health parameters by improving the quality and quantity of data using machine learning methods. This can eventually result in improved decision support tools that help health care providers with better resource planning and consequently enhanced patient outcomes. In the first three chapters, we focus on improving the quality of data. In the first chapter, we investigate the impact of a specific group of factors on a patient outcome. In chapters two and three, we propose a feature selection method to identify the most significant features for the prediction task and apply it to two different health parameters including hospital length of stay and disease prediction. In chapter four, we address the issues with the quantity of data when there are missing values in the dataset. To this aim, we develop a missing data imputation model to enhance the usability of incomplete records for the prediction tasks. To address a wide range of existing challenges in analyzing clinical data, in chapter V, we extend our work to develop methods for accurate analysis of medical image data. This can potentially result in more accurate and faster care delivery process as many clinical diagnostic decisions are based on medical images.

This dissertation follows the structure of a compilation thesis (dissertation by publications). This document includes five publications. Three of which, have already been published/accepted (one published in a Journal, one published in an IEEE conference, and one is accepted in a journal but has not been published yet). The fourth and fifth publications are under review of Journals.

Embargo Reason

Publication Pending