Date of Graduation


Document Type


Degree Type



Statler College of Engineering and Mineral Resources


Lane Department of Computer Science and Electrical Engineering

Committee Chair

Nancy Lan Guo


Increasing the efficiency and effectiveness of chemotherapy will rely on the ability to accurately predict an individual cancer patient's chemosensitivity to certain drugs. One approach to this issue has focused on genomic and proteomic profiling. While previous work has focused on genomic profiling, few studies have explored the correlation of proteomic profiles with drug sensitivity. Meanwhile, a novel algorithm is needed to integrate both protein and gene expression, which is important to systematically understand fundamental chemosensitivity mechanisms. In this study, we sought to explore whether proteomic signatures of untreated cancer cell lines could accurately predict their chemosensitivity. Furthermore, we developed an algorithm to integrate proteomic and genomic profiles and used them to classify the chemosensitivity of the cell lines in an attempt to determine whether the integrated profiles could further increase the accuracy of chemosensitivity prediction.;First, in order to explore whether the proteomic signatures could accurately predict chemosensitivity, we developed a machine learning model exclusively based on proteomic profiling to predict drug response. We used data from studies in which the expression levels of 52 proteins in 60 human cancer cell (NCI-60) lines were determined. The model combined random forests, Relief, and nearest neighbor algorithms to construct chemosensitivity classifiers for each cell line against each of 118 chemotherapeutic agents. The chemosensitivity prediction accuracy of all the evaluated 118 agents was significantly (P < 0.02) higher than random prediction accuracy. These results indicate that it is feasible to accurately predict chemosensitivity by proteomic approaches.;Next, we integrated genomic profiling into our proteomic model and developed a novel feature selection scheme to identify biomarkers from the integrated profiles in NCI-60 cell lines. Then, we used the random forests algorithm to construct chemosensitivity classifiers for the same 118 agents. Seventy-six out of the 118 classifiers could significantly (P < 0.05) improve the chemosensitivity prediction accuracy acquired by protein expression-based classifiers alone. These results demonstrate that our integrated genomic and proteomic approach could further increase chemosensitivity prediction accuracy.;Overall, we found that it is feasible to use proteomic signatures alone to accurately predict chemosensitivity. Integrating genomic and proteomic signatures further increases chemosensitivity prediction accuracy.