Author ORCID Identifier
Semester
Fall
Date of Graduation
2022
Document Type
Dissertation
Degree Type
PhD
College
Statler College of Engineering and Mineral Resources
Department
Lane Department of Computer Science and Electrical Engineering
Committee Chair
Nancy Lan Guo
Committee Co-Chair
Donald Adjeroh
Committee Member
Donald Adjeroh
Committee Member
Katerina Goseva-Popstojanova
Committee Member
Michael Hu
Committee Member
Xin Li
Abstract
Lung cancer has the second highest cancer incidence rate and the top cancer-related mortality worldwide. An estimate from the American Cancer Society shows that, in 2022, there will be about 236,740 lung cancer cases (117,910 men and 118,830 women) in the US. To date, there are no prognostic/predictive biomarkers to select chemotherapy, immunotherapy, and radiotherapy in individual non-small cell lung cancer (NSCLC) patients. There is an unmet clinical need to identify patients with early-stage NSCLC who are likely to develop recurrence and to predict their therapeutic responses. This dissertation developed a novel computational methodology for modeling molecular gene association networks based on DNA copy number variations, gene expression, protein expression, and single-cell gene expression data of NSCLC and discovering novel biomarkers and therapeutic targets. This dissertation has made the following technical and theoretical contributions to the scientific field: First, a practical extension was made on the Boolean implication network algorithm based on prediction logic. The Boolean implication networks are probabilistic graphical models that express the relationship between two variables. It has conceptual advantages over the existing methodologies. This dissertation extended the usage of the Boolean implication network to model multinary instead of binary data, and construct multi-omics and single-cell omics gene regulatory networks (GRN). Several harmonization techniques were adopted to obtain compatible data and make it possible to build cross-level multi-omics networks in multiple cohorts from different platforms. Secondly, an innovative data driven pipeline was developed for biomarker discovery and therapeutic target identification. The further exploitation of the information contained in the constructed Boolean implication networks is carried out. Novel prognostic genes and proliferation genes were found, and functional pathways, targeted therapies, and repositioning drugs were discovered based on the genes we identified. The developed framework can be applied to any disease with sufficient data. Thirdly, a landscape evaluation was conducted of the biological and clinical relevance of multi-omics and single-cell Boolean implication network centralities rigorously quantified with graph theory centrality metrics in NSCLC tumors. This is the first systematical revelation of the association between multi-omics network centralities and NSCLC tumorigenesis, proliferation, and patient survival. It is proved that gene centrality metrics in GRN can be used in the prioritization of candidates for biomarkers and drug targets. In the future, the results obtained from this dissertation can be tested for biological verification or confirmation of experimental results, thereby helping to identify genes that play an essential role in the cause and progression of NSCLC and to find potential drugs which can be used in the treatment of NSCLC.
Recommended Citation
Ye, Qing, "A Novel Computational Network Methodology for Discovery of Biomarkers and Therapeutic Targets" (2022). Graduate Theses, Dissertations, and Problem Reports. 11558.
https://researchrepository.wvu.edu/etd/11558
Embargo Reason
Publication Pending
Included in
Bioinformatics Commons, Other Analytical, Diagnostic and Therapeutic Techniques and Equipment Commons, Other Computer Engineering Commons