Date of Graduation
2015
Document Type
Dissertation
Degree Type
PhD
College
Statler College of Engineering and Mineral Resources
Department
Lane Department of Computer Science and Electrical Engineering
Committee Chair
Nancy L Guo
Committee Co-Chair
Donald Adjeroh
Committee Member
Michael Andrew
Committee Member
Katerina Goseva-Popstojanova
Committee Member
Guodong Guo
Abstract
Identifying potential toxicity signaling pathways could guide future animal studies and support human risk assessment and intervention efforts. This thesis describes a novel computational approach for identifying biological processes and pathways that are significantly associated with a disease pathology from time series, dose response, gene expression data.;Our system employs a novel constrained non-negative matrix factorization algorithm and Monte Carlo Markov chain simulation to identify underlying patterns in mRNA gene expression data. Quantitative pathology can be used as a pattern constraint. The found patterns can be thought of as functions that influence a gene's expression. Using a database of curated gene sets, we can identify biological processes that are significantly related to a pathology.;We also developed a computational model for integrating miRNA with mRNA time series microarray data along with disease pathology. The dynamic temporal regulatory effects of miRNA are not well known and a single miRNA may regulate many mRNA. The integrated analysis includes identifying both mRNA and miRNA that are significantly similar to the quantitative pathology. Potential regulatory miRNA/mRNA target pairs are then identified through databases of both predicted and validated pairs. Finally, potential target pairs are filtered, keeping only pairs that demonstrate regulatory effects in the expression data.;Multi-walled carbon nanotubes (MWCNT) are known for their transient inflammatory and progressive fibrotic pulmonary effects; however, the mechanisms underlying these pathologies are unknown. In this thesis, we used time series microarray data of global lung mRNA and miRNA expression isolated from 160 C57BL/6J mice exposed by pharyngeal aspiration to vehicle or 10, 20, 40, or 80 mug MWCNT at 1, 7, 28, or 56 days post-exposure. Quantitative pathology patterns of MWCNT-induced inflammation (bronchoalveolar lavage score) and fibrosis (Sirius Red staining, quantitative morphometric analysis) were obtained from separate studies.;Understanding the regulatory networks between mRNA and miRNA in different stages would be beneficial for understanding the complex path of disease development. These identified genes and pathways may be useful for determining biomarkers of MWCNT-induced lung inflammation and fibrosis for early detection of disease. Our computational approach detects biologically relevant processes with and without pathology information. The identified significant processes and genes are supported by evidence in the literature and with biological validation.
Recommended Citation
Dymacek, Julian Marshall, "A novel computational system for identification of biological processes from multi-dimensional high-throughput genomic data" (2015). Graduate Theses, Dissertations, and Problem Reports. 5524.
https://researchrepository.wvu.edu/etd/5524