Semester
Fall
Date of Graduation
2009
Document Type
Dissertation
Degree Type
PhD
College
Statler College of Engineering and Mineral Resources
Department
Lane Department of Computer Science and Electrical Engineering
Committee Chair
Bojan Cukic
Committee Co-Chair
Tim Menzies
Abstract
It is difficult to build high quality software with limited quality assurance budgets. Software fault prediction models can be used to learn fault predictors from software metrics. Fault prediction prior to software release can guide Verification and Validation (V&V) activity and allocate scarce resources to modules which are predicted to be fault-prone.;One of the most important goals of fault prediction is to detect fault prone modules as early as possible in the software development life cycle. Design and code metrics have been successfully used for predicting fault-prone modules. In this dissertation, we introduce fault prediction from software requirements. Furthermore, we investigate the advantages of the incremental development of software fault prediction models, and we compare the performance of these models as the volume of data and their life cycle origin (design, code, or their combination) evolution during project development. We confirm that increasing the volume of training data improves model performance. And that, models built from code metrics typically outperform those built using design metrics only. However, both types of models prove to be useful as they can be constructed in different phases of the life cycle. We also demonstrate that models that utilize a combination of design and code level metrics outperform models which use either one metric set exclusively.;In evaluation of fault prediction models, misclassification cost has been neglected. Using a graphical measurement, the cost curve, we evaluate software fault prediction models. Cost curves not only allow software quality engineers to introduce project-specific misclassification costs into model evaluation, but also allow them to incorporate module-specific misclassification costs into model evaluation. Classifying a software module as fault-prone implies the application of some verification activities, thus adding to the development cost. Misclassifying a module as fault free carries the risk of system failure, and is also associated with cost implications. Our results, through the analysis of more than ten projects from public repositories, support a recommendation to adopt cost curves as one of the standard methods for software fault prediction model performance evaluation.
Recommended Citation
Jiang, Yue, "Incremental development and cost-based evaluation of software fault prediction models" (2009). Graduate Theses, Dissertations, and Problem Reports. 2886.
https://researchrepository.wvu.edu/etd/2886