Semester
Fall
Date of Graduation
2012
Document Type
Dissertation
Degree Type
PhD
College
Statler College of Engineering and Mineral Resources
Department
Lane Department of Computer Science and Electrical Engineering
Committee Chair
Tim Menzies.
Abstract
Software effort estimation (SEE) is the activity of estimating the total effort required to complete a software project. Correctly estimating the effort required for a software project is of vital importance for the competitiveness of the organizations. Both under- and over-estimation leads to undesirable consequences for the organizations. Under-estimation may result in overruns in budget and schedule, which in return may cause the cancellation of projects; thereby, wasting the entire effort spent until that point. Over-estimation may cause promising projects not to be funded; hence, harming the organizational competitiveness.;Due to the significant role of SEE for software organizations, there is a considerable research effort invested in SEE. Thanks to the accumulation of decades of prior research, today we are able to identify the core issues and search for the right principles to tackle pressing questions. For example, regardless of decades of work, we still lack concrete answers to important questions such as: "What is the best SEE method?" The introduced estimation methods make use of local data, however not all the companies have their own data, so: "How can we handle the lack of local data?" Common SEE methods take size attributes for granted, yet size attributes are costly and the practitioners place very little trust in them. Hence, we ask: "How can we avoid the use of size attributes?" Collection of data, particularly dependent variable information (i.e. effort values) is costly: "How can find an essential subset of the SEE data sets?" Finally, studies make use of sampling methods to justify a new method's performance on SEE data sets. Yet, trade-off among different variants is ignored: "How should we choose sampling methods for SEE experiments?";This thesis is a rigorous investigation towards identification and tackling of the pressing issues in SEE. Our findings rely on extensive experimentation performed with a large corpus of estimation techniques on a large set of public and proprietary data sets. We summarize our findings and industrial experience in the form of 12 principles: 1) Know your domain 2) Let the Experts Talk 3) Suspect your data 4) Data Collection is Cyclic 5) Use a Ranking Stability Indicator 6) Assemble Superior Methods 7) Weighting Analogies is Over-elaboration 8) Use Easy-path Design 9) Use Relevancy Filtering 10) Use Outlier Pruning 11) Combine Outlier and Synonym Pruning 12) Be Aware of Sampling Method Trade-off.
Recommended Citation
Kocaguneli, Ekrem, "A Principled Methodology: A Dozen Principles of Software Effort Estimation" (2012). Graduate Theses, Dissertations, and Problem Reports. 3601.
https://researchrepository.wvu.edu/etd/3601