Date of Graduation
Statler College of Engineering and Mineral Resources
Lane Department of Computer Science and Electrical Engineering
Biometrics systems are experiencing wide-spread usage in identification and access control applications. To estimate the performance of any biometric systems, their characteristics need to be analyzed to make concrete conclusions for real time usage. Performance testing of hardware or software components of either custom or state-of-the-art commercial biometric systems is typically carried out on large datasets. Several public and private datasets are used in current biometric research. West Virginia University has completed several large scale multimodal biometric data collection with an aim to create research datasets that can be used by disciplines concerning secured biometric applications. However, the demographic and image quality properties of these datasets can potentially lead to bias when they are used in performance testing of new systems. To overcome this, the characteristics of datasets used for performance testing must be well understood prior to usage.;This thesis will answer three main questions associated with this issue:;• For a single matcher, do the genuine and impostor match score distributions within specific demographics groups vary from those of the entire dataset? • What are the possible ways to compare the subset of demographic match score distributions against those of the entire dataset? • Based on these comparisons, what conclusions can be made about the characteristics of dataset?;In this work, 13,976 frontal face images from WVU's 2012 Biometric collection project funded by the FBI involving 1200 individuals were used as a 'test' dataset. The goal was to evaluate performance of this dataset by generating genuine and impostor match scores distributions using a commercial matching software Further, the dataset was categorized demographically, and match score distributions were generated for these subsets in order to explore whether or not this breakdown impacted match score distributions. The match score distributions of the overall dataset were compared against each demographic cohorts.;Using statistical measures, Area under Curve (AUC) and Equal Error Rate (EER) were observed by plotting Receiver Operating Characteristics (ROC) curves to measure the performance of each demographic group with respect to overall data and also within the cohorts of demographic group. Also, Kull-back Leibler Divergence and Jensen Shannon Divergence values were calculated for each demographic cohort (age, gender and ethnicity) within the overall data. These statistical approaches provide a numerical value representing the amount of variation between two match score distributions In addition, FAR and FRR was observed to estimate the error rates. These statistical measures effectively enabled the determination of the impact of different demographic breakdown on match score distributions, and thus, helped in understanding the characteristics of dataset and how they may impact its usage in performance testing biometrics.
Kamireddy, Mounica, "Evaluating the Performance of a Large-Scale Facial Image Dataset Using Agglomerated Match Score Statistics" (2016). Graduate Theses, Dissertations, and Problem Reports. 5928.