Author ORCID Identifier

https://orcid.org/0000-0003-4112-626X

Semester

Fall

Date of Graduation

2025

Document Type

Dissertation

Degree Type

PhD

College

Statler College of Engineering and Mineral Resources

Department

Lane Department of Computer Science and Electrical Engineering

Committee Chair

Jeremy Dawson

Committee Member

Tom Devine

Committee Member

Yuxin Liu

Committee Member

Prashnna Gyawali

Committee Member

Kakan Day

Abstract

Despite significant advances in deep face recognition, current systems face several practical challenges in real-world scenarios. These include high computational cost of training on large-scale datasets, inefficient use of metric space, and mismatch between training and evaluation frameworks. This dissertation addresses these limitations through three completed studies. The first part presents a research effort aimed at addressing the computational bottlenecks of large-scale FR training. This work proposes a framework that replaces conventional scalar identity labels with structured identity codes, \ie, sequences of tokens optimized to preserve semantic and metric separation. The formulation is designed to reduce the computational cost of classification from linear to logarithmic in the number of classes while alleviating the minority collapse problem observed in unbalanced data distributions.

The second part introduces a training strategy that optimizes the allocation of centroid-based class representations. Instead of fixing the association between each class and its centroid, centroids are dynamically reassigned using bipartite matching during training. This joint optimization over network parameters and assignments improves metric space usage and leads to better generalization across both balanced and long-tail datasets. The final part focuses on reducing the gap between training and inference in FR systems. While training is typically performed using sample-to-centroid comparisons under classification-based objectives, inference relies on sample-to-sample comparisons in the embedding space. To address this discrepancy, a training framework is developed that injects sample-to-sample supervision into the classification loss. By using feature magnitude as a proxy for recognizability, the method suppresses the influence of unrecognizable samples and emphasizes meaningful ones, leading to better alignment between training dynamics and test-time behavior. Together, these contributions aim to build more generalizable, efficient, and scalable FR systems suitable for deployment in unconstrained settings.

Share

COinS