Semester
Fall
Date of Graduation
2020
Document Type
Dissertation
Degree Type
PhD
College
Statler College of Engineering and Mineral Resources
Department
Lane Department of Computer Science and Electrical Engineering
Committee Chair
Gianfranco Doretto
Committee Co-Chair
Donald Adjeroh
Committee Member
Xin Li
Committee Member
Yanfang (Fanny) Ye
Committee Member
Yu Gu
Abstract
A large number of deep learning methods applied to computer vision problems require encoder-decoder maps. These methods include, but are not limited to, self-representation learning, generalization, few-shot learning, and novelty detection. Encoder-decoder maps are also useful for photo manipulation, photo editing, superresolution, etc. Encoder-decoder maps are typically learned using autoencoder networks.
Traditionally, autoencoder reciprocity is achieved in the image-space using pixel-wise
similarity loss, which has a widely known flaw of producing non-realistic reconstructions. This flaw is typical for the Variational Autoencoder (VAE) family and is not only limited to pixel-wise similarity losses, but is common to all methods relying upon the explicit maximum likelihood training paradigm, as opposed to an implicit one. Likelihood maximization, coupled with poor decoder distribution leads to poor or blurry reconstructions at best. Generative Adversarial Networks (GANs) on the other hand, perform an implicit maximization of the likelihood by solving a minimax game, thus bypassing the issues derived from the explicit maximization. This provides GAN architectures with remarkable generative power, enabling the generation of high-resolution images of humans, which are indistinguishable from real photos to the naked eye. However, GAN architectures lack inference capabilities, which makes them unsuitable for training encoder-decoder maps, effectively limiting their application space.
We introduce an autoencoder architecture that (a) is free from the consequences of
maximizing the likelihood directly, (b) produces reconstructions competitive in quality with state-of-the-art GAN architectures, and (c) allows learning disentangled representations, which makes it useful in a variety of problems. We show that the proposed architecture and training paradigm significantly improves the state-of-the-art in novelty and anomaly detection methods, it enables novel kinds of image manipulations, and has significant potential for other applications.
Recommended Citation
Pidhorskyi, Stanislav M.S., "Representation Learning with Adversarial Latent Autoencoders" (2020). Graduate Theses, Dissertations, and Problem Reports. 7810.
https://researchrepository.wvu.edu/etd/7810