Dissertations and habilitations - Laboratoire de Recherche en Informatique

Ph.D de

Ph.D
Group : Learning and Optimization

Deep latent variable models: from properties to structures

Starts on 01/10/2017
Advisor : SEBAG, Michèle

Funding : Contrat de thèse d'autres organismes
Affiliation : Université Paris-Saclay
Laboratory : LRI - AO

Defended on 13/10/2021, committee :

Research activities :

Abstract :
Deep Latent Variable Models are generative models combining Bayesian Networks and deep learning, illustrated by the renowned Variational Autoencoder. This thesis focuses on their structure, understood as the combination of 3 aspects: the Bayesian Network graph, the choice of probability distribution families for the variables, and the neural architecture. We show that and how several aspects and properties of those models can be understood and controlled through this structure, without altering the training objective constructed from the Evidence Lower Bound.

The first contribution concerns the impact of the observation model -- the probabilistic modeling of the observed variables -- on the training process: how it determines the demarcation between signal and noise and its impact on training dynamic when its scale parameter is learned rather than fixed. It then behaves similarly to a simulated annealing process.

The second contribution, CompVAE, is centered on the hierarchical structure of latent variables: a generative model conditioned by a multi-set of elements to be combined in the final generation. CompVAE demonstrates how global properties -- ensemblist manipulations in this case -- can be achieved by solely structural design. The model is furthermore empirically validated on real data to generate electrical consumption curves.

The third contribution, Boltzmann Tuning of Generative Models (BTGM), is a framework for adjusting trained generative models according to an externally provided criterion while finding the minimal required adjustments. This is done while finely controlling which latent variables are adjusted and how the are. We empirically demonstrate how BTGM can be used to specialize a trained model or to explore the extreme parts of a generative distribution.