Probabilistic models play a pivotal role in machine learning. They are at the core of powerful algorithms that can uncover hidden structures, learn useful representations, and efficiently utilize these to make accurate predictions or generate realistic observations. As datasets and problems continue to grow in both volume and complexity, probabilistic models have needed to rely increasingly on approximate Bayesian inference methods in order to scale and meet their demands.
In particular, there has been a great deal of renewed interest in variational inference, an approach that reformulates the problem of approximating intractable posterior densities as an optimization problem, whereby a search is performed over a family of simpler distributions to find the member of that family closest to the posterior. In recent years, significant advances have been made toward using stochastic optimization methods to scale variational inference to large datasets, deriving generic methods to easily fit general classes of models, and using neural networks to specify flexible parametric families of approximate densities.
Concurrent with these advances, there has been tremendous research interest in the use of implicit probabilistic models in machine learning. Implicit models admit high fidelity to the data generating process, but do not yield tractable probability densities. As we better understand the implications of learning in implicit models, we are beginning to recognize their deep connections to density ratio estimation and approximate divergence minimization.
In recent years, the scope and applicability of variational inference has expanded dramatically. By leveraging the formal connections between density ratio estimation and learning in implicit models, variational inference can now be applied in settings where no likelihood is available, where the family of posterior approximations is arbitrarily expressive and does not yield a density, and indeed, even where no prior density is available.