Kinsey Pavilion 1240B

David Blei, Professor

Department of Statistics and Computer Science, Columbia University

Bayesian statistics and expressive probabilistic modeling have become key tools for the modern statistician. They let us express complex assumptions about the hidden structures that underlie our data and have been successfully applied in numerous fields.

The central computational problem in Bayesian statistics is posterior inference, the problem of approximating the conditional distribution of the hidden variables given the observations. Approximate posterior inference algorithms have revolutionized the field, revealing its potential as a usable and general-purpose language for data analysis.

Bayesian statistics, however, has not yet reached this potential. First, statisticians and scientists regularly encounter massive data sets, but existing approximate inference algorithms do not easily scale. Second, most approximate inference algorithms are not generic; each must be adapted to the specific model at hand.

In this talk I will discuss our recent research on addressing these two limitations. I will first describe stochastic variational inference, an approximate inference algorithm for handling massive data sets and I will demonstrate its application to probabilistic topic models of millions of articles. Then I will discuss black box variational inference, a generic algorithm for approximating the posterior. Black box inference easily applies to many models with little model-specific derivation and few restrictions on their properties. I will demonstrate its use on deep exponential families and describe how it enables powerful tools for probabilistic programming.