University of Iowa
Wasserstein barycenter and its application in distributed Bayesian inference
Flexible hierarchical Bayesian modeling of massive data is challenging due to poorly scaling computations in large sample size settings. Scalable Bayesian methods based on the divide-and-conquer technique provide a general approach for tractable Bayesian inference in massive data settings. All these methods consist of three steps. First, data are divided into smaller computationally tractable subsets. Second, posterior samples are obtained in parallel across all the subsets. Third, posterior samples from all the subsets are combined to yield an approximation of the full data posterior distribution, which is used for inference and predictions. Sampling in the second step is more efficient than sampling from the full data posterior due to the smaller size of any subset. Since the combination step takes negligible time relative to sampling, posterior computations can be scaled to massive data by dividing the full data into a sufficiently large number of data subsets. Existing divide-and-conquer methods differ mainly in the combination scheme. Our focus is on the WASP method that combines subset posterior distributions through their barycenter in a Wasserstein space of probability measures. We demonstrate the application of WASP on several parametric models and conclude with an application in Kriging using modified predictive process.
This talk is based on joint work with David B. Dunson (Duke University), Rajarshi Guhaniyogi (UC Santa Cruz), Cheng Li (National University of Singapore), and Terrance Savitsky (U.S. Bureau of Labor Statistics). Most of the talk will be based on the manuscript entitled "Scalable Bayes via Barycenter in Wasserstein Space" available at https://arxiv.org/abs/1508.05880.
Refreshments at 3:45pm in Snedecor 2101.