Machine Learning Thesis Defense
- Gates Hillman Centers
- ASA Conference Room 6115
- WILLIE NIESWANGER
- Ph.D. Student
- Machine Learning Department
- Carnegie Mellon University
Post-Inference Methods for Scalable Probabilistic Modeling and Sequential Decision Making
Probabilistic modeling refers to a set of techniques for modeling data that allows one to specify assumptions about the processes that generate data, incorporate prior beliefs about models, and infer properties of these models given observed data. Benefits include uncertainty quantification, multiple plausible solutions, reduction of overfitting, better performance given small data or large models, and explicit incorporation of a priori knowledge and problem structure. In recent decades, an array of inference algorithms have been developed to estimate these models.
This thesis focuses on post-inference methods, which are procedures that can be applied after the completion of standard inference algorithms to allow for increased efficiency, accuracy, or parallelism when learning probabilistic models of big data sets. These methods also allow for scalable computation in distributed or online settings, incorporation of complex prior information, and
better use of inference results in downstream tasks. A few examples include:
- Embarrassingly parallel inference: Large data sets are often distributed over a collection of machines. We first compute an inference result (e.g. with Markov chain Monte Carlo or variational inference) on each machine, in parallel, without communication between machines. Afterwards, we combine the results to yield an inference result for the full data set.
- Prior swapping: Certain model priors limit the number of applicable inference algorithms, or increase their computational cost. We first choose any "convenient prior" (e.g. a conjugate prior, or a prior that allows for computationally cheap inference), and compute an inference result. Afterwards, we use this result to efficiently perform inference with other, more sophisticated priors or regularizers.
- Sequential decision making and optimization: Model-based sequential decision making and optimization methods use models to define acquisition functions. We compute acquisition functions using the inference result from any probabilistic program or model framework, and perform efficient inference in sequential settings.
We also describe the benefits of combining the above methods, present methodology for applying the embarrassingly parallel procedures when the number of machines is dynamic or unknown at inference time, illustrate how these methods can be applied for spatiotemporal analysis and in covariate dependent models, show ways to optimize these methods by incorporating test-functions of interest, and demonstrate how these methods can be implemented in probabilistic programming frameworks for automatic deployment.
Eric Xing (Chair)
Ryan Adams (Princeton University)
Yee Whye Teh (University of Oxford and DeepMind)