A probabilistic programming approach to probabilistic data analysis

F Saad, VK Mansinghka - Advances in Neural Information …, 2016 - proceedings.neurips.cc
Advances in Neural Information Processing Systems, 2016proceedings.neurips.cc
Probabilistic techniques are central to data analysis, but different approaches can be
challenging to apply, combine, and compare. This paper introduces composable generative
population models (CGPMs), a computational abstraction that extends directed graphical
models and can be used to describe and compose a broad class of probabilistic data
analysis techniques. Examples include discriminative machine learning, hierarchical
Bayesian models, multivariate kernel methods, clustering algorithms, and arbitrary …
Abstract
Probabilistic techniques are central to data analysis, but different approaches can be challenging to apply, combine, and compare. This paper introduces composable generative population models (CGPMs), a computational abstraction that extends directed graphical models and can be used to describe and compose a broad class of probabilistic data analysis techniques. Examples include discriminative machine learning, hierarchical Bayesian models, multivariate kernel methods, clustering algorithms, and arbitrary probabilistic programs. We demonstrate the integration of CGPMs into BayesDB, a probabilistic programming platform that can express data analysis tasks using a modeling definition language and structured query language. The practical value is illustrated in two ways. First, the paper describes an analysis on a database of Earth satellites, which identifies records that probably violate Kepler’s Third Law by composing causal probabilistic programs with non-parametric Bayes in 50 lines of probabilistic code. Second, it reports the lines of code and accuracy of CGPMs compared with baseline solutions from standard machine learning libraries.
proceedings.neurips.cc
以上显示的是最相近的搜索结果。 查看全部搜索结果

Google学术搜索按钮

example.edu/paper.pdf
搜索
获取 PDF 文件
引用
References