Nonparametric learning from Bayesian models with randomized objective functions

DOI10.48550/ARXIV.1806.11544MaRDI QIDQ70214arXivFDO

Authors S. P. Lyddon, S. G. Walker, C. C. Holmes

Publication date 29 June 2018

Abstract: Bayesian learning is built on an assumption that the model space contains a true reflection of the data generating mechanism. This assumption is problematic, particularly in complex data environments. Here we present a Bayesian nonparametric approach to learning that makes use of statistical models, but does not assume that the model is true. Our approach has provably better properties than using a parametric model and admits a Monte Carlo sampling scheme that can afford massive scalability on modern computer architectures. The model-based aspect of learning is particularly attractive for regularizing nonparametric inference when the sample size is small, and also for correcting approximate approaches such as variational Bayes (VB). We demonstrate the approach on a number of examples including VB classifiers and Bayesian random forests.

Cited in

(1)

PosteriorBootstrap

Summary This article discusses a non-parametric Bayesian approach to update a model parameter's posterior distribution given data. It involves minimizing the KL divergence between the true data-generating mechanism and a parametric family $\mathcal{F}_{\Theta}$ for a parameter $\theta$. A mixture of Dirichlet processes (MDP) prior, $[F|\theta] \sim \text{DP}(c, f_\theta(\cdot)); \quad \theta \sim \pi(\theta)$, is used to specify beliefs about the true mechanism $F_0$. Given data $x_{1:n}$, the posterior update for $F$ utilizes the DP's conjugacy. The process ultimately aims to inform the posterior distribution of $\theta$, $\tilde{\pi}(\theta|x_{1:n})$, through integration over possible $F$.

Summary_simple Imagine you have a math model that tries to guess how some data was created. To make the model better, you update its "best guess" (called a parameter) based on new data. This process uses a special tool called a "mixture of Dirichlet processes" to handle uncertainty about the true source of the data. When new data comes in, this tool helps adjust the model's guess by balancing old assumptions with new information. Essentially, it's a way to refine a math model's accuracy by embracing and updating its uncertainties as more data becomes available.

This page was built for publication: Nonparametric learning from Bayesian models with randomized objective functions

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q70214)