Abstract: This paper discusses the challenges presented by tall data problems associated with Bayesian classification (specifically binary classification) and the existing methods to handle them. Current methods include parallelizing the likelihood, subsampling, and consensus Monte Carlo. A new method based on the two-stage Metropolis-Hastings algorithm is also proposed. The purpose of this algorithm is to reduce the exact likelihood computational cost in the tall data situation. In the first stage, a new proposal is tested by the approximate likelihood based model. The full likelihood based posterior computation will be conducted only if the proposal passes the first stage screening. Furthermore, this method can be adopted into the consensus Monte Carlo framework. The two-stage method is applied to logistic regression, hierarchical logistic regression, and Bayesian multivariate adaptive regression splines.
Recommendations
- scientific article; zbMATH DE number 6781368
- Extended stochastic gradient Markov chain Monte Carlo for large-scale Bayesian variable selection
- Double-parallel Monte Carlo for Bayesian analysis of big data
- Parallel Markov chain Monte Carlo for Bayesian hierarchical models with big data, in two stages
- Comparing consensus Monte Carlo strategies for distributed Bayesian computation
Cites work
- scientific article; zbMATH DE number 2117879 (Why is no real title available?)
- scientific article; zbMATH DE number 6781368 (Why is no real title available?)
- 10.1162/15324430152748236
- A Bayesian approach to characterizing uncertainty in inverse problems using coarse and fine-scale information
- Bayesian Classification of Tumours by Using Gene Expression Data
- Classification with Bayesian MARS
- Generalized Nonlinear Modeling With Multivariate Free-Knot Regression Splines
- Multivariate adaptive regression splines
- Reversible jump Markov chain Monte Carlo computation and Bayesian model determination
- Speeding Up MCMC by Efficient Data Subsampling
- The pseudo-marginal approach for efficient Monte Carlo computations
Cited in
(6)- scientific article; zbMATH DE number 6781368 (Why is no real title available?)
- Accelerating sequential Monte Carlo with surrogate likelihoods
- Speeding up MCMC by Delayed Acceptance and Data Subsampling
- A two-stage adaptive Metropolis algorithm
- Scalable Bayesian Nonparametric Clustering and Classification
- The Block-Poisson Estimator for Optimally Tuned Exact Subsampling MCMC
This page was built for publication: Two-stage Metropolis-Hastings for tall data
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q724596)