DOLDA: a regularized supervised topic model for high-dimensional multi-class regression
From MaRDI portal
Publication:2184404
Abstract: Generating user interpretable multi-class predictions in data rich environments with many classes and explanatory covariates is a daunting task. We introduce Diagonal Orthant Latent Dirichlet Allocation (DOLDA), a supervised topic model for multi-class classification that can handle both many classes as well as many covariates. To handle many classes we use the recently proposed Diagonal Orthant (DO) probit model (Johndrow et al., 2013) together with an efficient Horseshoe prior for variable selection/shrinkage (Carvalho et al., 2010). We propose a computationally efficient parallel Gibbs sampler for the new model. An important advantage of DOLDA is that learned topics are directly connected to individual classes without the need for a reference class. We evaluate the model's predictive accuracy on two datasets and demonstrate DOLDA's advantage in interpreting the generated predictions.
Recommendations
Cites work
- 10.1162/jmlr.2003.3.4-5.993
- A Bayesian analysis of the multinomial probit model using marginal data augmentation
- Bayesian Analysis of Binary and Polychotomous Response Data
- Bayesian Inference for Logistic Models Using Pólya–Gamma Latent Variables
- Bayesian linear regression with sparse priors
- Distributed algorithms for topic models
- Gibbs Sampling for Bayesian Non-Conjugate and Hierarchical Models by Using Auxiliary Variables
- MedLDA: maximum margin supervised topic models
- Sparse Partially Collapsed MCMC for Parallel Inference in Topic Models
- Statistical topic models for multi-label document classification
- The horseshoe estimator for sparse signals
- Tree ensembles with rule structured horseshoe regularization
Cited in
(3)
This page was built for publication: DOLDA: a regularized supervised topic model for high-dimensional multi-class regression
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2184404)