Nonparametric Bayesian topic modelling with the hierarchical Pitman-Yor processes

From MaRDI portal
Publication:324682

DOI10.1016/J.IJAR.2016.07.007zbMATH OpenNonearXiv1609.06783OpenAlexW2483632691WikidataQ28111781 ScholiaQ28111781MaRDI QIDQ324682FDOQ324682

Lan Du, Wray Buntine, Kar Wai Lim, Changyou Chen

Publication date: 17 October 2016

Published in: International Journal of Approximate Reasoning (Search for Journal in Brave)

Abstract: The Dirichlet process and its extension, the Pitman-Yor process, are stochastic processes that take probability distributions as a parameter. These processes can be stacked up to form a hierarchical nonparametric Bayesian model. In this article, we present efficient methods for the use of these processes in this hierarchical context, and apply them to latent variable models for text analytics. In particular, we propose a general framework for designing these Bayesian models, which are called topic models in the computer science community. We then propose a specific nonparametric Bayesian topic model for modelling text from social media. We focus on tweets (posts on Twitter) in this article due to their ease of access. We find that our nonparametric model performs better than existing parametric models in both goodness of fit and real world applications.


Full work available at URL: https://arxiv.org/abs/1609.06783







Cites Work


Cited In (7)

Uses Software





This page was built for publication: Nonparametric Bayesian topic modelling with the hierarchical Pitman-Yor processes

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q324682)