Word-class embeddings for multiclass text classification
From MaRDI portal
Abstract: Pre-trained word embeddings encode general word semantics and lexical regularities of natural language, and have proven useful across many NLP tasks, including word sense disambiguation, machine translation, and sentiment analysis, to name a few. In supervised tasks such as multiclass text classification (the focus of this article) it seems appealing to enhance word representations with ad-hoc embeddings that encode task-specific information. We propose (supervised) word-class embeddings (WCEs), and show that, when concatenated to (unsupervised) pre-trained word embeddings, they substantially facilitate the training of deep-learning models in multiclass classification by topic. We show empirical evidence that WCEs yield a consistent improvement in multiclass classification accuracy, using four popular neural architectures and six widely used and publicly available datasets for multiclass text classification. Our code that implements WCEs is publicly available at https://github.com/AlexMoreo/word-class-embeddings
Recommendations
Cites work
- scientific article; zbMATH DE number 6378127 (Why is no real title available?)
- scientific article; zbMATH DE number 5957307 (Why is no real title available?)
- 10.1162/153244303322533223
- 10.1162/153244303322753625
- 10.1162/jmlr.2003.3.4-5.993
- Adjusting the outputs of a classifier to new a priori probabilities: A simple procedure
- Distributional correspondence indexing for cross-lingual and cross-domain sentiment classification.
- From word to sense embeddings: a survey on vector representations of meaning
- Learning representations by back-propagating errors
- Lightweight random indexing for polylingual text classification
- Natural language processing (almost) from scratch
- Support-vector networks
- Visualizing data using t-SNE
- Why does unsupervised pre-training help deep learning?
Cited in
(3)
Describes a project that uses
Uses Software
This page was built for publication: Word-class embeddings for multiclass text classification
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2036741)