news20

From MaRDI portal
Dataset:6034164



OpenML1594MaRDI QIDQ6034164

OpenML dataset with id 1594

No author found.

Full work available at URL: https://api.openml.org/data/v1/download/1595696/news20.sparse_arff

Upload date: 18 June 2015


Dataset Characteristics

Number of classes: 0
Number of features: 62,062 (numeric: 62,062, symbolic: 0 and in total binary: 0 )
Number of instances: 19,928
Number of instances with missing values: 0
Number of missing values: 0

Author: Ken Lang Source: original - Date unknown Please cite: Ken Lang. Newsweeder: Learning to filter netnews. In Proceedings of the Twelfth International Conference on Machine Learning, pages 331-339, 1995.

  1. Dataset from the LIBSVM data repository.

Preprocessing: First 80/20 training/testing split. Also see http://qwone.com/~jason/20Newsgroups/