adult

From MaRDI portal
Dataset:6032980



OpenML179MaRDI QIDQ6032980

OpenML dataset with id 179

No author found.

Full work available at URL: https://api.openml.org/data/v1/download/3608/adult.arff

Upload date: 23 April 2014



Dataset Characteristics

Number of classes: 2
Number of features: 15 (numeric: 2, symbolic: 13 and in total binary: 2 )
Number of instances: 48,842
Number of instances with missing values: 3,620
Number of missing values: 6,465

Author: Ronny Kohavi and Barry Becker Source: UCI - 1996-05-01 Please cite: Ron Kohavi, "Scaling Up the Accuracy of Naive-Bayes Classifiers: a Decision-Tree Hybrid", Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, 1996

Note: This dataset is not the original UCI dataset. It has some discretized features. See version 2 for the original.

Prediction task is to determine whether a person makes over 50K a year. Extraction was done by Barry Becker from the 1994 Census database. A set of reasonably clean records was extracted using the following conditions: ((AAGE>16) && (AGI>100) && (AFNLWGT>1)&& (HRSWK>0))

Ronny Kohavi and Barry Becker. Data Mining and Visualization, Silicon Graphics. e-mail: ronnyk '@' live.com for questions.