house_8L

From MaRDI portal
Dataset:6033019



OpenML218MaRDI QIDQ6033019

OpenML dataset with id 218

No author found.

Full work available at URL: https://api.openml.org/data/v1/download/3655/house_8L.arff

Upload date: 23 April 2014


Dataset Characteristics

Number of classes: 0
Number of features: 9 (numeric: 9, symbolic: 0 and in total binary: 0 )
Number of instances: 22,784
Number of instances with missing values: 0
Number of missing values: 0

Author: Source: Unknown - Please cite:

This database was designed on the basis of data provided by US Census

Bureau [1] (under Lookup Access
[2]: Summary Tape File 1). The data
were collected as part of the 1990 US census. These are mostly counts
cumulated at different survey levels. For the purpose of this data set
a level State-Place was used. Data from all states was obtained. Most
of the counts were changed into appropriate proportions.  There are 4
different data sets obtained from this database: House(8H) House(8L)
House(16H) House(16L) These are all concerned with predicting the
median price of the house in the region based on demographic
composition and a state of housing market in the region. A number in
the name signifies the number of attributes of the data set. A
following letter denotes a very rough approximation to the difficulty
of the task. For Low task difficulty, more correlated attributes were
chosen as signified by univariate smooth fit of that input on the
target. Tasks with High difficulty have had their attributes chosen to
make the modelling more difficult due to higher variance or lower
correlation of the inputs to the target.

Original source: DELVE repository of data. 
Source: collection of regression datasets by Luis Torgo (ltorgo@ncc.up.pt) at
http://www.ncc.up.pt/~ltorgo/Regression/DataSets.html
Characteristics: 22784 cases, 9 continuous attributes.