pollution
Dataset:6033276
OpenML dataset with id 542
No author found.
Full work available at URL: https://api.openml.org/data/v1/download/52654/pollution.arff
Upload date: 29 September 2014
Dataset Characteristics
Number of classes: 0
Number of features: 16 (numeric: 16, symbolic: 0 and in total binary: 0 )
Number of instances: 60
Number of instances with missing values: 0
Number of missing values: 0
Author: Source: Unknown - Date unknown Please cite:
This is the pollution data so loved by writers of papers on ridge regression. Source: McDonald, G.C. and Schwing, R.C. (1973) 'Instabilities of regression estimates relating air pollution to mortality', Technometrics, vol.15, 463- Variables in order: PREC Average annual precipitation in inches JANT Average January temperature in degrees F JULT Same for July OVR65 % of 1960 SMSA population aged 65 or older POPN Average household size EDUC Median school years completed by those over 22 HOUS % of housing units which are sound & with all facilities DENS Population per sq. mile in urbanized areas, 1960 NONW % non-white population in urbanized areas, 1960 WWDRK % employed in white collar occupations POOR % of families with income < $3000 HC Relative hydrocarbon pollution potential NOX Same for nitric oxides SO@ Same for sulphur dioxide HUMID Annual average % relative humidity at 1pm MORT Total age-adjusted mortality rate per 100,000
Information about the dataset
CLASSTYPE: numeric
CLASSINDEX: none specific