detroit

From MaRDI portal
Dataset:6033009



OpenML208MaRDI QIDQ6033009

OpenML dataset with id 208

No author found.

Full work available at URL: https://api.openml.org/data/v1/download/3645/detroit.arff

Upload date: 23 April 2014


Dataset Characteristics

Number of classes: 0
Number of features: 14 (numeric: 14, symbolic: 0 and in total binary: 0 )
Number of instances: 13
Number of instances with missing values: 0
Number of missing values: 0

Author: Source: Unknown - Please cite:

Data from StatLib (ftp stat.cmu.edu/datasets)

This is the data set called `DETROIT' in the book `Subset selection in
regression' by Alan J. Miller published in the Chapman & Hall series of
monographs on Statistics & Applied Probability, no. 40.   The data are
unusual in that a subset of three predictors can be found which gives a
very much better fit to the data than the subsets found from the Efroymson
stepwise algorithm, or from forward selection or backward elimination.

The original data were given in appendix A of `Regression analysis and its
application: A data-oriented approach' by Gunst & Mason, Statistics
textbooks and monographs no. 24, Marcel Dekker.   It has caused problems
because some copies of the Gunst & Mason book do not contain all of the data,
and because Miller does not say which variables he used as predictors and
which is the dependent variable.   (HOM was the dependent variable, and the
predictors were FTP ... WE)

The data were collected by J.C. Fisher and used in his paper: "Homicide in
Detroit: The Role of Firearms", Criminology, vol.14, 387-400 (1976)


The data are on the homicide rate in Detroit for the years 1961-1973.
FTP    - Full-time police per 100,000 population
UEMP   - %  unemployed in the population
MAN    - number of manufacturing workers in thousands
LIC    - Number of handgun licences per 100,000 population
GR     - Number of handgun registrations per 100,000 population
CLEAR  - %  homicides cleared by arrests
WM     - Number of white males in the population
NMAN   - Number of non-manufacturing workers in thousands
GOV    - Number of government workers in thousands
HE     - Average hourly earnings
WE     - Average weekly earnings

HOM    - Number of homicides per 100,000 of population
ACC    - Death rate in accidents per 100,000 population
ASR    - Number of assaults per 100,000 population

N.B. Each case takes two lines.