ipums_la_99-small (Q6033119): Difference between revisions

Latest revision as of 12:27, 16 April 2024

OpenML dataset with id 378

Language	Label	Description	Also known as
English	ipums_la_99-small	OpenML dataset with id 378

Statements

instance of

data set

0 references

dataset version identifier

1

0 references

description

**Author**: IPUMS (ipums@hist.umn.edu) \N**Donor**: Stephen Bay (sbay@ics.uci.edu) \N**Source**: [UCI](https://archive.ics.uci.edu/ml/datasets/IPUMS+Census+Database) - 1999 \N**Please cite**: \N\N**IPUMS Database** \NThis data set contains unweighted PUMS census data from the Los Angeles and Long Beach areas for the years 1970, 1980, and 1990. The coding schemes have been standardized (by the IPUMS project) to be consistent across years. The original source for this data set is the IPUMS project (RugglesSobek, 1997). The IPUMS project is a large collection of federal census data which has standardized coding schemes to make comparisons across time easy.\N\NThe data is an unweighted 1 in 100 sample of responses from the Los Angeles -- Long Beach area for the years 1970, 1980, and 1990. The household and individual records were flattened into a single table and we used all variables that were available for all three years. When there was more than one version of a variable, such as for race, we used the most general. For occupation and industry we used the 1950 basis.\N\NNote that PUMS data is based on cluster samples, i.e. samples are made of households or dwellings from which there may be multiple individuals. Individuals from the same household are no longer independent. Ruggles (1995) considers this issue further and discusses its effect (along with the effects of stratification) on standard errors.\N\NThe variable schltype appears to have different coding values across the years 1970, 1980, and 1990.\N\NThere are two versions of this data set. The small data set contains a 1 in 1000 sample of the Los Angeles and\NLong Beach area. It was formed by sampling from the large data set. The large data set contains a 1 in 100 sample of the Los Angeles and Long Beach area.\N\N**Past Usage** \NS. D. Bay and M. J. Pazzani. (1999) "Detecting Group Differences: Mining Contrast Sets". submitted.\N\N**Copyright Information** \NAll persons are granted a limited license to use and distribute this documentation and the accompanying data, subject to the following conditions:\N* No fee may be charged for use or distribution.\N* Publications and research reports based on the database must cite it appropriately. The citation should include the following: Steven Ruggles and Matthew Sobek et. al. Integrated Public Use Microdata Series: Version 2.0 Minneapolis: Historical Census Projects, University of Minnesota, 1997\N\NIf possible, citations should also include the URL for the IPUMS site: http://www.ipums.umn.edu/.\N\NIn addition, we request that users send us a copy of any publications, research reports, or educational material making use of the data or documentation. Send all electronic material to ipums@hist.umn.edu\N\NReferences \N\N1. http://www.ipums.umn.edu/\N2. mailto:ipums@hist.umn.edu\N3. http://www.ics.uci.edu/~sbay\N4. mailto:sbay@ics.uci.edu\N5. http://www.ipums.umn.edu/\N6. mailto:ipums@hist.umn.edu\N7. http://www.ipums.umn.edu/\N8. http://www.census.gov/\N9. http://kdd.ics.uci.edu/\N10. http://www.ics.uci.edu/\N11. http://www.uci.edu/

0 references

IPUMS

0 references

1999-11-09

0 references

upload date

27 September 2014

0 references

full work available at URL

https://api.openml.org/data/v1/download/52418/ipums_la_99-small.arff

0 references

https://archive.ics.uci.edu/ml/datasets/IPUMS+Census+Database

0 references

default target attribute

movedin

0 references

0 references

0 references

0 references

https://www.tandfonline.com/doi/abs/10.1080/01615440.1995.9955312

0 references

checksum

ecc4faa8fde6a270c3ecbfc1df40cee8

determination method

MD5

0 references

number of binary features

9

0 references

number of classes

7

0 references

number of features

61

0 references

number of instances

8,844

0 references

number of instances with missing values

8,844

0 references

number of missing values

51,515

0 references

number of numeric features

0

0 references

number of symbolic features

61

0 references

file format

ARFF

0 references

MaRDI profile type

MaRDI dataset profile

0 references

Identifiers

OpenML dataset ID

378

0 references

Sitelinks

Mathematics(1 entry)

mardi Dataset:6033119

Revision as of 10:11, 15 April 2024 Importer (talk \| contribs) Bots 7,038,868 edits ‎Created a new Item	Latest revision as of 12:27, 16 April 2024 Import240416010454 (talk \| contribs) 10,906 edits Added link to MaRDI item.
links / mardi / name	links / mardi / name
		Dataset:6033119