Glass-Classification

OpenMLMaRDI QIDQ6036843FDORO-CrateQ6036843

Full work available at URL https://api.openml.org/data/v1/download/22102575/Glass-Classification.arff

Upload date 24 March 2022

Context This is a Glass Identification Data Set from UCI. It contains 10 attributes including id. The response is glass type(discrete 7 values) Content Attribute Information:

Id number: 1 to 214 (removed from CSV file) RI: refractive index Na: Sodium (unit measurement: weight percent in corresponding oxide, as are attributes 4-10) Mg: Magnesium Al: Aluminum Si: Silicon K: Potassium Ca: Calcium Ba: Barium Fe: Iron Type of glass: (class attribute) -- 1 buildingwindowsfloatprocessed -- 2 buildingwindowsnonfloatprocessed -- 3 vehiclewindowsfloatprocessed -- 4 vehiclewindowsnonfloatprocessed (none in this database) -- 5 containers -- 6 tableware -- 7 headlamps

Acknowledgements https://archive.ics.uci.edu/ml/datasets/Glass+Identification Source: Creator: B. German Central Research Establishment Home Office Forensic Science Service Aldermaston, Reading, Berkshire RG7 4PN Donor: Vina Spiehler, Ph.D., DABFT Diagnostic Products Corporation (213) 776-0180 (ext 3014) Inspiration Data exploration of this dataset reveals two important characteristics : 1) The variables are highly corelated with each other including the response variables: So which kind of ML algorithm is most suitable for this dataset Random Forest , KNN or other? Also since dataset is too small is there any chance of applying PCA or it should be completely avoided? 2) Highly Skewed Data: Is scaling sufficient or are there any other techniques which should be applied to normalize data? Like BOX-COX Power transformation?

Cited in

(1)

Three similarity measures between one-dimensional datasets

Dataset characteristics

Number of features 10 (numeric: 10, symbolic: 0 and in total binary: 0 )

Number of instances 214

Number of instances with missing values 0

Number of missing values 0

RO-Crate

What is a RO-Crate?

A RO-Crate is a standardized research object package used to bundle data together with rich machine-readable metadata. Each RO-Crate contains:

the files belonging to the dataset (e.g. CSVs, images, code, documentation)
a ro-crate-metadata.json file describing the content, provenance, and context
persistent identifiers and references to related research objects (e.g. software, publications)

This ensures that the dataset can be easily reused, cited, validated, and interpreted in a reproducible manner. More information can be found here.

Download

You can download a RO-Crate for this dataset here: Download RO-Crate

HINT: The RO-Crate is created dynamically, so it could take up to 30 seconds until the downloads starts.

This page was built for dataset: Glass-Classification