Glass-Classification

From MaRDI portal
Dataset:6036843



OpenML43750MaRDI QIDQ6036843FDOQ6036843RO-CrateQ6036843

OpenML dataset with id 43750

Author name not available (Why is that?)

Full work available at URL: https://api.openml.org/data/v1/download/22102575/Glass-Classification.arff

Upload date: 24 March 2022



Dataset Characteristics

Number of features: 10 (numeric: 10, symbolic: 0 and in total binary: 0 )
Number of instances: 214
Number of instances with missing values: 0
Number of missing values: 0

Context This is a Glass Identification Data Set from UCI. It contains 10 attributes including id. The response is glass type(discrete 7 values) Content Attribute Information:

Id number: 1 to 214 (removed from CSV file) RI: refractive index Na: Sodium (unit measurement: weight percent in corresponding oxide, as are attributes 4-10) Mg: Magnesium Al: Aluminum Si: Silicon K: Potassium Ca: Calcium Ba: Barium Fe: Iron Type of glass: (class attribute) -- 1 buildingwindowsfloatprocessed -- 2 buildingwindowsnonfloatprocessed -- 3 vehiclewindowsfloatprocessed -- 4 vehiclewindowsnonfloatprocessed (none in this database) -- 5 containers -- 6 tableware -- 7 headlamps

Acknowledgements https://archive.ics.uci.edu/ml/datasets/Glass+Identification Source: Creator: B. German Central Research Establishment Home Office Forensic Science Service Aldermaston, Reading, Berkshire RG7 4PN Donor: Vina Spiehler, Ph.D., DABFT Diagnostic Products Corporation (213) 776-0180 (ext 3014) Inspiration Data exploration of this dataset reveals two important characteristics : 1) The variables are highly corelated with each other including the response variables: So which kind of ML algorithm is most suitable for this dataset Random Forest , KNN or other? Also since dataset is too small is there any chance of applying PCA or it should be completely avoided? 2) Highly Skewed Data: Is scaling sufficient or are there any other techniques which should be applied to normalize data? Like BOX-COX Power transformation?




Cited In (1)


ROCrate

What is a RO-Crate?

A RO-Crate is a standardized research object package used to bundle data together with rich machine-readable metadata. Each RO-Crate contains:

  • the files belonging to the dataset (e.g. CSVs, images, code, documentation)
  • a ro-crate-metadata.json file describing the content, provenance, and context
  • persistent identifiers and references to related research objects (e.g. software, publications)

This ensures that the dataset can be easily reused, cited, validated, and interpreted in a reproducible manner. More information can be found here.

Download

You can download a RO-Crate for this dataset here: Download RO-Crate

HINT: The RO-Crate is created dynamically, so it could take up to 30 seconds until the downloads starts.


This page was built for dataset: Glass-Classification