Diabetes-130-Hospitals_(Fairlearn)

From MaRDI portal
Dataset:6036972



OpenML43903MaRDI QIDQ6036972

OpenML dataset with id 43903

No author found.

Full work available at URL: https://api.openml.org/data/v1/download/22102813/Diabetes-130-Hospitals_(Fairlearn).arff

Upload date: 1 June 2022


Dataset Characteristics

Number of classes: 2
Number of features: 22 (numeric: 5, symbolic: 6 and in total binary: 6 )
Number of instances: 101,766
Number of instances with missing values: 0
Number of missing values: 0

The "Diabetes 130-Hospitals" dataset represents 10 years of clinical care at 130 U.S. hospitals and delivery networks, collected from 1999 to 2008. Each record represents the hospital admission record for a patient diagnosed with diabetes whose stay lasted between one to fourteen days. The features describing each encounter include demographics, diagnoses, diabetic medications, number of visits in the year preceding the encounter, and payer information, as well as whether the patient was readmitted after release, and whether the readmission occurred within 30 days of the release.

The original "Diabetes 130-Hospitals" dataset was collected by Beata Strack, Jonathan P. DeShazo, Chris Gennings, Juan L. Olmo, Sebastian Ventura, Krzysztof J. Cios, and John N. Clore in 2014.

This version of the dataset was derived by the Fairlearn team for the SciPy 2021 tutorial "Fairness in AI Systems: From social context to practice using Fairlearn". In this version, the target variable "readmitted" is binarized into whether the patient was re-admitted within thirty days. The full dataset pre-processing script can be found on GitHub: https://github.com/fairlearn/talks/blob/main/2021_scipy_tutorial/preprocess.py