duke-breast-cancer
OpenML dataset with id 1434
No author found.
Full work available at URL: https://api.openml.org/data/v1/download/1426694/duke-breast-cancer.sparse_arff
Upload date: 27 April 2015
Dataset Characteristics
Number of classes: 0
Number of features: 7,130 (numeric: 7,130, symbolic: 0 and in total binary: 0 )
Number of instances: 86
Number of instances with missing values: 0
Number of missing values: 0
Author: Shirish Krishnaj Shevade and S. Sathiya Keerthi. libSVM","AAD group Source: original - Date unknown Please cite: Shirish Krishnaj Shevade and S. Sathiya Keerthi. A simple and efficient algorithm for gene selection using sparse logistic regression. Bioinformatics, 19(17):2246-2253, 2003.
- Dataset from the LIBSVM data repository.
Preprocessing: Instance-wise normalization to mean zero and variance one. Then feature-wise normalization to mean zero and variance one. The original dataset consists of 49 instances. Five are removed since the classification results using immunohistochemistry and protein immunoblotting assay confilcted. Of the remaining, two instances were rejected due to failed array hybridization. The rest data are further splited into training (38), and validation (4).