lung-cancer (Q6032976)

OpenML dataset with id 163

Language	Label	Description	Also known as
English	lung-cancer	OpenML dataset with id 163

Statements

instance of

data set

0 references

OpenML dataset ID

163

0 references

dataset version identifier

1

0 references

description

**Author**: \N**Source**: Unknown - \N**Please cite**: \N\N1. Title: Lung Cancer Data\N \N 2. Source Information:\N \T- Data was published in : \N \T Hong, Z.Q. and Yang, J.Y. "Optimal Discriminant Plane for a Small\N \T Number of Samples and Design Method of Classifier on the Plane",\N \T Pattern Recognition, Vol. 24, No. 4, pp. 317-324, 1991.\N \T- Donor: Stefan Aeberhard, stefan@coral.cs.jcu.edu.au\N \T- Date : May, 1992\N \N 3. Past Usage:\N \T- Hong, Z.Q. and Yang, J.Y. "Optimal Discriminant Plane for a Small\N Number of Samples and Design Method of Classifier on the Plane",\N Pattern Recognition, Vol. 24, No. 4, pp. 317-324, 1991.\N \T- Aeberhard, S., Coomans, D, De Vel, O. "Comparisons of \N \T Classification Methods in High Dimensional Settings", \N \T submitted to Technometrics.\N \T- Aeberhard, S., Coomans, D, De Vel, O. "The Dangers of \N \T Bias in High Dimensional Settings", submitted to\N \T pattern Recognition.\N \N 4. Relevant Information:\N \T- This data was used by Hong and Young to illustrate the \N \T power of the optimal discriminant plane even in ill-posed\N \T settings. Applying the KNN method in the resulting plane\T\N \T gave 77% accuracy. However, these results are strongly\N \T biased (See Aeberhard's second ref. above, or email to\N \T stefan@coral.cs.jcu.edu.au). Results obtained by\N \T Aeberhard et al. are : \N \T RDA : 62.5%, KNN 53.1%, Opt. Disc. Plane 59.4%\N \N \T The data described 3 types of pathological lung cancers.\N \T The Authors give no information on the individual\N \T variables nor on where the data was originally used.\N \N - In the original data 4 values for the fifth attribute were -1.\N These values have been changed to ? (unknown). (*)\N - In the original data 1 value for the 39 attribute was 4. This\N value has been changed to ? (unknown). (*)\N \N \T \N 5. Number of Instances: 32\N \N 6. Number of Attributes: 57 (1 class attribute, 56 predictive)\N \N 7. Attribute Information:\N \N \Tattribute 1 is the class label.\N \T\N \T- All predictive attributes are nominal, taking on integer \N \T values 0-3\N \N 8. Missing Attribute Values: Attributes 5 and 39 (*)\N \N 9. Class Distribution:\N \T- 3 classes, \N \T\T1.)\T9 observations\N \T\T2.)\T13 "\N \T\T3.)\T10 "\N \N\N Information about the dataset\N CLASSTYPE: nominal\N CLASSINDEX: first

0 references

collection date

May, 1992

0 references

upload date

23 April 2014

0 references

full work available at URL

https://api.openml.org/data/v1/download/3584/lung-cancer.arff

0 references

default target attribute

class