covertype (Q6037082)

From MaRDI portal
OpenML dataset with id 44036
Language Label Description Also known as
English
covertype
OpenML dataset with id 44036

    Statements

    0 references
    0 references
    Dataset used in the tabular data benchmark https://github.com/LeoGrin/tabular-benchmark, \N transformed in the same way. This dataset belongs to the "classification on categorical and\N numerical features" benchmark. Original description: \N \N**Author**: Jock A. Blackard, Dr. Denis J. Dean, Dr. Charles W. Anderson \N**Source**: [UCI](https://archive.ics.uci.edu/ml/datasets/Covertype) - 1998 \N\NThis is the original version of the famous covertype dataset in ARFF format. \N\N**Covertype** \NPredicting forest cover type from cartographic variables only (no remotely sensed data). The actual forest cover type for a given observation (30 x 30 meter cell) was determined from US Forest Service (USFS) Region 2 Resource Information System (RIS) data. Independent variables were derived from data originally obtained from US Geological Survey (USGS) and USFS data. Data is in raw form (not scaled) and contains binary (0 or 1) columns of data for qualitative independent variables (wilderness areas and soil types). \N\NThis study area includes four wilderness areas located in the Roosevelt National Forest of northern Colorado. These areas represent forests with minimal human-caused disturbances, so that existing forest cover types are more a result of ecological processes rather than forest management practices. \N\NSome background information for these four wilderness areas: Neota (area 2) probably has the highest mean elevational value of the 4 wilderness areas. Rawah (area 1) and Comanche Peak (area 3) would have a lower mean elevational value, while Cache la Poudre (area 4) would have the lowest mean elevational value. \N\NAs for primary major tree species in these areas, Neota would have spruce/fir (type 1), while Rawah and Comanche Peak would probably have lodgepole pine (type 2) as their primary species, followed by spruce/fir and aspen (type 5). Cache la Poudre would tend to have Ponderosa pine (type 3), Douglas-fir (type 6), and cottonwood/willow (type 4). \N\NThe Rawah and Comanche Peak areas would tend to be more typical of the overall dataset than either the Neota or Cache la Poudre, due to their assortment of tree species and range of predictive variable values (elevation, etc.) Cache la Poudre would probably be more unique than the others, due to its relatively low elevation range and species composition.\N\NAttribute Information: \NGiven is the attribute name, attribute type, the measurement unit and a brief description. The forest cover type is the classification problem. The order of this listing corresponds to the order of numerals along the rows of the database. \N>\NName / Data Type / Measurement / Description \NElevation / quantitative /meters / Elevation in meters \NAspect / quantitative / azimuth / Aspect in degrees azimuth \NSlope / quantitative / degrees / Slope in degrees \NHorizontal_Distance_To_Hydrology / quantitative / meters / Horz Dist to nearest surface water features \NVertical_Distance_To_Hydrology / quantitative / meters / Vert Dist to nearest surface water features \NHorizontal_Distance_To_Roadways / quantitative / meters / Horz Dist to nearest roadway \NHillshade_9am / quantitative / 0 to 255 index / Hillshade index at 9am, summer solstice \NHillshade_Noon / quantitative / 0 to 255 index / Hillshade index at noon, summer solstice \NHillshade_3pm / quantitative / 0 to 255 index / Hillshade index at 3pm, summer solstice \NHorizontal_Distance_To_Fire_Points / quantitative / meters / Horz Dist to nearest wildfire ignition points \NWilderness_Area (4 binary columns) / qualitative / 0 (absence) or 1 (presence) / Wilderness area designation \NSoil_Type (40 binary columns) / qualitative / 0 (absence) or 1 (presence) / Soil Type designation \NCover_Type (7 types) / integer / 1 to 7 / Forest Cover Type designation \N\N\NRelevant Papers: \N- Blackard, Jock A. and Denis J. Dean. 2000. "Comparative Accuracies of Artificial Neural Networks and Discriminant Analysis in Predicting Forest Cover Types from Cartographic Variables." Computers and Electronics in Agriculture 24(3):131-151. \N- Blackard, Jock A. and Denis J. Dean. 1998. "Comparative Accuracies of Neural Networks and Discriminant Analysis in Predicting Forest Cover Types from Cartographic Variables." Second Southern Forestry GIS Conference. University of Georgia. Athens, GA. Pages 189-199. \N- Blackard, Jock A. 1998. "Comparison of Neural Networks and Discriminant Analysis in Predicting Forest Cover Types." Ph.D. dissertation. Department of Forest Sciences. Colorado State University. Fort Collins, Colorado. 165 pages.
    0 references
    18 June 2022
    0 references
    class
    0 references
    86b4088fcc002be15cc6413bdabf8603
    0 references
    39
    0 references
    2
    0 references
    55
    0 references
    423,680
    0 references
    0
    0 references
    10
    0 references
    44
    0 references
    0 references