forest_fires

From MaRDI portal
Dataset:6037795



OpenML44962MaRDI QIDQ6037795

OpenML dataset with id 44962

No author found.

Full work available at URL: https://api.openml.org/data/v1/download/22111826/forest_fires.arff

Upload date: 22 December 2022
Copyright license: Creative Commons Attribution 4.0 International


Dataset Characteristics

Number of classes: 0
Number of features: 13 (numeric: 11, symbolic: 0 and in total binary: 0 )
Number of instances: 517
Number of instances with missing values: 0
Number of missing values: 0

Data Description

The aim of this dataset is to predict the burned area of forest fires, in the northeast region of Portugal, by using meteorological and other data.

The output 'area' was first transformed with a $ln(x+1)$ function. Then, several Data Mining methods were applied. After fitting the models, the outputs were post-processed with the inverse of the $ln(x+1)$ transform. Four different input setups were used.

Attribute Description

1. *X* - x-axis spatial coordinate within the Montesinho park map: 1 to 9 2. *Y* - y-axis spatial coordinate within the Montesinho park map: 2 to 9 3. *month* - month of the year: 'jan' to 'dec' 4. *day* - day of the week: 'mon' to 'sun' 5. *FFMC* - FFMC index from the FWI system: 18.7 to 96.20 6. *DMC* - DMC index from the FWI system: 1.1 to 291.3 7. *DC* - DC index from the FWI system: 7.9 to 860.6 8. *ISI* - ISI index from the FWI system: 0.0 to 56.10 9. *temp* - temperature in Celsius degrees: 2.2 to 33.30 10. *RH* - relative humidity in %: 15.0 to 100 11. *wind* - wind speed in km/h: 0.40 to 9.40 12. *rain* - outside rain in mm/m2 : 0.0 to 6.4 13. *area* - the burned area of the forest (in ha): 0.00 to 1090.84 (this target variable is very skewed towards 0.0, thus it may make sense to model with the logarithm transform).