Finding the right XAI Method --- Dataset

DOI10.5281/zenodo.7715398Zenodo7715398MaRDI QIDQ6719668FDOQ6719668

Dataset published at Zenodo repository.

Anna Hedström, Philine Lou Bommer, Marina M.-C. Höhne, Dilyara Bareeva, Marlene Kretschmer

Publication date: 10 March 2023

Copyright license: Creative Commons Attribution 4.0 International

This dataset provides the complementary preprocessed data for the training of the neural networks used in Bommer et. al. and according source code (https://github.com/philine-bommer/Climate_X_Quantus). In thepublication , we introduce XAI evaluation in the context of climate research and assess different desired explanation properties, namely, robustness, faithfulness, randomization, complexity, and localization. To this end we build upon previous work (Labe and Barnes et. al. 2021) and train a multi-layer perceptron (MLP) and a convolutional neural network (CNN) to predict the decade based on annual-mean temperature maps. Following Labe and Barnes et. al. 2021, we use data simulated by the general climate model, CESM1 (Hurrell et. al.2013).We use the global 2-m air temperature (T2m) temperature maps from 1920to 2080. The data consist of 40 ensemble members and each member is generated by varying the atmospheric initial conditions with fixed external forcing, i.e.historical forcings are imposed from1920 to2005and Representative Concentration Pathways 8.5 for the following years (Kayet. al. 2015). Following Labe and Barnes et. al. 2021, we compute annual averages and apply a bilinear interpolation. This results in T=161temperature maps for each member, with v=144 longitude grid cellsand h=95latitude grid cells, given the1.9sampling in latitude and 2.5 sampling in longitude. The temperature maps are finally standardized by removing the multi-year (1920 to2080)mean and subsequently dividing by the corresponding standard deviation. Unlike the flattened input used for the MLP (temperature maps are flattened into a vector), the CNN maintains the longitude-latitude grid of the temperature maps.Similar to Labe and Barnes et. al. 2021, for training, validation and testing we use the model data discussed above. For both MLP and CNN we consider 20%of the data as test set and the remaining 80%is split into a training (64%) and validation (16%) set. We trainboth networks to solve a fuzzy classification problem whichcombinesclassification and regression.In the classification setting, the network assigns each map to one of the 20 different classes, where each class corresponds to one decade between 1900 and 2100 (necessary class devision for later regression, as done byLabe and Barnes et. al. 2021). The network output thus, is a probability vector containing a probability for each class. To assess the network performance we use themonthly 2m air temperature of the 20th century Reanalysis data (V3) (Slivinski et. al. 2019) from1920 to2015. The dataset includes two compressed .npz-files and a Readme.md.A full description of the data contained in this datasetand instructions on the data usageare provided in the Readme-file.

This page was built for dataset: Finding the right XAI Method --- Dataset