vehicle_sensIT

From MaRDI portal
Dataset:6033109



OpenML357MaRDI QIDQ6033109

OpenML dataset with id 357

No author found.

Full work available at URL: https://api.openml.org/data/v1/download/52260/vehicle_sensIT.sparse_arff

Upload date: 29 August 2014


Dataset Characteristics

Number of classes: 2
Number of features: 101 (numeric: 100, symbolic: 1 and in total binary: 1 )
Number of instances: 98,528
Number of instances with missing values: 0
Number of missing values: 0

Author: M. Duarte, Y. H. Hu Source: original - 2013-11-14 - Please cite: M. Duarte and Y. H. Hu. Vehicle classification in distributed sensor networks. Journal of Parallel and Distributed Computing, 64(7):826-838, July 2004.


This is the SensIT Vehicle (combined) dataset, retrieved 2013-11-14 from the libSVM site. Additional to the preprocessing done there (see LibSVM site for details), this dataset was created as follows: -join test and train datasets (2 files, already pre-combined) -relabel classes 1,2=positive class and 3=negative class -normalize each file columnwise according to the following rules: -If a column only contains one value (constant feature), it will set to zero and thus removed by sparsity. -If a column contains two values (binary feature), the value occuring more often will be set to zero, the other to one. -If a column contains more than two values (multinary/real feature), the column is divided by its std deviation.