webdata_wXa (Q6033105)
From MaRDI portal
![]() | This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: webdata_wXa |
OpenML dataset with id 350
Language | Label | Description | Also known as |
---|---|---|---|
English | webdata_wXa |
OpenML dataset with id 350 |
Statements
1
0 references
**Author**: John Platt \N**Source**: [libSVM](http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets) - Date unknown \N**Please cite**: John C. Platt. \NFast training of support vector machines using sequential minimal optimization. \NIn Bernhard Schölkopf, Christopher J. C. Burges, and Alexander J. Smola, editors, Advances in Kernel Methods - Support Vector Learning, Cambridge, MA, 1998. MIT Press.a\N\NThis is the famous webdata dataset w[1-8]a in its binary version, retrieved 2014-11-14 from the libSVM site. Additional to the preprocessing done there (see LibSVM site for details), this dataset was created as follows: \N\N* load all web data datasets, train and test, e.g. w1a, w1a.t, w2a, w2a.t, w3a, ... \N* join test and train for each subset, e.g. w1a and w1a.t, w2a and w2a.t \N* normalize each file columnwise according to the following rules: \N* If a column only contains one value (constant feature), it will set to zero and thus removed by sparsity. \N* If a column contains two values (binary feature), the value occuring more often will be set to zero, the other to one. \N* If a column contains more than two values (multinary/real feature), the column is divided by its std deviation.\N* afterwards all these 8 files are merged into one, and randomly sorted. \N* duplicate lines were finally removed.\N\NAn R script which does all of these steps can be found here:\Nhttps://github.com/openml/data_scripts/blob/master/webdata_wXa/dataDownloader.R
0 references
1998
0 references
29 August 2014
0 references
Y
0 references
https://ieeexplore.ieee.org/abstract/document/4731075
0 references
1
0 references
2
0 references
124
0 references
36,974
0 references
0
0 references
0
0 references
123
0 references
1
0 references