SGEMM_GPU_kernel_performance (Q6036946): Difference between revisions

Latest revision as of 12:29, 16 April 2024

OpenML dataset with id 43871

Language	Label	Description	Also known as
English	SGEMM_GPU_kernel_performance	OpenML dataset with id 43871

Statements

instance of

data set

0 references

dataset version identifier

3

0 references

description

**Dataset description**\N\NThis data set measures the running time of a matrix-matrix product $A \times B = C$, where all \Nmatrices \Nhave size 2048 x 2048, using a parameterizable *SGEMM GPU* (Single Precision General Matrix Multiply) \Nkernel with 241600 possible parameter \Ncombinations. For each tested combination, 4 runs were performed and their results are reported \Nas the 4 last columns. All times are measured in milliseconds*. There are 14 parameters, the first \N10 are ordinal and can only take up to 4 different powers of two values, and the 4 last variables \Nare binary. Out of 1327104 total parameter combinations, only 241600 are feasible (due to various \Nkernel constraints). This data set contains the results for all these feasible combinations. \NThe experiment was run on a desktop workstation running Ubuntu 16.04 Linux with an Intel Core i5 \N(3.5GHz), 16GB RAM, and a NVidia Geforce GTX 680 4GB GF580 GTX-1.5GB GPU. We use the 'gemm_fast' \Nkernel from the automatic OpenCL kernel tuning library 'CLTune' (https://github.com/CNugteren/CLTune). \N\N\* *Note*: For this kind of data sets it is usually better to work with the logarithm of the running \Ntimes (see e.g. Falch and Elster, 'Machine learning-based auto-tuning for enhanced performance \Nportability of OpenCL applications', 2015).\N\N**Attribute description** \N\N*Independent variables*\N\N* MWG, NWG: per-matrix 2D tiling at workgroup level: {16, 32, 64, 128} \N(integer)\N\N* KWG: inner dimension of 2D tiling at workgroup level: {16, 32} (integer) \N* MDIMC, NDIMC: local workgroup size: {8, 16, 32} (integer) 6-7. MDIMA, NDIMB: local memory shape: \N{8, 16, 32} (integer) \N* KWI: kernel loop unrolling factor: {2, 8} (integer) \N* VWM, VWN: per-matrix vector widths for loading and storing: {1, 2, 4, 8} (integer) \N* STRM, STRN: enable stride for accessing off-chip memory within a single thread: {0, 1} (categorical)\N* SA, SB: per-matrix manual caching of the 2D workgroup tile: {0, 1} (categorical) - \N\N*Output* \N\N* Run1, Run2, Run3, Run4: performance times in milliseconds for 4 independent runs using the same \Nparameters. They range between 13.25 and 3397.08. Run1 is used as the default target variable.\N\N**Related Studies**\N\NRafael Ballester-Ripoll, Enrique G. Paredes, Renato Pajarola.\NSobol Tensor Trains for Global Sensitivity Analysis.\NIn arXiv Computer Science / Numerical Analysis e-prints, 2017,\Nhttps://doi.org/10.1016/j.ress.2018.11.007\N\N**Authors**\N\NEnrique Paredes and Rafael Ballester-Ripoll. The original data was obtained from the UCI Machine Learning\Nrepository [Link](https://archive.ics.uci.edu/ml/datasets/sgemm+gpu+kernel+performance).\N\N**Citation**\N\NPlease cite one of the following papers: \N\N* Rafael Ballester-Ripoll, Enrique G. Paredes, Renato Pajarola.\NSobol Tensor Trains for Global Sensitivity Analysis.\NIn arXiv Computer Science / Numerical Analysis e-prints, 2017,\Nhttps://arxiv.org/abs/1712.00233\N\N* Cedric Nugteren and Valeriu Codreanu.\NCLTune: A Generic Auto-Tuner for OpenCL Kernels.\NIn: MCSoC: 9th International Symposium on Embedded Multicore/Many-core Systems-on-Chip. IEEE, 2015,\Nhttps://doi.org/10.1109/MCSoC.2015.10

0 references

upload date

20 April 2022

0 references

copyright license

Creative Commons Attribution 4.0 International

0 references

full work available at URL

https://api.openml.org/data/v1/download/22102748/SGEMM_GPU_kernel_performance.arff

0 references

default target attribute

Run1

0 references

0 references

0 references

d3068edeb7842bb3173991644ca7182c

determination method

MD5

0 references

number of binary features

0

0 references

number of classes

0

0 references

number of features

15

0 references

number of instances

241,600

0 references

number of instances with missing values

0

0 references

number of missing values

0

0 references

number of numeric features

15

0 references

number of symbolic features

0

0 references

file format

ARFF

0 references

MaRDI profile type

MaRDI dataset profile

0 references

Identifiers

OpenML dataset ID

43871

0 references

Sitelinks

Mathematics(1 entry)

mardi Dataset:6036946

Revision as of 14:45, 15 April 2024 Importer (talk \| contribs) Bots 7,049,768 edits ‎Created a new Item	Latest revision as of 12:29, 16 April 2024 Import240416010454 (talk \| contribs) 10,906 edits Added link to MaRDI item.
links / mardi / name	links / mardi / name
		Dataset:6036946