
From MaRDI portal
Revision as of 13:26, 16 April 2024 by Import240416010454 (talk | contribs) (Created automatically from import240416010454)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


OpenML227MaRDI QIDQ6033025

OpenML dataset with id 227

No author found.

Full work available at URL: https://api.openml.org/data/v1/download/3664/cpu_small.arff

Upload date: 23 April 2014

Dataset Characteristics

Number of classes: 0
Number of features: 13 (numeric: 13, symbolic: 0 and in total binary: 0 )
Number of instances: 8,192
Number of instances with missing values: 0
Number of missing values: 0

Author: Source: Unknown - Please cite:

The Computer Activity databases are a collection of computer systems

activity measures. The data was collected from a Sun Sparcstation
20/712 with 128 Mbytes of memory running in a multi-user university
department. Users would typically be doing a large variety of tasks
ranging from accessing the internet, editing files or running very
cpu-bound programs.  The data was collected continuously on two
separate occasions. On both occassions, system activity was gathered
every 5 seconds. The final dataset is taken from both occasions with
equal numbers of observations coming from each collection epoch.

System measures used:
1. lread - Reads (transfers per second ) between system memory and user memory.
2. lwrite - writes (transfers per second) between system memory and user memory.
3. scall - Number of system calls of all types per second.
4. sread - Number of system read calls per second.
5. swrite - Number of system write calls per second . 
6. fork - Number of system fork calls per second. 
7. exec - Number of system exec calls per second. 
8. rchar - Number of characters transferred per second by system read calls.
9. wchar - Number of characters transfreed per second by system write calls. 
10. pgout - Number of page out requests per second.
11. ppgout - Number of pages, paged out per second. 
12. pgfree - Number of pages per second placed on the free list. 
13. pgscan - Number of pages checked if they can be freed per second.
14. atch - Number of page attaches (satisfying a page fault by reclaiming a page in memory) per second.
15. pgin - Number of page-in requests per second.
16. ppgin - Number of pages paged in per second.
17. pflt - Number of page faults caused by protection errors (copy-on-writes). 
18. vflt - Number of page faults caused by address translation. 
19. runqsz - Process run queue size.
20. freemem - Number of memory pages available to user processes.
21. freeswap - Number of disk blocks available for page swapping. 
22. usr - Portion of time (%) that cpus run in user mode.
23. sys - Portion of time (%) that cpus run in system mode.
24. wio - Portion of time (%) that cpus are idle waiting for block IO.
25. idle - Portion of time (%) that cpus are otherwise idle.

The two different regression tasks obtained from these databases are:

Predict usr, the portion of time that cpus run in user mode from all attributes 1-21.

Predict usr using a restricted number (excluding the paging information (10-18)

Original source: DELVE repository of data. 
Source: collection of regression datasets by Luis Torgo (ltorgo@ncc.up.pt) at
Characteristics: 8192 cases, 13 continuous attributes