datanugget (Q158256): Difference between revisions

@@ Property / programmed in @@
-R
@@ Property / programmed in: R / rank @@
-Normal rank
@@ Property / last update @@
-May 2023Timestamp +2023-05-02T00:00:00Z
Timezone +00:00
Calendar Gregorian
Precision 1 day
Before 0
After 0
-Timestamp
++2023-05-02T00:00:00Z
-Timezone
++00:00
-Calendar
+Gregorian
-Precision
+day
 Before
 After
@@ Property / last update: 2 May 2023 / rank @@
-Normal rank
@@ Property / author @@
-Traymon Beavers
@@ Property / author: Traymon Beavers / rank @@
-Normal rank
@@ Property / author @@
-Javier Cabrera
@@ Property / author: Javier Cabrera / rank @@
-Normal rank
@@ Property / author @@
-Mariusz Lubomirski
@@ Property / author: Mariusz Lubomirski / rank @@
-Normal rank
@@ Property / maintained by @@
-Yajie Duan
@@ Property / maintained by: Yajie Duan / rank @@
-Normal rank
@@ Property / copyright license @@
-GNU General Public License, version 2.0
@@ Property / copyright license: GNU General Public License, version 2.0 / rank @@
-Normal rank
@@ Property / depends on software @@
-R
@@ Property / depends on software: R / rank @@
-Normal rank
@@ Property / depends on software: R / qualifier @@
-software version identifier: ≥ 4.0
@@ Property / depends on software @@
-doSNOW
@@ Property / depends on software: doSNOW / rank @@
-Normal rank
@@ Property / depends on software: doSNOW / qualifier @@
-software version identifier: ≥ 1.0.16
@@ Property / depends on software @@
-foreach
@@ Property / depends on software: foreach / rank @@
-Normal rank
@@ Property / depends on software: foreach / qualifier @@
-software version identifier: ≥ 1.5.1
@@ Property / depends on software @@
-parallel
@@ Property / depends on software: parallel / rank @@
-Normal rank
@@ Property / depends on software: parallel / qualifier @@
-software version identifier: ≥ 4.0.5
@@ Property / software version identifier @@
+.0.0
@@ Property / software version identifier: 1.0.0 / rank @@
+Normal rank
@@ Property / software version identifier: 1.0.0 / qualifier @@
+publication date: 24 January 2020Timestamp +2020-01-24T00:00:00Z
Timezone +00:00
Calendar Gregorian
Precision 1 day
Before 0
After 0
-Timestamp
++2020-01-24T00:00:00Z
-Timezone
++00:00
-Calendar
+Gregorian
-Precision
+day
 Before
 After
@@ Property / software version identifier @@
+.2.2
@@ Property / software version identifier: 1.2.2 / rank @@
+Normal rank
@@ Property / software version identifier: 1.2.2 / qualifier @@
+publication date: 26 October 2023Timestamp +2023-10-26T00:00:00Z
Timezone +00:00
Calendar Gregorian
Precision 1 day
Before 0
After 0
-Timestamp
++2023-10-26T00:00:00Z
-Timezone
++00:00
-Calendar
+Gregorian
-Precision
+day
 Before
 After
@@ Property / software version identifier @@
+.2.4
@@ Property / software version identifier: 1.2.4 / rank @@
+Normal rank
@@ Property / software version identifier: 1.2.4 / qualifier @@
+publication date: 28 November 2023Timestamp +2023-11-28T00:00:00Z
Timezone +00:00
Calendar Gregorian
Precision 1 day
Before 0
After 0
-Timestamp
++2023-11-28T00:00:00Z
-Timezone
++00:00
-Calendar
+Gregorian
-Precision
+day
 Before
 After
@@ Property / programmed in @@
+R
@@ Property / programmed in: R / rank @@
+Normal rank
@@ Property / last update @@
+November 2023Timestamp +2023-11-28T00:00:00Z
Timezone +00:00
Calendar Gregorian
Precision 1 day
Before 0
After 0
-Timestamp
++2023-11-28T00:00:00Z
-Timezone
++00:00
-Calendar
+Gregorian
-Precision
+day
 Before
 After
@@ Property / last update: 28 November 2023 / rank @@
+Normal rank
@@ Property / maintained by @@
+Yajie Duan
@@ Property / maintained by: Yajie Duan / rank @@
+Normal rank
@@ Property / description @@
+Creating, and refining data nuggets.     Data nuggets reduce a large dataset into a small collection of nuggets of     data, each containing a center (location), weight (importance), and scale     (variability) parameter. Data nugget centers are created by choosing     observations in the dataset which are as equally spaced apart as possible.     Data nugget weights are created by counting the number observations     closest to a given data nugget’s center. We then say the data nugget     'contains' these observations and the data nugget center is recalculated     as the mean of these observations. Data nugget scales are created by     calculating the trace of the covariance matrix of the observations     contained within a data nugget divided by the dimension of the dataset.     Data nuggets are refined by 'splitting' data nuggets which have scales or     shapes (defined as the ratio of the two largest eigenvalues of the     covariance matrix of the observations contained within the data nugget)     Reference paper: [1] Cherasia, K. E., Cabrera, J., Fernholz, L. T., & Fernholz, R. (2022). Data Nuggets in Supervised Learning. \emph{In Robust and Multivariate Statistical Methods: Festschrift in Honor of David E. Tyler} (pp. 429-449). Cham: Springer International Publishing. [2] Beavers, T., Cheng, G., Duan, Y., Cabrera, J., Lubomirski, M., Amaratunga, D., Teigler, J. (2023). Data Nuggets: A Method for Reducing Big Data While Preserving Data Structure (Submitted for Publication).
+Normal rank
@@ Property / author @@
+Traymon Beavers
@@ Property / author: Traymon Beavers / rank @@
+Normal rank
@@ Property / author @@
+Javier Cabrera
@@ Property / author: Javier Cabrera / rank @@
+Normal rank
@@ Property / author @@
+Ge Cheng
@@ Property / author: Ge Cheng / rank @@
+Normal rank
@@ Property / author @@
+Kunting Qi
@@ Property / author: Kunting Qi / rank @@
+Normal rank
@@ Property / author @@
+Mariusz Lubomirski
@@ Property / author: Mariusz Lubomirski / rank @@
+Normal rank
@@ Property / copyright license @@
+GNU General Public License, version 2.0
@@ Property / copyright license: GNU General Public License, version 2.0 / rank @@
+Normal rank
@@ Property / depends on software @@
+doSNOW
@@ Property / depends on software: doSNOW / rank @@
+Normal rank
@@ Property / depends on software: doSNOW / qualifier @@
+software version identifier: ≥ 1.0.16
@@ Property / depends on software @@
+foreach
@@ Property / depends on software: foreach / rank @@
+Normal rank
@@ Property / depends on software: foreach / qualifier @@
+software version identifier: ≥ 1.5.1
@@ Property / depends on software @@
+parallel
@@ Property / depends on software: parallel / rank @@
+Normal rank
@@ Property / depends on software: parallel / qualifier @@
+software version identifier: ≥ 4.0.5
@@ Property / depends on software @@
+Rfast
@@ Property / depends on software: Rfast / rank @@
+Normal rank
@@ Property / depends on software: Rfast / qualifier @@
+software version identifier: ≥ 2.0.7
@@ Property / depends on software @@
+R
@@ Property / depends on software: R / rank @@
+Normal rank
@@ Property / depends on software: R / qualifier @@
+software version identifier: ≥ 4.0
@@ Property / MaRDI profile type @@
+MaRDI software profile
@@ Property / MaRDI profile type: MaRDI software profile / rank @@
+Normal rank
@@ links / mardi / name / links / mardi / name @@
+Software:158256

Latest revision as of 20:00, 12 March 2024

Create, and Refine Data Nuggets

Language	Label	Description	Also known as
English	datanugget	Create, and Refine Data Nuggets

Statements

instance of

R package

0 references

software version identifier

1.2.1

publication date

2 May 2023

0 references

1.0.0

publication date

24 January 2020

0 references

1.2.2

publication date

26 October 2023

0 references

1.2.4

publication date

28 November 2023

0 references

0 references

0 references

28 November 2023

0 references

maintained by

Yajie Duan

0 references

description

Creating, and refining data nuggets. Data nuggets reduce a large dataset into a small collection of nuggets of data, each containing a center (location), weight (importance), and scale (variability) parameter. Data nugget centers are created by choosing observations in the dataset which are as equally spaced apart as possible. Data nugget weights are created by counting the number observations closest to a given data nugget’s center. We then say the data nugget 'contains' these observations and the data nugget center is recalculated as the mean of these observations. Data nugget scales are created by calculating the trace of the covariance matrix of the observations contained within a data nugget divided by the dimension of the dataset. Data nuggets are refined by 'splitting' data nuggets which have scales or shapes (defined as the ratio of the two largest eigenvalues of the covariance matrix of the observations contained within the data nugget) Reference paper: [1] Cherasia, K. E., Cabrera, J., Fernholz, L. T., & Fernholz, R. (2022). Data Nuggets in Supervised Learning. \emph{In Robust and Multivariate Statistical Methods: Festschrift in Honor of David E. Tyler} (pp. 429-449). Cham: Springer International Publishing. [2] Beavers, T., Cheng, G., Duan, Y., Cabrera, J., Lubomirski, M., Amaratunga, D., Teigler, J. (2023). Data Nuggets: A Method for Reducing Big Data While Preserving Data Structure (Submitted for Publication).

0 references

0 references

0 references

0 references

0 references

0 references