Latest revision as of 07:40, 2 September 2024

Task Area 3: Statistics and Machine Learning

Mission

In the context of MaRDI, and the NFDI more broadly, our task area seeks to address the needs that the Statistics and ML community has in the management of its research data. These research data range from literature, statistical models and algorithms to benchmark data sets and software. The task area will follow FAIR principles in initiating libraries of curated datasets (OpenML | Zenodo)that will be connected to software and research literature by providing an associated library of statistical analyses (mlr-org | Zenodo). The task area will further support FAIR development of new methods by creating workflows and a demonstration platform for how to evaluate, compare or also tune methods through empirical analyses and simulation studies. Finally, the task area will cooperate with journal partners to establish standards for quality control and reproducibility of numerical experiments in the scientific publication process.

mlr3

mlr3 is an open-source collection of R packages providing a unified interface for machine learning in the R language.

You can make your algorithm available via the mlr3 interface, by creating a pull request in our community repository mlr3extralearners.

Start by checking out our website or by reading our book.

OpenML

OpenML is a FAIR platform for sharing machine learning research data. You can use OpenML with mlr3, using the interface R package mlr3oml.

Zenodo community: Graphical Modelling and Causal Inference

This community offers a space for collecting and sharing datasets and software products that are related to graphical modelling and causal discovery/inference. Graphical models are statistical models that facilitate refined yet tractable data exploration by using graphs to represent complex stochastic independence structures between considered variables. Models based on directed graphs, in particular, provide the state-of-the-art approach for detailed exploration of cause-effect relationships. The different instances of graphical models also go by names such as Bayesian networks, Markov random fields, probabilistic graphical models, structural causal or structural equation models.

@@ Line 5: / Line 5: @@
 the Statistics and ML community has in the management of its research data. These research data
 range from literature, statistical models and algorithms to benchmark data sets and software. The
-task area will follow FAIR principles in initiating libraries of curated datasets that will be connected to
+task area will follow FAIR principles in initiating libraries of curated datasets ([https://www.openml.org/search?type=study&study_type=task&id=353&sort=tasks_included OpenML] | [https://zenodo.org/communities/mardigmci/records?q=&l=list&p=1&s=10&sort=newest Zenodo])that will be connected to
-software and research literature by providing an associated library of statistical analyses. The task
+software and research literature by providing an associated library of statistical analyses ([https://mlr-org.com/gallery mlr-org] | [https://doi.org/10.5281/zenodo.10625451 Zenodo]). The task
 area will further support FAIR development of new methods by creating workflows and a demonstration
 platform for how to evaluate, compare or also tune methods through empirical analyses and
 simulation studies. Finally, the task area will cooperate with journal partners to establish standards
 for quality control and reproducibility of numerical experiments in the scientific publication process.
 == mlr3 ==
@@ Line 24: / Line 23: @@
 [https://www.openml.org/ OpenML] is a FAIR platform for sharing machine learning research data.
-You can use OpenML with mlr3, using interface R package [https://github.com/mlr-org/mlr3oml/ mlr3oml].
+You can use OpenML with mlr3, using the interface R package [https://github.com/mlr-org/mlr3oml/ mlr3oml].
+== Zenodo community: Graphical Modelling and Causal Inference ==
+This [https://zenodo.org/communities/mardigmci community] offers a space for collecting and sharing datasets and software products that are related to graphical modelling and causal discovery/inference.  Graphical models are statistical models that facilitate refined yet tractable data exploration by using graphs to represent complex stochastic independence structures between considered variables.  Models based on directed graphs, in particular, provide the state-of-the-art approach for detailed exploration of cause-effect relationships.  The different instances of graphical models also go by names such as Bayesian networks, Markov random fields, probabilistic graphical models, structural causal or structural equation models.
 == Team ==
 * [https://www.slds.stat.uni-muenchen.de/people/bischl/ Bernd Bischl (LMU München)]
-* [https://www.math.cit.tum.de/en/statistics/people/mathias-drton/ Mathias Drton (TU München)]
+* [https://www.math.cit.tum.de/en/math/people/professors/drton-mathias/ Mathias Drton (TU München)]
 * [https://www.slds.stat.uni-muenchen.de/people/fischer/ Sebastian Fischer (LMU München)]
-* [https://www.math.cit.tum.de/en/statistics/people/haug/ Stephan Haug (TU München)]
+* [https://www.math.cit.tum.de/math/hgstp/ Stephan Haug (TU München)]
+* [https://www.wias-berlin.de/~nemeth/ László Németh (WIAS Berlin)]
 * [https://www.wias-berlin.de/people/tabelow/ Karsten Tabelow (WIAS Berlin)]
-* [https://www.math.cit.tum.de/en/statistics/people/oleksandr-zadorozhnyi/ Oleksandr Zadoroyhnyi (TU München)]
+* [https://www.math.cit.tum.de/en/math/people/academic-staff/oleksandr-zadorozhnyi/ Oleksandr Zadorozhnyi (TU München)]

Portal/TA3: Difference between revisions