BEELINE (Q4834871): Difference between revisions

Latest revision as of 21:53, 11 June 2025

Dataset published at Zenodo repository.

Language	Label	Description	Also known as
English	BEELINE	Dataset published at Zenodo repository.

Statements

instance of

data set

0 references

publication date

27 February 2023

0 references

copyright license

Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International

0 references

community

Graphical Modelling and Causal Inference

0 references

description

This collection consists of over 400 single-cell gene expression datasets across four curated and six synthetic gene regulatory networks. It was created to benchmarking algorithms for gene regulatory network inference in Pratapa et al. (2020). Task: The collection can be used to study causal inference algorithms. Summary: Size of collection: 400 datasets on 6 - 19 features of different size Task: Causal Inference Problem Data Type: Mixed Data Dataset Scope: Collection of Datasets Ground Truth: Known Graph Temporal Structure: Static Data License: CC BY-NC 4.0 (see 10.5281/zenodo.3701939) Missing Values: No Missing Values Missingness Statement: There are no missing values. Collection: (for a detailed description see Peng et al. (2024), for simulation details see Pratapa et al. (2020)) Curated: There are experiments on four curated gene regulatory networks:mCAD (Mammalian Cortical Area Development, 14 edges and 5 nodes), VSC (Ventral Spinal Cord Development, 15 edges and 8 nodes), HSC (Hematopoietic Stem Cell Differentiation, 30 edges and 11 nodes), and GSD (Gonadal Sex Determination, 79 edges and 18 nodes). Synthetic: There are experiments six synthetic gene regulatory networks: dyn-BF (Bifurcating, 12 edges and 5 nodes), dyn-BFC (Bifurcating Converging, 18 edges and 9 nodes), dyn-CY (Cycle, 6 edges and 5 nodes), dyn-LI (Linear, 8 edges and 7 nodes), dyn-LL (Linear Long, 19 edges and 18 nodes), and dyn-TF (Trifurcating, 20 edges and 7 nodes). Files per Experiment: GroundTruth.csv: This file represents the actual biological regulatory interactions between genes, typically derived from known databases, literature, or synthetic models. An edge weight of +1 represents activation, -1 represents inhibition. refNetwork.csv: This file is a processed version of the ground truth network, keeping only the sign (+ or -) of interactions. ExpressionData.csv:This file contains the RNAseq data, with genes as rows and cell IDs as columns. PseudoTime.csv: This file contains the Pseudotime. It is a computationally inferred measure that orders single cells along a trajectory to represent their progression through a biological process, such as differentiation or development.

0 references

author

Mathematical Research Data Initiative

0 references

MaRDI profile type

MaRDI dataset profile

0 references

Identifiers

Zenodo ID

7682713

0 references

DOI

10.5281/ZENODO.7682713

0 references

Sitelinks

Mathematics(1 entry)

mardi Dataset:4834871

@@ label / en / label / en @@
-BEELINE datasets
+BEELINE
@@ description / en / description / en @@
-Dataset published at Zenodo repository
+Dataset published at Zenodo repository.
@@ Property / author @@
-Mathias Drton
@@ Property / author: Mathias Drton / rank @@
-Normal rank
@@ Property / author @@
-Stephan Haug
@@ Property / author: Stephan Haug / rank @@
-Normal rank
@@ Property / author @@
-David Reifferscheidt
@@ Property / author: David Reifferscheidt / rank @@
-Normal rank
@@ Property / author @@
-Oleksandr Zadorozhnyi
@@ Property / author: Oleksandr Zadorozhnyi / rank @@
-Normal rank
@@ Property / DOI @@
-.5281/zenodo.7682713
@@ Property / DOI: 10.5281/zenodo.7682713 / rank @@
-Normal rank
@@ Property / P1457 (Deleted Property) @@
-Over 400 simulated datasets (across six synthetic networks and four curated Boolean models) originally used for benchmarking algorithms for gene regulatory network inference. Property P1457 not found, cannot determine the data type to use.
-Normal rank
@@ Property / cites work @@
-Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data
-Normal rank
@@ Property / MaRDI profile type @@
-MaRDI dataset profile
@@ Property / MaRDI profile type: MaRDI dataset profile / rank @@
-Normal rank
@@ Property / community @@
+Graphical Modelling and Causal Inference
@@ Property / community: Graphical Modelling and Causal Inference / rank @@
+Normal rank
@@ Property / description @@
+This collection consists of over 400 single-cell gene expression datasets across four curated and six synthetic gene regulatory networks. It was created to benchmarking algorithms for gene regulatory network inference in Pratapa et al. (2020).  Task: The collection can be used to study causal inference algorithms.  Summary:  Size of collection: 400 datasets on 6 - 19 features of different size Task: Causal Inference Problem Data Type: Mixed Data Dataset Scope: Collection of Datasets Ground Truth: Known Graph Temporal Structure: Static Data License: CC BY-NC 4.0 (see 10.5281/zenodo.3701939) Missing Values: No Missing Values   Missingness Statement: There are no missing values.  Collection: (for a detailed description see Peng et al. (2024), for simulation details see Pratapa et al. (2020))  Curated: There are experiments on four curated gene regulatory networks:mCAD (Mammalian Cortical Area Development, 14 edges and 5 nodes), VSC (Ventral Spinal Cord Development, 15 edges and 8 nodes), HSC (Hematopoietic Stem Cell Differentiation, 30 edges and 11 nodes), and GSD (Gonadal Sex Determination, 79 edges and 18 nodes). Synthetic: There are experiments six synthetic gene regulatory networks: dyn-BF (Bifurcating, 12 edges and 5 nodes), dyn-BFC (Bifurcating Converging, 18 edges and 9 nodes), dyn-CY (Cycle, 6 edges and 5 nodes), dyn-LI (Linear, 8 edges and 7 nodes), dyn-LL (Linear Long, 19 edges and 18 nodes), and dyn-TF (Trifurcating, 20 edges and 7 nodes).   Files per Experiment:  GroundTruth.csv: This file represents the actual biological regulatory interactions between genes, typically derived from known databases, literature, or synthetic models. An edge weight of +1 represents activation, -1 represents inhibition. refNetwork.csv: This file is a processed version of the ground truth network, keeping only the sign (+ or -) of interactions. ExpressionData.csv:This file contains the RNAseq data, with genes as rows and cell IDs as columns. PseudoTime.csv: This file contains the Pseudotime. It is a computationally inferred measure that orders single cells along a trajectory to represent their progression through a biological process, such as differentiation or development.
+Normal rank
@@ Property / author @@
+Mathematical Research Data Initiative
@@ Property / author: Mathematical Research Data Initiative / rank @@
+Normal rank
@@ Property / MaRDI profile type @@
+MaRDI dataset profile
@@ Property / MaRDI profile type: MaRDI dataset profile / rank @@
+Normal rank
@@ Property / DOI @@
+.5281/ZENODO.7682713
@@ Property / DOI: 10.5281/ZENODO.7682713 / rank @@
+Normal rank

BEELINE (Q4834871): Difference between revisions

Import250611010657 (talk | contribs)

Latest revision as of 21:53, 11 June 2025

Statements

Identifiers

Sitelinks

Mathematics(1 entry)