Identifying 3D Genome Organization in Diploid Organisms via Euclidean Distance Geometry
From MaRDI portal
Publication:5065472
Abstract: The spatial organization of the DNA in the cell nucleus plays an important role for gene regulation, DNA replication, and genomic integrity. Through the development of chromosome conformation capture experiments (such as 3C, 4C, Hi-C) it is now possible to obtain the contact frequencies of the DNA at the whole-genome level. In this paper, we study the problem of reconstructing the 3D organization of the genome from such whole-genome contact frequencies. A standard approach is to transform the contact frequencies into noisy distance measurements and then apply semidefinite programming (SDP) formulations to obtain the 3D configuration. However, neglected in such reconstructions is the fact that most eukaryotes including humans are diploid and therefore contain two copies of each genomic locus. We prove that the 3D organization of the DNA is not identifiable from distance measurements derived from contact frequencies in diploid organisms. In fact, there are infinitely many solutions even in the noise-free setting. We then discuss various additional biologically relevant and experimentally measurable constraints (including distances between neighboring genomic loci and higher-order interactions) and prove identifiability under these conditions. Furthermore, we provide SDP formulations for computing the 3D embedding of the DNA with these additional constraints and show that we can recover the true 3D embedding with high accuracy from both noiseless and noisy measurements. Finally, we apply our algorithm to real pairwise and higher-order contact frequency data and show that we can recover known genome organization patterns.
Recommendations
- Modeling three-dimensional chromosome structures using gene expression data
- RECONSTRUCTING THE SPATIAL ORDER OF CHROMOSOMES BY A CONVEX HULL ALGORITHM
- Model-based distance embedding with applications to chromosomal conformation biology
- Distance-based genome rearrangement phylogeny
- Analysis of similarity/dissimilarity of DNA sequences by a new 3D graphical representation
- Placing Probes along the Genome Using Pairwise Distance Data
Cites work
- scientific article; zbMATH DE number 847265 (Why is no real title available?)
- Distance shrinkage and Euclidean embedding via regularized kernel estimation
- Euclidean distance matrix completion problems
- Framework for kernel regularization with application to protein clustering
- Graph implementations for nonsmooth convex programs
- Solving Euclidean distance matrix completion problems via semidefinite progrmming
- Sum of squares method for sensor network localization
- The convex geometry of linear inverse problems
Cited in
(5)- Algebraic structures in statistical methodology. Abstracts from the workshop held December 4--10, 2022
- Inferring the three-dimensional structures of the X-chromosome during X-inactivation
- 3D genome reconstruction from partially phased Hi-C data
- Model-based distance embedding with applications to chromosomal conformation biology
- Statistical curve models for inferring 3D chromatin architecture
This page was built for publication: Identifying 3D Genome Organization in Diploid Organisms via Euclidean Distance Geometry
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5065472)