Coding sequence density estimation via topological pressure (Q2512920)

From MaRDI portal





scientific article
Language Label Description Also known as
English
Coding sequence density estimation via topological pressure
scientific article

    Statements

    Coding sequence density estimation via topological pressure (English)
    0 references
    0 references
    0 references
    2 February 2015
    0 references
    In this paper, the authors present a novel approach to genomic analysis using tools from the theory of thermodynamics. They adapt tools from thermodynamic formalism, particularly they adapt a well-known concept of ergodic theory and call it topological pressure. This feature is given by a weighted measure of complexity of a finite sequence. Two goals of the study were achieved: (1) using the topological pressure, applied on the human genome, it is predicted the coding sequence density on other genomes. This result gives a key practical advantage to the present approach, because it is possible to use only a single moderately phylogenetically distant informant genome as training data; (2) using the theory of thermodynamic formalism, the data encoded on the basis of topological pressure are turned into a probability distribution, which measures the coding potential of sequences of nucleotides of length between 750 and 5,000 bp. Another advantage of the present approach is its speed in calculation: the topological pressure can predict a coding sequence density for a genome in a matter of seconds, while ab initio prediction programs take a few hours, and evidence-based methods can take weeks.
    0 references
    DNA sequence analysis
    0 references
    coding sequence density estimation
    0 references
    topological pressure
    0 references
    0 references
    0 references
    0 references

    Identifiers

    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references