Sparse approaches for the exact distribution of patterns in long state sequences generated by a Markov source
From MaRDI portal
(Redirected from Publication:384993)
Applications of Markov chains and discrete-time Markov processes on general state spaces (social mobility, learning theory, industrial processes, etc.) (60J20) Protein sequences, DNA sequences (92D20) Markov chains (discrete-time Markov processes on discrete state spaces) (60J10) Analysis of algorithms and problem complexity (68Q25)
Abstract: We present two novel approaches for the computation of the exact distribution of a pattern in a long sequence. Both approaches take into account the sparse structure of the problem and are two-part algorithms. The first approach relies on a partial recursion after a fast computation of the second largest eigenvalue of the transition matrix of a Markov chain embedding. The second approach uses fast Taylor expansions of an exact bivariate rational reconstruction of the distribution. We illustrate the interest of both approaches on a simple toy-example and two biological applications: the transcription factors of the Human Chromosome 5 and the PROSITE signatures of functional motifs in proteins. On these example our methods demonstrate their complementarity and their hability to extend the domain of feasibility for exact computations in pattern problems to a new level.
Recommendations
- On the First k Moments of the Random Count of a Pattern in a Multistate Sequence Generated by a Markov Source
- Numerical Solutions for Patterns Statistics on Markov Chains
- Multiple pattern matching: a Markov chain approach
- Distributions of pattern statistics in sparse Markov models
- Explicit distributional results in pattern formation
Cites work
- scientific article; zbMATH DE number 5542185 (Why is no real title available?)
- scientific article; zbMATH DE number 3460178 (Why is no real title available?)
- scientific article; zbMATH DE number 1517989 (Why is no real title available?)
- scientific article; zbMATH DE number 2183071 (Why is no real title available?)
- scientific article; zbMATH DE number 941396 (Why is no real title available?)
- scientific article; zbMATH DE number 947423 (Why is no real title available?)
- scientific article; zbMATH DE number 3303655 (Why is no real title available?)
- A Unified Construction of the Glushkov, Follow, and Antimirov Automata
- A unified approach to word occurrence probabilities
- An Efficient Formula for Linear Recurrences
- Assessing the Statistical Significance of Overrepresented Oligonucleotides
- Combinatorial Pattern Matching
- Compound Poisson approximation for counts of rare patterns in Markov chains and extreme sojourns in birth-death chains.
- Compound Poisson approximations for word patterns under Markovian hypotheses
- Distribution of waiting time until the \(r\)th occurrence of a compound pattern
- Efficient string matching
- Expected frequencies of DNA patterns using whittle's formula
- Explicit distributional results in pattern formation
- High-order lifting and integrality certification
- Motif statistics.
- Numerical Solutions for Patterns Statistics on Markov Chains
- On exact and approximate interpolation of sparse rational functions
- Pattern Markov Chains: Optimal Markov Chain Embedding Through Deterministic Finite Automata
- Poisson approximations for runs and patterns of rare events
- Regular expressions at their best: a case for rational design
- Waiting time and complexity for matching patterns with automata
- Waiting time distributions for pattern occurrence in a constrained sequence
- Waiting times for patterns in a sequence of multistate trials
Cited in
(6)- Moments of the count of a regular expression in a heterogeneous random sequence
- Sparse Markov chains for sequence data
- Transducing Markov sequences
- Faster exact distributions of pattern statistics through sequential elimination of states
- Distributions of pattern statistics in sparse Markov models
- Distribution of statistics of hidden state sequences through the sum-product algorithm
This page was built for publication: Sparse approaches for the exact distribution of patterns in long state sequences generated by a Markov source
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q384993)