Sublinear time motif discovery from multiple sequences
From MaRDI portal
Abstract: A natural probabilistic model for motif discovery has been used to experimentally test the quality of motif discovery programs. In this model, there are background sequences, and each character in a background sequence is a random character from an alphabet . A motif is a string of characters. Each background sequence is implanted a probabilistically generated approximate copy of . For a probabilistically generated approximate copy of , every character is probabilistically generated such that the probability for is at most . We develop three algorithms that under the probabilistic model can find the implanted motif with high probability via a tradeoff between computational time and the probability of mutation. The methods developed in this paper have been used in the software implementation. We observed some encouraging results that show improved performance for motif detection compared with other softwares.
Recommendations
- Discovering Almost Any Hidden Motif from Multiple Sequences in Polynomial Time with Low Sample Complexity and High Success Probability
- Efficient Algorithms for Model-Based Motif Discovery from Multiple Sequences
- Discovering almost any hidden motif from multiple sequences
- Probabilistic analysis of a motif discovery algorithm for multiple sequences
- Algorithms and Computation
Cites work
- scientific article; zbMATH DE number 3567782 (Why is no real title available?)
- scientific article; zbMATH DE number 1305511 (Why is no real title available?)
- scientific article; zbMATH DE number 819814 (Why is no real title available?)
- Algorithms on Strings, Trees and Sequences
- Discovering almost any hidden motif from multiple sequences
- Distinguishing string selection problems.
- Finding similar regions in many strings
- On covering problems of codes
- On the closest string and substring problems
- Probabilistic analysis of a motif discovery algorithm for multiple sequences
Cited in
(14)- Probabilistic analysis of a motif discovery algorithm for multiple sequences
- Discovering almost any hidden motif from multiple sequences
- Analysis method and algorithm design of biological sequence problem based on generalized k-mer vector
- Efficient Algorithms for Model-Based Motif Discovery from Multiple Sequences
- Discovering Almost Any Hidden Motif from Multiple Sequences in Polynomial Time with Low Sample Complexity and High Success Probability
- Toward optimal motif enumeration.
- String Processing and Information Retrieval
- Identification of Distinguishing Motifs
- Space and Time Efficient Algorithms for Planted Motif Search
- Editorial: Special issue on algorithms for sequence analysis and storage
- New Bounds for Motif Finding in Strong Instances
- Algorithms and Computation
- Combinatorial Pattern Matching
- An upper bound on the hardness of exact matrix based motif discovery
This page was built for publication: Sublinear time motif discovery from multiple sequences
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1736589)