Hidden words statistics for large patterns (Q2030762): Difference between revisions

From MaRDI portal
Set OpenAlex properties.
ReferenceBot (talk | contribs)
Changed an Item
 
(One intermediate revision by one other user not shown)
Property / arXiv ID
 
Property / arXiv ID: 2003.09584 / rank
 
Normal rank
Property / cites work
 
Property / cites work: The distribution of subword counts is usually normal / rank
 
Normal rank
Property / cites work
 
Property / cites work: Coded Trace Reconstruction / rank
 
Normal rank
Property / cites work
 
Property / cites work: On information transmission over a finite buffer channel / rank
 
Normal rank
Property / cites work
 
Property / cites work: Probability: A Graduate Course / rank
 
Normal rank
Property / cites work
 
Property / cites work: A Class of Statistics with Asymptotically Normal Distribution / rank
 
Normal rank
Property / cites work
 
Property / cites work: Lower bounds for trace reconstruction / rank
 
Normal rank
Property / cites work
 
Property / cites work: On the asymptotic statistics of the number of occurrences of multiple permutation patterns / rank
 
Normal rank
Property / cites work
 
Property / cites work: Trace Reconstruction Revisited / rank
 
Normal rank
Property / cites work
 
Property / cites work: A survey of results for deletion channels and related synchronization channels / rank
 
Normal rank
Property / cites work
 
Property / cites work: Limit Theorems for the Number of Empty Cells in an Equiprobable Scheme for Group Allocation of Particles / rank
 
Normal rank
Property / cites work
 
Property / cites work: Achievable Rates for Channels With Deletions and Insertions / rank
 
Normal rank

Latest revision as of 21:53, 25 July 2024

scientific article
Language Label Description Also known as
English
Hidden words statistics for large patterns
scientific article

    Statements

    Hidden words statistics for large patterns (English)
    0 references
    0 references
    0 references
    7 June 2021
    0 references
    Summary: We study here the so called \textit{subsequence pattern matching} also known as \textit{hidden pattern matching} in which one searches for a given pattern \(w\) of length \(m\) as a subsequence in a random text of length \(n\). The quantity of interest is the number of occurrences of \(w\) as a subsequence (i.e., occurring in \textit{not} necessarily consecutive text locations). This problem finds many applications from intrusion detection, to trace reconstruction, to deletion channel, and to DNA-based storage systems. In all of these applications, the pattern \(w\) is of variable length. To the best of our knowledge this problem was only tackled for a fixed length \(m=O(1)\). In our main result we prove that for \(m=o(n^{1/3})\) the number of subsequence occurrences is normally distributed. In addition, we show that under some constraints on the structure of \(w\) the asymptotic normality can be extended to \(m=o(\sqrt{n})\). For a special pattern \(w\) consisting of the same symbol, we indicate that for \(m=o(n)\) the distribution of number of subsequences is either asymptotically normal or asymptotically log normal. After studying some special patterns (e.g., alternating) we conjecture that this dichotomy is true for all patterns. We use Hoeffding's projection method for \(U\)-statistics to prove our findings.
    0 references
    subsequence pattern matching
    0 references

    Identifiers