Internal pattern matching queries in a text and applications
From MaRDI portal
Publication:5362985
DOI10.1137/1.9781611973730.36zbMATH Open1371.68340arXiv1311.6235OpenAlexW2949583843MaRDI QIDQ5362985FDOQ5362985
Authors: Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń
Publication date: 5 October 2017
Published in: Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms (Search for Journal in Brave)
Abstract: We consider several types of internal queries, that is, questions about fragments of a given text specified in constant space by their locations in . Our main result is an optimal data structure for Internal Pattern Matching (IPM) queries which, given two fragments and , ask for a representation of all fragments contained in and matching exactly; this problem can be viewed as an internal version of the Exact Pattern Matching problem. Our data structure answers IPM queries in time proportional to the quotient of fragments' lengths, which is required due to the information content of the output. If is a text of length over an integer alphabet of size , then our data structure occupies machine words (that is, bits) and admits an -time construction algorithm. We show the applicability of IPM queries for answering internal queries corresponding to other classic string processing problems. Among others, we derive optimal data structures reporting the periods of a fragment and testing the cyclic equivalence of two fragments. IPM queries have already found numerous further applications, following the path paved by the classic Longest Common Extension (LCE) queries of Landau and Vishkin (JCSS, 1988). In particular, IPM queries have been implemented in grammar-compressed and dynamic settings and, along with LCE queries, constitute elementary operations of the PILLAR model, developed by Charalampopoulos, Kociumaka, and Wellnitz (FOCS 2020). On the way to our main result, we provide a novel construction of string synchronizing sets of Kempa and Kociumaka (STOC 2019). Our method, based on a new restricted version of the recompression technique of Je.z (J. ACM, 2016), yields a hierarchy of string synchronizing sets covering the whole spectrum of fragments' lengths.
Full work available at URL: https://arxiv.org/abs/1311.6235
Recommendations
Cited In (32)
- Quasi-Linear-Time Algorithm for Longest Common Circular Factor
- Title not available (Why is that?)
- Universal reconstruction of a string
- Efficient Computation of 2-Covers of a String.
- On maximal unbordered factors
- Practical Performance of Space Efficient Data Structures for Longest Common Extensions.
- Finding the cyclic covers of a string
- Even faster elastic-degenerate string matching via fast matrix multiplication
- Computing minimal and maximal suffixes of a substring
- Multidimensional period recovery
- Two-dimensional maximal repetitions
- Elastic-Degenerate String Matching via Fast Matrix Multiplication
- Internal dictionary matching
- Internal dictionary matching
- Dynamic and internal longest common substring
- Two-dimensional maximal repetitions
- Internal pattern matching queries in a text and applications
- Near-optimal quantum algorithms for string problems
- Longest unbordered factor in quasilinear time
- Internal shortest absent word queries in constant time and linear space
- Shortest covers of all cyclic shifts of a string
- Internal Quasiperiod Queries
- Circular pattern matching with \(k\) mismatches
- Weighted edit distance computation: strings, trees, and Dyck
- The ``runs theorem
- Repetition Detection in a Dynamic String
- Period recovery of strings over the Hamming and edit distances
- Internal masked prefix sums and its connection to fully internal measurement queries
- Title not available (Why is that?)
- Detecting one-variable patterns
- Longest common substring made fully dynamic
- String Powers in Trees
This page was built for publication: Internal pattern matching queries in a text and applications
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5362985)