An encoding for order-preserving matching
From MaRDI portal
Publication:5111725
DOI10.4230/LIPICS.ESA.2017.38zbMATH Open1442.68038arXiv1610.02865OpenAlexW2595089934MaRDI QIDQ5111725FDOQ5111725
Rossano Venturini, Travis Gagie, Giovanni Manzini
Publication date: 27 May 2020
Abstract: Encoding data structures store enough information to answer the queries they are meant to support but not enough to recover their underlying datasets. In this paper we give the first encoding data structure for the challenging problem of order-preserving pattern matching. This problem was introduced only a few years ago but has already attracted significant attention because of its applications in data analysis. Two strings are said to be an order-preserving match if the {em relative order} of their characters is the same: e.g., and are an order-preserving match. We show how, given a string over an arbitrary alphabet and a constant , we can build an -bit encoding such that later, given a pattern with , we can return the number of order-preserving occurrences of in in time. Within the same time bound we can also return the starting position of some order-preserving match for in (if such a match exists). We prove that our space bound is within a constant factor of optimal; our query time is optimal if . Our space bound contrasts with the bits needed in the worst case to store itself, an index for order-preserving pattern matching with no restrictions on the pattern length, or an index for standard pattern matching even with restrictions on the pattern length. Moreover, we can build our encoding knowing only how each character compares to neighbouring characters.
Full work available at URL: https://arxiv.org/abs/1610.02865
Recommendations
Cites Work
- Space-efficient substring occurrence estimation
- Fast Prefix Search in Little Space, with Applications
- Efficient Storage and Retrieval by Content and Address of Static Files
- Encodings for Range Selection and Top-k Queries
- Asymptotically Optimal Encodings for Range Selection
- Encodings for Range Majority Queries
- Parameterized pattern matching: Algorithms and applications
- Order-preserving matching
- Efficient Algorithms for the Order Preserving Pattern Matching Problem
- Single and Multiple Consecutive Permutation Motif Search
- Order-preserving indexing
- Order-preserving pattern matching with \(k\) mismatches
- A linear time algorithm for consecutive permutation pattern matching
- A fast algorithm for order-preserving pattern matching
- Theory and practice of monotone minimal perfect hashing
- Encoding 2D range maximum queries
- Optimal Encodings for Range Top-$$k$$, Selection, and Min-Max
- A New Succinct Representation of RMQ-Information and Improvements in the Enhanced Suffix Array
- Combined data structure for previous- and next-smaller-values
- An experimental study of a compressed index
- A filtration method for order-preserving matching
- Space efficient data structures for nearest larger neighbor
- pBWT: Achieving Succinct Data Structures for Parameterized Pattern Matching and Related Problems
- Encoding range minima and range top-2 queries
- Bit-Parallel Approximate Matching of Circular Strings with k Mismatches
- Compact Encodings and Indexes for the Nearest Larger Neighbor Problem
- Encodings of Range Maximum-Sum Segment Queries and Applications
- Encoding Data Structures
Cited In (6)
This page was built for publication: An encoding for order-preserving matching
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5111725)