Faster longest common extension queries in strings over general alphabets

DOI10.4230/LIPICS.CPM.2016.5MaRDI QIDQ5369537zbMATH OpenOpenAlexFDO

Authors Paweł Gawrychowski, Tomasz Kociumaka, Wojciech Rytter, Tomasz Waleń

Publication date 17 October 2017

Full work available at URL https://arxiv.org/abs/1602.00447

longest common prefix difference cover longest common extension maximal repetitions

Data structures (68P05) Algorithms on strings (68W32)

Abstract: Longest common extension queries (often called longest common prefix queries) constitute a fundamental building block in multiple string algorithms, for example computing runs and approximate pattern matching. We show that a sequence of

q

LCE queries for a string of size

n

over a general ordered alphabet can be realized in

O (q l o g l o g n + n l o g^{*} n)

time making only

O (q + n)

symbol comparisons. Consequently, all runs in a string over a general ordered alphabet can be computed in

O (n l o g l o g n)

time making

O (n)

symbol comparisons. Our results improve upon a solution by Kosolobov (Information Processing Letters, 2016), who gave an algorithm with

O (n l o g^{2 / 3} n)

running time and conjectured that

O (n)

time is possible. We make a significant progress towards resolving this conjecture. Our techniques extend to the case of general unordered alphabets, when the time increases to

O (q l o g n + n l o g^{*} n)

. The main tools are difference covers and the disjoint-sets data structure.

Recommendations

Cited in

(15)

This page was built for publication: Faster longest common extension queries in strings over general alphabets

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5369537)