Online algorithms for finding distinct substrings with length and multiple prefix and suffix conditions

From MaRDI portal
Publication:6166969

DOI10.1007/978-3-031-20643-6_3zbMATH Open1525.68207arXiv2207.04194OpenAlexW4312918225MaRDI QIDQ6166969FDOQ6166969


Authors: Laurentius Leonard, Shunsuke Inenaga, Hideo Bannai, Takuya Mieno Edit this on Wikidata


Publication date: 4 August 2023

Published in: String Processing and Information Retrieval (Search for Journal in Brave)

Abstract: Let two static sequences of strings P and S, representing prefix and suffix conditions respectively, be given as input for preprocessing. For the query, let two positive integers k1 and k2 be given, as well as a string T given in an online manner, such that Ti represents the length-i prefix of T for 1leqileq|T|. In this paper we are interested in computing the set mathitansi of distinct substrings w of Ti such that k1leq|w|leqk2, and w contains some pinP as a prefix and some sinS as a suffix. More specifically, the counting problem is to output |mathitansi|, whereas the reporting problem is to output all elements of mathitansi, for each iteration i. Let sigma denote the alphabet size, and for a sequence of strings A, VertAVert=sumuinA|u|. Then, we show that after O((VertPVert+VertSVert)logsigma)-time preprocessing, the solutions for the counting and reporting problems for each iteration up to i can be output in O(|Ti|logsigma) and O(|Ti|logsigma+|mathitansi|) total time. The preprocessing time can be reduced to O(VertPVert+VertSVert) for integer alphabets of size polynomial with regard to VertPVert+VertSVert. Our algorithms have possible applications to network traffic classification.


Full work available at URL: https://arxiv.org/abs/2207.04194







Cites Work






This page was built for publication: Online algorithms for finding distinct substrings with length and multiple prefix and suffix conditions

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6166969)