A compressed dynamic self-index for highly repetitive text collections

From MaRDI portal
Publication:776840

DOI10.1016/J.IC.2020.104518zbMATH Open1446.68045arXiv1711.02855OpenAlexW3000242028WikidataQ126344789 ScholiaQ126344789MaRDI QIDQ776840FDOQ776840


Authors: Takaaki Nishimoto, Yoshimasa Takabatake, Yasuo Tabei Edit this on Wikidata


Publication date: 13 July 2020

Published in: Information and Computation (Search for Journal in Brave)

Abstract: We present a novel compressed dynamic self-index for highly repetitive text collections. Signature encoding is a compressed dynamic self-index for highly repetitive texts and has a large disadvantage that the pattern search for short patterns is slow. We improve this disadvantage for faster pattern search by leveraging an idea behind truncated suffix tree and present the first compressed dynamic self-index named TST-index that supports not only fast pattern search but also dynamic update operation of index for highly repetitive texts. Experiments using a benchmark dataset of highly repetitive texts show that the pattern search of TST-index is significantly improved.


Full work available at URL: https://arxiv.org/abs/1711.02855




Recommendations



Cites Work


Cited In (3)

Uses Software





This page was built for publication: A compressed dynamic self-index for highly repetitive text collections

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q776840)