Direct superbubble detection (Q2003335)

From MaRDI portal
scientific article
Language Label Description Also known as
English
Direct superbubble detection
scientific article

    Statements

    Direct superbubble detection (English)
    0 references
    0 references
    0 references
    0 references
    0 references
    8 July 2019
    0 references
    Summary: Superbubbles are a class of induced subgraphs in digraphs that play an essential role in assembly algorithms for high-throughput sequencing data. They are connected with the remainder of the host digraph by a single entrance and a single exit vertex. Linear-time algorithms for the enumeration superbubbles recently have become available. Current approaches require the decomposition of the input digraph into strongly-connected components, which are then analyzed separately. In principle, a single depth-first search could be used, provided one can guarantee that the root of the depth-first search (DFS)-tree is not itself located in the interior or the exit point of a superbubble. Here, we describe a linear-time algorithm to determine suitable roots for a DFS-forest that is guaranteed to identify the superbubbles in a digraph correctly. In addition to the advantages of a more straightforward implementation, we observe a nearly three-fold gain in performance on real-world datasets. We present a reference implementation of the new algorithm that accepts many commonly-used input formats for digraphs. It is available as open source from github.
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    superbubble
    0 references
    depth-first search
    0 references
    cycles
    0 references
    linear-time algorithm
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references