We present the first publicly available high-performance implementation of state-of-the-art MPC algorithms, including the parameterized approaches. Our experiments on random DAGs show that parameterized algorithms are orders-of-magnitude faster on dense graphs. Additionally, we present new pre-processing heuristics based on transitive edge sparsification. We show that our heuristics improve MPC-solvers by orders-of-magnitude.
We provide a $O(|\delta| \log |Q|)$-time algorithm computing a co-lex total preorder when the input is a Wheeler NFA, and an algorithm with the same time complexity computing the smallest-width co-lex partial order of any DFA.
We show how to chain *maximal exact matches* (MEMs) between a query string $Q$ and a labeled directed acyclic graph (DAG) $G=(V,E)$ to solve the *longest common subsequence* (LCS) problem between $Q$ and $G$.
We show that, for acyclic graphs, considering the width of the graph yields advances in our understanding of its approximability. For the non-negative version, we show that a popular heuristic is a $O( \log |X|)$-approximation on graphs satisfying two properties related to the width (satisfied by e.g., series-parallel graphs), and strengthen its worst-case approximation ratio for sparse graphs. For the negative version, we give a $(\lceil \log \Vert X \Vert \rceil +1)$-approximation using a power-of-two approach, combined with parity fixing arguments and a decomposition of unitary flows ($\Vert X \Vert \leq 1$) into at most width paths.
We give the first local characterization of safe paths for flow decompositions in directed acyclic graphs (DAGs), leading to a practical algorithm for finding the complete set of safe paths. We additionally evaluated our algorithms against the trivial safe algorithms (unitigs, extended unitigs) and the popularly used heuristic (greedy-width) for flow decomposition on RNA transcripts datasets. We find that despite maintaining perfect precision the safe and complete algorithm reports significantly higher coverage as compared to trivial safe algorithms.
We obtain two new MPC parameterized algorithms for DAGs running in time $O(k^2|V|\log(|V|) + |E|)$ and $O(k^3|V| + |E|)$. We also obtain a parallel algorithm running in $O(k^2|V| + |E|)$ parallel steps and using $O(\log(|V|))$ processors (in the PRAM model). We also obtain edge sparsification algorithms preserving the width of the DAG with the same running time as our MPC algorithms.
We describe an parameterized algorithm to compute the width $k$ of a DAG in time $O(k2^k|E| + k^24^k|V|)$.
We describe indexes for searching large data sets for variable-length-gapped (VLG) patterns. Our best approach provides search speeds several times faster than prior art across a broad range of patterns and texts.
New Repetition-Aware Compressed Suffix Tree. Slightly larger than state-of-the-art, but outperforms them in time, often by orders of magnitude.