Linear-time computation of DAWGs, symmetric indexing structures, and MAWs for integer alphabets

Y Fujishige, Y Tsujimaru, S Inenaga, H Bannai… - Theoretical Computer …, 2023 - Elsevier
The directed acyclic word graph (DAWG) of a string y of length n is the smallest (partial) DFA
which recognizes all suffixes of y with only O (n) nodes and edges. In this paper, we show …

[HTML][HTML] Alignment-free sequence comparison using absent words

P Charalampopoulos, M Crochemore, G Fici… - Information and …, 2018 - Elsevier
Sequence comparison is a prerequisite to virtually all comparative genomic analyses. It is
often realised by sequence alignment techniques, which are computationally expensive …

[HTML][HTML] Absent words in a sliding window with applications

M Crochemore, A Héliou, G Kucherov… - Information and …, 2020 - Elsevier
An absent word of a word y is a word that does not occur in y. It is then called minimal if all its
proper factors occur in y. In fact, minimal absent words (MAWs) provide useful information …

[图书][B] 125 Problems in Text Algorithms: With Solutions

M Crochemore, T Lecroq, W Rytter - 2021 - books.google.com
String matching is one of the oldest algorithmic techniques, yet still one of the most
pervasive in computer science. The past 20 years have seen technological leaps in …

Linear-time sequence comparison using minimal absent words & applications

M Crochemore, G Fici, R Mercaş, SP Pissis - LATIN 2016: Theoretical …, 2016 - Springer
Sequence comparison is a prerequisite to virtually all comparative genomic analyses. It is
often realized by sequence alignment techniques, which are computationally expensive …

Minimal absent words in a sliding window and applications to on-line pattern matching

M Crochemore, A Héliou, G Kucherov… - … on fundamentals of …, 2017 - Springer
An absent (or forbidden) word of a word y is a word that does not occur in y. It is then called
minimal if all its proper factors occur in y. There exist linear-time and linear-space algorithms …

[HTML][HTML] Minimal forbidden factors of circular words

G Fici, A Restivo, L Rizzo - Theoretical Computer Science, 2019 - Elsevier
Minimal forbidden factors are a useful tool for investigating properties of words and
languages. Two factorial languages are distinct if and only if they have different …

Palindromic trees for a sliding window and its applications

T Mieno, K Watanabe, Y Nakashima, S Inenaga… - Information Processing …, 2022 - Elsevier
The palindromic tree (aka eertree) for a string S of length n is a tree-like data structure that
represents the set of all distinct palindromic substrings of S, using O (n) space [Rubinchik …

Efficient implementation and empirical evaluation of compression by substring enumeration

S Kanai, H Yokoo, K Yamazaki… - IEICE Transactions on …, 2016 - search.ieice.org
This paper gives an array-based practical encoder for the lossless data compression
algorithm known as Compression by Substring Enumeration (CSE). The encoder makes use …

Lossless Data Compression via Substring Enumeration for k-th Order Markov Sources with a Finite Alphabet

KI Iwata, M Arimura - IEICE Transactions on Fundamentals of …, 2016 - search.ieice.org
A generalization of compression via substring enumeration (CSE) for k-th order Markov
sources with a finite alphabet is proposed, and an upper bound of the codeword length of …