Absent subsequences in words

M Kosche, T Koß, F Manea, S Siemer - International Conference on …, 2021 - Springer
An absent factor of a string w is a string u which does not occur as a contiguous substring
(aka factor) inside w. We extend this well-studied notion and define absent subsequences: a …

Significant non-existence of sequences in genomes and proteomes

G Koulouras, MC Frith - Nucleic acids research, 2021 - academic.oup.com
Minimal absent words (MAWs) are minimal-length oligomers absent from a genome or
proteome. Although some artificially synthesized MAWs have deleterious effects, there is still …

[HTML][HTML] kmerDB: a database encompassing the set of genomic and proteomic sequence information for each species

I Mouratidis, FA Baltoumas, N Chantzi… - Computational and …, 2024 - Elsevier
The decrease in sequencing expenses has facilitated the creation of reference genomes
and proteomes for an expanding array of organisms. Nevertheless, no established …

K-mer applied in Mycobacterium tuberculosis genome cluster analysis

LM Ferreira, T Sáfadi, JL Ferreira - Brazilian Journal of Biology, 2022 - SciELO Brasil
According to studies carried out, approximately 10 million people developed tuberculosis in
2018. Of this total, 1.5 million people died from the disease. To study the behavior of the …

Linear-time computation of generalized minimal absent words for multiple strings

K Okabe, T Mieno, Y Nakashima, S Inenaga… - … Symposium on String …, 2023 - Springer
A string w is called a minimal absent word (MAW) for a string S if w does not occur as a
substring in S and all proper substrings of w occur in S. MAWs are well-studied …

Internal shortest absent word queries in constant time and linear space

G Badkobeh, P Charalampopoulos… - Theoretical Computer …, 2022 - Elsevier
Given a string T of length n over an alphabet Σ⊂{1, 2,…, n O (1)} of size σ, we are to
preprocess T so that given a range [i, j], we can return a representation of a shortest string …

[HTML][HTML] A semi-automatic methodology for analysing distributed and private biobanks

JR Almeida, D Pratas, JL Oliveira - Computers in Biology and Medicine, 2021 - Elsevier
Privacy issues limit the analysis and cross-exploration of most distributed and private
biobanks, often raised by the multiple dimensionality and sensitivity of the data associated …

Reverse-safe text indexing

G Bernardini, H Chen, G Fici, G Loukides… - Journal of Experimental …, 2021 - dl.acm.org
We introduce the notion of reverse-safe data structures. These are data structures that
prevent the reconstruction of the data they encode (ie, they cannot be easily reversed). A …

[PDF][PDF] Internal shortest absent word queries

G Badkobeh, P Charalampopoulos… - … Annual Symposium on …, 2021 - research.gold.ac.uk
Given a string T of length n over an alphabet Σ⊂{1, 2,..., nO (1)} of size σ, we are to
preprocess T so that given a range [i, j], we can return a representation of a shortest string …

[HTML][HTML] Combinatorics of minimal absent words for a sliding window

T Akagi, Y Kuhara, T Mieno, Y Nakashima… - Theoretical Computer …, 2022 - Elsevier
A string w is called a minimal absent word (MAW) for another string T if w does not occur in T
but the proper substrings of w occur in T. For example, let Σ={a, b, c} be the alphabet. Then …