J Guo, H Ren, M Ijaz, X Qi, T Ahmed, Y You, G Li, Z Yu… - Genomics, 2023 - Elsevier
The pathogenic fungus Pestalotiopsis versicolor is a major etiological agent of fungal twig blight disease affecting bayberry trees. However, the lack of complete genome sequence …
Finding repetitive structures in genomes and proteins is important to understand their biological functions. Many data compressors for modern genomic sequences rely heavily on …
Motifs are frequent patterns used to identify biological functionality in genomic sequences, periodicity in time series, or user trends in web logs. In contrast to a lot of existing work that …
Modern applications, including bioinformatics, time series, and web log analysis, require the extraction of frequent patterns, called motifs, from one very long (ie, several gigabytes) …
W Li, J Freudenberg - Computational biology and chemistry, 2014 - Elsevier
Repetitive and redundant regions of a genome are particularly problematic for mapping sequencing reads. In the present paper, we compile a list of the unmappable regions in the …
T Nishimoto, Y Tabei - arXiv preprint arXiv:2004.01493, 2020 - arxiv.org
Enumerating characteristic substrings (eg, maximal repeats, minimal unique substrings, and minimal absent words) in a given string has been an important research topic because there …
T Beller, K Berger, E Ohlebusch - … de Indias, Colombia, October 21-25 …, 2012 - Springer
The identification of repetitive sequences (repeats) is an essential component of genome sequence analysis, and the notions of maximal and supermaximal repeats capture all exact …
Generalizations of plain strings have been proposed as a compact way to represent a collection of nearly identical sequences or to express uncertainty at specific text positions by …
Genome assemblies are typically compared with respect to their contiguity, coverage, and accuracy. We propose a genome-wide, alignment-free genomic distance based on …