Efficient DNA sequence compression with neural networks

M Silva, D Pratas, AJ Pinho - GigaScience, 2020 - academic.oup.com
Background The increasing production of genomic data has led to an intensified need for
models that can cope efficiently with the lossless compression of DNA sequences. Important …

A hybrid pipeline for reconstruction and analysis of viral genomes at multi-organ level

D Pratas, M Toppinen, L Pyöriä, K Hedman… - …, 2020 - academic.oup.com
Background Advances in sequencing technologies have enabled the characterization of
multiple microbial and host genomes, opening new frontiers of knowledge while kindling …

Persistent minimal sequences of SARS-CoV-2

D Pratas, JM Silva - Bioinformatics, 2020 - academic.oup.com
Motivation Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused
more than 14 million cases and more than half million deaths. Given the absence of …

Automatic analysis of artistic paintings using information-based measures

JM Silva, D Pratas, R Antunes, S Matos, AJ Pinho - Pattern Recognition, 2021 - Elsevier
The artistic community is increasingly relying on automatic computational analysis for
authentication and classification of artistic paintings. In this paper, we identify hidden …

GeCo2: An optimized tool for lossless compression and analysis of DNA sequences

D Pratas, M Hosseini, AJ Pinho - Practical Applications of Computational …, 2020 - Springer
The development of efficient DNA data compression tools is fundamental for reducing the
storage, given the increasing availability of DNA sequences. The importance is also …

A generative nonparametric Bayesian model for whole genomes

A Amin, EN Weinstein, D Marks - Advances in Neural …, 2021 - proceedings.neurips.cc
Generative probabilistic modeling of biological sequences has widespread existing and
potential use across biology and biomedicine, particularly given advances in high …

Smash++: an alignment-free and memory-efficient tool to find genomic rearrangements

M Hosseini, D Pratas, B Morgenstern, AJ Pinho - Gigascience, 2020 - academic.oup.com
Background The development of high-throughput sequencing technologies and, as its
result, the production of huge volumes of genomic data, has accelerated biological and …

Allowing mutations in maximal matches boosts genome compression performance

Y Liu, L Wong, J Li - Bioinformatics, 2020 - academic.oup.com
Motivation A maximal match between two genomes is a contiguous non-extendable sub-
sequence common in the two genomes. DNA bases mutate very often from the genome of …

Metagenomic composition analysis of an ancient sequenced polar bear jawbone from Svalbard

D Pratas, M Hosseini, G Grilo, AJ Pinho, RM Silva… - Genes, 2018 - mdpi.com
The sequencing of ancient DNA samples provides a novel way to find, characterize, and
distinguish exogenous genomes of endogenous targets. After sequencing, computational …

JARVIS3: an efficient encoder for genomic data

MJP Sousa, AJ Pinho, D Pratas - Bioinformatics, 2024 - academic.oup.com
Motivation Large-scale genomic projects grapple with the complex challenge of reducing
medium-and long-term storage space and its associated energy consumption, monetary …