Data compression for sequencing data

S Deorowicz, S Grabowski - Algorithms for Molecular Biology, 2013 - Springer
Post-Sanger sequencing methods produce tons of data, and there is a generalagreement
that the challenge to store and process them must be addressedwith data compression. In …

BWA-MEME: BWA-MEM emulated with a machine learning approach

Y Jung, D Han - Bioinformatics, 2022 - academic.oup.com
Motivation The growing use of next-generation sequencing and enlarged sequencing
throughput require efficient short-read alignment, where seeding is one of the major …

Compressive biological sequence analysis and archival in the era of high-throughput sequencing technologies

R Giancarlo, SE Rombo, F Utro - Briefings in bioinformatics, 2014 - academic.oup.com
High-throughput sequencing technologies produce large collections of data, mainly DNA
sequences with additional information, requiring the design of efficient and effective …

Dolos: Language‐agnostic plagiarism detection in source code

R Maertens, C Van Petegem, N Strijbol… - Journal of Computer …, 2022 - Wiley Online Library
Background Learning to code is increasingly embedded in secondary and higher education
curricula, where solving programming exercises plays an important role in the learning …

Big data integration in remote sensing across a distributed metadata-based spatial infrastructure

J Fan, J Yan, Y Ma, L Wang - Remote Sensing, 2017 - mdpi.com
Since Landsat-1 first started to deliver volumes of pixels in 1972, the volumes of archived
data in remote sensing data centers have increased continuously. Due to various satellite …

Compareads: comparing huge metagenomic experiments

N Maillet, C Lemaitre, R Chikhi, D Lavenier… - BMC …, 2012 - Springer
Background Nowadays, metagenomic sample analyses are mainly achieved by comparing
them with a priori knowledge stored in data banks. While powerful, such approaches do not …

[图书][B] Cloud computing in remote sensing

L Wang, J Yan, Y Ma - 2019 - taylorfrancis.com
This book provides the users with quick and easy data acquisition, processing, storage and
product generation services. It describes the entire life cycle of remote sensing data and …

essaMEM: finding maximal exact matches using enhanced sparse suffix arrays

M Vyverman, B De Baets, V Fack, P Dawyndt - Bioinformatics, 2013 - academic.oup.com
We have developed essaMEM, a tool for finding maximal exact matches that can be used in
genome comparison and read mapping. essaMEM enhances an existing sparse suffix array …

A bioinformatician's guide to the forefront of suffix array construction algorithms

AMS Shrestha, MC Frith, P Horton - Briefings in bioinformatics, 2014 - academic.oup.com
The suffix array and its variants are text-indexing data structures that have become
indispensable in the field of bioinformatics. With the uninitiated in mind, we provide an …

Indexes of large genome collections on a PC

A Danek, S Deorowicz, S Grabowski - PloS one, 2014 - journals.plos.org
The availability of thousands of individual genomes of one species should boost rapid
progress in personalized medicine or understanding of the interaction between genotype …