Data compression for sequencing data

S Deorowicz, S Grabowski - Algorithms for Molecular Biology, 2013 - Springer
Post-Sanger sequencing methods produce tons of data, and there is a generalagreement
that the challenge to store and process them must be addressedwith data compression. In …

Compressive biological sequence analysis and archival in the era of high-throughput sequencing technologies

R Giancarlo, SE Rombo, F Utro - Briefings in bioinformatics, 2014 - academic.oup.com
High-throughput sequencing technologies produce large collections of data, mainly DNA
sequences with additional information, requiring the design of efficient and effective …

A survey on data compression methods for biological sequences

M Hosseini, D Pratas, AJ Pinho - Information, 2016 - mdpi.com
The ever increasing growth of the production of high-throughput sequencing data poses a
serious challenge to the storage, processing and transmission of these data. As frequently …

Sequence statistical code based data compression algorithm for wireless sensor network

S Jancy, C Jayakumar - Wireless Personal Communications, 2019 - Springer
Sensors play an integral part in the technologically advanced real world. Wireless sensors
are which have powered by batteries with limited capacity. Hence energy efficiency is one of …

A new algorithm for “the LCS problem” with application in compressing genome resequencing data

R Beal, T Afrin, A Farheen, D Adjeroh - BMC genomics, 2016 - Springer
Background The longest common subsequence (LCS) problem is a classical problem in
computer science, and forms the basis of the current best-performing reference-based …

Epigenomic k-mer dictionaries: shedding light on how sequence composition influences in vivo nucleosome positioning

R Giancarlo, SE Rombo, F Utro - Bioinformatics, 2015 - academic.oup.com
Motivation: Information-theoretic and compositional analysis of biological sequences, in
terms of k-mer dictionaries, has a well established role in genomic and proteomic studies …

Adaptive efficient compression of genomes

S Wandelt, U Leser - Algorithms for Molecular Biology, 2012 - Springer
Abstract Modern high-throughput sequencing technologies are able to generate DNA
sequences at an ever increasing rate. In parallel to the decreasing experimental time and …

Nongreedy unbalanced Huffman tree compressor for single and multifasta files

S Alyami, CH Huang - Journal of Computational Biology, 2020 - liebertpub.com
Next-generation sequencing technologies are producing genomic data at ever-increasing
rates. It has become a challenge to store, transmit, and process the massive quantity of data …

K  2 and : efficient alignment-free sequence similarity measurement based on Kendall statistics

J Lin, DA Adjeroh, BH Jiang, Y Jiang - Bioinformatics, 2018 - academic.oup.com
Motivation Alignment-free sequence comparison methods can compute the pairwise
similarity between a huge number of sequences much faster than sequence-alignment …

The intrinsic combinatorial organization and information theoretic content of a sequence are correlated to the DNA encoded nucleosome organization of eukaryotic …

F Utro, V Di Benedetto, DFV Corona… - Bioinformatics, 2016 - academic.oup.com
Motivation: Thanks to research spanning nearly 30 years, two major models have emerged
that account for nucleosome organization in chromatin: statistical and sequence specific …