The world's technological capacity to store, communicate, and compute information

M Hilbert, P López - science, 2011 - science.org
We estimated the world's technological capacity to store, communicate, and compute
information, tracking 60 analog and digital technologies during the period from 1986 to …

High-throughput DNA sequence data compression

Z Zhu, Y Zhang, Z Ji, S He, X Yang - Briefings in bioinformatics, 2015 - academic.oup.com
The exponential growth of high-throughput DNA sequence data has posed great challenges
to genomic data storage, retrieval and transmission. Compression is a critical tool to address …

Textual data compression in computational biology: a synopsis

R Giancarlo, D Scaturro, F Utro - Bioinformatics, 2009 - academic.oup.com
Motivation: Textual data compression, and the associated techniques coming from
information theory, are often perceived as being of interest for data communication and …

Efficient storage of high throughput DNA sequencing data using reference-based compression

MHY Fritz, R Leinonen, G Cochrane… - Genome research, 2011 - genome.cshlp.org
Data storage costs have become an appreciable proportion of total cost in the creation and
analysis of DNA sequence data. Of particular concern is that the rate of increase in DNA …

A simple statistical algorithm for biological sequence compression

MD Cao, TI Dix, L Allison… - 2007 Data Compression …, 2007 - ieeexplore.ieee.org
This paper introduces a novel algorithm for biological sequence compression that makes
use of both statistical properties and repetition within sequences. A panel of experts is …

[图书][B] Handbook of computational molecular biology

S Aluru - 2005 - taylorfrancis.com
The enormous complexity of biological systems at the molecular level must be answered
with powerful computational methods. Computational biology is a young field, but has seen …

Compression of DNA sequence reads in FASTQ format

S Deorowicz, S Grabowski - Bioinformatics, 2011 - academic.oup.com
Motivation: Modern sequencing instruments are able to generate at least hundreds of
millions short reads of genomic data. Those huge volumes of data require effective means to …

Large-scale compression of genomic sequence databases with the Burrows–Wheeler transform

AJ Cox, MJ Bauer, T Jakobi, G Rosone - Bioinformatics, 2012 - academic.oup.com
Abstract Motivation: The Burrows–Wheeler transform (BWT) is the foundation of many
algorithms for compression and indexing of text data, but the cost of computing the BWT of …

A genomic distance based on MUM indicates discontinuity between most bacterial species and genera

M Deloger, M El Karoui, MA Petit - Journal of bacteriology, 2009 - Am Soc Microbiol
The fundamental unit of biological diversity is the species. However, a remarkable extent of
intraspecies diversity in bacteria was discovered by genome sequencing, and it reveals the …

A survey on data compression methods for biological sequences

M Hosseini, D Pratas, AJ Pinho - Information, 2016 - mdpi.com
The ever increasing growth of the production of high-throughput sequencing data poses a
serious challenge to the storage, processing and transmission of these data. As frequently …