A hybrid MPI-OpenMP strategy to speedup the compression of big next-generation sequencing datasets

S Vargas-Pérez, F Saeed - IEEE transactions on parallel and …, 2017 - ieeexplore.ieee.org
DNA sequencing has moved into the realm of Big Data due to the rapid development of high-
throughput, low cost Next-Generation Sequencing (NGS) technologies. Sequential data …

Handling the data management needs of high-throughput sequencing data: SpeedGene, a compression algorithm for the efficient storage of genetic data

D Qiao, WK Yip, C Lange - BMC bioinformatics, 2012 - Springer
Abstract Background As Next-Generation Sequencing data becomes available, existing
hardware environments do not provide sufficient storage space and computational power to …

Efficient sequencing data compression and FPGA acceleration based on a two-step framework

S Chen, Y Chen, Z Wang, W Qin, J Zhang… - Frontiers in …, 2023 - frontiersin.org
With the increasing throughput of modern sequencing instruments, the cost of storing and
transmitting sequencing data has also increased dramatically. Although many tools have …

CURC: a CUDA-based reference-free read compressor

S Xie, X He, S He, Z Zhu - Bioinformatics, 2022 - academic.oup.com
Motivation The data deluge of high-throughput sequencing (HTS) has posed great
challenges to data storage and transfer. Many specific compression tools have been …

PMFFRC: a large-scale genomic short reads compression optimizer via memory modeling and redundant clustering

H Sun, Y Zheng, H Xie, H Ma, X Liu, G Wang - BMC bioinformatics, 2023 - Springer
Background Genomic sequencing reads compressors are essential for balancing high-
throughput sequencing short reads generation speed, large-scale genomic data sharing …

Fastdrc: Fast and scalable genome compression based on distributed and parallel processing

Y Ji, H Fang, H Yao, J He, S Chen, K Li… - … and Architectures for …, 2020 - Springer
With the advent of next-generation sequencing technology, sequencing costs have fallen
sharply compared to the previous sequencing technologies. Genomic big data has become …

FastqCLS: a FASTQ compressor for long-read sequencing via read reordering using a novel scoring model

D Lee, G Song - Bioinformatics, 2022 - academic.oup.com
Motivation Over the past decades, vast amounts of genome sequencing data have been
produced, requiring an enormous level of storage capacity. The time and resources needed …

Data-dependent bucketing improves reference-free compression of sequencing reads

R Patro, C Kingsford - Bioinformatics, 2015 - academic.oup.com
Motivation: The storage and transmission of high-throughput sequencing data consumes
significant resources. As our capacity to produce such data continues to increase, this …

NGC: lossless and lossy compression of aligned high-throughput sequencing data

N Popitsch, A von Haeseler - Nucleic acids research, 2013 - academic.oup.com
A major challenge of current high-throughput sequencing experiments is not only the
generation of the sequencing data itself but also their processing, storage and transmission …

High-throughput compression of FASTQ data with SeqDB

M Howison - IEEE/ACM Transactions on Computational Biology …, 2012 - ieeexplore.ieee.org
Compression has become a critical step in storing next-generation sequencing (NGS) data
sets because of both the increasing size and decreasing costs of such data. Recent …