A survey and classification of storage deduplication systems

J Paulo, J Pereira - ACM Computing Surveys (CSUR), 2014 - dl.acm.org
The automatic elimination of duplicate data in a storage system, commonly known as
deduplication, is increasingly accepted as an effective technique to reduce storage costs …

A comprehensive study of the past, present, and future of data deduplication

W Xia, H Jiang, D Feng, F Douglis… - Proceedings of the …, 2016 - ieeexplore.ieee.org
Data deduplication, an efficient approach to data reduction, has gained increasing attention
and popularity in large-scale storage systems due to the explosive growth of digital data. It …

{FastCDC}: A fast and efficient {Content-Defined} chunking approach for data deduplication

W Xia, Y Zhou, H Jiang, D Feng, Y Hua, Y Hu… - 2016 USENIX Annual …, 2016 - usenix.org
Content-Defined Chunking (CDC) has been playing a key role in data deduplication
systems in the past 15 years or so due to its high redundancy detection abil-ity. However …

On wide area network optimization

Y Zhang, N Ansari, M Wu, H Yu - … Communications surveys & …, 2011 - ieeexplore.ieee.org
Applications, deployed over a wide area network (WAN) which may connect across
metropolitan, regional or national boundaries, suffer performance degradation owing to …

The design of fast content-defined chunking for data deduplication based storage systems

W Xia, X Zou, H Jiang, Y Zhou, C Liu… - … on Parallel and …, 2020 - ieeexplore.ieee.org
Content-Defined Chunking (CDC) has been playing a key role in data deduplication
systems recently due to its high redundancy detection ability. However, existing CDC-based …

A fast asymmetric extremum content defined chunking algorithm for data deduplication in backup storage systems

Y Zhang, D Feng, H Jiang, W Xia, M Fu… - IEEE Transactions …, 2016 - ieeexplore.ieee.org
Chunk-level deduplication plays an important role in backup storage systems. Existing
Content-Defined Chunking (CDC) algorithms, while robust in finding suitable chunk …

A forest-structured bloom filter with flash memory

G Lu, B Debnath, DHC Du - 2011 IEEE 27th Symposium on …, 2011 - ieeexplore.ieee.org
A Bloom Filter (BF) is a data structure based on probability to compactly represent/record a
set of elements (keys). It has wide applications on efficiently identifying a key that has been …

Assuring demanded read performance of data deduplication storage with backup datasets

YJ Nam, D Park, DHC Du - 2012 IEEE 20th International …, 2012 - ieeexplore.ieee.org
Data deduplication has been widely adopted in contemporary backup storage systems. It not
only saves storage space considerably, but also shortens the data backup time significantly …

DARE: A deduplication-aware resemblance detection and elimination scheme for data reduction with low overheads

W Xia, H Jiang, D Feng, L Tian - IEEE Transactions on …, 2015 - ieeexplore.ieee.org
Data reduction has become increasingly important in storage systems due to the explosive
growth of digital data in the world that has ushered in the big data era. One of the main …

Fingerprint-based data deduplication using a mathematical bounded linear hash function

ASM Saeed, LE George - Symmetry, 2021 - mdpi.com
Due to the quick increase in digital data, especially in mobile usage and social media, data
deduplication has become a vital and cost-effective approach for removing redundant data …