A survey of secure data deduplication schemes for cloud storage systems

Y Shin, D Koo, J Hur - ACM computing surveys (CSUR), 2017 - dl.acm.org
Data deduplication has attracted many cloud service providers (CSPs) as a way to reduce
storage costs. Even though the general deduplication approach has been increasingly …

A comprehensive study of the past, present, and future of data deduplication

W Xia, H Jiang, D Feng, F Douglis… - Proceedings of the …, 2016 - ieeexplore.ieee.org
Data deduplication, an efficient approach to data reduction, has gained increasing attention
and popularity in large-scale storage systems due to the explosive growth of digital data. It …

{FastCDC}: A fast and efficient {Content-Defined} chunking approach for data deduplication

W Xia, Y Zhou, H Jiang, D Feng, Y Hua, Y Hu… - 2016 USENIX Annual …, 2016 - usenix.org
Content-Defined Chunking (CDC) has been playing a key role in data deduplication
systems in the past 15 years or so due to its high redundancy detection abil-ity. However …

The design of fast content-defined chunking for data deduplication based storage systems

W Xia, X Zou, H Jiang, Y Zhou, C Liu… - … on Parallel and …, 2020 - ieeexplore.ieee.org
Content-Defined Chunking (CDC) has been playing a key role in data deduplication
systems recently due to its high redundancy detection ability. However, existing CDC-based …

{DupHunter}: Flexible {High-Performance} Deduplication for Docker Registries

N Zhao, H Albahar, S Abraham, K Chen… - 2020 USENIX Annual …, 2020 - usenix.org
Containers are increasingly used in a broad spectrum of applications from cloud services to
storage to supporting emerging edge computing paradigm. This has led to an explosive …

The design of fast and lightweight resemblance detection for efficient post-deduplication delta compression

W Xia, L Pu, X Zou, P Shilane, S Li, H Zhang… - ACM Transactions on …, 2023 - dl.acm.org
Post-deduplication delta compression is a data reduction technique that calculates and
stores the differences of very similar but non-duplicate chunks in storage systems, which is …

A fast asymmetric extremum content defined chunking algorithm for data deduplication in backup storage systems

Y Zhang, D Feng, H Jiang, W Xia, M Fu… - IEEE Transactions …, 2016 - ieeexplore.ieee.org
Chunk-level deduplication plays an important role in backup storage systems. Existing
Content-Defined Chunking (CDC) algorithms, while robust in finding suitable chunk …

The dilemma between deduplication and locality: Can both be achieved?

X Zou, J Yuan, P Shilane, W Xia, H Zhang… - … USENIX conference on …, 2021 - usenix.org
Data deduplication is widely used to reduce the size of backup workloads, but it has the
known disadvantage of causing poor data locality, also referred to as the fragmentation …

Finesse:{Fine-Grained} Feature Locality based Fast Resemblance Detection for {Post-Deduplication} Delta Compression

Y Zhang, W Xia, D Feng, H Jiang, Y Hua… - 17th USENIX Conference …, 2019 - usenix.org
In storage systems, delta compression is often used as a complementary data reduction
technique for data deduplication because it is able to eliminate redundancy among the non …

The dynamic cuckoo filter

H Chen, L Liao, H Jin, J Wu - 2017 IEEE 25th International …, 2017 - ieeexplore.ieee.org
The emergence of large-scale dynamic sets in real applications creates stringent
requirements for approximate set representation structures: 1) the capacity of the set …