Design tradeoffs for data deduplication performance in backup workloads

M Fu, D Feng, Y Hua, X He, Z Chen, W Xia… - … USENIX Conference on …, 2015 - usenix.org
Data deduplication has become a standard component in modern backup systems. In order
to understand the fundamental tradeoffs in each of its design choices (such as prefetching …

Reducing fragmentation for in-line deduplication backup storage via exploiting backup history and cache knowledge

M Fu, D Feng, Y Hua, X He, Z Chen… - … on Parallel and …, 2015 - ieeexplore.ieee.org
In backup systems, the chunks of each backup are physically scattered after deduplication,
which causes a challenging fragmentation problem. We observe that the fragmentation …

Improving restore speed for backup systems that use inline {Chunk-Based} deduplication

M Lillibridge, K Eshghi, D Bhagwat - 11th USENIX Conference on File …, 2013 - usenix.org
Slow restoration due to chunk fragmentation is a serious problem facing inline chunk-based
data deduplication systems: restore speeds for the most recent backup can drop orders of …

Efficient hybrid inline and out-of-line deduplication for backup storage

YK Li, M Xu, CH Ng, PPC Lee - ACM Transactions on Storage (TOS), 2014 - dl.acm.org
Backup storage systems often remove redundancy across backups via inline deduplication,
which works by referring duplicate chunks of the latest backup to those of existing backups …

Demystifying data deduplication

N Mandagere, P Zhou, MA Smith… - Proceedings of the ACM …, 2008 - dl.acm.org
Effectiveness and tradeoffs of deduplication technologies are not well understood--vendors
tout Deduplication as a" silver bullet" that can help any enterprise optimize its deployed …

[PDF][PDF] iDedup: latency-aware, inline data deduplication for primary storage.

K Srinivasan, T Bisson, GR Goodson, K Voruganti - Fast, 2012 - usenix.org
Deduplication technologies are increasingly being deployed to reduce cost and increase
space-efficiency in corporate data centers. However, prior research has not applied …

A long-term user-centric analysis of deduplication patterns

Z Sun, G Kuenning, S Mandal, P Shilane… - … 32nd Symposium on …, 2016 - ieeexplore.ieee.org
Deduplication has become essential in disk-based backup systems, but there have been
few long-term studies of backup workloads. Most past studies either were of a small static …

Characterizing datasets for data deduplication in backup applications

N Park, DJ Lilja - IEEE International Symposium on Workload …, 2010 - ieeexplore.ieee.org
The compression and throughput performance of data deduplication system is directly
affected by the input dataset. We propose two sets of evaluation metrics, and the means to …

[PDF][PDF] Building a high-performance deduplication system

F Guo, P Efstathopoulos - 2011 USENIX Annual Technical Conference …, 2011 - usenix.org
Modern deduplication has become quite effective at eliminating duplicates in data, thus
multiplying the effective capacity of disk-based backup systems, and enabling them as …

Revdedup: A reverse deduplication storage system optimized for reads to latest backups

CH Ng, PPC Lee - Proceedings of the 4th Asia-Pacific workshop on …, 2013 - dl.acm.org
Deduplication is known to effectively eliminate duplicates, yet it introduces fragmentation
that degrades read performance. We propose RevDedup, a deduplication system that …