From hyper-dimensional structures to linear structures: Maintaining deduplicated data's locality

X Zou, J Yuan, P Shilane, W Xia, H Zhang… - ACM Transactions on …, 2022 - dl.acm.org
Data deduplication is widely used to reduce the size of backup workloads, but it has the
known disadvantage of causing poor data locality, also referred to as the fragmentation …

Sliding {Look-Back} Window Assisted Data Chunk Rewriting for Improving Deduplication Restore Performance

Z Cao, S Liu, F Wu, G Wang, B Li, DHC Du - 17th USENIX Conference …, 2019 - usenix.org
Data deduplication is an effective way of improving storage space utilization. The data
generated by deduplication is persistently stored in data chunks or data containers (a …

Inde: An inline data deduplication approach via adaptive detection of valid container utilization

L Lin, Y Deng, Y Zhou, Y Zhu - ACM Transactions on Storage, 2023 - dl.acm.org
Inline deduplication removes redundant data in real-time as data is being sent to the storage
system. However, it causes data fragmentation: logically consecutive chunks are physically …

MGRM: a multi-segment greedy rewriting method to alleviate data fragmentation in deduplication-based cloud backup systems

D Zhang, Y Deng, Y Zhou, J Li, W Zhu… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Data deduplication has been broadly used in Cloud due to its storage space saving ability.
Capping methods that rewrite the data chunks of low Container Reference Ratio (CRR) …

A content fingerprint-based cluster-wide inline deduplication for shared-nothing storage systems

A Khan, P Hamandawana, Y Kim - IEEE Access, 2020 - ieeexplore.ieee.org
Deduplication has been principally employed in distributed storage systems to improve
storage space efficiency. Traditional deduplication research ignores the design …

A fragmentation-aware redundancy elimination scheme for inline backup systems

Y Zhang, W Zhu, D Feng, W Huang, N Jiang… - Future Generation …, 2024 - Elsevier
Data deduplication is a widely employed technique in backup systems to enhance storage
efficiency by eliminating duplicate chunks. Delta compression is a technique that …

Accelerating ml/dl applications with hierarchical caching on deduplication storage clusters

P Hamandawana, A Khan, J Kim… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Large scale machine learning (ML) and deep learning (DL) platforms face challenges when
integrated with deduplication enabled storage clusters. In the quest to achieve smart and …

Improving restore performance of packed datasets in deduplication systems via reducing persistent fragmented chunks

Y Zhang, M Fu, X Wu, F Wang, Q Wang… - … on Parallel and …, 2020 - ieeexplore.ieee.org
Data deduplication, though being efficient for redundancy elimination in storage systems,
introduces chunk fragmentation which severely decreases restore performance. Rewriting …

Applying Delta Compression to Packed Datasets for Efficient Data Reduction

Y Zhang, H Jiang, C Wang, W Huang… - IEEE Transactions …, 2023 - ieeexplore.ieee.org
Backup systems often adopt deduplication techniques for data reduction. Real-world backup
products often group files into larger units (called packed files) before deduplicating them …

A survey on deduplication systems

A Godavari, C Sudhakar - International Journal of Grid and …, 2024 - inderscienceonline.com
With the arrival of new technological trends such as Big Data and Internet of Things,
tremendous amount of duplicate data is being generated. Duplicate data causes the …