Ceph: A scalable, high-performance distributed file system

S Weil, SA Brandt, EL Miller, DDE Long… - Proceedings of the 7th …, 2006 - usenix.org
We have developed Ceph, a distributed file system that provides excellent performance,
reliability, and scalability. Ceph maximizes the separation between data and metadata …

Extreme binning: Scalable, parallel deduplication for chunk-based file backup

D Bhagwat, K Eshghi, DDE Long… - … on Modeling, Analysis …, 2009 - ieeexplore.ieee.org
Data deduplication is an essential and critical component of backup systems. Essential,
because it reduces storage space requirements, and critical, because the performance of …

CRUSH: Controlled, scalable, decentralized placement of replicated data

SA Weil, SA Brandt, EL Miller, C Maltzahn - Proceedings of the 2006 …, 2006 - dl.acm.org
Emerging large-scale distributed storage systems are faced with the task of distributing
petabytes of data among tens or hundreds of thousands of storage devices. Such systems …

Secure data deduplication

MW Storer, K Greenan, DDE Long… - Proceedings of the 4th …, 2008 - dl.acm.org
As the world moves to digital storage for archival purposes, there is an increasing demand
for systems that can provide secure data storage in a cost-effective manner. By identifying …

Dynamic metadata management for petabyte-scale file systems

SA Weil, KT Pollack, SA Brandt… - SC'04: Proceedings of …, 2004 - ieeexplore.ieee.org
In petabyte-scale distributed file systems that decouple read and write from metadata
operations, behavior of the metadata server cluster will be critical to overall system …

Disk scrubbing in large archival storage systems

TJE Schwarz, Q Xin, EL Miller… - The IEEE Computer …, 2004 - ieeexplore.ieee.org
Large archival storage systems experience long periods of idleness broken up by rare data
accesses. In such systems, disks may remain powered off for long periods of time. These …

MAD2: A scalable high-throughput exact deduplication approach for network backup services

J Wei, H Jiang, K Zhou, D Feng - 2010 IEEE 26th Symposium …, 2010 - ieeexplore.ieee.org
Deduplication has been widely used in disk-based secondary storage systems to improve
space efficiency. However, there are two challenges facing scalable high-throughput …

[PDF][PDF] Ceph: reliable, scalable, and high-performance distributed storage

SA Weil - 2007 - docs.huihoo.com
System designers have long sought to improve the performance of file systems, which have
proved critical to the overall performance of an exceedingly broad class of applications. The …

Evaluation of distributed recovery in large-scale storage systems

Q Xin, EL Miller, SJTJE Schwarz - Proceedings. 13th IEEE …, 2004 - ieeexplore.ieee.org
Storage clusters consisting of thousands of disk drives are now being used both for their
large capacity and high throughput. However, their reliability is far worse than that of smaller …

[PDF][PDF] {FastScale}: Accelerate {RAID} Scaling by Minimizing Data Migration

W Zheng, G Zhang - 9th USENIX Conference on File and Storage …, 2011 - usenix.org
Previous approaches to RAID scaling either require a very large amount of data to be
migrated, or cannot tolerate multiple disk additions without resulting in disk imbalance. In …