Copyright, Inktank Storage Inc. 2013.

Imagine an entire cluster filled with commodity hardware. Reliability and availability is also a concern of RAID. little human intervention and faster recovery times: this is a reality with Ceph replication. which is a time consuming operation making it difficult to expand and migrate your data. you are not making good use of your spindles. keeping the cost of data reasonable because you do not need to buy more spindles or a lot more controllers or heads. reliability. . CRUSH uses intelligent data replication to ensure resiliency. save you time. Once the overhaul is complete. rotate the data faster and encode a bit in fewer inches of rest. It is built to scale to the Exabyte level and beyond while running on readily available commodity hardware. losing the whole set. many RAID controllers fail the recovery after an NRE. CRUSH provides a better data management mechanism compared to older approaches. If you do not balance the traffic. and extended periods of degraded performance. This just turns into more problems down the road. Operating costs exists because RAID doesn’t manage itself. Proprietary appliances may also require you to order the replacements from the manufacturer which often comes at a much higher than commodity price for the drives. RAID recovery does a great job but when it doesn’t work. The access speed has also not kept up with the increased density of the drives. Ceph stripes data across an entire cluster. disks have grown. self-healing. you want to use the latest and greatest disks.OVERVIEW Ceph. distributed storage system. open source. it goes really bad and could cost you a lot of dollars and time to fix the problem. RAID AND ITS CHALLENGES RAID challenges include capacity and speed issues. and enables massive scale by cleanly distributing the work to all the clients and OSDs in the cluster. instead of having to depend on a central lookup table. Unlike traditional RAID. while keeping a mix of old and new data to prevent high traffic in replaced disks. not just RAID sets. The cost of RAID includes both the capital and operating expenses. Whether you are using RAID-5 or RAID-6. Non-recovery error (NRE) rate is not a function of the disk drive but a function of the bits. is a massively scalable. During that time. availability and the expense. As technology has advanced. increase flexibility and lower risk of losing data over RAID. Storage clients and OSDs both use the CRUSH (controlled. operating system crashes and facility or regional disasters. Let’s take a deeper look at how Ceph replication will save you money. Speed and economic gains have come from greater density on disks. it may take a 4TB drive many days to complete rebuilding. hashing) algorithm to efficiently compute information about data containers on demand. Also. With technologies like RAID6. Key to Ceph’s design is the autonomous. once a drive fails don’t put off replacing the drive. Even the most advance RAID system can not protect you against: server failures. What happens if you want to expand using RAID? When building a larger system to add greater capacity. as the disks will be larger and costing less per GB. which is better suited to hyper-scale storage. which leads to drives becoming larger and making NREs common. and intelligent Object Storage Daemon (OSD). the complete choice for cloud storage. and the redistribution of applications. no redundant array of inexpensive disks (RAID) cards. switch failures. requiring a major overhaul of the entire system. We instead can precisely position more tracks on a spindle. You may also run into your storage system reaching a limit beyond which cannot be further expanded. The capital costs comes from the mark-up for enterprise hardware and high performance RAID controllers that you will need to make sure that your storage system is most efficient. under. Further most. This is where we run into problems. users are exposed to simultaneous disk failures. and has been integrated with the leading open source cloud management platforms. Ceph is in the Linux kernel. tune storage for appliances. NIC failures. replication. RAID requires management to create storage. scalable. being able to scale to your needs. Most RAID replication schemes require that the disks have the same geometry and must be replaced with identical units. the odds of an NRE during recovery are significant and client data access will be starved out during recovery. you need to redistribute the data in a way that balances the capacity and balances the traffic. undetected bit errors.

potentially hundreds of source OSDs will be involved in copying data to hundreds of destination OSDs. get best deal over time • RAID not required. autonomous. The model is built around leveraging many inexpensive building blocks and assuming that those blocks will all eventually fail. distributed storage can do for you: http://ceph. ENTERPRISE RAID Raw $/GB Protected $/GB Usable (90%) Replicated Relative Expense $3 $4 (RAID6 6+2) $4. The CRUSH algorithm has been extensively documented in numerous scientific papers. users have to trust that in the event of failure everything works as advertised. When it comes to disk drives. RAID systems can be opaque about their internal workings. then use CRUSH to find new locations to place copies of the data. as vendors consider this their “secret sauce”. Those OSDs will learn about their peer’s failure via a new map of the cluster.44 $8. Ceph protects against disk failures in two ways. leading to lower component costs Below you will find a graph that puts into perspective how cost effective Ceph Replication is compared to Enterprise RAID based on GB. it’s easy to get started with Ceph. With Ceph. all the details are out in the open. This automatically keeps an optimal level of resiliency in the cluster. and using dense inexpensive disks and drive controllers partially offsets the cost of additional capacity. Last but not least. and say good-bye to expensive proprietary storage solutions. so the individual disk operations are fast and simple. Being locked into proprietary solutions. Key benefits of de-clustered placement are: • Recovery is parallel and 200x faster • Service can continue during the recovery process • Exposure to 2nd failures is reduced by 200x • Zone aware placement protects against higher level failures • Recovery is automatic and does not await new drives • • • No idle hot-spares are required Second. The first way is replicating data in multiple locations and fault domains. This uses less expensive disk controllers and avoids the problems common with RAID and today’s large disks. known as de-clustered placement. The parallel copies will complete quickly. and the implementation is available as Open Source software.67 (3 copies) Baseline (100%) GIVE CEPH A TRY Download Ceph .com/docs/master/start/ http://ceph. Unlike RAID.88 (Main + Bkup) 533% storage cost CEPH REPLICATION $0. cost is also a key advantage that Ceph has over RAID. a certain percentage of failures is a given even for the highest quality disks. Take advantage of our learning resources to start using Ceph and see for yourself how much a modern approach to reliable.ADVANTAGES OF CEPH Commodity hardware is what makes today’s cloud infrastructures possible. Some of the keys cost benefits include: • Can leverage commodity hardware for lowest costs • Not locked in to single vendor. Raw disks are cheap. There is no need to synchronize stripes of data across many disks or calculate parity.50 $1. the data on any failed disk is replicated across many OSDs.67 $1.50 (3 copies) $1. reducing exposure to multiple failures and degraded performance. As free open source software.