This action might not be possible to undo. Are you sure you want to continue?
Copyright, Inktank Storage Inc. 2013.
CRUSH provides a better data management mechanism compared to older approaches. CRUSH uses intelligent data replication to ensure resiliency. It is built to scale to the Exabyte level and beyond while running on readily available commodity hardware. Whether you are using RAID-5 or RAID-6. and intelligent Object Storage Daemon (OSD). which is a time consuming operation making it difficult to expand and migrate your data. operating system crashes and facility or regional disasters. it goes really bad and could cost you a lot of dollars and time to fix the problem. Storage clients and OSDs both use the CRUSH (controlled. once a drive fails don’t put off replacing the drive. self-healing. NIC failures. is a massively scalable. The capital costs comes from the mark-up for enterprise hardware and high performance RAID controllers that you will need to make sure that your storage system is most efficient. You may also run into your storage system reaching a limit beyond which cannot be further expanded. requiring a major overhaul of the entire system. and the redistribution of applications. Operating costs exists because RAID doesn’t manage itself. not just RAID sets. users are exposed to simultaneous disk failures. Even the most advance RAID system can not protect you against: server failures. Speed and economic gains have come from greater density on disks. With technologies like RAID6. switch failures. Non-recovery error (NRE) rate is not a function of the disk drive but a function of the bits. losing the whole set. instead of having to depend on a central lookup table. replication. Proprietary appliances may also require you to order the replacements from the manufacturer which often comes at a much higher than commodity price for the drives. Further most. The access speed has also not kept up with the increased density of the drives. disks have grown. As technology has advanced. RAID AND ITS CHALLENGES RAID challenges include capacity and speed issues. The cost of RAID includes both the capital and operating expenses. undetected bit errors. many RAID controllers fail the recovery after an NRE. while keeping a mix of old and new data to prevent high traffic in replaced disks. open source. What happens if you want to expand using RAID? When building a larger system to add greater capacity. availability and the expense. This just turns into more problems down the road. distributed storage system. under. which is better suited to hyper-scale storage. reliability. no redundant array of inexpensive disks (RAID) cards. rotate the data faster and encode a bit in fewer inches of rest. keeping the cost of data reasonable because you do not need to buy more spindles or a lot more controllers or heads. Let’s take a deeper look at how Ceph replication will save you money. tune storage for appliances. During that time. Unlike traditional RAID. This is where we run into problems. as the disks will be larger and costing less per GB. you want to use the latest and greatest disks. increase flexibility and lower risk of losing data over RAID. hashing) algorithm to efficiently compute information about data containers on demand. little human intervention and faster recovery times: this is a reality with Ceph replication. Ceph stripes data across an entire cluster. Imagine an entire cluster filled with commodity hardware. and has been integrated with the leading open source cloud management platforms. it may take a 4TB drive many days to complete rebuilding. We instead can precisely position more tracks on a spindle. Once the overhaul is complete. If you do not balance the traffic. save you time. Ceph is in the Linux kernel. . RAID recovery does a great job but when it doesn’t work. you need to redistribute the data in a way that balances the capacity and balances the traffic. Reliability and availability is also a concern of RAID.OVERVIEW Ceph. RAID requires management to create storage. which leads to drives becoming larger and making NREs common. scalable. being able to scale to your needs. Also. the complete choice for cloud storage. Key to Ceph’s design is the autonomous. and extended periods of degraded performance. the odds of an NRE during recovery are significant and client data access will be starved out during recovery. and enables massive scale by cleanly distributing the work to all the clients and OSDs in the cluster. you are not making good use of your spindles. Most RAID replication schemes require that the disks have the same geometry and must be replaced with identical units.
and the implementation is available as Open Source software. known as de-clustered placement.com/resources/publications/ .50 (3 copies) $1.88 (Main + Bkup) 533% storage cost CEPH REPLICATION $0. reducing exposure to multiple failures and degraded performance. Take advantage of our learning resources to start using Ceph and see for yourself how much a modern approach to reliable. a certain percentage of failures is a given even for the highest quality disks. Those OSDs will learn about their peer’s failure via a new map of the cluster. cost is also a key advantage that Ceph has over RAID. as vendors consider this their “secret sauce”. Key benefits of de-clustered placement are: • Recovery is parallel and 200x faster • Service can continue during the recovery process • Exposure to 2nd failures is reduced by 200x • Zone aware placement protects against higher level failures • Recovery is automatic and does not await new drives • • • No idle hot-spares are required Second.67 $1. leading to lower component costs Below you will find a graph that puts into perspective how cost effective Ceph Replication is compared to Enterprise RAID based on GB. The model is built around leveraging many inexpensive building blocks and assuming that those blocks will all eventually fail. and say good-bye to expensive proprietary storage solutions. the data on any failed disk is replicated across many OSDs. Some of the keys cost benefits include: • Can leverage commodity hardware for lowest costs • Not locked in to single vendor. get best deal over time • RAID not required. As free open source software. so the individual disk operations are fast and simple. The first way is replicating data in multiple locations and fault domains. and using dense inexpensive disks and drive controllers partially offsets the cost of additional capacity. The CRUSH algorithm has been extensively documented in numerous scientific papers. RAID systems can be opaque about their internal workings. Last but not least. Unlike RAID. distributed storage can do for you: http://ceph. then use CRUSH to find new locations to place copies of the data. With Ceph.50 $1. There is no need to synchronize stripes of data across many disks or calculate parity. When it comes to disk drives. Ceph protects against disk failures in two ways. users have to trust that in the event of failure everything works as advertised. This uses less expensive disk controllers and avoids the problems common with RAID and today’s large disks. This automatically keeps an optimal level of resiliency in the cluster. Raw disks are cheap. ENTERPRISE RAID Raw $/GB Protected $/GB Usable (90%) Replicated Relative Expense $3 $4 (RAID6 6+2) $4.44 $8.com/docs/master/start/ http://ceph.67 (3 copies) Baseline (100%) GIVE CEPH A TRY Download Ceph today. Being locked into proprietary solutions. all the details are out in the open.ADVANTAGES OF CEPH Commodity hardware is what makes today’s cloud infrastructures possible. The parallel copies will complete quickly. autonomous. potentially hundreds of source OSDs will be involved in copying data to hundreds of destination OSDs. it’s easy to get started with Ceph.
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue reading from where you left off, or restart the preview.