You are on page 1of 23

Section 2 : Storage Networking Technologies and Virtualization

Content Addressed Storage

Chapter 9

EMC Proven Professional

The #1 Certification Program in the information storage


and management industry

© 2009 EMC Corporation. All rights reserved.


Chapter Objectives
Upon completion of this chapter, you will be able to:
o Describe CAS, fixed content and archives, traditional storage
solutions for archive
o Describe the features and benefits of a CAS based storage
strategy
o List the physical and logical elements of CAS
o Describe the storage and retrieval process for CAS data objects
o Describe the best suited operational environments for CAS
solutions

© 2009 EMC Corporation. All rights reserved.


Lesson: CAS Overview
Upon completion of this lesson, you be able to:
o Define fixed content
o Describe traditional archival solutions and its shortcoming
o Define Content Addressed Storage (CAS)
o List benefits of CAS

© 2009 EMC Corporation. All rights reserved.


What are Fixed Content and Archives
Generate Improve Leverage
New Revenues Service Levels Historical Value

Digital Assets Retained For Active Reference And Value


Electronic Documents Digital Records Rich Media
•Contracts, claims, etc. •Documents •Medical
•E-mail and attachments – Checks, securities trades – X-rays, MRIs, CTI
•Financial spread sheets – Historical preservation •Video
•CAD/CAM designs •Photographs – News / media, movies
•Presentations – Personal / professional – Security surveillance
•Surveys •Audio
– Seismic, astronomic, – Voicemail
geographic – Radio

© 2009 EMC Corporation. All rights reserved.


Challenges of Storing Fixed Content
o Fixed content is growing at more than 90% annually
o Significant amount of newly created information falls into this category
o New regulations require retention and data protection

o Often, long-term preservation is required (years-decades)


o Simultaneous multi-user online access is preferable to offline
storage
o Need faster access to fixed content
o Need for location independent data, enabling technology
refresh and migration
o Traditional storage methods are inadequate

© 2009 EMC Corporation. All rights reserved.


Traditional storage solutions for Archive
o Three categories of archival solution are:
o Online, nearline, and offline based on the means of access

o Traditional archival solution were offline


o Traditional archival process used optical disks and tapes as media for
archival
o An archive is often stored on a Write Once Read Many (WORM) device,
such as a CD-ROM

© 2009 EMC Corporation. All rights reserved.


Shortcomings of Traditional Archiving Solutions
o Tape is slow, and standards are always changing
o Optical is expensive, and requires vast amounts of media
o Recovering files from tape and optical is often time consuming
o Data on tape and optical is subject to media degradation
o Both solution require sophisticated media management

CAS has emerged as an alternative to traditional


archiving solutions

© 2009 EMC Corporation. All rights reserved.


What is Content Addressed Storage (CAS)
o Object-oriented, location-independent approach to data
storage
o Repository for the “Objects”
o Access mechanism to interface with repository
o Globally unique identifiers provide access to objects

© 2009 EMC Corporation. All rights reserved.


Benefits of CAS Additional Task
Research on role of CAS in ILM
o Content authenticity Strategy

o Content integrity
o Location independence
o Single-instance storage (SiS)
o Retention enforcement
o Record-level protection and disposition
o Technology independence
o Fast record retrieval

© 2009 EMC Corporation. All rights reserved.


Lesson Summary
Key points covered in this lesson:
o CAS Definition
o Challenges of Storing Fixed Content
o Shortcomings of Traditional Archiving Solutions
o Benefits of CAS

© 2009 EMC Corporation. All rights reserved.


Lesson: CAS Architecture
Upon completion of this lesson, you will be able to:
o Describe CAS architecture
o Describe Physical and logical elements of CAS
o Describe data storage and retrieval process in CAS
environment
o CAS examples

© 2009 EMC Corporation. All rights reserved.


Physical Elements of CAS
o Storage devices (CAS Based)
o Storage node
o Access node
o Servers (to which storage devices get Storage
Nodes
connected) Access
Private
LAN
Nodes

o Client

IP

CAS System
API

Server
© 2009 EMC Corporation. All rights reserved.
CAS Terminology

o Application Programming Interface (API)


API
o A set of function calls that enables communication
between applications or between an application and
an operating system
o BLOB (Binary Large Object)
o The actual data without the descriptive information
(metadata)
o The Distinct Bit Sequence (DBS) of user data
represents the actual content of a file and is
independent of the filename and physical location

© 2009 EMC Corporation. All rights reserved.


CAS Terminology (Cont)

o C-Clip
o A package containing the user's data and associated
metadata
o C-Clip ID (C-Clip handle or C-Clip reference) is the CA that
the system returns to the client application
o Content Address (CA)
o An identifier that uniquely addresses the content of a file
and not its location. Unlike location-based addresses,
content addresses are inherently stable and, once
calculated, they never change and always refer to the same
content
o C-Clip Descriptor File (CDF)
o The additional XML file that the system creates when
making a C-Clip. This file includes the content addresses for
all referenced BLOBs and associated metadata

© 2009 EMC Corporation. All rights reserved.


How CAS Stores a Data Object

Client presents data


to API to be archived Unique Content CAS System
Address is calculated
Application Server Object is sent
to CAS System via
Client CAS API over IP
API

C-Clip
(Object)

CDF
© 2009 EMC Corporation. All rights reserved.
How CAS Stores a Data Object

Client presents data


to API to be archived Unique Content CAS System
Address is calculated
Application Server Object is sent
to CAS System via
Client CAS API over IP Object
API

Acknowledgement
returned to
application
CAS System validates
the Content Address
and stores the object
Clip ID is retained and
stored for future use
© 2009 EMC Corporation. All rights reserved.
How CAS Retrieves a Data Object

4
1
CAS authenticates
Object is needed by the request and
an application delivers the object
CAS System

Application Server

Client

API

3
2 Retrieval request is
Application finds sent to the CAS System via
Content Address of CAS API over IP
C-Clip ID
object to be retrieved

© 2009 EMC Corporation. All rights reserved.


CAS Features
o Features available with most CAS systems are:
o Integrity checking
o Data protection
o Local replication
o Remote replication
o Load balancing
o Scalability
o Self-diagnosis and repair
o Report generation and event notification
o Fault tolerance
o Through the use of redundant components and data protection schemes
o Audit trails
o Documentation of management activities, access and disposition of data

© 2009 EMC Corporation. All rights reserved.


Example 1: CAS Healthcare Solution
Hospital

API
Stored locally for Data Stored
Short-Term Use on CAS
Patient Studies (60 Days)

Application CAS System


Server

o Each X-ray image ranges from about 15MB to over 1GB


o Patient record is stored online for a period of 60-90 days
o Beyond 90 days patient records are archived

© 2009 EMC Corporation. All rights reserved.


Example 2: CAS Financial Solution
Bank

API

Application Server CAS System

o Check image size is about 25KB


o Check imaging service provider may process 50–90 million
check images per month
o Checks are stored online for a period of 60 days
o Beyond 60 days data is archived
© 2009 EMC Corporation. All rights reserved.
Lesson Summary
Key points covered in this lesson:
o CAS architecture
o Physical and logical elements of CAS
o CAS storage and retrieval process
o CAS solution examples

© 2009 EMC Corporation. All rights reserved.


Chapter Summary
Key points covered in this chapter:
o Benefits of CAS based storage strategy
o Overview of physical and logical elements of CAS
o Storing and retrieving data from CAS
o CAS application examples

© 2009 EMC Corporation. All rights reserved.


#1 IT
company

For more information visit


© 2009 EMC Corporation. All rights reserved.
http://education.EMC.com

You might also like