Professional Documents
Culture Documents
STORAGE MANAGEMENT
SUB CODE: 16BCA525D21
MODULE 1: INTRODUCTION TO INFORMATION STORAGE AND
MANAGEMENT
Module 1 :
Information Storage: Data – Types of Data –Information - Storage , Evolution of Storage
Technology and Architecture, Data Centre Infrastructure - Core elements- Key Requirements
for Data Centre Elements -Managing Storage Infrastructure, Key Challenges in Managing
Information, Information Lifecycle - Information Lifecycle Management - ILM Implementation
-ILM Benefits
10/10/2021 2
V SEM, SM, DEPARTMENT OF BCA
AIM
To familiarize students with fundamentals of storage and overview of data centres.
Introduction to Storage and Data centres Information Storage
• Structured data is organized in rows and columns in a rigidly defined format, so that applications can
retrieve and process it efficiently.
• Data is unstructured if its elements cannot be stored in rows and columns, and is therefore difficult to query
and retrieve by business applications.
• For example, customer contacts may be stored in various forms such as sticky notes, e-mail messages,
business cards, or even digital format files such as .doc, .txt, and .pdf.
• Due to its unstructured nature, it is difficult to retrieve using a customer relationship management
application.
4
Introduction to Storage and Data centres Information Storage
• Unstructured:
Web Pages
o Instant messages
Rich Media
o Images
Audio Video
o Audio
o Video
o Movies Structured (20%)
• This enables the conversion of various types of content and media from conventional forms to digital
formats.
• Lower cost of digital storage: Technological advances and decrease in the cost of storage devices have
provided low-cost solutions and encouraged the development of less expensive data storage devices.
• This cost benefit has increased the rate at which data is being generated and stored.
• Affordable and faster communication technology: The rate of sharing digital data is now much faster
than traditional approaches.
• A handwritten letter may take a week to reach its destination, whereas it only takes a few seconds for an
e-mail message to reach its recipient. 6
Introduction to Storage and Data centres Information Storage
Examples of Data
7
Introduction to Storage and Data centres Information Storage
Data Explosion
• Inexpensive and easier ways to create, collect, and store all types of data, coupled with increasing individual
and business needs, have led to accelerated data growth, popularly termed as the data explosion.
• Data has different purposes and criticality, so both individuals and businesses have contributed in varied
proportions to this data explosion.
• Most of the data created holds significance in the short-term but, becomes less valuable over time. This
governs the type of data storage solutions used.
• Individuals store data on a variety of storage devices, such as hard disks, CDs, DVDs, or Universal Serial
Bus (USB) flash drives.
8
Introduction to Storage and Data centres Information Storage
Business
• Businesses generate vast amounts of data and then extract meaningful information from this data to derive
economic benefits.
• Therefore, businesses need to maintain data and ensure its availability over a longer period.
• Furthermore, the data can vary in criticality and may require special handling.
• For example, legal and regulatory requirements mandate that banks maintain account information for their
customers accurately and securely.
• Some businesses handle data for millions of customers, and ensures the security and integrity of data over
a long period of time.
• This requires high capacity storage devices with enhanced security features that can retain data for a
longer period.
9
Introduction to Storage and Data centres Information Storage
IP Operation
10
Introduction to Storage and Data centres Information Storage
11
Introduction to Storage and Data centres Information Storage
Information Storage
Examples:
12
Introduction to Storage and Data centres Information Storage
13
Introduction to Storage and Data centres Information Storage
• Buying/spending patterns
• Customer satisfaction/service
14
Introduction to Storage and Data centres Information Storage
• Reduced cost
• New services
• Communicate to bank customers with high account balances about a special savings plan.
15
Introduction to Storage and Data centres Information Storage
Storage
• Type of storage used is based on the type of data and the rate at which it is created and used
Examples:
26
Introduction to Storage and Data centres Information Storage
• Historically, organizations had centralised computers (mainframe) and information storage devices
(tape reels and disk packs) in their data centre.
• The evolution of open systems and the affordability and ease of deployment that they offer made it
possible for business units/departments to have their own servers and storage.
• In earlier implementations of open systems, the storage was typically internal to the server.
• Originally, there were very limited policies and processes for managing these servers and the data created.
• To overcome these challenges, storage technology evolved from non-intelligent internal storage to
intelligent networked storage.
17
Introduction to Storage and Data centres Information Storage
18
Introduction to Storage and Data centres Information Storage
Multi Protocol
Router
LAN FC SAN
IP SAN
SAN / NAS
RAID Array
Internal DAS
19
Introduction to Storage and Data centres Information Storage
• Redundant Array of Independent Disks (RAID): This technology was developed to address the cost,
performance, and availability requirements of data.
• It continues to evolve today and is used in all storage architectures such as DAS, SAN, and so on.
• Direct-attached storage (DAS): This type of storage connects directly to a server (host) or a group of
servers in a cluster.
• Storage can be either internal or external to the server. External DAS alleviated the challenges of limited
internal storage capacity.
20
Introduction to Storage and Data centres Information Storage
Storage Area Network (SAN): This is a dedicated, high-performance Fibre Channel (FC) network to
facilitate block-level communication between servers and storage.
• SAN offers scalability, availability, performance, and cost benefits compared to DAS.
Network-attached storage (NAS): This is dedicated storage for file serving applications.
• Unlike a SAN, it connects to an existing communication network (LAN) and provides file access
to heterogeneous clients.
• Because it is purposely built for providing storage to file server applications, it offers higher
scalability, availability, performance, and cost benefits compared to general purpose file servers.
21
Introduction to Storage and Data centres Information Storage
• Internet Protocol SAN (IP-SAN): One of the latest evolutions in storage architecture, IP-SAN is a
convergence of technologies used in SAN and NAS.
• IP-SAN provides block-level communication across a local or wide area network (LAN or WAN),
resulting in greater consolidation and availability of data.
Note: Storage technology and architecture continues to evolve, which enables organizations to consolidate,
protect, optimize, and leverage their data to achieve the highest return on information assets.
22
Introduction to Storage and Data centres Information Storage
23
Introduction to Storage and Data centres Information Storage
24
Introduction to Storage and Data centres Information Storage
25
Introduction to Storage and Data centres Information Storage
26
Introduction to Storage and Data centres Information Storage
IP-SAN
27
Introduction to Storage and Data centres Information Storage
a) Raw fact
b) Processed fact
c) Processed information
Answer: a
28
Introduction to Storage and Data centres Information Storage
a) Images
b) Videos
c) Audio
d) Spreadsheet
Answer: d
29
Introduction to Storage and Data centres Information Storage
Answer: a
30
Introduction to Storage and Data centres Information Storage
4. Which one of the given options is a collection of raw facts from which conclusions may be drawn?
a) Information
b) Data
c) Digital Value
Answer: b
31
Introduction to Storage and Data centres Information Storage
5. Information created by both an individual and business. State whether True or False.
a) True
b) False
Answer: a
32
Introduction to Storage and Data centres Information Storage
6. Which one of the given factors contributed to the growth of digital data?
a) Only i
b) Only ii
c) Only i and ii
d) All i, ii and iii
Answer: d
33
Introduction to Storage and Data centres Information Storage
7. Inexpensive and easier ways to create, collect, and store all types of data, coupled with increasing
individual and business needs, have led to accelerated data growth known as ___________________.
a) Data inclusion
b) Data Explosion
c) Data termination
Answer: b
34
Introduction to Storage and Data centres Information Storage
8. Which one of the given options are organized in rows and columns?
a) Unstructured
b) Hierarchical
c) Structured
d) Flat
Answer: c
35
Introduction to Storage and Data centres Information Storage
a) Unstructured
b) Hierarchical
c) Structured
d) Flat
Answer: a
36
Introduction to Storage and Data centres Information Storage
10. Storage is partitioned and assigned to a server for accessing its data. Which technology supports this
feature?
a) SAN
b) DAS
c) RAID
d) NAS
Answer: a
37
Introduction to Storage and Data centres Information Storage
Document Links
Storage
https://www.siemon.com/us/white_papers/14-07- This link explains about how storage technology has
Technology
29-data-center-storage-evolution.asp evolved over the years
Evolution
38
Introduction to Storage and Data centres Information Storage
Video Links
Overview of data This video explains about data and information and
https://www.youtube.com/watch?v=mUgEgkV16Bw
and information their relation
39
Introduction to Storage and Data centres Information Storage
Information Lifecycle
Change in the value of information with time, from its creation until disposal
40
Introduction to Storage and Data centres Information Storage
41
Introduction to Storage and Data centres Information Storage
ILM refers to the creation and management of wide-ranging set of strategies for streamlining the storage
infrastructure and the data within it. It is a comprehensive approach in managing the flow of data from its
creation to its disposal.
42
Introduction to Storage and Data centres Information Storage
Protect
43
Introduction to Storage and Data centres Information Storage
AUTOMATED
FLEXIBLE
44
Introduction to Storage and Data centres Information Storage
ILM should possess the following characteristics to obtain and maintain an ILM strategy:
45
Introduction to Storage and Data centres Information Storage
Management is simplified
Utilisation is improved
TCO is lowered
46
Introduction to Storage and Data centres Information Storage
▪ Improved utilization
• Tiered storage platforms
▪ Simplified management
• Processes, tools and automation
▪ Simplified backup and recovery
• A wider range of options to balance the need for business continuity.
▪ Maintaining compliance
• Knowledge of what data needs to be protected for what length of time.
▪ Lower Total Cost of Ownership
• By aligning the infrastructure and management costs with information value.
47
Introduction to Storage and Data centres Information Storage
11. The best storage technology for global data sharing is _____________.
a) SAN
b) NAS
c) RAID
d) IP-SAN
Answer: d
48
Introduction to Storage and Data centres Information Storage
a) Remains constant
b) Changes over time
c) May change or may not change over time
Answer: b
49
Introduction to Storage and Data centres Information Storage
i. Data creation
ii. Data sharing
iii. Data archival
a) Only i
b) Only ii
c) All i, ii and iii
d) None of the above
Answer: d
50
Introduction to Storage and Data centres Information Storage
14. DAS is a kind of storage that is directly connected to a computer. State whether True or False.
a) True
b) False
Answer: a
51
Introduction to Storage and Data centres Information Storage
52
Introduction to Storage and Data centres Information Storage
Data centres store and manage a large volume of mission-critical data. Organizations maintain data
centres to provide integrated data processing facility across the system.
53
Introduction to Storage and Data centres Information Storage
54
Introduction to Storage and Data centres Information Storage
55
Introduction to Storage and Data centres Information Storage
56
Introduction to Storage and Data centres Information Storage
57
Introduction to Storage and Data centres Information Storage
• Applications
• Databases – Database Management System (DBMS) and the physical and logical storage of data
• Servers/Operating systems
• Networks (LAN and SAN)
• Storage arrays
58
Introduction to Storage and Data centres Information Storage
Storage
Array
Server
Client Storage Area
Network
Local Area
Network
Application User
Database
Interface
OS and
DBMS
59
Introduction to Storage and Data centres Information Storage
What are the key factors to ensure for the success of a Data centre?
Availability
Manageability
Performance Capacity
Scalability
60
Introduction to Storage and Data centres Information Storage
1. Availability
2. Location and facility
3. Uptime
4. Cooling
5. Security
6. Facility maintenance
7. Scalability
8. Change Management
61
Introduction to Storage and Data centres Information Storage
Availability: All data centre elements should be designed to enable accessibility. The ability of
users to access data when necessary can have a positive impact on a business.
62
Introduction to Storage and Data centres Information Storage
Location and Facility: Location can have a considerable impact and could serve as an opportunity to
enhance the company’s ability to address network expectancy, disaster recovery, and data control. When
determining the location of a data centre, it is also necessary to consider the geographical location with respect
to the service provider.
63
Introduction to Storage and Data centres Information Storage
Uptime: Organizations can lose a lot of money during the downtime due to the inaccessibility of necessary
information and resources. The service provider’s track record of uptime should be tested and it should be
determined whether the provider has been able to maintain 100% availability for electrical and mechanical
systems in the past.
64
Introduction to Storage and Data centres Information Storage
65
Introduction to Storage and Data centres Information Storage
Cooling: The establishment of a data centre involves a combination of utilities, generators, and a range of
power supplies that emit excessive heat. It is essential to have a robust cooling infrastructure to ensure cost
efficiencies and a viable environment to establish the data centre.
66
Introduction to Storage and Data centres Information Storage
67
Introduction to Storage and Data centres Information Storage
Security: The security of the data centre is equally important. It is necessary to establish policies and
procedures, and ensure proper integration of the data centre core elements in order to prevent unauthorized
access to information. To maximize protection, the facility should be fabricated with military-grade security
measures.
68
Introduction to Storage and Data centres Information Storage
Facility Maintenance: During the installation of a data centre, it is important to consider the maintenance and
management aspects. The provider should have sufficient insight of the facility and its efficiencies to ensure
that the entire system is being observed and any issues can be quickly addressed before they cause any outage.
69
Introduction to Storage and Data centres Information Storage
Scalability: Data centre operations should be able to allocate additional on-demand storage or processing
capabilities, without interrupting business operations. For the growth of a business, it is often necessary to
install more servers, additional databases, and new applications. The storage solution should also able to grow
with the business.
70
Introduction to Storage and Data centres Information Storage
Change Management: Proper guidelines for change management ensures that there is no alteration in the
data centre that has not been planned, scheduled, discussed, and approved. It is also necessary to provide back
out steps or a Plan B. Whether bringing new amendments to the system or making old practices obsolete, data
centres must follow the change management process.
71
Introduction to Storage and Data centres Information Storage
15. Which one of the given options is/are a data centre element?
i. Databases
ii. Servers
iii. Applications
a) Only i
b) Only ii
c) Only i and ii
d) All i, ii and iii
Answer: d
72
Introduction to Storage and Data centres Information Storage
a) Accessibility
b) Robustness
c) Scalability
d) Optimality
Answer: a
73
Introduction to Storage and Data centres Information Storage
17. While determining site for a data centre, the most important factor to consider is ___________.
a) Cost
b) Geographical location
c) Workforce
Answer: b
74
Introduction to Storage and Data centres Information Storage
Document Links
75
Introduction to Storage and Data centres Information Storage
Video Links
76
Introduction to Storage and Data centres Information Storage
77
Introduction to Storage and Data centres Information Storage
• Campus network
• Private WAN
• Remote access
• Internet server farm
• Extranet server farm
• Intranet server farm
78
Introduction to Storage and Data centres Information Storage
79
Introduction to Storage and Data centres Information Storage
• Data centres typically house many components that support the infrastructure building blocks, such as the
core switches of the campus network or the edge routers of the private WAN.
• Data centre designs can include any or all of the building blocks as shown in previous slide figure,
including any or all server farm types. Each type of server farm can be a separate physical entity, depending
on the business requirements of the enterprise.
80
Introduction to Storage and Data centres Information Storage
81
Introduction to Storage and Data centres Information Storage
Data centres in Service Provider (SP) environments, known as Internet Data centres (IDCs), unlike in
enterprise environments, are the source of revenue that supports collocated server farms for enterprise
customers.
The SP Data centre is a service-oriented environment built to house, or host, an enterprise customer’s
application environment under tightly controlled SLAs for uptime and availability. Enterprises also build
IDCs when the sole reason for the Data centre is to support Internet-facing applications.
82
Introduction to Storage and Data centres Information Storage
18. Which one of the given options is a building block of a typical enterprise network?
i. Private WAN
ii. Remote access
iii. Extranet server farm
a) Only i
b) Only ii
c) Only i and ii
d) All i, ii and iii
Answer: d
83
Introduction to Storage and Data centres Information Storage
Answer: b
84
Introduction to Storage and Data centres Information Storage
85
Introduction to Storage and Data centres Information Storage
• In this section, we shall discuss the architecture of a generic enterprise Data centre connected to the
Internet and supporting an intranet server farm.
• Other types of server farms follow the same architecture used for intranet server farms yet with
different scalability, security, and management requirements.
86
Introduction to Storage and Data centres Information Storage
Topology of an Enterprise
Data centre Architecture
87
Introduction to Storage and Data centres Information Storage
88
Introduction to Storage and Data centres Information Storage
Internet Edge
The Internet Edge provides the connectivity from the enterprise to the internet and its associated redundancy
and security functions, as following:
• External and internal routing through Exterior Border Gateway Protocol (EBGP) and Interior
Border Gateway Protocol (IBGP).
89
Introduction to Storage and Data centres Information Storage
The campus core switches provide connectivity between the internet Edge, the intranet server farms, the
campus network, and the private WAN.
The core switches physically connect to the devices that provide access to other major network areas, such
as the private WAN edge routers, the server-farm aggregation switches, and campus distribution switches.
90
Introduction to Storage and Data centres Information Storage
Aggregation Layer
Access Layer
Storage Layer
91
Introduction to Storage and Data centres Information Storage
Aggregation Layer
The aggregation layer is the aggregation point for devices that provide services to all server farms. These
devices are multilayer switches, firewall, load balancers, and other devices that typically support services
across all servers.
92
Introduction to Storage and Data centres Information Storage
Access Layer
The access layer provides Layer 2 connectivity and Layer 2 features to the server farm. In a multitier
server farm, each server function could be located on different access switches on different segments.
93
Introduction to Storage and Data centres Information Storage
94
Introduction to Storage and Data centres Information Storage
95
Introduction to Storage and Data centres Information Storage
Storage Layer
The storage layer consists of the storage infrastructure such as Fibre Channel switches and routers that
support Small Computer System Interface (SCSI) over IP (iSCSI) or Fibre Channel over IP (FCIP). Storage
network devices provides the connectivity to servers, storage devices such as disk subsystems, and tape
subsystems.
96
Introduction to Storage and Data centres Information Storage
Transport Layer
The Data centre transport layer includes the transport technologies required for the following purposes:
• Communication between distributed server farms located in distributed Data centres for the
purpose of remote mirroring, replication, or clustering.
97
Introduction to Storage and Data centres Information Storage
98
Introduction to Storage and Data centres Information Storage
20. ________________ provides the connectivity from the enterprise to the Internet.
a) Internet Edge
b) Campus Core Switches
c) Access layer
d) Storage layer
Answer: a
99
Introduction to Storage and Data centres Information Storage
21. The ________________ provides connectivity between the Internet Edge, the intranet server farms,
the campus network, and the private WAN.
a) Internet Edge
b) Campus Core Switches
c) Access layer
d) Storage layer
Answer: b
10
Introduction to Storage and Data centres Information Storage
22. Access layer provides Layer 2 connectivity and Layer 2 features to the server farm. State whether True
or False.
a) True
b) False
Answer: a
10
Introduction to Storage and Data centres Information Storage
Document Links
Data centre http://focus.forsythe.com/articles/396/10- This link explains the steps to a successful Data
Strategy Steps-to-a-Successful-Data-centre-Strategy centre strategy.
10
Introduction to Storage and Data centres Information Storage
Video Links
Data centre https://www.youtube.com/watch?v=McVZ01k This video explains about real life data centre
Deployment n-lQ deployment and best practices.
Data centre https://www.youtube.com/watch?v=8M0XPOi In this Video, you will learn about details of
Topology KC7I Data centre topology.
10
Introduction to Storage and Data centres Information Storage
Answer: c
10
Introduction to Storage and Data centres Information Storage
24. Accessibility of data centre elements will have no impact on business. State whether True or False.
a) True
b) False
Answer: b
10
Introduction to Storage and Data centres Information Storage
Answer: a
10
Introduction to Storage and Data centres Information Storage
Answer: a
10
Introduction to Storage and Data centres Information Storage
a) Low
b) Medium
c) High
Answer: c
10
Introduction to Storage and Data centres Information Storage
a) Robustness
b) Scalability
c) Accessibility
d) Fault tolerance
Answer: b
10
Introduction to Storage and Data centres Information Storage
29. Which one of the given options is true about change management?
a) Only i
b) Only ii
c) Only i and ii
d) All i, ii and iii
Answer: a
11
Introduction to Storage and Data centres Information Storage
30. It is important to have a Plan B in place in a typical data centre environment. State whether True or
False.
a) True
b) False
Answer: a
11
Introduction to Storage and Data centres Information Storage
31. In a multitier server farm, each server function could be located on different access switches on
different segments. State true of false.
a) True
b) False
Answer: a
11
Introduction to Storage and Data centres Information Storage
Answer: d
11
Introduction to Storage and Data centres Information Storage
Summary
• Data is a collection of raw facts from which the required output is extracted.
• Storage technology has evolved through configurations such as RAID, DAS, SAN, NAS, and IP-SAN.
• The change in value of information with the change in time is called information lifecycle.
• ILM is a comprehensive approach in managing the flow of data from its creation to its disposal.
• The key requirement for data centre, elements are availability, location and facility, uptime, cooling,
security, facility maintenance, scalability, and change management.
11
Introduction to Storage and Data centres Information Storage
Assignment
1. Define data and information with real life examples.
2. Explain how structured and unstructured data differs.
3. Write a description on evolution of storage technologies.
4. Explain the different stages of Information Lifecycle.
5. What is ILM? Explain the various benefits of ILM.
6. Why do we need data centres? Explain.
7. Name and explain the core elements in data centre infrastructure.
8. Explain the architecture of Enterprise Data centre.
9. Discuss the necessity of information storage management with illustrative examples.
10. Explain Virtuous cycle of information with a neat diagram.
11
Introduction to Storage and Data centres Information Storage
Assignment
11. Define storage. Discuss the importance of digital data.
12. What is Data? Give examples of digital data.
13. List some of the factors that have contributed to the growth of digital data.
14. With the advancement of computer and communication technologies, the rate of data generation and
sharing has increased exponentially. Justify the statement.
15. Discuss the advantages of digital data over traditional data handling techniques.
16. Give examples of Research and Business data.
17. Demonstrate how data can be classified based on how it is stored and managed.
18. Differentiate between data and information.
19. Bring out the analysis of information by individual and business.
20. Discuss the options available for storing data for individuals and business. 11
Introduction to Storage and Data centres Information Storage
Assignment
21. Mention the challenges in storing the data in mainframes / open systems.
22. Explain the evolution of storage architectures.
23. Bring out the differences between SAN and NAS. Outline the Data centre Infrastructure requirements
to store and manage large amounts of mission-critical data.
24. Discuss the five core elements that are essential for the basic functionality of a data centre.
25. Demonstrate the requirements of the core elements of the data centre with an order processing system.
26. List out the key requirements for data centre elements.
27. Summarize the necessity of the key characteristics of data centre elements with respect to the storage.
28. Managing a modern, complex data centre involves many tasks. List out the tasks with suitable
examples.
Assignment
30. In order to frame an effective information management policy, businesses need to consider the key
challenges of information management. Mention the key challenges faced by the business.
31. Framing a policy to meet the challenges of information management involves understanding the value
of information over its lifecycle. Justify the statement.
35. Recommend the activities involved in the process of developing an Information Lifecycle Management
(ILM) strategy.
11
Introduction to Storage and Data centres Information Storage
Assignment
37. Implementing an ILM strategy provides the benefits that directly address the challenges of information
management. Justify the statement.
11
Introduction to Storage and Data centres Information Storage
Document Links
Video Links
https://www.youtube.com/watch?v=bpUzG
This link compares DAS, NAS and SAN
Types of storage technologies ZLO948&list=PLWDUzz3hCDJJFiOO_zpMTbi
technologies
HUNPDpU3xb
This link explains what is data,
https://www.youtube.com/watch?v=mUgEg
information and knowledge. A short
Overview of data and information kV16Bw
description on how data differs from
information.
12
Introduction to Storage and Data centres Information Storage
E-Book Links
Building a https://www.actualtechmedia.com/wp-
Modern Data content/uploads/2018/05/Building-a-Modern- 14 to 22
centre Data-Center-ebook.pdf
Information http://www.academia.edu/19388733/1_1_Infor
Storage and mation_Storage_and_Management_2nd_Editio 4 to 15
Management n
12
DEPARTMENT OF BCA
STORAGE MANAGEMENT
SUB CODE: 16BCA525D21
Module 2 :
Components of a Storage System Environment – Host –Connectivity – Storage, Disk Drive Components –
Platter – Spindle - Read/Write Head - Actuator Arm Assembly - Controller - Physical Disk Structure -
Zoned Bit Recording - Logical Block Addressing , Disk Drive Performance -1 Disk Service Time ,
Fundamental Laws Governing Disk Performance , Logical Components of the Host - Operating System -
Device Driver -Volume Manager - File System – Application , Application Requirements and Disk
Performance.
10/7/2021 2
V SEM, SM, DEPARTMENT OF BCA
Storage System Environment
AIM
To equip students with components of storage system environment and present an overview of
different storage technologies.
3
Storage System Environment
• Host: The host system is used for the interaction between the OS and application that has
requested the data.
• Connectivity: It provides the connection and helps in carrying out the data and the read or write
commands between the storage mediums and the host.
• Storage: The data gets stored into the storage mediums or devices.
4
Storage System Environment
Host
It is one of the main components in a storage system environment:
1. CPU
2. Storage
LA
N
o Disk device and internal memory.
3. I/O device Group of
Servers
• Host to host communications
o Network Interface Card (NIC).
• Host to storage device communications
o Host Bus Adapter (HBA). 5
Storage System Environment
CPU
The CPU consists of four main components:
▪ Control Unit.
▪ Register.
6
Storage System Environment
Storage
Memory and storage media are used to store data, either persistently or temporarily. Generally, there are two
types of memory on a host:
▪ Storage devices are less expensive than semiconductor memory. Examples of storage devices are as follows:
▪ Hard disk (magnetic).
7
Storage System Environment
Components of Storage
Fast
CPU registers
L2 cache L1 cache
Speed
RAM
Magnetic
disk
Optical
Tape disk
Slow
Low High 8
Cost
Storage System Environment
I/O Devices
Human interface
• Keyboard
• Mouse
• Monitor
Computer-computer interface
• Network Interface Card (NIC)
Computer-peripheral interface
• USB (Universal Serial Bus) port
• Host Bus Adapter (HBA)
9
Storage System Environment
10
Storage System Environment
Connectivity
▪ Provides interconnection between hosts or between a host and any storage devices
▪ The physical components are the hardware elements that connect the host to storage, and the
logical components of connectivity are the protocols used for communication between the host
and storage.
▪ Physical Components of Connectivity are:
Bus, Port and Cable
Disk
Port 11
Storage System Environment
▪ The port is a specialised outlet that enables connectivity between the host and external devices.
▪ Cables connect hosts to internal or external devices using copper or fibre optic media.
Buses, as conduits of data transfer on the computer system, can be classified as follows:
▪ System bus: The bus that carries data from the processor to memory.
▪ Local or I/O bus: A high-speed pathway that connects directly to the processor and carries data between
the peripheral devices, such as storage devices and the processor.
12
Storage System Environment
Serial Bi-directional
Parallel
13
Storage System Environment
• The popular interface protocol used for the local bus to connect to a peripheral device is a peripheral component
interconnect (PCI).
• The interface protocols that connect to disk systems are Integrated Device Electronics/Advanced Technology
Attachment (IDE/ATA) and Small Computer System Interface (SCSI).
• PCI is a specification that standardises how PCI expansion cards, such as network cards or modems, exchange
information with the CPU. PCI provides the interconnection between the CPU and attached devices.
• IDE/ATA is the most popular interface protocol used on modern disks. This protocol offers excellent
performance at relatively low cost.
• SCSI has emerged as a preferred protocol in high-end computers. This interface is far less commonly used than
IDE/ATA on personal computers due to its higher cost.
14
Storage System Environment
Storage
• The storage device is the most important component in the storage system environment.
• A storage device uses magnetic or solid state media. Disks, tapes, and diskettes use magnetic
media.
• CD-ROM is an example of a storage device that uses optical media, and a removable flash
memory card is an example of solid-state media.
15
Storage System Environment
Components of Storage
Magnetic Media
• Magnetic storage is an example of non-volatile memory, which means when the storage device is
not powered, then the data does not get lost and is retained.
• Magnetic storage uses various patterns of magnetisation in a magnetisable material to store the
data.
• Magnetic storages are relatively cheap and are widely used for data storage.
• Examples of devices that use the magnetic media are tapes, disks, and diskettes.
16
Storage System Environment
17
Storage System Environment
Optical Media
• Optical media are the kind of storage media that contain the content in a digital format and the
read/write feature can be performed on them using a laser.
• The examples of optical media include CD-ROMs, DVD-ROMs, and all the variations of the two
formats, as well as optical jukeboxes.
• Optical media can last up to seven times as long as any traditional storage media.
18
Storage System Environment
19
Storage System Environment
• It is made up of silicon microchips. It is non-volatile storage and stores data electronically instead of
storing it magnetically.
• It contains no mechanical parts, which allows the transfer of data, to and from the storage medium, to
take place at a very higher speed.
• An example of a device that uses the solid-state media is a removable flash memory.
20
Storage System Environment
21
Storage System Environment
Tapes
• Tapes are a popular storage media used for backup because of their relatively low cost. In the past, data Centres
hosted a large number of tape drives and processed several thousand reels of tape. However, the tape has the
following limitations:
▪ Data is stored on the tape linearly along the length of the tape. As a result, random data access is slow and time-
consuming.
▪ In a shared computing environment, data stored on tape cannot be accessed by multiple applications
simultaneously, restricting its use to one application at a time.
▪ On a tape drive, the read/write head touches the tape surface, so the tape degrades or wears out after repeated use.
▪ The storage and retrieval requirements of data from tape and the overhead associated with managing tape media
are significant.
22
Storage System Environment
Magnetic Tapes
23
Storage System Environment
Answer: d)
24
Storage System Environment
Answer: b)
25
Storage System Environment
Answer: c)
26
Storage System Environment
Answer: a)
27
Storage System Environment
Answer: b)
28
Storage System Environment
Answer: a)
29
Storage System Environment
Answer: c)
30
Storage System Environment
Answer: a)
31
Storage System Environment
Answer: b)
32
Storage System Environment
Answer: c)
33
Storage System Environment
34
Storage System Environment
Answer: c)
35
Storage System Environment
13. Which one connect hosts to internal or external devices using copper or fibre optic media?
a) Bus
b) Wire
c) Port
d) Cable
Answer: d)
36
Storage System Environment
Answer: b)
37
Storage System Environment
Answer: d)
38
Storage System Environment
Document Links
Solid state http://searchstorage.techtarget.com/definitio This link explains about fast solid state storage
storage n/solid-state-storage technology
Components of https://www.webopedia.com/TERM/C/CPU.ht
This link explains about the components of CPU
CPU ml
Video Links
Types of storage https://www.youtube.com/watch?v=NgVm2k This video explains and compares different types of
media UmrYo storage media
40
Storage System Environment
41
Storage System Environment
A disk drive is one of the most popular, randomly accessible, and rewritable storage medium used in the
modern computers. It is a physical drive in a computer which is capable of holding and retrieving
information. The disk drives are used for storing and accessing the data for a performance-intensive,
online application.
42
Storage System Environment
Controller
HDA Interfac
e
Power
Connector
43
Storage System Environment
• Platter
• Spindle
• Read/write head
• Actuator arm assembly
• Controller
44
Storage System Environment
Spindle
Platter
Read/write head
Actuator arm
Actuator arm
Actuator axis
assembly
Actuator
Disk Drive
45
Storage System Environment
Platters
46
Storage System Environment
Spindle
Spindle
47
Storage System Environment
Read/Write Heads
Most drives have two R/W heads per platter, one for each surface of the platter.
• When reading data, they detect magnetic polarisation on the platter surface.
• When writing data, they change the magnetic polarisation on the platter surface.
48
Storage System Environment
R/W Head
R/W Head
49
Storage System Environment
Track
Platter
50
Storage System Environment
Cylinders
Cylinder
51
Storage System Environment
Controller
Controller
Interface
HDA
Power
Connector
52
Storage System Environment
Zoned-Bit Recording
Since a platter is made up of concentric tracks, the outer tracks can hold more data than the inner ones
because they are physically longer than the inner tracks. However, in older disk drives, the outer tracks
had the same number of sectors as the inner tracks, which means that the data density was very low on the
outer tracks. This was an inefficient use of the available space.
Zoned-bit recording uses the disk more efficiently. It group tracks into zones that are based upon their
distance from the Centre of the disk.
53
Storage System Environment
Sector
Track
54
Storage System Environment
Earlier, drives used physical addresses made up of the Cylinder, Head, and the Sector number (CHS) to
refer to specific locations on the disk. This meant that the host had to be aware of the geometry of each
disk that was used.
Logical Block Addressing (LBA) simplifies addressing by a using a linear address for accessing physical
blocks of data. The disk controller performs the translation process from LBA to CHS address. The host
only needs to know the size of the disk drive (how many blocks).
55
Storage System Environment
Sector
Block 0
Cylinder Block 8
Head (lower surface)
Block 16
Block 32
Block 48
16. In Hard drive, the _______________ causes the platters to rotate within a sealed case.
a) Tracks
b) Sectors
c) Spindle
d) Read/Write Head
Answer: c)
57
Storage System Environment
a) Sectors
b) Cylinders
c) Zones
d) None of the above
Answer: a)
58
Storage System Environment
Answer: c)
59
Storage System Environment
a) True
b) False
Answer: a)
60
Storage System Environment
Answer: a)
61
Storage System Environment
Answer: b)
62
Storage System Environment
22. A typical HDD consists of one or more flat circular disks called as
a) Platter
b) Spindle
c) Read/Write head
d) Actuator arm assembly
e) Controller
Answer: a)
63
Storage System Environment
Answer: b)
64
Storage System Environment
Answer: c)
65
Storage System Environment
25. Which is a printed circuit board, mounted at the bottom of a disk drive ?
a) Platter
b) Spindle
c) Actuator arm assembly
d) Controller
Answer: d)
66
Storage System Environment
Answer: b)
67
Storage System Environment
PERFORMANCE
OF DISK DRIVE
68
Storage System Environment
Seek Time
69
Storage System Environment
Rotational Latency
It is the time taken to position the data under the read/write head by rotating and repositioning the platter.
Platter movement
70
Storage System Environment
Disk Drive
Data transfer rate is also known as transfer rate, and it refers to the average amount of data per unit time
that can be delivered by the drive to the Host Bus Adapter (HBA).
71
Storage System Environment
The Data Transfer Rate describes the MB/second that the drive can deliver data to the HBA. It can be
categorised as:
• Internal transfer rate - the speed of moving data from the disk surface to the R/W heads on a
single track of one surface of the disk.
• External transfer rate - the rate at which data can be moved through the interface to the HBA.
72
Storage System Environment
73
Fundamental Laws
Governing Disk Drive
Performance
7
Storage System Environment
I/O Queue
Arrival
I/O
6 5 4 3 2 1 Processed I/O Request
▪ Littles’ Law Controller
• Describes the relationship between the number of requests in a queue and the response time.
• N=a×R
o “N” is the total number of requests in the queue system
o “a” is the arrival rate of I/O requests
o “R” is the average response time
▪ Utilisation law
• Defines the I/O controller utilisation
• U = a × Rs
o “U” is the I/O controller utilisation
o “Rs“ is the service time
75
Storage System Environment
Answer: b)
76
Storage System Environment
Answer: a)
77
Storage System Environment
Answer: d)
78
Storage System Environment
Answer: b)
79
Storage System Environment
Answer: a)
80
Storage System Environment
Answer: c)
81
Storage System Environment
Answer: d)
82
Storage System Environment
34. The time taken by the platter to rotate and position the data under the R/W head is called
a) Seek Time
b) Average Seek Time
c) Rotational Latency
d) External Latency
Answer: c)
83
Storage System Environment
35. The rate at which data can be moved through the interface to the HBA is known as
a) Internal Transfer rate
b) External Transfer rate
c) Access rate
d) Pulse rate
Answer: b)
84
Storage System Environment
Document Link
85
Storage System Environment
Video Link
86
Storage System Environment
Logical Components
of Host
87
Storage System Environment
Application
88
Storage System Environment
Operating System
89
Storage System Environment
90
Storage System Environment
Logical Storage
Physical
Storage
91
Storage System Environment
LVM Example
Server
s
Logical
Volume
Physical
Volume
Partitionin Concatenatio
g n
92
Storage System Environment
Volume Group
93
Storage System Environment
Logical Volume
94
Storage System Environment
Device Drivers
File System
▪ The file is a collection of related records or data stored as a unit.
▪ The file system is the organisation of files on disk.
Examples, FAT 32, NTFS, UNIX FS and EXT2/3
95
Storage System Environment
a) Physical Volumes
b) Logical Volumes
c) Logical Groups
d) Volume Groups
Answer: c)
96
Storage System Environment
Answer: b)
97
Storage System Environment
Answer: c)
98
Storage System Environment
Answer: a)
99
Storage System Environment
Answer: c)
10
Storage System Environment
RAID technology
10
Storage System Environment
RAID technology
• It is a technology that enables greater levels of performance, reliability and/or large volumes when
dealing with data.
• High performance is achieved by concurrent use of two or more ‘hard disk drives’.
• Mirroring, Stripping (of data) and Error correction techniques combined with multiple disk arrays
gives the desired reliability and performance.
10
Storage System Environment
RAID flavours
1. RAID 0
2. RAID 1
3. RAID 5
4. RAID 10
10
Storage System Environment
RAID 0
10
Storage System Environment
RAID 1
10
Storage System Environment
RAID 5
10
Storage System Environment
RAID 10
10
Storage System Environment
• A software layer sits above the disk device drivers and provides an abstraction layer between the
logical drives(RAIDs) and physical drives.
10
Storage System Environment
Hardware-Based RAID
10
Storage System Environment
a) RAID 0
b) RAID 1
c) RAID 5
d) None of the above
Answer: b)
11
Storage System Environment
42. Hardware based RAID implementation requires no processor use for RAID calculations because
________
a) Does not involve calculations
b) Separate controller is present for calculations
c) Separate programme is present for calculations
d) None of the above
Answer: b)
11
Storage System Environment
11
Storage System Environment
Answer: a)
11
Storage System Environment
45. ______________ measures (in hours) the average life expectancy of an HDD
Answer: d)
11
Storage System Environment
46. Consider a storage array of 100 HDDs, each with an MTBF of 750,000 hours. This means that a HDD
in this array is likely to fail at least once in ____________.
a) 7,500 hours
b) 7,50 hours
c) 75 hours
d) 700 hours
Answer: a)
11
Storage System Environment
47. Which is a technique whereby data is stored on two different HDDs, yielding two copies of data?
a) Parity
b) Stripping
c) Mirroring
d) Seizing
Answer: b)
11
Storage System Environment
48. Which RAID configuration, data is striped across the HDDs in a RAID set ?
a) RAID 3
b) RAID 2
c) RAID 1
d) RAID 0
Answer: d)
11
Storage System Environment
Answer: c)
11
Storage System Environment
50. Consider an application that generates 5,200 IOPS, with 60 percent of them being reads. The disk load
in RAID 5 is
a) 11,440 IOPS
b) 11, 404 IOPS
c) 11, 004 IOPS
d) 11, 400 IOPS
Answer: a)
11
Storage System Environment
Document Link
RAID https://www.prepressure.com/library/techn This link explains the details of RAID technology with
technology ology/raid advantages and disadvantages of different RAID flavours
12
Storage System Environment
Video Links
https://www.youtube.com/watch?v=wTcxROb
RAID technology This video explains about RAID 0, 1, 2, 5, 6, 10
q738
https://www.youtube.com/watch?v=wpZ4589 In this Video, you will learn about RAID controller and its
RAID controller
0zEk use and need
12
Storage System Environment
Storage Networking
Technologies
12
Storage System Environment
Overview
12
Storage System Environment
Comparison
DAS NAS SAN
shared
Storage Type sectors blocks
files
Data TCP/IP, Fibre
IDE/SCSI
Transmission Ethernet Channel
clients or clients or
Access Mode servers
servers servers
Capacity
109 109 - 1012 1012
(bytes)
Complexity Easy Moderate Difficult
Management
High Moderate Low
Cost (per GB)
12
Storage System Environment
DAS Types
Ethernet
Network
Used
Small Server
Used
SCSI
Channel Used
12
Storage System Environment
SCSI
12
Storage System Environment
SAN
• A Storage Area Network (SAN) is a specialised, dedicated high-speed network joining servers and
storage, including disks, disk arrays, tapes, etc.
• Storage (data store) is separated from the processors (and separated processing).
• High capacity, high availability, high scalability, ease of configuration, ease of reconfiguration.
• Fibre Channel is the de facto SAN networking architecture, although other network standards could
be used.
12
Storage System Environment
SAN Benefits
• Storage consolidation.
• Data sharing.
• Non-disruptive scalability for growth.
• Improved backup and recovery.
• Tape pooling.
• LAN-free and server-free data movement.
• High performance.
• High availability server clustering.
• Data integrity.
• Disaster tolerance.
• Ease of data migration.
• Cost-effectiveness (total cost of ownership).
12
Storage System Environment
SAN Architecture
• Fibre Channel (FC) is well established in the open systems environment as the underlining architecture
of the SAN.
• Fibre Channel is structured with independent layers, as are other networking protocols. There are five
layers, where 0 is the lowest layer. The physical layers are 0 to 2. These layers carry the physical
attributes of the network and transport the data created by the higher level protocols, such as SCSI,
TCP/IP, or FICON.
• ANSI T11 is the technical committee who produced FC Protocol (interface standard) for
high-performance and mass storage applications.
• FC Protocol is designed to transport multiple protocols, such as HIPPI, IPI, SCSI, IP, Ethernet, etc.
13
Storage System Environment
FC Protocol Layers
13
Storage System Environment
SAN Topologies
• Point-to-point.
• Loop (arbitrated) – shared media.
• Switched.
13
Storage System Environment
Point-to-Point
• The point-to-point topology is the easiest Fibre Channel configuration to implement, and it is also the
easiest to administer.
13
Storage System Environment
Arbitrated Loop
13
Storage System Environment
Switched FC
• Fibre Channel-switches function in a manner similar to traditional network switches to provide increased
bandwidth, scalable performance, an increased number of devices, and, in some cases, increased
redundancy. Fibre Channel-switches vary in the number of ports and media types they support.
• Multiple switches can be connected to form a switch fabric capable of supporting a large number of host
servers and storage subsystems.
Server
s Fibre Channel
Storage
Switch
Device
13
Storage System Environment
NAS
NAS Implementation
SMB
NetBIOS
TCP
IP
802.3
13
Storage System Environment
NAS Implementation
NFS
TCP
IP
802.3
13
Storage System Environment
NAS vs SAN
Traditionally:
• NAS is used for low-volume access to a large amount of storage by many users.
• SAN is the solution for terabytes (1012) of storage and multiple, simultaneous access to files, such
as streaming audio/video.
The lines are becoming blurred between the two technologies now, and while the SAN-versus-NAS debate
continues, the fact is that both technologies complement each another.
13
Storage System Environment
IP-SAN
▪ Can be used for local data Centre and long haul applications.
• iSCSI – Internet SCSI – allows block storage to be accessed over a TCP/IP network as though it
were locally attached
• FCIP – Fibre Channel over IP – used to tunnel Fibre Channel frames over TCP/IP connections
14
Storage System Environment
Answer: a)
14
Storage System Environment
a) DAS
b) NAS
c) SAN
d) None of the above
Answer: b)
14
Storage System Environment
Answer: c)
14
Storage System Environment
Summary
• A storage system is a group of components providing storage facilities to one or numerous computers.
• The main components of storage system environments are the host, connectivity, and storage.
• Components of a disk drive are platter, spindle, read/write head, actuator arm assembly, and controller.
• A host is a computer or any other device on which applications run.
• The logical components of a host are OS, device driver, volume manager, file system, volume group.
• Zoned bit recording groups tracks into zones based on their distance from the centre of the disk.
14
Storage System Environment
Assignment
14
Storage System Environment
Assignment
8. Explain the three main components of the storage system.
9. Discuss the physical components of the host in the storage system.
10. Compare RAM and ROM storage devices
11. Explain the different types of data communication between the I/O devices and host.
12. Recommend the physical components of connectivity between the host and storage. Explain.
13. Give the classification of buses for data transfer.
14. Discuss the logical components of the connectivity in the storage system.
14
Storage System Environment
Assignment
15. List out the merits and limitations of tape storage device.
16. Discuss the different types of storage devices that use magnetic media.
17. With a neat diagram, explain the Disk Drive Components.
18. With a neat diagram, explain Physical Disk Structure.
19. Explain the concept of Zone Bit Recording to store the information in the disk.
20. Discuss the concept of logical block addressing in the disk storage device.
21. The utilisation law is a fundamental law that defines the I/O controller utilization. Justify the
statement.
14
Storage System Environment
Assignment
22. Discuss the necessity of RAID Technology?
23. Elaborate the different types of RAID implementation. What are the benefits and limitation of the
techniques?
24. With a neat diagram, explain the components of RAID Array.
25. Explain the RAID techniques that define the various RAID levels.
26. Compare the different RAID techniques.
27. Recommend the techniques determine the data availability and performance characteristics of a
redundant array.
14
Storage System Environment
Assignment
28. Consider the three users A, B and C. The user B wants to store the two copies of data on two
different HDDs, user A wants to store the parts of data on different disks and user C wants to
protect and store the segments of the data. Identify the RAID techniques that can be used to store
the user A, B and C data.
29. List the various RAID levels with the RAID techniques adopted.
30. Compare the different RAID levels.
31. With a neat diagram, explain the RAID 0 and 1 levels to improve the disk performance.
32. With a neat diagram, explain the nested RAID levels to improve the disk performance.
14
Storage System Environment
Assignment
33. Consider an application that generates 5,200 IOPS, with 60 percent of them being reads. Compute
the disk load in RAID 5 RAID 1
34. What is DAS? Explain the different types of DAS architectures.
35. List out the benefits and limitations of DAS architecture.
36. Data Centre managers are burdened with the challenging task of providing low-cost, high-
performance information management solutions. Justify the statement.
37. With a neat diagram, explain the SAN implementation.
38. List and explain the components of the SAN.
39. With a neat diagram, explain the fibre Channel SAN evolution.
15
Storage System Environment
Assignment
40. Distinguish between multi-mode and single-mode fibre cables for data transfer in FC SAN.
41. Compare and contrast between FC switch and FC hub.
42. List FC architecture supports three basic interconnectivity options. explain and compare.
43. With neat diagrams, explain the FC topologies most commonly deployed in SAN implementations.
15
Storage System Environment
Document Links
http://study.com/academy/lesson/magnetic-storage-
This link explains Magnetic Storage: Definition,
Magnetic storage definition-devices-examples.html
Devices and Examples
Video Links
Storage Area Network https://www.youtube.com/watch?v=7et3bmXhr8w This video explains the building blocks of SAN
15
Storage System Environment
E-Book Links
http://read.pudn.com/downloads168/ebook/77230
Storage Systems Architecture 1 to 7
7/STF2-0SectionIntroduction.pdf
hthttps://www.pdfdrive.com/information-storage-
Information Storage and Management and-management-volume-1-of-rnhtnet- 1-317
d17363479.html
15
DEPARTMENT OF BCA
STORAGE MANAGEMENT
SUB CODE: 16BCA525D21
Module 3 :
Implementation of RAID - Software RAID - Hardware RAID -RAID Array Component -RAID Levels -
Striping -Mirroring -Parity RAID 0 RAID 1 -Nested RAID -RAID 3 -RAID 4 -RAID 5 --RAID 6 -RAID
Comparison -RAID Impact on Disk-Performance - Application IOPS and RAID Configurations -
Introduction to Direct Attached Storage – Types of DAS – Introduction to SAN – Components of SAN –
FC connectivity – FC topologies – Introduction to NAS – NAS components – NAS Implementation – NAS
File sharing.
10/7/2021 2
V SEM, SM, DEPARTMENT OF BCA
RAID and Storage Networking Technologies
AIM
To familiarise students with RAID and provide the good knowledge of different storage technologies
such DAS, SAN, FC and NAS.
3
RAID and Storage Networking Technologies
Objective
4
RAID and Storage Networking Technologies
Outcome
At the end of this module, you are expected to:
5
RAID and Storage Networking Technologies
Content
• Why RAID?
• Implementation of RAID
• RAID Array Component
• RAID technology
• RAID Levels
• RAID Comparison
• RAID Impact on Disk-Performance
• Application IOPS and RAID Configurations
• Storage Networking Technologies (DAS, SAN, FC, and NAS)
6
RAID and Storage Networking Technologies
Why RAID
• Performance limitation of disk drive.
• An individual drive has a certain life expectancy.
• Measured in MTBF (Mean Time Between Failure).
The more the number of HDDs in a storage array, the larger the probability for disk failure. For example:
• If the MTBF of a drive is 750,000 hours, and there are 100 drives in the array, then the MTBF of
the array becomes 750,000 / 100, or 7,500 hours.
• RAID was introduced to mitigate this problem.
RAID provides:
• Increase capacity
• Higher availability
• Increased performance
7
RAID and Storage Networking Technologies
Implementation of RAID
8
RAID and Storage Networking Technologies
• A software layer sits above the disk device drivers and provides an abstraction layer between the
logical drives (RAIDs) and physical drives.
9
RAID and Storage Networking Technologies
• Supported features: Software RAID does not support all RAID levels.
• Operating system compatibility: Software RAID is tied to the host operating system hence, upgrades to
software RAID or to the operating system should be validated for compatibility.
10
RAID and Storage Networking Technologies
11
RAID and Storage Networking Technologies
• Controller card RAID is host-based hardware RAID implementation in which a specialised RAID
controller is installed in the host and HDDs are connected to it.
• The RAID Controller interacts with the hard disks using a PCI bus.
• The external RAID controller is an array-based hardware RAID. It acts as an interface between the
host and disks.
• Key functions of RAID controllers are: ■ Management and control of disk aggregations ■ Translation
of I/O requests between logical disks and physical disks ■ Data regeneration in the event of disk
failures.
12
RAID and Storage Networking Technologies
13
RAID and Storage Networking Technologies
Logical Array
RAID
Controller
Hard Disks
Host
RAID Array
14
RAID and Storage Networking Technologies
• A RAID array is an enclosure that contains a number of HDDs and the supporting hardware and
software to implement RAID.
• These sub-enclosures, or physical arrays, hold a fixed number of HDDs, and may also include other
supporting hardware, such as power supplies.
• A subset of disks within a RAID array can be grouped to form logical associations called logical
arrays, also known as a RAID set or a RAID group.
• Logical arrays are comprised of Logical Volumes (LV). The operating system recognises the LVs as if
they are physical HDDs managed by the RAID controller.
15
RAID and Storage Networking Technologies
RAID technology
and Levels
16
RAID and Storage Networking Technologies
RAID technology
• It is a technology that enables greater levels of performance, reliability and/or large volumes when
dealing with data.
• High performance is achieved by concurrent use of two or more ‘hard disk drives’.
• Mirroring, Stripping (of data) and Error correction techniques combined with multiple disk arrays
gives the desired reliability and performance.
17
RAID and Storage Networking Technologies
RAID Levels
• 0 Striped array with no fault tolerance
• 1 Disk mirroring
• Nested RAID (i.e., 1 + 0, 0 + 1, etc.)
• 3 Parallel access array with dedicated parity disk
• 4 Striped array with independent disks and a dedicated parity disk
• 5 Striped array with independent disks and distributed parity
• 6 Striped array with independent disks and dual distributed parity
18
RAID and Storage Networking Technologies
RAID flavors
1. RAID 0
2. RAID 1
3. RAID 5
4. RAID 10
19
RAID and Storage Networking Technologies
Stripe
Data Organization: Striping
Strip
Stripe
Stripe 1
Stripe 2
Strips
20
RAID and Storage Networking Technologies
RAID 0
21
RAID and Storage Networking Technologies
RAID 0
• Allows multiple data to be read or written simultaneously, and therefore improves performance.
• Does not provide data protection and availability in the event of disk failures.
22
RAID and Storage Networking Technologies
RAID 0
1
5
9
2
RAID
6
Controller
10
3
7
Host
11
23
RAID and Storage Networking Technologies
RAID 1
• Data is stored on two different HDDs, yielding two copies of the same data.
• Provides availability.
• In the event of HDD failure, access to data is still available from the surviving HDD.
• When the failed disk is replaced with a new one, data is automatically copied from the surviving disk to the
new disk.
• Done automatically by RAID the controller.
• Disadvantage: The amount of storage capacity is twice the amount of data stored.
24
RAID and Storage Networking Technologies
Mirroring
25
RAID and Storage Networking Technologies
RAID 1
RAID
Block 01 Block 0
1
Controller
Host
26
RAID and Storage Networking Technologies
Nested RAID
• Combines the performance benefits of RAID 0 with the redundancy benefit of RAID 1.
• Data is first mirrored, and then both copies are striped across multiple HDDs.
• Rebuild operation only requires data to be copied from the surviving disk into the replacement disk.
27
RAID and Storage Networking Technologies
RAID 1
Block 0
Block 2
RAID RAID 0
Block 0
3
2
1
Controller
Block 1
Block 3
Host
28
RAID and Storage Networking Technologies
Block 0 Block 0
Block 2 Block 2
RAID 0
RAID
Controller
Block 1 Block 1
Block 3 Block 3
Host
29
RAID and Storage Networking Technologies
RAID 0
Block 1
Block 3
RAID 1
RAID
Block 2
0
Controller
Block 1
Block 3
Host
30
RAID and Storage Networking Technologies
Block 0 Block 1
Block 2 Block 3
RAID 1
RAID
Controller
Block 0 Block 1
Block 2 Block 3
Host
31
RAID and Storage Networking Technologies
RAID 10
32
RAID and Storage Networking Technologies
1
6 5
9
RAID ?1
Controller
3
Host 7 7
11
The middle drive fails:
Parity calculation 4 + 6 + 1 + 7 = 18 0123
4 + 6 + ? + 7 = 18 18
4567
? = 18 – 4 – 6 – 7
?=1
Parity Disk
33
RAID and Storage Networking Technologies
• Stripes data for high performance and uses parity for improved fault tolerance.
• If a drive files, data can be reconstructed using data in the parity drive.
• For RAID 3, data read / write is done across the entire stripe.
• Provide good bandwidth for large sequential data access such as video streaming.
34
RAID and Storage Networking Technologies
RAID 3
RAID
Block 0
3
2
1 Block 0
Controller
Block 1
Parity
Generated
Block 2
Host
Block 3
P0123
35
RAID and Storage Networking Technologies
RAID 4
• Similar to RAID 3, RAID 4 stripes data for high performance and uses parity for improved fault tolerance.
• Data is striped across all disks except the parity disk in the array.
• Parity information is stored on a dedicated disk so that the data can be rebuilt if a drive fails.
• Unlike RAID 3, data disks in RAID 4 can be accessed independently, so that specific data elements can be
read or written on single disk without read or write of an entire stripe.
36
RAID and Storage Networking Technologies
RAID 4
37
RAID and Storage Networking Technologies
• RAID 5 is similar to RAID 4, except that the parity is distributed across all disks instead of stored on a
dedicated disk.
• This overcomes the write bottleneck on the parity disk.
• RAID 6 is similar to RAID 5, except that it includes a second parity element to allow survival in the event
of two disk failures.
• The probability for this to happen increases and the number of drives in the array increases.
• Calculates both horizontal parity (as in RAID 5) and diagonal parity.
• Has more write penalty than in RAID 5.
• Rebuild operation may take longer than on RAID 5.
38
RAID and Storage Networking Technologies
RAID 5
Block 0
Block 4
Block 1
Block 5
Parity
RAID Block 2
Block 0
4 Block 4
0
Generated
Controller Block 6
P4
0516
273
Block 3
Host
P4567
P0123
Block 7
39
RAID and Storage Networking Technologies
RAID 5
• RAID 5 is an ideal combination of
good performance, good fault
tolerance and high capacity and
storage efficiency.
40
RAID and Storage Networking Technologies
RAID Comparison
41
RAID and Storage Networking Technologies
RAID Comparison
Min
RAID Storage Efficiency % Cost Read Performance Write Performance
Disks
Good
Good Slower than a single disk, as every
1 2 50 High Better than a single disk write must be committed to two disks
(n-1)*100/n
Good for random reads and very Poor to fair for small random writes
where n= number of
3 3 Moderate good for sequential reads Good for large, sequential writes
disks
(n-2)*100/n
Moderate but Very good for random reads Good for small, random writes
6 4 where n= number of
more than RAID 5 Good for sequential reads (has write penalty)
disks
1+0
and 4 50 High Very good Good
0+1
42
RAID and Storage Networking Technologies
RAID Impact on
Performance of Disk
43
RAID and Storage Networking Technologies
RAID Impacts on Performance RAID Controller
2
Ep new Ep old X E4 old E4 new
O
R
P0 D1 D2 D3 D4
45
RAID and Storage Networking Technologies
• When deciding the number of disks required for an application, it is important to consider the impact of
RAID based on IOPS generated by the application.
• The total disk load should be computed by considering the type of RAID configuration and the ratio of
read compared to write from the host.
• The following example illustrates the method of computing the disk load in different types of RAID.
Total IOPS at peak workload is 1200, Read/Write ratio 2:1
• RAID 1/0
Additional Task
• RAID 5 Discuss impact of sequential and
Random I/O in different RAID
Configuration
46
RAID and Storage Networking Technologies
• Consider an application that generates 5,200 IOPS, with 60% of them being reads.
• RAID 5 disk load = 0.6 × 5,200 + 4 × (0.4 × 5,200) [because the write penalty for RAID 5 is 4]
• RAID 1 disk load = 0.6 × 5,200 + 2 × (0.4 × 5,200) [because every write manifests as two writes to the
disks]
47
RAID and Storage Networking Technologies
• The computed disk load determines the number of disks required for the application. If in this example,
an HDD with a specification of a maximum 180 IOPS for the application needs to be used, the number of
disks required to meet the workload for the RAID configuration would be as follows:
48
RAID and Storage Networking Technologies
RAID Summary
49
RAID and Storage Networking Technologies
a) RAID 0
b) RAID 1
c) RAID 5
Answer: b
50
RAID and Storage Networking Technologies
2. Hardware based RAID implementation requires no processor use for RAID calculations because it does
not involve calculations. State whether True or False.
a) True
b) False
Answer: b
51
RAID and Storage Networking Technologies
a) Only i
b) Only ii
c) Only i and ii
d) All i, ii and iii
Answer: d
52
RAID and Storage Networking Technologies
Answer: a
53
RAID and Storage Networking Technologies
Answer: d
54
RAID and Storage Networking Technologies
6. Consider a storage array of 100 HDDs, each with an MTBF of 750,000 hours. This means that a HDD in
this array is likely to fail at least once in ____________ hours.
a) 7,500
b) 7,50
c) 75
d) 700
Answer: a
55
RAID and Storage Networking Technologies
7. Which one of the given options is a technique whereby, data is stored on two different HDDs, yielding
two copies of data?
a) Parity
b) Stripping
c) Mirroring
d) Seizing
Answer: b
56
RAID and Storage Networking Technologies
8. In which one of the given RAID configuration, data is striped across the HDDs in a RAID set?
a) RAID 3
b) RAID 2
c) RAID 1
d) RAID 0
Answer: d
57
RAID and Storage Networking Technologies
9. In which one of the given RAID configuration, data is mirrored to improve fault tolerance?
a) RAID 3
b) RAID 2
c) RAID 1
d) RAID 0
Answer: c
58
RAID and Storage Networking Technologies
10. Consider an application that generates 5,200 IOPS, with 60 percent of them being reads. The disk load
in RAID 5 is ___________________.
a) 11,440 IOPS
b) 11, 404 IOPS
c) 11, 004 IOPS
d) 11, 400 IOPS
Answer: a
59
RAID and Storage Networking Technologies
Document Links
60
RAID and Storage Networking Technologies
Video Links
61
RAID and Storage Networking Technologies
Storage Networking
Technologies
62
RAID and Storage Networking Technologies
Overview
63
RAID and Storage Networking Technologies
64
RAID and Storage Networking Technologies
65
RAID and Storage Networking Technologies
What is DAS?
DAS Benefits
• Simple to deploy.
• Low complexity.
67
RAID and Storage Networking Technologies
• SCSI
• Parallel (primarily for internal bus)
• Serial (external bus)
• FC
• High speed network technology
DAS Management
• Internal
• Host provides:
• Direct Attached Storage managed individually through the server and the OS.
• External
• Lower TCO (Total Cost of Ownership) for managing data and storage infrastructure.
69
RAID and Storage Networking Technologies
DAS Challenges
• Scalability is limited
• Limited bandwidth
• Distance limitations
70
• Resulting in islands of over and under utilized storage pools.
RAID and Storage Networking Technologies
DAS Types
Ethernet
Network
Used
Small Server
Used
SCSI
Channel Used
71
RAID and Storage Networking Technologies
SCSI
72
RAID and Storage Networking Technologies
73
RAID and Storage Networking Technologies
74
RAID and Storage Networking Technologies
SAN
• A Storage Area Network (SAN) is a specialised, dedicated high speed network joining servers and
storage, including disks, disk arrays, tapes, etc.
• Storage (data store) is separated from the processors (and separated processing).
• High capacity, high availability, high scalability, ease of configuration, ease of reconfiguration.
• Fibre Channel is the de facto SAN networking architecture, although other network standards could be
used.
75
RAID and Storage Networking Technologies
SAN Benefits
• Storage consolidation
• Data sharing
• Non-disruptive scalability for growth
• Improved backup and recovery
• Tape pooling
• LAN-free and server-free data movement
• High performance
• High availability server clustering
• Data integrity
• Disaster tolerance
• Ease of data migration
• Cost-effectiveness (total cost of ownership)
76
RAID and Storage Networking Technologies
SAN Architecture
• Fibre Channel (FC) is well established in the open systems environment as the underlining architecture of
the SAN.
• Fibre Channel is structured with independent layers, as are other networking protocols. There are five
layers, where 0 is the lowest layer. The physical layers are 0 to 2. These layers carry the physical attributes
of the network and transport the data created by the higher level protocols, such as SCSI, TCP/IP, or
FICON.
• ANSI T11 is the technical committee who produced FC Protocol (interface standard) for
high-performance and mass storage applications.
• FC Protocol is designed to transport multiple protocols, such as HIPPI, IPI, SCSI, IP, Ethernet, etc.
77
RAID and Storage Networking Technologies
• Scalability
• Theoretical limit: Appx. 15 million devices
• Secure Access
Additional Task
Research on Bladed Switch
Technology Storage Storage 78
Array Array
RAID and Storage Networking Technologies
IP FC SAN
network
FC Switch
FC Switch
FC Hub FC Switch
FC Switch
FC Switch
Examples of nodes
• Hosts, storage and tape library
Node
Tx
• Ports are available on: Port 0 Rx
Port 0 Link
• HBA in host Port 1
81
RAID and Storage Networking Technologies
Node Connectors:
• SC Duplex Connectors
• LC Duplex Connectors
ST Connector
83
RAID and Storage Networking Technologies
• Hubs
Director
84
RAID and Storage Networking Technologies
• Business Continuity
Servers
85
RAID and Storage Networking Technologies
86
RAID and Storage Networking Technologies
FC Connectivity
87
RAID and Storage Networking Technologies
• Limited connectivity
Storage Array
Servers
88
RAID and Storage Networking Technologies
FC Hub
Storage Array
Servers
89
RAID and Storage Networking Technologies
FC-AL Transmission
Node A Node D
NL_Port #1
NL_Port #1 Transmi Hub_PtB Hub_Pt NL_Port #2
NL_Port #2
Receive
HBA t Bypy HBA
HBA p HBA
B
Byp
y Transmi
Receive p t
Node B Node C
Transmi B NL_Port #3
#3
NL_Port #4 Receive NL_Port
NL_Port #4 t Bypy FA
HBA p B HBA
Array Port
Bypy Transmi
Receive Hub_Pt Hub_Ptp t
90
RAID and Storage Networking Technologies
Servers
91
RAID and Storage Networking Technologies
FC-SW Transmission
Node A Node D
NL_Port #1 Transmi Port Port N_Port #2
NL_Port #2
N_Port #1 Receive
HBA t HBA
Storage Port
HBA
Transmi
Receive
t
Node B Node C
NL_Port #4 Transmi
N_Port #4 Receive N_Port #3
HBA t
HBA Storage Port
Transmi
Receive Port Port t
92
RAID and Storage Networking Technologies
Port Types
Hos
t ?NL- ?
NL-
Port Port
Tape
?
FC Library
Hos Hub
t NL-
Port
Hos
t ?N- F-
? ?
FC
Port FL-
Port Switch
Port
? ?
FC
Switch
F-
? ?F-
?N- ?
E- E-
Port Port Port Port
N-
Port Port
Storage Array Storage Array
93
RAID and Storage Networking Technologies
• ISLs are used to transfer host-to-storage data as well as the fabric management traffic from one switch
to another.
Multimode Fiber
1Gb=500m 2Gb=300m
FC Switch FC Switch
Single-mode Fiber
up to10 km
FC Switch FC Switch
94
RAID and Storage Networking Technologies
96
RAID and Storage Networking Technologies
• Describe layers of FC
• Discuss FC addressing
Additional Task
Research on Flow Control and Class of FC Services
97
RAID and Storage Networking Technologies
FC Architecture Overview
• FC uses channel technology
• Provide high performance with low protocol overheads
• FCP is SCSI-3 over FC network
• Sustained transmission bandwidth over long distances
• Provides speeds up to 8 Gb/s (8 GFC)
• FCP has five layers:
• FC-4 Applicatio
n
• FC-2 FC- SCS HIPP ESCO AT I
• FC-1 4 I I N M P
FC- Framing/Flow
• FC-0 Control
2
*FC-3 is not yet implemented FC- Encode/Deco
1 de
FC- 1 2 4 8
0 G G G G
b b b b
98 /s /s /s /s
RAID and Storage Networking Technologies
FC-4 Mapping interface Mapping upper layer protocol (e.g. SCSI-3 to FC transport
FC-2 Routing, flow control Frame structure, ports, FC addressing, buffer credits
• Address Format:
100
RAID and Storage Networking Technologies
101
RAID and Storage Networking Technologies
102
RAID and Storage Networking Technologies
• Exchange operations
• Maps to sequence
• Sequence
• Frames
104
RAID and Storage Networking Technologies
FC Topologies
105
RAID and Storage Networking Technologies
106
RAID and Storage Networking Technologies
• High Availability t
c
t
c
t
c
h h h
• Medium Scalability
Core Direct Direct
• Medium to maximum Connectivity Serv
e
T o
r
o
r
i Storage
r e Dual-core Arr
r ay
topology
107
RAID and Storage Networking Technologies
Storage Storage
Arr
108 Arr
ay ay
RAID and Storage Networking Technologies
FC
SAN
Storage
Array Array
port
HB
Serve
A
rs 109
RAID and Storage Networking Technologies
FC Protocol Layers
110
RAID and Storage Networking Technologies
SAN Topologies
• Point-to-point
• Switched
111
RAID and Storage Networking Technologies
Point-to-Point
• The point-to-point topology is the easiest Fibre Channel configuration to implement, and it is also the
easiest to administer.
112
RAID and Storage Networking Technologies
Arbitrated Loop
113
RAID and Storage Networking Technologies
Switched FC
• Fibre Channel-switches function in a manner similar to traditional network switches to provide increased
bandwidth, scalable performance, an increased number of devices, and, in some cases, increased
redundancy. Fibre Channel-switches vary in the number of ports and media types they support.
• Multiple switches can be connected to form a switch fabric capable of supporting a large number of host
servers and storage subsystems.
Server
s Fibre Channel
Storage
Switch
Device
114
RAID and Storage Networking Technologies
Network-Attached Storage
115
RAID and Storage Networking Technologies
Network-Attached Storage
After completing this topic, you will be able to:
116
RAID and Storage Networking Technologies
NAS
▪ NAS is a dedicated storage device, and it operates in a client/server mode.
▪ NAS is connected to the file server via LAN.
▪ Protocol: NFS (or CIFS) over an IP Network
• Network File System (NFS) – UNIX/Linux
• Common Internet File System (CIFS) – Windows Remote file system (drives) mounted on the local
system (drives).
o Evolved from Microsoft NetBIOS, NetBIOS over TCP/IP (NBT), and Server Message Block
(SMB).
▪ Advantage: No distance limitation
▪ Disadvantage: Speed and Latency
▪ Weakness: Security
117
RAID and Storage Networking Technologies
• File Sharing
• Traditional client/server model, implemented with file-sharing protocols for remote file sharing
118
RAID and Storage Networking Technologies
What is NAS ?
Clients
Application Print
Server Server
NAS Device
Applicatio File
ns Syst
Print Operating
em
Drive System
File rs Netwo
Syst rk
Operating
em
System
Netwo
rk
Single
NASFunctio
n Devi
ce
General Purpose
Servers or
(Windows
UNIX)
121
RAID and Storage Networking Technologies
Benefits of NAS
• Supports comprehensive access to information.
• Centralises storage
• Simplifies management
• Scalability
122
RAID and Storage Networking Technologies
Components of NAS
UNIX
NFS Network Interface
NAS Head
NFS CIFS
IP
NAS Device OS
Storage Interface
CIFS
Windows
Storage Array
123
RAID and Storage Networking Technologies
• Traditional Microsoft environment file sharing protocol, based upon the Server Message
Block protocol.
124
RAID and Storage Networking Technologies
• Client/server application.
• Mount points grant access to remote hierarchical file structures for local file system structures.
Additional Task
Research on NFS and CIFS
125
RAID and Storage Networking Technologies
• Stateful Protocol
• Can automatically restore connections and reopen files that were open prior to interruption.
• CIFS runs over TCP/IP and uses DNS (Domain Naming Service) for name resolution.
126
RAID and Storage Networking Technologies
NAS I/O
Applicatio Storage
n Interface
3
Operating Network
System Protocol Block I/O
to
storage
I/O NAS Operating device
Redire System
ct
NFS / NFS / Storage
CIFS Client uses CIFS Array
file I/O
TCP/IP TCP/IP
Stack 1 Stack
Network Network
Interface Interface
4
Clien IP NAS
t Netw Devic
ork e
127
RAID and Storage Networking Technologies
NAS Implementations
Integrated
NAS
I
P
NAS
Devic
e
Gateway
NAS
I FC
P SAN
NAS Head
Storage
Array 128
RAID and Storage Networking Technologies
Application Server
IP
Application Server
129
RAID and Storage Networking Technologies
Clien
t
I
FC
P
SAN
Clien
t
Application
Server
Storage
Array
Clien
t NAS
Gateway
Additional Task
Research on suitable environments for implementing Integrated and Gateway NAS 130
RAID and Storage Networking Technologies
Hosting and Accessing Files on the NAS – File sharing protocols (NFS and CIFS)
Steps to host a file system:
• Create an array volume
• Assign volume to NAS device
• Create a file system on the volume
• Mount the file system
• Access the file system
• Use NFS in UNIX environment
• Execute mount/nfsmount command
• Use CIFS in windows environment
• Map the network drive as:
\\Account1\Act_Rep
131
RAID and Storage Networking Technologies
NAS Implementation
SMB
NetBIOS
TCP
IP
802.3
132
RAID and Storage Networking Technologies
NAS Implementation
NFS
TCP
IP
802.3
133
RAID and Storage Networking Technologies
NAS vs SAN
▪ Traditionally:
• NAS is used for low-volume access to a large amount of storage by many users.
• SAN is the solution for terabytes (1012) of storage and multiple, simultaneous access to files, such
as streaming audio/video.
▪ The lines are becoming blurred between the two technologies now, and while the SAN-versus-NAS
debate continues, the fact is that both technologies complement each another.
134
RAID and Storage Networking Technologies
Data TCP/IP,
IDE/SCSI Fibre Channel
Transmission Ethernet
clients or clients or
Access Mode servers
servers servers
11. Which one of the given options supports IDE/SCSI as data transmission mode?
a) DAS
b) NAS
c) SAN
Answer: a
136
RAID and Storage Networking Technologies
a) DAS
b) NAS
c) SAN
Answer: b
137
RAID and Storage Networking Technologies
a) DAS
b) NAS
c) SAN
Answer: c
138
RAID and Storage Networking Technologies
Summary
• The more the number of HDDs in a storage array, the larger the probability for disk failure.
• Fiber Channel-switches function in a manner similar to traditional network switches to provide increased
bandwidth, scalable performance, an increased number of devices, and, in some cases, increased
redundancy.
• ISLs are used to transfer host-to-storage data as well as the fabric management traffic from one switch to
another.
• A Storage Area Network (SAN) is a specialised, dedicated high speed network joining servers and storage,
including disks, disk arrays, tapes, etc.
• Fiber Channel is the de facto SAN networking architecture, although other network standards could be
used.
• FC Protocol is designed to transport multiple protocols, such as HIPPI, IPI, SCSI, IP, Ethernet, etc.
• When deciding the number of disks required for an application, it is important to consider the impact of
RAID based on IOPS generated by the application.
139
RAID and Storage Networking Technologies
Assignment
140
RAID and Storage Networking Technologies
Assignment
8. Recommend the techniques to determine the data availability and performance characteristics of a
redundant array.
8. Consider three users A, B and C. The user B wants to store the two copies of data on two different HDDs,
user A wants to store the parts of data on different disks and user C wants to protect and store the
segments of the data. Identify the RAID techniques that can be used to store the user A, B and C data.
8. List the various RAID levels with the RAID techniques adopted.
8. Explain RAID 0 & 1 levels to improve the disk performance with a diagram.
8. Explain the nested RAID levels to improve the disk performance with a diagram.
141
RAID and Storage Networking Technologies
Assignment
14. Consider an application that generates 5,200 IOPS, with 60% of them being reads. Compute the disk
load in RAID 5 RAID 1.
14. Data Centre managers are burdened with the challenging task of providing low-cost, high-performance
information management solutions. Justify the statement.
Assignment
21. Distinguish between multi-mode and single-mode fiber cables for data transfer in FC SAN.
21. FC architecture supports three basic interconnectivity options. List, explain and compare.
21. Explain the FC topologies most commonly deployed in SAN implementations with a diagram.
Document Links
144
RAID and Storage Networking Technologies
Video Links
Storage Area Network https://www.youtube.com/watch?v=7et3bmXhr8w This video explains the building blocks of SAN
145
RAID and Storage Networking Technologies
E-Book Links
https://www.mindshare.com/files/ebooks/SAS%20Storage
Storage Systems Architecture ALL
%20Architecture.pdf
146
DEPARTMENT OF BCA
STORAGE MANAGEMENT
SUB CODE: 16BCA525D21
MODULE 4: BACKUP AND RECOVERY
Module 4 :
Introduction to Business Continuity - Backup Purpose -Disaster Recovery - Operational Backup –
Archival, Backup Considerations, Backup Granularity, Recovery Considerations, Backup Methods ,
Backup Process, Backup and Restore Operations, Backup Topologies - Server less Backup , Backup
Technologies -Backup to Tape - Physical Tape Library - Backup to Disk - Virtual Tape Library
10/10/2021 2
V SEM, SM, DEPARTMENT OF BCA
Back up and Recovery
AIM:
To familiarise students with in depth knowledge on fundamentals of Business Continuity, Backup of data,
Disaster Recovery and various restore options of data.
3
Back up and Recovery
Objectives
• To understand the business continuity process due to the adverse operations in business and its impact on
back up and recovery of data.
• To create additional copy of data that can be used for restore and recovery purposes.
• Identify the data to be backed up is transferred from the backup client (Source).
4
Back up and Recovery
Outcomes:
At the end of this module, you are expected to:
• Define Business Continuity and Information Availability.
• Detail impact of information unavailability.
• Define BC measurement and terminologies.
• Describe BC planning process.
• Explain BC technology solutions.
• Define Backup and backup consideration.
• Describe purposes of backup.
• Explain backup granularity and restore.
• List backup methods.
• Describe backup/recovery process and operation.
5
Back up and Recovery
Content
• Business continuance infrastructure services
• The need for redundancy
• Information availability
• BC terminology and life cycle
• BC technology solutions
• Backup technologies and recovery considerations
• Restore and restart considerations
6
Back up and Recovery
Business Continuity
7
Back up and Recovery
• Continuous access to information is a must for the smooth functioning of business operations today,
as the cost of business disruption could be catastrophic.
• There are many threats to information availability, such as natural disasters (e.g., flood, fire,
earthquake), unplanned occurrences (e.g., cybercrime, human error, network and computer failure),
and planned occurrences (e.g., upgrades, backup, restore) that result in the inaccessibility of
information.
• It is critical for businesses to define appropriate plans that can help them overcome these crises.
• Business continuity is an important process to define and implement these plans.
• Business Continuity is preparing for, responding to, and recovering from an application outage that
adversely affects business operations.
8
Back up and Recovery
9
Back up and Recovery
Information Availability
10
Back up and Recovery
Information Availability
Information availability refers to the ability of an infrastructure to function according to business
expectations during its specified time of operation.
11
Back up and Recovery
Information Availability
12
Back up and Recovery
Information Availability
• Reliability
The components delivering the information should be able to function without failure, under stated
conditions, for a specified amount of time.
• Accessibility
Information should be accessible at the right place and to the right user.
• Timeliness
Information must be available whenever required.
13
Back up and Recovery
14
Back up and Recovery
• Planned outages
Planned outages include installation/integration/maintenance of new hardware, software upgrades or
patches, taking backups, application and data restores, facility operations (renovation and construction)
and refresh/migration of the testing to the production environment.
• Unplanned outages
Unplanned outages include failure caused by database corruption, component failure, and human
errors.
15
Back up and Recovery
Impact of Downtime
Lost Productivity Know the downtime costs (per Lost Revenue
• Number of employees hour, day, two days...) • Direct loss
impacted (x hours out * • Compensatory payments
hourly rate) • Lost future revenue
• Billing losses
• Investment losses
Other Expenses
Temporary employees, equipment rental, overtime
costs, extra shipping costs, travel expenses... 16
Back up and Recovery
Impact of Downtime
Average cost of downtime per hour = average productivity loss per hour + average revenue loss per
hour
Where,
Productivity loss per hour = (total salaries and benefits of all employees per week) / (average number of
working hours per week).
Average revenue loss per hour = (total revenue of an organisation per week) / (average number of hours per
week that an organisations is open for business).
17
Back up and Recovery
Time
Incident Diagnosis Recovery Incident
18
Back up and Recovery
19
Back up and Recovery
20
Back up and Recovery
Answer: a
21
Back up and Recovery
2. __________________ includes failure caused by database corruption, component failure, and human
errors.
a. Unplanned outages
b. Mean To Time Repair
c. Planned outages
Answer: a
22
Back up and Recovery
a. Unplanned outages
b. Mean to time repair
c. Planned outages
d. Catastrophic
Answer: d
23
Back up and Recovery
4. Which one of the given options is the unplanned occurrence threat to information security?
a. Earthquake
b. Human error
c. Upgrades
Answer: b
24
Back up and Recovery
5. Which one of the given options is the planned occurrence threat to information security?
a. Earthquake
b. Human error
c. Upgrades
Answer: c
25
Back up and Recovery
a. Earthquake
b. human error
c. upgrades
Answer: a
26
Back up and Recovery
Business Continuity
(BC) Terminologies
27
Back up and Recovery
BC Terminologies
The Business Continuity defines common terms related to Business Continuity operations.
28
Back up and Recovery
BC Terminologies
Disaster recovery
• Coordinated process of restoring systems, data, and infrastructure required to support ongoing
business operations in the event of a disaster.
• Restoring previous copy of data and applying logs to that copy to bring it to a known point of
consistency.
• Generally implies use of backup technology.
29
Back up and Recovery
BC Terminologies
Disaster restart
• Process of restarting from disaster using mirrored consistent copies of data and applications.
• Generally implies use of replication technologies.
30
Back up and Recovery
BC Terminologies
31
Back up and Recovery
BC Terminologies
Recovery Time Objective (RTO)
• Time within which systems, applications, or functions must be recovered after an outage.
• Amount of downtime that a business can endure and survive.
32
Back up and Recovery
Business Continuity
(BC) Planning Lifecycle
33
Back up and Recovery
34
Back up and Recovery
BC planning must follow a disciplined approach like any other planning process. Organisations today
dedicate specialised resources to develop and maintain BC plans. From the conceptualisation to the
realisation of the BC plan, a lifecycle of activities can be defined for the BC process.
35
Back up and Recovery
1. Establishing objectives
2. Analysing
4. Implementing
36
Back up and Recovery
Establishing
objectives
Training,
testing,
Analysing
assessing, and
maintaining
Implementing
Designing and
developing
37
Back up and Recovery
Establishing objectives
• Determine BC requirements.
• Estimate the scope and budget to achieve requirements.
• Select a BC team by considering subject matter experts from all areas of the business, whether
internal or external.
• Create BC policies.
38
Back up and Recovery
Analysing
• Collect information on data profiles, business processes, infrastructure support, dependencies, and
frequency of using business infrastructure.
• Identify critical business needs and assign recovery priorities.
• Create a risk analysis for critical areas and mitigation strategies.
• Conduct a Business Impact Analysis (BIA).
• Create a cost and benefit analysis based on the consequences of data unavailability.
• Evaluate options.
39
Back up and Recovery
• Define the team structure and assign individual roles and responsibilities. For example, different teams
are formed for activities such as emergency response, damage assessment, and infrastructure and
application recovery.
• Design data protection strategies and develop infrastructure.
• Develop contingency scenarios.
• Develop emergency response procedures.
• Detail recovery and restart procedures.
40
Back up and Recovery
Implementing
• Implement risk management and mitigation procedures that include backup, replication, and
management of resources.
• Prepare the disaster recovery sites that can be utilised if a disaster affects the primary data centre.
• Implement redundancy for every resource in a data centre to avoid single points of failure.
41
Back up and Recovery
• Train the employees who are responsible for backup and replication of business-critical data on a
regular basis or whenever there is a modification in the BC plan.
• Train employees on emergency response procedures when disasters are declared.
• Train the recovery team on recovery procedures based on contingency scenarios.
• Perform damage assessment processes and review recovery plans.
• Test the BC plan regularly to evaluate its performance and identify its limitations.
• Assess the performance reports and identify limitations.
• Update the BC plans and recovery/restart procedures to reflect regular changes within the data centre.
42
Back up and Recovery
ii. Implementing
a) Only i
b) Only ii
c) Only i and ii
Answer: d
43
Back up and Recovery
BC Technology Solutions
44
Back up and Recovery
Failure Analysis
Failure analysis involves Analysing the data centre to identify systems that are susceptible to a single point
of failure and implementing fault-tolerance mechanisms such as redundancy.
45
Back up and Recovery
Single Points of Failure refers to the failure of a component of a system that can terminate the availability of
the entire system or IT service.
46
Back up and Recovery
Fault Tolerance
To mitigate a single point of failure, systems are designed with redundancy, such that the system will fail,
only if all the components in the redundancy group fail. This ensures that the failure of a single
component does not affect data availability.
47
Back up and Recovery
Heartbeat Connection
Redundant Ports
Client
FC Switches
IP
Remote Site
Redundant Network
Redundant Paths
Redundant FC Switches
48
Back up and Recovery
Multi-pathing Software
49
Back up and Recovery
8. _________________ software helps to recognise and utilises alternate I/O path to data.
a. Multi-pathing
b. Single-pathing
c. Training
Answer: a
50
Back up and Recovery
9. _______________ refers to the failure of a component of a system that can terminate the
availability of the entire system.
c. Maintaining
Answer: a
51
Back up and Recovery
52
Back up and Recovery
• Backup is an additional copy of data that can be used for restore and recovery purposes.
• The Backup copy is used when the primary copy is lost or corrupted.
• Businesses back up their data to enable its recovery in case of potential loss.
• Businesses also back up their data to comply with regulatory requirements.
53
Back up and Recovery
Backup purposes
• Disaster Recovery
Restores production data to an operational state after disaster.
• Operational
Restores data in the event of data loss or logical corruptions that may occur during routine processing.
• Archival
Preserve transaction records, email, and other business work products for regulatory compliance.
54
Back up and Recovery
55
Back up and Recovery
Backup Speed
When an appliance is first added to the system, the first complete backup will take some time.
However, once the first backup is complete, updates should take only a few minutes each day, and
be virtually unnoticed by end users on the network, assuming business-grade network connectivity
is in place. If a slow backup causes a general network slowdown for end users, IT might be forced
to suspend backups, completely derailing the data protection plan.
56
Back up and Recovery
Recovery Time
Backup speed is important, but recovery speed is where the data protection plan proves its worth. In the
event of a system crash or disaster, how fast can critical data be recovered and accessed? Businesses
should establish a Recovery Time Objective (RTO) and a Recovery Point Objective (RPO). RTO is the
time by which a business process must be restored after a disaster or disruption to avoid unacceptable
consequences of a break in business continuity. RPO is essentially the minimum frequency of backups –
which translates to the amount of time and data that would be acceptable to lose in the event of a disaster.
57
Back up and Recovery
Reliability
Once you define the strategy, it is critical to implement that strategy on an infrastructure you can trust.
By ensuring both best practices and high-quality, reputable storage hardware and software systems, the
organisation can be confident that backed up data will be there when it needs to be recovered.
58
Back up and Recovery
Simplicity
If the data protection strategy and systems are not easy to implement, you risk mistakes or missed backups
at critical times. Additionally, ease of use and management capabilities are critical for evaluating long-term
resources necessary to maintain an effective strategy, especially when IT resources are limited or
outsourced. While, some technology platforms may be more affordable, an administrator’s time to monitor
could add significantly to cost. Look for solutions that enable storage administrators to set policies so that
backups run seamlessly without much need for interaction.
59
Back up and Recovery
Cost Efficiency
Key Total Cost of Ownership (TCO) considerations includes energy efficiency, maintenance, and scaling up
the system to manage data growth. A tiered storage strategy is critical here. Tying storage system
performance and cost to the value of the data, and quickly moving fixed content and other non-production
information onto reliable, lower cost systems will help manage cost while delivering on the access level and
recovery time objectives required.
60
Back up and Recovery
10. Which one of the given options are considered for Backup and Recovery?
i. Backup Speed
iii. Reliability
a) Only i
b) Only ii
c) Only i and ii
Answer: d
61
Back up and Recovery
a. Simply copying
b. Fatching
c. Reliability
Answer: a
62
Back up and Recovery
Backup Granularity
63
Back up and Recovery
Backup granularity
Backup granularity describes the level of detail characterising backup data. In data backup and restore
systems, increased granularity improves recovery point objectives because, the backup process
operates using more discrete granules or increments of data, minimising the amount of data lost during
a system restore.
64
Back up and Recovery
Su Su Su Su Su
Su M T W T F S Su M T W T F S Su M T W T F S Su M T W T F S Su
Incremental Backup
Su M T W T F S Su M T W T F S Su M T W T F S Su M T W T F S Su
• Files that have changed since the last backup are backed up.
• Fewest amount of files to be backed up therefore, faster backup and less storage space.
• Longer restore because last full and all subsequent incremental backups must be applied.
66
Back up and Recovery
Production
67
Back up and Recovery
• More files to be backed up, therefore it takes more time to backup and uses more storage space.
• Much faster restore because only the last full and the last cumulative backup must be applied.
68
Back up and Recovery
Production
69
Back up and Recovery
a. Backup granularity
b. Data Recovery
c. Reliability
Answer: d
70
Back up and Recovery
Backup Methods
71
Back up and Recovery
Backup Methods
• Cold or offline
• Hot or online
• Open file
• Retry
• Open File Agents
• Point in Time (PIT) replica
• Backup file metadata for consistency
• Bare metal recovery
72
Back up and Recovery
Backup Process
A backup system uses client/server architecture with a backup server and multiple backup clients. The backup
server manages the backup operations and maintains the backup catalogueue, which contains information
about the backup process and backup metadata.
73
Back up and Recovery
Backup Architecture
1. Backup client
Sends backup data to backup server or storage node.
2. Backup server
Manages backup operations and maintains backup catalogueue.
3. Storage node
Responsible for writing data to backup device.
74
Back up and Recovery
Tape Library
75
Back up and Recovery
Backup Operation
76
Back up and Recovery
Backup Operation
Application Server and Backup Clients
3 5
1 4 6
8 7
Restore Operation
1. Backup server scans backup catalogue to identify data to be restored and the client who will receive data.
2. Backup server instructs storage node to load backup media in backup device.
3. Data is then read and sent to backup client.
4. Storage node sends restore metadata to backup server.
5. Backup server updates catalogue.
78
Back up and Recovery
Restore Operation
Application Server and Backup Clients
1 2 3
5 4
a. Backup granularity
b. Backup server
c. Reliability
Answer: b
80
Back up and Recovery
Backup Topologies
81
Back up and Recovery
Backup Topologies
82
Back up and Recovery
Metadata Data
LAN
Backup Server
Backup Device
Application Server
and Backup Client
and Storage Node
83
Back up and Recovery
In LAN-based backup, all servers are connected to the LAN and all storage devices are directly attached
to the storage node. The data to be backed up is transferred from the backup client (source), to the
backup device over the LAN, which may affect network performance.
84
Back up and Recovery
Metadata
LAN
Data
Backup Device
Storage Node
85
Back up and Recovery
The SAN-based backup is also known as the LAN-free backup. The SAN-based backup topology is the
most appropriate solution, when a backup device needs to be shared among the clients. In this case, the
backup device and clients are attached to the SAN.
86
Back up and Recovery
LAN FC SAN
Metadata
Data
Storage Node
87
Back up and Recovery
Mixed Backup
The mixed topology uses both the LAN-based and SAN-based topologies. This topology might be
implemented for several reasons, including cost, server location, reduction in administrative overhead, and
performance considerations.
88
Back up and Recovery
Mixed Backup
Application Server
and Backup Client
Metadata
LAN FC SAN
Metadata Data
Storage Node
89
Back up and Recovery
Backup Technology
90
Back up and Recovery
Backup Technology
A wide range of technology solutions are currently available for backup. Tapes and disks are the two most
commonly used backup media.
91
Back up and Recovery
Backup to Tape
• Traditional destination for backup
• Low cost option
• Sequential / Linear Access
• Multiple streaming (Backup streams from multiple clients to a single backup device)
Data from
Stream 1 Data from
Stream 2 Data from
Stream 3
Tape
92
Back up and Recovery
93
Back up and Recovery
Tape Limitations
• Reliability
• Restore performance
• Mount, load to ready, rewind, dismount times
• Sequential Access
• Cannot be accessed by multiple hosts simultaneously
• Controlled environment for tape storage
• Wear and tear of tape
• Shipping/handling challenges
• Tape management challenges
94
Back up and Recovery
Backup to Disk
• Ease of implementation
• Fast access
• More Reliable
• Random Access
• Multiple host access
• Enhanced overall backup and recovery performance
95
Back up and Recovery
A Virtual Tape Library (VTL) has the same components as that of a physical tape library except that the
majority of the components are presented as virtual resources. For backup software, there is no
difference between a physical tape library and a virtual tape library.
96
Back up and Recovery
Emulation Engine
Storage (LUNs)
Backup Clients 97
Back up and Recovery
Disk-Aware
Difference Tape Virtual Tape
Backup-to-Disk
No inherent protection
Reliability RAID, spare RAID, spare
methods
Subject to mechanical
Performance Faster single stream Faster single stream
operations, load times
Multiple
Use Backup only Backup only
(backup/production)
98
Back up and Recovery
a. Backup Technology
b. Backup Topology
c. Recovery Method
Answer: b
99
Back up and Recovery
i. Backup to Disk
ii. Backup to virtual tape
iii. Backup to Tape (Physical Tape Library)
a) Only i
b) Only ii
c) Only i and ii
d) All i, ii and iii
Answer: d
10
Back up and Recovery
Assignment
1. A network router has a failure rate of 0.02 percent per 1,000 hours. What is the MTBF of that component?
2. Provide examples of planned and unplanned downtime in the context of data centre operations.
3. How does clustering help to minimise RTO?
4. Discuss the security concerns in backup environment.
5. Differentiate the benefits of using “virtual tape library” over “physical tapes.”
6. What is the necessity of business continuity?
7. Discuss the threats to information availability.
8. Define Information availability.
9. Explain the causes of Information Unavailability.
10. Explain how to measure Information Availability.
10
Back up and Recovery
Assignment
11. Discuss the Consequences of Downtime of data availability.
12. Elaborate the strategies to meet RPO and RTO targets.
13. With a neat diagram, explain the BC Planning Lifecycle.
14. With a neat diagram, explain Single point of failure analysis.
15. Explain enhancements in the infrastructure to mitigate single points of failures.
16. Discuss the BC Technology Solutions where data can be recovered and business operations can be
restarted using an alternate copy of data.
17. Explain the importance of backup of data.
18. Compare Cumulative (or differential) and Synthetic (or constructed) full backup.
19. Explain different types of backup methods.
20. With a neat diagram, explain the backup process. 10
Back up and Recovery
Assignment
19. Differentiate between various backup technology options.
20. Explain the basic topologies that are used in a backup environment.
21. Demonstrate how RPO and RTO are the major considerations when planning a backup strategy.
10
Back up and Recovery
Summary
✓ Business Continuity is preparing for, responding to, and recovering from an application outage that
adversely affects business operations.
✓ Information Availability can be defined in terms of three parameters that is reliability, accessibility and
timeliness.
✓ BC planning must follow a disciplined approach like any other planning process.
✓ Failure analysis involves Analysing the data centre to identify systems that are susceptible to a single
point of failure and implementing fault-tolerance mechanisms such as redundancy.
✓ Backup is an additional copy of data that can be used for restore and recovery purposes.
10
Back up and Recovery
Summary
✓ A backup system uses client/server architecture with a backup server and multiple backup clients.
✓ In LAN-based backup, all servers are connected to the LAN and all storage devices are directly
attached to the storage node. The data to be backed up is transferred from the backup client (source), to
the backup device over the LAN, which may affect network performance.
10
Back up and Recovery
Document Links
Backup technologies
http://www.nexsan.com/key-considerations-backup-
and recovery This link discusses the various backup and
and-recovery/
considerations recovery considerations.
Backup technologies
http://wikibon.org/wiki/v/Backup_and_recovery_techni
and recovery This link discusses the various technique of
ques
considerations backup and recovery
10
Back up and Recovery
Video Links
Introduction to Backup
This video provides the need of
Systems http://youtube.com/watch?v=k6dosJ9phWY
backup systems
E-Book Links
Introduction to
Page No 229 to 241
Business Continuity
http://wachum.org/dewey/000/ism1.pdf
Backup and Recovery Page No 251 to 275
10
DEPARTMENT OF BCA
STORAGE MANAGEMENT
SUB CODE: 16BCA525D21
MODULE 5: REPLICATION-LOCAL AND REMOTE
Module 5 :
Source and Target -Uses of Local Replicas, Data Consistency - Consistency of a Replicated File System -
Consistency of a Replicated Database , Local Replication Technologies - Host-Based Local Replication -
Storage Array-Based Replication , Restore and Restart Considerations - Tracking Changes to Source and
Target , Creating Multiple Replicas, Management Interface – Remote Replication Modes – Remote
Replication Technologies – Network Infrastructure
10/10/2021 2
V SEM, SM, DEPARTMENT OF BCA
Replication – Local and Remote
AIM:
To familiarise students with fundamentals of Replication – Local and Remote.
3
Replication – Local and Remote
Objectives:
The objectives of this module are to:
• Understand exact copy of data through replication.
• Define additional copy of data that can be used for restore and recovery purposes.
• Understand local replication and the possible uses of local replicas.
• Define consistency considerations when replicating file systems and databases.
• Comprehend host and array based replication technologies
• Functionality
• Differences
• Considerations
• Selecting the appropriate technology
4
Replication – Local and Remote
Outcomes:
At the end of this module, you are expected to:
• Explain local replication.
• List the possible uses of local replicas.
• Outline replica considerations such as Recoverability and Consistency.
• Describe how consistency is ensured in file system and database replication.
• Explain Dependent write principle.
5
Replication – Local and Remote
Content
• Local replication technologies
• Host based Local Replication technologies
• LVM based mirroring
• File system snapshot
• Array based Local Replication technologies
• Full volume mirroring
• Pointer-based full volume copy
• Pointer-based virtual replica
• Restore and restart considerations
• Modes of remote replications
• Remote replication technologies 6
Replication – Local and Remote
Local Replicas
7
Replication – Local and Remote
Replication
Replicas mean an exact copy of the original and replication is the process of creating an exact
copy of data.
REPLICATION
8
Replication – Local and Remote
9
Replication – Local and Remote
Replication Considerations
• Types of Replica: choice of replica tie back into RPO (recovery point objective)
• Point-in-Time (PIT)
• Non zero RPO
• Continuous
• Near zero RPO
• What makes a replica good?
• Recoverability/Re-startability
• Replica should be able to restore data on the source device.
• Restart business operation from replica.
• Consistency
• Ensuring consistency is a primary requirement for all the replication technologies.
10
Replication – Local and Remote
Understanding Consistency
• Ensure data buffered in the host is properly captured on the disk when replica is created.
• Data is buffered in the host before written to disk.
• Consistency is required to ensure the usability of replica.
• Consistency can be achieved in various ways:
• For file Systems
• Offline: Un-mount file system
• Online: Flush host buffers
• For Databases
• Offline: Shutdown database
• Online: Database in hot backup mode
• Dependent Write I/O Principle
• By Holding I/Os
11
Replication – Local and Remote
Application
Source Replica
12
Replication – Local and Remote
• e.g. Page (data) write is dependent write I/O based on a successful log write.
1 1 1
2 2 2
3 3 3 3
4 4 4 4
👍 Consistent
👎 Inconsistent
14
Replication – Local and Remote
Source Replica
1 1
5 5 2 2
3 3
4 4
Consistent
15
Replication – Local and Remote
16
Replication – Local and Remote
17
Replication – Local and Remote
a. Local only
b. Remote only
18
Replication – Local and Remote
a. Sink
b. Source
c. Product
d. Production
Answer: Production
19
Replication – Local and Remote
a. target LUN
c. Replica
d. All of Above
20
Replication – Local and Remote
a. Front
b. Back
c. Backup
d. Source
Answer: Backup
21
Replication – Local and Remote
a. In consistent
b. Consistent
c. Full
d. Partial
Answer: Consistent
22
Replication – Local and Remote
a. Awk
b. background
c. Sync
d. Async
Answer: Sync
23
Replication – Local and Remote
25
Replication – Local and Remote
26
Replication – Local and Remote
In LVM-based replication, logical volume manager is responsible for creating and controlling the
host-level logical volume.
27
Replication – Local and Remote
Physical
Volume 1
Host Logical Volume
Logical Volume
Physical
Volume 2
28
Replication – Local and Remote
29
Replication – Local and Remote
File system (FS) snapshot is a pointer-based replica that requires a fraction of the space used by
the original FS. This snapshot can be implemented by either FS itself or by LVM. It uses Copy on
First Write (CoFW) principle.
30
Replication – Local and Remote
Array
Source Replica
31
Replication – Local and Remote
32
Replication – Local and Remote
Attached
Source Target
Array
33
Replication – Local and Remote
34
Replication – Local and Remote
Source Target
Write to
Read/Write
Target Read/Write
Source Target
Read from
Read/Write
Target Read/Write
Source Target 35
Replication – Local and Remote
36
Replication – Local and Remote
Source
Target
Virtual Device
A
B C
C’
C’
Write to Source C
37
Replication – Local and Remote
Source
Target
Virtual Device
A
A’
A
B A
C’ C
38
Replication – Local and Remote
Sourc Save
e Location
39
Replication – Local and Remote
40
Replication – Local and Remote
Source 0 0 0 0 0 0 0 0
At PIT
Target 0 0 0 0 0 0 0 0
Source 1 0 0 1 0 1 0 0
After PIT…
Target 0 0 1 1 0 0 0 1
For resynchronization/restore
Logical OR 1 0 1 1 0 1 0 1
0 = unchanged 1 = changed 41
Replication – Local and Remote
Restore/Restart Operation
• Source has a failure
• Logical Corruption
• Physical failure of source devices
• Failure of Production server
• Solution
• Restores data from target to source
• The restores would typically be done incrementally.
• Applications can be restarted even before synchronisation is complete.
-----OR------
• Start production on target
• Resolve issues with source while continuing operations on target.
• After issue resolution restore latest data on target to source.
42
Replication – Local and Remote
Restore/Restart Considerations
• Before a Restore
• Stop all access to the Source and Target devices.
• Identify target to be used for restore.
• Based on RPO and Data Consistency.
• Perform Restore
• Before starting production on target
• Stop all access to the Source and Target devices.
• Identify Target to be used for restart.
• Based on RPO and Data Consistency.
• Create a “Gold” copy of Target
• As a precaution against further failures
• Start production on Target
43
Replication – Local and Remote
44
Replication – Local and Remote
46
Replication – Local and Remote
06:00 A.M.
Source
12:00 P.M.
Point-In-Time
06:00 P.M.
12:00 A.M.
: 12 : 01 : 02 : 03 : 04 : 05 : 06 : 07 : 08 : 09 : 10 : 11 : 12 : 01 : 02 : 03 : 04 : 05 : 06 : 07 : 08 : 09 : 10 : 11 :
A.M. P.M. 47
Replication – Local and Remote
• CLI
• GUI
48
Replication – Local and Remote
d. Host-based remote
49
Replication – Local and Remote
50
Replication – Local and Remote
b. Only synchronous
c. Only Asynchronous
d. None
51
Replication – Local and Remote
d. Pointer-based replica
53
Replication – Local and Remote
Remote Replication
54
Replication – Local and Remote
Remote Replication
Remote Replication is a process of creating replicas at remote sites.
REPLICATION
55
Replication – Local and Remote
56
Replication – Local and Remote
Synchronous Replication
• A write is committed to both source and remote replica before it is acknowledged to the host.
• Ensures source and replica have identical data at all times.
• Provides near-zero RPO.
• Response time depends on bandwidth and distance.
• Requires bandwidth more than the maximum write workload.
• Typically deployed for distance less than 200 km (125 miles) between two sites.
57
Replication – Local and Remote
Synchronous Replication
Required bandwidth
Source
1
Max
Typical workload
4
Host Writes
3 2 MB/s
Target at Time
Remote Site
Data Write
Data Acknowledgment
58
Replication – Local and Remote
Asynchronous Replication
• A write is committed to the source and immediately acknowledged to the host.
• Data is buffered at the source and transmitted to the remote site later.
• Finite Recovery Point Objective (RPO).
• RPO depends on size of buffer and available network bandwidth.
• Requires bandwidth equal to or greater than average write workload.
• Sufficient buffer capacity should be provisioned.
• Can be deployed over long distances.
59
Replication – Local and Remote
Asynchronous Replication
Typical
Source workload
1
Required
Writes bandwidth
2 MB/s
Host 4 3 Average
Target
Time
Data Write
Data Acknowledgment
60
Replication – Local and Remote
Synchronous Replication
• A write must be committed to the source and remote replica
before it is acknowledged to the host.
• Ensures source and remote replica have identical data at all times Source
• Write ordering is maintained 1
• Replica receives writes in exactly the same order as the
source. 4
• Synchronous replication provides the lowest RPO and RTO Host
3 2
• Goal is zero RPO.
• RTO is as small as the time it takes to start application on the Data Write
target site. Data Acknowledgement
Target
62
Replication – Local and Remote
63
Replication – Local and Remote
Asynchronous Replication
• Write is committed to the source and immediately
acknowledged to the host.
Source
• Data is buffered at the source and transmitted to the remote 1
site later.
• Some vendors maintain write ordering. 2
• Other vendors do not maintain write ordering, but
ensure that the replica will always be a consistent re- Host 4 3
startable image.
Data Write
• Finite RPO Data Acknowledgement
64
Replication – Local and Remote
65
Replication – Local and Remote
• Log Shipping
66
Replication – Local and Remote
LVM-based
• Duplicate Volume Groups at source and target sites
• All writes to the source Volume Group are replicated to
the target Volume Group by the LVM.
• Can be synchronous or asynchronous mode.
• In the event of a network failure
I
• Writes are queued in the log file and sent to target when P
the issue is resolved.
• Size of the log file determines length of outage that can
be withstood.
• Upon failure at source site, production can be transferred to
target site.
67
Replication – Local and Remote
68
Replication – Local and Remote
• Log files are transmitted to the remote host when the DBMS switch log files
• The remote host receives the logs and applies them to remote database
Advantages:
• Minimal overhead
IP
Logs
Original
Logs
Standby
70
Replication – Local and Remote
IP/FC
Network
Source
Distance Replica
Production Server
DR Server
71
Replication – Local and Remote
Network links
Source Target
72
Replication – Local and Remote
Network links
74
Replication – Local and Remote
Source Data
Local Replica
• Typically remote replication operations will be suspended when the remote replicas are used
for BC operations.
• During business operations changes will/could happen to both the source and remote replicas.
• Most remote replication technologies have the ability to track changes made to the source and
remote replicas to allow for incremental re-synchronisation.
• Resuming remote replication operations will require re-synchronisation between the source and
replica.
76
Replication – Local and Remote
77
Replication – Local and Remote
78
Replication – Local and Remote
Synchronous
Disk Buffered
Remote Replica
Local Replica
• Synchronous + Asynchronous
Synchronous Asynchronous
Remote Replica
Local Replica
Asynch
with
Differentia
l
SOURCE
Resynch
80
REMOT
Replication – Local and Remote
SAN/WAN
C C
• Use ESCON (Enterprise System Connectivity) or FC (Fibre Channel) for shorter distance.
• Protocol converters may require to connect ESCON or FC adapters from the arrays to these
networks.
82
Replication – Local and Remote
• Up to 32 protected and 64 unprotected separate wavelengths of data can be multiplexed into a light
stream transmitted on a single optical fiber.
Optical Channels
ESCON
Optical Lambda λ
Fibre Channel Optical Electrical
Gigabit Ethernet
83
Replication – Local and Remote
SD
H 84
Replication – Local and Remote
85
Replication – Local and Remote
86
Replication – Local and Remote
88
Replication – Local and Remote
Network-based Replication
• It provides any-point-in-time recovery capability during its normal operation.
• Continuous Data Protection appliances are present at both source and remote sites.
• Supports both synchronous and asynchronous replication modes.
89
Replication – Local and Remote
Three-site Replication
• Data from source site is replicated to two remote sites.
• Replication is synchronous to one of the remote sites and asynchronous or disk buffered to the other
remote site.
• Mitigates the risk in two site replication.
• Implemented in two ways:
• Cascade/ Multi hop
• Triangle/ Multi target
90
Replication – Local and Remote
a. Only i and ii
b. Only ii and iii
c. Only i and iii
d. All i, ii and iii
a. Synchronous
b. Asynchronous
c. Hybrid
Answer: Asynchronous
92
Replication – Local and Remote
93
Replication – Local and Remote
a. Local
b. Remote
c. Local and Remote
d. Target
Answer: Remote
94
Replication – Local and Remote
a. Local
b. Remote
c. Hybrid
d. Federated
Answer: Federated
95
Replication – Local and Remote
Assignment
1. Describe the uses of a local replica in various business operations.
2. Explain the RPO that can be achieved with synchronous, asynchronous, and disk-buffered remote
replication.
3. What is the difference between a restore operation and a resynchronisation operation with local
replicas? Explain with examples.
4. Describe the uses of a local replica in various business operations.
5. How can consistency be ensured when replicating a database?
6. What are the differences among full volume mirroring and pointer based replicas?
7. What is the key difference between full copy mode and deferred mode?
8. What are the considerations when performing restore operations for each array replication
96
technology?
Replication – Local and Remote
Assignment
9. Differentiate between Synchronous and Asynchronous mode.
12. What are differences in the bandwidth requirements between the array remote replication
technologies discussed in this module?
13. Discuss the effects of a bunker failure in a three-site replication for the following implementation:
2. Multihop—synchronous + asynchronous
3. Multitarget 97
Replication – Local and Remote
Summary
• Remote replication is the process of creating replicas of information assets at remote sites
(locations).
• The two Modes of remote replication are Synchronous and asynchronous mode
• Host-based remote replication uses one or more components of the host to perform and
manage the replication operation. There are two basic approaches to host-based remote
replication: LVM-based replication and database replication via log shipping.
• In storage array-based remote replication, the array operating environment and resources
perform and manage data replication. Synchronous, asynchronous, disk buffered, and three
site replication are the storage based remote replication techniques.
• SAN-based remote replication enables the replication of data between heterogeneous storage
arrays Network options for remote replication
Additional Task
Research on EMC Remote Replication products
98
Replication – Local and Remote
Summary
• Various local Replication technologies were discussed. host-based replication, logical
volume managers (LVMs) or the file systems perform the local replication process.
• LVM-based replication and file system (FS) snapshot are examples of host-based local
replication.
• Full volume mirroring, Pointer-based full volume copy and Pointer-based virtual replica are
the examples of array based local replication technologies.
• In LVM-based replication, logical volume manager is responsible for creating and
controlling the host-level logical volume. File system (FS) snapshot is a pointer-based
replica that requires a fraction of the space used by the original FS.
• In storage array-based local replication, the array operating environment performs the local
replication process.
• In full-volume mirroring, the target is attached to the source and established as a mirror of
the source
99
Replication – Local and Remote
Summary
• Replication is the process of creating an exact copy of data. Replication can be classified into two
major categories: local and remote.
• Consistency considerations need to considered when replicating file systems and databases.
• Host-based and storage-based replications are the two major technologies adopted for local
replication.
• File system replication and LVM-based replication are examples of host-based local replication
technology.
• Storage array–based replication can be implemented with distinct solutions namely, full-volume
mirroring, pointer based full-volume replication, and pointer-based virtual replication.
10
0
Replication – Local and Remote
Summary
• pointer-based full-volume replication Like full-volume mirroring, this technology can
provide full copies of the source data on the targets.
• In pointer-based virtual replication, at the time of session activation, the target contains
pointers to the location of data on the source.
• Remote Replication is a process of creating replicas at remote sites.
• Replicas meaning an exact copy and replication is the process of creating an exact copy of
data.
• The two basic modes of remote replication are synchronous and asynchronous.
• In synchronous remote replication, writes must be committed to the source and the target,
prior to acknowledging “write complete” to the host.
• In asynchronous remote replication, a write is committed to the source and immediately
acknowledged to the host. Data is buffered at the source and transmitted to the remote site
later.
10
1
Replication – Local and Remote
Document Links
10
2
Replication – Local and Remote
Video Links
Types of Data https://www.youtube.com/watch?v=atOrav This link explains the pros and cons of
Replication H_OMc different types of data replication.
10
3
Replication – Local and Remote
E-Book Links
10
4