You are on page 1of 634

DEPARTMENT OF BCA

STORAGE MANAGEMENT
SUB CODE: 16BCA525D21
MODULE 1: INTRODUCTION TO INFORMATION STORAGE AND
MANAGEMENT

Module 1 V Sem, SM, Department of BCA


CONTENTS :

Module 1 :
Information Storage: Data – Types of Data –Information - Storage , Evolution of Storage
Technology and Architecture, Data Centre Infrastructure - Core elements- Key Requirements
for Data Centre Elements -Managing Storage Infrastructure, Key Challenges in Managing
Information, Information Lifecycle - Information Lifecycle Management - ILM Implementation
-ILM Benefits

10/10/2021 2
V SEM, SM, DEPARTMENT OF BCA
AIM
To familiarize students with fundamentals of storage and overview of data centres.
Introduction to Storage and Data centres Information Storage

Factors contributed to the growth of digital data


• Data can be classified as structured or unstructured based on how it is stored and managed.

• Structured data is organized in rows and columns in a rigidly defined format, so that applications can
retrieve and process it efficiently.

• Structured data is typically stored using a Database Management System (DBMS).

• Data is unstructured if its elements cannot be stored in rows and columns, and is therefore difficult to query
and retrieve by business applications.

• For example, customer contacts may be stored in various forms such as sticky notes, e-mail messages,
business cards, or even digital format files such as .doc, .txt, and .pdf.

• Due to its unstructured nature, it is difficult to retrieve using a customer relationship management
application.
4
Introduction to Storage and Data centres Information Storage

How many categories of data are there?


E-Mail Attachments
▪ Data can be categorized into following categories:
Unstructured (80%)
X-Rays
• Structured:
Manuals Instant Messages
o Data Bases
Images
o Spread Sheets Documents

• Unstructured:
Web Pages

o Instant messages
Rich Media
o Images
Audio Video
o Audio
o Video
o Movies Structured (20%)

▪ Over 80% of enterprise data is unstructured 5


Introduction to Storage and Data centres Information Storage

Factors contributed to the growth of digital data


• Increase in data processing capabilities: Modern-day computers provide a significant increase in
processing and storage capabilities.

• This enables the conversion of various types of content and media from conventional forms to digital
formats.

• Lower cost of digital storage: Technological advances and decrease in the cost of storage devices have
provided low-cost solutions and encouraged the development of less expensive data storage devices.

• This cost benefit has increased the rate at which data is being generated and stored.

• Affordable and faster communication technology: The rate of sharing digital data is now much faster
than traditional approaches.

• A handwritten letter may take a week to reach its destination, whereas it only takes a few seconds for an
e-mail message to reach its recipient. 6
Introduction to Storage and Data centres Information Storage

Examples of Data

7
Introduction to Storage and Data centres Information Storage

Data Explosion

• Inexpensive and easier ways to create, collect, and store all types of data, coupled with increasing individual
and business needs, have led to accelerated data growth, popularly termed as the data explosion.

• Data has different purposes and criticality, so both individuals and businesses have contributed in varied
proportions to this data explosion.

• The importance and the criticality of data vary with time.

• Most of the data created holds significance in the short-term but, becomes less valuable over time. This
governs the type of data storage solutions used.

• Individuals store data on a variety of storage devices, such as hard disks, CDs, DVDs, or Universal Serial
Bus (USB) flash drives.

8
Introduction to Storage and Data centres Information Storage

Business
• Businesses generate vast amounts of data and then extract meaningful information from this data to derive
economic benefits.

• Therefore, businesses need to maintain data and ensure its availability over a longer period.

• Furthermore, the data can vary in criticality and may require special handling.

• For example, legal and regulatory requirements mandate that banks maintain account information for their
customers accurately and securely.

• Some businesses handle data for millions of customers, and ensures the security and integrity of data over
a long period of time.

• This requires high capacity storage devices with enhanced security features that can retain data for a
longer period.
9
Introduction to Storage and Data centres Information Storage

IP Operation

10
Introduction to Storage and Data centres Information Storage

How data differs from information?


• Data: Collection of raw facts from which the required output is extracted.
• Information: Intelligence and knowledge derived from data.

11
Introduction to Storage and Data centres Information Storage

Information Storage

Collects and arranges data

Acts as an Information repository

Examples:

12
Introduction to Storage and Data centres Information Storage

Why Information Storage ?

▪ “Digital universe – The Information Explosion”


Can you imagine
• 21st Century is information era. how much
• Information is being created at ever increasing rate. facebook data is
generated daily
• Information has become critical for success. ???

▪ We live in an on-command, on-demand world


• Example: Social networking sites, e-mails, video and photo sharing website, online shopping,
search engines etc.
▪ Information management is a big challenge
• Organization seek to Store Protect Optimize Leverage the information optimally.

13
Introduction to Storage and Data centres Information Storage

Value of Information to a Business

• Identifying new business opportunities

• Buying/spending patterns

• Internet stores, retail stores, supermarkets.

• Customer satisfaction/service

• Tracking shipments, and deliveries.

14
Introduction to Storage and Data centres Information Storage

Value of Information to a Business

• Identifying patterns that lead to changes in existing business

• Reduced cost

• Just-in-time inventory, eliminating over-stocking of products, optimizing shipment and delivery.

• New services

• Security alerts for “stolen” credit card purchases.

• Targeted marketing campaigns

• Communicate to bank customers with high account balances about a special savings plan.

• Creating a competitive advantage

15
Introduction to Storage and Data centres Information Storage

Storage

• Data created by individuals/businesses must be stored for further processing

• Type of storage used is based on the type of data and the rate at which it is created and used

Examples:

• Individuals: Digital camera, Cell phone, DVD’s, Hard disk.

• Businesses: Hard disk, external disk arrays, tape library.

• Storage model: An evolution

• Centralized: mainframe computers.

• Decentralized: Client –server model.

26
Introduction to Storage and Data centres Information Storage

Storage Technology and Architecture Evolution

• Historically, organizations had centralised computers (mainframe) and information storage devices
(tape reels and disk packs) in their data centre.

• The evolution of open systems and the affordability and ease of deployment that they offer made it
possible for business units/departments to have their own servers and storage.

• In earlier implementations of open systems, the storage was typically internal to the server.

• The proliferation of departmental servers in an enterprise resulted in unprotected, unmanaged, fragmented


islands of information and increased operating cost.

• Originally, there were very limited policies and processes for managing these servers and the data created.

• To overcome these challenges, storage technology evolved from non-intelligent internal storage to
intelligent networked storage.
17
Introduction to Storage and Data centres Information Storage

Evolution of Storage Technology


Storage technology has evolved through the following configurations:

• Direct Attached Storage (DAS)


• Redundant Array of Independent Disks (RAID)
• Network Attached Storage (NAS)/Storage Area Network (SAN)
• Internet Protocol SAN (IP-SAN)

18
Introduction to Storage and Data centres Information Storage

Multi Protocol
Router

LAN FC SAN

IP SAN

SAN / NAS
RAID Array

Internal DAS

19
Introduction to Storage and Data centres Information Storage

Storage Technology and Architecture Evolution

• Redundant Array of Independent Disks (RAID): This technology was developed to address the cost,
performance, and availability requirements of data.

• It continues to evolve today and is used in all storage architectures such as DAS, SAN, and so on.

• Direct-attached storage (DAS): This type of storage connects directly to a server (host) or a group of
servers in a cluster.

• Storage can be either internal or external to the server. External DAS alleviated the challenges of limited
internal storage capacity.

20
Introduction to Storage and Data centres Information Storage

Storage Technology and Architecture Evolution

Storage Area Network (SAN): This is a dedicated, high-performance Fibre Channel (FC) network to
facilitate block-level communication between servers and storage.

• Storage is partitioned and assigned to a server for accessing its data.

• SAN offers scalability, availability, performance, and cost benefits compared to DAS.

Network-attached storage (NAS): This is dedicated storage for file serving applications.

• Unlike a SAN, it connects to an existing communication network (LAN) and provides file access
to heterogeneous clients.

• Because it is purposely built for providing storage to file server applications, it offers higher
scalability, availability, performance, and cost benefits compared to general purpose file servers.

21
Introduction to Storage and Data centres Information Storage

Storage Technology and Architecture Evolution

• Internet Protocol SAN (IP-SAN): One of the latest evolutions in storage architecture, IP-SAN is a
convergence of technologies used in SAN and NAS.

• IP-SAN provides block-level communication across a local or wide area network (LAN or WAN),
resulting in greater consolidation and availability of data.

Note: Storage technology and architecture continues to evolve, which enables organizations to consolidate,
protect, optimize, and leverage their data to achieve the highest return on information assets.

22
Introduction to Storage and Data centres Information Storage

Direct Attached Storage

23
Introduction to Storage and Data centres Information Storage

Redundant Array of Independent Disks

24
Introduction to Storage and Data centres Information Storage

Storage Area Network

25
Introduction to Storage and Data centres Information Storage

Network Attached Storage

26
Introduction to Storage and Data centres Information Storage

IP-SAN

27
Introduction to Storage and Data centres Information Storage

Self Assessment Questions

1. Which one of the given options is true about data?

a) Raw fact
b) Processed fact
c) Processed information

Answer: a

28
Introduction to Storage and Data centres Information Storage

2. Which of these is not an example of unstructured data?

a) Images
b) Videos
c) Audio
d) Spreadsheet

Answer: d

29
Introduction to Storage and Data centres Information Storage

3. Which one of the given options means On-command, on-demand world?

a) Information needed when and where it is required.


b) Data needed when and where it is required.
c) Information not needed when and where it is required.
d) Nothing needed when and where it is required.

Answer: a

30
Introduction to Storage and Data centres Information Storage

4. Which one of the given options is a collection of raw facts from which conclusions may be drawn?

a) Information
b) Data
c) Digital Value

Answer: b

31
Introduction to Storage and Data centres Information Storage

5. Information created by both an individual and business. State whether True or False.

a) True
b) False

Answer: a

32
Introduction to Storage and Data centres Information Storage

6. Which one of the given factors contributed to the growth of digital data?

i. Increase in data processing capabilities


ii. Lower cost of digital storage
iii. Affordable and faster communication technology

a) Only i
b) Only ii
c) Only i and ii
d) All i, ii and iii

Answer: d
33
Introduction to Storage and Data centres Information Storage

7. Inexpensive and easier ways to create, collect, and store all types of data, coupled with increasing
individual and business needs, have led to accelerated data growth known as ___________________.

a) Data inclusion
b) Data Explosion
c) Data termination

Answer: b

34
Introduction to Storage and Data centres Information Storage

8. Which one of the given options are organized in rows and columns?

a) Unstructured
b) Hierarchical
c) Structured
d) Flat

Answer: c

35
Introduction to Storage and Data centres Information Storage

9. In today’s world 99 percent of data is _______________.

a) Unstructured
b) Hierarchical
c) Structured
d) Flat

Answer: a

36
Introduction to Storage and Data centres Information Storage

10. Storage is partitioned and assigned to a server for accessing its data. Which technology supports this
feature?

a) SAN
b) DAS
c) RAID
d) NAS

Answer: a

37
Introduction to Storage and Data centres Information Storage

Document Links

Topics URL Notes

Storage https://www.techopedia.com/definition/30481/stora This link explains about basic concept of storage


Management ge-management management

You will learn in details how data differs from


Concept of Data http://www.differencebetween.info/difference-
information. This link provides a tabular comparison
and Information between-data-and-information
between data and information

Storage
https://www.siemon.com/us/white_papers/14-07- This link explains about how storage technology has
Technology
29-data-center-storage-evolution.asp evolved over the years
Evolution

38
Introduction to Storage and Data centres Information Storage

Video Links

Topics URL Notes

Overview of data This video explains about data and information and
https://www.youtube.com/watch?v=mUgEgkV16Bw
and information their relation

A detailed description about types of data is


Types of data https://www.youtube.com/watch?v=WBU7sW1jy2o
presented in this video with enough real-life examples

39
Introduction to Storage and Data centres Information Storage

Information Lifecycle

Change in the value of information with time, from its creation until disposal

40
Introduction to Storage and Data centres Information Storage

Stages in Information Lifecycle

41
Introduction to Storage and Data centres Information Storage

Information Lifecycle Management (ILM)

ILM refers to the creation and management of wide-ranging set of strategies for streamlining the storage
infrastructure and the data within it. It is a comprehensive approach in managing the flow of data from its
creation to its disposal.

42
Introduction to Storage and Data centres Information Storage

Information Life cycle Management (ILM) Example

Protect

New Process Deliver Warranty


order order order claim
Time
Value

Fulfilled Aged Warranty


order data Voided

Create Access Migrate Archive Dispose

43
Introduction to Storage and Data centres Information Storage

Information Life cycle Management (ILM) Process Overview

AUTOMATED

Classify Implement Organize


policies with Integrated
data / storage
information management
applications resources to
management of storage
based on align with
tools environment
business rules data classes

FLEXIBLE

44
Introduction to Storage and Data centres Information Storage

Information Lifecycle Management (ILM) Characteristics

ILM should possess the following characteristics to obtain and maintain an ILM strategy:

Business-centric Policy-based Optimised Centrally-managed Heterogeneous

45
Introduction to Storage and Data centres Information Storage

Information Lifecycle Management (ILM) Benefits

Information retrieval is quick

Management is simplified

Compliances and governance is managed

Operations cost is lowered

Utilisation is improved

Provides wide range of options

TCO is lowered

46
Introduction to Storage and Data centres Information Storage

How good ILM can be for your organisation?

▪ Improved utilization
• Tiered storage platforms
▪ Simplified management
• Processes, tools and automation
▪ Simplified backup and recovery
• A wider range of options to balance the need for business continuity.
▪ Maintaining compliance
• Knowledge of what data needs to be protected for what length of time.
▪ Lower Total Cost of Ownership
• By aligning the infrastructure and management costs with information value.
47
Introduction to Storage and Data centres Information Storage

Self Assessment Questions

11. The best storage technology for global data sharing is _____________.

a) SAN
b) NAS
c) RAID
d) IP-SAN

Answer: d

48
Introduction to Storage and Data centres Information Storage

12. The value of information ______________.

a) Remains constant
b) Changes over time
c) May change or may not change over time

Answer: b

49
Introduction to Storage and Data centres Information Storage

13. Which of these is not a stage of Information Lifecycle?

i. Data creation
ii. Data sharing
iii. Data archival

a) Only i
b) Only ii
c) All i, ii and iii
d) None of the above

Answer: d

50
Introduction to Storage and Data centres Information Storage

14. DAS is a kind of storage that is directly connected to a computer. State whether True or False.

a) True
b) False

Answer: a

51
Introduction to Storage and Data centres Information Storage

Introduction to Data centre

52
Introduction to Storage and Data centres Information Storage

What is a data centre?

Data centres store and manage a large volume of mission-critical data. Organizations maintain data
centres to provide integrated data processing facility across the system.

53
Introduction to Storage and Data centres Information Storage

Goals of Data centre

• Support for business operations around the clock (resiliency).


• Protection of data against disaster and multiple software/hardware failure.
• Safe custody of huge amount of data.
• Lowering the total cost of operation and the maintenance needed to sustain the business function.

54
Introduction to Storage and Data centres Information Storage

Facilities offered by Data centre

• Helps distribute data processing workloads for large organisations.


• Combines various storage architectures to meet the storage requirements of organisation
• Uninterrupted and seamless operation.

55
Introduction to Storage and Data centres Information Storage

How does a Data centre look like?

56
Introduction to Storage and Data centres Information Storage

Inside of Facebook data centre

57
Introduction to Storage and Data centres Information Storage

Core Elements in Data centre Infrastructure

• Applications
• Databases – Database Management System (DBMS) and the physical and logical storage of data
• Servers/Operating systems
• Networks (LAN and SAN)
• Storage arrays

58
Introduction to Storage and Data centres Information Storage

Storage
Array
Server
Client Storage Area
Network
Local Area
Network

Application User
Database
Interface
OS and
DBMS

59
Introduction to Storage and Data centres Information Storage

What are the key factors to ensure for the success of a Data centre?

Availability

Data Integrity Security

Manageability

Performance Capacity

Scalability

60
Introduction to Storage and Data centres Information Storage

Key Requirements for Data centre Elements

1. Availability
2. Location and facility
3. Uptime
4. Cooling
5. Security
6. Facility maintenance
7. Scalability
8. Change Management

61
Introduction to Storage and Data centres Information Storage

Availability: All data centre elements should be designed to enable accessibility. The ability of
users to access data when necessary can have a positive impact on a business.

62
Introduction to Storage and Data centres Information Storage

Location and Facility: Location can have a considerable impact and could serve as an opportunity to
enhance the company’s ability to address network expectancy, disaster recovery, and data control. When
determining the location of a data centre, it is also necessary to consider the geographical location with respect
to the service provider.

63
Introduction to Storage and Data centres Information Storage

Uptime: Organizations can lose a lot of money during the downtime due to the inaccessibility of necessary
information and resources. The service provider’s track record of uptime should be tested and it should be
determined whether the provider has been able to maintain 100% availability for electrical and mechanical
systems in the past.

64
Introduction to Storage and Data centres Information Storage

Powerful backup arrangement to provide 100% Uptime

65
Introduction to Storage and Data centres Information Storage

Key Requirements for Data centre Elements

Cooling: The establishment of a data centre involves a combination of utilities, generators, and a range of
power supplies that emit excessive heat. It is essential to have a robust cooling infrastructure to ensure cost
efficiencies and a viable environment to establish the data centre.

66
Introduction to Storage and Data centres Information Storage

Cooling arrangement in data centre environment

67
Introduction to Storage and Data centres Information Storage

Security: The security of the data centre is equally important. It is necessary to establish policies and
procedures, and ensure proper integration of the data centre core elements in order to prevent unauthorized
access to information. To maximize protection, the facility should be fabricated with military-grade security
measures.

68
Introduction to Storage and Data centres Information Storage

Facility Maintenance: During the installation of a data centre, it is important to consider the maintenance and
management aspects. The provider should have sufficient insight of the facility and its efficiencies to ensure
that the entire system is being observed and any issues can be quickly addressed before they cause any outage.

69
Introduction to Storage and Data centres Information Storage

Scalability: Data centre operations should be able to allocate additional on-demand storage or processing
capabilities, without interrupting business operations. For the growth of a business, it is often necessary to
install more servers, additional databases, and new applications. The storage solution should also able to grow
with the business.

70
Introduction to Storage and Data centres Information Storage

Change Management: Proper guidelines for change management ensures that there is no alteration in the
data centre that has not been planned, scheduled, discussed, and approved. It is also necessary to provide back
out steps or a Plan B. Whether bringing new amendments to the system or making old practices obsolete, data
centres must follow the change management process.

71
Introduction to Storage and Data centres Information Storage

Self Assessment Questions

15. Which one of the given options is/are a data centre element?

i. Databases
ii. Servers
iii. Applications

a) Only i
b) Only ii
c) Only i and ii
d) All i, ii and iii

Answer: d
72
Introduction to Storage and Data centres Information Storage

16. Availability of data centre elements refers to _________________ of the elements.

a) Accessibility
b) Robustness
c) Scalability
d) Optimality

Answer: a

73
Introduction to Storage and Data centres Information Storage

17. While determining site for a data centre, the most important factor to consider is ___________.

a) Cost
b) Geographical location
c) Workforce

Answer: b

74
Introduction to Storage and Data centres Information Storage

Document Links

Topics URL Notes


This link explains about typical elements that an
https://www.techrepublic.com/blog/10-
Data centre efficient data centre is expected to have. The
things/10-critical-elements-of-an-efficient-
Elements success of data centre depends on proper
data-centre/
management of these elements

75
Introduction to Storage and Data centres Information Storage

Video Links

Topics URL Notes


This Video We Will Know About Main
Data centre https://www.youtube.com/watch?v=XTwwKq7 components of Data Center and Google Floating
Basics _HGY Data Center, Microsoft Undersea Data Center
Explanation.

76
Introduction to Storage and Data centres Information Storage

Role of Data centre


in the Enterprise

77
Introduction to Storage and Data centres Information Storage

The building blocks of a typical enterprise network includes the following:

• Campus network
• Private WAN
• Remote access
• Internet server farm
• Extranet server farm
• Intranet server farm

78
Introduction to Storage and Data centres Information Storage

79
Introduction to Storage and Data centres Information Storage

• Data centres typically house many components that support the infrastructure building blocks, such as the
core switches of the campus network or the edge routers of the private WAN.

• Data centre designs can include any or all of the building blocks as shown in previous slide figure,
including any or all server farm types. Each type of server farm can be a separate physical entity, depending
on the business requirements of the enterprise.

80
Introduction to Storage and Data centres Information Storage

Role of Data centre in Service


Provider Environment

81
Introduction to Storage and Data centres Information Storage

Data centres in Service Provider (SP) environments, known as Internet Data centres (IDCs), unlike in
enterprise environments, are the source of revenue that supports collocated server farms for enterprise
customers.

The SP Data centre is a service-oriented environment built to house, or host, an enterprise customer’s
application environment under tightly controlled SLAs for uptime and availability. Enterprises also build
IDCs when the sole reason for the Data centre is to support Internet-facing applications.

82
Introduction to Storage and Data centres Information Storage

Self Assessment Questions

18. Which one of the given options is a building block of a typical enterprise network?

i. Private WAN
ii. Remote access
iii. Extranet server farm

a) Only i
b) Only ii
c) Only i and ii
d) All i, ii and iii

Answer: d
83
Introduction to Storage and Data centres Information Storage

19. Which one of the given options are IDCs?

a) Data centres in Enterprise environment


b) Data centres in Service Provider environment
c) Data centres in local environment

Answer: b

84
Introduction to Storage and Data centres Information Storage

Data centre Architecture

85
Introduction to Storage and Data centres Information Storage

• In this section, we shall discuss the architecture of a generic enterprise Data centre connected to the
Internet and supporting an intranet server farm.

• Other types of server farms follow the same architecture used for intranet server farms yet with
different scalability, security, and management requirements.

86
Introduction to Storage and Data centres Information Storage

Topology of an Enterprise
Data centre Architecture

87
Introduction to Storage and Data centres Information Storage

88
Introduction to Storage and Data centres Information Storage

Internet Edge

The Internet Edge provides the connectivity from the enterprise to the internet and its associated redundancy
and security functions, as following:

• Redundant connections to different service providers.

• External and internal routing through Exterior Border Gateway Protocol (EBGP) and Interior
Border Gateway Protocol (IBGP).

• Edge security to control access from the Internet.

• Control for access to the Internet from the enterprise clients.

89
Introduction to Storage and Data centres Information Storage

Campus Core Switches

The campus core switches provide connectivity between the internet Edge, the intranet server farms, the
campus network, and the private WAN.

The core switches physically connect to the devices that provide access to other major network areas, such
as the private WAN edge routers, the server-farm aggregation switches, and campus distribution switches.

90
Introduction to Storage and Data centres Information Storage

Network Layers of the Server Farm

Aggregation Layer

Access Layer

Storage Layer

Data centre Transport Layer

91
Introduction to Storage and Data centres Information Storage

Aggregation Layer

The aggregation layer is the aggregation point for devices that provide services to all server farms. These
devices are multilayer switches, firewall, load balancers, and other devices that typically support services
across all servers.

92
Introduction to Storage and Data centres Information Storage

Access Layer

The access layer provides Layer 2 connectivity and Layer 2 features to the server farm. In a multitier
server farm, each server function could be located on different access switches on different segments.

The three segments are:


• Front End segment
• Application segment
• Back End segment

93
Introduction to Storage and Data centres Information Storage

94
Introduction to Storage and Data centres Information Storage

Aggregation and Access Layers

95
Introduction to Storage and Data centres Information Storage

Storage Layer
The storage layer consists of the storage infrastructure such as Fibre Channel switches and routers that
support Small Computer System Interface (SCSI) over IP (iSCSI) or Fibre Channel over IP (FCIP). Storage
network devices provides the connectivity to servers, storage devices such as disk subsystems, and tape
subsystems.

96
Introduction to Storage and Data centres Information Storage

Transport Layer

The Data centre transport layer includes the transport technologies required for the following purposes:

• Communication between distributed Data centres for rerouting client-to-server traffic.

• Communication between distributed server farms located in distributed Data centres for the
purpose of remote mirroring, replication, or clustering.

97
Introduction to Storage and Data centres Information Storage

Storage and Transport Layers

98
Introduction to Storage and Data centres Information Storage

Self Assessment Questions

20. ________________ provides the connectivity from the enterprise to the Internet.

a) Internet Edge
b) Campus Core Switches
c) Access layer
d) Storage layer

Answer: a

99
Introduction to Storage and Data centres Information Storage

21. The ________________ provides connectivity between the Internet Edge, the intranet server farms,
the campus network, and the private WAN.

a) Internet Edge
b) Campus Core Switches
c) Access layer
d) Storage layer

Answer: b

10
Introduction to Storage and Data centres Information Storage

22. Access layer provides Layer 2 connectivity and Layer 2 features to the server farm. State whether True
or False.

a) True
b) False

Answer: a

10
Introduction to Storage and Data centres Information Storage

Document Links

Topics URL Notes

This link explains about the detailed


Data centre https://www.fiberoptics4sale.com/blogs/archiv
architecture of data centre, in particular
Architecture e-posts/95041990-what-are-data-centres
enterprise data centre.

Data centre http://focus.forsythe.com/articles/396/10- This link explains the steps to a successful Data
Strategy Steps-to-a-Successful-Data-centre-Strategy centre strategy.

10
Introduction to Storage and Data centres Information Storage

Video Links

Topics URL Notes

Data centre https://www.youtube.com/watch?v=McVZ01k This video explains about real life data centre
Deployment n-lQ deployment and best practices.

Data centre https://www.youtube.com/watch?v=8M0XPOi In this Video, you will learn about details of
Topology KC7I Data centre topology.

10
Introduction to Storage and Data centres Information Storage

Self Assessment Questions

23. In data centre environment, cooling is required because _____________________.

a) Workers may feel hot


b) Workers can work comfortably
c) Equipment emit excessive heat

Answer: c

10
Introduction to Storage and Data centres Information Storage

24. Accessibility of data centre elements will have no impact on business. State whether True or False.

a) True
b) False

Answer: b

10
Introduction to Storage and Data centres Information Storage

25. Good location of data centre indicates ____________________________.

a) Continuous service availability


b) No impact on service
c) Better service quality
d) None of these

Answer: a

10
Introduction to Storage and Data centres Information Storage

26. Which one of the given options indicates Uptime?

a) Time in which all services are up and running.


b) Time in maintenance operation of the data centre.
c) Time in which some services are temporarily suspended.

Answer: a

10
Introduction to Storage and Data centres Information Storage

27. Powerful backup solution guarantees ________ uptime.

a) Low
b) Medium
c) High

Answer: c

10
Introduction to Storage and Data centres Information Storage

28. The term _____________ indicates additional on-demand capabilities.

a) Robustness
b) Scalability
c) Accessibility
d) Fault tolerance

Answer: b

10
Introduction to Storage and Data centres Information Storage

29. Which one of the given options is true about change management?

i. No unplanned alteration in the data centre.


ii. Enforces changes in data centre from time to time.
iii. Manage various changes that take place in data centre.

a) Only i
b) Only ii
c) Only i and ii
d) All i, ii and iii

Answer: a

11
Introduction to Storage and Data centres Information Storage

30. It is important to have a Plan B in place in a typical data centre environment. State whether True or
False.

a) True
b) False

Answer: a

11
Introduction to Storage and Data centres Information Storage

31. In a multitier server farm, each server function could be located on different access switches on
different segments. State true of false.

a) True
b) False

Answer: a

11
Introduction to Storage and Data centres Information Storage

32. Which of these is not a segment of Access Layer?

a) Front end segment


b) Back end segment
c) Application segment
d) Middle level segment

Answer: d

11
Introduction to Storage and Data centres Information Storage

Summary

• Data is a collection of raw facts from which the required output is extracted.

• Information is processed data. The value of information changes over time.

• Storage technology has evolved through configurations such as RAID, DAS, SAN, NAS, and IP-SAN.

• The change in value of information with the change in time is called information lifecycle.

• ILM is a comprehensive approach in managing the flow of data from its creation to its disposal.

• The key requirement for data centre, elements are availability, location and facility, uptime, cooling,
security, facility maintenance, scalability, and change management.

11
Introduction to Storage and Data centres Information Storage

Assignment
1. Define data and information with real life examples.
2. Explain how structured and unstructured data differs.
3. Write a description on evolution of storage technologies.
4. Explain the different stages of Information Lifecycle.
5. What is ILM? Explain the various benefits of ILM.
6. Why do we need data centres? Explain.
7. Name and explain the core elements in data centre infrastructure.
8. Explain the architecture of Enterprise Data centre.
9. Discuss the necessity of information storage management with illustrative examples.
10. Explain Virtuous cycle of information with a neat diagram.
11
Introduction to Storage and Data centres Information Storage

Assignment
11. Define storage. Discuss the importance of digital data.
12. What is Data? Give examples of digital data.
13. List some of the factors that have contributed to the growth of digital data.

14. With the advancement of computer and communication technologies, the rate of data generation and
sharing has increased exponentially. Justify the statement.

15. Discuss the advantages of digital data over traditional data handling techniques.
16. Give examples of Research and Business data.
17. Demonstrate how data can be classified based on how it is stored and managed.
18. Differentiate between data and information.
19. Bring out the analysis of information by individual and business.
20. Discuss the options available for storing data for individuals and business. 11
Introduction to Storage and Data centres Information Storage

Assignment
21. Mention the challenges in storing the data in mainframes / open systems.
22. Explain the evolution of storage architectures.

23. Bring out the differences between SAN and NAS. Outline the Data centre Infrastructure requirements
to store and manage large amounts of mission-critical data.

24. Discuss the five core elements that are essential for the basic functionality of a data centre.
25. Demonstrate the requirements of the core elements of the data centre with an order processing system.
26. List out the key requirements for data centre elements.
27. Summarize the necessity of the key characteristics of data centre elements with respect to the storage.

28. Managing a modern, complex data centre involves many tasks. List out the tasks with suitable
examples.

29. Elaborate on the activities involved in managing storage infrastructure. 11


Introduction to Storage and Data centres Information Storage

Assignment

30. In order to frame an effective information management policy, businesses need to consider the key
challenges of information management. Mention the key challenges faced by the business.

31. Framing a policy to meet the challenges of information management involves understanding the value
of information over its lifecycle. Justify the statement.

32. What is information lifecycle? Demonstrate with an illustrative example.


33. Define Information Lifecycle Management (ILM)? Discuss the characteristics of an ILM strategy.
34. Formulate how tired storage technique can reduce total storage cost.

35. Recommend the activities involved in the process of developing an Information Lifecycle Management
(ILM) strategy.

36. Illustrate a three-step road map to enterprise-wide ILM implementation.

11
Introduction to Storage and Data centres Information Storage

Assignment

37. Implementing an ILM strategy provides the benefits that directly address the challenges of information
management. Justify the statement.

38. Discuss the benefits of ILM strategy.

11
Introduction to Storage and Data centres Information Storage

Document Links

Topics URL Notes


This link explains a general overview of
Cover Story: Storage http://www.networkmagazineindia.com/200309/coverstory01.shtml
storage management, along with expert’s
Management
opinion
Storage infrastructure http://searchstorage.techtarget.com/opinion/Storage-infrastructure-
You will learn Basic challenges associated
management is still management-is-still-elusive
with Storage and its management
elusive
http://www.networkworld.com/article/2300188/data-centre/six-steps-to-ilm-
Steps to ILM This link explains about the six steps to
implementation.html
implementation successfully implement ILM

Information Lifecycle http://searchstorage.techtarget.com/definition/information-life-cycle- This link explains about ILM approach in


Management management industry

https://www.fiberoptics4sale.com/blogs/archive-posts/95041990-what-are- This link explains about architecture of data


Data centre
data-centres centre , with emphasis to enterprise and
Architecture
discussing each component in details
http://focus.forsythe.com/articles/396/10-Steps-to-a-Successful-Data-centre-
In this link, you will learn about the steps for
Data centre Strategy Strategy
adopting a successful Data centre Strategy
12
Introduction to Storage and Data centres Information Storage

Video Links

Topics URL Notes

https://www.youtube.com/watch?v=bpUzG
This link compares DAS, NAS and SAN
Types of storage technologies ZLO948&list=PLWDUzz3hCDJJFiOO_zpMTbi
technologies
HUNPDpU3xb
This link explains what is data,
https://www.youtube.com/watch?v=mUgEg
information and knowledge. A short
Overview of data and information kV16Bw
description on how data differs from
information.

https://www.youtube.com/watch?v=XZmG This link takes us to a virtual tour


Inside a Google Data centre
GAbHqa0 inside a Google Data centre

12
Introduction to Storage and Data centres Information Storage

E-Book Links

Topics URL Page Number

Building a https://www.actualtechmedia.com/wp-
Modern Data content/uploads/2018/05/Building-a-Modern- 14 to 22
centre Data-Center-ebook.pdf

Information http://www.academia.edu/19388733/1_1_Infor
Storage and mation_Storage_and_Management_2nd_Editio 4 to 15
Management n

12
DEPARTMENT OF BCA
STORAGE MANAGEMENT
SUB CODE: 16BCA525D21

MODULE 2: STORAGE SYSTEM ENVIRONMENT

Module 2 V Sem, SM, Department of BCA


CONTENTS :

Module 2 :
Components of a Storage System Environment – Host –Connectivity – Storage, Disk Drive Components –
Platter – Spindle - Read/Write Head - Actuator Arm Assembly - Controller - Physical Disk Structure -
Zoned Bit Recording - Logical Block Addressing , Disk Drive Performance -1 Disk Service Time ,
Fundamental Laws Governing Disk Performance , Logical Components of the Host - Operating System -
Device Driver -Volume Manager - File System – Application , Application Requirements and Disk
Performance.

10/7/2021 2
V SEM, SM, DEPARTMENT OF BCA
Storage System Environment

AIM

To equip students with components of storage system environment and present an overview of
different storage technologies.

3
Storage System Environment

There are three main components in a storage system environment:

• Host: The host system is used for the interaction between the OS and application that has
requested the data.

• Connectivity: It provides the connection and helps in carrying out the data and the read or write
commands between the storage mediums and the host.

• Storage: The data gets stored into the storage mediums or devices.

4
Storage System Environment

Host
It is one of the main components in a storage system environment:

▪ Applications run on hosts


▪ Hosts can range from simple laptops to complex server clusters
Serve
▪ Physical components of a host: r

1. CPU
2. Storage
LA
N
o Disk device and internal memory.
3. I/O device Group of
Servers
• Host to host communications
o Network Interface Card (NIC).
• Host to storage device communications
o Host Bus Adapter (HBA). 5
Storage System Environment

CPU
The CPU consists of four main components:

▪ Arithmetic Logic Unit (ALU).

▪ Control Unit.

▪ Register.

▪ Level 1 (L1) cache.

6
Storage System Environment

Storage
Memory and storage media are used to store data, either persistently or temporarily. Generally, there are two
types of memory on a host:

▪ Random Access Memory (RAM)

▪ Read-Only Memory (ROM)

▪ Storage devices are less expensive than semiconductor memory. Examples of storage devices are as follows:
▪ Hard disk (magnetic).

▪ CD-ROM or DVD-ROM (optical).

▪ Floppy disk (magnetic).

▪ Tape drive (magnetic).

7
Storage System Environment

Components of Storage

Fast
CPU registers

L2 cache L1 cache

Speed
RAM
Magnetic
disk

Optical
Tape disk

Slow
Low High 8
Cost
Storage System Environment

I/O Devices

Human interface
• Keyboard
• Mouse
• Monitor
Computer-computer interface
• Network Interface Card (NIC)
Computer-peripheral interface
• USB (Universal Serial Bus) port
• Host Bus Adapter (HBA)

9
Storage System Environment

Logical Components of Host

10
Storage System Environment

Connectivity

▪ Provides interconnection between hosts or between a host and any storage devices
▪ The physical components are the hardware elements that connect the host to storage, and the
logical components of connectivity are the protocols used for communication between the host
and storage.
▪ Physical Components of Connectivity are:
Bus, Port and Cable

CPU BUS HBA Cable

Disk
Port 11
Storage System Environment

Physical components of connectivity bus, port and cables


▪ The bus is the collection of paths that facilitate data transmission from one part of a computer to another,
such as from the CPU to the memory.

▪ The port is a specialised outlet that enables connectivity between the host and external devices.

▪ Cables connect hosts to internal or external devices using copper or fibre optic media.

Buses, as conduits of data transfer on the computer system, can be classified as follows:

▪ System bus: The bus that carries data from the processor to memory.

▪ Local or I/O bus: A high-speed pathway that connects directly to the processor and carries data between
the peripheral devices, such as storage devices and the processor.

12
Storage System Environment

Bus Technology Serial

Serial Bi-directional

Parallel

13
Storage System Environment

Logical Components of Connectivity

• The popular interface protocol used for the local bus to connect to a peripheral device is a peripheral component
interconnect (PCI).

• The interface protocols that connect to disk systems are Integrated Device Electronics/Advanced Technology
Attachment (IDE/ATA) and Small Computer System Interface (SCSI).

• PCI is a specification that standardises how PCI expansion cards, such as network cards or modems, exchange
information with the CPU. PCI provides the interconnection between the CPU and attached devices.

• IDE/ATA is the most popular interface protocol used on modern disks. This protocol offers excellent
performance at relatively low cost.

• SCSI has emerged as a preferred protocol in high-end computers. This interface is far less commonly used than
IDE/ATA on personal computers due to its higher cost.

14
Storage System Environment

Storage
• The storage device is the most important component in the storage system environment.
• A storage device uses magnetic or solid state media. Disks, tapes, and diskettes use magnetic
media.
• CD-ROM is an example of a storage device that uses optical media, and a removable flash
memory card is an example of solid-state media.

15
Storage System Environment

Components of Storage

Magnetic Media

• Magnetic storage is an example of non-volatile memory, which means when the storage device is
not powered, then the data does not get lost and is retained.

• Magnetic storage uses various patterns of magnetisation in a magnetisable material to store the
data.

• Magnetic storages are relatively cheap and are widely used for data storage.

• Examples of devices that use the magnetic media are tapes, disks, and diskettes.

16
Storage System Environment

Magnetic Disk Drive(Hard disk drive)

17
Storage System Environment

Optical Media

• Optical media are the kind of storage media that contain the content in a digital format and the
read/write feature can be performed on them using a laser.

• The examples of optical media include CD-ROMs, DVD-ROMs, and all the variations of the two
formats, as well as optical jukeboxes.

• Optical media can last up to seven times as long as any traditional storage media.

18
Storage System Environment

Optical Disk (DVD, CD)

19
Storage System Environment

Solid state media

• It is made up of silicon microchips. It is non-volatile storage and stores data electronically instead of
storing it magnetically.

• It contains no mechanical parts, which allows the transfer of data, to and from the storage medium, to
take place at a very higher speed.

• Provides data integrity and endurance as any other electronic device.

• An example of a device that uses the solid-state media is a removable flash memory.

20
Storage System Environment

21
Storage System Environment
Tapes
• Tapes are a popular storage media used for backup because of their relatively low cost. In the past, data Centres
hosted a large number of tape drives and processed several thousand reels of tape. However, the tape has the
following limitations:

▪ Data is stored on the tape linearly along the length of the tape. As a result, random data access is slow and time-
consuming.

▪ In a shared computing environment, data stored on tape cannot be accessed by multiple applications
simultaneously, restricting its use to one application at a time.

▪ On a tape drive, the read/write head touches the tape surface, so the tape degrades or wears out after repeated use.

▪ The storage and retrieval requirements of data from tape and the overhead associated with managing tape media
are significant.
22
Storage System Environment

Magnetic Tapes

23
Storage System Environment

Self Assessment Question

1. Which of these is not a valid component of a storage system environment?


a) Host
b) Connectivity
c) Storage
d) Virtual memory

Answer: d)

24
Storage System Environment

Self Assessment Question


2. Which of these is not a valid component of a storage system environment?
a) Bus
b) Wire
c) Port
d) Cable

Answer: b)

25
Storage System Environment

Self Assessment Question


3. The computers on which the users applications run are referred to as
a) Laptop
b) Desktop
c) Host
d) Node

Answer: c)

26
Storage System Environment

Self Assessment Question


4. Physical components of the host are known as
a) Hardware
b) Software
c) Protocol
d) Cable

Answer: a)

27
Storage System Environment

Self Assessment Question


5. Logical components of the host are known as
a) Hardware
b) Software and Protocol
c) Port
d) Cable

Answer: b)

28
Storage System Environment

Self Assessment Question


6. The physical components communicate with one another by using a communication pathway called a
_________
a) Bus
b) Wire
c) Port
d) Cable

Answer: a)

29
Storage System Environment

Self Assessment Question


7. Memory modules are implemented using ________________
a) Registers
b) Flip Flops
c) Semiconductor chips
d) Diodes

Answer: c)

30
Storage System Environment

Self Assessment Question


8. User to host communications handled by
a) Basic I/O devices
b) Network Interface Card (NIC) or modem
c) Host Bus Adaptor (HBA)
d) Routers

Answer: a)

31
Storage System Environment

Self Assessment Question


9. Host to host communications enabled using devices
a) Basic I/O devices
b) Network Interface Card (NIC) or modem
c) Host Bus Adaptor (HBA)
d) Routers

Answer: b)

32
Storage System Environment

Self Assessment Question


10. Host to storage device communications: Handled by a
a) Basic I/O devices
b) Network Interface Card (NIC) or modem
c) Host Bus Adaptor (HBA)
d) Routers

Answer: c)

33
Storage System Environment

Self Assessment Question


11. Collection of paths that facilitates data transmission from one part of a computer to another, such as
from the CPU to the memory is known as
a) Bus
b) Wire
c) Port
d) Cable
Answer: a)

34
Storage System Environment

Self Assessment Question


12. A specialised outlet that enables connectivity between the host and external devices.
a) Bus
b) Wire
c) Port
d) Cable

Answer: c)

35
Storage System Environment

Self Assessment Question

13. Which one connect hosts to internal or external devices using copper or fibre optic media?

a) Bus
b) Wire
c) Port
d) Cable

Answer: d)

36
Storage System Environment

Self Assessment Question


14. The bus that carries data from the processor to memory
a) Basic I/O bus
b) System bus
c) Control bus
d) Data bus

Answer: b)

37
Storage System Environment

Self Assessment Question


15. The popular interface protocol used for the local bus to connect to a peripheral device is
a) DCI
b) IDE/ATE
c) SCSI
d) PCI

Answer: d)

38
Storage System Environment

Document Links

Topics URL Notes

http://study.com/academy/lesson/magnetic- This link explains the magnetic storage technologies with


Magnetic storage
storage-definition-devices-examples.html examples

http://searchstorage.techtarget.com/definitio You will learn in details about optical media with


Optical media
n/optical-media examples and definitions

Solid state http://searchstorage.techtarget.com/definitio This link explains about fast solid state storage
storage n/solid-state-storage technology

Components of https://www.webopedia.com/TERM/C/CPU.ht
This link explains about the components of CPU
CPU ml

This link gives the details of the magnetic tape drive


Tape drive https://en.wikipedia.org/wiki/Tape_drive
features
39
Storage System Environment

Video Links

Topics URL Notes

Types of storage https://www.youtube.com/watch?v=NgVm2k This video explains and compares different types of
media UmrYo storage media

Storage system https://www.youtube.com/watch?v=-


This video explains storage system environment
environment 8hAOsQnbK8

40
Storage System Environment

Hard Disk Drive (HDD)

41
Storage System Environment

What is a disk drive?

A disk drive is one of the most popular, randomly accessible, and rewritable storage medium used in the
modern computers. It is a physical drive in a computer which is capable of holding and retrieving
information. The disk drives are used for storing and accessing the data for a performance-intensive,
online application.

42
Storage System Environment

Controller

HDA Interfac
e

Power
Connector
43
Storage System Environment

Components of a Disk Drive

The key components of a disk drive are:

• Platter
• Spindle
• Read/write head
• Actuator arm assembly
• Controller

44
Storage System Environment

Spindle

Platter

Read/write head

Actuator arm
Actuator arm
Actuator axis
assembly

Actuator
Disk Drive

45
Storage System Environment

Platters

46
Storage System Environment

Spindle

Spindle

47
Storage System Environment

Read/Write Heads

Data is read and written by read/write heads, or R/W heads.

Most drives have two R/W heads per platter, one for each surface of the platter.

• When reading data, they detect magnetic polarisation on the platter surface.
• When writing data, they change the magnetic polarisation on the platter surface.

48
Storage System Environment

R/W Head

R/W Head

49
Storage System Environment

Sectors and Tracks


Sector

Track

Platter
50
Storage System Environment

Cylinders

Cylinder

51
Storage System Environment

Controller

Controller

Interface
HDA

Power
Connector

Bottom View of Disk Drive

52
Storage System Environment

Zoned-Bit Recording

Since a platter is made up of concentric tracks, the outer tracks can hold more data than the inner ones
because they are physically longer than the inner tracks. However, in older disk drives, the outer tracks
had the same number of sectors as the inner tracks, which means that the data density was very low on the
outer tracks. This was an inefficient use of the available space.

Zoned-bit recording uses the disk more efficiently. It group tracks into zones that are based upon their
distance from the Centre of the disk.

53
Storage System Environment

Sector

Track

Platter Without Zones Platter With Zones

54
Storage System Environment

Logical Block Addressing

Earlier, drives used physical addresses made up of the Cylinder, Head, and the Sector number (CHS) to
refer to specific locations on the disk. This meant that the host had to be aware of the geometry of each
disk that was used.

Logical Block Addressing (LBA) simplifies addressing by a using a linear address for accessing physical
blocks of data. The disk controller performs the translation process from LBA to CHS address. The host
only needs to know the size of the disk drive (how many blocks).

Logical blocks are mapped to physical sectors on a 1:1 basis


Block numbers start at 0 and increment by one until the last block is reached [E.g., 0, 1, 2, 3 … (N-1)]
Block numbering starts at the beginning of a cylinder and continues until the end of that cylinder
This is the traditional method for accessing peripherals on SCSI, Fibre Channel, and newer ATA disks.

55
Storage System Environment

Sector
Block 0
Cylinder Block 8
Head (lower surface)

Block 16

Block 32

Block 48

Physical Address = CHS Logical Block Address = Block #


56
Storage System Environment

Self Assessment Question

16. In Hard drive, the _______________ causes the platters to rotate within a sealed case.

a) Tracks
b) Sectors
c) Spindle
d) Read/Write Head

Answer: c)

57
Storage System Environment

Self Assessment Question


17. Multiple _____________ form a track

a) Sectors
b) Cylinders
c) Zones
d) None of the above

Answer: a)

58
Storage System Environment

Self Assessment Question


18. Suppose a Disk Drive has 3 platters. The number of recording surfaces is ______________
a) Three
b) Four
c) Six
d) Five

Answer: c)

59
Storage System Environment

Self Assessment Question


19. Zoned-Bit Recording is a strategy to utilise space. State whether True or False.

a) True
b) False

Answer: a)

60
Storage System Environment

Self Assessment Question


20. Disks support rapid access to
a) Random data locations
b) Sequential data locations
c) Both a and b
d) None of the above

Answer: a)

61
Storage System Environment

Self Assessment Question


21. Data can be recorded and erased on a magnetic disk _______________ number of times.
a) Average
b) Infinite
c) Finite
d) moderate

Answer: b)

62
Storage System Environment

Self Assessment Question

22. A typical HDD consists of one or more flat circular disks called as
a) Platter
b) Spindle
c) Read/Write head
d) Actuator arm assembly
e) Controller

Answer: a)

63
Storage System Environment

Self Assessment Question

23. Which component of the disk connects all the platters?


a) Platter
b) Spindle
c) Read/Write head
d) Actuator arm assembly

Answer: b)

64
Storage System Environment

Self Assessment Question

24. The R/W heads of the HDD are mounted on the


a) Platter
b) Spindle
c) Actuator arm assembly
d) Controller

Answer: c)

65
Storage System Environment

Self Assessment Question

25. Which is a printed circuit board, mounted at the bottom of a disk drive ?

a) Platter
b) Spindle
c) Actuator arm assembly
d) Controller

Answer: d)

66
Storage System Environment

Self Assessment Question


26. Which one simplifies addressing by using a linear address to access physical blocks of data
a) CHS
b) LBA
c) ZBA
d) Bus address

Answer: b)

67
Storage System Environment

PERFORMANCE
OF DISK DRIVE

68
Storage System Environment

Seek Time

▪ Seek time is the time for read/write heads


to move between tracks.

▪ Seek time specifications include:


• Full stroke.
• Average seek time.
• Track-to-track.

69
Storage System Environment

Rotational Latency

It is the time taken to position the data under the read/write head by rotating and repositioning the platter.

Original data position


Read/write head

Platter movement

70
Storage System Environment

Data Transfer Rate


Internal transfer rate
measured here
External transfer rate
measured here

HBA Interface Buffer

Disk Drive

Data transfer rate is also known as transfer rate, and it refers to the average amount of data per unit time
that can be delivered by the drive to the Host Bus Adapter (HBA).
71
Storage System Environment

Data Transfer Rate

The Data Transfer Rate describes the MB/second that the drive can deliver data to the HBA. It can be
categorised as:

• Internal transfer rate - the speed of moving data from the disk surface to the R/W heads on a
single track of one surface of the disk.

• External transfer rate - the rate at which data can be moved through the interface to the HBA.

72
Storage System Environment

Other Factors affecting Disk Drive Performance

Size and number of


RPM of a disk drive Actuator characteristics
Platter surfaces

Data recording &


Spindle motor speed Disk response time
encoding factors

File system used CPU speed Disk Service Time

73
Fundamental Laws
Governing Disk Drive
Performance

7
Storage System Environment

I/O Queue
Arrival
I/O
6 5 4 3 2 1 Processed I/O Request
▪ Littles’ Law Controller

• Describes the relationship between the number of requests in a queue and the response time.
• N=a×R
o “N” is the total number of requests in the queue system
o “a” is the arrival rate of I/O requests
o “R” is the average response time
▪ Utilisation law
• Defines the I/O controller utilisation
• U = a × Rs
o “U” is the I/O controller utilisation
o “Rs“ is the service time

75
Storage System Environment

Self Assessment Question

27. For disk performance to be high, seek time should be __________


a) High
b) Low
c) High or Low as per situation
d) None of the above

Answer: b)

76
Storage System Environment

Self Assessment Question


28. If rotational latency is low, disk performance will be __________
a) High
b) Low
c) High or Low as per situation
d) None of the above

Answer: a)

77
Storage System Environment

Self Assessment Question


29.The time taken by a disk to complete an I/O request is
a) Seek Time
b) Average Seek Time
c) Full stroke
d) Disk service time

Answer: d)

78
Storage System Environment

Self Assessment Question


30. ___________ is the average time taken by the read/write head to move from one point to another
a) Seek Time
b) Average Seek Time
c) Full stroke
d) Track-to-Track

Answer: b)

79
Storage System Environment

Self Assessment Question


31. The time taken to position the R/W heads across the platter with a radial movement is
a) Seek Time
b) Average Seek Time
c) Full stroke
d) Track-to-Track

Answer: a)

80
Storage System Environment

Self Assessment Question


32. The time taken by the R/W head to move across the entire width of the disk, from the innermost track
to the outermost track is called as.
a) Seek Time
b) Average Seek Time
c) Full stroke
d) Track-to-Track

Answer: c)

81
Storage System Environment

Self Assessment Question


33. The time taken by the R/W head to move between adjacent tracks is called as
a) Seek Time
b) Average Seek Time
c) Full stroke
d) Track-to-Track

Answer: d)

82
Storage System Environment

Self Assessment Question

34. The time taken by the platter to rotate and position the data under the R/W head is called
a) Seek Time
b) Average Seek Time
c) Rotational Latency
d) External Latency

Answer: c)

83
Storage System Environment

Self Assessment Question

35. The rate at which data can be moved through the interface to the HBA is known as
a) Internal Transfer rate
b) External Transfer rate
c) Access rate
d) Pulse rate

Answer: b)

84
Storage System Environment

Document Link

Topics URL Notes


This link explains the hard disk drive and its
https://www.explainthatstuff.com/hardd
Hard Disk Drive components. It presents details about structure of
rive.html
a hard disk drive and its functioning

85
Storage System Environment

Video Link

Topics URL Notes

This video discusses the components and working


Hard Disk Drive https://www.youtube.com/watch?v=p-
of a hard disk drive. It shows how the components
in motion JJp-oLx58
work when the hard disk is in motion

86
Storage System Environment

Logical Components
of Host

87
Storage System Environment

Application

▪ The interface between a user and the host


▪ Three-tiered architecture
o Application UI, computing logic and underlying databases.
▪ Application data access can be classified as:
o Block-level access: Data stored and retrieved in blocks, specifying the LBA.
o File-level access: Data stored and retrieved by specifying the name and path of files.

88
Storage System Environment

Operating System

• Resides between the applications and the hardware.


• Controls the environment.

89
Storage System Environment

LVM (Logical Volume Manager)

▪ Responsible for creating and controlling host level logical storage


o A physical view of storage is converted to a logical view by mapping
o Logical data blocks are mapped to physical data blocks
▪ Usually offered as part of the operating system or a third-party host software
▪ LVM Components:
o Physical Volumes
o Volume Groups
o Logical Volumes

90
Storage System Environment

Logical Storage

Physical
Storage
91
Storage System Environment

LVM Example

Server
s

Logical
Volume

Physical
Volume

Partitionin Concatenatio
g n
92
Storage System Environment

Volume Group

• One or more Physical Volumes form a Volume Group.


• LVM manages Volume Groups as a single entity.
• Physical Volumes can be added and removed from a Volume Group as necessary.
• Physical Volumes are typically divided into contiguous equal-sized disk blocks.

93
Storage System Environment

Logical Volume

Logical Volume Logical Disk


Block

Physical Volume 1 Physical Volume 2 Physical Volume 3


Physical Disk
Volume Group Block

94
Storage System Environment

Device Drivers

▪ Enables the Operating System to recognize the device.


▪ Provides API to access and control devices.
▪ They are hardware dependent and Operating System specific.

File System
▪ The file is a collection of related records or data stored as a unit.
▪ The file system is the organisation of files on disk.
Examples, FAT 32, NTFS, UNIX FS and EXT2/3

95
Storage System Environment

Self Assessment Question

36. Which of these is not a LVM component?

a) Physical Volumes
b) Logical Volumes
c) Logical Groups
d) Volume Groups

Answer: c)

96
Storage System Environment

Self Assessment Question

37. One or more ________________ form a Volume Group


a) Logical Volumes
b) Physical Volumes
c) LVM
d) None of the Above

Answer: b)

97
Storage System Environment

Self Assessment Question

38. __________ enables operating system to recognize the device.


a) Logical Hardware
b) Physical Volumes
c) Device Drivers
d) HDD

Answer: c)

98
Storage System Environment

Self Assessment Question

39. Which one is a collection of related records or data stored as a unit ?


a) File
b) Volume
c) Record
d) HDD

Answer: a)

99
Storage System Environment

Self Assessment Question

40. LVM manages Volume Groups as a ________________ entity.


a) Double
b) Triple
c) Single
d) Quad

Answer: c)

10
Storage System Environment

RAID technology

10
Storage System Environment

RAID technology

• RAID stands for Redundant Array of Independent Disks.

• It is a technology that enables greater levels of performance, reliability and/or large volumes when
dealing with data.

• High performance is achieved by concurrent use of two or more ‘hard disk drives’.

• Mirroring, Stripping (of data) and Error correction techniques combined with multiple disk arrays
gives the desired reliability and performance.

10
Storage System Environment

RAID flavours

Commonly used ones:

1. RAID 0
2. RAID 1
3. RAID 5
4. RAID 10

Other rarely used types: RAID 2,3,4,6,50……

10
Storage System Environment

RAID 0

• It splits data among two or more disks.

• Provides good performance.

• Lack of data redundancy means there is no failover


support with this configuration.

• In the diagram to the right, the odd blocks are written


to disk 0 and the even blocks to disk 1 such that A1,
A2, A3, A4, … would be the order of blocks if read
sequentially from the beginning.

• Used in read-only NFS systems and gaming systems.

10
Storage System Environment

RAID 1

• RAID 1 is ‘data mirroring’.

• Two copies of the data are held on two physical


disks, and the data is always identical.

• Twice as many disks are required to store the same


data when compared to RAID 0.

• Array continues to operate so long as at least one


drive is functioning.

10
Storage System Environment

RAID 5

• RAID 5 is an ideal combination of


good performance, good fault
tolerance and high capacity and
storage efficiency.

• An arrangement of parity and


CRC to help rebuild drive data in
case of disk failures.

• “Distributed Parity” is the key


word here.

10
Storage System Environment

RAID 10

• Combines RAID 1 and RAID 0 technology.

• Which means having the pleasure of both


good performance and good failover
handling?

• Also called ‘Nested RAID’.

10
Storage System Environment

Software Based RAID

• Software implementations are provided by many operating systems.

• A software layer sits above the disk device drivers and provides an abstraction layer between the
logical drives(RAIDs) and physical drives.

• Servers’ processor is used to run the RAID software.

• Used for simpler configurations like RAID 0 and RAID 1.

10
Storage System Environment

Hardware-Based RAID

• A hardware implementation of RAID


requires at least a special-purpose RAID
controller.

• On a desktop system, this may be built


into the motherboard.

• The processor is not used for RAID


calculations as a separate controller is
present.

A PCI-bus-based, IDE/ATA hard disk RAID


controller, supporting levels 0, 1 and 10

10
Storage System Environment

Self Assessment Question

41. Data Mirroring is a phenomenon witnessed in which RAID flavour?

a) RAID 0
b) RAID 1
c) RAID 5
d) None of the above

Answer: b)

11
Storage System Environment

Self Assessment Question

42. Hardware based RAID implementation requires no processor use for RAID calculations because
________
a) Does not involve calculations
b) Separate controller is present for calculations
c) Separate programme is present for calculations
d) None of the above

Answer: b)

11
Storage System Environment

Self Assessment Question

43. Which of these can be achieved through RAID?


a) High performance
b) Reliability
c) Fault tolerance
d) All of the above
Answer: d)

11
Storage System Environment

Self Assessment Question

44. RAID stands for


a) Redundant Array of Independent Disks
b) Replica Array of Independent Disks
c) Redundant Array of Indecent Disks
d) Relay Array of Independent Disks

Answer: a)

11
Storage System Environment

Self Assessment Question

45. ______________ measures (in hours) the average life expectancy of an HDD

a) Average Time Between Failure (ATBF)


b) Total Time Between Failure (TTBF)
c) Standard Time Between Failure (STBF)
d) Mean Time Between Failure (MTBF)

Answer: d)

11
Storage System Environment

Self Assessment Question

46. Consider a storage array of 100 HDDs, each with an MTBF of 750,000 hours. This means that a HDD
in this array is likely to fail at least once in ____________.

a) 7,500 hours
b) 7,50 hours
c) 75 hours
d) 700 hours

Answer: a)

11
Storage System Environment

Self Assessment Question

47. Which is a technique whereby data is stored on two different HDDs, yielding two copies of data?
a) Parity
b) Stripping
c) Mirroring
d) Seizing

Answer: b)

11
Storage System Environment

Self Assessment Question

48. Which RAID configuration, data is striped across the HDDs in a RAID set ?
a) RAID 3
b) RAID 2
c) RAID 1
d) RAID 0

Answer: d)

11
Storage System Environment

Self Assessment Question

49. Which RAID configuration, data is mirrored to improve fault tolerance


a) RAID 3
b) RAID 2
c) RAID 1
d) RAID 0

Answer: c)

11
Storage System Environment

Self Assessment Question

50. Consider an application that generates 5,200 IOPS, with 60 percent of them being reads. The disk load
in RAID 5 is
a) 11,440 IOPS
b) 11, 404 IOPS
c) 11, 004 IOPS
d) 11, 400 IOPS

Answer: a)

11
Storage System Environment

Document Link

Topics URL Notes

RAID https://www.prepressure.com/library/techn This link explains the details of RAID technology with
technology ology/raid advantages and disadvantages of different RAID flavours

12
Storage System Environment

Video Links

Topics URL Notes

https://www.youtube.com/watch?v=wTcxROb
RAID technology This video explains about RAID 0, 1, 2, 5, 6, 10
q738

https://www.youtube.com/watch?v=wpZ4589 In this Video, you will learn about RAID controller and its
RAID controller
0zEk use and need

12
Storage System Environment

Storage Networking
Technologies

12
Storage System Environment

Overview

There are 4 basic forms of Network Storage:

• Direct attached storage (DAS)


• Network attached storage (NAS)
• Storage area network (SAN)
• Internet Protocol Storage area network (IP-SAN)

12
Storage System Environment

Comparison
DAS NAS SAN
shared
Storage Type sectors blocks
files
Data TCP/IP, Fibre
IDE/SCSI
Transmission Ethernet Channel
clients or clients or
Access Mode servers
servers servers
Capacity
109 109 - 1012 1012
(bytes)
Complexity Easy Moderate Difficult
Management
High Moderate Low
Cost (per GB)
12
Storage System Environment

DAS Types
Ethernet
Network

Used

IDE Disk Array

Small Server

Used
SCSI
Channel Used

clients SCSI Disk Array


Large Server

12
Storage System Environment

SCSI

▪ SCSI stands for Small Computer System Interface.


▪ It is an I/O bus for a peripheral device, such as hard drives, tape drives, CD-ROM, scanners, etc.
• an improvement over IDE
▪ A single SCSI bus connects multiple elements (max 7 or 15).
▪ High-speed data transfer:
• 5, 10, 20, 100, 320 MB/sec.
▪ Overlapping I/O capability:
• Multiple read and write commands can be outstanding simultaneously.
• Different SCSI drives to be processing commands concurrently rather than serially. The data can
then be buffered and transferred over the SCSI bus at very high speeds.
12
Storage System Environment

SCSI Distribution Architecture

▪ SCSI is a client/server architecture.

▪ The client is called the initiator and issues


request to the server. The client is I/O
reques
subsystem under the typical OS control. t

▪ The “server” is called the target, which is


the SCSI controller inside the storage
respons Storage Device
device. It receives process and responds to Client e (Target)
the requests from the initiator. (Initiator)

▪ SCSI commands support block I/O,


transferring a large amount of data in
blocks.

12
Storage System Environment

SAN

• A Storage Area Network (SAN) is a specialised, dedicated high-speed network joining servers and
storage, including disks, disk arrays, tapes, etc.

• Storage (data store) is separated from the processors (and separated processing).

• High capacity, high availability, high scalability, ease of configuration, ease of reconfiguration.

• Fibre Channel is the de facto SAN networking architecture, although other network standards could
be used.

12
Storage System Environment

SAN Benefits

• Storage consolidation.
• Data sharing.
• Non-disruptive scalability for growth.
• Improved backup and recovery.
• Tape pooling.
• LAN-free and server-free data movement.
• High performance.
• High availability server clustering.
• Data integrity.
• Disaster tolerance.
• Ease of data migration.
• Cost-effectiveness (total cost of ownership).
12
Storage System Environment

SAN Architecture

• Fibre Channel (FC) is well established in the open systems environment as the underlining architecture
of the SAN.

• Fibre Channel is structured with independent layers, as are other networking protocols. There are five
layers, where 0 is the lowest layer. The physical layers are 0 to 2. These layers carry the physical
attributes of the network and transport the data created by the higher level protocols, such as SCSI,
TCP/IP, or FICON.

• ANSI T11 is the technical committee who produced FC Protocol (interface standard) for
high-performance and mass storage applications.

• FC Protocol is designed to transport multiple protocols, such as HIPPI, IPI, SCSI, IP, Ethernet, etc.

13
Storage System Environment

FC Protocol Layers

13
Storage System Environment

SAN Topologies

Fibre Channel-based networks support three types of topologies:

• Point-to-point.
• Loop (arbitrated) – shared media.
• Switched.

13
Storage System Environment

Point-to-Point

• The point-to-point topology is the easiest Fibre Channel configuration to implement, and it is also the
easiest to administer.

• The distance between nodes can be up to 10 km.

13
Storage System Environment

Arbitrated Loop

• Shared Media Transport. Similar in concept


to shared Ethernet.

• Not common for FC-based SAN.

• Commonly used for JBOD (Just a Bunch of


Disks).

• An arbitration protocol determines who can


access the media.

13
Storage System Environment

Switched FC

• Fibre Channel-switches function in a manner similar to traditional network switches to provide increased
bandwidth, scalable performance, an increased number of devices, and, in some cases, increased
redundancy. Fibre Channel-switches vary in the number of ports and media types they support.

• Multiple switches can be connected to form a switch fabric capable of supporting a large number of host
servers and storage subsystems.

Server
s Fibre Channel
Storage
Switch
Device

13
Storage System Environment

NAS

▪ NAS is a dedicated storage device, and it operates in a client/server mode.


▪ NAS is connected to the file server via LAN.
▪ Protocol: NFS (or CIFS) over an IP Network.
• Network File System (NFS) – UNIX/Linux.
• Common Internet File System (CIFS) – Windows Remote file system (drives) mounted on the local
system (drives).
o evolved from Microsoft NetBIOS, NetBIOS over TCP/IP (NBT), and Server Message Block
(SMB).
▪ Advantage: No distance limitation.
▪ Disadvantage: Speed and Latency.
▪ Weakness: Security.
13
Storage System Environment

NAS Implementation

SMB
NetBIOS

TCP
IP
802.3

13
Storage System Environment

NAS Implementation

NFS
TCP
IP
802.3

13
Storage System Environment

NAS vs SAN

Traditionally:

• NAS is used for low-volume access to a large amount of storage by many users.

• SAN is the solution for terabytes (1012) of storage and multiple, simultaneous access to files, such
as streaming audio/video.

The lines are becoming blurred between the two technologies now, and while the SAN-versus-NAS debate
continues, the fact is that both technologies complement each another.

13
Storage System Environment

IP-SAN

▪ IP storage networking – carrying storage traffic over IP.

▪ Uses TCP, reliable transport for delivery.

▪ Can be used for local data Centre and long haul applications.

▪ Two primary IETF protocols/standards:

• iSCSI – Internet SCSI – allows block storage to be accessed over a TCP/IP network as though it
were locally attached

• FCIP – Fibre Channel over IP – used to tunnel Fibre Channel frames over TCP/IP connections

14
Storage System Environment

Self Assessment Question

51. Which of these support IDE/SCSI as data transmission mode?


a) DAS
b) NAS
c) SAN
d) None of the above

Answer: a)

14
Storage System Environment

Self Assessment Question

52. Among these, the one having moderate complexity is _________________

a) DAS
b) NAS
c) SAN
d) None of the above

Answer: b)

14
Storage System Environment

Self Assessment Question

53. Fibre Channel is the networking architecture of _________________


a) DAS
b) NAS
c) SAN
d) None of the above

Answer: c)

14
Storage System Environment

Summary

• A storage system is a group of components providing storage facilities to one or numerous computers.
• The main components of storage system environments are the host, connectivity, and storage.
• Components of a disk drive are platter, spindle, read/write head, actuator arm assembly, and controller.
• A host is a computer or any other device on which applications run.
• The logical components of a host are OS, device driver, volume manager, file system, volume group.
• Zoned bit recording groups tracks into zones based on their distance from the centre of the disk.

14
Storage System Environment

Assignment

1. Explain the various components of storage system.


2. What is the difference between port and cable?
3. Write a description on various hard disk drive components.
4. Draw a neat diagram of a hard disk drive, showing all its components.
5. Explain various factors affecting hard disk drive performance.
6. Explain the difference between RAID 0 and RAID 1 with examples.
7. Explain various SAN topologies.

14
Storage System Environment

Assignment
8. Explain the three main components of the storage system.
9. Discuss the physical components of the host in the storage system.
10. Compare RAM and ROM storage devices
11. Explain the different types of data communication between the I/O devices and host.
12. Recommend the physical components of connectivity between the host and storage. Explain.
13. Give the classification of buses for data transfer.
14. Discuss the logical components of the connectivity in the storage system.

14
Storage System Environment

Assignment
15. List out the merits and limitations of tape storage device.
16. Discuss the different types of storage devices that use magnetic media.
17. With a neat diagram, explain the Disk Drive Components.
18. With a neat diagram, explain Physical Disk Structure.
19. Explain the concept of Zone Bit Recording to store the information in the disk.
20. Discuss the concept of logical block addressing in the disk storage device.
21. The utilisation law is a fundamental law that defines the I/O controller utilization. Justify the
statement.

14
Storage System Environment

Assignment
22. Discuss the necessity of RAID Technology?
23. Elaborate the different types of RAID implementation. What are the benefits and limitation of the
techniques?
24. With a neat diagram, explain the components of RAID Array.
25. Explain the RAID techniques that define the various RAID levels.
26. Compare the different RAID techniques.
27. Recommend the techniques determine the data availability and performance characteristics of a
redundant array.

14
Storage System Environment

Assignment
28. Consider the three users A, B and C. The user B wants to store the two copies of data on two
different HDDs, user A wants to store the parts of data on different disks and user C wants to
protect and store the segments of the data. Identify the RAID techniques that can be used to store
the user A, B and C data.
29. List the various RAID levels with the RAID techniques adopted.
30. Compare the different RAID levels.
31. With a neat diagram, explain the RAID 0 and 1 levels to improve the disk performance.
32. With a neat diagram, explain the nested RAID levels to improve the disk performance.

14
Storage System Environment

Assignment

33. Consider an application that generates 5,200 IOPS, with 60 percent of them being reads. Compute
the disk load in RAID 5 RAID 1
34. What is DAS? Explain the different types of DAS architectures.
35. List out the benefits and limitations of DAS architecture.
36. Data Centre managers are burdened with the challenging task of providing low-cost, high-
performance information management solutions. Justify the statement.
37. With a neat diagram, explain the SAN implementation.
38. List and explain the components of the SAN.
39. With a neat diagram, explain the fibre Channel SAN evolution.

15
Storage System Environment

Assignment

40. Distinguish between multi-mode and single-mode fibre cables for data transfer in FC SAN.
41. Compare and contrast between FC switch and FC hub.
42. List FC architecture supports three basic interconnectivity options. explain and compare.
43. With neat diagrams, explain the FC topologies most commonly deployed in SAN implementations.

15
Storage System Environment

Document Links

Topics URL Notes


https://searchstorage.techtarget.com/tip/The-components- The link explains in details all aspects of a balanced
Storage System
that-make-up-a-well-balanced-data-storage-system data storage system

http://study.com/academy/lesson/magnetic-storage-
This link explains Magnetic Storage: Definition,
Magnetic storage definition-devices-examples.html
Devices and Examples

This link explains about different types of Optical


Optical media http://searchstorage.techtarget.com/definition/optical-media
media (CD, DVD)
http://searchstorage.techtarget.com/definition/solid-state-
This link explains about Solid State Storage
Solid State Devices storage
Technology

Performance assessment https://www.pctechguide.com/hard-disks/hard-disk-hard- This link explains various parameters of Drive


of Disk Drive drive-performance-transfer-rates-latency-and-seek-times Performance - transfer rates, latency and seek times

This link explains about hard disk drive and its


https://www.explainthatstuff.com/harddrive.html
Hard disk drive components. It presents details about structure of a
hard disk drive and its functioning
15
Storage System Environment

Video Links

Topics URL Notes

https://www.youtube.com/watch?v=NgVm2kUmrY This video explains various types of storage


Storage Media
o media

This video illustrates various components of


HDD working https://www.youtube.com/watch?v=p-JJp-oLx58
hard disk drive and shows their working

Storage Area Network https://www.youtube.com/watch?v=7et3bmXhr8w This video explains the building blocks of SAN

This video covers various aspects of NAS in


Introduction to NAS https://www.youtube.com/watch?v=KXQUpJWTrlA
details.

15
Storage System Environment

E-Book Links

Topics URL Page Number

http://read.pudn.com/downloads168/ebook/77230
Storage Systems Architecture 1 to 7
7/STF2-0SectionIntroduction.pdf

hthttps://www.pdfdrive.com/information-storage-
Information Storage and Management and-management-volume-1-of-rnhtnet- 1-317
d17363479.html

15
DEPARTMENT OF BCA
STORAGE MANAGEMENT
SUB CODE: 16BCA525D21

MODULE 3: RAID AND STORAGE NETWORKING TECHNOLOGIES

Module 3 V Sem, SM, Department of BCA


CONTENTS :

Module 3 :
Implementation of RAID - Software RAID - Hardware RAID -RAID Array Component -RAID Levels -
Striping -Mirroring -Parity RAID 0 RAID 1 -Nested RAID -RAID 3 -RAID 4 -RAID 5 --RAID 6 -RAID
Comparison -RAID Impact on Disk-Performance - Application IOPS and RAID Configurations -
Introduction to Direct Attached Storage – Types of DAS – Introduction to SAN – Components of SAN –
FC connectivity – FC topologies – Introduction to NAS – NAS components – NAS Implementation – NAS
File sharing.

10/7/2021 2
V SEM, SM, DEPARTMENT OF BCA
RAID and Storage Networking Technologies

AIM

To familiarise students with RAID and provide the good knowledge of different storage technologies
such DAS, SAN, FC and NAS.

3
RAID and Storage Networking Technologies

Objective

The Objectives of this module are to understand:

• The requirement of RAID.


• The details of various RAID Technologies.
• The view of Storage Networking Technologies.

4
RAID and Storage Networking Technologies

Outcome
At the end of this module, you are expected to:

• Describe RAID and the needs it addresses.


• Describe the concepts upon which RAID is built.
• Define and compare RAID levels.
• Recommend the use of the common RAID levels based on performance and availability
considerations.
• Explain factors impacting disk drive performance.
• Be familiar with Storage Networking Technologies (DAS, SAN, NAS).

5
RAID and Storage Networking Technologies

Content
• Why RAID?
• Implementation of RAID
• RAID Array Component
• RAID technology
• RAID Levels
• RAID Comparison
• RAID Impact on Disk-Performance
• Application IOPS and RAID Configurations
• Storage Networking Technologies (DAS, SAN, FC, and NAS)

6
RAID and Storage Networking Technologies

Why RAID
• Performance limitation of disk drive.
• An individual drive has a certain life expectancy.
• Measured in MTBF (Mean Time Between Failure).
The more the number of HDDs in a storage array, the larger the probability for disk failure. For example:
• If the MTBF of a drive is 750,000 hours, and there are 100 drives in the array, then the MTBF of
the array becomes 750,000 / 100, or 7,500 hours.
• RAID was introduced to mitigate this problem.
RAID provides:
• Increase capacity
• Higher availability
• Increased performance
7
RAID and Storage Networking Technologies

Implementation of RAID

8
RAID and Storage Networking Technologies

There are two types of RAID implementation, hardware and software.

Software Based RAID

• Software implementations are provided by many operating systems.

• A software layer sits above the disk device drivers and provides an abstraction layer between the
logical drives (RAIDs) and physical drives.

• Server's processor is used to run the RAID software.

• Used for simpler configurations like RAID 0 and RAID 1.

9
RAID and Storage Networking Technologies

Limitations of Software Based RAID

• Performance: Software RAID affects overall system performance.

• Supported features: Software RAID does not support all RAID levels.

• Operating system compatibility: Software RAID is tied to the host operating system hence, upgrades to
software RAID or to the operating system should be validated for compatibility.

10
RAID and Storage Networking Technologies

Hardware Based RAID


A hardware implementation of RAID requires at
least a special-purpose RAID controller.

• On a desktop system this may be built


into the motherboard.

• Processor is not used for RAID


calculations as a separate controller is
present.

A PCI-bus-based, IDE/ATA hard disk RAID


controller, supporting levels 0, 1 and 10

11
RAID and Storage Networking Technologies

Hardware Based RAID

• In hardware RAID implementations, a specialised hardware controller is implemented either on the


host or on the array.

• Controller card RAID is host-based hardware RAID implementation in which a specialised RAID
controller is installed in the host and HDDs are connected to it.

• The RAID Controller interacts with the hard disks using a PCI bus.

• The external RAID controller is an array-based hardware RAID. It acts as an interface between the
host and disks.

• Key functions of RAID controllers are: ■ Management and control of disk aggregations ■ Translation
of I/O requests between logical disks and physical disks ■ Data regeneration in the event of disk
failures.

12
RAID and Storage Networking Technologies

RAID Array Components

13
RAID and Storage Networking Technologies

RAID Array Components


Physical Array

Logical Array

RAID
Controller
Hard Disks

Host

RAID Array

14
RAID and Storage Networking Technologies

RAID Array Components

• A RAID array is an enclosure that contains a number of HDDs and the supporting hardware and
software to implement RAID.

• HDDs inside a RAID array are usually contained in smaller sub-enclosures.

• These sub-enclosures, or physical arrays, hold a fixed number of HDDs, and may also include other
supporting hardware, such as power supplies.

• A subset of disks within a RAID array can be grouped to form logical associations called logical
arrays, also known as a RAID set or a RAID group.

• Logical arrays are comprised of Logical Volumes (LV). The operating system recognises the LVs as if
they are physical HDDs managed by the RAID controller.

15
RAID and Storage Networking Technologies

RAID technology
and Levels

16
RAID and Storage Networking Technologies

RAID technology

• RAID stands for Redundant Array of Independent Disks.

• It is a technology that enables greater levels of performance, reliability and/or large volumes when
dealing with data.

• High performance is achieved by concurrent use of two or more ‘hard disk drives’.

• Mirroring, Stripping (of data) and Error correction techniques combined with multiple disk arrays
gives the desired reliability and performance.

17
RAID and Storage Networking Technologies

RAID Levels
• 0 Striped array with no fault tolerance
• 1 Disk mirroring
• Nested RAID (i.e., 1 + 0, 0 + 1, etc.)
• 3 Parallel access array with dedicated parity disk
• 4 Striped array with independent disks and a dedicated parity disk
• 5 Striped array with independent disks and distributed parity
• 6 Striped array with independent disks and dual distributed parity

18
RAID and Storage Networking Technologies

RAID flavors

Commonly used ones:

1. RAID 0
2. RAID 1
3. RAID 5
4. RAID 10

Other rarely used types: RAID 2,3,4,6,50……

19
RAID and Storage Networking Technologies
Stripe
Data Organization: Striping

Strip

Stripe

Strip 1 Strip 2 Strip 3

Stripe 1
Stripe 2

Strips
20
RAID and Storage Networking Technologies

RAID 0

• It splits data among two or more disks.

• Provides good performance.

• Lack of data redundancy means there is no fail over


support with this configuration.

• In the diagram to the right, the odd blocks are written


to disk 0 and the even blocks to disk 1 such that A1,
A2, A3, A4, … would be the order of blocks if read
sequentially from the beginning.

• Used in read only NFS systems and gaming systems.

21
RAID and Storage Networking Technologies

RAID 0

• Data is distributed across the HDDs in the RAID set.

• Allows multiple data to be read or written simultaneously, and therefore improves performance.

• Does not provide data protection and availability in the event of disk failures.

22
RAID and Storage Networking Technologies

RAID 0

1
5
9

2
RAID
6
Controller
10

3
7
Host
11

23
RAID and Storage Networking Technologies

RAID 1

• Data is stored on two different HDDs, yielding two copies of the same data.
• Provides availability.

• In the event of HDD failure, access to data is still available from the surviving HDD.

• When the failed disk is replaced with a new one, data is automatically copied from the surviving disk to the
new disk.
• Done automatically by RAID the controller.

• Disadvantage: The amount of storage capacity is twice the amount of data stored.

• Mirroring is NOT the same as doing backup.

24
RAID and Storage Networking Technologies

Mirroring

• RAID 1 is ‘data mirroring’.

• Two copies of the data are held on two physical


disks, and the data is always identical.

• Twice as many disks are required to store the same


data when compared to RAID 0.

• Array continues to operate so long as at least one


drive is functioning.

25
RAID and Storage Networking Technologies

RAID 1

RAID
Block 01 Block 0
1
Controller

Host

26
RAID and Storage Networking Technologies

Nested RAID
• Combines the performance benefits of RAID 0 with the redundancy benefit of RAID 1.

RAID 0+1 – Mirrored Stripe

• Data is striped across HDDs, then the entire stripe is mirrored.

• If one drive fails, the entire stripe is faulted.


• Rebuild operation requires data to be copied from each disk in the healthy stripe, causing increased
load on the surviving disks.

RAID 1+0 – Striped Mirror

• Data is first mirrored, and then both copies are striped across multiple HDDs.

• When a drive fails, data is still accessible from its mirror.

• Rebuild operation only requires data to be copied from the surviving disk into the replacement disk.
27
RAID and Storage Networking Technologies

Nested RAID – 0+1 (Striping and Mirroring)

RAID 1

Block 0

Block 2

RAID RAID 0
Block 0
3
2
1
Controller

Block 1

Block 3
Host

28
RAID and Storage Networking Technologies

Nested RAID – 0+1 (Striping and Mirroring)


RAID 1

Block 0 Block 0

Block 2 Block 2

RAID 0
RAID
Controller
Block 1 Block 1

Block 3 Block 3
Host

29
RAID and Storage Networking Technologies

Nested RAID – 1+0 (Mirroring and Striping)

RAID 0

Block 1

Block 3

RAID 1
RAID
Block 2
0
Controller
Block 1

Block 3
Host

30
RAID and Storage Networking Technologies

Nested RAID – 1+0 (Mirroring and Striping)


RAID 0

Block 0 Block 1

Block 2 Block 3

RAID 1
RAID
Controller
Block 0 Block 1

Block 2 Block 3
Host

31
RAID and Storage Networking Technologies

RAID 10

• Combines RAID 1 and RAID 0 technology.

• Which means having the pleasure of both


good performance and good failover
handling.

• Also called ‘Nested RAID’.

32
RAID and Storage Networking Technologies

RAID Redundancy: Parity


4

1
6 5
9

RAID ?1
Controller

3
Host 7 7
11
The middle drive fails:
Parity calculation 4 + 6 + 1 + 7 = 18 0123
4 + 6 + ? + 7 = 18 18
4567
? = 18 – 4 – 6 – 7
?=1
Parity Disk
33
RAID and Storage Networking Technologies

RAID 3 and RAID 4

• Stripes data for high performance and uses parity for improved fault tolerance.

• One drive is dedicated for parity information.

• If a drive files, data can be reconstructed using data in the parity drive.

• For RAID 3, data read / write is done across the entire stripe.

• Provide good bandwidth for large sequential data access such as video streaming.

• For RAID 4, data read/write can be independently on single disk.

34
RAID and Storage Networking Technologies

RAID 3

RAID
Block 0
3
2
1 Block 0
Controller
Block 1
Parity
Generated
Block 2
Host
Block 3
P0123

35
RAID and Storage Networking Technologies

RAID 4

• Similar to RAID 3, RAID 4 stripes data for high performance and uses parity for improved fault tolerance.

• Data is striped across all disks except the parity disk in the array.

• Parity information is stored on a dedicated disk so that the data can be rebuilt if a drive fails.

• Striping is done at the block level.

• Unlike RAID 3, data disks in RAID 4 can be accessed independently, so that specific data elements can be
read or written on single disk without read or write of an entire stripe.

• RAID 4 provides good read throughput and reasonable write throughput.

36
RAID and Storage Networking Technologies

RAID 4

37
RAID and Storage Networking Technologies

RAID 5 and RAID 6

• RAID 5 is similar to RAID 4, except that the parity is distributed across all disks instead of stored on a
dedicated disk.
• This overcomes the write bottleneck on the parity disk.

• RAID 6 is similar to RAID 5, except that it includes a second parity element to allow survival in the event
of two disk failures.
• The probability for this to happen increases and the number of drives in the array increases.
• Calculates both horizontal parity (as in RAID 5) and diagonal parity.
• Has more write penalty than in RAID 5.
• Rebuild operation may take longer than on RAID 5.

38
RAID and Storage Networking Technologies

RAID 5
Block 0
Block 4

Block 1
Block 5

Parity
RAID Block 2
Block 0
4 Block 4
0
Generated
Controller Block 6
P4
0516
273
Block 3
Host
P4567

P0123
Block 7

39
RAID and Storage Networking Technologies

RAID 5
• RAID 5 is an ideal combination of
good performance, good fault
tolerance and high capacity and
storage efficiency.

• An arrangement of parity and


CRC to help rebuilding drive data
in case of disk failures.

• “Distributed Parity” is the key


word here.

40
RAID and Storage Networking Technologies

RAID Comparison

41
RAID and Storage Networking Technologies

RAID Comparison
Min
RAID Storage Efficiency % Cost Read Performance Write Performance
Disks

Very good for both random and


Very good
0 2 100 Low sequential read

Good
Good Slower than a single disk, as every
1 2 50 High Better than a single disk write must be committed to two disks

(n-1)*100/n
Good for random reads and very Poor to fair for small random writes
where n= number of
3 3 Moderate good for sequential reads Good for large, sequential writes
disks

(n-1)*100/n Fair for random write


Very good for random reads
where n= number of Slower due to parity overhead
5 3 Moderate Good for sequential reads
disks Fair to good for sequential writes

(n-2)*100/n
Moderate but Very good for random reads Good for small, random writes
6 4 where n= number of
more than RAID 5 Good for sequential reads (has write penalty)
disks

1+0
and 4 50 High Very good Good
0+1

42
RAID and Storage Networking Technologies

RAID Impact on
Performance of Disk

43
RAID and Storage Networking Technologies
RAID Impacts on Performance RAID Controller

Ep new = Ep old - E4 old + E4 new

2
Ep new Ep old X E4 old E4 new
O
R

P0 D1 D2 D3 D4

• Small (less than element size) write on RAID 3 and 5.


• Ep = E1 + E2 + E3 + E4 (XOR operations)
• If parity is valid, then: Ep new = Ep old – E4 old + E4 new (XOR operations)
• 2 disk reads and 2 disk writes
• Parity Vs Mirroring
• Reading, calculating and writing parity segment introduces penalty to every write operation.
• Parity RAID penalty manifests due to slower cache flushes.
44
• Increased load in writes can cause contention and can cause slower read response times.
RAID and Storage Networking Technologies

Application IOPS and RAID


Configurations

45
RAID and Storage Networking Technologies

Application IOPS and RAID Configurations - RAID Penalty Exercise

• When deciding the number of disks required for an application, it is important to consider the impact of
RAID based on IOPS generated by the application.

• The total disk load should be computed by considering the type of RAID configuration and the ratio of
read compared to write from the host.

• The following example illustrates the method of computing the disk load in different types of RAID.
Total IOPS at peak workload is 1200, Read/Write ratio 2:1

• Calculate IOPS requirement at peak activity for

• RAID 1/0
Additional Task
• RAID 5 Discuss impact of sequential and
Random I/O in different RAID
Configuration
46
RAID and Storage Networking Technologies

Application IOPS and RAID Configurations - RAID Penalty Exercise

• Consider an application that generates 5,200 IOPS, with 60% of them being reads.

The disk load in RAID 5 is calculated as follows:

• RAID 5 disk load = 0.6 × 5,200 + 4 × (0.4 × 5,200) [because the write penalty for RAID 5 is 4]

= 3,120 + 4 × 2,080 = 3,120 + 8,320 = 11,440 IOPS

The disk load in RAID 1 is calculated as follows:

• RAID 1 disk load = 0.6 × 5,200 + 2 × (0.4 × 5,200) [because every write manifests as two writes to the
disks]

= 3,120 + 2 × 2,080 = 3,120 + 4,160 = 7,280 IOPS

47
RAID and Storage Networking Technologies

Application IOPS and RAID Configurations - RAID Penalty Exercise

• The computed disk load determines the number of disks required for the application. If in this example,
an HDD with a specification of a maximum 180 IOPS for the application needs to be used, the number of
disks required to meet the workload for the RAID configuration would be as follows:

• RAID 5: 11,440 / 180 = 64 disks

• RAID 1: 7,280 / 180 = 42 disks (approximated to the nearest even number)

48
RAID and Storage Networking Technologies

RAID Summary

Key points covered in RAID are:

• What RAID is and the needs it addresses?

• The concepts upon which RAID is built.

• Some commonly implemented RAID levels.

49
RAID and Storage Networking Technologies

Self Assessment Questions

1. Data Mirroring is a phenomenon witnessed in ___________ favour.

a) RAID 0
b) RAID 1
c) RAID 5

Answer: b

50
RAID and Storage Networking Technologies

Self Assessment Questions

2. Hardware based RAID implementation requires no processor use for RAID calculations because it does
not involve calculations. State whether True or False.

a) True
b) False

Answer: b

51
RAID and Storage Networking Technologies

Self Assessment Questions


3. Which one of these options can be achieved through RAID?
i. High performance
ii. Reliability
iii. Fault tolerance

a) Only i
b) Only ii
c) Only i and ii
d) All i, ii and iii

Answer: d
52
RAID and Storage Networking Technologies

Self Assessment Questions

4. Which one of the given options stands for RAID?

a) Redundant Array of Independent Disks


b) Replica Array of Independent Disks
c) Redundant Array of Indecent Disks
d) Relay Array of Independent Disks

Answer: a

53
RAID and Storage Networking Technologies

Self Assessment Questions

5. ______________ measures (in hours) the average life expectancy of an HDD.

a) Average Time Between Failure (ATBF)


b) Total Time Between Failure (TTBF)
c) Standard Time Between Failure (STBF)
d) Mean Time Between Failure (MTBF)

Answer: d

54
RAID and Storage Networking Technologies

Self Assessment Questions

6. Consider a storage array of 100 HDDs, each with an MTBF of 750,000 hours. This means that a HDD in
this array is likely to fail at least once in ____________ hours.

a) 7,500
b) 7,50
c) 75
d) 700

Answer: a

55
RAID and Storage Networking Technologies

Self Assessment Questions

7. Which one of the given options is a technique whereby, data is stored on two different HDDs, yielding
two copies of data?

a) Parity
b) Stripping
c) Mirroring
d) Seizing

Answer: b

56
RAID and Storage Networking Technologies

Self Assessment Questions

8. In which one of the given RAID configuration, data is striped across the HDDs in a RAID set?

a) RAID 3
b) RAID 2
c) RAID 1
d) RAID 0

Answer: d

57
RAID and Storage Networking Technologies

Self Assessment Questions

9. In which one of the given RAID configuration, data is mirrored to improve fault tolerance?

a) RAID 3
b) RAID 2
c) RAID 1
d) RAID 0

Answer: c

58
RAID and Storage Networking Technologies

Self Assessment Questions

10. Consider an application that generates 5,200 IOPS, with 60 percent of them being reads. The disk load
in RAID 5 is ___________________.

a) 11,440 IOPS
b) 11, 404 IOPS
c) 11, 004 IOPS
d) 11, 400 IOPS

Answer: a

59
RAID and Storage Networking Technologies

Document Links

Topics URL Notes


This link explains about the details of RAID technology
RAID https://www.prepressure.com/library/technology/ra
with advantages and disadvantages of different RAID
technology id
flavours.

60
RAID and Storage Networking Technologies

Video Links

Topics URL Notes

RAID technology https://www.youtube.com/watch?v=wTcxRObq738 This video explains about RAID 0, 1, 2, 5, 6, 10.

In this Video, you will learn about RAID controller


RAID controller https://www.youtube.com/watch?v=wpZ45890zEk
and its use and need.

61
RAID and Storage Networking Technologies

Storage Networking
Technologies

62
RAID and Storage Networking Technologies

Overview

There are 4 basic forms of Network Storage:

• Direct Attached Storage (DAS)


• Network Attached Storage (NAS)
• Storage Area Network (SAN)
• Internet Protocol Storage Area Network (IP-SAN)

63
RAID and Storage Networking Technologies

Direct Attached Storage


(DAS)

64
RAID and Storage Networking Technologies

Direct Attached Storage

Upon completion of this topic, you will be able to:

• Discuss the benefits of DAS.

• Describe the elements of DAS.

• Discuss DAS management considerations.

• Discuss DAS challenges.

65
RAID and Storage Networking Technologies

What is DAS?

Internal Direct Connect External Direct Connect

• Uses block level protocol for data access.


66
RAID and Storage Networking Technologies

DAS Benefits

• Ideal for local data provisioning.

• Quick deployment for small environments.

• Simple to deploy.

• Low capital expense.

• Low complexity.

67
RAID and Storage Networking Technologies

DAS Connectivity Options


• ATA (IDE) and SATA
• Primarily for internal bus

• SCSI
• Parallel (primarily for internal bus)
• Serial (external bus)

• FC
• High speed network technology

• Buss and Tag


• Primarily for external mainframe
• Precursor to ESCON and FICON
68
RAID and Storage Networking Technologies

DAS Management
• Internal

• Host provides:

• Disk partitioning (Volume management).

• File system layout.

• Direct Attached Storage managed individually through the server and the OS.

• External

• Array based management

• Lower TCO (Total Cost of Ownership) for managing data and storage infrastructure.

69
RAID and Storage Networking Technologies

DAS Challenges
• Scalability is limited

• Number of connectivity ports to hosts

• Difficulty to add more capacity

• Limited bandwidth

• Distance limitations

• Downtime required for maintenance with internal DAS

• Limited ability to share resources

• Array front-end port.

• Unused resources cannot be easily re-allocated.

70
• Resulting in islands of over and under utilized storage pools.
RAID and Storage Networking Technologies

DAS Types
Ethernet
Network

Used

IDE Disk Array

Small Server

Used
SCSI
Channel Used

clients SCSI Disk Array


Large Server

71
RAID and Storage Networking Technologies

SCSI

▪ SCSI stands for Small Computer System Interface.


▪ It is an I/O bus for peripheral device, such as hard drives, tape drives, CD-ROM, scanners, etc.
• An improvement over IDE.
▪ A single SCSI bus connects multiple elements (max 7 or 15).
▪ High speed data transfer:
• 5, 10, 20, 100, 320 MB/sec, …
▪ Overlapping I/O capability:
• Multiple read and write commands can be outstanding simultaneously.
• Different SCSI drives to be processing commands concurrently rather than serially. The data can then
be buffered and transferred over the SCSI bus at very high speeds.

72
RAID and Storage Networking Technologies

SCSI Distribution Architecture

▪ SCSI is a client/server architecture.

▪ The client is called the initiator and issues


request to the server. The client is I/O
reques
subsystem under the typical OS control. t

▪ The “server” is called the target, which is


the SCSI controller inside the storage
respons Storage Device
device. It receives, process, and responds Client e (Target)
to the requests from the initiator. (Initiator)

▪ SCSI commands support block I/O,


transferring large amount of data in
blocks.

73
RAID and Storage Networking Technologies

Storage Area Network


(SAN)

74
RAID and Storage Networking Technologies

SAN
• A Storage Area Network (SAN) is a specialised, dedicated high speed network joining servers and
storage, including disks, disk arrays, tapes, etc.

• Storage (data store) is separated from the processors (and separated processing).

• High capacity, high availability, high scalability, ease of configuration, ease of reconfiguration.

• Fibre Channel is the de facto SAN networking architecture, although other network standards could be
used.

75
RAID and Storage Networking Technologies

SAN Benefits

• Storage consolidation
• Data sharing
• Non-disruptive scalability for growth
• Improved backup and recovery
• Tape pooling
• LAN-free and server-free data movement
• High performance
• High availability server clustering
• Data integrity
• Disaster tolerance
• Ease of data migration
• Cost-effectiveness (total cost of ownership)
76
RAID and Storage Networking Technologies

SAN Architecture

• Fibre Channel (FC) is well established in the open systems environment as the underlining architecture of
the SAN.

• Fibre Channel is structured with independent layers, as are other networking protocols. There are five
layers, where 0 is the lowest layer. The physical layers are 0 to 2. These layers carry the physical attributes
of the network and transport the data created by the higher level protocols, such as SCSI, TCP/IP, or
FICON.

• ANSI T11 is the technical committee who produced FC Protocol (interface standard) for
high-performance and mass storage applications.

• FC Protocol is designed to transport multiple protocols, such as HIPPI, IPI, SCSI, IP, Ethernet, etc.

77
RAID and Storage Networking Technologies

What is a SAN? Servers

• Dedicated high speed network of servers and shared storage


devices
• Provide block level data access
• Resource Consolidation
• Centralised storage and management FC SAN

• Scalability
• Theoretical limit: Appx. 15 million devices
• Secure Access

Additional Task
Research on Bladed Switch
Technology Storage Storage 78
Array Array
RAID and Storage Networking Technologies

Understanding Fibre Channel

• Fibre Channel is a high-speed network technology that uses:


• Optical fiber cables (for front end connectivity)
• Serial copper cables (for back end connectivity)
• Latest FC implementations support 8Gb/s

o Servers are attached to 2 distinct networks

IP FC SAN
network

Users and Servers and Storage and


Application Applications Application
Clients Data
79
RAID and Storage Networking Technologies

FC SAN Evolution Server Servers Servers

FC Switch
FC Switch

FC Hub FC Switch FC Switch

FC Hub FC Switch

FC Switch
FC Switch

Storage Array Storage Arrays Storage Arrays


SAN Islands Interconnected SANs Enterprise SANs
FC Arbitrated Loop FC Switched Fabric FC Switched Fabric

Fibre Channel SAN Evolution


80
RAID and Storage Networking Technologies

Components of SAN: Node ports

Examples of nodes
• Hosts, storage and tape library
Node
Tx
• Ports are available on: Port 0 Rx
Port 0 Link
• HBA in host Port 1

• Front-end adapters in storage


• Each port has transmit (Tx) link and Port n

receive (Rx) link


• HBAs perform low-level interface functions
automatically to minimise impact on host
performance.

81
RAID and Storage Networking Technologies

Components of SAN: Cabling

• SAN implementation uses:


• Copper cables for short distance
• Optical fiber cables for long distance
• Two types of optical cables
• Single-mode
• Can carry single beams of light
• Distance up to 10 KM
• Multi-mode
• Can carry multiple beams of light
simultaneously
• Distance up to 500 meters
82
RAID and Storage Networking Technologies

Components of SAN: Cabling (Connectors)

Node Connectors:
• SC Duplex Connectors
• LC Duplex Connectors

Patch panel Connectors SC Connector


• ST Simplex Connectors
LC Connector

ST Connector

83
RAID and Storage Networking Technologies

Components of SAN: Interconnecting devices

Basis for SAN communication

• Hubs

• Switches FC HUB FC Switch


• Directors

Director
84
RAID and Storage Networking Technologies

Components of SAN: Storage array

• Provides storage consolidation and


centralisation HBA

Features of an array are:


Arrays
• High Availability/Redundancy
HBA FC SAN
• Performance

• Business Continuity

• Multiple host connect


HBA

Servers

85
RAID and Storage Networking Technologies

Components of SAN: SAN management software

• A suite of tools used in a SAN to manage the


interface between host and storage arrays.

• Provides integrated management of SAN


environment.

• Web based GUI or CLI.

86
RAID and Storage Networking Technologies

FC Connectivity

87
RAID and Storage Networking Technologies

SAN Interconnectivity Options: Point to Point

Point to point (Pt-to-Pt)

• Direct connection between devices

• Limited connectivity

Storage Array

Servers

88
RAID and Storage Networking Technologies

SAN Interconnectivity Options: FC-AL

Fibre Channel Arbitrated Loop (FC-AL)

• Devices must arbitrate to gain control

• Devices are connected via hubs

• Supports up to 127 devices

FC Hub
Storage Array

Servers

89
RAID and Storage Networking Technologies

FC-AL Transmission

Node A Node D
NL_Port #1
NL_Port #1 Transmi Hub_PtB Hub_Pt NL_Port #2
NL_Port #2
Receive
HBA t Bypy HBA
HBA p HBA
B
Byp
y Transmi
Receive p t

Node B Node C
Transmi B NL_Port #3
#3
NL_Port #4 Receive NL_Port
NL_Port #4 t Bypy FA
HBA p B HBA
Array Port
Bypy Transmi
Receive Hub_Pt Hub_Ptp t

90
RAID and Storage Networking Technologies

SAN Interconnectivity Options: FC-SW

Fabric connect (FC-SW)

• Dedicated bandwidth between


devices

• Support up to 15 million devices

• Higher availability than hubs


Storage Array

Servers

91
RAID and Storage Networking Technologies

FC-SW Transmission

Node A Node D
NL_Port #1 Transmi Port Port N_Port #2
NL_Port #2
N_Port #1 Receive
HBA t HBA
Storage Port
HBA
Transmi
Receive
t

Node B Node C
NL_Port #4 Transmi
N_Port #4 Receive N_Port #3
HBA t
HBA Storage Port
Transmi
Receive Port Port t

92
RAID and Storage Networking Technologies

Port Types

Hos
t ?NL- ?
NL-
Port Port
Tape

?
FC Library
Hos Hub
t NL-
Port

Hos
t ?N- F-
? ?
FC
Port FL-
Port Switch
Port
? ?
FC
Switch
F-
? ?F-
?N- ?
E- E-
Port Port Port Port
N-
Port Port
Storage Array Storage Array

93
RAID and Storage Networking Technologies

Inter Switch Links (ISL)

• ISL connects two or more FC switches to each other using E-Ports.

• ISLs are used to transfer host-to-storage data as well as the fabric management traffic from one switch
to another.

• ISL is also one of the scaling mechanisms in SAN connectivity.

Multimode Fiber

1Gb=500m 2Gb=300m
FC Switch FC Switch

Single-mode Fiber

up to10 km
FC Switch FC Switch

94
RAID and Storage Networking Technologies

Login types in a Switched Network

Extended Link Services that are defined in the standards:

• FLOGI - Fabric login


• Between N_Port to F_Port
• PLOGI - Port login
• Between N_Port to N_Port
• N_Port establishes a session with another N_Port
• PRLI - Process login
• Between N_Port to N_Port
• To share information about the upper layer protocol type in use
• And recognising device as the SCSI initiator, or target
95
RAID and Storage Networking Technologies

Fibre Channel Architecture

96
RAID and Storage Networking Technologies

Fibre Channel Architecture

Upon completion of this topic, you will be able to:

• Describe layers of FC

• Describe FC protocol stack

• Discuss FC addressing

• Define WWN addressing

• Discuss structure and organization of FC Data

Additional Task
Research on Flow Control and Class of FC Services

97
RAID and Storage Networking Technologies

FC Architecture Overview
• FC uses channel technology
• Provide high performance with low protocol overheads
• FCP is SCSI-3 over FC network
• Sustained transmission bandwidth over long distances
• Provides speeds up to 8 Gb/s (8 GFC)
• FCP has five layers:
• FC-4 Applicatio
n
• FC-2 FC- SCS HIPP ESCO AT I
• FC-1 4 I I N M P
FC- Framing/Flow
• FC-0 Control
2
*FC-3 is not yet implemented FC- Encode/Deco
1 de

FC- 1 2 4 8
0 G G G G
b b b b
98 /s /s /s /s
RAID and Storage Networking Technologies

Fibre Channel Protocol Stack

FC layer Function SAN relevant features specified by FC layer

FC-4 Mapping interface Mapping upper layer protocol (e.g. SCSI-3 to FC transport

FC-3 Common services Not implemented

FC-2 Routing, flow control Frame structure, ports, FC addressing, buffer credits

FC-1 Encode/decode 8b/10b encoding, bit and frame synchronization

FC-0 Physical layer Media, cables, connector


99
RAID and Storage Networking Technologies

Fibre Channel Addressing

• FC Address is assigned during Fabric Login

• Used to communicate between nodes within SAN

• Similar in functionality to an IP address on NICs

• Address Format:

• 24 bit address, dynamically assigned

• Contents of the three bytes depend on the type of N-Port

• For an N_Port or a public NL_Port:

• Switch maintains mapping of WWN to FC-Address via the Name Server

100
RAID and Storage Networking Technologies

Fibre Channel Addressing

101
RAID and Storage Networking Technologies

World Wide Names

• Unique 64 bit identifier

• Static to the port

• Used to physically identify ports or nodes within SAN

• Similar to NIC’s MAC address


World Wide Name - Array
5 0 0 6 0 1 6 0 0 0 6 0 0 1 B 2
0101 0000 0000 0110 0000 0001 0110 0000 0000 0000 0110 0000 0000 0001 1011 0010

Company ID Por Model Seed


24 bits t 32 bits

World Wide Name - HBA


1 0 0 0 0 0 0 0 c 9 2 0 d c 4 0
Reserved Company ID Company Specific
12 bits 24 bits 24 bits

102
RAID and Storage Networking Technologies

Structure and Organization of FC Data

• FC data is organized as:

• Exchange operations

• Enables two N_ports to identify and manage a set of information units

• Maps to sequence

• Sequence

• Contiguous set of frames sent from one port to another.

• Frames

• Fundamental unit of data transfer

• Each frame can contain up to 2112 bytes of payload.


103
RAID and Storage Networking Technologies

Structure and Organization of FC Data

SOF Frame Header Data Field CRC EOF

4 Bytes 24 Bytes 0 - 2112 Bytes 4 Bytes 4 Bytes

104
RAID and Storage Networking Technologies

FC Topologies

105
RAID and Storage Networking Technologies

FC Topologies and Management


Upon completion of this topic, you will be able to:

• Define FC fabric topologies.

• Describe different types of zoning.

106
RAID and Storage Networking Technologies

Fabric Topology: Core-Edge Fabric


Edge
FC FC T FC
S Si S
• Can be two or three tiers w
i
we
ir
w
i
t t t
• Single Core Tier c
h
c
h
c
h

• One or two Edge Tiers


Core Direct
Serv o
T
• In a two tier topology, storage is usually
e r
i Storage
r
e Single-core Arr
r
topology ay
connected to the Core
Edge
FC FC T FC
• Benefits S
w
Si
we
S
w
i ir i

• High Availability t
c
t
c
t
c
h h h

• Medium Scalability
Core Direct Direct
• Medium to maximum Connectivity Serv
e
T o
r
o
r
i Storage
r e Dual-core Arr
r ay
topology
107
RAID and Storage Networking Technologies

Fabric Topology: Mesh

• Can be either partial or full mesh.

• All switches are connected to each other.

• Host and Storage can be FC


S
FC
S
w w
it
located anywhere in the fabric. c
it
c
h h
e e

• Host and Storage can be


s s

Serv Partial Serv


Full
e M e
localized to a single switch. r es
h
r
M
e
s
h

Storage Storage
Arr
108 Arr
ay ay
RAID and Storage Networking Technologies

Fabric Management: Zoning

FC
SAN

Storage
Array Array
port
HB
Serve
A
rs 109
RAID and Storage Networking Technologies

FC Protocol Layers

110
RAID and Storage Networking Technologies

SAN Topologies

Fibre Channel based networks supports three types of topologies:

• Point-to-point

• Loop (arbitrated) – shared media

• Switched

111
RAID and Storage Networking Technologies

Point-to-Point

• The point-to-point topology is the easiest Fibre Channel configuration to implement, and it is also the
easiest to administer.

• The distance between nodes can be up to 10 km.

112
RAID and Storage Networking Technologies

Arbitrated Loop

• Shared Media Transport. Similar in


concept to shared Ethernet.

• Not common for FC-based SAN.

• Commonly used for JBOD (Just a


Bunch of Disks).

• An arbitration protocol determines


who can access the media.

113
RAID and Storage Networking Technologies

Switched FC

• Fibre Channel-switches function in a manner similar to traditional network switches to provide increased
bandwidth, scalable performance, an increased number of devices, and, in some cases, increased
redundancy. Fibre Channel-switches vary in the number of ports and media types they support.

• Multiple switches can be connected to form a switch fabric capable of supporting a large number of host
servers and storage subsystems.

Server
s Fibre Channel
Storage
Switch
Device

114
RAID and Storage Networking Technologies

Network-Attached Storage

115
RAID and Storage Networking Technologies

Network-Attached Storage
After completing this topic, you will be able to:

• Describe NAS, its benefits and components.

• Discuss different NAS implementations.

• Describe NAS file-sharing protocols.

• Discuss NAS management options.

116
RAID and Storage Networking Technologies

NAS
▪ NAS is a dedicated storage device, and it operates in a client/server mode.
▪ NAS is connected to the file server via LAN.
▪ Protocol: NFS (or CIFS) over an IP Network
• Network File System (NFS) – UNIX/Linux
• Common Internet File System (CIFS) – Windows Remote file system (drives) mounted on the local
system (drives).
o Evolved from Microsoft NetBIOS, NetBIOS over TCP/IP (NBT), and Server Message Block
(SMB).
▪ Advantage: No distance limitation
▪ Disadvantage: Speed and Latency
▪ Weakness: Security

117
RAID and Storage Networking Technologies

File Sharing Environment

• File system is a structured way of storing and organising data files.

• File Sharing

• Storing and accessing data files over network.

• File system must be mounted in order to access files.

• Traditional client/server model, implemented with file-sharing protocols for remote file sharing

• Example: FTP, CIFS (also known as SMB), NFS, DFS

118
RAID and Storage Networking Technologies

File Sharing Technology Evolution


Networked File
Sharing
Portable Media Networked
Stand Alone for File PCs
PC Sharing

119 Network Attached Storage (NAS)


RAID and Storage Networking Technologies

What is NAS ?
Clients

Application Print
Server Server
NAS Device

NAS is shared storage on a network infrastructure


120
RAID and Storage Networking Technologies

General Purpose Servers vs. NAS Devices

Applicatio File
ns Syst
Print Operating
em
Drive System
File rs Netwo
Syst rk
Operating
em
System
Netwo
rk

Single
NASFunctio
n Devi
ce

• Dedicated for file-serving


• Uses real-time OS dedicated for file-serving
purpose

General Purpose
Servers or
(Windows
UNIX)
121
RAID and Storage Networking Technologies

Benefits of NAS
• Supports comprehensive access to information.

• Improves efficiency – uses special purpose OS.

• Improved flexibility – platform independent.

• Centralises storage

• Simplifies management

• Scalability

• High availability – provides redundant components.

• Provides security integration to environment (user authentication and authorisation)

122
RAID and Storage Networking Technologies

Components of NAS

UNIX
NFS Network Interface
NAS Head
NFS CIFS
IP
NAS Device OS

Storage Interface
CIFS

Windows

Storage Array

123
RAID and Storage Networking Technologies

NAS File Sharing Protocols

Two common NAS file sharing protocols are:

• NFS – Network File System protocol

• Traditional UNIX environment file sharing protocol.

• CIFS – Common Internet File System protocol

• Traditional Microsoft environment file sharing protocol, based upon the Server Message
Block protocol.

124
RAID and Storage Networking Technologies

Network File System (NFS)

• Client/server application.

• Uses RPC mechanisms over TCP protocol.

• Mount points grant access to remote hierarchical file structures for local file system structures.

• Access to the mount can be controlled by permissions.

Additional Task
Research on NFS and CIFS
125
RAID and Storage Networking Technologies

NAS File Sharing - CIFS

• Common Internet File System

• Developed by Microsoft in 1996.

• An enhanced version of the Server Message Block (SMB) protocol

• Stateful Protocol

• Can automatically restore connections and reopen files that were open prior to interruption.

• Operates at the Application/Presentation layer of the OSI model.

• Most commonly used with Microsoft operating systems, but is platform-independent.

• CIFS runs over TCP/IP and uses DNS (Domain Naming Service) for name resolution.

126
RAID and Storage Networking Technologies

NAS I/O

Applicatio Storage
n Interface
3
Operating Network
System Protocol Block I/O
to
storage
I/O NAS Operating device
Redire System
ct
NFS / NFS / Storage
CIFS Client uses CIFS Array
file I/O
TCP/IP TCP/IP
Stack 1 Stack

Network Network
Interface Interface
4

Clien IP NAS
t Netw Devic
ork e
127
RAID and Storage Networking Technologies

NAS Implementations
Integrated
NAS

I
P

NAS
Devic
e

Gateway
NAS

I FC
P SAN

NAS Head

Storage
Array 128
RAID and Storage Networking Technologies

Integrated NAS Connectivity Clients

Application Server
IP

Integrated NAS System

Application Server

129
RAID and Storage Networking Technologies

Gateway NAS Connectivity


Application
Server

Clien
t

I
FC
P
SAN
Clien
t

Application
Server
Storage
Array

Clien
t NAS
Gateway
Additional Task
Research on suitable environments for implementing Integrated and Gateway NAS 130
RAID and Storage Networking Technologies

Hosting and Accessing Files on the NAS – File sharing protocols (NFS and CIFS)
Steps to host a file system:
• Create an array volume
• Assign volume to NAS device
• Create a file system on the volume
• Mount the file system
• Access the file system
• Use NFS in UNIX environment
• Execute mount/nfsmount command
• Use CIFS in windows environment
• Map the network drive as:
\\Account1\Act_Rep

131
RAID and Storage Networking Technologies

NAS Implementation

SMB
NetBIOS

TCP
IP
802.3

132
RAID and Storage Networking Technologies

NAS Implementation

NFS
TCP
IP
802.3

133
RAID and Storage Networking Technologies

NAS vs SAN

▪ Traditionally:
• NAS is used for low-volume access to a large amount of storage by many users.
• SAN is the solution for terabytes (1012) of storage and multiple, simultaneous access to files, such
as streaming audio/video.

▪ The lines are becoming blurred between the two technologies now, and while the SAN-versus-NAS
debate continues, the fact is that both technologies complement each another.

134
RAID and Storage Networking Technologies

Comparison of Storage Networking Technologies

DAS NAS SAN

Storage Type sectors shared files blocks

Data TCP/IP,
IDE/SCSI Fibre Channel
Transmission Ethernet
clients or clients or
Access Mode servers
servers servers

Capacity (bytes) 109 109 - 1012 1012

Complexity Easy Moderate Difficult


Management
High Moderate Low
Cost (per GB)
135
RAID and Storage Networking Technologies

Self Assessment Questions

11. Which one of the given options supports IDE/SCSI as data transmission mode?

a) DAS
b) NAS
c) SAN

Answer: a

136
RAID and Storage Networking Technologies

Self Assessment Questions

12. Which one of the given options has moderate complexity?

a) DAS
b) NAS
c) SAN

Answer: b

137
RAID and Storage Networking Technologies

Self Assessment Questions

13. Fiber Channel is the networking architecture of _________.

a) DAS
b) NAS
c) SAN

Answer: c

138
RAID and Storage Networking Technologies

Summary

• The more the number of HDDs in a storage array, the larger the probability for disk failure.
• Fiber Channel-switches function in a manner similar to traditional network switches to provide increased
bandwidth, scalable performance, an increased number of devices, and, in some cases, increased
redundancy.
• ISLs are used to transfer host-to-storage data as well as the fabric management traffic from one switch to
another.
• A Storage Area Network (SAN) is a specialised, dedicated high speed network joining servers and storage,
including disks, disk arrays, tapes, etc.
• Fiber Channel is the de facto SAN networking architecture, although other network standards could be
used.
• FC Protocol is designed to transport multiple protocols, such as HIPPI, IPI, SCSI, IP, Ethernet, etc.
• When deciding the number of disks required for an application, it is important to consider the impact of
RAID based on IOPS generated by the application.

139
RAID and Storage Networking Technologies

Assignment

1. Differentiate between RAID 0 and RAID 1 with examples.


2. Explain various SAN topologies.
3. Discuss the necessity of RAID Technology.
4. Elaborate different types of RAID implementation. What are the benefits and limitation of the techniques.
5. Explain the components of RAID Array with diagram.
6. Explain RAID techniques that defines various RAID levels.
7. Compare different RAID techniques.
8. Other Than Raid Feature What Are The Other Features In Software Management Functionalities?
9. What Is A Raid Array ? What is the Difference Between A Jbod & A Raid Array ?
10. Which Is Best Raid Level For Performance And Which Is Best For Redundancy?

140
RAID and Storage Networking Technologies

Assignment

8. Recommend the techniques to determine the data availability and performance characteristics of a
redundant array.

8. Consider three users A, B and C. The user B wants to store the two copies of data on two different HDDs,
user A wants to store the parts of data on different disks and user C wants to protect and store the
segments of the data. Identify the RAID techniques that can be used to store the user A, B and C data.

8. List the various RAID levels with the RAID techniques adopted.

8. Compare different RAID levels.

8. Explain RAID 0 & 1 levels to improve the disk performance with a diagram.

8. Explain the nested RAID levels to improve the disk performance with a diagram.

141
RAID and Storage Networking Technologies

Assignment

14. Consider an application that generates 5,200 IOPS, with 60% of them being reads. Compute the disk
load in RAID 5 RAID 1.

14. What is DAS? Explain different types of DAS architectures.

14. List out the benefits and limitations of DAS architecture.

14. Data Centre managers are burdened with the challenging task of providing low-cost, high-performance
information management solutions. Justify the statement.

14. Explain the SAN implementation with a diagram.

14. List and explain the components of the SAN.

14. Explain the Fiber Channel SAN evolution with a diagram.


142
RAID and Storage Networking Technologies

Assignment

21. Distinguish between multi-mode and single-mode fiber cables for data transfer in FC SAN.

21. Differentiate between FC switch and FC hub.

21. FC architecture supports three basic interconnectivity options. List, explain and compare.

21. Explain the FC topologies most commonly deployed in SAN implementations with a diagram.

21. Define NAS? Explain the benefits.

21. Explain the components of NAS.

21. Explain the different ways of implementing the NAS on server.

21. List and explain various NAS File sharing protocols.


143
RAID and Storage Networking Technologies

Document Links

Topics URL Notes


https://searchstorage.techtarget.com/definition/st
Storage Area Network This link explains the building blocks of SAN
orage-area-network-SAN
https://www.utilizewindows.com/introduction-to- This link covers various aspects of NAS in
Introduction to NAS
network-attached-storage-nas/ details
Introduction to RAID,
levels of Raid and
https://searchstorage.techtarget.com/definition/RA This link explains about RAID Technology
Understanding and
ID fundamentals.
Working with RAID.

144
RAID and Storage Networking Technologies

Video Links

Topics URL Notes

Storage Area Network https://www.youtube.com/watch?v=7et3bmXhr8w This video explains the building blocks of SAN

This video covers various aspects of NAS in


Introduction to NAS https://www.youtube.com/watch?v=KXQUpJWTrlA
details
RAID Fundamentals -
Understanding & Working https://www.youtube.com/watch?v=GsbAUj71ikw This video explains about RAID Technology
With RAID fundamentals.

https://www.youtube.com/watch?v=KmbEjYVtByQ This video provides an in-depth knowledge of


RAID levels
various RAID levels

http://youtube.com/watch?v=X1x9EMd5ywY This video tells about the importance of RAID in


Introduction to RAID
storage.

145
RAID and Storage Networking Technologies

E-Book Links

Topics URL Page Number

https://www.mindshare.com/files/ebooks/SAS%20Storage
Storage Systems Architecture ALL
%20Architecture.pdf

Information Storage and https://clickforacess.files.wordpress.com/2017/11/san-


ALL
Management book.pdf

146
DEPARTMENT OF BCA
STORAGE MANAGEMENT
SUB CODE: 16BCA525D21
MODULE 4: BACKUP AND RECOVERY

Module 4 V Sem, SM, Department of BCA


CONTENTS :

Module 4 :
Introduction to Business Continuity - Backup Purpose -Disaster Recovery - Operational Backup –
Archival, Backup Considerations, Backup Granularity, Recovery Considerations, Backup Methods ,
Backup Process, Backup and Restore Operations, Backup Topologies - Server less Backup , Backup
Technologies -Backup to Tape - Physical Tape Library - Backup to Disk - Virtual Tape Library

10/10/2021 2
V SEM, SM, DEPARTMENT OF BCA
Back up and Recovery

AIM:

To familiarise students with in depth knowledge on fundamentals of Business Continuity, Backup of data,
Disaster Recovery and various restore options of data.

3
Back up and Recovery

Objectives

The objectives of this module are:

• To understand the business continuity process due to the adverse operations in business and its impact on
back up and recovery of data.

• To list exact copy of data through replication.

• To create additional copy of data that can be used for restore and recovery purposes.

• Understand the level of detail characterising backup data.

• Identify the data to be backed up is transferred from the backup client (Source).

4
Back up and Recovery

Outcomes:
At the end of this module, you are expected to:
• Define Business Continuity and Information Availability.
• Detail impact of information unavailability.
• Define BC measurement and terminologies.
• Describe BC planning process.
• Explain BC technology solutions.
• Define Backup and backup consideration.
• Describe purposes of backup.
• Explain backup granularity and restore.
• List backup methods.
• Describe backup/recovery process and operation.
5
Back up and Recovery

Content
• Business continuance infrastructure services
• The need for redundancy
• Information availability
• BC terminology and life cycle
• BC technology solutions
• Backup technologies and recovery considerations
• Restore and restart considerations

6
Back up and Recovery

Business Continuity

7
Back up and Recovery

Business Continuity (BC)

• Continuous access to information is a must for the smooth functioning of business operations today,
as the cost of business disruption could be catastrophic.
• There are many threats to information availability, such as natural disasters (e.g., flood, fire,
earthquake), unplanned occurrences (e.g., cybercrime, human error, network and computer failure),
and planned occurrences (e.g., upgrades, backup, restore) that result in the inaccessibility of
information.
• It is critical for businesses to define appropriate plans that can help them overcome these crises.
• Business continuity is an important process to define and implement these plans.
• Business Continuity is preparing for, responding to, and recovering from an application outage that
adversely affects business operations.
8
Back up and Recovery

Business Continuity (BC)

• Business Continuity solutions address unavailability and degraded application performance.


• Business Continuity is an integrated and enterprise wide process and set of activities to ensure
“information availability”.

9
Back up and Recovery

Information Availability

10
Back up and Recovery

Information Availability
Information availability refers to the ability of an infrastructure to function according to business
expectations during its specified time of operation.

11
Back up and Recovery

Information Availability

Information Availability can be defined in terms of three parameters:


• Reliability
• Accessibility
• Timeliness

12
Back up and Recovery

Information Availability

• Reliability
The components delivering the information should be able to function without failure, under stated
conditions, for a specified amount of time.
• Accessibility
Information should be accessible at the right place and to the right user.
• Timeliness
Information must be available whenever required.

13
Back up and Recovery

Causes of Information Unavailability

Disaster (<1% of Occurrences)


Natural or man made
Flood, fire, earthquake
Contaminated building

Unplanned Outages (20%)


Failure
Database corruption
Component failure
Human error

Planned Outages (80%)


Competing workloads
Backup, reporting
Data warehouse extracts
Application and data restore

14
Back up and Recovery

Causes of Information Unavailability

• Planned outages
Planned outages include installation/integration/maintenance of new hardware, software upgrades or
patches, taking backups, application and data restores, facility operations (renovation and construction)
and refresh/migration of the testing to the production environment.
• Unplanned outages
Unplanned outages include failure caused by database corruption, component failure, and human
errors.

15
Back up and Recovery

Impact of Downtime
Lost Productivity Know the downtime costs (per Lost Revenue
• Number of employees hour, day, two days...) • Direct loss
impacted (x hours out * • Compensatory payments
hourly rate) • Lost future revenue
• Billing losses
• Investment losses

Damaged Reputation Financial Performance

• Customers • Revenue recognition


• Suppliers • Cash flow
• Financial markets • Lost discounts (A/P)
• Banks • Payment guarantees
• Business partners • Credit rating
• Stock price

Other Expenses
Temporary employees, equipment rental, overtime
costs, extra shipping costs, travel expenses... 16
Back up and Recovery

Impact of Downtime

Average cost of downtime per hour = average productivity loss per hour + average revenue loss per
hour
Where,
Productivity loss per hour = (total salaries and benefits of all employees per week) / (average number of
working hours per week).
Average revenue loss per hour = (total revenue of an organisation per week) / (average number of hours per
week that an organisations is open for business).

17
Back up and Recovery

Measuring Information Availability

MTTR – Time to repair or ‘downtime’

Response Time Recovery Time

Detection Repair Restoration

Time
Incident Diagnosis Recovery Incident

Detection MTBF – Time between failures or


Repair time
elapsed time ‘uptime’

18
Back up and Recovery

Measuring Information Availability


• Mean Time To Repair (MTTR)
It is the average time required to repair a failed component.
• Mean Time Between Failure (MTBF)
It is the average time available for a system or component to perform its normal operations between
failures.

19
Back up and Recovery

Measuring Information Availability

Information Availability = MTBF / (MTBF + MTTR)


or
Information Availability = uptime / (uptime + downtime)

20
Back up and Recovery

Self Assessment Question

1. What does MTTR stand for?

a. Mean Time To Repair


b. Mean To Time Repair
c. Manage Time To Repair

Answer: a

21
Back up and Recovery

Self Assessment Question

2. __________________ includes failure caused by database corruption, component failure, and human
errors.

a. Unplanned outages
b. Mean To Time Repair
c. Planned outages

Answer: a

22
Back up and Recovery

Self Assessment Question

3. Business disruption could be _______________.

a. Unplanned outages
b. Mean to time repair
c. Planned outages
d. Catastrophic

Answer: d

23
Back up and Recovery

Self Assessment Question

4. Which one of the given options is the unplanned occurrence threat to information security?

a. Earthquake
b. Human error
c. Upgrades

Answer: b

24
Back up and Recovery

Self Assessment Question

5. Which one of the given options is the planned occurrence threat to information security?

a. Earthquake
b. Human error
c. Upgrades

Answer: c

25
Back up and Recovery

Self Assessment Question

6. _______________ is the natural disaster threat to information security.

a. Earthquake
b. human error
c. upgrades

Answer: a

26
Back up and Recovery

Business Continuity
(BC) Terminologies

27
Back up and Recovery

BC Terminologies

The Business Continuity defines common terms related to Business Continuity operations.

28
Back up and Recovery

BC Terminologies

Disaster recovery
• Coordinated process of restoring systems, data, and infrastructure required to support ongoing
business operations in the event of a disaster.
• Restoring previous copy of data and applying logs to that copy to bring it to a known point of
consistency.
• Generally implies use of backup technology.

29
Back up and Recovery

BC Terminologies

Disaster restart
• Process of restarting from disaster using mirrored consistent copies of data and applications.
• Generally implies use of replication technologies.

30
Back up and Recovery

BC Terminologies

Recovery Point Objective (RPO)


• Point in time to which systems and data must be recovered after an outage.
• Amount of data loss that a business can endure.

31
Back up and Recovery

BC Terminologies
Recovery Time Objective (RTO)
• Time within which systems, applications, or functions must be recovered after an outage.
• Amount of downtime that a business can endure and survive.

32
Back up and Recovery

Business Continuity
(BC) Planning Lifecycle

33
Back up and Recovery

Business Continuity Planning (BCP) Process


• Identifying the critical business functions.
• Collecting data on various business processes within those functions.
• Business Impact Analysis (BIA)
• Risk Analysis
• Assessing, prioritising, mitigating, and managing risk.
• Designing and developing contingency plans and Disaster Recovery plan (DR Plan).
• Testing, training and maintenance.

34
Back up and Recovery

Business Continuity (BC) Planning Lifecycle

BC planning must follow a disciplined approach like any other planning process. Organisations today
dedicate specialised resources to develop and maintain BC plans. From the conceptualisation to the
realisation of the BC plan, a lifecycle of activities can be defined for the BC process.

35
Back up and Recovery

Business Continuity (BC) Planning Lifecycle

The BC planning lifecycle includes five stages:

1. Establishing objectives

2. Analysing

3. Designing and developing

4. Implementing

5. Training, testing, assessing, and maintaining

36
Back up and Recovery

Business Continuity (BC) Planning Lifecycle

Establishing
objectives

Training,
testing,
Analysing
assessing, and
maintaining

Implementing
Designing and
developing

37
Back up and Recovery

Establishing objectives

• Determine BC requirements.
• Estimate the scope and budget to achieve requirements.
• Select a BC team by considering subject matter experts from all areas of the business, whether
internal or external.
• Create BC policies.

38
Back up and Recovery

Analysing

• Collect information on data profiles, business processes, infrastructure support, dependencies, and
frequency of using business infrastructure.
• Identify critical business needs and assign recovery priorities.
• Create a risk analysis for critical areas and mitigation strategies.
• Conduct a Business Impact Analysis (BIA).
• Create a cost and benefit analysis based on the consequences of data unavailability.
• Evaluate options.

39
Back up and Recovery

Designing and developing

• Define the team structure and assign individual roles and responsibilities. For example, different teams
are formed for activities such as emergency response, damage assessment, and infrastructure and
application recovery.
• Design data protection strategies and develop infrastructure.
• Develop contingency scenarios.
• Develop emergency response procedures.
• Detail recovery and restart procedures.

40
Back up and Recovery

Implementing

• Implement risk management and mitigation procedures that include backup, replication, and
management of resources.
• Prepare the disaster recovery sites that can be utilised if a disaster affects the primary data centre.
• Implement redundancy for every resource in a data centre to avoid single points of failure.

41
Back up and Recovery

Training, testing, assessing, and maintaining

• Train the employees who are responsible for backup and replication of business-critical data on a
regular basis or whenever there is a modification in the BC plan.
• Train employees on emergency response procedures when disasters are declared.
• Train the recovery team on recovery procedures based on contingency scenarios.
• Perform damage assessment processes and review recovery plans.
• Test the BC plan regularly to evaluate its performance and identify its limitations.
• Assess the performance reports and identify limitations.
• Update the BC plans and recovery/restart procedures to reflect regular changes within the data centre.

42
Back up and Recovery

Self Assessment Question

7. Which one of the given options is included in the BC planning lifecycle?

i. Designing and developing

ii. Implementing

iii. Training, testing, assessing, and maintaining

a) Only i

b) Only ii

c) Only i and ii

d) All i, ii and iii

Answer: d
43
Back up and Recovery

BC Technology Solutions

44
Back up and Recovery

Failure Analysis

Failure analysis involves Analysing the data centre to identify systems that are susceptible to a single point
of failure and implementing fault-tolerance mechanisms such as redundancy.

45
Back up and Recovery

Single Point of Failure

Single Points of Failure refers to the failure of a component of a system that can terminate the availability of
the entire system or IT service.

46
Back up and Recovery

Fault Tolerance

To mitigate a single point of failure, systems are designed with redundancy, such that the system will fail,
only if all the components in the redundancy group fail. This ensures that the failure of a single
component does not affect data availability.

47
Back up and Recovery

Implementation of Fault Tolerance


Clustered Servers
Redundant Arrays

Heartbeat Connection

Redundant Ports

Client

FC Switches

IP

Storage Array Storage Array

Remote Site

Redundant Network
Redundant Paths

Redundant FC Switches
48
Back up and Recovery

Multi-pathing Software

• Configuration of multiple paths increases data availability.


• Even with multiple paths, if a path fails, I/O will not reroute unless system recognises that it has an
alternate path.
• Multi-pathing software helps to recognise and utilises alternate I/O path to data.
• Multi-pathing software also provides the load balancing.
• Load balancing improves I/O performance and data path utilisation.

49
Back up and Recovery

Self Assessment Question

8. _________________ software helps to recognise and utilises alternate I/O path to data.

a. Multi-pathing

b. Single-pathing

c. Training

Answer: a

50
Back up and Recovery

Self Assessment Question

9. _______________ refers to the failure of a component of a system that can terminate the
availability of the entire system.

a. Single Points of Failure

b. Training, testing, assessing

c. Maintaining

Answer: a

51
Back up and Recovery

Backup and Recovery

52
Back up and Recovery

Backup and Recovery

• Backup is an additional copy of data that can be used for restore and recovery purposes.
• The Backup copy is used when the primary copy is lost or corrupted.
• Businesses back up their data to enable its recovery in case of potential loss.
• Businesses also back up their data to comply with regulatory requirements.

This Backup copy can be created by:


• Simply copying data (there can be one or more copies).
• Mirroring data (the copy is always updated with whatever is written to the primary copy).

53
Back up and Recovery

Backup purposes

• Disaster Recovery
Restores production data to an operational state after disaster.
• Operational
Restores data in the event of data loss or logical corruptions that may occur during routine processing.
• Archival
Preserve transaction records, email, and other business work products for regulatory compliance.

54
Back up and Recovery

Backup and Recovery Considerations

Considerations for Backup and Recovery are:


1. Backup Speed
2. Recovery Time
3. Reliability
4. Simplicity
5. Cost Efficiency

55
Back up and Recovery

Backup Speed

How quickly and efficiently can data be backed up?

When an appliance is first added to the system, the first complete backup will take some time.
However, once the first backup is complete, updates should take only a few minutes each day, and
be virtually unnoticed by end users on the network, assuming business-grade network connectivity
is in place. If a slow backup causes a general network slowdown for end users, IT might be forced
to suspend backups, completely derailing the data protection plan.

56
Back up and Recovery

Recovery Time

Backup speed is important, but recovery speed is where the data protection plan proves its worth. In the
event of a system crash or disaster, how fast can critical data be recovered and accessed? Businesses
should establish a Recovery Time Objective (RTO) and a Recovery Point Objective (RPO). RTO is the
time by which a business process must be restored after a disaster or disruption to avoid unacceptable
consequences of a break in business continuity. RPO is essentially the minimum frequency of backups –
which translates to the amount of time and data that would be acceptable to lose in the event of a disaster.

57
Back up and Recovery

Reliability

How reliable is your backup and recovery ecosystem?

Once you define the strategy, it is critical to implement that strategy on an infrastructure you can trust.
By ensuring both best practices and high-quality, reputable storage hardware and software systems, the
organisation can be confident that backed up data will be there when it needs to be recovered.

58
Back up and Recovery

Simplicity

If the data protection strategy and systems are not easy to implement, you risk mistakes or missed backups
at critical times. Additionally, ease of use and management capabilities are critical for evaluating long-term
resources necessary to maintain an effective strategy, especially when IT resources are limited or
outsourced. While, some technology platforms may be more affordable, an administrator’s time to monitor
could add significantly to cost. Look for solutions that enable storage administrators to set policies so that
backups run seamlessly without much need for interaction.

59
Back up and Recovery

Cost Efficiency

What is the cost – not just to acquire, but also to operate?

Key Total Cost of Ownership (TCO) considerations includes energy efficiency, maintenance, and scaling up
the system to manage data growth. A tiered storage strategy is critical here. Tying storage system
performance and cost to the value of the data, and quickly moving fixed content and other non-production
information onto reliable, lower cost systems will help manage cost while delivering on the access level and
recovery time objectives required.

60
Back up and Recovery

Self Assessment Question

10. Which one of the given options are considered for Backup and Recovery?

i. Backup Speed

ii. Recovery Time

iii. Reliability

a) Only i

b) Only ii

c) Only i and ii

d) All i, ii and iii

Answer: d
61
Back up and Recovery

Self Assessment Question

11. This Backup copy can be created by _____________ data.

a. Simply copying

b. Fatching

c. Reliability

Answer: a

62
Back up and Recovery

Backup Granularity

63
Back up and Recovery

Backup granularity

Backup granularity describes the level of detail characterising backup data. In data backup and restore
systems, increased granularity improves recovery point objectives because, the backup process
operates using more discrete granules or increments of data, minimising the amount of data lost during
a system restore.

64
Back up and Recovery

Backup granularity levels


Full Backup

Su Su Su Su Su

Cumulative (Differential) Backup

Su M T W T F S Su M T W T F S Su M T W T F S Su M T W T F S Su

Incremental Backup

Su M T W T F S Su M T W T F S Su M T W T F S Su M T W T F S Su

Amount of data backup


65
Back up and Recovery

Restoring from Incremental Backup

• Files that have changed since the last backup are backed up.
• Fewest amount of files to be backed up therefore, faster backup and less storage space.
• Longer restore because last full and all subsequent incremental backups must be applied.

66
Back up and Recovery

Restoring from Incremental Backup

Monday Tuesday Wednesday Thursday Friday

Files 1, 2, 3 File 4 Updated File 3 File 5 Files 1, 2, 3, 4, 5

Full Backup Incremental Incremental Incremental

Production

67
Back up and Recovery

Restoring from Cumulative Backup

• More files to be backed up, therefore it takes more time to backup and uses more storage space.
• Much faster restore because only the last full and the last cumulative backup must be applied.

68
Back up and Recovery

Restoring from Cumulative Backup

Monday Tuesday Wednesday Thursday Friday

Files 1, 2, 3 File 4 Files 4, 5 Files 4, 5, 6 Files 1, 2, 3, 4, 5, 6

Full Backup Cumulative Cumulative Cumulative

Production

69
Back up and Recovery

Self Assessment Question

12. _____________ describes the level of detail characterising backup data.

a. Backup granularity

b. Data Recovery

c. Reliability

Answer: d

70
Back up and Recovery

Backup Methods

71
Back up and Recovery

Backup Methods

• Cold or offline
• Hot or online
• Open file
• Retry
• Open File Agents
• Point in Time (PIT) replica
• Backup file metadata for consistency
• Bare metal recovery

72
Back up and Recovery

Backup Process

A backup system uses client/server architecture with a backup server and multiple backup clients. The backup
server manages the backup operations and maintains the backup catalogueue, which contains information
about the backup process and backup metadata.

73
Back up and Recovery

Backup Architecture

1. Backup client
Sends backup data to backup server or storage node.
2. Backup server
Manages backup operations and maintains backup catalogueue.
3. Storage node
Responsible for writing data to backup device.

74
Back up and Recovery

Backup Architecture & Process

Backup Data Storage Array

Application Server/ Backup Server/


Backup Client Storage Node

Tape Library

75
Back up and Recovery

Backup Operation

1. Start of scheduled backup process.


2. Backup server retrieves backup related information from backup catalogueue.
3. Backup server instructs storage node to load backup media in backup device.
4. Backup server instructs backup clients to sends it's metadata to backup server and data to be backed
up to storage node.
5. Backup clients send data to storage node.
6. Storage node sends data to backup device.
7. Storage node sends metadata and media information to Backup server.
8. Backup server updates catalogue and records the status.

76
Back up and Recovery

Backup Operation
Application Server and Backup Clients

3 5

1 4 6

8 7

Backup Server Storage Node Backup Device 77


Back up and Recovery

Restore Operation
1. Backup server scans backup catalogue to identify data to be restored and the client who will receive data.
2. Backup server instructs storage node to load backup media in backup device.
3. Data is then read and sent to backup client.
4. Storage node sends restore metadata to backup server.
5. Backup server updates catalogue.

78
Back up and Recovery

Restore Operation
Application Server and Backup Clients

1 2 3

5 4

Backup Server Storage Node Backup Device 79


Back up and Recovery

Self Assessment Question

13. __________________ retrieves backup related information from backup catalogue.

a. Backup granularity

b. Backup server

c. Reliability

Answer: b

80
Back up and Recovery

Backup Topologies

81
Back up and Recovery

Backup Topologies

There are 4 basic backup topologies:

1. Direct Attached Based Backup


2. LAN Based Backup
3. SAN Based Backup
4. Mixed backup

82
Back up and Recovery

Direct Attached Based Backup


In a direct-attached backup, a backup device is attached directly to the client. Only the metadata is sent to
the backup server through the LAN. This configuration frees the LAN from backup traffic.

Metadata Data

LAN

Backup Server
Backup Device
Application Server
and Backup Client
and Storage Node

83
Back up and Recovery

LAN Based Backups

In LAN-based backup, all servers are connected to the LAN and all storage devices are directly attached
to the storage node. The data to be backed up is transferred from the backup client (source), to the
backup device over the LAN, which may affect network performance.

84
Back up and Recovery

LAN Based Backups


Application Server
and Backup Client Backup Server

Metadata

LAN

Data

Backup Device
Storage Node
85
Back up and Recovery

SAN Based Backups

The SAN-based backup is also known as the LAN-free backup. The SAN-based backup topology is the
most appropriate solution, when a backup device needs to be shared among the clients. In this case, the
backup device and clients are attached to the SAN.

86
Back up and Recovery

SAN Based Backups

LAN FC SAN

Metadata
Data

Backup Server Application Server Backup Device


and Backup Client

Storage Node

87
Back up and Recovery

Mixed Backup

The mixed topology uses both the LAN-based and SAN-based topologies. This topology might be
implemented for several reasons, including cost, server location, reduction in administrative overhead, and
performance considerations.

88
Back up and Recovery

Mixed Backup

Application Server
and Backup Client

Metadata

LAN FC SAN

Metadata Data

Backup Server Application Server


and Backup Client Backup Device

Storage Node
89
Back up and Recovery

Backup Technology

90
Back up and Recovery

Backup Technology

A wide range of technology solutions are currently available for backup. Tapes and disks are the two most
commonly used backup media.

The most common backup technologies are :


1. Backup to Tape (Physical Tape Library)
2. Backup to Disk
3. Backup to virtual tape

91
Back up and Recovery

Backup to Tape
• Traditional destination for backup
• Low cost option
• Sequential / Linear Access
• Multiple streaming (Backup streams from multiple clients to a single backup device)

Data from
Stream 1 Data from
Stream 2 Data from
Stream 3

Tape

92
Back up and Recovery

Physical Tape Library

93
Back up and Recovery

Tape Limitations

• Reliability
• Restore performance
• Mount, load to ready, rewind, dismount times
• Sequential Access
• Cannot be accessed by multiple hosts simultaneously
• Controlled environment for tape storage
• Wear and tear of tape
• Shipping/handling challenges
• Tape management challenges

94
Back up and Recovery

Backup to Disk

• Ease of implementation
• Fast access
• More Reliable
• Random Access
• Multiple host access
• Enhanced overall backup and recovery performance

95
Back up and Recovery

Virtual Tape Library

A Virtual Tape Library (VTL) has the same components as that of a physical tape library except that the
majority of the components are presented as virtual resources. For backup software, there is no
difference between a physical tape library and a virtual tape library.

96
Back up and Recovery

Virtual Tape Library

Backup Server/ FC SAN


Storage Node

Virtual Tape Library Appliance


LAN

Emulation Engine
Storage (LUNs)

Backup Clients 97
Back up and Recovery

Tape Versus Disk Versus Virtual Tape

Disk-Aware
Difference Tape Virtual Tape
Backup-to-Disk

Offsite Capabilities Yes No Yes

No inherent protection
Reliability RAID, spare RAID, spare
methods

Subject to mechanical
Performance Faster single stream Faster single stream
operations, load times

Multiple
Use Backup only Backup only
(backup/production)

98
Back up and Recovery

Self Assessment Question

14. LAN Based Backup is a ____________________.

a. Backup Technology

b. Backup Topology

c. Recovery Method

Answer: b

99
Back up and Recovery

Self Assessment Question


15. Which one of the given options is/are the most common backup technologies?

i. Backup to Disk
ii. Backup to virtual tape
iii. Backup to Tape (Physical Tape Library)

a) Only i
b) Only ii
c) Only i and ii
d) All i, ii and iii
Answer: d
10
Back up and Recovery

Assignment
1. A network router has a failure rate of 0.02 percent per 1,000 hours. What is the MTBF of that component?
2. Provide examples of planned and unplanned downtime in the context of data centre operations.
3. How does clustering help to minimise RTO?
4. Discuss the security concerns in backup environment.
5. Differentiate the benefits of using “virtual tape library” over “physical tapes.”
6. What is the necessity of business continuity?
7. Discuss the threats to information availability.
8. Define Information availability.
9. Explain the causes of Information Unavailability.
10. Explain how to measure Information Availability.

10
Back up and Recovery

Assignment
11. Discuss the Consequences of Downtime of data availability.
12. Elaborate the strategies to meet RPO and RTO targets.
13. With a neat diagram, explain the BC Planning Lifecycle.
14. With a neat diagram, explain Single point of failure analysis.
15. Explain enhancements in the infrastructure to mitigate single points of failures.
16. Discuss the BC Technology Solutions where data can be recovered and business operations can be
restarted using an alternate copy of data.
17. Explain the importance of backup of data.
18. Compare Cumulative (or differential) and Synthetic (or constructed) full backup.
19. Explain different types of backup methods.
20. With a neat diagram, explain the backup process. 10
Back up and Recovery

Assignment
19. Differentiate between various backup technology options.
20. Explain the basic topologies that are used in a backup environment.
21. Demonstrate how RPO and RTO are the major considerations when planning a backup strategy.

10
Back up and Recovery

Summary
✓ Business Continuity is preparing for, responding to, and recovering from an application outage that
adversely affects business operations.

✓ Information Availability can be defined in terms of three parameters that is reliability, accessibility and
timeliness.

✓ BC planning must follow a disciplined approach like any other planning process.

✓ Failure analysis involves Analysing the data centre to identify systems that are susceptible to a single
point of failure and implementing fault-tolerance mechanisms such as redundancy.

✓ Backup is an additional copy of data that can be used for restore and recovery purposes.

✓ Backup granularity describes the level of detail characterising backup data.

10
Back up and Recovery

Summary
✓ A backup system uses client/server architecture with a backup server and multiple backup clients.

✓ In LAN-based backup, all servers are connected to the LAN and all storage devices are directly
attached to the storage node. The data to be backed up is transferred from the backup client (source), to
the backup device over the LAN, which may affect network performance.

10
Back up and Recovery

Document Links

Topics URL Notes


https://searchdisasterrecovery.techtarget.com/definitio
Business Continuity This link explains the introduction of business
n/business-continuity
continuity.
https://www.milton-keynes.gov.uk/business/business-
This link explains the BC life cycle.
continuity-risk-and-resilience/business-continuity-
BC Life Cycle
management-lifecycle

Backup technologies
http://www.nexsan.com/key-considerations-backup-
and recovery This link discusses the various backup and
and-recovery/
considerations recovery considerations.

Backup technologies
http://wikibon.org/wiki/v/Backup_and_recovery_techni
and recovery This link discusses the various technique of
ques
considerations backup and recovery

10
Back up and Recovery

Video Links

Topics URL Notes

Business Continuity https://www.youtube.com/watch?v=hciSIFsbdJs This video explains about the BC

Introduction to Backup
This video provides the need of
Systems http://youtube.com/watch?v=k6dosJ9phWY
backup systems

Basic Backup Methods This video gives insight


http://youtube.com/watch?v=a_--9FQhph0
knowledge of backup methods

Backup Basics This video describes the basics of


http://youtube.com/watch?v=TLXaZSqoBs8
back up technology

http://youtube.com/watch?v=kflzD5nxI8g This video gives the different


Backup Types
backup types
10
Back up and Recovery

E-Book Links

Topics URL Page Number

Introduction to
Page No 229 to 241
Business Continuity
http://wachum.org/dewey/000/ism1.pdf
Backup and Recovery Page No 251 to 275

10
DEPARTMENT OF BCA
STORAGE MANAGEMENT
SUB CODE: 16BCA525D21
MODULE 5: REPLICATION-LOCAL AND REMOTE

Module 5 V Sem, SM, Department of BCA


CONTENTS :

Module 5 :
Source and Target -Uses of Local Replicas, Data Consistency - Consistency of a Replicated File System -
Consistency of a Replicated Database , Local Replication Technologies - Host-Based Local Replication -
Storage Array-Based Replication , Restore and Restart Considerations - Tracking Changes to Source and
Target , Creating Multiple Replicas, Management Interface – Remote Replication Modes – Remote
Replication Technologies – Network Infrastructure

10/10/2021 2
V SEM, SM, DEPARTMENT OF BCA
Replication – Local and Remote

AIM:
To familiarise students with fundamentals of Replication – Local and Remote.

3
Replication – Local and Remote

Objectives:
The objectives of this module are to:
• Understand exact copy of data through replication.
• Define additional copy of data that can be used for restore and recovery purposes.
• Understand local replication and the possible uses of local replicas.
• Define consistency considerations when replicating file systems and databases.
• Comprehend host and array based replication technologies
• Functionality
• Differences
• Considerations
• Selecting the appropriate technology

4
Replication – Local and Remote

Outcomes:
At the end of this module, you are expected to:
• Explain local replication.
• List the possible uses of local replicas.
• Outline replica considerations such as Recoverability and Consistency.
• Describe how consistency is ensured in file system and database replication.
• Explain Dependent write principle.

5
Replication – Local and Remote

Content
• Local replication technologies
• Host based Local Replication technologies
• LVM based mirroring
• File system snapshot
• Array based Local Replication technologies
• Full volume mirroring
• Pointer-based full volume copy
• Pointer-based virtual replica
• Restore and restart considerations
• Modes of remote replications
• Remote replication technologies 6
Replication – Local and Remote

Local Replicas

7
Replication – Local and Remote

Replication
Replicas mean an exact copy of the original and replication is the process of creating an exact
copy of data.

REPLICATION

Source Replica (Target)

8
Replication – Local and Remote

Uses of Local Replicas


1. Alternate source for backup : An alternative to doing backup on production volumes.
2. Fast recovery : Provides minimal RTO (Recovery Time Objective).
3. Decision support : Use replicas to run decision support operations such as creating a
report and reduce burden on production volumes.
4. Testing platform : To test critical business data or applications.
5. Data Migration : Use replicas to do data migration instead of production volumes.

9
Replication – Local and Remote

Replication Considerations
• Types of Replica: choice of replica tie back into RPO (recovery point objective)
• Point-in-Time (PIT)
• Non zero RPO
• Continuous
• Near zero RPO
• What makes a replica good?
• Recoverability/Re-startability
• Replica should be able to restore data on the source device.
• Restart business operation from replica.
• Consistency
• Ensuring consistency is a primary requirement for all the replication technologies.

10
Replication – Local and Remote

Understanding Consistency
• Ensure data buffered in the host is properly captured on the disk when replica is created.
• Data is buffered in the host before written to disk.
• Consistency is required to ensure the usability of replica.
• Consistency can be achieved in various ways:
• For file Systems
• Offline: Un-mount file system
• Online: Flush host buffers
• For Databases
• Offline: Shutdown database
• Online: Database in hot backup mode
• Dependent Write I/O Principle
• By Holding I/Os

11
Replication – Local and Remote

File System Consistency: Flushing Host Buffer

Application

File System Data


Sync
Daemon Memory Buffers

Logical Volume Manager

Physical Disk Driver

Source Replica

12
Replication – Local and Remote

Database Consistency: Dependent write I/O Principle


• Dependent Write: A write I/O that will not be issued by an application until a prior related write
I/O has completed

• A logical dependency, not a time dependency.

• Inherent in all Database Management Systems (DBMS)

• e.g. Page (data) write is dependent write I/O based on a successful log write.

• Necessary for protection against local outages

• Power failures create a dependent write consistent image.

• A Restart transforms the dependent write consistent to transitionally consistent.

• i.e., Committed transactions will be recovered, in-flight transactions will be discarded.


13
Replication – Local and Remote

Database Consistency: Dependent Write I/O


Source Replica Source Replica

1 1 1

2 2 2

3 3 3 3

4 4 4 4

👍 Consistent
👎 Inconsistent
14
Replication – Local and Remote

Database Consistency: Holding I/O

Source Replica

1 1

5 5 2 2

3 3

4 4

Consistent
15
Replication – Local and Remote

Self Assessment Question


1. Which one of the given options is the abbreviation of RTO?

a. Recovery Task Outcome

b. Recovery Time Outcome

c. Recovery Time Objective

Answer: Recovery Time Objective

16
Replication – Local and Remote

Self Assessment Question


2. Which one of the given options is the abbreviation of RPO?

a. Recovery Point Objective

b. Recovery Point Outcome

c. Recovery Power Objective

Answer: Recovery Point Objective

17
Replication – Local and Remote

Self Assessment Question


3. Replication can be classified into two major categories:

a. Local only

b. Remote only

c. Local and Remote

Answer: Local and Remote

18
Replication – Local and Remote

Self Assessment Question


4. A host accessing data from one or more LUNs on the storage array is called as _____________ host.

a. Sink

b. Source

c. Product

d. Production

Answer: Production

19
Replication – Local and Remote

Self Assessment Question


5. A LUN (or LUNs) on which the data is replicated is called the ______________.

a. target LUN

b. simply the target

c. Replica

d. All of Above

Answer: All of Above

20
Replication – Local and Remote

Self Assessment Question


6. The period during which a source is available to perform a data backup is called a ___________
window.

a. Front

b. Back

c. Backup

d. Source

Answer: Backup
21
Replication – Local and Remote

Self Assessment Question


7. ___________________ replica ensures that data buffered in the host is properly captured on the disk
when the replica is created.

a. In consistent

b. Consistent

c. Full

d. Partial

Answer: Consistent
22
Replication – Local and Remote

Self Assessment Question


8. _________ daemon is the process that flushes the buffers to disk at set intervals.

a. Awk

b. background

c. Sync

d. Async

Answer: Sync

23
Replication – Local and Remote

Self Assessment Question


9. According to _______________ principle, a write I/O is not issued by an application until a prior
related write I/O has completed.

a. Dependent read I/O

b. Independent read I/O

c. Independent write I/O

d. Dependent write I/O

Answer: Dependent write I/O


24
Replication – Local and Remote

Self Assessment Question


10. In which one of the given options, the database is in the hot backup mode?

a. Logging activity decreases

b. Logging activity Increases

c. Logging activity Synchronise

d. Logging activity remains constant

Answer: Logging activity Increases

25
Replication – Local and Remote

Local Replication Technologies


There are two types of local Replication technology:
1. Host based replication
• Logical Volume Manager (LVM) based replication (LVM mirroring)
• File System Snapshot
2. Storage Array based replication
• Full volume mirroring
• Pointer based full volume replication
• Pointer based virtual replication

26
Replication – Local and Remote

Host-Based Local Replication


In host-based replication, Logical Volume Managers (LVMs) or the file systems perform the local
replication process. LVM-based replication and File System (FS) snapshot are examples of host-
based local replication.

In LVM-based replication, logical volume manager is responsible for creating and controlling the
host-level logical volume.

27
Replication – Local and Remote

Host-Based Local Replication

Physical
Volume 1
Host Logical Volume

Logical Volume

Physical
Volume 2

28
Replication – Local and Remote

Storage Array–Based Replication


In storage array-based local replication, the array operating environment performs the local
replication process. The host resources such as CPU and memory are not used in the replication
process. Consequently, the host is not burdened by the replication operations. The replica can be
accessed by an alternate host for any business operations.

29
Replication – Local and Remote

File system (FS) snapshot

File system (FS) snapshot is a pointer-based replica that requires a fraction of the space used by
the original FS. This snapshot can be implemented by either FS itself or by LVM. It uses Copy on
First Write (CoFW) principle.

30
Replication – Local and Remote

Storage Array–Based Replication

Array

Source Replica

Production Server BC Server

31
Replication – Local and Remote

Full Volume Mirroring


• Target is a full physical copy of the source device.
• Target is attached to the source and data from source is copied to the target.
• Target is unavailable while it is attached.
• Target device is as large as the source device.
• Good for full backup, decision support, development, testing and restore to last PIT.

32
Replication – Local and Remote

Full Volume Mirroring

Attached

Read/Write Not Ready

Source Target
Array

33
Replication – Local and Remote

Pointer-based Full Volume Replication


• Provides full copy of source data on the target.
• Target device is made accessible for business operation as soon as the replication session is started.
• Point-in-Time is determined by time of session activation.
• Two modes
• Copy on First Access (deferred)
• Full Copy mode
• Target device is at least as large as the source device

34
Replication – Local and Remote

Copy on First Access (CoFA) Mode


Write to
Read/Write
Source Read/Write

Source Target

Write to
Read/Write
Target Read/Write

Source Target

Read from
Read/Write
Target Read/Write

Source Target 35
Replication – Local and Remote

Pointer Based Virtual Replication


• Targets do not hold actual data, but hold pointers to where the data is located.
• Target requires a small fraction of the size of the source volumes.
• A replication session is setup between source and target devices.
• Target devices are accessible immediately when the session is started.
• At the start of the session the target device holds pointers to data on source device.
• Typically recommended if the changes to the source are less than 30.

36
Replication – Local and Remote

Pointer-based Virtual Replication Write to Source


When a write is issued to the source for the first time after replication session activation:
• Original data at that address is copied to save location.
• The pointer in the target is updated to point to this data in the save location.
• Finally, the new write is updated on the source.
Save Location

Source
Target
Virtual Device
A

B C
C’
C’
Write to Source C
37
Replication – Local and Remote

Pointer-based Virtual Replication Write to Target


When a write is issued to the target for the first time after session activation:
• Original data from the source device is copied to the save location and similarly the
pointer is updated to the data in save location.
• Another copy of the original data is created in the save location before the new write
is updated on the save location. A’ Write to Target
Save Location

Source
Target
Virtual Device
A
A’
A
B A
C’ C
38
Replication – Local and Remote

Virtual Replication: Copy on First Write Example


Target
Virtual
Device

Sourc Save
e Location

39
Replication – Local and Remote

Tracking Changes to Source and Target


• Changes will/can occur to the Source/Target devices after PIT has been created.
• How and at what level of granularity should this be tracked?
• Too expensive to track changes at a bit by bit level.
• Would require an equivalent amount of storage to keep track.
• Based on the vendor, some level of granularity is chosen and a bit map is created (one for
source and one for target)
• For example one could choose 32 KB as the granularity.
• If any change is made to any bit on one 32KB chunk the whole chunk is flagged as
changed in the bit map.
• For 1GB device, map would only take up 32768/8/1024 = 4KB space.

40
Replication – Local and Remote

Tracking Changes to Source and Target: Bitmap

Source 0 0 0 0 0 0 0 0
At PIT
Target 0 0 0 0 0 0 0 0

Source 1 0 0 1 0 1 0 0
After PIT…

Target 0 0 1 1 0 0 0 1

For resynchronization/restore

Logical OR 1 0 1 1 0 1 0 1

0 = unchanged 1 = changed 41
Replication – Local and Remote

Restore/Restart Operation
• Source has a failure
• Logical Corruption
• Physical failure of source devices
• Failure of Production server
• Solution
• Restores data from target to source
• The restores would typically be done incrementally.
• Applications can be restarted even before synchronisation is complete.
-----OR------
• Start production on target
• Resolve issues with source while continuing operations on target.
• After issue resolution restore latest data on target to source.
42
Replication – Local and Remote

Restore/Restart Considerations
• Before a Restore
• Stop all access to the Source and Target devices.
• Identify target to be used for restore.
• Based on RPO and Data Consistency.
• Perform Restore
• Before starting production on target
• Stop all access to the Source and Target devices.
• Identify Target to be used for restart.
• Based on RPO and Data Consistency.
• Create a “Gold” copy of Target
• As a precaution against further failures
• Start production on Target

43
Replication – Local and Remote

Restore/Restart Considerations (Cont…)


• Pointer-based full volume replicas
• Restores can be performed to either the original source device or to any other device of like size.
• Restores to the original source could be incremental in nature.
• Restore to a new device would involve a full synchronisation.
• Pointer-based virtual replicas
• Restores can be performed to the original source or to any other device of like size as long as the
original source device is healthy.
• Target only has pointers
• Pointers to source for data that has not been written to after PIT.
• Pointers to the “save” location for data was written after PIT.
• Thus, to perform a restore to an alternate volume the source must be healthy to access data
that has not yet been copied over to the target.

44
Replication – Local and Remote

Restore and Restart Considerations


• Source has a failure
• Logical corruption or physical failure of source devices.
• Solution
• Restore data from target to source.
• Restore would typically be done incrementally.
• Applications can be restarted even before synchronisation is complete.
-----OR------
• Start production on target
• Create a “Gold” copy of target device before restarting on target.
• Resolve issues with source while continuing operations on target.
• After resolving the issue, restore latest data on target to source. 45
Replication – Local and Remote

Comparison of Local Replication Technologies

Pointer-based Full-Volume Pointer-based Virtual


Factor Full-Volume Mirroring
Replication Replication

Performance impact on Full copy mode – no impact


No impact High impact
source due to replica CoFA mode – some impact

At least the same as Small fraction of the


Size of target At least the same as the source
the source source
Availability of source for Full copy mode – not required
Not required Required
restoration CoFA mode – required
Only after
synchronisation and
Accessibility to target Immediately accessible Immediately accessible
detachment from the
source

46
Replication – Local and Remote

Creating Multiple Replicas Target Devices

06:00 A.M.

Source
12:00 P.M.

Point-In-Time
06:00 P.M.

12:00 A.M.

: 12 : 01 : 02 : 03 : 04 : 05 : 06 : 07 : 08 : 09 : 10 : 11 : 12 : 01 : 02 : 03 : 04 : 05 : 06 : 07 : 08 : 09 : 10 : 11 :

A.M. P.M. 47
Replication – Local and Remote

Local Replication Management: Array Based


• Replication management software residing on storage array.

• Provides an interface for easy and reliable replication management.

• Two types of interface:

• CLI

• GUI

48
Replication – Local and Remote

Self Assessment Question


11. LVM-based replication and File System (FS) snapshot are examples of ____________.

a. Host based local

b. Storage Array based

c. Full volume mirroring

d. Host-based remote

Answer: Host based local

49
Replication – Local and Remote

Self Assessment Question


12. Pointer based full volume replication is a technology of ___________________.

a. Host based replication

b. Storage Array based replication

c. Full volume mirroring

Answer: Storage Array based replication

50
Replication – Local and Remote

Self Assessment Question


13. LVM-based remote replication supports _________________ modes of data transfer.

a. Both synchronous and asynchronous

b. Only synchronous

c. Only Asynchronous

d. None

Answer : Both synchronous and asynchronous

51
Replication – Local and Remote

Self Assessment Question


14. File system (FS) snapshot is a ________________ that requires a fraction of the space used by the
original FS.

a. Host based replication

b. Storage Array based replication

c. Full volume mirroring

d. Pointer-based replica

Answer : Pointer-based replica


52
Replication – Local and Remote

Self Assessment Question


15. In ________________ the target is attached to the source and established as a mirror of the source.

a. Host based replication

b. Storage Array based replication

c. Full volume mirroring

d. Pointer based Replication

Answer : Full volume mirroring

53
Replication – Local and Remote

Remote Replication

54
Replication – Local and Remote

Remote Replication
Remote Replication is a process of creating replicas at remote sites.

REPLICATION

Storage Array – Source site Storage Array – Remote site

55
Replication – Local and Remote

Modes of Remote Replication


There are two modes of remote replication. They are:
1. Synchronous Mode
2. Asynchronous Mode

56
Replication – Local and Remote

Synchronous Replication
• A write is committed to both source and remote replica before it is acknowledged to the host.
• Ensures source and replica have identical data at all times.
• Provides near-zero RPO.
• Response time depends on bandwidth and distance.
• Requires bandwidth more than the maximum write workload.
• Typically deployed for distance less than 200 km (125 miles) between two sites.

57
Replication – Local and Remote

Synchronous Replication

Required bandwidth
Source
1
Max
Typical workload
4
Host Writes
3 2 MB/s

Target at Time
Remote Site
Data Write

Data Acknowledgment

58
Replication – Local and Remote

Asynchronous Replication
• A write is committed to the source and immediately acknowledged to the host.
• Data is buffered at the source and transmitted to the remote site later.
• Finite Recovery Point Objective (RPO).
• RPO depends on size of buffer and available network bandwidth.
• Requires bandwidth equal to or greater than average write workload.
• Sufficient buffer capacity should be provisioned.
• Can be deployed over long distances.

59
Replication – Local and Remote

Asynchronous Replication

Typical
Source workload
1
Required
Writes bandwidth
2 MB/s

Host 4 3 Average

Target
Time
Data Write

Data Acknowledgment
60
Replication – Local and Remote

What is Remote Replication?


• Replica is created at remote site
• Addresses risk associated with regionally driven outages.
• Could be a few miles away or half way around the globe.
• Modes of remote replication (based on RPO requirement)
• Synchronous Replication
• Asynchronous Replication

Source site Remote site 61


Replication – Local and Remote

Synchronous Replication
• A write must be committed to the source and remote replica
before it is acknowledged to the host.
• Ensures source and remote replica have identical data at all times Source
• Write ordering is maintained 1
• Replica receives writes in exactly the same order as the
source. 4
• Synchronous replication provides the lowest RPO and RTO Host
3 2
• Goal is zero RPO.
• RTO is as small as the time it takes to start application on the Data Write
target site. Data Acknowledgement

Target

62
Replication – Local and Remote

Synchronous Replication: Bandwidth Requirement


• Response Time Extension
• Application response time will be extended. Required bandwidth

• Data must be transmitted to target site


Max
before write can be acknowledged.
Typical workload
• Time to transmit will depend on distance Writes
MB/s
and bandwidth.
• Bandwidth
• To minimise impact on response time,
sufficient bandwidth must be provided at all
times.
Time
• Rarely deployed beyond 200 km.

63
Replication – Local and Remote

Asynchronous Replication
• Write is committed to the source and immediately
acknowledged to the host.
Source
• Data is buffered at the source and transmitted to the remote 1
site later.
• Some vendors maintain write ordering. 2
• Other vendors do not maintain write ordering, but
ensure that the replica will always be a consistent re- Host 4 3
startable image.
Data Write
• Finite RPO Data Acknowledgement

• Replica will be behind the source by a finite amount.


Target
• Typically configurable.

64
Replication – Local and Remote

Asynchronous Replication: Bandwidth Requirement


• Response time unaffected
Typical
• Bandwidth workload
Required
• Need average bandwidth bandwidth
Writes
• Buffers MB/s

• Need sufficient buffers at the source Average

to hold data before transmission to


remote site.
Time
• Can be deployed over long distances

65
Replication – Local and Remote

Remote Replication Technologies


• Host based

• Logical Volume Manager (LVM) based

• Support both synchronous and asynchronous mode

• Log Shipping

• Storage Array based

• Support both synchronous and asynchronous mode

• Disk Buffered - Consistent PITs

• Combination of Local and Remote Replication

66
Replication – Local and Remote

LVM-based
• Duplicate Volume Groups at source and target sites
• All writes to the source Volume Group are replicated to
the target Volume Group by the LVM.
• Can be synchronous or asynchronous mode.
• In the event of a network failure
I
• Writes are queued in the log file and sent to target when P
the issue is resolved.
• Size of the log file determines length of outage that can
be withstood.
• Upon failure at source site, production can be transferred to
target site.

67
Replication – Local and Remote

LVM Based – Advantages and Limitation


Advantages
• The source and target sites do not need to use the same storage arrays and RAID protection
configuration.
• Response time issue can be eliminated with asynchronous mode (with the drawback of
extended RPO).
• Specialised hardware / software is not required because LVM is available in the OS
running on the host.
Limitations
• Extended network outages require large log files.
• If log files fill up before outage is resolved, full synchronisation is required.
• Replication process adds overhead on host CPUs.
• Remote host must be continuously up and available.

68
Replication – Local and Remote

Host Based Log Shipping


• Offered by most database vendors

• Transactions to the source database are stored in logs

• Log files are transmitted to the remote host when the DBMS switch log files

• Happens at pre-configured time intervals or when a log file is full.

• The remote host receives the logs and applies them to remote database

Advantages:

• Minimal overhead

• Low bandwidth requirement

• Standby database is consistent up to the last applied log 69


Replication – Local and Remote

Host Based Log Shipping

IP
Logs

Original

Logs

Standby
70
Replication – Local and Remote

Storage Array Based Remote Replication


• Replication performed by the array operating environment
• Host CPU resources can be devoted to production operations instead of replication operations.
• Arrays communicate with each other via dedicated channels.
• ESCON, Fibre Channel or Gigabit Ethernet.
• Replicas are on different arrays
• Primarily used for DR purposes.
• Can also be used for other business operations.

Source Array Target Array

IP/FC
Network

Source
Distance Replica
Production Server
DR Server
71
Replication – Local and Remote

Array Based – Synchronous Replication


Write is received by the source array from host/server
Write is transmitted by source array to the target array
Target array sends acknowledgement to the source array
Source array signals write complete to host/server

Network links

Source Target
72
Replication – Local and Remote

Array Based – Synchronous Replication


Write is received by the source array from host/server
Source array signals write complete to host/server
Write is transmitted by source array to the target array
Target array sends acknowledgement to the source array

Network links

• No impact on response time Source Target

• Extended distances between arrays


• Lower bandwidth as compared to Synchronous 73
Replication – Local and Remote

Asynchronous Replication: Ensuring Consistency


• Maintain write ordering
• Some vendors attach a time stamp and sequence number with each write, then send the writes to
remote array.
• These writes are applied to the remote devices in exact order based on the time stamp and
sequence numbers.
• Dependent write consistency
• Some vendors buffer the writes in the cache of the source array for a period of time (between 5
and 30 seconds).
• At the end of this time current buffer is closed in a consistent manner and the buffer is switched,
new writes are received in the new buffer.
• Closed buffer is then transmitted to the remote array.
• Remote replica will contain a consistent, re-startable image on the application.

74
Replication – Local and Remote

Array Based – Disk Buffered Replication


• Local and Remote replication technologies can be combined to create consistent PIT copies of
data on target arrays.
• RPO usually in the order of hours
• Lower Bandwidth requirements
• Extended distance solution

Source Data
Local Replica

Source Remote Replica


Local Replica
Host
Source Storage Array
Target Storage Array
75
Replication – Local and Remote

Remote Replicas – Tracking Changes


• Remote replicas can be used for BC operations

• Typically remote replication operations will be suspended when the remote replicas are used
for BC operations.

• During business operations changes will/could happen to both the source and remote replicas.

• Most remote replication technologies have the ability to track changes made to the source and
remote replicas to allow for incremental re-synchronisation.

• Resuming remote replication operations will require re-synchronisation between the source and
replica.

76
Replication – Local and Remote

Array Based – Which Technology?


• Synchronous
• Is a must if zero RPO is required.
• Need sufficient bandwidth at all times.
• Rarely above 125 miles.
• Asynchronous
• Extended distance solutions with minimal RPO (order of minutes).
• No Response time elongation.
• Generally requires lower Bandwidth than synchronous.
• Must design with adequate cache/buffer capacity.
• Disk buffered
• Extended distance solution with RPO in the order of hours.
• Require lower bandwidth than synchronous or asynchronous.

77
Replication – Local and Remote

Three Site Replication


• Eliminates disadvantages of two site replication

• Single site disaster leads to a window when there is no DR protection.

• Data replicated to two remote sites

• Implemented in two ways

• Three Site Cascade/Multi-hop

• Three Site Triangle/Multi-target

78
Replication – Local and Remote

Three Site Replication – Cascade/Multi-hop


• Synchronous + Disk Buffered
Source Data Remote Replica
Local Replica

Synchronous

Disk Buffered
Remote Replica
Local Replica

Source Site Bunker Site Remote Site

• Synchronous + Asynchronous

Source Data Remote Replica


Local Replica

Synchronous Asynchronous

Remote Replica
Local Replica

Source Site Bunker Site


Remote Site
79
Replication – Local and Remote

Three Site Replication – Triangle/Multi-target


BUNKER

Asynch
with
Differentia
l
SOURCE
Resynch

80
REMOT
Replication – Local and Remote

SAN Based Replication : Terminologies


• Control Array: Array responsible for the replication operations
• Control Device: Device on controlling array to/from which data is being replicated.
• Remote Array: Array to/from which data is being replicated
• Remote Device: Device on remote array to/from which data is being replicated.
• Operation
• Push: Data is pushed from control array to remote array.
• Pull: Data is pulled to the control array from remote array.

Control Array Remote Array

SAN/WAN
C C

Control Device PUSH Remote Device


PULL 81
Replication – Local and Remote

Network Options for Remote Replication


• A dedicated or a shared network must be in place for remote replication.

• Use ESCON (Enterprise System Connectivity) or FC (Fibre Channel) for shorter distance.

• For extended distances, an optical or IP network must be used.

• Example of optical network: DWDM and SONET

• Protocol converters may require to connect ESCON or FC adapters from the arrays to these
networks.

• Native GigE adapters allow array to be connected directly to IP Networks.

82
Replication – Local and Remote

Dense Wavelength Division Multiplexing (DWDM)


• DWDM is a technology that puts data from different sources together on an optical fiber with each
signal carried on its own separate light wavelength.

• Up to 32 protected and 64 unprotected separate wavelengths of data can be multiplexed into a light
stream transmitted on a single optical fiber.

Optical Channels

ESCON
Optical Lambda λ
Fibre Channel Optical Electrical

Gigabit Ethernet

83
Replication – Local and Remote

Synchronous Optical Network (SONET)


• SONET is Time Division Multiplexing (TDM) technology. OC OC4
3 8
• Traffic from multiple sources of different speeds are organised
into a frame and sent out onto SONET ring as an optical signal.

• Synchronous Digital Hierarchy (SDH) similar to SONET but is SONE


T
the European standard.

• SONET/SDH offers the ability to service multiple locations, STM-


STM-
and provides reliability/availability, automatic protection 1 16

switching and restoration.

SD
H 84
Replication – Local and Remote

Remote Replication Technologies

85
Replication – Local and Remote

Remote Replication Technology


The main remote replication technologies are:
• Host-based remote replication
• Storage array-based remote replication
• Network-based remote replication
• Three-site remote replication.

86
Replication – Local and Remote

Host-based of Remote Replication


Replication is performed by host-based software
• LVM-based replication
• All writes to the source volume group are replicated to the target volume group by the
LVM.
• Can be synchronous or asynchronous.
• Log shipping
• Commonly used in a database environment.
• All relevant components of source and target databases are synchronised prior to the start
of replication.
• Transactions to source database are captured in logs and periodically transferred to
remote host. 87
Replication – Local and Remote

Storage Array-based Remote Replication


Replication is performed by array-operating environment . It uses three replication methods:
• Disk buffered
• Synchronous
• Writes are committed to both source and replica before it is acknowledged to host.
• Asynchronous
• Writes are committed to source and immediately acknowledged to host.
• Data is buffered at source and transmitted to remote site later.

88
Replication – Local and Remote

Network-based Replication
• It provides any-point-in-time recovery capability during its normal operation.
• Continuous Data Protection appliances are present at both source and remote sites.
• Supports both synchronous and asynchronous replication modes.

89
Replication – Local and Remote

Three-site Replication
• Data from source site is replicated to two remote sites.
• Replication is synchronous to one of the remote sites and asynchronous or disk buffered to the other
remote site.
• Mitigates the risk in two site replication.
• Implemented in two ways:
• Cascade/ Multi hop
• Triangle/ Multi target

90
Replication – Local and Remote

Self Assessment Question


16. Which one of the given options are the main remote replication technologies?
i. Host-based remote replication
ii. Storage array-based remote replication
iii. Network-based remote replication

a. Only i and ii
b. Only ii and iii
c. Only i and iii
d. All i, ii and iii

Answer: All i, ii and iii 91


Replication – Local and Remote

Self Assessment Question


17. A write is committed to the source and immediately acknowledged to the host in _____________
mode.

a. Synchronous
b. Asynchronous
c. Hybrid

Answer: Asynchronous

92
Replication – Local and Remote

Self Assessment Question


18. The replication management software residing on the storage array provides an interface for
___________.

a. Smooth and error-free replication


b. Smooth and error replication
c. Hybrid Mode replication
d. Hard and error-free replication

Answer: Smooth and error-free replication

93
Replication – Local and Remote

Self Assessment Question


19. _____________ replication is the process of creating replicas of information assets at remote sites
(locations).

a. Local
b. Remote
c. Local and Remote
d. Target

Answer: Remote

94
Replication – Local and Remote

Self Assessment Question


20. LVM-based remote replication does not scale well, particularly in the case of applications using
__________________ databases.

a. Local
b. Remote
c. Hybrid
d. Federated

Answer: Federated

95
Replication – Local and Remote

Assignment
1. Describe the uses of a local replica in various business operations.
2. Explain the RPO that can be achieved with synchronous, asynchronous, and disk-buffered remote
replication.
3. What is the difference between a restore operation and a resynchronisation operation with local
replicas? Explain with examples.
4. Describe the uses of a local replica in various business operations.
5. How can consistency be ensured when replicating a database?
6. What are the differences among full volume mirroring and pointer based replicas?
7. What is the key difference between full copy mode and deferred mode?
8. What are the considerations when performing restore operations for each array replication
96
technology?
Replication – Local and Remote

Assignment
9. Differentiate between Synchronous and Asynchronous mode.

10. Discuss one host based remote replication technology.

11. Discuss one array based remote replication technology.

12. What are differences in the bandwidth requirements between the array remote replication
technologies discussed in this module?

13. Discuss the effects of a bunker failure in a three-site replication for the following implementation:

1. Multihop—synchronous + disk buffered

2. Multihop—synchronous + asynchronous

3. Multitarget 97
Replication – Local and Remote

Summary
• Remote replication is the process of creating replicas of information assets at remote sites
(locations).
• The two Modes of remote replication are Synchronous and asynchronous mode
• Host-based remote replication uses one or more components of the host to perform and
manage the replication operation. There are two basic approaches to host-based remote
replication: LVM-based replication and database replication via log shipping.
• In storage array-based remote replication, the array operating environment and resources
perform and manage data replication. Synchronous, asynchronous, disk buffered, and three
site replication are the storage based remote replication techniques.
• SAN-based remote replication enables the replication of data between heterogeneous storage
arrays Network options for remote replication
Additional Task
Research on EMC Remote Replication products
98
Replication – Local and Remote

Summary
• Various local Replication technologies were discussed. host-based replication, logical
volume managers (LVMs) or the file systems perform the local replication process.
• LVM-based replication and file system (FS) snapshot are examples of host-based local
replication.
• Full volume mirroring, Pointer-based full volume copy and Pointer-based virtual replica are
the examples of array based local replication technologies.
• In LVM-based replication, logical volume manager is responsible for creating and
controlling the host-level logical volume. File system (FS) snapshot is a pointer-based
replica that requires a fraction of the space used by the original FS.
• In storage array-based local replication, the array operating environment performs the local
replication process.
• In full-volume mirroring, the target is attached to the source and established as a mirror of
the source

99
Replication – Local and Remote

Summary
• Replication is the process of creating an exact copy of data. Replication can be classified into two
major categories: local and remote.

• Consistency considerations need to considered when replicating file systems and databases.

• Host-based and storage-based replications are the two major technologies adopted for local
replication.

• File system replication and LVM-based replication are examples of host-based local replication
technology.

• Storage array–based replication can be implemented with distinct solutions namely, full-volume
mirroring, pointer based full-volume replication, and pointer-based virtual replication.
10
0
Replication – Local and Remote

Summary
• pointer-based full-volume replication Like full-volume mirroring, this technology can
provide full copies of the source data on the targets.
• In pointer-based virtual replication, at the time of session activation, the target contains
pointers to the location of data on the source.
• Remote Replication is a process of creating replicas at remote sites.
• Replicas meaning an exact copy and replication is the process of creating an exact copy of
data.
• The two basic modes of remote replication are synchronous and asynchronous.
• In synchronous remote replication, writes must be committed to the source and the target,
prior to acknowledging “write complete” to the host.
• In asynchronous remote replication, a write is committed to the source and immediately
acknowledged to the host. Data is buffered at the source and transmitted to the remote site
later.

10
1
Replication – Local and Remote

Document Links

Topics URL Notes


https://searchdisasterrecovery.techtarget.com
/tutorial/Data-replication-technologies-and-
disaster-recovery-planning-tutorial
This link explains about Local
Local replication https://nscpolteksby.ac.id/ebook/files/Ebook/
replication technologies
technologies Computer%20Engineering/EMC%20Informatio
n%20Storage%20and%20Management%20(20
09)/19.%20Chapter%2013%20-
%20Local%20Replication.pdf
By this link you Learn how remote
replication fits into a disaster recovery
https://www.computerweekly.com/feature/Re
Remote plan, as well as the various data
mote-replication-Comparing-data-replication-
replication replication methods available, such as
methods
host-based, array-based and network-
based replication

10
2
Replication – Local and Remote

Video Links

Topics URL Notes


Data Replication https://www.youtube.com/watch?v=fUrKt-
This link explains the data replication.
AQYtE

Types of Data https://www.youtube.com/watch?v=atOrav This link explains the pros and cons of
Replication H_OMc different types of data replication.

Host based https://www.youtube.com/watch?v=LIlT3YK This link explains the host based


replication NgvI replication

https://www.youtube.com/watch?v=UchwII This link explains the Remote replication


Remote replication
AZMyc technology
This link explains replicate your
Host-based Virtual https://www.youtube.com/watch?v=4WhB production virtual servers between your
Server Replication h7s9-_s own legacy virtual environment and
Expedient's virtual infrastructure

10
3
Replication – Local and Remote

E-Book Links

Topics URL Page Number


https://clickforacess.files.wordpress.com/201
Local Replication Page No 283 to 301
7/11/san-book.pdf
https://clickforacess.files.wordpress.com/201
Remote Replication Page No 309 to 324
7/11/san-book.pdf

10
4

You might also like