You are on page 1of 610

DATA PROTECTION AND

MANAGEMENT

PARTICIPANT GUIDE

PARTICIPANT GUIDE
Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page i


Table of Contents

Data Protection and Management................................................................................ 2


Course Objectives................................................................................................................ 3
Data Protection and Management ........................................................................................ 4

Introduction to Data Protection ................................................................................ 5


Introduction to Data Protection ............................................................................................. 6
Introduction to Data Protection ............................................................................................. 7
Data Protection Primer ......................................................................................................... 8
Knowledge Check: Data Protection Primer ........................................................................ 14
Data Center ....................................................................................................................... 15
Knowledge Check: Data Center ......................................................................................... 23
Data Protection and Availability Solutions .......................................................................... 24
Knowledge Check: Data Protection and Availability Solutions ............................................ 34
Concepts in Practice .......................................................................................................... 35
Exercise - Introduction to Data Protection .......................................................................... 36

Data Protection Architecture .................................................................................. 38


Data Protection Architecture .............................................................................................. 39
Data Protection Architecture .............................................................................................. 40
Data Protection Architecture: Overview.............................................................................. 41
Data Source – Application and Hypervisor ......................................................................... 44
Knowledge Check: Application and Hypervisor .................................................................. 55
Data Source – Primary Storage ......................................................................................... 56
Knowledge Check: Primary Storage................................................................................... 66
Protection Application and Storage .................................................................................... 67
Knowledge Check: Protection Storage Overview ............................................................... 75
Data Security and Management ......................................................................................... 76
Knowledge Check: Data Security and Management .......................................................... 88
Concepts in Practice .......................................................................................................... 89
Exercise - Data Protection Architecture.............................................................................. 93

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page ii


Fault Tolerance Techniques .................................................................................... 96
Fault Tolerance Techniques ............................................................................................... 97
Fault Tolerance Techniques ............................................................................................... 98
Fault Tolerance Overview .................................................................................................. 99
Knowledge Check: Fault Tolerance Overview .................................................................. 107
Compute and Network ..................................................................................................... 108
Knowledge Check: Compute and Network ....................................................................... 115
Storage ............................................................................................................................ 116
Knowledge Check: Storage .............................................................................................. 122
Application and Availability Zone...................................................................................... 123
Knowledge Check: Application and Availability Zone ....................................................... 128
Concepts in Practice ........................................................................................................ 129
Exercise ........................................................................................................................... 131

Data Backup ........................................................................................................... 133


Data Backup .................................................................................................................... 134
Data Backup .................................................................................................................... 135
Introduction to Backup ..................................................................................................... 136
Knowledge Check: Introduction to Backup ....................................................................... 152
Backup Topologies .......................................................................................................... 153
Knowledge Check: Backup Topologies ............................................................................ 158
Backup Methods .............................................................................................................. 159
Knowledge Check: Backup Methods ................................................................................ 166
Concepts in Practice ........................................................................................................ 167
Exercise- Data Backup..................................................................................................... 169

Data Deduplication................................................................................................. 171


Data Deduplication........................................................................................................... 172
Data Deduplication........................................................................................................... 173
Data Deduplication Overview ........................................................................................... 174
Knowledge Check: Deduplication Granularity and Methods ............................................. 181
Deduplication Granularity and Methods ........................................................................... 182
Knowledge Check: Deduplication Granularity and Methods ............................................. 189

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page iii


Concepts in Practice ........................................................................................................ 190
Exercise - Data Deduplication .......................................................................................... 191

Replication.............................................................................................................. 193
Replication ....................................................................................................................... 194
Replication ....................................................................................................................... 195
Data Replication Overview ............................................................................................... 196
Knowledge Check: Data Replication Overview ................................................................ 200
Local Replication.............................................................................................................. 201
Knowledge Check: Local Replication ............................................................................... 210
Remote Replication.......................................................................................................... 211
Knowledge Check: Remote Replication ........................................................................... 216
Concepts in Practice ........................................................................................................ 217
Exercise- Replication ....................................................................................................... 219

Data Archiving ........................................................................................................ 221


Data Archiving ................................................................................................................. 222
Data Archiving ................................................................................................................. 223
Data Archiving Overview .................................................................................................. 224
Knowledge Check: Data Archiving Overview ................................................................... 229
Archiving Operation and Storage ..................................................................................... 230
Knowledge Check: Archiving Operation and Storage ....................................................... 238
Concepts in Practice ........................................................................................................ 239
Exercise: Data Archiving .................................................................................................. 240

Data Migration ........................................................................................................ 243


Data Migration ................................................................................................................. 244
Data Migration ................................................................................................................. 245
Data Migration ................................................................................................................. 246
Knowledge Check: Data Migration ................................................................................... 258
Concepts in Practice ........................................................................................................ 259
Exercise - Data Migration ................................................................................................. 261

Data Protection in Software-Defined Data Center ............................................... 263

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page iv


Data Protection in Software-Defined Data Center ............................................................ 264
Data Protection in Software-Defined Data Center ............................................................ 265
Software-Defined Data Center Overview ......................................................................... 266
Knowledge Check: Software-Defined Data Center Overview ........................................... 270
Software-Defined Compute, Storage, and Networking ..................................................... 271
Knowledge Check: Software-Defined Compute, Storage, and Networking ....................... 281
Data Protection Process in SDDC .................................................................................... 282
Knowledge Check: Data Protection Process in SDDC ..................................................... 287
Concepts in Practice ........................................................................................................ 288
Exercise: Data Protection in SDDC .................................................................................. 291

Cloud-Based Data Protection ............................................................................... 293


Cloud-based Data Protection ........................................................................................... 294
Cloud-Based Data Protection ........................................................................................... 295
Cloud Computing Overview ............................................................................................. 296
Knowledge Check: Cloud Computing Overview ............................................................... 310
Cloud-Based Data Protection ........................................................................................... 311
Knowledge Check: Cloud-Based Data Protection ............................................................ 323
Cloud-Based Data Archiving ............................................................................................ 324
Knowledge Check: Cloud-Based Data Archiving .............................................................. 330
Concepts in Practice ........................................................................................................ 331
Exercise: Cloud-based Data Protection............................................................................ 333

Protecting Big Data and Mobile Device Data ....................................................... 335


Protecting Big Data and Mobile Device Data.................................................................... 336
Protecting Big Data and Mobile Device Data.................................................................... 337
Big Data Overview ........................................................................................................... 338
Knowledge Check: Big Data Overview ............................................................................. 346
Protecting Big Data .......................................................................................................... 347
Knowledge Check: Protecting Big Data............................................................................ 353
Protecting Mobile Devices................................................................................................ 354
Knowledge Check: Protecting Mobile Devices ................................................................. 360
Concepts in Practice ........................................................................................................ 361

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page v


Exercise: Data Protection in Big Data and Mobile Device Environment............................ 362

Securing the Data Protection Environment ......................................................... 364


Securing the Data Protection Environment....................................................................... 365
Securing the Data Protection Environment....................................................................... 366
Overview of Data Security................................................................................................ 367
Knowledge Check: Overview of Data Security ................................................................. 376
Security Threats in Data Protection Environment ............................................................. 377
Knowledge Check: Security Threats in Data Protection Environment............................... 382
Security Controls in a Data Protection Environment – 1 ................................................... 383
Knowledge Check: Security Controls in a Data Protection Environment – 1..................... 396
Security Controls in a Data Protection Environment – 2 ................................................... 397
Knowledge Check: Security Controls in a Data Protection Environment – 2..................... 403
Cyber Recovery ............................................................................................................... 404
Knowledge Check: Cyber Recovery ................................................................................. 409
Concepts in Practice ........................................................................................................ 410
Exercise: Securing the Data Protection Environment ....................................................... 413

Managing the Data Protection Environment ........................................................ 415


Managing the Data Protection Environment ..................................................................... 416
Managing the Data Protection Environment ..................................................................... 417
Introduction to Data Protection Management ................................................................... 418
Knowledge Check: Introduction to Data Protection Management ..................................... 428
Operations Management – 1 ............................................................................................ 429
Knowledge Check: Operations Management – 1 ............................................................. 437
Operations Management - 2 ............................................................................................ 438
Knowledge Check: Operations Management - 2 .............................................................. 446
Concepts in Practice ........................................................................................................ 447
Exercise - Managing the Data Protection Environment .................................................... 450

Summary................................................................................................................. 452
Summary ......................................................................................................................... 453
You Have Completed This eLearning............................................................................... 454

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page vi


Data Protection and Management – Associate ................................................................ 455

Appendix ............................................................................................... 457

Glossary ................................................................................................ 601

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page vii


Data Protection and Management

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 1


Data Protection and Management

Data Protection and Management

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 2


Data Protection and Management

Course Objectives

The main objectives of the course are to:


→ Explain data protection architecture and its building blocks.
→ Evaluate fault-tolerance techniques in a data center.
→ Describe data backup methods and data deduplication.
→ Describe data replication, data archiving and data migration methods.
→ Describe the data protection process in a software-defined data center.
→ Articulate cloud-based data protection techniques.
→ Describe various solutions for protecting Big Data, cloud and mobile device
data.
→ Describe security controls and management processes in a data protection
environment.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 3


Data Protection and Management

Data Protection and Management

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 4


Introduction to Data Protection

Introduction to Data Protection

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 5


Introduction to Data Protection

Introduction to Data Protection

Introduction to Data Protection

The main objectives of the topic are to:


→ Explain the need for data protection and data availability.
→ Define data center and its core elements.
→ Explain data protection and availability solutions.
→ List the key data protection management activities.
→ Use the availability formula to calculate system availability.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 6


Introduction to Data Protection

Introduction to Data Protection

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 7


Introduction to Data Protection

Data Protection Primer

Data Protection Primer

Objectives

The objectives of the topic are to:


• List the reasons for data protection and its management.
• Explain the correlation between data protection and data availability.
• Use the availability formula for the measurement of data availability.
• List the causes and impacts of data unavailability.

Data Protection Overview

Data protection is one of the least glamorous yet important aspect in any
organization. The sensitive data of an organization must be safeguarded so that
miscreants can't use that data to demand a ransom, encrypt it to make it
unavailable to the organization, publicly release an organization's client data, and
many other crimes. So, protecting an organization's data is of the utmost
importance.

Organizations use various techniques to protect their important data, some of


which are as follows:

Archive older but important


files

Create backups often

Use security mechanisms

Test data recovery


Keep a copy of data to a remote site

For more information, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 8


Introduction to Data Protection

Need for Data Protection and Management

Key reasons for data protection

An organization’s data is its most valuable asset.

Sensitive data, if lost, may lead to significant financial, legal, and business loss
apart from serious damage to the organization’s reputation.

For more information about the need for data protection, click here.

Correlating Data Protection and Availability

The correlation between data protection and availability has been depicted in the
following illustration:

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 9


Introduction to Data Protection

Data Protection Data Availability

Process of safeguarding data from corruption and loss Ability of an IT infrastructure component/service to function
as required during its operating time

Involves technologies/solutions that can prevent data Involves technologies, strategy, procedure, and IT
loss and recover data resource readiness appropriate for application/service

Drives the choice of data protection


Helps in improving data availability
technologies/solutions

For more information, click here.

Measurement of Data Availability

Data availability is measured as percentage of uptime in a given year.

For more information, click here.

Measurement of Data Availability - MTBF and MTTR

Data availability is also measured as a factor of the reliability of components or


services—as reliability increases, so does availability. It is calculated as the mean
time between failure (MTBF) divided by the MTBF plus the mean time to repair
(MTTR).

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 10


Introduction to Data Protection

MTBF is the average time available for a component or a service to perform its
normal operations between failures. It is calculated as the total uptime divided by
the number of failures.

MTTR is the average time required to repair a failed component or service.

Click the image to zoom in.

For details of this method, click here.

Causes of Data Unavailability

The following Image illustrates the various causes of data unavailability.

Hardware Failure

Disaster

Loss of Power

IT Infrastructure Refresh

Software Failure

Ransomware
Data Loss

For information about the causes of data unavailability, click here.

Impacts of Data Unavailability

The following Image illustrates the various impacts of data unavailability.

Click the highlighted boxes on the image for more information about the impacts.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 11


Introduction to Data Protection

1: The loss of productivity can be measured in terms of the salaries, wages, and
benefits of employees that are made idle by an outage. It can be calculated as -
Number of employees impacted x hours of outage x hourly rate.

2:

• Revenue recognition
• Cash flow
• Lost discounts
• Payment guarantees
• Credit rating
• Stock price

3: Loss of revenue includes:

• Direct losses
• Compensatory payments
• Future revenue losses
• Investment losses

4: The damage to reputation may result in a loss of confidence or credibility with


customers, suppliers, financial markets, banks, and business partners.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 12


Introduction to Data Protection

5: The other possible consequences of outage include the cost of additional rented
equipment, overtime, and extra shipping.

Data Protection in a Data Center

A data center provides centralized data-processing capability. It is used to provide


worldwide access to business applications and IT services over a network,
commonly the Internet.

A data center usually stores large amounts of data and provides services to a vast
number of users. Therefore, data protection in a data center is vital for carrying out
business operations.

Data Center B
Server-to-server Data Copy (North America)

Inter-data
Center Data
Copy
Management Data Center
Servers Servers
and Cloud
Connectivity Copy

Storage

Storage-to- Data Center and


storage Data Cloud Copy Cloud
Copy

Data Center A (Europe)

For more information, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 13


Introduction to Data Protection

Knowledge Check: Data Protection Primer

Check Your Knowledge

1. What is the availability of a computer with MTBF = 8000 hours and MTTR = 12
hours?
a. 99.5%
b. 98.9%
c. 90%
d. 99.8%

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 14


Introduction to Data Protection

Data Center

Data Center

Objectives

The objectives of the topic are to:

• Define a data center and its components.


• Explain compute, storage, and connectivity elements of a data center.
• List the characteristics of a data center.

Introduction to Data Center

A data center is a dedicated facility where an organization houses, operates, and


maintains its IT infrastructure along with other supporting infrastructures. It
centralizes an organization’s IT equipment and data-processing operations. A data

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 15


Introduction to Data Protection

center may be constructed in-house and located in an organization’s own facility, or


it may be outsourced, with equipment being located at a third-party site.

A data center typically consists of the facility, IT equipment, and support


infrastructure. Click on each tab for more information.

Facility

It is the building and floor space where the data center is constructed. It typically
has a raised floor with ducts underneath holding power and network cables.

IT equipment

It includes components such as compute systems, storage, and connectivity


elements along with cabinets for housing the IT equipment.

Support infrastructure

It includes power supply, fire and humidity detection systems; heating, ventilation
and air conditioning (HVAC) systems; and security systems such as biometrics,
badge readers, and video surveillance systems.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 16


Introduction to Data Protection

Data Center IT Equipment – Compute System

A compute system is a computing device (combination of hardware, firmware, and


system software) that runs business applications. Examples of compute systems
include application servers, desktops, laptops, and mobile devices. A compute
system’s hardware consists of a central processing unit (CPU), memory, internal
storage, and input/output (I/O) devices. The logical components of a compute
system include the operating system (OS), file system, device drivers, and logical
volume manager (LVM).

Types of Compute System

Click each compute system type for more information.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 17


Introduction to Data Protection

1 2 3

1: It is built in an upright standalone enclosure called a “tower”, which looks similar


to a desktop cabinet. Tower compute systems typically have individual monitors,
keyboards, and mice. They occupy significant floor space and require complex
cabling when deployed in a data center.

2: It is a compute system designed to be fixed inside a frame called a “rack”. It is


also known as a rack server. A rack is a standardized enclosure containing multiple
mounting slots called “bays”, each of which holds a server in place with the help of
screws.

3: It is an electronic circuit board containing only core processing components,


such as CPU(s), memory, integrated network controllers, storage drive, and
essential I/O cards and ports. It is also known as a blade server. It is housed in a
slot inside a blade enclosure (or chassis), which holds multiple blades and provides
integrated power supply, cooling, networking, and management functions.

Data Center IT Equipment – Storage


Storage devices are assembled within a
storage system/array .

Storage devices

Storage
System
Storage systems are designed for high
capacity, scalability, performance, reliability,
and security .

Storage devices (or simply “storage”) are devices consisting of non-volatile


recording media on which digital data can be persistently stored.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 18


Introduction to Data Protection

Storage may be internal (for example, internal hard disk drives, SSDs), removable
(for example, memory cards), or external (for example, magnetic tape drive) to a
compute system.

Storage may be internal,


removable, or external to a
compute system .

For information, click here.

Data Center IT Equipment – Connectivity Elements

Connectivity elements create communication paths between compute systems and


storage for data exchange and resource sharing.

Examples of connectivity elements are as follows:

• Open Systems Interconnection (OSI) layer-2 network switches


• OSI layer-3 switches or routers
• Cables
• Network adapters such as an NIC

For more information click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 19


Introduction to Data Protection

Data Center in a Box – Converged Infrastructure

Management

Virtualization

Compute

Storage

Network

IT components that make up a data center can be packaged into a single,


standalone computing box, called converged infrastructure. The package is a self-
contained unit that can be deployed independently, or aggregated with other
packages to meet additional capacity and performance requirements.

Components of a converged infrastructure may include compute systems, data


storage devices, networking equipment, and software for IT infrastructure
management, data protection, and automation.

For characteristics and areas of concern of converged infrastructure, click here.

Data Center in a Box – Hyper-converged Infrastructure

Set up new systems 4.5X


faster.

Scale in 5 minutes. Lower


TCO by 30%.

Deploy a fully-virtualized
environment in just 20
minutes

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 20


Introduction to Data Protection

Hyperconverged Infrastructure (HCI) combines the datacenter components of


compute, storage, virtualization, and storage networking into a distributed
infrastructure platform, managed by software. The intelligent HCI software can
create flexible building blocks, thereby replacing legacy infrastructure, including
separate servers, and storage networks and arrays. Such an infrastructure allows
organizations to plan and size their workloads accurately and enables flexible and
easy scaling.

Unlike Converged Infrastructure (CI), which relies on hardware and uses physical
building blocks, HCI is software-defined. Moreover, HCI is more flexible and
scalable than CI.

Characteristics of a Data Center

A data center should have the following key characteristics:

Click each characteristic in the given image for more details.

2 3

4 5

1: Continuous availability: A data center should ensure 24x7x365 availability of


data to provide anytime, anywhere data access.

2: Software-defined: A software-defined data center supports software-centric


control of data center resources. A controller software that is decoupled from
hardware sends instructions to the hardware components to perform the required
operations.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 21


Introduction to Data Protection

3: IT-as-a-service: A data center should adopt the IT resource delivery as a


service paradigm. This enables the IT department of an organization to become a
utility to the business and deliver IT resources as services for convenient
consumption by business units. IT services are maintained in a service catalog
which enables users to provision resources in a self-service manner.

4: Multi-layered security: Multiple layers of security help in mitigating the risk of


security threats in case one layer is compromised. An attacker must breach each
layer to be successful. This, in turn, provides additional time to detect and respond
to an attack.

5: Virtualization: It is the process of abstracting physical resources, such as


compute, storage, and network, and creating virtual resources from them. A
virtualized data center provides the flexibility to create and reclaim virtual resources
dynamically.

6: On-demand scalability: The data center IT infrastructure should be designed


for scalable computing. This enables the IT resources to scale-up, down, in, and
out quickly as the demand for resources grows and shrinks.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 22


Introduction to Data Protection

Knowledge Check: Data Center

Knowledge Check Question

2. 'Hyperconverged infrastructure (HCI) combines the datacenter components of


compute, storage, virtualization, and storage networking into a distributed
infrastructure platform, managed by hardware.' State whether this statement is
true or false.
a. True
b. False

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 23


Introduction to Data Protection

Data Protection and Availability Solutions

Data Protection and Availability Solutions

Objectives

The objectives of the topic are to:

• Explain data protection and availability solutions and their benefits.


• Demonstrate the evolution of data protection solutions.
• Define the data protection terminologies.
• List data protection management activities.

Introduction to Data Protection and Availability Solutions

Data protection and availability solutions assure that the data is safe and
accessible to the intended users at a required level of performance. Different
solutions may be used in the same data center environment.

Data protection and availability solutions are as follows:

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 24


Introduction to Data Protection

Data Protection Terminologies - Disaster Recovery

A disaster may impact the ability of a data


center to remain up and provide services to
users. This, in turn, may cause data
unavailability. Disaster recovery (DR)
mitigates the risk of data unavailability due to
a disaster. It involves a set of policies and
procedures for restoring IT infrastructure
including data that is required to support the
ongoing IT services after a disaster occurs.

For more information, click here.

Data Protection Terminologies - RPO and RTO

When designing a data availability strategy for an application or a service,


organizations must consider two important parameters that are closely associated
with recovery.

Recovery Point Objective (RPO)

This is the point-in-time to which data must be recovered after an outage. It defines
the amount of data loss that a business can endure. Based on the RPO,
organizations plan for the frequency with which a backup or replica must be made.
For example, if the RPO of a particular business application is 24 hours, then
backups are created every midnight. The corresponding recovery strategy is to
restore data from the set of last backups. An organization can plan for an
appropriate data protection solution on the basis of the RPO it sets.

Recovery Time Objective (RTO)

This is the time within which systems and applications must be recovered after an
outage. It defines the amount of downtime that a business can endure and survive.
Based on the RTO, an organization can decide which data protection technology is
best suited. The more critical the application, the lower the RTO should be.

Both RPO and RTO are counted in minutes, hours, or days and are directly related
to the criticality of the IT service and data. Usually, the lower the RTO and RPO,
the higher is the cost of a data protection solution or technology.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 25


Introduction to Data Protection

Recovery Point Objectives Recovery Time Objectives


(RPO) (RTO)
Point-in-time to which Time within which systems and
Disaster applications must be recovered
data must be recovered

Time

RPO = Amount of data RTO = Amount of downtime that a


loss that a business can business can endure
endure

Fault-tolerant IT Infrastructure

Redundant Redundant
Links Ports
Redundant Switches

Compute System
Port
Link
Switch

Storage
System
Compute
Cluster Single Point of
Failure

A fault-tolerant IT infrastructure is designed based on the concept of fault tolerance.


Such an infrastructure:

• Continues providing services in case some of its components fail.


• Improves the availability of data and services.

For more information, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 26


Introduction to Data Protection

Data Backup

Data backup is the process of making a copy of primary data for the purpose of
restoring the original data in the event of data loss or corruption.

The backup data should not be kept in the same storage device where the original
data is stored. Otherwise, both the original data and the backup data will be lost if
physical damage occurs to the storage device. Often, data backups are performed
both within and between sites or data centers. The local backup within a site
enables easy access to the backup data and quick recovery. The backup data at
the remote site (cloud) provides protection against a disaster, which could damage
or destroy the local backup data.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 27


Introduction to Data Protection

Data Replication

Data replication is the process of creating an exact copy (replica) of the data so
that the data copy may be used to restore the original data in the event of a data
loss or corruption, or to restart business operations in case the primary storage is
not operational.

A replica can also be used to perform other business operations such as backup,
reporting, and testing. Data replication is similar to data backup, but it provides
higher availability because the replica can be made operational immediately after
the primary storage failure. Replication can be performed both within and across
data centers or clouds.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 28


Introduction to Data Protection

Data Archiving

Data archiving is the process of identifying and moving inactive data from primary
storage systems to lower cost storage systems, called data archives, for long term
retention. A data archive stores older but important data that is less likely to be
accessed frequently.

Data archiving provides the following advantages:

• Assures data availability on a long-term basis.


• Meets data retention requirements.
• Reduces primary storage consumption and related costs.
• Reduces the amount of data that must be backed up.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 29


Introduction to Data Protection

Data Migration

Data migration is the process of moving data between storage systems or compute
systems. A change in data format due to a system upgrade is also considered a
data migration. Data migration has several use cases.

For example, before a scheduled system maintenance, data is transferred to


another system to ensure continuous data availability. In another case, when a
technology or a system upgrade occurs the existing data needs to be moved to a
new system before withdrawing the old system to avoid downtime. Another
example is to move data from one cloud service provider to another.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 30


Introduction to Data Protection

Data Security

Counter Measures

Security Threats

Data security refers to the countermeasures that are used to protect data against
unauthorized access, deletion, modification, or disruption. It provides protection
against security threats that can potentially destroy or corrupt data and cause data
and service unavailability.

Security countermeasures include the implementation of tools, processes, and


policies that can prevent security attacks on infrastructure components and
services.

There are solutions like Dell EMC™ Cyber Recovery, which offer protection to
organizations against ransomware and other devastating attacks. With such a
solution in place, the organization is equipped with immutable clean backups, kept
safely in their vault, even in the case of production or backup data infiltration. This
way the organization can protect itself from huge data and revenue losses and
minimize downtime because of data unavailability.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 31


Introduction to Data Protection

Data Protection as a Service

I want to back up my files, so that I can retrieve from anywhere,


anytime.

My organization needs a remote data protection service to eliminate the risk


of downtime due to a disaster.

My organization wants to outsource non-critical applications to free


My organization needs a secured online archive for long term data up resources for high value projects.
retention.

Many organizations use a range of data protection as service offerings from cloud
providers. These offerings serve as a means of outsourcing non-strategic activities
as well as improving data protection and availability levels for certain workloads. As
the data protection infrastructure is maintained by the cloud provider, the expenses,
tasks, and time associated with data protection management is reduced. The
reduction of management tasks can drive new business initiatives, discovery of
new markets, and innovation.

Disaster Recovery as a Service is offered by the cloud service providers to its


clients to safeguard the client's data and IT infrastructure in the cloud environment
in case of a disaster. This service provides DR orchestration to restore the
functionality of the IT infrastructure of the client, after the disaster, using a SaaS
solution.

Data Protection Management Activities

Click the highlighted boxes in the given image for more information about the data
protection management activities.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 32


Introduction to Data Protection

1
4

2 5

1: Monitoring: It helps in gathering information on various resources and checking


the status of data protection operations in a data center. Monitoring involves
tracking configuration errors that may fail a recovery, violations of data protection
policies, availability of components, backup operations that exceed the backup
window, missed SLAs, and resource utilization.

2: Reporting: It involves collating and presenting the monitored parameters such


as resource performance, capacity, and configuration. Reporting enables data
center managers to analyze and improve the data protection operations, avoid
failures, reduce missed SLAs, and plan for resource procurement. It also helps in
establishing business justifications and chargeback of costs associated with data
protection operations.

3: Capacity planning: It involves estimating the amount of IT resources required


to support backup, replication, and archiving operations and meet the changing
capacity requirements. It also involves analyzing the capacity consumption trends
and forecasting future capacity requirements.

4: Troubleshooting: It resolves backup, replication, and archiving-related issues in


the data center so that data protection services can maintain their operational state.
It involves identifying and correcting the reason for an issue.

5: Resource optimization: It involves improving the overall utilization and


performance of IT resources. It leverages the data collected during monitoring to
get visibility of the under-utilized and over-utilized resources, underperforming
components, and deviations from committed service levels. This helps in improving
performance, reducing spending, preventing downtime, and meeting service level
targets.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 33


Introduction to Data Protection

Knowledge Check: Data Protection and Availability Solutions

Knowledge Check Question

3. Match the given concepts with their correct definitions.

A. Data A The process of creating an exact copy


replication of the data so that the data copy may
be used to restore the original data in
the event of a data loss or corruption

B. Data B The process of moving data between


migration storage systems or compute systems

C. Data C The countermeasures that are used to


security protect data against unauthorized
access, deletion, modification, or
disruption

D. Data D The process of identifying and moving


archiving inactive data from primary storage
systems to lower cost storage
systems for long term retention

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 34


Introduction to Data Protection

Concepts in Practice

Concepts in Practice

Click the right and left arrows to view all the concepts in practice.

Dell EMC VxBlock

The VxBlock System 1000 is a converged infrastructure solution. It is an integrated


solution that combines compute, virtualization, network, storage, management, and
data protection components into a single package. It supports traditional and
modern workloads, including data analytics, big data, mission-critical enterprise
applications, and more. It is flexible and adaptable and can be used in mid-sized,
enterprise, or service provider data centers.

VxBlock combines industry-leading technologies that include Dell EMC storage and
data protection options, Cisco UCS blade and rack servers, Cisco LAN and SAN
networking, and VMware virtualization and cloud management into one fully
integrated system. It leverages its deep VMware integration to simplify automation
of everything from daily infrastructure provisioning tasks to delivery of IaaS and
SaaS.

Dell EMC VxRail

VxRail is a fully integrated, preconfigured, and tested Hyper Converged


Infrastructure (HCI) system optimized for VMware vSAN and is the standard for
transforming VMware environments. VxRail provides a simple, cost effective
hyperconverged solution that solves a wide range of your operational and
environmental challenges and supports almost any use case, including tier-one
applications, cloud native and mixed workloads. VxRail HCI System Software, a
suite of integrated software elements that sits between infrastructure components
such as vSAN and VMware Cloud Foundation, delivers a seamless and automated
operational experience.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 35


Introduction to Data Protection

Exercise - Introduction to Data Protection

Exercise - Introduction to Data Protection


Click the 'Scenario' and 'Deliverables' sub-headings for information on the exercise.

1. Present Scenario:

The exercise scenario is as follows:

• A storage system is used to provide a data archiving service.

• The scheduled operating time of the service = 24×365 hours.

• MTBF of the storage system = 10000 hours.

• MTTR of the storage system = 12 hours

• Last year the storage system failed twice.

• Storage system failures resulted in a service downtime of three days.

2. Expected Deliverables:

The following are your deliverables for this exercise:

• What is the expected availability of the storage system?

• What are the expected annual uptime and downtime of the storage system?

• What is the achieved availability of the data archiving service in the last
year?

Solution

Availability is calculated as: MTBF/(MTBF+MTTR)×100

Here, expected availability of the storage system

= 10000 / (10000 + 12) × 100

= 0.9988 × 100

= 99.88 %

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 36


Introduction to Data Protection

Scheduled operating time of the service = 24 × 365 hours = 8760 hours

Expected annual uptime of the storage system = 8760 hours per year × (0.9988) ≈
8749.5 hours

Expected annual downtime of the storage system = 8760 hours per year × (1 −
0.9988) ≈ 10.5 hours

Achieved availability of the service in the last year = (Operating Time -


Downtime)/(Operating Time) × 100 = (8760 - (24 × 3)) / 8760 × 100

= 0.9918 × 100

= 99.18 %

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 37


Data Protection Architecture

Data Protection Architecture

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 38


Data Protection Architecture

Data Protection Architecture

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 39


Data Protection Architecture

Data Protection Architecture

Data Protection Architecture

The main objectives of the topic are to:


→ Describe the building blocks of a data protection architecture.
→ Describe data source components – application, hypervisor, and primary
storage.
→ Describe data protection applications and storage.
→ Explain data security and management concepts.
→ Apply the knowledge of protection architecture to address the organization’s
challenges and requirements.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 40


Data Protection Architecture

Data Protection Architecture: Overview

Data Protection Architecture: Overview

Why Data Protection Architecture?

Data protection without an intentional architecture results in an accidental


architecture.

The reasons why it is required are:

• Ad hoc and arbitrarily implemented solutions.


• Unclear ownership of processes and resources.
• Multiple unconnected tools and no central visibility.
• Complexity in scaling resources.
• Difficulty in meeting SLAs.
• Challenges in ensuring and reporting on compliance and governance
requirements.
• Expenditure increases manifold with data growth.

For more information click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 41


Data Protection Architecture

Data Protection Architecture

Data Protection Architecture has three core components:

• Data source
• Protection application and storage
• Data security and management

Click on each highlighted box label for detailed information about the components.

5
2 3

1 4

1: It is the source of the data that must be protected. The data source can be a
business application, a hypervisor, or primary storage.

2: Data security involves implementing various kinds of safeguards or controls in


order to lessen the risk of a vulnerability in the data source and protection
components. Governance, Risk, and Compliance (GRC) help in planning and
implementing security controls. GRC ensure that all operations are performed in
accordance with an organization's internal policies and external regulations.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 42


Data Protection Architecture

3: Data protection management provides visibility and control of the components


and protection operations. Visibility is provided through the discovery of data
source and protection elements. Control is enabled by means of operations
management and orchestration. It also ensures that the data protection services
meet the SLAs.

4: Both the protection application and the protection storage interact with the data
sources. During interaction, they can identify the data that needs protection.

• Protection Application: A packaged application running on a compute system


or a data protection feature embedded in any IT equipment.
• Protection Storage: Built for data protection. The protection storage may be
deployed by an organization in its data center within its own premises or may
exist in the cloud.

5: The data security and management component interacts with other components
of the data protection architecture to exchange data, command, and status.

Note: The data protection architecture is based on the concept of a


fault-tolerant data center infrastructure that assures continuous
availability of data and services.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 43


Data Protection Architecture

Data Source – Application and Hypervisor

Data Source – Application and Hypervisor

Objectives

The objectives of the topic are to:


→ Understand business applications.
→ Define Application Programming Interface.
→ Understand how hypervisor works.
→ Describe virtual machines.
→ Understand the concept of a virtual appliance.
→ Define the concept of containers and understand the key difference between
containers and VMs.

Data Source- Business Application

Business application is software or a tool that is used by business users to perform


various business operations1. Business applications also:

1Helps in increasing the productivity of a business. A business application is


specific to the operation(s) it is designed for. It can be a proprietary, commercial
off-the-shelf (COTS), or customized third-party product.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 44


Data Protection Architecture

Business Applications

API

GUI CLI

Interaction

Interaction

Business Users
(Clients)

Protection/Management/Ot
her Business Applications

• Helps in increasing productivity.


• Provides user interfaces – CLI, GUI.
• Provides API for application-to-application interaction

For more details, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 45


Data Protection Architecture

Application Programming Interface

Sequential data sets

Backup
Storage

API_Routine_SendCopy()
Call to Start Backup

Data set
Database Backup list
Application Application

• A set of programmatic instructions and specifications that provides an interface


for software components to communicate with each other.
• Specifies a set of routines or functions that can be called from a software,
allowing it to interact with the software providing the API.
• Enables communication with an application without understanding its underlying
architecture.
− APIs may be precompiled code that is leveraged in programming languages,
and can also be web-based.
• The image shows an API routine (for instance, API_Routine_SendCopy()) that
is called by a backup application. The backup application uses the API routine
of the database application to pass a list of data sets to be backed up. The
database application begins to send the copy of data sets in sequential order to
the backup storage.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 46


Data Protection Architecture

Many modern applications leverage REST APIs2 to allow orchestration and


interaction between applications outside of the GUI.

Data Source- Hypervisor

A hypervisor3 is software that allows multiple operating systems4 (OSs) to share


and run concurrently on a single compute system.

2A REST API is an application program interface (API) that uses HTTP requests to
Get, Put, Post, and Delete data. An API for a website is code that allows two
software programs to communicate with each other.

3The hypervisor provides a compute virtualization layer that abstracts the physical
hardware of a compute system from the OS and enables the creation of multiple
VMs.

4Each OS runs on a logical compute system which is defined as a virtual machine


(VM).

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 47


Data Protection Architecture

• Each Virtual Machine (VM) is isolated from the other VMs on the same physical
compute system. Therefore, the application running on one VM does not
interfere with those running on other VMs.
• The isolation also provides fault tolerance so that if one VM crashes, the other
VMs remain unaffected.
• A VM appears as a physical compute system with its own CPU, memory,
network controller, and disks.
• The compute system on which a hypervisor is running is called a host machine
and each VM is called a guest machine.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 48


Data Protection Architecture

• A compute system can be configured with hypervisor5 and without hypervisor6.


• The OS that is installed on a guest machine is called a guest OS7.

For more information about hypervisor, click here.

Virtual Machine

A Virtual Machine (VM) is a logical compute system with virtual hardware on which
a supported guest OS and application run. From the perspective of the guest OS, a
VM appears as a physical compute system.

5Multiple VMs and applications run at a time. Improved resource utilization.


Consolidation of application servers. Increased management efficiency.

6Typically one application runs at a time. Underutilized compute resources.


Proliferation of application servers. Management inefficiency.

7 An application can run on the guest OS.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 49


Data Protection Architecture

• Each VM has its own configuration for hardware, software, network, and
security.
− Hardware and software are configured to meet the application’s
requirements.
• The image shows the typical virtual hardware components of a VM.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 50


Data Protection Architecture

− This includes virtual CPU(s), virtual motherboard, virtual RAM, virtual disk,
virtual network adapter, optical drives, serial and parallel ports, and
peripheral devices.
• VM is a discrete set of files such as - Configuration file8, virtual disk file9,
Memory state file10 and Log file11.

− For managing VM files, a hypervisor may use a native clustered file


system12.
To learn more about virtual machine, click here.

Virtual Appliance

A virtual appliance is a preconfigured VM preinstalled with a guest OS and an


application dedicated to a specific function. Virtual appliances are used for different
functions such as load balancing, firewall, routing of packets, and data backup.

8 Stores the VM’s configuration data, including VM name, location, BIOS


information, guest OS type, number of CPUs, memory size, number of adapters
and associated MAC addresses, and SCSI controller type.

9Stores the content of a VM’s disk drive. A VM can have multiple virtual disk files,
each of which appears as a separate disk drive to the VM.

10Stores the memory contents of a VM and is used to resume a VM that is in a


suspended state.

11Used to keep a record of the VM’s activity and is often used for troubleshooting
purposes.

12A clustered file system can be mounted on multiple compute systems


simultaneously. This enables multiple compute systems running hypervisor to
access the same file system simultaneously.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 51


Data Protection Architecture

• Virtual appliances13 are not so different from the physical appliances which are
used in kitchen, office, and data centers to perform specific tasks.
• Created using Open Virtualization Format (OVF)14 and simplifies the
deployment of an application.

Traditional Application Deployment Virtual Appliance Deployment

Deploy an application on a VM is Deployment is faster


time-consuming and error-prone

It involves setting up a new VM, The VM is preconfigured and has


installing the guest OS and then the preinstalled software
application

More expensive Less expensive

• Virtual appliance image should be uploaded in the cloud’s image repository


before deploying it in a cloud environment.

− Appliance should be planned in such a way that it can easily run on the
hypervisor that is used in the organization’s cloud environment.
− Performance is limited to the resources of the hypervisor and it may
compete for resources with other VMs running on the same hypervisor.
− When deploying a virtual appliance, VM attributes need to be described by
providing the virtual appliance’s metadata15.

13The virtual appliance is a software packaged into a virtual format that is quickly
and easily deployed on a hypervisor.

14 An open hypervisor-independent packaging and distribution format.

15Metadata contains attributes of virtual machine such as RAM size and number of
processors.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 52


Data Protection Architecture

Containers

Imagine needing multiple versions of applications for testing or production. IT team


would need multiple Virtual Machines which are running multiple iterations of
applications with necessary binaries and libraries. This would be challenging as
moving around large amounts of data limits VM mobility.

• Containerization is an operating system-level virtualization method that


simplifies application deployment and requires fewer resources than virtual
machines. Containers are application-centric methods that:

− Delivers microservices by providing portable, isolated virtual environments


for applications to run without interference from other running applications.
− Bundles applications with the software libraries that they depend on,
allowing developers to create “build once, run anywhere” code making
applications very portable.
− Becomes the norm for modern applications and cloud-native applications.
To learn more about containers, click here.

Containers vs. VMs

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 53


Data Protection Architecture

Containers VMs

Shared Operating System Separate Operating System

Small Image Footprint, Large Image Footprint, (GB)


(MB)

Quick start times Full boots

Stateless Stateful

Easily transportable Not easily portable, (exports/conversions/etc)

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 54


Data Protection Architecture

Knowledge Check: Application and Hypervisor

Knowledge Check Question

1. Which of the following statements are correct?


a. A hypervisor abstracts the business application from the guest OS.
b. An API enables interaction between a user and a business application.
c. A hypervisor allocates the processing and memory resources to each VM.
d. A virtual appliance has direct access to the hardware of a physical compute
system.
e. Each VM is isolated from the other VMs on the same physical compute
system.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 55


Data Protection Architecture

Data Source – Primary Storage

Data Source – Primary Storage

Objectives

The objectives of the topic are to:


→ Describe primary storage devices and the system architecture.
→ Understand scale-up and scale-out architecture.
→ Understand common types of primary storage system.

Primary Storage Device

Read Write
Read Write

Internal/External Drive

Storage System
Primary Storage Device

• A primary storage device is the persistent storage for data used by business
applications to perform transactions.
• Data from a primary storage device can be copied or moved directly to a
protection storage to run business applications and hypervisors.

For detailed information about primary storage device, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 56


Data Protection Architecture

Architecture of Primary Storage Systems

Controller

Storage
(HDD/SSD)

A primary storage system has two key components – controller and storage.

• Controllers are connected to the compute systems either directly or via a


network. They read or write data to the storage while processing the I/O
requests from the compute systems.
• A primary storage system can have all hard disk drives (HDDs), all solid state
drives (SSDs), or a combination of both. It may contain several storage drives to
provide petabytes of storage capacity.

Scale-up and Scale-out Architecture

A storage system may be built either based on a scale-up or a scale-out


architecture.

Click on "Scale-up" or "Scale-out" in this image for more information about the
architecture.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 57


Data Protection Architecture

1:

It provides the capability to scale the capacity and performance of a single storage
system based on requirements. Scaling up a storage system16 involves upgrading
or adding controllers and storage.

2:

• It provides the capability to maximize storage capacity and performance by


simply adding storage nodes consisting of multiple controllers and storage
devices to a cluster of nodes.
− This provides the flexibility to use many storage nodes up to the limit
supported by a cluster of moderate performance and capacity to produce a
total storage system that has large aggregate performance and capacity.
• Scale-out architecture pools the storage resources in the cluster and distributes
the workload across all the storage nodes.

16 Storage systems typically have a fixed capacity ceiling, which limits their
scalability. Performance may also start to degrade when reaching the capacity
limit.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 58


Data Protection Architecture

− This results in linear performance improvement as more storage nodes are


added to the cluster.

Common Types of Primary Storage System

Based on the supported level of data access, primary storage systems can be
classified as:

SAN-Attached Storage

A SAN-attached storage is a block-based storage system.

• SAN connects block-based storage with each other and to the compute
systems.
• SAN-attached storage improves the utilization of storage resources compared
to a direct-attached storage (DAS) environment.
− This reduces the total amount of storage that an organization needs to
purchase and manage.
− Storage management becomes centralized and less complex, which further
reduces the cost of managing data.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 59


Data Protection Architecture

• With long-distance SAN, data transfer between SAN-attached storage systems


can be extended across geographic locations.
• The long-distance SAN17 connectivity enables:

− The compute systems across locations to access shared data.


− Replication18 of data between SAN-attached storage systems that reside in
separate locations.

17 The long-distance SAN connectivity facilitates remote backup of application data.


Backup data can be transferred through a SAN to SAN-attached backup storage
that may reside at a remote location. This avoids the need to ship backup media
such as tapes from the primary site to the remote site and removes the associated
pitfalls such as packing and shipping expenses and the risk of losing tapes in
transit.

18The replication over long-distances helps in protecting data against local and
regional disaster.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 60


Data Protection Architecture

Network-Attached Storage (NAS)

NAS Clients

NAS Device

NAS19 is a dedicated, high-performance file sharing20 and storage device.

• Administrators create file systems on NAS systems, create shares, and export
shares to NAS clients.
• Enables clients to share files over an IP-based network.

19Also referred as unstructured storage which is used to address unstructured


data.

20 File sharing, as the name implies, enables users to share files with other users.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 61


Data Protection Architecture

− It enables both UNIX and Microsoft Windows users to share the same data.
• Uses file-sharing protocols such as CIFS and NFS to provide access to the file
data.
• NAS device uses its own OS and integrated hardware and software
components to meet specific file-service needs.

− Helps in performing file-level I/O better than a general-purpose server.


− Can serve more clients and provides the benefit of server consolidation by
eliminating the need for multiple file servers.

Object Based Storage Device (OSD)

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 62


Data Protection Architecture

OSD stores data in the form of objects on a flat address space21. All objects exist at
the same level and an object cannot be placed inside another object.

• Object stored in an OSD is identified by a unique identifier called the object ID22.
• OSD provides a metadata service that is responsible for generating object ID
from the content of a file.
− Metadata service maintains the mapping of the object IDs and the file
system namespace.
• When an application server sends a read request to the OSD, the metadata
service retrieves the object ID for the requested file.

− The object ID is used to retrieve and send the file to the application server.

21Unlike file systems that have restrictions on the number of files, directories and
levels of hierarchy, the flat address space has no hierarchy of directories and files.
As a result, billions of objects can be stored in a single namespace.

22The object ID allows easy access to objects without the need to specify their
storage locations.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 63


Data Protection Architecture

Unified Storage

Unified storage23 is a single storage system that consolidates block-level, file-level,


and object-level access and is managed centrally. It combines SAN-attached
storage functionality, NAS functionality, and OSD functionality in a single system.

• In some implementations, there are dedicated or separate controllers for


handling block I/O, file I/O, and object I/O.
• The sharing of the storage resources increases storage utilization.

23Unified storage includes a unified controller. The unified controller is capable of


processing block-level, file-level, and object-level I/O requests concurrently.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 64


Data Protection Architecture

− Reduces the capital expenditure (CAPEX) on new storage resources and


the associated operational expenditure (OPEX).
• Eliminates the guesswork associated with planning for block, file, and object
storage capacity separately.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 65


Data Protection Architecture

Knowledge Check: Primary Storage

Knowledge Check Question

2. Which of the following types of primary storage system provides file-level


access? Choose all that apply.
a. SAN-attached storage
b. Network-attached storage
c. Object-based storage device
d. Unified storage

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 66


Data Protection Architecture

Protection Application and Storage

Protection Application and Storage

Objectives

The objectives of the topic are to:


→ Explain the functions of protection applications.
→ Understand the concept of protection storage.
→ Describe disk based protection storage.
→ Describe tape based protection storage.
→ Explain virtual tape library.

Data Protection and Availability Products Overview

4 1 2 3

1: A backup application is software that creates a copy of the data so that the
backup copy can be used to restore the original data in the event of data loss or
corruption.

The key functions of backup and recovery application are:

• Provides a user interface to centrally manage the backup environment.


• Allows the backup administrator to perform backup and recovery
configurations.
• Create a copy of backup data (Cloning).
• Transfer data from one storage device to another (Staging).
• Integrate with deduplication software to eliminate redundancy in backup data.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 67


Data Protection Architecture

2:

A replication application is a software24 that creates a copy (replica) of the data so


that the data copy may be used to restore the original data in the event of a data
loss or corruption, or to restart business operations in case the primary storage is
not operational.

The key functions of a replication software are:

• Create both local and remote copies of data25.


• Performs data migration that moves data between storage systems or compute
systems.
• Perform compression and encryption when transferring data to the remote
location for reducing network bandwidth and improving data security.

3: The key functions of data archiving application are:

• Identifies and moves inactive data out of primary storage systems into lower
cost storage systems, called data archives, for long term retention.
• Creates a stub file26 on the primary storage after moving the original data to
archive storage.
• Performs retrieval of archived data when required by the client.
• Creates index27 archived data to facilitate user searches and data retrieval.

24Replication software can run on a compute system, a storage system, or on an


appliance.

25
Typically a technology or a system upgrade requires the existing data to be
moved to a new system before withdrawing the old system to avoid downtime.

26 The stub file contains the address of the archived data.

27By utilizing the index, users may also search and retrieve their data with the web
search tool.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 68


Data Protection Architecture

4: The key functions of Data management application are:

• Provides end-to-end visibility of data protection environment.


• Stores data across cloud environment and on-premise.
• Analyzes data while it is in use to obtain live and real-time results.
• Ensures data is used, managed, and retained in compliance with regulations.
• Optimizes tools and storage to minimize the cost of using and storing data.
• Helps in consolidating reports, correlating issues to find root cause, and tracking
migration of data and services.

• A packaged application running on a compute system or a data protection


feature embedded in any IT equipment.
• Organizations implement data protection and availability applications such as
management, backup, replication, and archiving. These applications help to:

− Protect the data from accidental deletion, application crashes, data


corruption, and disaster.
− Ensure data availability on a long-term basis and also helps organizations
meet compliance requirements.
Click on each application box in the image for more information.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 69


Data Protection Architecture

Protection Storage Overview

Application Server

Primary
Storage
Device Tape

Data is copied to
Protection Storage Disk

Protection Storage

• Protection storage is used to store the data to be protected.


• Organizations typically use tape-based and disk-based protection storage.
• Protection storage28 can be deployed within a data center or may exist in the
cloud.

28Typically organizations have protection storage at the remote data center for DR
purpose.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 70


Data Protection Architecture

Disk-based Protection Storage

Disk density has increased dramatically over the past few years, lowering the cost
per gigabyte to the point where disk is a viable protection storage option for
organizations.

Types of disk-based data protection storage are:

• SAN-attached Storage
• Network-attached Storage (NAS)
• Object-based Storage
• Cloud-based Storage

Key Benefits of disk-based protection storage are:

• Provides enhanced performance, scalability, and reliability.


• Offers faster recovery when compared to tapes.
− In addition, these protection storage systems come with RAID or erasure
coding features to protect data from loss.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 71


Data Protection Architecture

• Supports replicate data to a remote site to help an organization comply with off-
site requirements.
− This avoids the need to ship tapes from the primary site to the remote site
and thus reduces the risk of losing tapes in transit.
• Includes features such as data deduplication, compression, and encryption to
support various business objectives.

Tape-based Protection Storage

1 3

1: A tape library contains one or more tape drives that record and retrieve data on
magnetic tape.

2: A tape cartridge is composed of magnetic tape in a plastic enclosure. Tape


cartridges are placed in slots when not in use by a tape drive.

3: Robotic arms are used to move tapes around the library such as moving a tape
drive into a slot.

4: Used to add or remove tapes from the library without opening the access doors
because opening the access doors causes a library to go offline.

A tape library is a tape-based protection storage system that has tape drives and
tape cartridges, along with a robotic arm or picker mechanism as shown in the
image.

Click on each mechanism in the Tape Library image for more information.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 72


Data Protection Architecture

Tape devices have several advantages and disadvantages:

• A low-cost, portable solution and can be used for long-term, off-site storage29.
• Must be stored in locations with a controlled environment to ensure preservation
of media and prevention of data corruption.
• Highly susceptible to wear and tear and may have a short shelf life.
• Traditional backup process using tapes is not optimized to recognize duplicate
content.
• Storing and retrieving the data takes more time with tape.
• Data integrity and recoverability are also major issues with tape-based media.

29Physical transportation of tapes to offsite locations also adds management


overhead and increases the possibility of loss of tapes during offsite shipment.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 73


Data Protection Architecture

Virtual Tape Library

Emulation Engine

Storage

Key features of Virtual Tape Library are:

• Disks are emulated and presented as tapes to backup software.


• Does not require any additional modules or changes in the legacy backup
software.
− Emulation software has a database with a list of virtual tapes, and each
virtual tape is created on a disk.
• Provides better performance, reliability and random disk access characteristics
over physical tape.
• Does not require the usual maintenance tasks associated with a physical tape
drive such as periodic cleaning and drive calibration.
• Offers a number of features that are not available with physical tape libraries
such as replication.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 74


Data Protection Architecture

Knowledge Check: Protection Storage Overview

Knowledge Check Question

3. Which type of protection storage provides portability of media?


a. Disk storage
b. Tape storage
c. Virtual tape storage

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 75


Data Protection Architecture

Data Security and Management

Data Security and Management

Objectives

The objectives of the topic are to:


→ Explain the goals of data security and management.
→ Understand governance, risk, and compliance.
→ Define security threats and controls.
→ Describe key data protection management functions.

Introduction to Data Security and Management

Data management helps in aligning the data


protection operations and services to business goal
and service level requirements of an organization.
Management functions that are necessary for the
visibility and control of data source and protection
components, and data protection operations.

Data Security and Management functions:

• Helps in protecting data, data sources, and protection components from


unauthorized access, modification, and disruption.
• Involves implementing various kinds of countermeasures or controls, in order to
lessen the risk of an exploitation or a vulnerability in the data source and
protection components.
• Controls secure implementation of data based on the organization’s
governance, risk mitigation, and compliance requirements.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 76


Data Protection Architecture

The goal of data security is to provide confidentiality30, integrity31, and availability32,


commonly referred to as the security triad or CIA.

Governance, Risk, and Compliance (GRC)

Governance, Risk, and Compliance (GRC) helps


an organization to ensure that their acts are
ethically correct and in accordance with their risk
appetite (the risk level an organization chooses to
accept), internal policies, and external regulations.

All operations including data protection operations


of an organization are managed and supported
through GRC.

30Confidentiality provides the required secrecy of data to ensure that only


authorized users have access to data.

31 Integrity ensures that unauthorized changes to data are not allowed. Also ensure
to detect and protect against unauthorized alteration or deletion of data.

32Availability ensures that authorized users have reliable and timely access to data
and services.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 77


Data Protection Architecture

Governance • Governance determines the purpose, strategy, and


operational rules by which organizations are directed and
managed.

− For example, governance policies define the access


rights to users based on their roles and privileges.

Risk • Risk1 management involves identification, assessment, and


prioritization of risks and establishing controls to minimize
the impact of those risks.

− For example, a key risk management activity is to


identify resources that should not be accessed by
certain users in order to preserve confidentiality,
integrity, and availability.

Compliance • Compliance is the act of adhering to, and demonstrating


adherence to, external laws and regulations as well as
corporate policies and procedures.

− An example of compliance is to enforce a security rule


relating to identity management.

1 Risk is the effect of uncertainty on business objectives.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 78


Data Protection Architecture

Security Threats and Controls

Data is one of the most important assets for any organization. Other assets include
hardware, software, and other infrastructure components required to access and
protect data.

The implementation and effectiveness of any security control is primarily governed


by the GRC processes and policies.

Threat Agent

Gives rise to
Threat
Wish to abuse and/or
That exploits
may damage
Vulnerabilities

Leading to
Risk Control Owner
To Impose
To reduce s
Asset
Values

• Organizations (asset owners) want to safeguard assets from threat agents


(attackers).
• Threats are the potential attacks that can be carried out on assets to impact the
confidentiality, integrity, and availability of data and services.
− Examples of attacks include: attempts to gain unauthorized access into the
system, unauthorized data modification, denial of service (DoS) and
ransomware.
• Attackers exploit vulnerabilities or weaknesses of an asset to carry out attacks.
• Risk happen when the likelihood of a threat agent (an attacker) to exploit the
vulnerability arises.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 79


Data Protection Architecture

− Therefore, organizations deploy various security controls to minimize risk by


reducing the vulnerabilities.
• Security controls have two key objectives:

− To ensure that the assets are easily accessible to authorized users.


− To make it difficult for potential attackers to access and compromise the
assets.

Ransomware Protection (Air Gapped Solution)

Data Protection and Vaulting Process.

Data is the currency of the internet economy and a critical asset that must be
protected, kept confidential and made available at a moment’s notice. Global
business relies on the constant flow of data across interconnected networks, and
digital transformation means an increase of sensitive data. This presents ample
opportunity for cyber threats and exposure to leverage data for ransom, corporate
espionage or even cyber warfare. Ransomware is a malware method that:

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 80


Data Protection Architecture

• Encodes the targeted system or files. To decode the system or files the hacker
demands for some ransom mostly in form of cryptocurrency.
• Spreads through phishing emails that contain malicious attachments or through
drive-by downloading33.

To protect your dynamically huge data from cyber-attacks requires proven and
modern solutions. Here are some components for proven and modern solution:

Click the tabs below for more information about proven and modern solutions.

Data Isolation and Governance

An isolated data center environment that is disconnected from corporate and


backup networks and restricted from users other than those with proper clearance.

Automated Data Copy and Air Gap

Create unchangeable data copies in a secure digital vault and processes that
create an operational air gap between the production / backup environment and the
vault.

Intelligent Analytics and Tools

Machine learning and full-content indexing with powerful analytics within the safety
of the vault. Automated integrity checks to determine whether data has been
impacted by malware and tools to support remediation if needed.

Recovery and Remediation

Workflows and tools to perform recovery after an incident using dynamic restore
processes and your existing data recovery procedures.

33
Occurs when an end-user by mistake visits an infected website and then
malware is downloaded and installed without the user’s information.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 81


Data Protection Architecture

Solution Planning and Design

Expert guidance to select critical data sets, applications and other vital assets to
determine RTOs and RPOs and streamline recovery.

Data Management Functions

Data protection management functions that are necessary for the visibility and
control of data source and protection components, and data protection operations.
Each of these functions are:

Discovery

Protection
Protection Configuration ....
Storage
Applications Performance ....

Protection Status ....

Protection SAN and Availability ....

Operations NAS VM Movement ....

Threshold Exceptions ....


Monitor
Missed SLA ....
Virtual Business
Environment Applications

• Discovery involves collecting and storing information about infrastructure


components, data protection operations, and services.
• Gathers information about VM movement, threshold exceptions, repeated
failures, growth of a backup that will exceed the backup window, missed service
level agreements (SLAs), and compliance breaches.
• Provides the visibility needed to monitor, troubleshoot, optimize, plan, and
report about IT infrastructure components.
• Discovery is typically scheduled by setting an interval for its periodic
occurrence.

− Can be initiated by an administrator or triggered automatically when a


change occurs in an IT infrastructure.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 82


Data Protection Architecture

Operations Management

Capacity Troubleshooting Resource


Monitoring Reporting
Planning Optimization

• Operations management involves on-going management activities to maintain


the IT infrastructure, protection operations, and the deployed services.
• Operations management activities ensure that the data protection services and
service levels are delivered as committed.

− Activities such as monitoring, capacity planning, troubleshooting, resource


optimization, and reporting.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 83


Data Protection Architecture

Orchestration

• Orchestration refers to the automated arrangement, coordination, and


management of various component functions to provide and manage IT
operations and services.
• Orchestration is performed by orchestration software called an orchestrator.
• Orchestrator provides a library of predefined workflows as well as enables
defining new workflows34.
− A workflow logically integrates and sequences various component functions
to automate data protection operations and services.
• Orchestrator interacts with various components to trigger execution of the
component functions.

34A workflow refers to a series of inter-related component functions to perform an


IT operation and to provide a service.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 84


Data Protection Architecture

Provisioning Data Protection Services

Data protection services are provisioned to meet the availability and data protection
requirements of business applications and IT services.

• Data protection services leverage the protection technologies and solutions


provided by protection applications and other infrastructure components.
− Examples of data protection services are backup as a service, replication as
a service, disaster recovery as a service, and data migration as a service.
• Provisioning of protection services is commonly orchestrated using an
orchestrator35.

35 Interacts and coordinates the execution of protection functions of various


infrastructure components.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 85


Data Protection Architecture

• An administrator may provision data protection services using management


tools.
• Services are usually visible and accessible to the users through a service
catalog that is hosted on a web portal.

− Users can request a service in a self-service way by simply clicking an


appropriate service on the service catalog.
− Orchestrator automatically triggers a workflow after protection request is
placed from the service catalog and does not require manual interaction
between administrators and users.

Governance and Compliance

• Data governance is an umbrella term that encompasses a number of policies


and processes which help to ensure the effective management of data assets
within the organization.

• Data governance establishes the processes and responsibilities and defines


who can take what action, upon what data, in what situations, using what
methods?

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 86


Data Protection Architecture

• Governance determines the purpose, strategy, and operational rules by which


organizations are directed and managed.
− For example, governance policies define the access rights to users based
on their roles and privileges.
• Data governance is required because it:
− Implements and enforce policies that helps in protecting data from any
mislead.
− Ensures compliance with data privacy laws and other rules and regulations.
• Compliance is the act of adhering to, and demonstrating adherence to, external
laws and regulations as well as corporate policies and procedures.

− An example of compliance is to enforce a security rule relating to identity


management.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 87


Data Protection Architecture

Knowledge Check: Data Security and Management

Knowledge Check Question

4. Match the following elements with their descriptions:

A. Orchestration D Collecting information about


components, operations, and
services.

B. Governance A Automated arrangement and


coordination of component
functions.

C. Integrity B Determination of strategy and rules


for managing organizations.

D. Discovery C Disallowing of unauthorized


changes to data.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 88


Data Protection Architecture

Concepts in Practice

Concepts in Practice

Click the right and left arrows to view all the concepts in practice.

Dell PowerEdge Server

Dell PowerEdge Server family includes various types of servers that include Tower
servers36, Rack servers37, and Modular Servers38.

Dell EMC PowerStore

• A modern storage appliance designed for the data era.


• PowerStore’s single architecture for block, file, and VMware vVols leverages the
latest technologies to support an enterprise-class variety of traditional and
modern workloads.
− From relational databases, to ERP and EMR apps, cloud native applications,
and file-based workloads such as content repositories and home directories.

36Tower servers generally contain more disk drives bays and expansion card slots
than other server form factors. The advantages of a tower server lie in its compact
shape. Tower servers can be used in work areas which are not designed to contain
servers. Its simplicity and robustness make the tower server an ideal place for a
small company to begin using a server.

37A rack server is also called a rack-mounted server. Rack-mount servers are
designed to save space when there are several servers in a confined area. Rack
servers are generally more expensive. They are better suited to medium-sized
businesses or micro-businesses.

38Modular servers are the latest development in the history of the different server
types. Also defined as a server that is hosted with a dedicated chassis, including
network and storage components.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 89


Data Protection Architecture

• Designed to leverage next-gen innovations such as end-to-end NVMe and dual


port Intel® Optane™ solid state drives (SSDs) as Storage Class Memory
(SCM).
• As capacity and performance may be scaled independently, each active
PowerStore appliance can grow over to Petabytes of storage and multiple
appliances can be clustered for greater performance.

Dell EMC PowerMax

• Dell EMC PowerMax next generation enterprise storage delivers seamless


cloud mobility to leverage agility and economics of cloud.
• Dell EMC PowerMax is that platform, which offers massive scalability in every
possible dimension—performance, capacity, connectivity, LUNs/devices and
superior data services.
− All with a future-proof architecture featuring end-to-end non-volatile memory
express (NVMe), SCM, built-in machine learning, seamless cloud mobility
and deep VMware integration.
• Dell EMC PowerMax delivers extreme efficiency with global inline deduplication
and compression, delivering data reduction, space efficient snaps, and thin
provisioning.

Dell EMC Unity

• Dell EMC Unity XT All-Flash and Hybrid Flash arrays set new standards for
storage with compelling simplicity, all-inclusive software, blazing speed,
optimized efficiency, multi-cloud enablement.
− All in a modern NVMe-ready design – to meet the needs of resource-
constrained IT professionals in large or small companies.
• Designed for performance, optimized for efficiency and built for hybrid cloud
environments.
• These systems are the perfect fit for supporting demanding virtualized
applications, deploying unified storage and addressing Remote-Office-Branch-
Office requirements.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 90


Data Protection Architecture

Dell EMC PowerScale

• Dell EMC PowerScale is the next evolution of the OneFS – the operating
system powering the industry’s leading scale-out NAS platform.
− The software-defined architecture of OneFS give simplicity at scale,
intelligent insights and the ability to have any data anywhere it needs to be.
− Whether it is hosting file shares or home directories or delivering high
performance data access for applications like analytics, video rendering and
life sciences, PowerScale can seamlessly scale performance, capacity and
efficiency to handle any unstructured data workload.
• PowerScale brings a new version of OneFS to our Isilon nodes as well as two
new all-flash PowerScale nodes, that delivers application requirements like S3
protocol and performance needs like NVMe, from the edge to the cloud.
• The new PowerScale all flash platforms co-exist seamlessly in the same cluster
with existing Isilon nodes to drive your traditional and modern applications.

Dell EMC ECS

• Dell EMC ECS is the leading object storage platform boasts unmatched
scalability, performance, resilience, and economics.
• Dell EMC ECS has been purpose-built to store unstructured data at public cloud
scale with the reliability and control of a private cloud.
• Capable of scaling to exabytes and beyond, ECS empowers organizations to
manage a globally distributed storage infrastructure under a single global
namespace with anywhere access to content.
• Deployable as a turnkey appliance or in a software-defined model, ECS delivers
rich S3-compatibility on a globally distributed architecture, empowering
organizations to support enterprise workloads such as:

− Cloud-native, archive, IoT, AI, and big data analytics applications at scale.

Dell EMC XtremIO

• Modern data centers require storage that can consistently meet heavy
performance demands, provide un-paralleled support to manage the entire
virtual workload lifecycle, and deliver demanding SLAs.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 91


Data Protection Architecture

− The purpose-built all-flash array is designed for this virtual era.


• With multi-dimensional scalability, in-memory metadata, unmatched storage
efficiency, rich application integrated copy services, metadata-aware replication,
and unprecedented management simplicity.
− XtremIO delivers on the promise of a simple, agile, scalable, fully-virtualized
data center — all while minimizing infrastructure footprint and TCO.
• XtremIO is the storage platform designed to support all modern virtual data
center objectives.

− Data reduction using inline deduplication, compression, XtremIO Virtual


Copies, and thin provisioning.
− Great for mixed workloads, virtualized applications and VDI.

VMware vSphere

• VMware vSphere is the industry leading compute virtualization platform.


vSphere provides an enterprise platform for both traditional applications as well
as modern applications.
− Organizations can deliver a developer-ready infrastructure, scale without
compromise and simplify operations.
• vSphere can scale infrastructure to meet the demands of high-performance
applications and memory intensive databases.
• It supports industry leading scale through Monster VMs designed for SAP
HANA and Epic Cache Operational Database.

− Improve performance and scale for Monster VMs to support your large scale
up environments.
− Scale up to 24TB memory and support up to 768 vCPUs through Monster
VMs.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 92


Data Protection Architecture

Exercise - Data Protection Architecture

Exercise - Data Protection Architecture


1. Present Scenario:

• IT infrastructure of an organization includes 20 physical computing systems.

• Compute systems consist of both Microsoft Windows and UNIX platforms.

• Compute systems host financial, email, and backup applications, and the
organization’s website.

• Each compute system runs a single application to avoid resource conflict.

• Utilization of these compute systems is mostly around 20%.

• Compute systems are connected to six file servers with direct-attached


storage.

− Three file servers for Windows users and the remaining three file servers for
UNIX users.
• Email application uses a SAN-attached (block-based) storage system as
primary storage.

• Email application uses an aging OSD to archive old emails.

• A SAN-attached tape library is used to store all backup data.

• Backup application is purpose-built for backup-to-tape operations.

2. Organization’s Challenges:

• Tape library is aging and is a performance bottleneck during backup


operations.

• Multiple management tools are used to manage different types of storage


systems.

− Creates complexity and delays storage provisioning decisions and


troubleshooting.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 93


Data Protection Architecture

• Management tools cannot provide real-time, end-to-end visibility and


reporting on the IT infrastructure, and backup and archiving operations.

• SAN-attached storage system has only 10% of its storage capacity


available.

• UNIX users and Microsoft Windows users are unable to share files.

• Some of the file servers are overly utilized and therefore new file servers
must be deployed.

3. Organization’s Requirements:

• Need to deploy new applications on social networking, eCommerce, and big


data analytics.

• Need to purchase 30 new compute systems to deploy the new applications.

4. Expected Deliverables:

Propose a solution that will:

• Optimize utilization of compute resources.

• Eliminate the performance bottleneck caused by the tape library without


changing the existing backup application.

• Reduce management complexity.

• Provide real-time, end-to-end visibility of the infrastructure and operations.

• Allow UNIX and Windows users to share files.

• Reduce proliferation of file servers and improve file serving performance.

Solution

The proposed solution is as follows:

• Install a hypervisor on each physical compute system to run multiple


VMs/applications and improve its utilization.
− Organization can use fewer physical compute systems to run both the
existing and the new applications.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 94


Data Protection Architecture

− Organization can reduce the acquisition and operational cost of new


compute systems.
• To overcome the performance bottleneck of tape library, following can be
considered:
− Implement disk-based backup solution, this will improve the backup
performance.
− Aging tape library can be replaced with a virtual tape library.
• Deploy a unified storage system that will consolidate block-level, file-level, and
object-level access.
− Migrate data from the tape library and the OSD to the unified storage system
before decommissioning.
− A single management tool can be used for unified management of storage
systems.
• The management tool should support the discovery of the IT infrastructure
periodically and when a change occurs in the infrastructure.
− A unified management tool with an ability to discover the entire infrastructure
will provide end-to-end visibility.
• Use the NAS-functionality of the unified storage for file sharing among the
compute systems.

− Organization can consolidate multiple file servers to a NAS system and


thereby avoid proliferation of file servers.
− NAS is optimized for file serving and thus provides better performance than
a file server.
− NAS allows UNIX and Microsoft Windows users to share files.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 95


Fault Tolerance Techniques

Fault Tolerance Techniques

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 96


Fault Tolerance Techniques

Fault Tolerance Techniques

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 97


Fault Tolerance Techniques

Fault Tolerance Techniques

Fault Tolerance Techniques

The main objectives of the topic are to:


→ Describe fault tolerance and its key requirements.
→ Describe compute, network, and storage-based fault tolerance techniques.
→ Describe application-based fault tolerance techniques.
→ Apply various fault tolerance techniques to ensure data availability.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 98


Fault Tolerance Techniques

Fault Tolerance Overview

Fault Tolerance Overview

Objectives

The objectives of the topic are to:

• Define and explain the need for fault tolerance.


• Review fault tolerance implementations.
• Understand key requirements for fault tolerance.

Impact of Fault

IT services may experience interruptions due to the presence of faults in the


underlying software and hardware systems. A fault in a system component causes
a deviation from its expected behavior.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 99


Fault Tolerance Techniques

• Results in degraded output or complete failure of system.


• Faults can be due to software bug, signal distortion, storage media error, server
crash, network error, application time out, operator error, and physical damage
to the hardware.

Need for Fault Tolerance

• Service interruptions can be reduced by improving the reliability and availability


of IT systems.
• Reliability and availability can be improved through Fault Tolerance.
• Fault tolerance must be achieved in:

− Compute
− Network
− Storage
− Application

Reliability

• Improved by using systems that can consistently perform their operations as


expected without performance degradation or failure.

Availability

• Improved by ensuring that the IT systems and services can perform their
required functions during their operating time.
• Dependent on the reliability of systems on which the services are created.

To learn about the Need for Fault Tolerance, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 100


Fault Tolerance Techniques

What is Fault Tolerance?

The ability of a system to continue functioning in the event of a fault within or failure
of some of its components.

The common reasons for a fault or a failure are:

• Hardware failure
• Software bugs
• Administrator/user errors

Fault tolerance protects a system or a service against three types of unavailability:

Transient Unavailability

Occurs once for a short time and then disappears.

• Example: An online transaction times out but works fine when a user retries the
operation.

Intermittent Unavailability

Recurring unavailability that is characterized by an outage, then availability again,


then another outage, and so on.

Permanent Unavailability

Exists until the faulty component is repaired or replaced. Examples of permanent


unavailability are network link outage, application bugs, and manufacturing defects.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 101


Fault Tolerance Techniques

To learn more about Fault Tolerance, click here.

Key Requirements for Fault Tolerance

• Elimination of single point of failure (SPOF)


• Fault isolation
• Fault recovery

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 102


Fault Tolerance Techniques

Elimination of SPOF

Clustered Compute
Systems
Redundant
Links Redundant Storage
Systems
Redundant
LAN/WAN

NICs
Redundant
HBAs
Redundant
SAN
Switches
Redundant
Networks
Clients

Redundant
Redundant Remote Site
Links
NICs

Fault-tolerant infrastructures are typically configured without single points of failure


to ensure that individual component failures do not result in service outages. The
general method to avoid single points of failure is to provide redundant components
for each necessary resource, so that a service can continue with the available
resource even if a component fails.

• The example shown in the image represents an infrastructure designed to


mitigate the single points of failure at component level. The single points of
failure at the compute level can be avoided by implementing redundant
compute systems in a clustered configuration.
• Single points of failure at the network level can be avoided via path redundancy
and various fault tolerance protocols. Multiple independent paths can be
configured between nodes so that if a component along the active path fails,
traffic is rerouted along another path.
• The single point of failure at the storage level can be eliminated by configuring
redundant ports and controllers on each storage system and also by deploying
redundant storage systems. These storage systems may be located in separate
regions or sites to reduce the risk of data loss in the event of a disaster.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 103


Fault Tolerance Techniques

Fault Isolation

Isolated dead path`

HBAs Storage
Port

=Points of Fault

Compute System HBAs Storage


Port
Storage System
Live
Pending I/Os are Path
redirected to live
path

• Limits the scope of a fault into local area so that the other areas of a system is
not impacted by the fault.
• Does not prevent failure of a component but ensures that the failure does not
impact the overall system.
• Requires a fault detection mechanism that identifies the location of a fault and a
contained system design (like sandbox) that prevents a faulty system
component from impacting other components.

To learn more about Fault Isolation, click here.

Fault Recovery

Restores a system to the desired operating level after a fault has occurred in the
system.

There are three types of fault recovery:

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 104


Fault Tolerance Techniques

Complete functional recovery

• System should be designed so that when a component fails, a redundant


component can take over its functions automatically (in some cases, manually).
• If maintaining redundant resources is cost prohibitive, then the second category
of fault recovery may be considered.

Functional recovery using an alternative logic or process

• Uses an alternate module, process, or path to recover a system in the event of


a fault.
• The alternate module, process, or path may not achieve the full capabilities of
the original functions because of cost-effective system design, limiting resource,
and design constraints.

Degraded functional recovery

• System should be designed to operate at some compromised level of


performance after a fault occurs.
• Some of the functions of the system may be inaccessible until repairs are made.
But, the system remains available to continue business operations.

Fault Recovery (Cont'd.)

There are two approaches to fault recovery:

Forward recovery

• Involves correcting the fault in a system to continue system operations from the
faulty state. It is useful only when the cause and the impact of a fault is
understood.

− Example, consider a group of two mirrored disk drives that store same data.
Each write I/O is written to both the disk drives. If one of the drives in the
mirrored pair fails and is replaced by a new drive, the surviving drive in the
mirrored pair will be used for data recovery and continuous operation.
Therefore, I/O operations can be continued from the fault condition.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 105


Fault Tolerance Techniques

Backward recovery

• Involves rolling back or restoring a system to a previous recovery point. Instead


of finding the cause of a fault, it aborts changes that have produced a fault and
resorts to reversing previous operation or state of a process.

− For example, the memory state, settings state, and power state (on, off, or
suspended) of a virtual machine (VM) is saved at a specific recovery point
so that the VM can be restored to its previous state if anything goes wrong.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 106


Fault Tolerance Techniques

Knowledge Check: Fault Tolerance Overview

Knowledge Check Question

1. Which of the following are types of fault recovery? Choose all that apply.
a. Complete functional recovery
b. Functional recovery using an alternative logic
c. Degraded functional recovery
d. Backwards recovery

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 107


Fault Tolerance Techniques

Compute and Network

Compute and Network

Objectives

The objectives of the topic are to:

• Review compute-based fault tolerance techniques.


• Understand network-based fault tolerance techniques.

Introduction to Compute and Network

• Common compute and network-based fault tolerance techniques are:

− Compute clustering
− Virtual machine (VM) live shadow copy
− Link aggregation
− NIC teaming
− Switch aggregation
− Multipathing
− Configuring hot-swappable components

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 108


Fault Tolerance Techniques

Compute Clustering

• Two or more compute systems/hypervisors are clustered to provide high


availability and load balancing.
• Service running on a failed compute system moves to another compute system.
• Heartbeat mechanism determines the health of compute systems in a cluster.

To learn more about Compute Clustering, click here.

Virtual Machine (VM) Live Shadow Copy

3
1

2 4

1: Enables failover to secondary VM immediately, if primary VM fails

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 109


Fault Tolerance Techniques

2: VM live shadow copy works on a hypervisor cluster

3: Creates a live copy of a primary VM on another compute system

Secondary VM executes events that occur on primary VM

4: New secondary VM ensures redundancy after failover

• Provides continuous availability of services running on VMs even if the host


physical compute system or hypervisor fails.
• Works on a hypervisor cluster provided the hypervisors support it.
• When enabled for a VM, creates a live copy (i.e., a secondary VM) of a primary
VM on another compute system.

If the primary VM fails due to hardware failure, the technique enables failover to the
secondary VM immediately. After the failover occurs, a new secondary VM is
created and redundancy is reestablished.

To learn more about VM Live Shadow Copy, click here.

Link Aggregation

1: Combines two or more parallel interswitch links (ISLs) into a single logical ISL,
called a link aggregation group
Optimizes network performance by distributing network traffic across the shared
bandwidth of all the ISLs in a link aggregation group
Enables network traffic failover in the event of a link failure. If a link in a link

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 110


Fault Tolerance Techniques

aggregation group is lost, all network traffic on that link is redistributed across the
remaining links.

• Combines multiple ISLs into a single logical ISL (link aggregation group)
• Distributes network traffic over ISLs, ensuring even ISL utilization.
• Enables network traffic failover in the event of a link failure.
• Provides higher throughput than a single ISL could provide.

To learn more about Link Aggregation, click here.

NIC Teaming

1: Provides automatic failover in the event of an NIC/link failure.

• Groups NICs so that they appear as a logical NIC.


• Provides network traffic failover in the event of a NIC/link failure.
• Distributes network traffic across NICs.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 111


Fault Tolerance Techniques

Switch Aggregation

• Combines two physical switches to make them appear as a single logical


switch.
• Distributes network traffic across all links from aggregated switches.
• Continues network traffic flow through another switch if one switch fails.
• Provides higher throughput than a single switch could provide.

− Improves node performance.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 112


Fault Tolerance Techniques

Multipathing

1: Enables a compute system to use multiple paths for transferring data to a


storage device on a storage system. Multipathing enables automated path failover.
This eliminates the possibility of disrupting an application or a service due to failure
of a component on the path such as network adapter, cable, port, and storage
controller (SC).

• Enables a compute system to use multiple paths for transferring data to a


storage device.
• Enables failover by redirecting I/O from failed path to the available path.
• Performs load balancing by distributing I/Os across paths.

To learn more about Multipathing, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 113


Fault Tolerance Techniques

Configuring Hot-swappable Components

I/O

Hot Swappable
Controller Blade

• Components can be replaced while a system is powered-on and remains in


operation.
• Hot-swapping does not require shutting down and then restarting a system.
• A system should have redundant components for hot-swapping.
• System operation will continue while the faulty component is removed and
replaced.

For an example of Hot-swapping, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 114


Fault Tolerance Techniques

Knowledge Check: Compute and Network

Knowledge Check Question

2. Which of the following statements are correct? Select all correct options.
a. Clustering enables service failover from a failed server to an active server.
b. VM live shadow copy balances client’s traffic across primary and
secondary VMs.
c. Switch aggregation creates a group of active and passive switches.
d. Link aggregation combines multiple logical ISLs to create a single physical
ISL .
e. Hot-swappable components can be replaced while a system remains
available.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 115


Fault Tolerance Techniques

Storage

Storage

Objectives

The objectives of the topic are to:

• Understand Redundant Array of Independent Disks (RAID).


• Understand Redundant Array of Independent Nodes (RAIN).
• Explain Erasure Coding.
• Explore Hot Sparing.
• Describe Cache protection: Mirroring and Vaulting.

Why Storage Fault-tolerant Techniques?

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 116


Fault Tolerance Techniques

• Data centers usually comprise of storage systems with a large number of


storage media. The greater the number of drives in use, the greater the
probability of a drive failure.
• Some of the storage systems are comprised of multiple nodes where each node
is a compute system that has processing power and storage.
• Failure of cache memory can result in data unavailability. So, protecting the
data in the cache is also important.

Redundant Array of Independent Disks (RAID)

1: Key Functions:
Managing drive aggregations
Translation of I/O requests between logical and physical drives
Data regeneration in the event of drive failures

2: A logical unit that consists of multiple drives where the data is written in blocks
across the drives in the pool

• RAID is a technique that combines multiple disk drives into a logical unit and
provides protection, performance, or both.

− Provides data protection against drive failures.


− Improves storage system performance by serving I/Os from multiple drives
simultaneously.
To learn more about RAID, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 117


Fault Tolerance Techniques

Redundant Array of Independent Nodes (RAIN)

Internal Switch Internal Switch

Node 1 Node 2 Node 3

Node Cluster

External Switch External Switch

• Nodes are clustered in a network with redundant storage.


• Provides increased fault tolerance by allowing automated data recovery even if
multiple nodes fail.
• New nodes can be added to the cluster dynamically to meet performance and
capacity requirements.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 118


Fault Tolerance Techniques

Erasure Coding Technique

• Provides space-optimal data redundancy to protect data loss against multiple


drive/node failures.

− A set of n disks is divided into m disks to hold data and k disks to hold
coding information.
− Coding information is calculated from data.
To understand the illustration, click here.

Hot Spare (Dynamic Drive Sparing)

• Refers to a spare disk drive that replaces a failed drive by taking the identity of
the failed drive.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 119


Fault Tolerance Techniques

• With the hot spare, one of the following methods of data recovery is performed
depending on the RAID implementation:
− If parity RAID is used, the data is rebuilt onto the hot spare from the parity
and the data on the surviving disk drives in the RAID set.
− If mirroring is used, the data from the surviving mirror is used to copy the
data onto the hot spare.
• When a new disk drive is added to the system, data from the hot spare is
copied to it. The hot spare returns to its idle state, ready to replace the next
failed drive.
• A hot spare should be large enough to accommodate data from a failed drive.
Some systems implement multiple hot spares to improve data availability.

Cache Protection - Mirroring

1: Each write is mirrored and stored in two independent cache memory cards

2: Even if one cache fails, the data is still available in the mirrored cache

Cache is a volatile memory. So, a power failure or any kind of cache failure will
cause loss of data that is not yet committed to the storage drive. This risk of losing
uncommitted data held in cache can be mitigated using cache mirroring and cache
vaulting.

Mirroring:

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 120


Fault Tolerance Techniques

• Each write to cache is held in two different memory locations on two


independent memory cards. If a cache failure occurs, the write data will still be
safe in the mirrored location and can be committed to the storage drive.
• The risk of data loss due to power failure can be addressed in various ways;
powering the memory with a battery until the AC power is restored or using
battery power to write the cache content to the storage drives. If an extended
power failure occurs, using batteries is not a viable option.

Cache Protection - Vaulting

1: Cache content is copied to vault drive during power failure

After power is restored, the data from the drive is written back to cache

• Each write to cache is held in two different memory locations on two


independent memory cards. If a cache failure occurs, the write data will still be
safe in the mirrored location and can be committed to the storage drive.
• The risk of data loss due to power failure can be addressed in various ways;
powering the memory with a battery until the AC power is restored or using
battery power to write the cache content to the storage drives. If an extended
power failure occurs, using batteries is not a viable option.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 121


Fault Tolerance Techniques

Knowledge Check: Storage

Knowledge Check Question

3. A storage system is configured with an erasure coding technique. If the data is


divided into seven data segments and four coding segments, and each
segment written in different drives, how many drive failures can be withstood
without losing the data in this configuration?
a. 3
b. 4
c. 7
d. 11

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 122


Fault Tolerance Techniques

Application and Availability Zone

Application and Availability Zone

Objectives

The objectives of the topic are to:

• Explain graceful degradation of application functionality.


• Explore fault detect and retry logic in application code.
• Understand persistent state mode.
• Understand database (DB) rollback.
• Review checkpointing.
• Multiple availability zones.

Introduction to Fault-tolerant Application

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 123


Fault Tolerance Techniques

• A well-designed application has to be designed to deal with IT resource failures


to guarantee the required availability.
• Fault-tolerant applications have logic to detect and handle fault conditions to
avoid application downtime.

Graceful Degradation

1: In case of failure, when a module is down and when client is accessing the
application, the application is still available to the client with degraded functionality
and performance.

• Application maintains limited functionality even when some of the modules or


supporting services are not available.
• Unavailability of certain application component or modules should not take the
entire application down.

To learn more about graceful degradation, click here.

Fault Detection and Retry Logic

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 124


Fault Tolerance Techniques

1: When failure happens, the retry logic sends a second request and the service
becomes available then.

• Refer to a mechanism that implements a logic in the code of an application to


improve the availability.
• Detect and retry the service that is temporarily down which may result in a
successful restore of the service.

To understand the Fault Detection and Retry logic, click here.

Persistent State Model

1: State information can be accessed by the new server from the repository

• State information is stored out of the memory and stored in a data repository.
• If an instance fails, the state information will still be available in the repository.

To learn about Persistent State Model in detail, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 125


Fault Tolerance Techniques

Database Rollback

1: Restores a DB to a previous state by cancelling transactions D and E

• Rollback restores a DB to a previous state by cancelling transaction(s).


• DB can be restored to a consistent previous state even after erroneous
operations are performed.

− Important for database integrity.


To learn about Database Rollback, click here.

Checkpointing

• Saves a copy of the state (checkpoint) of a process/ application periodically.


• Enables rolling back to a previous state and continuing tasks.
• Provides protection against transient unavailability. Upon rollback to a previous
checkpoint, the applications and processes continue operation in the same
manner as they did before a failure. However, if the fault is caused by a
software bug or an administrative error, then the application will continue to fail
and rollback endlessly.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 126


Fault Tolerance Techniques

Configuring Multiple Availability Zones

1: In the event of a zone outage, services can fail over to another zone.

Availability zones, although isolated from each other, are connected through low-
latency network links.

• Availability zone is a location with its own set of resources and isolated from
other zones.
• Availability zones, although isolated from each other, are connected through
low-latency network links.
• In the event of a zone outage, services can fail over to another zone.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 127


Fault Tolerance Techniques

Knowledge Check: Application and Availability Zone

Knowledge Check Question

4. Which of the following refers to the ability of an application to maintain limited


functionality even when some of the components, modules, or supporting
services are not available?
a. Retry logic
b. Persistent state model
c. Graceful degradation
d. Availability zone

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 128


Fault Tolerance Techniques

Concepts in Practice

Concepts in Practice

Click the right and left arrows for more information.

Dell EMC PowerPath

Dell EMC PowerPath is a host-based software that provides automated data path
management and load-balancing capabilities for heterogeneous server, network,
and storage deployed in physical and virtual environments. The PowerPath family
includes PowerPath Multipathing for physical environments, as well as Linux, AIX,
and Solaris virtual environments and PowerPath/VE Multipathing for VMware
vSphere and Microsoft Hyper-V virtual environments. It automates multipathing
policies and load balancing to provide predictable and consistent application
availability and performance across physical and virtual environment. PowerPath
improves service-level agreements by eliminating application impact from I/O
failures.

VMware vSphere HA

VMware vSphere HA leverages multiple ESXi hosts configured as a cluster to


provide rapid recovery from outages and provides high availability for applications
running in virtual machines. It protects against a server failure by restarting the
virtual machines on other hosts within the cluster. It protects against application
failure by continuously monitoring a virtual machine and resetting it if a failure is
detected. It protects against datastore accessibility failures by restarting affected
virtual machines on other hosts which still have access to their datastores.

VMware vSphere FT

Sphere HA provides a base level of protection for your virtual machines by


restarting virtual machines in the event of a host failure. vSphere Fault Tolerance
provides a higher level of availability, allowing users to protect any virtual machine
from a host failure with no loss of data, transactions, or connections. Fault
Tolerance provides continuous availability by ensuring that the states of the
Primary and Secondary VMs are identical at any point in the instruction execution
of the virtual machine. If either the host running the Primary VM or the host running
the Secondary VM fails, an immediate and transparent failover occurs. The

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 129


Fault Tolerance Techniques

functioning ESXi host seamlessly becomes the Primary VM host without losing
network connections or in-progress transactions. With transparent failover, there is
no data loss and network connections are maintained. After a transparent failover
occurs, a new Secondary VM is respawned and redundancy is re-established. The
entire process is transparent and fully automated and occurs even if vCenter
Server is unavailable.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 130


Fault Tolerance Techniques

Exercise

Exercise: Fault Tolerance Techniques


1. Present Scenario:

An organization has two availability zones.

Each zone has:

• A cluster of 10 physical compute systems running 50 VMs.

• Two block-based storage systems.

• Four Ethernet switches.

− Three active and one on standby.


• Applications running on VMs are used to provide eCommerce services.

2. Organization Challenges:

• Some over-utilized ISLs cause degradation of service performance.

• Service performance is impacted during peak workload hours due to limited


bandwidth of switch 2.

• Failure of a VM, physical compute system, or HBA causes a brief service


interruption and data loss for in-progress transactions.

• Recently a payment gateway fault caused a service outage.

− Customers were unable to view product catalog, shopping cart, and order
status.
• Recently a power supply failure caused an entire zone outage and loss of in-
progress transactional data.

3. Organization requirements:

• High availability and performance must be ensured to meet service level


commitments.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 131


Fault Tolerance Techniques

• Even a brief service interruption or loss of transactional data is


unacceptable.

4. Expected Deliverables:

Propose the fault tolerance techniques to address the organization’s


challenges and requirements.

Solution

The proposed solution is as follows:

• Aggregate ISLs between two Ethernet switches to distribute traffic.


• Aggregate switch 2 and switch 3 to allow both switches to be active.
• Use VM live shadow copy to provide continuous availability of services.
• Implement multipathing to prevent service disruption due to an HBA failure.
• Ensure that application/service design supports graceful degradation.
• Configure redundant power supplies in each zone to avoid data loss.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 132


Data Backup

Data Backup

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 133


Data Backup

Data Backup

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 134


Data Backup

Data Backup

Data Backup

The main objectives of the topic are to:


→ Describe backup architecture.
→ Understand various backup and recovery operations.
→ Describe backup granularity.
→ Describe backup topologies.
→ Explain various backup methods.
→ Apply various backup strategies to address the organization’s challenges
and requirements.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 135


Data Backup

Introduction to Backup

Introduction to Backup

Objectives

The objectives of the topic are to:


→ Describe backup architecture.
→ Understand various backup and recovery operations.
→ Describe backup granularity.
→ Describe backup multiplexing.
→ Understand backup cloning and staging.

Why Do We Need Data Backup?

Organizations implement backup in order to protect the data from accidentally


deleting files, application crashes, data corruption, and disaster. Data should be
protected at local location as well as to a remote location for ensuring the
availability of service.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 136


Data Backup

An organization need data backup to:

• Recover the lost or corrupted data for smooth functioning of business


operations.
• Meet the demanding SLAs.
• Comply with regulatory requirements.
• Avoid financial and business loss.

For more details about need for data backup, click here

Backup Architecture

In a backup environment, the common backup components are backup client,


backup server, storage node, and backup target.

Component Role

Backup Client • Gathers data to be backed up.


• Sends data to the backup storage node.
• Sends metadata to the backup server.
• Retrieves data during a recovery.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 137


Data Backup

Backup Server • Manages the backup operations and maintains the backup
catalog.
• Contains information about the backup configuration39 and
backup metadata40.

Storage Node • Responsible for organizing the client’s data and writing the
data to a backup device.
• Controls one or more backup devices41.
• Sends the tracking information about the data written to
the backup device to the backup server.
• Reads data from the backup device during recoveries .

• A wide range of backup targets are currently available such as tape, disk, and
virtual tape library.

− Organizations can also back up their data to the cloud storage.


− Many service providers offer backup as a service that enables an
organization to reduce its backup management overhead.

39 The backup configuration contains information about when to run backups, which
client data to be backed up, and more.

40 The backup metadata contains information about the backed up data.

41Backup devices may be attached directly or through a network to the storage


node.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 138


Data Backup

Backup Operations

Following are the steps for performing backup operations:

• Backup server initiates scheduled backup process.


• Backup server retrieves backup-related information from the backup catalog.
• Backup server instructs storage node to load backup media in the backup
device.
• Backup server instructs backup clients to send data to be backed up to the
storage node.
• Backup clients send data to the storage node and update the backup catalog on
the backup server.
• Storage node sends data to the backup device.
• Storage node sends metadata and media information to the backup server.
• Backup server updates the backup catalog.

To learn more about backup operations, click here.

Backup Operations (Cont’d.)

Some backup operations along with their description are:

Backup Description
Operation

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 139


Data Backup

Backup Initiation • Client-initiated backup


Method Manual process performed on a backup client from
either a GUI or the command line.
• Server initiated backup

Initiated from server, usually configured to start


automatically, but may also be started manually.

Backup Mode • Cold backup (Offline)


Requires the application to be shutdown during the
backup process.
• Hot backup (Online)

Application is up-and-running with users accessing their


data during the backup process.

Backup-Type • File-level
One or more files are backed up on a client system.
• Block-level
Backup data at block-level instead of file-level.
• Image-level

Backup is saved as a single file, called an image.

For more details about backup operations, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 140


Data Backup

Recovery Operations

After the data is backed up, it can be restored42 when required. A recovery
operation restores data to its original state at a specific PIT. Typically, backup
applications support restoring one or more individual files, directories, or VMs.

Following are the steps for performing recovery operations:

• Backup client requests backup server for data restore.


• Backup server scans backup catalog to identify data to be restored and the
client that will receive data.
• Backup server instructs storage node to load backup media in backup device.
• Data is then read and sent to the backup client.
• Storage node sends restore metadata to backup server.
• Backup server updates the backup catalog.

To understand the recovery operations in detail, click here.

42 A restore process can be manually initiated from the client. It can also be initiated
from the server interface.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 141


Data Backup

Types of Recovery

The various types of recoveries are data recovery, disaster recovery, bare metal
recovery and cloud disaster recovery.

Types of Description
Recovery

Operational Restores small numbers of files after they have been accidentally
Recovery deleted or corrupted.
or restore

Disaster Restores IT infrastructure to an operational state following a


Recovery disaster.

Full VM Restores the entire backed up VMs to the same host or to a


Recovery different virtual host (ESXi host).

Cloud Data and applications (VM) can be replicated to the cloud


Disaster environment from on-premise data center. During disaster, the
Recovery data and application (VM) can be recovered from the cloud or the
services can be restored from the cloud environment.

For detailed information about different recovery types, click here.

Achieving Consistency in Backup

Consistency43 is critical to ensure that a backup can restore a file, directory, file
system, or database to a specific point-in-time.

43Consistency is a primary requirement to ensure the usability of backup copy after


restore.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 142


Data Backup

Offline Online

File System Un-mount file • Flushing compute system buffers


system
• Using open file agent

Database or Shutdown Database backup agents


Application database

To understand more on how consistency in backup can be achieved, click here.

Backup Granularities

Backup granularity depends on business needs and the required RTO/RPO. Based
on the granularity, backups can be categorized as full, incremental, cumulative (or
differential), incremental forever, and synthetic full backup.

Most organizations use a combination of these backup types to meet their backup
and recovery requirements. Let us understand each of them in detail:

Full Backup

• Copies full data on production volume to a backup storage device.


• Provides a faster data recovery.
• Requires more storage space.
• Takes more time to back up.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 143


Data Backup

Production Full Backup

Full Backup-Restore

In the motion graphics shown below, a full backup is created on every Sunday.
When there is a data loss in the production on Monday, the recent full backup that
is created on the previous Sunday is used to restore the data in the production.

• RPO determines which backup copy is used to restore the production.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 144


Data Backup

Incremental Backup

Incremental backup copies the data that has changed since the last backup.

• The main advantage of incremental backups is that less files are backed up
daily, allowing for shorter backup windows.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 145


Data Backup

• The primary disadvantage to incremental backups is that they can be time-


consuming44 to restore.
• Click here45 to view the example of incremental backup.

Cumulative Backup

Cumulative (differential) backup copies the data that has changed since the last full
backup.

44Suppose if there is a data loss on Wednesday morning before doing incremental


backup for that day and requires a recovery from the backup copies, the
administrator has to first restore Sunday's full backup. After that, the administrator
has to restore Monday's copy, followed by Tuesday's copy.

45For example, as shown in the motion graphic, a full backup is created on


Sunday, and incremental backups are created for the rest of the week. Monday's
backup would contain only the data that has changed since Sunday. Tuesday's
backup would contain only the data that has changed since Monday.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 146


Data Backup

• The advantage of differential backups over incremental backup is shorter


restore46 times.
• The tradeoff is that as time progresses, a differential backup can grow to
contain much more data47 than an incremental backup.
• Click here48 to view the example of cumulative backup.

46 Restoring a differential backup never requires more than two copies.

47Suppose if there is a data loss on Wednesday morning before doing cumulative


backup for that day and requires a recovery from backup copies, the administrator
has to first restore Sunday's full backup and restore the Tuesday’s backup copy.

48For example, the administrator created a full backup on Sunday and differential
backups for the rest of the week. Monday’s backup would contain all of the data
that has changed since Sunday. It would therefore be identical to an incremental
backup at this point. On Tuesday, however, the differential backup would backup
any data that had changed since Sunday (full backup).

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 147


Data Backup

Incremental Forever Backup

Rather than scheduling periodic full backups, this backup solution requires only one
initial full backup.

• Initial full backup followed by ongoing sequence of incremental backups.


• Incremental backups are automatically combined with the initial full backup in
such a way that you never need to perform a full backup again.
− Enables to use a single set of backups in order to restore.
• Reduces49 the amount of data that goes across the network and reduces the
length of the backup window.

49Also reduces the data growth because all incremental backups contain only the
blocks that have changed since the previous backup.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 148


Data Backup

Synthetic Full Backup

Another way to implement full backup is by performing synthetic backup. This


method is used when the production volume resources cannot be exclusively
reserved for a backup process for extended periods to perform a full backup.

• Created from an existing full backup and is merged with the data from any
existing incremental backups.
• This backup is not created directly from production data.

For more information about synthetic full backup, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 149


Data Backup

Backup Multiplexing

In an environment without multiplexing, only one stream of data is written to a tape


backup device at any given time. This situation is not ideal because as more clients
perform simultaneous backups, the tape drive’s throughput is not optimized.

One of the ways that backup software achieves backup efficiency with tapes is by
interleaving or multiplexing multiple backups onto a backup device. Multiplexing
allow:

• Backups of multiple client machines to send data to a single tape drive


simultaneously.
• May decrease backup time for large numbers of clients over slow networks, but
it does so at the cost of recovery time.

For more information about multiplexing, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 150


Data Backup

Backup Cloning and Staging

Some of the backup software provides the ability to further manage and protect the
backup data through the use of cloning50 and staging51.

Task Description

Cloning • It is the process of creating copies of backup data to enhance


data protection.
• These copies are then sent to the off-site vault while one copy
of the backup is kept at the production site.

Staging • It is a process of transferring data from one storage device to


another, and then removing the data from its original location.
• It reduces the time it takes to complete a backup by directing
the initial backup to a high performance device (disk-based
backup device).
• The data can then be staged to a storage medium (tape-based
backup device), freeing up the disk space.

50 Cloning improves data protection through redundancy, since each backup has a
clone at a geographically-dispersed location. Some backup software supports the
capability of performing copy operation at the same time as backups.

51Staging also allows data to be moved off the device outside the backup period,
ensuring that sufficient disk space is available for the next backup session.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 151


Data Backup

Knowledge Check: Introduction to Backup

Knowledge Check Question

1. From the list of steps provided - drag and drop each into the correct sequence
to perform a backup operation.

2 Backup server retrieves backup-related information from the backup


catalog.
6 Backup server updates the backup catalog.
1 Backup server initiates scheduled backup process.
3 Backup server instructs storage node to load backup media in backup
device and instructs backup clients to send data to be backed up to the
storage node.
4 Backup clients send data to storage node and update the backup catalog
on the backup server.
5 Storage node sends data to the backup device and sends metadata and
media information to the backup server.

Knowledge Check Question

2. Does an incremental forever backup require periodic full backup?


a. Yes
b. No

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 152


Data Backup

Backup Topologies

Backup Topologies

Objectives

The objectives of the topic are to:


→ Explain direct-attached backup.
→ Understand LAN-based backup.
→ Understand SAN-based and NAS-based backup.
→ Describe cloud-based backup.

Direct-Attached Backup

Metadata
Data

Application Server/ Backup Client/Storage Backup Device


Backup Server
Node

In a direct-attached backup, a backup device is attached directly to the client. Only


the metadata is sent to the backup server through the LAN.

In the image shown, the client acts as a storage node that writes data on the
backup device.

• The key advantage of direct-attached backups is speed.


− The backup device can operate at the speed of the channels.
− Frees the LAN from backup traffic.
• Disadvantages of direct-attached backup are:

− Backup device is not shared, which may lead to silos of backup device in the
environment.
− In a large data center environment, backup devices may be underutilized.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 153


Data Backup

To understand more about direct-attached backup, click here.

LAN-based Backup

Application Server/ Backup Client

Storage Node

Data

Metadata

Backup Device

Backup Server

In a LAN-based backup, the data to be backed up is transferred from the backup


client (source), to the backup device (destination) over the LAN, which may affect
network performance.

• Advantage:
− Centralized backups reduce management complexity.
• Disadvantage:

− Impacts the network performance.


− Impacts the application’s performance.
To know more about LAN-based backup, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 154


Data Backup

SAN-based Backup

Application Server/Backup Client

Storage Node
Metadata

Data

Backup Server Backup Device

The SAN-based backup52 (as shown in the image) is also known as the LAN-free
backup. The high-speed and extended distance capabilities of Fiber Channel are
used for the backup data movement path.

• Advantage:
− Production LAN environment is not impacted.
− Backup device can be shared among the clients.
− Offers improved backup and restore performance due to FC SAN.
• Disadvantage:
− Impacts the application’s performance.
• In the shown image, clients read the data from the application servers in the
SAN and write to the SAN-attached backup device.

− The backup data traffic is restricted to the SAN and the backup metadata is
transported over the LAN.
− However, the volume of metadata is insignificant when compared to
production data.

52 The SAN-based backup topology is the most appropriate solution when a backup
device needs to be shared among the clients. In this case the backup device and
clients are attached to the SAN.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 155


Data Backup

NAS-based Backup

Storage System

Application Server/Backup Backup Device


Client NAS Head

Metadata

Data

Backup Request
Backup Server

Network-attached storage (NAS) enables its clients to share files over an IP


network.

• It communicates by using the Network File System (NFS) for Unix


environments, Common Internet File System (CIFS) for Microsoft Windows
environments.
• The image shown, illustrates a server-based backup topology in a NAS
environment.

− In this approach, the NAS head retrieves data from storage over the network
and transfers it to the backup client running on the application server.
− The backup client sends this data to a storage node, which in turn writes the
data to the backup device.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 156


Data Backup

Cloud-based Backup

Backup Client

Backup Data to Cloud

Cloud
Backup Data from Cloud

Backup Client

Also known as backup-as-a-service (BUaaS) that provides clients with an online


solution for storage, backup, and recovery of files.

• Monitor the health of the data protection environment and comply with
government and industry regulations.
• Manages the data backup with robust on-site, off-site and hybrid cloud–based
security.

• Advantages:

− Scaled up and down the data quickly.


− Easily handles security and control issues.
− Quickly restore the backed-up data.
− Provide quick access to most needed data and apps to the clients in case of
disaster.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 157


Data Backup

Knowledge Check: Backup Topologies

Knowledge Check Question

Carefully inspect the given image.

3. Which backup topology is represented in the above image?


a. Direct-attached
b. LAN-based
c. SAN-based
d. NAS-based

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 158


Data Backup

Backup Methods

Backup Methods

Objectives

The objectives of the topic are to:


→ Describe agent-based backup.
→ Describe image-based backup.
→ Explain recovery-in-place (instant recovery).
→ Describe NDMP-based backup.
→ Understand the concept of a direct backup from primary storage.

Agent-Based Backup Approach

Backup Agent

Backup Device
Backup Data

Application Servers

Backup Server/
Storage Node

In this approach, an agent or client is installed on a virtual machine (VM) or a


physical compute system. The agent streams the backup data to the backup device
as shown in the image.

• Advantage:
− Backup configurations and recovery options follow traditional methods that
administrators are already familiar with, so there are no added configuration
requirements.
− Supports a single file backup and restore.
• Disadvantage:

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 159


Data Backup

− Impacts performance of applications running on compute systems.


− Does not provide the ability to backup and restore the VM as a whole.
For detailed information about agent-based backup approach, click here.

Image-Based Backup

Create Snapshot
VM Management Server

VM Snapshot

Proxy Server
Create
Snapshot Mount Snapshot
Backup Device

Application Servers Backup Data


Virtual Machine FS
Volume

Backup Server

Image-level backup53 (as shown in the image) makes a copy of the virtual machine
disk and configuration associated with a particular VM. The backup is saved as a
single entity called VM image.

In an image-level backup, the backup software can:

• Sends request to the VM management server to create a snapshot of the VMs


to be backed up and mount it on the Proxy Server.
• Create Snapshot and mount it on the proxy server.
• Backup is performed using the snapshot by the proxy server.

For more information about image-based backup, click here.

53This type of backup is suitable for restoring an entire VM in the event of a


hardware failure or human error such as the accidental deletion of the VM.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 160


Data Backup

Image-Based Backup – Changed Block Tracking

Application Server

VM Kernel creates an
additional file where it stores a
VM Kernel map of all the VM disk’s blocks

Virtual Machine Disk Block Map File


Copies only the changed (where changed blocks are tracked)
blocks from the virtual disk

Virtual Machine FS
Volume
Backup Device

To further enhance the image-based backup some of the vendors support changed
block tracking54 mechanism.

Changed block tracking technique dramatically:

• Reduces the amount of data to be copied before additional data reduction


technologies (deduplication) are applied.
• Reduces the backup windows and the amount of required storage for protecting
VMs.

For more details about the changed block tracking mechanism, click here.

54This feature identifies and tags any blocks that have changed since the last VM
snapshot. This enables the backup application to backup only the blocks that have
changed, rather than backing up every block.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 161


Data Backup

Recovery-in-Place (Instant Recovery)

VM Fails due to some reason


Application Server
Backup Device

Backup Server/ Storage


Node

VM is started directly from the backup device

VM is recovered in the background

VM Disk Files

FS Volume
(Production)

Certain VMs running mission critical applications need to be brought online


immediately with no time to spare waiting for a full VM image recovery. This is
achieved with the help of recovery-in-place technique.

Recovery-in-place (Instant VM recovery) is a term that:

• Enables to run a VM directly from the purpose-built backup appliance, using a


backed-up copy of the VM image.
• Provides an almost instant recovery of a failed VM.

− Reduces the RTO.


As shown in the image, if VM has failed due to some reason, the instant recovery
technique enables to restart VM directly from the backup device. At the same time
the VM is restored to the production storage.

For more information about Recovery-in-Place, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 162


Data Backup

NDMP-Based Backup

Storage

Application Server

Backup Device
Backup Data

NDMP Client
Instruct NAS to Start Backup Backup Data

NAS Device

Backup Server
NDMP Server Running on
NAS Head

An open standard TCP/IP-based protocol specifically designed for a backup in a


NAS environment.

• Data can be backed up using NDMP regardless of the OS.


• Backup data is sent directly from NAS to the backup device.
• No longer necessary to transport data through application servers.
• Backs up and restores data while preserving security attributes of file system
(NFS and CIFS).

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 163


Data Backup

The key components of an NDMP infrastructure are NDMP client55 and NDMP
server.

• The NDMP server has two components- data server56 and media server57.

The backup operation occurs as follows:

• Backup server uses NDMP client and instructs the NAS head to start the
backup.
• The NAS head uses its data server to read the data from the storage.
• The NAS head then uses its media server to send the data read by the data
server to the backup device.

Direct Primary Storage Backup

Direct primary storage backup approach backs up data directly from a primary
storage system to a backup target without any additional backup software.

This data protection solution integrates primary storage and protection storage
(backup device).

Direct primary storage provides the following advantages:

• Backs up data directly from a primary storage system to a backup target.


• Integrates primary storage and backup devices.

55NDMP client is the NDMP enabled backup software installed as an add-on


software on backup server.

56The data server is a component on a NAS system that has access to the file
systems containing the data to be backed up.

57The media server is a component on a NAS system that has access to the
backup device.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 164


Data Backup

• Eliminates the impact on applications.


• Reduces cost and complexity by eliminating excess infrastructure, including a
traditional backup application.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 165


Data Backup

Knowledge Check: Backup Methods

Knowledge Check Question

Carefully inspect the given image.

4. In the above image, which component runs NDMP client?


a. Backup Server
b. Application Server
c. NAS Device
d. Backup Device

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 166


Data Backup

Concepts in Practice

Concepts in Practice

Click the right and left arrows to view all the concepts in practice.

Dell EMC NetWorker

Dell EMC NetWorker is a backup and recovery solution for mission-critical business
applications in physical and virtual environments for on-premises and cloud.

• Unified backup and recovery software for the enterprise: deduplication, backup
to disk and tape, snapshots, replication and NAS.
• Provides a robust cloud capability enabling long term retention to the cloud,
backup to the cloud and backup in the cloud.
• NetWorker Module for Databases and Applications (NMDA) provides a data
protection solution for DB2, Informix, Lotus Domino/Notes, MySQL, Oracle,
SAP IQ, and Sybase ASE data.
− NMDA also provides data protection for MongoDB, MySQL, and
PostgreSQL data through the Orchestrated Application Protection feature.
• NetWorker Snapshot Management (NSM) is integrated with Dell EMC storage
and enables end-to-end snapshot management and backup from within the
NetWorker UI.

Dell EMC PowerProtect DP Series Appliance

PowerProtect DP series appliances deliver powerful backup and recovery of all


organization’s data, wherever it lives, using a single appliance.

• PowerProtect Appliance is the next generation of Integrated Data Protection


Appliance (IDPA) is all-in-one data protection software and storage in a single
appliance that delivers backup, replication, recovery, search, analytics and
more.
• Easy to deploy and manage and can help consolidate data protection software
and hardware for any size organization. Features include:
− Systems can scale to Petabyte of usable capacity.
− Flexible consumption models.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 167


Data Backup

− Cloud long-term retention and cloud DR-ready.


− VMware integration.
• Supports native Cloud DR with end-to-end orchestration allows enterprises to
copy backed-up VMs from on-premises IDPA environments to the public cloud
with AWS, Azure, or VMware Cloud on AWS.

Dell EMC PowerProtect DD Series Appliance

DD series enables organizations to protect, manage and recover data at scale


across their diverse environments.

• The next generation of Dell EMC Data Domain appliances, that are now setting
the bar for data management from edge to core to cloud.
• Integrates easily with existing infrastructures, enabling ease-of-use with leading
backup and archiving applications, and offers superior performance in
conjunction with PowerProtect Data Manager and Data Protection Suite.
• Natively tier deduplicated data to any supported cloud environment for long-
term retention with Dell EMC Cloud Tier.
• Provides fast disaster recovery with orchestrated DR and provides an efficient
architecture to extend on-premises data protection.

PowerProtect DD Virtual Edition (DDVE) leverages the power of DDOS to deliver


software-defined protection storage on-premises and in-cloud.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 168


Data Backup

Exercise- Data Backup

Exercise- Data Backup


1. Present Scenario:

• A financial organization runs business-critical applications in a virtualized


data center.

• The organization:

− Currently uses tape as their primary backup storage media for backing up
application data.
− Uses an agent-based backup solution for backing up data.
− Has a file-sharing environment in which multiple NAS systems serve all the
clients including application servers.
2. Organization’s Challenges:

• Backup operations consume resources on the compute systems that are


running multiple VMs.

− Significantly impacting the applications deployed on the VMs.


• During NAS backup, the application servers are impacted.

− Data is backed up from these servers to the backup device.


• Backing up and recovering of data also take more time.

3. Organization’s Requirements:

• Need faster backup and restore to meet the SLAs.

• Need to offload the backup workload from the compute system to avoid
performance impact on applications.

• Need a solution to avoid performing regular full backup.

• Require a solution to overcome the backup challenges in NAS environment.

4. Expected Deliverables:

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 169


Data Backup

• Propose a solution to address the organization’s challenges and


requirements.

Solution

The proposed solution is as follows:

• Implement disk-based backup solution to improve the backup and recovery


performance for meeting SLAs.
• Implement recovery-in-place to speed up the recovery operations.
• Implement image-based backup that helps to offloaded backup operation from
VMs to a proxy server.
− No backup agent is required inside the VM to backup.
• Implement incremental forever backup to avoid performing regular full backup.
• Deploy NDMP-based backup solution for NAS environment.

− In NDMP-based backup, data is sent directly from the NAS head to the
backup device without impacting the application servers.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 170


Data Deduplication

Data Deduplication

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 171


Data Deduplication

Data Deduplication

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 172


Data Deduplication

Data Deduplication

Data Deduplication

The main objectives of the topic are to:


→ List the components of data deduplication solution.
→ Explain deduplication ratio and factors affecting it.
→ Define deduplication granularity.
→ Explain source-based and target-based deduplication.
→ Apply various deduplication techniques to address the organization’s
challenges and requirements.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 173


Data Deduplication

Data Deduplication Overview

Data Deduplication Overview

Objectives

The objectives of the topic are to:

• List the key data deduplication components.


• Explain the data deduplication and backup processes.
• Define the hardware and software-based deduplication.
• Explain deduplication ratio and the factors affecting it.

Why Do We Need Data Deduplication?

Stores multiple copies


(duplicate) of the same
document

Backup device

In a data center environment, a high percentage of data that is retained on a


backup media is redundant. The typical backup process for most organizations

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 174


Data Deduplication

consists of a series of daily incremental backups and weekly full backups. Daily
backups are usually retained for a few weeks and weekly full backups are retained
for several months. Due to this process, multiple copies of identical or slowly-
changing data are retained on backup media, leading to a high level of data
redundancy.

Challenges of duplicate data in a data center:

• Difficult to protect the data within the budget.


• Impacts the backup window.
• Increases the network bandwidth.

Data deduplication is the process of detecting and identifying the unique data
segments (chunk) within a given set of data to eliminate redundancy. Only one
copy of the data is stored; the subsequent copies are replaced with a pointer to the
original data. Deduplication addresses all the aforesaid challenges.

For more information, click here.

Deduplication Ratio

The effectiveness of data deduplication is expressed as a deduplication or


reduction ratio, denoting the ratio of data before deduplication to the amount of
data after deduplication.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 175


Data Deduplication

Factors affecting Description


deduplication ratio

Retention period The longer the data retention period, the greater is the
chance of identical data existence in the backup.

Frequency of full The more frequently the full backups are conducted,
backup the greater is the advantage of deduplication.

Change rate The fewer the changes to the content between


backups, the greater is the efficiency of deduplication.

Data type The more unique the data, the less intrinsic
duplication exists.

Deduplication method Variable-length, sub-file deduplication discover the


highest amount of deduplication across an
organization.

For more information, click here.

Key Benefits of Data Deduplication

Click the icons shown below for information on the benefits of deduplication.

1 2 3 4

1: By eliminating redundant data from the backup, the infrastructure requirement is


minimized. Data deduplication directly results in reduced storage capacities to hold
backup images. Smaller capacity requirements mean lower acquisition costs as
well as reduced power and cooling costs.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 176


Data Deduplication

2: As data deduplication reduces the amount of content in the daily backup, users
can extend their retention policies. This can have a significant benefit to users who
currently require longer retention.

3: Data deduplication eliminates redundant content of backup data, which results in


backing up less data and reduced backup window.

4: By utilizing data deduplication at the client, redundant data is removed before


the data is transferred over the network. This considerably reduces the network
bandwidth required for sending backup data to remote site for DR purpose.

Example: Data Deduplication using PowerProtect DD

Data deduplication is easily performed inline with the help of software and general-
purpose CPUs, which are connected to the PowerProtect DD controller to transfer
deduplicated data to a backup appliance. Application Agents are installed on
application or database host servers to manage protection using PowerProtect
Data Manager. These Agents are commonly known as Data Domain Boost
Enterprise Agents (DDBEA) for databases and applications.

DD Boost is a software that improves the interactions of backup servers and clients
with a Data Domain backup appliance. The DD Boost makes the data deduplication
process distributed, so that there is faster data throughput and reduced server CPU
utilization. The File System agent allows an application administrator to protect and
recover data on the file system host. PowerProtect Data Manager integrates with
the File System agent to check and monitor backup compliance against protection
policies. PowerProtect Data Manager also allows central scheduling for backups.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 177


Data Deduplication

Application Server (Deduplication Backup Device


Client Software - Agent) PowerProtect DD (Target)

Example: Data Deduplication and Backup Process - with DD Boost

If the system doesn't have DD Boost, the Data Domain performs inline data
deduplication. As and when the files and data are sent over the network, the DD
deduplicates the data using RAM and CPU, writing only the unique data chunks to
the backup target.

With DD Boost, a considerable portion of the deduplication can occur before the
data is sent across to the Data Domain. The backup source takes the data,
segments it out, compares it with segments already on the Data Domain, and only
sends over new, unique segments.

Step 1

Client agent checks the file system and determines if a file has been backed up
before.

Modified files

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 178


Data Deduplication

Step 2

Modified files are broken into chunks and hashed.

Step 3

Hashes are compared with chunks already existing on the Data Domain.

Step 4

Only new and unique data chunks are backed up on the Data Domain.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 179


Data Deduplication

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 180


Data Deduplication

Knowledge Check: Deduplication Granularity and Methods

Knowledge Check Question

1. Which factor affects the deduplication ratio?


a. Type of backup media
b. Data type
c. Size of backup media

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 181


Data Deduplication

Deduplication Granularity and Methods

Deduplication Granularity and Methods

Objectives

The objectives of the topic are to:

• Define deduplication granularity.


• Explain source and target-based deduplication.
• Illustrate the deduplication use case: Disaster Recovery (DR).

Deduplication Granularity

Data deduplication has dramatically improved the value proposition of data


protection as well as remote and branch office backup consolidation and disaster
recovery strategies. Some deduplication approaches operate at the file level, while
others go deeper to examine data at a sub-file or block-level.

Even deduplication can happen at the object-level. Determining the uniqueness at


either the file, block, or object-level will offer benefits, though result may vary.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 182


Data Deduplication

File-level Deduplication

The key characteristics of file-level deduplication is as follows:

Client 1 Client 2

File-level
Deduplication

Both the clients have same content Backup


file, only one is backed up after file-
level deduplication
Even a single letter change in the file is
treated as a new file, so this file from
client 2 is backed up

Backup
device

• Detects and removes redundant copies of identical files in a backup


environment.
• Only one copy of the file is stored; the subsequent copies are replaced with a
pointer to the original file.
• Very effective for documents, spreadsheets, etc., where multiple users save
copies of the same file.
• Small change in a file results in another copy of the file.
• Does not address the problem of duplicate content inside the files.

For more information, click here.

Block-level Deduplication - Fixed-length

Fixed-length
Deduplication
Client

Backup
The changed segment is now considered as a
unique data, so it is backed up. Remaining
unchanged segment is considered as a duplicate
data and not backed up.

Backup device

Since all the segments are changed, all of


them are now considered as a unique data.
All the data segments are backed up.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 183


Data Deduplication

The key characteristics of block-level deduplication-fixed-length are as follows:

• Breaks files down to smaller segments and fixes the chunking at a specific size,
for example 8 KB or maybe 64 KB.
• Detects redundant data within and across files.

Fixed-length block may miss many opportunities to discover the redundant data
because the block boundary of similar data may be different.

For more information, click here.

Block-level Deduplication - Variable-length

Variable-length
Client Deduplication

Backup

There is a change in the block, so the


boundary for that block is only
adjusted, leaving the remaining blocks
unchanged. So only that block is
backed up.
Backup device

The key characteristics of block-level deduplication-Variable-length are as follows:

• The length of the segments vary and provide greater storage efficiency for
redundant data regardless of where new data has been inserted.
• If there is a change in the block, then the boundary for that block is only
adjusted, leaving the remaining blocks unchanged.
• It yields a greater granularity in identifying duplicate data.

For more information, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 184


Data Deduplication

Object-level Deduplication

Object-level deduplication is also referred as single instance storage. In an object-


based storage system, a file is stored as an object. Rather than accessing an
object by its file name at a physical location, object-based storage device uses an
object ID (signature) that is derived from each object's unique binary representation
to store and retrieve the object.

This object ID is unique to ensure that only one protected copy of the content is
stored (single instance storage), no matter how many times clients store the same
information. This significantly reduces the total number of data stored, and is a key
factor in lowering the cost of storing and managing content.

At write time, the object-based storage system is polled to see if it already has an
object with the same signature. If the object is already on the system, it is not
stored, rather only a pointer to that object is created.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 185


Data Deduplication

Deduplication Methods

Data deduplication can be classified based on where it occurs. When the


deduplication occurs close to where the data is created (backup client), it is
referred to as source-based deduplication. When it occurs near where the data is
stored (backup device), it is referred as target-based deduplication. In a target-
based deduplication, the deduplication can happen in-line or post-process.

Source-based Deduplication

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 186


Data Deduplication

In this deduplication method, the data is deduplicated at the source (backup client).
The backup client sends only new, unique segments across the network. This is
suitable for environment where storage and network is a constraint. However, it
may require a change in the backup software if this option is not supported by the
existing backup software. Source-based deduplication consumes CPU cycles on
client and may impact the application performance. So, it is recommended for
remote office branch office environment for performing centralized backup.

For more information, click here.

Target-based Deduplication

The characteristics of target-based deduplication are as follows:

• Data is deduplicated at the target.


• Supports current backup environment and no operational changes are required.
• Client is not affected since deduplication process takes place at target.
• Requires sufficient network bandwidth to send data across LAN or WAN during
the backup.
• Data is deduplicated at the backup device, either inline or post-process.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 187


Data Deduplication

Deduplication Use Case: Disaster Recovery

Typically, organizations maintain a copy of data at the remote site (DR site or
cloud) for DR purpose. If the primary site goes down due to disaster or any other
reasons, the data at the remote site will enable restoring of services and data to the
primary site. Data deduplication can enhance DR because of the following reasons:

• Deduplication significantly reduces the network bandwidth to transfer the data


from the primary site to the remote site (DR site or Cloud) for DR purpose.
• Deduplication also reduces the storage requirement at the remote site.

Deduplicated Deduplicated
data data

Primary site DR site

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 188


Data Deduplication

Knowledge Check: Deduplication Granularity and Methods

Knowledge Check Question

2. Match the following:

A. 1. Source- A C. Data is deduplicated at the


based backup client
deduplication
B. 2. Target-based B A. Requires sufficient network
deduplication bandwidth to send
duplicate data across LAN or
WAN during the backup

C. 4. File-level D D. Detects redundant data within


deduplication and across files

D. 3. Fixed-length C B. Does not address the problem


deduplication of duplicate content across the
files

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 189


Data Deduplication

Concepts in Practice

Concepts in Practice

Dell EMC PowerProtect DD Series Appliance with DD Boost

DD series enables organizations to protect, manage and recover data at scale


across their diverse environments. DD series integrates easily with existing
infrastructures, enabling ease-of-use with leading backup and archiving
applications, and offers superior performance in conjunction with PowerProtect
Data Manager and Data Protection Suite.

DD Boost software delivers an advanced level of integration with backup


applications and data base utilities, enhancing performance and ease of use.
Rather than sending all data to the system for deduplication processes, DD Boost
enables the backup server or application client to send only unique data segments
across the network to the system.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 190


Data Deduplication

Exercise - Data Deduplication

Exercise - Data Deduplication


Please click the sub-headings for information about the exercise.

1. Present Scenario:

An organization runs business applications in a data center. The organization:

• Has multiple remote/branch offices (ROBO) across different locations.

• Stores application data on SAN-based storage systems in the data center.

• Currently uses disk as their backup storage media for backing up application
data.

• Uses tapes for protecting data at the remote site for DR purpose.

2. Organization’s Challenges:

• Backup and production environments have huge amount of redundant data -


increases the infrastructure cost and impacts the backup window.

• Backing up data from branch offices to a centralized data center is restricted


due to the time and cost involved in sending huge volumes of data over the
WAN.

• Sending tapes to offsite locations would increase the risk of losing sensitive
data.

3. Organization’s Requirements:

• Need to eliminate redundant copies of data in both production and backup


environment.

• During backup, the business-critical applications should not get impacted.

• Need an effective solution to address the backup challenges of remote and


branch offices.

• Need an effective solution to address the challenges of remote site backup


using tapes for DR purpose.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 191


Data Deduplication

4. Expected Deliverables:

Propose a solution to address the organization’s challenges and requirements.

Solution

The proposed solution is as follows:

Implement deduplication solution to eliminate redundant data.

• Implement target-based deduplication solution for business-critical applications.


This will not impact the performance of business-critical applications.
• Implement source-based deduplication at branch offices. This will eliminate the
challenges associated with centrally backing up of branch office data and
considerably reduce the required network bandwidth.
• Organization can transfer deduplicated data over WAN to the remote site. This
will eliminate the need for shipping the tape to the remote site and reduces the
network bandwidth requirements.
• Organization can also utilize the deduplication capability of their SAN storage.
This will reduce the redundant data in production and reduce the primary
storage cost.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 192


Replication

Replication

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 193


Replication

Replication

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 194


Replication

Replication

Replication

The main objectives of the topic are to:


→ Describe the primary uses of replica and its characteristics.
→ Describe local replication solutions.
→ Describe remote replication solutions.
→ Apply various replication techniques to address the organization’s
challenges and requirements.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 195


Replication

Data Replication Overview

Data Replication Overview

Objectives

The objectives of the topic are to:


→ Understand the primary uses of replica.
→ Define different characteristics of replica.
→ Describe various methods to ensure replica consistency.
→ Describe different types of replication.

Introduction to Data Replication

Data Replication

Replica

Data Center B
Servers

Connectivity

Storage

Primary Storage Replica Cloud

Data Replication to
Data Replication Cloud

Data Center A

A data replication solution is one of the key data protection solution that:

• Enables organizations to achieve business continuity, high availability, and data


protection.
• Creates an exact copy (replica) of data. These replicas are used to restore and
restart operations if data loss occurs.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 196


Replication

− For example, if a production VM goes down, then the replica VM can be


used to restart the production operations with minimum disruption.
• Categorizes into two characteristics- recoverability58 and restartability59.

For more information on data replication, click here.

Primary Uses of Replicas

Replicas are created for various purposes, including the following:

Replica

Can act as a source for backup

Replication
Can be used to restart business operations or to
recover the data

Replication
Data Used for running decision support activities

Source

Used for testing applications

Replication

Data migration

To learn more about uses of replicas, click here.

58 Enables restoration of data to the source if there is a data loss at the source.

59 Enables to restart the business operations on it, if the source is not available due
to some reasons.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 197


Replication

Methods to Ensure Replica Consistency

Consistency ensures the usability of replica devices.

Offline Online

File System Unmount file system Flushing compute system buffers

Database Shutdown Database • Using dependent write I/O


principle
• Holding I/O to the source before
creating the replica

To learn more about replica consistency, click here.

Types of Replication

Replication can be classified into two major categories: local and remote
replication.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 198


Replication

Local Replication Remote Replication

• Refers to replicating data within the • Refers to replicating data to


same location remote locations (locations can
be geographically dispersed)
− Within a data center in compute-
based replication • Data can be synchronously or
asynchronously replicated
− Within a storage system in storage
system-based replication • Helps to mitigate the risks
associated with regional
• Typically used for operational restore of
outages
data in the event of data loss
• Enables organizations to
• Can be implemented at compute,
replicate the data to cloud for
storage, and network
DR purpose
• Can be implemented at
compute, storage, and network

To learn more about types of replication, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 199


Replication

Knowledge Check: Data Replication Overview

Knowledge Check Question

1. Does local replication provide a solution for disaster recovery?


a. Yes
b. No

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 200


Replication

Local Replication

Local Replication

Objectives

The objectives of the topic are to:


→ Understand the concept of file system snapshot.
→ Describe VM snapshot.
→ Describe VM clone.
→ Understand various key components of continuous data protection (CDP).
→ Describe local CDP replication.
→ Describe hypervisor-based CDP replication.

Local Replication Overview

Storage System

LUN A

LUN B

Local Replication

• In local replication, the replication is performed within the storage system. In


other words, the source and the target logical unit numbers (LUNs) reside on
the same storage system.

− Enables one to perform operational recovery in the event of data loss and
also provides the support for other business operations such as backup.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 201


Replication

File System Snapshot

File system (FS) snapshot creates a copy of a file system at a specific point-in-time,
even when the original file system continues to be updated and used normally.

• When a snapshot is created, a bitmap and blockmap are created in the


metadata of the snapshot FS. The bitmap is used to keep track of blocks that
are changed on the production FS after the snapshot creation.
• After the creation of the FS snapshot, all reads from the snapshot are actually
served by reading the production FS.

− To read from the snapshot FS, the bitmap is consulted. If the bit is 0, then
the read will be directed to the production FS.
− If the bit is 0, then the read will be directed to the production FS.
− If the bit is 1, then the block address will be obtained from the blockmap and
the data will be read from that address on the snapshot FS.

FS Snapshot 3
Wednesday View

Tuesday View

Monday View
FS Snapshot 2

FS Snapshot 1

Production File System (FS)

To learn more about snapshot, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 202


Replication

VM Snapshot

• A VM snapshot preserves the state60 and data61 of a VM at a specific PIT.


− This includes disks, memory, and other devices such as virtual network
interface cards.
• Useful for quick restore of a VM considering the storage space they consume.

− For example, an administrator can create a snapshot of a VM, then make


changes such as applying patches and software upgrades to the VM.
− If anything goes wrong, the administrator can simply restore the VM to its
previous state using the VM snapshot.

Virtual machine writes here

Snapshot 3 (Child Virtual Disk Changed blocks of snapshot 2


3)
and base image

Snapshot 2 (Child Virtual Disk


2) Changed blocks of snapshot 1

Snapshot 1 (Child Virtual Disk


1) Changed blocks of base image

Base Image (Parent Virtual


Disk)

VM Virtual Disk

Storage

60The state includes the VM’s power state (for example, powered-on, powered-off,
or suspended).

61 The data includes all of the files that make up the VM.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 203


Replication

VM Clone

Virtual Machine Virtual Machine


Clone

VM Copy of VM
Full
Configuration Configuration
Clone

VM Disk Copy of VM
Disk

Virtual Machine Virtual Machine


Clone
VM Copy of VM
Configuration Configuration
Linked
Clone
VM Disk Delta Disk

• Clone is a copy of an existing virtual machine (parent VM).


− The clone VM’s MAC address is different from the parent VM.
• Typically, clones are deployed when many identical VMs are required.
− Reduces the time required to deploy a new VM.
• Two types of clones:

Full clone Independent copy of a VM that shares nothing with the


parent VM.

Linked clone Created from a snapshot of the parent VM.

For detailed information about VM clone, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 204


Replication

Full Volume Replication- Clone

Full volume local replication provides the ability to create fully populated point-in-
time copies of LUNs within a storage system.

• When the replication session is started, an initial synchronization62 is performed


between the source LUN and the replica (clone).
− During the synchronization process, the replica will not be available for any
compute system access. Once the synchronization is completed, the replica
will be exactly same as the source LUN.
• The replica can be detached (fractured) from the source LUN and it can be
made available to another compute system for business operations.
− After detachment, the changes made to both the source and the replica can
be tracked at some predefined granularity.

62 Synchronization is the process of copying data from the source LUN to the clone.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 205


Replication

• The tracking table enables incremental resynchronization (source to target) or


incremental restore (target to source).
• In the image shown, re-synchronization happens from the source to the target.
In a full volume replication, the clone must be of the same size as the source
LUN.

Pointer-based Virtual Replication – Snapshot

Replication Session

Save
Location
Source Snapshot

In pointer-based virtual replication:

• At the time of replication session (time duration in which the replication


happens) activation:

− Target contains pointers to the location of the data on the source.


− Target does not contain data at any time. Therefore, the target is known as a
virtual replica.
In a pointer-based replication:

• Target is immediately accessible after the replication session activation.


− A predefined area in the storage system (save location) is used to store the
original data or the new data, based on the snapshot implementation.
• Can be implemented by using a technique named redirect on write (RoW).

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 206


Replication

Continuous Data Protection (CDP)

• CDP provides the capability to restore data and VMs to any previous point-in-
time (PIT).

− Data changes are continuously captured and stored on separate location


from the production volume so that the data can be restored to any previous
PIT.
The key benefits of CDP are as follows:

Click on each label to get more information about benefits.

1: CDP provides continuous replication, tracks all the changes to the production
volumes that enable to recover to any point-in-time.

2: CDP solutions have the capability to replicate data across heterogeneous


storage systems.

3: CDP supports both local and remote replication of data and VMs to meet
operational and disaster recovery respectively.

4: CDP supports various WAN optimization techniques (deduplication,


compression, and fast write) to reduce bandwidth requirements and also optimally
utilizes the available bandwidth.

5: CDP supports multi-site replication, where the data can be replicated to more
than two sites using synchronous and asynchronous replication.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 207


Replication

Key CDP Components

Local CDP Replication Operations

In this method, before the start of replication, the replica is synchronized with the
source and then the replication process starts. After the replication starts:

Write splitter creates a copy of


a write data and sends it to the
Compute
CDP appliance and production
volume System

Write I/O
Write
Splitte
CDP
r
Appliance

Write I/O
Data is written to the journal volume
along with its timestamp

Data is written
to replica

Production Replica Journal


Volume

Storage System

• All the writes to the source are split into two copies.
− One of the copies is sent to the CDP appliance and the other to the
production volume.
• CDP appliance writes the data to the journal volume.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 208


Replication

• Data from the journal volume is sent to the replica at predefined intervals.
• While recovering data to the source, the appliance restores data from the
replica and applies journal entries up to the point-in-time chosen for recovery.

Hypervisor-based CDP Implementation- Local Replication

Virtual Appliance

Write Splitter

VM Disk VM Disk
Files Files

Source Local
Journal
Volume Replica

Hypervisor-based CDP –
Local Replication

Some vendors offer continuous data protection for VMs through hypervisor-based
CDP implementation. This deployment option:

• Protects a single or multiple VMs locally or remotely.


• Enables to restore VM to any PIT.
• Virtual appliance is running on a hypervisor.
• Write splitter is embedded in the hypervisor.

The image shows a hypervisor-based CDP implementation.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 209


Replication

Knowledge Check: Local Replication

Knowledge Check Question

2. Which of the following statement is incorrect about VM Clone?


a. Clone is a copy of an existing parent VM
b. Clones are deployed when many identical VMs are required
c. Clone VM’s MAC address is same as of parent VM
d. Clones do not affect the parent VM during any type of changes

Knowledge Check Question

3. Which CDP component holds snapshots of the data to be replicated?


a. CDP appliance
b. Replica volume
c. Journal volume
d. Write splitter

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 210


Replication

Remote Replication

Remote Replication

Objectives

The objectives of the topic are to:


→ Explain remote replication.
→ Describe synchronous and asynchronous replication.
→ Understand the concepts of multi-site replication.
→ Understand the concept of hypervisor-based remote replication.
→ Describe remote CDP replication operations.

Remote Replication Overview

VM Replication

Source Site Remote Site

• In remote replication, the storage system operating environment performs the


replication process.

− One of the storage systems is in the source site and the other system is in
the remote site for DR purpose. Data can be transmitted from the source
storage system to the target system over a shared or a dedicated network.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 211


Replication

− Replication between storage systems may be performed in synchronous or


asynchronous modes.
− Hypervisor-based remote replication replicates VMs between a primary site
and a remote site.
o Initial synchronization is required between the source and the target.
o Copies all the data from source to target.
o Only the changes are replicated; this reduces network utilization.

Remote Replication

Remote replication can be performed in synchronous and asynchronous mode.

Synchronous

• Writes must be committed to the source and the target prior to acknowledging
“write complete” to the production compute system.
− Provides near zero RPO.
• The shown image, illustrates an example of synchronous remote replication. If
the source site is unavailable due to disaster, then the service can be restarted
immediately in the remote site to meet the required SLA.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 212


Replication

To learn more about synchronous remote replication, click here.

Asynchronous

• A write from a production compute system is committed to the source and


immediately acknowledged to the compute system.

− Data is buffered at the source and sent to the remote site periodically.
− Replica will be behind the source by a finite amount (finite RPO).
To learn more about asynchronous remote replication, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 213


Replication

Multi-site Replication

Storage System at Storage System at


Source site Remote Site 1
Production
Compute System

Source Replica

Synchronous
Replication
Asynchronous with
Differential
Resynchronization
Asynchronous
Replication

Replica

Storage System at
Remote Site 2

In a two-site synchronous replication, the source and the target sites are usually
within a short distance.

• In synchronous replication, if a regional disaster occurs, both the source and the
target sites might become unavailable.
• In asynchronous replication, if the source site fails, production can be shifted to
the target site, but there will be no further remote protection of data until the
failure is resolved.

Multi-site replication mitigates the risks identified in two-site replication. In a multi-


site replication:

• Data from source site is replicated to multiple remote sites for DR purpose.
− Disaster recovery protection is always available if any one-site failure
occurs.
• Mitigates the risk in two-site replication.

− No DR protection after source or remote site failure.


In this approach, data at the source is replicated to two different storage systems at
two different sites, as shown in the image. The source to remote site 1 (target 1)

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 214


Replication

replication is synchronous with a near-zero RPO. The source to remote site 2


(target 2) replication is asynchronous with an RPO in the order of minutes.

To learn more about multi-site replication, click here.

Remote CDP Replication Operations

1. Write splitter creates a copy


2. Data is sequenced,
of a write data and sends it to
compressed, and replicated to
the CDP appliance and 3. Data is received,
Compute the remote appliance
production volume uncompressed, and sequenced
System

Write
Splitter
Remote CDP
Appliance
5. Data is copied to the remote
replica
Local CDP
Appliance

4. Data is written to the


journal

Production Journal Remote


Volume Replica

Source Site Remote Site

In this method, the replica is synchronized with the source, and then the replication
process starts. After the replication starts:

• All the writes from the host to the source are split into two copies.
− Write splitter creates a copy of a write data and sends it to the CDP
appliance and production volume.
• Data is sequenced, compressed, and replicated to the remote appliance.
• Data is received, uncompressed, and sequenced.
• Data is written to the journal.
• Data is copied to the remote replica.

For more information about remote replication CDP operation, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 215


Replication

Knowledge Check: Remote Replication

Knowledge Check Question

4. Match the following elements with their functions:

A. Multi-site C Writes must be committed to the


Replication source and the target prior to
acknowledging “write complete”
to compute.

B. Asynchronous B Data is buffered at the source


Replication and sent to the remote site
periodically.

C. Synchronous A Disaster recovery protection is


Replication always available if any one-site
failure occurs

Knowledge Check Question

5. In hypervisor-based remote replication, initial synchronization is required


between the source system and the target system.
a. Yes
b. No

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 216


Replication

Concepts in Practice

Concepts in Practice

SnapVX

TimeFinder SnapVX is a local replication solution with cloud scalable snaps and
clones to protect your data. SnapVX solution:

• Provide space-efficient local snapshots that can be used for localized protection
and recovery and other use cases including development/test, analytics,
backups, and patching.
• Secure snapshots prevent accidental or malicious deletion, securing them for a
specified retention period.

− The snapshots are made as efficient as possible by sharing point-in-time


tracks which are called snapshot deltas.

SRDF

SRDF is Dell EMC’s Remote Replication technology that enables the remote
mirroring of a data center with minimal impact to the performance of the production
application. SRDF replication products:

• Provides disaster recovery and data mobility solutions for the PowerMax and
VMAX Family storage arrays.
• Copies process between the sites is accomplished independently without the
host.
− There are no limits to the distance between the source and the target
copies.
• Enables storage systems to be in the same room, different buildings, or
hundreds to thousands of kilometers apart.
• Offers the ability to maintain multiple, host-independent, remotely mirrored
copies of data.

For detailed information about key PowerMax and VMAX Family Remote
Replication options, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 217


Replication

RecoverPoint

Dell EMC RecoverPoint provides continuous data protection for comprehensive


operational and disaster recovery.

• RecoverPoint for Virtual Machines is a hypervisor-based, software-only data


protection solution for Virtual Machines.
• RecoverPoint delivers benefits including the ability to:
− Enable Continuous Data Protection for any point in time (PIT) recovery to
optimize RPO and RTO.
− Ensure recovery consistency for interdependent applications.
− Provide synchronous (sync) or asynchronous (async) replication policies.
− Reduce WAN bandwidth consumption and utilize available bandwidth
optimally.
• Consists of a physical RecoverPoint Appliance or a Virtual RecoverPoint
Appliance, and the write-splitter embedded in the supported Dell EMC storage
arrays.
• With Dell EMC XtremIO, the data replication is a splitter-less implementation
achieved by leveraging the highly efficient array-based snapshot technology
native to the XtremIO platform.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 218


Replication

Exercise- Replication

Exercise- Replication
1. Present Scenario:

• A multinational bank runs a business-critical application that stores data in a


LUN with RAID 1 configuration.

• Application is write-intensive with about 75% write operations.

• Every month-end the bank runs billing and reporting applications to generate
bills and statement of customer’s account.

• The bank has two data centers which are 100 miles apart.

2. Organization’s Challenges:

• The backup window is too long and is negatively impacting the application
performance.

• These billing and reporting applications have huge impact on the source
volume.

• In the past year, the top management has become extremely concerned
about DR because they do not have any DR plans in place.

3. Organization’s Requirements:

• During billing and reporting, the source volume should not have any impact.

• During backup the business-critical applications should not get impacted.

• Bank cannot afford any data loss; therefore, needs a disaster recovery
solution with near zero RPO.

4. Expected Deliverables:

• Propose a storage system-based local replication solution to address the


organization’s concern.

• Propose a solution to address the organization’s DR requirements.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 219


Replication

Solution

The proposed solution is as follows:

• Deploying a full volume (clone) local replication solution.


− All the data will be available on the replica after synchronization.
− Replica can be used as a source to take backup; this will not impact the
source volume.
− Create one more replica that can be used for billing and reporting.
• To meet the DR requirement, the organization can implement synchronous
remote replication.

− Provides near zero RPO.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 220


Data Archiving

Data Archiving

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 221


Data Archiving

Data Archiving

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 222


Data Archiving

Data Archiving

Data Archiving

The main objectives of the topic are to:


→ Describe the benefits, architecture, and common regulations for data
archiving.
→ Describe data archiving and retrieval operations.
→ Describe the correlation between storage tiering and archive.
→ Describe email archiving and Content Addressed Storage (CAS).
→ Apply data archiving concepts to address an organization’s challenges and
requirements.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 223


Data Archiving

Data Archiving Overview

Data Archiving Overview

Objectives

The objectives of the topic are to:

• Understand the need for data archiving.


• Review the benefits of data archiving.
• Learn about fixed content assets.
• Perform a comparison between backup and archiving.
• Explore archiving architecture and archive storage implementations.
• Explore examples of data archiving regulations.

Why Do We Need Data Archiving?

What are the challenges of keeping fixed


data in primary storage?

Increasing consumption of expensive primary storage

High performance storage for less frequently accessed


data

Risk of compliance breach

Increased data backup window and cost


Active Data

Fixed Data

Fixed data is growing at over 90% annually


Data Archiving addresses these challenges

Primary Storage Systems

What are the challenges of keeping fixed data in primary storage?

• Increasing consumption of expensive primary storage.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 224


Data Archiving

• High performance storage for less frequently accessed data.


• Risk of compliance breach.
• Increased data backup window and cost.

To learn more about the need for data archiving, click here.

Data Archiving and Its Benefits

Data archiving moves fixed data63 that is no longer actively accessed to a separate
low-cost archive storage system for long term retention and future reference:

• Saves primary storage capacity.


• Reduces backup window and backup storage cost.
• Moves less frequently accessed data to lower cost archive storage.
• Preserves data for future reference and adherence to regulatory compliance.

To understand more about Data Archiving Benefits, click here.

63Data that is no longer actively accessed by users. It still however needs to be


stored for business and regulatory requirements.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 225


Data Archiving

Backup vs. Archiving

Data Backup Data Archiving

Secondary copy of data Primary copy of data

Used for data recovery operations Available for data retrieval

Primary objective – operational Primary objective – compliance


recovery and disaster recovery adherence and lower cost

Typically short-term (weeks or months) Long-term (months, years, or decades)


retention retention

Archiving Architecture

Application Server
Archive Server (Policy Primary Storage
Engine)

File Server
Archiving
Archive Storage Agent

Clients

The archiving architecture consists of three key components:

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 226


Data Archiving

• Archiving agent- It is software installed on the application and file servers. The
agent is responsible for scanning the files and archiving them, based on the
policy defined on the archive server (policy engine).
• Archive server- It is software installed on a server that enables administrators
to configure policies for archiving data. Organizations set their own policies for
qualifying data to be moved into archive storage. Policies can be defined based
on file size, file type, or creation/modification/access time. Once the files are
identified for archiving, the archive server creates an index for the files. By
utilizing the index, users may search and retrieve their data.
• Archive storage- It stores the fixed data.

Examples of Data Archiving Regulations

SEC Rule 17a-4

• Part of the US Securities Exchange Act of 1934.


• Describes the requirements for data retention, indexing, and accessibility for
companies which deal in the trade or brokering of financial securities such as
stocks, bonds, and futures.
• Maintains that companies must retain the records of various types of
transactions for a certain period of time.

Sarbanes-Oxley Act

• Passed in 2002 and it protects the shareholders and the general public from
accounting errors and fraudulent practices in the enterprise.
• Created to protect investors by improving the accuracy and reliability of
corporate disclosures.
• Applies to all public companies and accounting firms.
• Not a set of business practices and does not specify how a business should
store records.
• Defines which records are to be retained and for how long.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 227


Data Archiving

Health Insurance Portability and Accountability Act

• Was passed in 1996 and is a set of federal regulations establishing national


standards for the players in health care industry. For example, health care
insurance and health care providers.
• Provides guidelines for protection and retention of patient records, including
email.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 228


Data Archiving

Knowledge Check: Data Archiving Overview

Knowledge Check Question

1. Which of the following statements are correct? Choose all that apply
a. Archiving fixed data before taking backup reduces the backup window
b. Primary objectives of archiving are compliance adherence and lower cost
c. Nearline archive makes the data immediately accessible
d. Data archiving must occur outside the application operating time
e. Archiving agent indexes and moves fixed data to high-performance storage

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 229


Data Archiving

Archiving Operation and Storage

Archiving Operation and Storage

Objectives

The objectives of the topic are to:

• Data archiving and retrieval operations.


• Correlation between storage tiering and archive.
• Storage tiering policies.
• File movement from NAS to archive.
• Email archiving.
• Content addressed storage (CAS) features and operations.

Data Archiving Operation

Primary Storage

Stub File
Communication

Archive Server
Clients

Archive Server

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 230


Data Archiving

• Archiving agent scans primary storage to find files that meet the archiving
policy. The archive server indexes the files.
• Once the files have been indexed, they are moved to archive storage and small
stub files are left on the primary storage.

To understand more about Data Archiving Operations, click here.

Data Retrieval Operation

When a client attempts to access the By utilizing the index for archived files,
files through an application or file server, users may also search and retrieve
the stub file is used to retrieve the file files. The retrieval of files from the
from archive storage. archive storage is transparent to the
clients.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 231


Data Archiving

Correlating Storage Tiering and Archive

Tier 1

Performance Primary
Tiers Storage

Primary Tier 2
Storage

Archive
Tier Tier 3
Archive
Storage

Storage tiering is a technique of establishing a hierarchy of storage types (tiers)


and identifying the candidate data to relocate to the appropriate storage type to
meet service level requirements at a low cost.

To learn more about Correlating Storage Tiering and Archive, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 232


Data Archiving

Storage Tiering Policy

Tier 2

Tier 3
Data Movement

• Movement of data between storage tiers happens based on predefined tiering


policies.
• A tiering policy is a set of rules to move data from a source tier to a destination
tier.

Example: If a policy states “move the files from tier 2 to tier 3 storage that are not
accessed for the last six months,” then all the files in tier 2 storage that match this
condition are moved to tier 3 storage. Multiple rules may also be combined to
create a policy as shown in the image.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 233


Data Archiving

Tiering Example: NAS to Archive File Movement

3)File is stored in the archive


Application Servers storage system

Archive Storage System

1) Policy Engine Scans the 2) Policy Engine creates a stub


NAS device file on the NAS device

Policy Engine NAS Device

The image illustrates an example of file-level storage tiering, where files are moved
from a NAS device (primary storage system) to an archive storage system. The
environment includes a policy engine, where tiering policies are configured.

For more on the Tiering example, click here.

Archiving Use Case: Email Archiving

Government Compliance

• Meets all requirements to produce emails from every individual involved in stock
sales or transfers.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 234


Data Archiving

Legal Dispute

• Meets requirement to produce all emails within a specified time period


containing specific keywords, to/from certain people.

Mailbox Space Saving

• Eliminates the time wasted by constantly deleting emails in mailboxes with a


fixed quota.

To learn more about Email Archiving, click here.

Purpose-built Archive Storage – CAS

Client

CAS

Application
Server

CAS API

• Content addressed storage (CAS) is an object-based storage device that is


purposely built for storing and managing fixed data.
• Each object stored in CAS is assigned a globally unique content address (digital
fingerprint of the content).
• Application server accesses the CAS device via the CAS API.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 235


Data Archiving

Key Features of CAS

Feature Description

Content integrity Provides assurance that the stored data has not
been altered.

Content authenticity Assures the genuineness of the stored content.

Single instance storage Uses a unique content address to guarantee the


storage of only a single instance of an object.

Retention enforcement Configurable retention settings ensure content is


not erased prior to the expiration of its defined
retention period.

To learn more about CAS, click here.

Key Features of CAS - Cont'd,

Feature Description

Location independence Physical location of the stored data is irrelevant


to the application that requests the data.

Data protection Provides both local and remote protection to the


objects stored on CAS.

Performance Uses a unique content address to guarantee the


storage of only a single instance of an object.

Self-healing Automatically detects and repairs corrupted


objects.

Audit trails Keeps track of management activities and any


access or disposition of data.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 236


Data Archiving

To get a little more detail about the key features of CAS, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 237


Data Archiving

Knowledge Check: Archiving Operation and Storage

Knowledge Check Question

Primary
Storage

Archive
Servers
Clients

Archive Storage

2. In the image shown, in which component of the data archiving environment


does the sub file reside?
a. Primary Storage
b. Clients
c. Archive Servers
d. Archive Storage

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 238


Data Archiving

Concepts in Practice

Concepts in Practice

Dell EMC Cloud Tier

Dell EMC Cloud Tier provides a solution for long-term retention. Using advanced
deduplication technology that significantly reduces storage footprints, and with Dell
EMC Cloud Tier (Cloud Tier), DDOS (DD Operating System) can natively tier data
to a public, private or hybrid cloud for long-term retention. Only unique data is sent
to the cloud and data lands on the cloud object storage already deduplicated.

Cloud Tier supports a broad ecosystem of backup and enterprise applications and
a variety of public and private clouds. Cloud Tier enables:

• Cost-effective, long term retention in the cloud.


• Simple, native cloud tiering with no external appliance or cloud gateway
required.
• More efficient transfer or data to and from the cloud, using less bandwidth,
thanks to source side deduplication.
• Effective and efficient management of capacity across on-premises and cloud
storage, optimizing and reducing the overall cost of storage.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 239


Data Archiving

Exercise: Data Archiving

Exercise: Data Archiving


1. Present Scenario:

• The IT infrastructure of a health care organization includes a cluster of six


physical computing systems that are running hypervisors.

• Clustered compute systems host a total of 24 VMs.

• VMs host health care, email, and backup applications; and file servers.

• Physical compute systems are connected to two disk-based, high-


performance storage systems.

• Physical compute systems are also connected to a tape library that is used
as backup storage system.

• One of the storage systems has mostly SSDs while another has only HDDs.

• Disk-based storage systems have about:

− 20% frequently accessed data


− 40% moderately accessed data
− 40% fixed data
• Each patient record is preserved for seven years even after a patient’s
death.

• Old records are needed when patients revisit the health care organization.

• The organization performs daily backup of all patient records.

• Each backup copy is retained in the tape library for one month and then the
tapes are moved and maintained in a vault.

2. Organization Challenges:

• Storage systems have only 10% storage capacity available for storing new
data.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 240


Data Archiving

• Budget constraints prevent buying another high-performance, high-cost


storage system.

• Last year, some of the old records were altered resulting in a delay in
treatment.

− Old records were retrieved by bringing the old tapes from the vault and
making them online.
• A long backup window impacts application performance during peak hours.

• Cost of purchasing and maintaining a large number of tapes often exceeds


budgeted cost.

• Maintaining a large number of tapes poses risks of labeling errors and lost
tapes.

3. Organization Requirements:

• Need to purchase a storage system immediately to meet capacity


requirements.

• Need to ensure that the old records are authentic and are not altered.

• Need faster retrieval of old records in case a patient revisits the


organization.

• Need to reduce the backup window and the associated costs and risks.

• Need to optimize application performance.

4. Expected Deliverables:

Propose a solution that will address the organization’s challenges and


requirements.

Solution

The proposed solution is as follows:

• Deploy a CAS and move fixed data to the CAS.


− CAS will provide content authenticity and integrity.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 241


Data Archiving

− CAS will enable faster retrieval of patient records compared to tapes.


− Moving fixed data to the CAS will reduce the backup window, backup
storage and tape maintenance costs, and associated risks.
− Reduced backup window will mitigate the impact of backups on application
performance.
• If budget permits, replace the tape library with a disk-based storage system or
virtual tape library.
− A disk-based backup storage system will further reduce the backup window
and eliminate costs and risks associated with tape maintenance.
• Implement storage tiering to optimize application performance and eliminate the
need to buy a high-performance, high-cost storage system.
• Create a hierarchy of storage tiers:
− Tier 0: Storage system with mostly SSDs
− Tier 1: Storage system with only HDDs
− Tier 2: CAS
• Deploy a policy engine and configure policies to automatically move:

− Frequently accessed data to tier 0


− Moderately accessed data to tier 1
− Fixed data to tier 2

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 242


Data Migration

Data Migration

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 243


Data Migration

Data Migration

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 244


Data Migration

Data Migration

Data Migration

The main objectives of the topic are to:


→ Explain SAN-based and NAS-based data migration.
→ Define host-based and application-based migration.
→ Apply data migration techniques to address the organization’s challenges
and requirements.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 245


Data Migration

Data Migration

Why Data Migration?


Data migration is a specialized replication technique that enables to move data
from one system to another within a data center, between data centers, between
cloud, and between data center and cloud. To meet the business challenges
presented by today’s on-demand 24x7 world, data must be highly available – in the
right place, at the right time, and at the right cost to the enterprise. Data migration
provides solution to these challenges.

Organizations deploys data migration solutions for the following reasons. Click
each sub-heading for more information.

1. Data center maintenance without downtime

Typically, in an IT environment, a scheduled maintenance is performed in a


data center. During maintenance the systems (compute, storage, and network)
are usually down, which may impact the availability of applications running on
those systems. Data migration solutions enable to move the applications and
data to other systems or data center without impacting the downtime.

2. Disaster avoidance

Data centers in the path of natural calamities (such as hurricanes) can


proactively migrate the applications to another data center without impacting
the business.

3. Technology refresh

As technology keeps changing, a requirement to purchase a new hardware


(for example, storage system) arises in order to meet the business
requirements. In such cases, IT organizations have to migrate their data and
applications to the new system from the old one.

4. Data center migration or consolidation

Sometime, IT organization may require data center migration or consolidation.


Data migration solutions enable to move applications from one data center to
another as part of a data center migration or consolidation effort without
downtime.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 246


Data Migration

5. Workload balancing across multiple sites

IT organizations having multiple data centers may face challenges. For


example, one of the data center infrastructure components (compute system,
storage, and network) are highly utilized or overloaded and the other data
center infrastructure components are underutilized. To overcome this
challenge, organization can migrate some of the VMs and data to the
underutilized data center to provide load balancing across data centers to
meet the performance and availability requirements.

Data Migration Techniques

The various data migration techniques are as follows:

SAN-based Migration NAS-based Migration


NAS to NAS direct data migration
Storage system to storage system direct data
migration
NAS to NAS data migration through intermediary
compute system
Storage system to storage system data migration
through intermediary virtualization appliance
NAS to NAS data migration using virtualization
appliance

Host-based Migration Application Migration

Host-based migration tool Migration of application from one environment


to another
Hypervisor-based migration

- VM live migration

- VM storage migration

For more information, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 247


Data Migration

SAN-based Data Migration - Storage to Storage Migration

Compute system access to


the remote device is not
allowed in both pull and
push operations

Compute
Data moves (push) from Compute
system
the old system to the new system
system
Remote
device

Control Data is pulled from the


device remote system to the
control system New storage system
Old storage system (Remote Storage System)
(Control Storage
System)
Compute system can access
the control device in both hot
push and pull operations

SAN-based migration moves block-level data between heterogeneous storage


systems over SAN. This technology is application and operating system
independent because the migration operations are performed by one of the storage
systems. The storage system performing the migration operations is called the
control storage system.

Data can be moved from or to devices in the control storage system to or from a
remote storage system. Data migration solutions perform push and pull operations
for data movement. These terms are defined from the perspective of control
storage system.

Push: Data is pushed from control system to remote system.

Pull: Data is pulled from the remote system to control system.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 248


Data Migration

SAN-based Data Migration - Through Intermediary Virtualization


Appliance

Virtual Volume is created


from the pool and Virtualization Appliance
assigned to the compute (Handles the migration of
system data)

LUNs are assigned


Storage pool is created
to the appliance
using the assigned LUNs
from the storage systems

Non-disruptive data
migration from
storage system A to B

SAN-based data migration can also be implemented using a virtualization


appliance at the SAN. Typically for data migration, the virtualization appliance
(controller) provides a translation layer in the SAN, between the compute systems
and the storage systems.

The LUNs created at the storage systems are assigned to the appliance. The
appliance abstracts the identity of these LUNs and creates a storage pool by
aggregating LUNs from the storage systems. A virtual volume is created from the
storage pool and assigned to the compute system. When an I/O is sent to a virtual
volume, it is redirected through the virtualization layer at the SAN to the mapped
LUNs.

In this type of migration:

• The LUNs remain online and accessible while data is migrating.


• This migration supports movement of data between multi-vendor
heterogeneous storage systems.

For more information, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 249


Data Migration

NAS-based Data Migration - NAS to NAS Direct Data Migration

Clients

Direct file-level LAN


migration over
LAN

Migration
software runs

Old NAS New NAS


System System

In a NAS to NAS direct data migration, file-level data is migrated from one NAS
system to another directly over the LAN without the involvement of any external
server.

The two primary options of performing NAS-based migration is either by using


NDMP protocol or software tool. In this example, the new NAS system initiates the
migration operation and pulls the data directly from the old NAS system over the
LAN.

For more information, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 250


Data Migration

NAS to NAS Data Migration Using an Intermediary Compute System

Compute system

File-level migration through


compute system

LAN

Old NAS system New NAS system

In a NAS to NAS data migration through intermediary compute system, all the data
is transferred through the compute system from the old NAS system to the new
NAS system. In this method of migration -

• An intermediary compute system executes a migration between the NAS


systems.
• The compute system, executing the migration, makes a connection to the old
NAS system and the target system.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 251


Data Migration

NAS to NAS Data Migration Using a Virtualization Appliance

Clients

File-level migration LAN

Virtualization appliance performs


non-disruptive data migration
between NAS systems

Old NAS system New NAS system

In this type pf NAS migration, the virtualization appliance facilitates the movement
of files from old NAS system to new NAS system. While the files are being moved,
clients can access their files non-disruptively. Clients can also read their files from
the old location and write them back to the new location without realizing that the
physical location has changed.

Virtualization appliance creates a virtualization layer that eliminates the


dependencies between the data accessed at the file level and the location where
the files are physically stored. A global namespace is used to map the logical path
of a file to the physical path names.

Host-based Migration

In a host-based migration, a migration tool is installed on a compute system to


perform data migration. This tool performs migration in one of the following ways:

• It uses host operating system to migrate data from one storage to another. This
approach uses host resources to move data non-disruptively from a source to a
target.
• It works in conjunction with storage system-based replication and migration
solutions to migrate data from one storage to another.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 252


Data Migration

Hypervisor-based Migration - VM Migrations

Migrated VMs

VM Migration

Compute System 1 Compute System 2

Network

Storage System

In this type of migration, virtual machines (VMs) are moved from one physical
compute system to another without any downtime. This enables -

• Scheduled maintenance without any downtime


• VM load balancing

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 253


Data Migration

Hypervisor-based Migration - VM Storage Migration

Compute System

Network
VM Storage
Migration

Storage Storage
system system

In a VM storage migration, VM files are moved from one storage system to another
system without any downtime or service disruption.

Key benefits of this type of migration are as follows:

• Simplifies array migration and storage upgrades


• Dynamically optimizes storage I/O performance
• Efficiently manages storage capacity

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 254


Data Migration

Application Migration

Application
Migration

Migration of disk content


(app, OS, and data) to an
empty VM

Physical
compute system
VM disk

Network

Application migration typically involves moving the application from one data center
environment to another. Typically, the organization can move the application from
physical to virtual environment. In a virtualized environment, the application can
also be moved from one hypervisor to another for various business reasons such
as balancing workload for improving performance and availability.

In an application migration from a physical to virtual environment, the physical


server running the application is converted into a virtual machine. This option
usually requires a converter software that clones the data on the hard disk of the
physical compute system and migrates the disk content (application, OS, and data)
to an empty VM. After this, the VM is configured based on the physical compute
system configuration and the VM is booted to run the application. Now-a-days the
applications are deployed using containers, and it is easy to migrate the containers
from one platform to another.

For more information, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 255


Data Migration

Use case - PowerProtect Data Manager with Data Protection for


Kubernetes on VMware - Introduction

Kubernetes, an open-source system, is used to automate


the deployment and management of containerized
applications. It offers support for clusters of nodes, some
of which are used for scheduling and orchestration. The
clusters have at least one work node that manages the
workloads. Kubernetes allows the logical abstraction of
cluster resources, referred to as namespaces. Data
centers can operate one or multiple Kubernetes clusters
in their infrastructure.

Dell Technologies PowerProtect Data Manager is an enterprise-level, software-


defined data protection solution offered by Dell Technologies and includes data
protection for Kubernetes on VMware. PowerProtect with Data Protection for
Kubernetes on VMware uses an operations-enhanced version of Valero, an open-
source tool, which offers data backup and restoration capabilities and configuration
of workloads running on Kubernetes clusters. This feature has made Kubernetes
data protection an important offering of PowerProtect Data Manager.

Use case - PowerProtect Data Manager with Data Protection for


Kubernetes on VMware - Features

PowerProtect Data Manager offers the following:

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 256


Data Migration

• Automated protection for containerized applications, enabling hands-off


backups and operator-specified restoration of persistent volume data.
• Access Kubernetes structures and cluster resources and can scale up or down
as per the backup workload demand changes.
• Auto-discover all namespaces in a cluster. It can also discover persistent
volumes and persistent volume claims in the namespace that need protection.
• Restores is not limited to the same Kubernetes namespaces that were backup
up but can be made to a different namespace or to a new namespace created
within that cluster. Data Manager will offer restorations to different Kubernetes
clusters or namespaces within another cluster, in its future releases.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 257


Data Migration

Knowledge Check: Data Migration

Knowledge Check Question

Carefully inspect the given image.

Compute
system

Network

Storage system
Storage system

1. Which migration technique is shown in the figure?


a. VM Storage Migration
b. Application Migration
c. NAS-based Data Migration - NAS to NAS Direct Data Migration
d. NAS to NAS data migration using intermediary compute system

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 258


Data Migration

Concepts in Practice

Concepts in Practice

Click the right and left arrows for more information.

Dell EMC VPLEX

Dell EMC VPLEX provides continuous data availability, transparent data mobility
and non-disruptive data migration for mission critical applications. VPLEX delivers
high performance for the latest flash storage technology in combination with
reduced latency to ensure business critical applications are never down and
VPLEX delivers greater than five 9’s availability. VPLEX enables data and workload
mobility across arrays and datacenters without host disruption. Ansible modules for
VPLEX enable operational teams to rapidly provision storage infrastructure with
accuracy to respond to the fast-paced needs of application developers. VPLEX
requires no compute resources from the application hosts or on the underlying
array to maximize data availability.

Dell EMC Intelligent Data Mobility

Dell EMC Intelligent Data Mobility services enable organizations to reduce the time,
cost and complexity of data migration. Dell EMC Intelligent Data Mobility enables
fast and simple data migration to storage solutions like Dell EMC Unity, a simple,
modern, flexible and affordable flash storage solution for midrange storage. It
provides customers with the flexibility, simplicity and efficiency to seamlessly move
data and workloads by using technology, automation and Dell EMC expertise.
Intelligent Data Mobility follows a standardized methodology to minimize the time
and expense of onboarding new storage.

VMware vSphere vMotion

VMware vSphere vMotion is a zero-downtime live migration of workloads from one


server to another. This capability is possible across vSwitches, Clusters, and even
Clouds. During the workload migration, the application is still running, and users
continue to have access to the systems they need. VMware vSphere live migration
allows you to move an entire running virtual machine from one physical server to
another, with no downtime. The virtual machine retains its network identity and
connections, ensuring a seamless migration process. Transfer the virtual machine’s

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 259


Data Migration

active memory and precise execution state over a high-speed network, allowing the
virtual machine to switch from running on the source vSphere host to the
destination vSphere host.

VMware vSphere Storage vMotion

With VMware vSphere Storage vMotion, organizations can migrate a virtual


machine and its disk files from one datastore to another while the virtual machine is
running. With Storage vMotion, one can move virtual machines off of arrays for
maintenance or to upgrade. It provides the flexibility to optimize disks for
performance, or to transform disk types, which you can use to reclaim space.
Storage vMotion has several uses in administering virtual infrastructure, including
the following examples of use:

• Storage maintenance and reconfiguration. You can use Storage vMotion to


move virtual machines off a storage device to allow maintenance or
reconfiguration of the storage device without virtual machine downtime.
• Redistributing storage load. You can use Storage vMotion to redistribute virtual
machines or virtual disks to different storage volumes to balance capacity or
improve performance.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 260


Data Migration

Exercise - Data Migration

Exercise - Data Migration


Please click each sub-title for more information on the exercise.

1. Present Scenario:

An organization runs business-critical applications in a traditional data center.


The organization:

• Currently runs applications on physical compute systems - Each compute


system runs a single application.

• Uses a block-based storage system to provision storage capacity for the


business applications.

• Has another block-based storage system from a different vendor that


supports internal applications.

• Has a file-sharing environment in which multiple NAS systems serve all the
clients including application servers.

• Plans to deploy more applications to expand their business.

2. Organization’s Challenges:

• Compute systems are running at 15% to 20%.

• Organization has limited budget to buy compute systems to run new


business applications.

• Business-critical applications are impacted during the maintenance of the


storage system - the storage system is down during maintenance and it
does not have any migration capability.

• It is also identified that some of the NAS systems are over utilized and some
of the NAS systems are underutilized - –Clients are impacted when
accessing the over utilized NAS systems.

3. Organization’s Requirements:

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 261


Data Migration

• They want to virtualize their compute infrastructure and run multiple


applications on each physical compute system. Running multiple
applications on each physical compute system reduces the need to invest
on purchasing new compute systems.

• Business-critical applications should not get impacted during the


maintenance of block-based storage system.

• Need an effective solution to address the challenges in the NAS


environment.

4. Expected Deliverables:

Propose a solution to address the organization’s challenges and requirements.

Solution

The proposed solution is as follows:

The organization can perform application migration by converting their physical


compute systems to virtual machines -

• Perform online migration that avoids the impact on application availability.


• Improves the overall utilization of the compute systems.

To avoid downtime during storage system maintenance, the organization can


implement SAN-based data migration solution -

• Migrates data to another storage system by using virtualization appliance.


• Supports data migration between multi-vendor storage systems.

To overcome the challenges in the NAS environment, the organization can


implement NAS-based data migration -

• Allows to move files from over utilized NAS system to underutilized NAS system
without impacting the client

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 262


Data Protection in Software-Defined Data Center

Data Protection in Software-Defined Data Center

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 263


Data Protection in Software-Defined Data Center

Data Protection in Software-Defined Data Center

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 264


Data Protection in Software-Defined Data Center

Data Protection in Software-Defined Data Center

Data Protection in Software-Defined Data Center

The main objectives of the topic are to:


→ Describe software-defined data center, its architecture and benefits.
→ Describe software-defined compute, storage, and networking.
→ Describe data protection process in a software-defined data center.
→ Apply the concept of data protection in a software-defined data center to
meet the organization’s requirements.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 265


Data Protection in Software-Defined Data Center

Software-Defined Data Center Overview

Software-Defined Data Center Overview

Objectives

The objectives of the topic are to:


→ Define and describe the attributes of software-defined data center.
→ Describe the architecture of software-defined data center.
→ Explain the functions of software controller.
→ Describe the key benefits of software-defined data center.

Software-Defined Data Center

• Software-defined data center (SDDC) is an approach to IT infrastructure that


− Abstracts, pools, and automates all resources in a data center environment
to achieve IT as a service (ITaaS).
− Controlled and managed by intelligent, policy-driven software.
• All IT infrastructure resources are virtualized, abstracted, and delivered as a
service, and the control of this data center is entirely automated by software.

The key attributes of SDDC are:

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 266


Data Protection in Software-Defined Data Center

Key Attributes Description

Abstraction and Abstracts and pools IT resources across data centers.


pooling

Automated policy- IT services are created from available resources


driven provisioning dynamically based on defined policy.
including data
protection

Unified management Provides a single control point for the entire


infrastructure across all physical and virtual resources.

Self-service Allows users to select IT services from a self-service


catalog.

Metering Usage of resources per user is measured and reported


by a metering system.

Open and extensible Enables integrating multi-vendor IT resources and


external management interfaces and applications into
the environment through the use of APIs.

To learn more about SDDC key attributes, click here.

Architecture of Software-Defined Data Center

Software-defined data center (SDDC) architecture includes four distinguished


planes – data plane, control plane, management plane, and service plane. As
mentioned in the below image:

Click on the name of each plane on the image for more information.

To learn about key components of software controller, click on SDDC Controller.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 267


Data Protection in Software-Defined Data Center

1: Allows a user to request or order a service from the catalog in a self-service


way.

2: Used to perform administrative operations such as configuring a system and


changing policies.

3: Provides the programming logic and policies that the data plane follows to
perform its operations.

The key functions of the control plane include asset discovery, resource abstraction
and pooling, provisioning resources for services.

4: Performs the data processing and transmission operations.

For detailed information about SDDC architecture, click here.

Key Benefits of SDDC

By extending virtualization throughout the data center, SDDC provides several


benefits to organizations. Some of the key benefits are described below:

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 268


Data Protection in Software-Defined Data Center

Benefits Description

Agility • On-demand self-service


• Faster resource provisioning

Cost efficiency • Use of the existing infrastructure and commodity


hardware lowers CAPEX

Improved control • Policy-based governance


• Automated data protection/disaster recovery
• Automated, policy-driven operations help in reducing
errors

Centralized • Unified management platform for centralized


management monitoring and administration

Flexibility • Use of commodity and advanced hardware


technologies
• Cloud support

For detailed information about key benefits of SDDC, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 269


Data Protection in Software-Defined Data Center

Knowledge Check: Software-Defined Data Center Overview

Knowledge Check Question

1. Which of the following statement is correct about software-defined controller?


Choose all that apply.
a. Performs resource abstraction and pooling.
b. Provides interfaces that enable only cloud applications external to the
controller to request resources.
c. Allows rapid provisioning of resources based on pre-defined policies.
d. Performs asset discovery.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 270


Data Protection in Software-Defined Data Center

Software-Defined Compute, Storage, and Networking

Software-Defined Compute, Storage, and Networking

Objectives

The objectives of the topic are to:


→ Define software-defined compute (SDC) and software-defined storage
(SDS).
→ Explain the functions of SDS controller.
→ Describe virtual storage system and virtual storage pool.
→ Describe software-defined networking (SDN) and functions of SDN
controller.
→ Describe virtual machine network and compute-based SAN.

Software-Defined Compute (SDC)

Software- Defined
Compute

• SDC is an approach to provision compute resources using compute


virtualization technology enabled by the hypervisor.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 271


Data Protection in Software-Defined Data Center

• Hypervisor decouples the application and the OS from the hardware and
encapsulates them in an isolated virtual container called a virtual machine (VM).
• Hypervisor controls the allocation of hardware resources to the VMs based on
policies, which means the hardware configuration of a VM is maintained using a
software.

Software-Defined Storage (SDS)

SDS is an approach to:

Virtual Storage Resources

SDS Software/Controller

Multiple Types of Storage Systems(Physical Storage Infrastructure)

Commodity

• Provisions the storage resources in which a software (SDS controller) controls


storage-related operations independent of the underlying physical storage
infrastructure.
• Abstracts the physical details of storage and delivers virtual storage resources.
• Controls the allocation of storage capacity based on policies configured on the
SDS controller.

The key functions of the SDS controller are:

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 272


Data Protection in Software-Defined Data Center

• Discovery64
• Resource abstraction and pooling65
• Service provisioning66

To learn more about functions of SDS controller, click here.

Virtual Storage System and Pool

Physical storage systems are separated into two parts- virtual storage system and
virtual storage pool. Let us understand each of them.

64SDS controller discovers physical storage systems to gather data and bring them
under its control and management.

65SDS controller abstracts physical storage systems into virtual storage systems
and virtual storage pools as per policies and also enables an administrator to
define storage services.

66SDS controller automates the storage provisioning tasks and delivers virtual
storage resources based on the service request issued through a service catalog.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 273


Data Protection in Software-Defined Data Center

Virtual Storage System

Virtual Storage System A Virtual Storage System B

• A virtual storage system is a logical grouping of physical storage systems. It


abstracts the physical storage systems and network connectivity.
• An administrator may create multiple virtual storage systems to partition a data
center into multiple groups of connected compute, network, and storage
resources.
− All physical components within a virtual storage system should be able to
communicate with each other.
• Multiple virtual storage systems may be configured for the purpose of fault
tolerance, network traffic isolation, and user group/tenant isolation.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 274


Data Protection in Software-Defined Data Center

Virtual Storage Pool

Virtual Storage System A

SSD HDD File

Virtual Storage Pool Virtual Storage Pool Virtual Storage Pool


Virtual Storage Pool
A Block: Tier 1 B Block: Tier 2 C Block: Tier 3
D (File)
(Gold) (Silver) (Bronze)

SSD HDD File

Virtual Storage System B

• A virtual storage pool is a logical entity that maps to the storage pools in the
virtual storage systems.
• Administrator may configure multiple virtual storage pools of different capacity,
performance, and protection characteristics based on the policy.
• A virtual storage pool may include storage pools from multiple virtual storage
systems.

For detailed information about virtual storage pool, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 275


Data Protection in Software-Defined Data Center

Software-Defined Networking (SDN)

SDN Controller (Control Plane)

Programming Logic for


Switching/Routing Network
Traffic

Network Components (Data Plane)


Switch Switch

Switch

Switch Switch

A network component such as a switch or a router consists of a data plane67 and a


control plane68. These planes are implemented in the firmware of the network
components.

Software-defined networking (SDN) is the networking approach that enables an


SDN software or controller to:

• Controls the switching and routing of the network traffic independent of the
underlying network.
• Abstracts the physical details of the network components and separates the
control plane functions from the data plane functions.

67The function of the data plane is to transfer the network traffic from one physical
port to another by following rules that are programmed into the component.

68The function of the control plane is to provide the programming logic that the
data plane follows for switching or routing of the network traffic.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 276


Data Protection in Software-Defined Data Center

• Provides instructions for data plane to handle network traffic based on policies.
• Provides CLI and GUI for administrators to manage the network infrastructure
and configure policies and APIs for external management tools and application
to interact with the SDN controller.

The common functions of SDN controller are:

• Discovery69
• Network component management70
• Network flow management71

Virtual Network

Virtual networks are software-defined logical networks that are created on a


physical network.

69SDN controller interacts with network components to discover information on


their configuration, topology, capacity, utilization, and performance.

70SDN controller configures network components to maintain interconnections


among the components and isolate network traffic through virtual networks.

71SDN controller controls the network traffic flow between the components and
chooses the optimal path for network traffic.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 277


Data Protection in Software-Defined Data Center

• Virtual networks can be created by segmenting a single physical network into


multiple logical networks.
• Multiple physical networks can also be consolidated into a single virtual
network72.
• Virtual networks are automatically or manually created, provisioned, and
managed through the SDN controller.
• Virtual networks are isolated and independent of each other. Nodes with a
common set of requirements can be functionally grouped in a virtual network.
• Organizations may create multiple virtual networks on a common network
infrastructure for the use of different user groups or tenants.
− Enables isolation of network traffic between various user groups or tenants.

72A virtual network appears as a physical network to the compute and storage
systems (called nodes) connected to it, because the existing network services are
reproduced in a virtual network.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 278


Data Protection in Software-Defined Data Center

− Also span physical boundaries, allowing network extension and optimizing


resource utilization across clusters and data centers.
• Common examples of virtual network are virtual LAN (VLAN), virtual extensible
LAN (VXLAN), and virtual SAN (VSAN).

Virtual Machine Network

Virtual Switch

• A logical network that provides Ethernet connectivity.


− Enables communication between the VMs running on a hypervisor within a
compute system.
• VM network includes logical switches called virtual switches.
• Virtual switches function similar to physical Ethernet switches.
• To understand the working of virtual switch with an example, click here.

Virtual Router

Clients

Virtual Router

Physical
Switch
Virtual Switch Virtual Switch
Physical
NIC

Physical Compute System

• A software-based router that can be installed on a VM or implemented using a


virtual appliance.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 279


Data Protection in Software-Defined Data Center

• Works similar to a physical router.


− Virtual router does not exist as a separate box with physical connections.
• Enables a VM to have the abilities of a router by performing the network and
packet routing functionality of the router via a software application.

Compute-based SAN

Compute System with DAS

C S C S C S C S

Compute-based SAN Storage Pool

C
Client Program

S Server Instance (server program)

A compute-based storage area network (SAN) is a:

• Software-defined SAN created from direct-attached storage.


− Located locally on the compute systems in a cluster.
− Creates a large block-based storage pool.
• A compute system that requires access to the block storage volumes runs a
client program.
• The compute systems that contribute their local storage to the shared storage
pool within the virtual SAN, run an instance of a server program.

− Owns the local storage and performs I/O operations as requested by a client
from a compute system within the cluster.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 280


Data Protection in Software-Defined Data Center

Knowledge Check: Software-Defined Compute, Storage, and


Networking

Knowledge Check Question

2. Which of the following statement is correct? Choose all that apply.


a. Compute-based SAN is created from the direct-attached storage on the
compute systems in a cluster.
b. Virtual storage system is an abstraction of physical storage systems and
the network connectivity between them.
c. Software-defined networking integrates control plane with data plane.
d. A virtual switch is a logical aggregation of physical Ethernet switches.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 281


Data Protection in Software-Defined Data Center

Data Protection Process in SDDC

Data Protection Process in SDDC

Objectives

The objectives of the topic are to:


→ Understand the key phases of data protection process.
→ Understand the key steps for defining data protection services.
→ Describe orchestration of data protection operations.
→ Explain the integration of components using orchestrator.

Introduction to Data Protection in SDDC

Protection technologies are usually


offered to the users as protection
services through self-service portal

Management Tools

Controller leverages the protection


technologies that are either natively built
into infrastructure components or Software Controller
provided by a separate protection
applications

IT Infrastructure

• SDDC ensures data availability and protection against data corruption,


hardware failures, and data center disasters.
• Protection technologies such as continuous data protection, image-based
backup, and snapshot are usually offered to the users as protection services.
− Each service is standardized to meet a specific level of performance,
protection, and availability requirements.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 282


Data Protection in Software-Defined Data Center

− User may request for a protection service from the self-service portal and
the software controller will fulfill the requests automatically.
• Software controller leverages the protection technologies that are either natively
built into the underlying IT infrastructure components or provided by a separate
protection applications.
− Controls and manages the protection applications, storage, and operations
according to predefined policies.
• The data protection process in an SDDC consists of three key phases. These
are:

− Discovering data protection architecture73


− Defining data protection services74
− Orchestrating data protection operations75

Defining Data Protection Services


Data protection services are defined with the help of following steps:

1. Selecting resources for data protection

• An administrator identifies and configures interrelated hardware, software,


and virtual components that will constitute a data protection service and
work together upon deployment of a service.

2. Defining data protection policies

73 The software controller performs discovery operation to collect and store


information about the components of data protection architecture and bring them
under its control and management.

74 Data protection services are defined by administrators using the service catalog.

75Orchestration of protection operations enables automated coordination among


various infrastructure components to deliver data protection services.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 283


Data Protection in Software-Defined Data Center

• Based on business requirement, an administrator defines policies for each


service. Policies include the:

− Schedule and performance level of protection operations.


− Data retention period.
− Data availability level.
− Recovery point objective (RPO) and recovery time objective (RTO).
− Type of protection storage.
• Once defined in the service catalog, the data protection services are
automatically created and the policy settings become the attributes of the
services.

Orchestrating Data Protection Operations

Service
Delivery

• Orchestration refers to the automated arrangement, coordination, and


management of various system or component-related tasks in an IT
infrastructure to manage IT resources and provide services.
− Tasks are programmatically integrated and sequenced into orchestration
workflows.
• SDDC controller has built-in workflows.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 284


Data Protection in Software-Defined Data Center

− Orchestration software/orchestrator is used to orchestrate service delivery


and management operation.
• Orchestrator interacts with the SDDC controller through APIs to enable
orchestration based on its workflows.

− Provides an interface for administrators to define new workflows.

Integration of Components using Orchestrator

Component integration is the connection of multiple component-related tasks which


are essential for carrying out resource management and service delivery into a
workflow. The orchestrator provides component integration capability.

• Users request for services from a service catalog on self-service portal. The
portal interacts with the orchestrator and transfers service requests.
• Orchestrator interacts with appropriate components to orchestrate execution of
component-related tasks based on pre-defined workflows.
• Components that may be considered for integration are shown in the image.
Click on each integration components for detailed information.

5 3 1

4 2

1: It authenticates and authorizes users, which help in verifying user-credentials


when they logon to the portal.

2: These tools automate various management operations in the data protection


environment such as logging service-related issues, notifying events, monitoring
capacity, and approving changes in the infrastructure.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 285


Data Protection in Software-Defined Data Center

3: It is responsible for controlling and managing infrastructure resources centrally


and provisioning services. A single controller may have capability to control the
entire infrastructure.

• Separate controllers may also be deployed to control compute, storage, or


networking operations.

4: It is a federated database that provides a single view about the managed


resources and services in a data protection environment.

• CMS is updated automatically as changes are made in the infrastructure. Both


the portal and the management tools use data from the CMS when appropriate.

5: It collects and records the usage of services per user group or consumer in
number of units consumed of a service.

Examples of a service unit are: per GB of storage, per transaction, and per hour of
application usage.

• It also generates billing report76 based on price per unit and number of units
consumed of a service.

76 The billing report is visible to the user through the cloud portal.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 286


Data Protection in Software-Defined Data Center

Knowledge Check: Data Protection Process in SDDC

Knowledge Check Question

3. Match the following activities with their descriptions:

A. Selecting B Collect and store information


resources for about protection components.
data protection
B. Discovering data A Identify and configure
protection interrelated components that
architecture will constitute a data protection
service.

C. Orchestrating D Define ‘schedule’ and


data protection ‘performance level’ of
operations protection operations.

D. Defining data C Protection component-related


protection policies tasks are programmatically
integrated and sequenced into
workflows.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 287


Data Protection in Software-Defined Data Center

Concepts in Practice

Concepts in Practice

Click the right and left arrows to view all the concepts in practice.

Dell EMC ECS

ECS, the leading object storage platform from Dell EMC, provides unmatched
scalability, performance, resilience, and economics.

• Deployable as a turnkey appliance or in a software-defined model.


• Delivers rich S3-compatibility on a globally distributed architecture, empowering
organizations to support enterprise workloads such as cloud-native, archive,
IoT, AI, and big data analytics applications at scale.
• Stores unstructured data at public cloud scale with the reliability and control of a
private cloud.
• Capable of scaling to exabytes and beyond.
• Empowers organizations to manage a globally distributed storage infrastructure
under a single global namespace with anywhere access to content.

Dell EMC Unity Cloud Edition

Dell EMC Unity Cloud Edition lets you deploy Dell EMC Unity unified storage as a
virtual storage appliance directly in an AWS cloud.

• Dell EMC Unity Cloud Edition is software-defined storage that runs on industry-
standard hardware and VMware ESXi.
• Enterprise Capabilities such as Snapshots, Quotas and Tiering are delivered
with Common Unity experience.
• With Unity Cloud Edition, File Services are consumed within each Customer
SDDC so there is no need for an External File Appliance or File Service.
• Dell EMC Unity Cloud Edition enables Cloud Synch Disaster Recovery between
on premises-deployed Dell EMC Unity systems and VMware Cloud-based
appliances.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 288


Data Protection in Software-Defined Data Center

− This block and file solution is ideal for a variety of use cases in the cloud
including home directory for running a VDI environment in VMware Cloud,
Test/Dev operations, or replication services to a third site.

VMware vSAN

vSAN is enterprise-class, software-defined storage virtualization software that,


when combined with vSphere, allows you to manage compute and storage with a
single platform. With vSAN, you can:

• Reduce the cost and complexity of traditional storage and take the easiest path
to future ready hyper-converged infrastructure and hybrid cloud.
• Improve business agility, all while speeding operations and lowering costs when
integrated with hyper-converged infrastructure (HCI) solution.
• Modernize the infrastructure by leveraging existing tools, skillsets and software
solutions.
• Simplify the extension from on premises to the public cloud.

vSAN is integrated with vSphere, optimizing the data I/O path to provide the
highest levels of performance with minimal impact on CPU and memory.

vSAN minimizes storage latency with built-in caching on server side flash devices,
delivering up to 50 percent more IOPS than previously possible.

VMware NSX Data Center

VMware NSX Data Center is the network virtualization and security platform that
enables the virtual cloud network.

• A software-defined approach to networking that extends across data centers,


clouds, and application frameworks.
• With NSX Data Center, networking and security are brought closer to the
application wherever it’s running, from virtual machines (VMs) to containers to
bare metal.
− Like the operational model of VMs, networks can be provisioned and
managed independent of underlying hardware.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 289


Data Protection in Software-Defined Data Center

• NSX Data Center reproduces the entire network model in software, enabling
any network topology—from simple to complex multitier networks—to be
created and provisioned in seconds.
• Users can create multiple virtual networks with diverse requirements, leveraging
a combination of the services offered via NSX.

− From a broad ecosystem of third-party integrations ranging from next-


generation firewalls to performance management solutions to build
inherently more agile and secure environments.
− These services can then be extended to a variety of endpoints within and
across clouds.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 290


Data Protection in Software-Defined Data Center

Exercise: Data Protection in SDDC

Exercise: Data Protection in SDDC


1. Present Scenario:

• An organization uses its data center to provide email service to its


customers globally.

• A cluster of 20 VMs is used to provide the email service.

• Data center storage infrastructure is controlled and managed by an SDS


controller.

• SDS controller provides a single virtual storage pool for all the VMs to store
email data.

2. Organization’s Requirement:

• Organization wants to use another data center in a separate geographic


region to provide the email service.

• Both data centers must be active.

• Both the data centers must have capability to failover services automatically
in the event of a disaster.

• Organization wants to implement three categories of data protection policy –


’Gold’, ‘Silver’, and ‘Bronze’.

• Features of ‘Gold’ policy include CDP and DR protection.

• Features of ‘Silver’ policy include asynchronous remote replication and DR


protection.

• Features of ‘Bronze’ policy include periodic local replication.

3. Expected Deliverables:

• Propose a solution that will meet the organization’s requirements.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 291


Data Protection in Software-Defined Data Center

Solution

The proposed solution is as follows:

• Deploy and connect SDS controllers at both the sites.


• Span the VM cluster across both the data centers.
• Configure SDS controllers to support active/active configuration with automated
service failover.
• Create three virtual pools and associate a data protection service (Gold, Silver,
or Bronze) with each of them.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 292


Cloud-Based Data Protection

Cloud-Based Data Protection

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 293


Cloud-Based Data Protection

Cloud-based Data Protection

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 294


Cloud-Based Data Protection

Cloud-Based Data Protection

Cloud-Based Data Protection

The main objectives of the topic are to:


→ Describe cloud computing and its essential characteristics.
→ Describe cloud service models and the cloud deployment models.
→ Describe cloud-based backup, replication, archiving, and migration.
→ Apply the concepts of cloud in a data protection environment.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 295


Cloud-Based Data Protection

Cloud Computing Overview

Cloud Computing Overview

Objectives

The objectives of the topic are to:

• Understand traditional IT vs. cloud computing.


• Review essential cloud characteristics.
• Explore cloud service models.
• Explore cloud deployment models.

What is Cloud Computing

Cloud Infrastructure

Compute Platform
Network Storage Applications
Systems Software

Desktop Mobile
Thin Client Devices

• According to the NIST, “Cloud computing is a model for enabling ubiquitous,


convenient, on-demand network access to a shared pool of configurable

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 296


Cloud-Based Data Protection

computing resources that can be rapidly provisioned and released with minimal
management effort or service provider interaction.”
• Consumers pay only for the services that they use, either based on a
subscription or based on resource consumption.

To learn more about Cloud Computing, click here.

Traditional IT vs. Cloud Computing

Traditional IT Cloud Computing

IT resources are owned and managed IT resources are rented as services

Needs considerable time to acquire and On-demand resource provisioning and


provision resources scalability

Lacks ability to support needed Self service provisioning of resources


business agility

IT resources are planned for peak Resource consumption is metered


usage

Underutilized resources Provides business agility and high


utilization

High up-front CAPEX Offers reduced CAPEX

To understand more about the differences between Traditional IT and Cloud


Computing, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 297


Cloud-Based Data Protection

Essential Cloud Characteristics

Rapid elasticity is the capability


to quickly scale-out and rapidly
release resources to quickly
scale-in Measured service is the process used by
cloud systems to automatically control and
optimize resource use by leveraging a
metering capability

On-demand
self service
enables Broad network access
consumers to provides capabilities that are
unilaterally available over the network
provision and accessed through
computing standard mechanisms
capabilities as
needed

Resource pooling combines a


provider’s computing resources to
serve multiple consumers using a
multi-tenant model

The five essential characteristics or tenets of a cloud (as defined by NIST) are:

Rapid elasticity

• Capabilities can be elastically provisioned and released, with automation these


can even be rapidly and automatically scaled outward and inward,
commensurate with demand.
• To the consumer, the capabilities available for provisioning often appear to be
unlimited and can be appropriated in any quantity at any time.

On-Demand self-service

• The end user can provision computing capabilities themselves allowing them to
allocate things such as server time and network storage, as needed
automatically without requiring human interaction with each service provider.

Resource pooling

• The provider’s computing resources are pooled to serve multiple consumers


using a multi-tenant model, with different physical and virtual resources
dynamically assigned and reassigned according to consumer demand.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 298


Cloud-Based Data Protection

• There is a sense of location independence in that the customer generally has


no control or knowledge over the exact location of the provided resources but
may be able to specify location at a higher level of abstraction (e.g., country,
state, or datacenter).
• Examples of resources include storage, processing, memory, and network
bandwidth.

Measured service

• Cloud systems automatically control and optimize resource use by leveraging a


metering capability at some level of abstraction appropriate to the type of
service (e.g., storage, processing, bandwidth, and active user accounts).
• Resource usage can be monitored, controlled, and reported, providing
transparency for both the provider and consumer of the utilized service.

Broad network access

• Capabilities are available over the network and accessed through standard
mechanisms that promote use by heterogeneous thin or thick client platforms
(e.g., mobile phones, tablets, laptops, and workstations).

Cloud Service Offering Examples

I want to back up my files, so that


I can retrieve from anywhere,
anytime.

Secured online backup service


Cloud
On-Demand computing
resources
My organization needs to
grow but cannot spend much
to buy new servers, storage.
Trial on wide variety of Leverage
platform/infrastructure latest
Rent resources technology
temporarily
Access on-
demand
I need a word processing
application for a brief
period to prepare my
documents. My organization handles critical
customer data but lacks the
My organization wants to required storage infrastructure.
test a software, before My organization
investing on it. cannot afford
investment for
seasonal peaks.

To learn more about Cloud Service Offering, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 299


Cloud-Based Data Protection

Cloud Service Models

Infrastructure as a Service (IaaS)

• Provides capability to the consumer to hire infrastructure components such as


servers, storage, and network.
• Enables consumers to deploy and run software, including OS and applications.
• Pays for infrastructure components usage, for example, Storage capacity, CPU
usage, etc.

To learn more about Cloud Service Models and Infrastructure as a Service, click
here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 300


Cloud-Based Data Protection

Platform as a Service (PaaS)

• Capability provided to the consumer to deploy consumer-created or acquired


applications on the provider’s infrastructure.
• Consumer has control over:
− Deployed applications.
− Possible application hosting environment configurations.
• Consumer is billed for platform software components:

− OS, Database, Middleware.


To learn more about the Platform as a Service model, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 301


Cloud-Based Data Protection

Software as a Service (SaaS)

• Capability provided to the consumer to use provider’s applications running in a


cloud infrastructure.
• Complete stack including application is provided as a service.
• Application is accessible from various client devices, for example, via a thin
client interface such as a Web browser
• Billing is based on the application usage.

To learn more about Software as a Service, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 302


Cloud-Based Data Protection

Cloud Deployment Models

• Cloud deployment models provide basis for how cloud infrastructure is built,
managed, and accessed.
• Each cloud deployment model may be used for any of the cloud service models:
IaaS, PaaS, and SaaS. The different deployment models present a number of
tradeoffs in terms of control, scale, cost, and availability of resources.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 303


Cloud-Based Data Protection

Public Cloud

Enterprise P Enterprise Q

Cloud Provider's Resources

Individual R

IT resources are made available to the general public or organizations and are
owned by the cloud service provider.

To learn more about Public Cloud, click here.

Private Cloud

Externally Hosted Private


Cloud
Cloud Provider's Enterprise P
Resources

Enterprise P

Resources dedicated for Enterprise


P

Resources of
Enterprise P

On Premise Private Cloud

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 304


Cloud-Based Data Protection

Cloud infrastructure is operated solely for one organization and is not shared with
other organizations. This cloud model offers the greatest level of security and
control.

To learn more about Private Cloud, click here.

Multi Cloud

The multi-cloud approach is taken to meet business demands if no single cloud


model can suit the various requirements and workloads.

Some application workloads run better on one cloud platform while other workloads
achieve higher performance and lower cost on another one.

The wide variety of business requirements result in a need for various cloud
offerings. For example, one might use Amazon EC2 for computing and Microsoft
Azure for data lake storage while leveraging Google Cloud SQL.

Cost optimization, availability, and performance requirements are other factors


contributing to selection of multiple cloud offerings.

Some organizations also pursue multi-cloud strategies for data sovereignty or


regulatory reasons. Certain laws, regulations and organization policies require
enterprise data to physically reside in certain locations.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 305


Cloud-Based Data Protection

Community Cloud -On-Premise

Enterprise P Enterprise Q

Resources of Resources of
Enterprise P Enterprise Q

Enterprise R

One or more participant organizations provide cloud services that are consumed by
the community.

To learn more about On-Premise Community Cloud, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 306


Cloud-Based Data Protection

Community Cloud - Externally Hosted

Community Users
Enterprise R
Enterprise P

Enterprise Q

Cloud Provider's Resources

Resources Dedicated for


Community

IT resources are hosted on the premises of the external cloud service provider and
not within the premises of any of the participant organizations.

To learn more about Externally Hosted Community Cloud, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 307


Cloud-Based Data Protection

Hybrid Cloud

Enterprise Q

Cloud Provider's
Resources

Enterprise P

Resources of
Public Cloud
Enterprise P

Private Cloud Individual R

IT resources are consumed from two or more distinct cloud infrastructures (private,
community, or public).

To learn more about Hybrid Cloud, click here.

Cloud Benefits

• Provides the capability to provision IT resources quickly and at any time,


thereby considerably reducing the time required to deploy new applications and
services. This enables businesses to reduce the time-to-market and to respond
more quickly to market changes.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 308


Cloud-Based Data Protection

Business Agility Flexibility of access

Reduced IT Cost Simplified Infrastructure


Management

High Availability Cloud Increased Collaboration

Flexible Scaling Business Continuity

Pay for Use

To understand more about the benefits of Cloud, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 309


Cloud-Based Data Protection

Knowledge Check: Cloud Computing Overview

Knowledge Check Question

1. Match the following elements with their descriptions:

A. Hybrid D Cloud infrastructure is operated


Cloud solely for one organization and is
not shared with other organizations.

B. Community C IT resources are made available to


Cloud the general public or organizations
and are owned by the cloud service
provider.

C. Public Cloud A IT resources are consumed from


two or more distinct cloud
infrastructures.

D. Private B Cloud infrastructure that is set up for


Cloud the sole use by a group of
organizations with common goals or
requirements.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 310


Cloud-Based Data Protection

Cloud-Based Data Protection

Cloud-Based Data Protection

Objectives

The objectives of the topic are to:

• Identify drivers for cloud-based data protection.


• Identify types of backup service.
• Review restoring data from cloud.
• Review cloud-based replication.
• Understand Disaster Recovery as a Service.

Drivers for Cloud-based Data Protection

Simplified Flexible Scalability


Management

On-demand, self-
Reduced CAPEX
service provisioning

Recover data to any


location/devices

Organizations need to regularly protect the data to avoid losses, stay compliant,
and preserve data integrity. Data explosion poses challenges such as strains on
the backup window, IT budget, and IT management. The growth and complexity of
data environment, added with proliferation of virtual machines and mobile devices,
constantly outpaces the existing data protection plans. Deployment of a new data
protection solution takes weeks of planning, justification, procurement, and setup.
Enterprises must also comply with regulatory and litigation requirements. These
challenges can be addressed with the emergence of cloud-based data protection.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 311


Cloud-Based Data Protection

• Simplified Management: Configuration, applying the latest patches and


updates, and carrying out upgrades and replacements.
• On-demand self-service provisioning: IT resources can be provisioned on-
demand through service catalog.
• Reduced CAPEX: Enables the organization to hire the IT resources based on
pay per use or subscription pricing.
• Flexible Scalability: Provides the capability to scale-in or scale-out the
resources as per the requirement.
• Recover data to any location/devices: Enables the organization to recover
the data from any place to any device.

To learn more about Cloud-based Data Protection, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 312


Cloud-Based Data Protection

Backup as a Service

Cloud Resources

Backup Data to the


Cloud

On-Premise Data Center

Backup Clients Backup Clients

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 313


Cloud-Based Data Protection

• Enables consumers to procure backup services on-demand.


• Reduces the backup management overhead.
• Backing up to cloud ensures regular and automated backup of data.
• Gives the consumers the flexibility to select a backup technology based on their
current requirements.

To learn more about Backup as a Service, click here.

Types of Backup Services

The three common backup service deployment options that cloud service providers
offer to their consumers are:

• Local Backup Service (Managed backup service).


• Remote Backup Service.
• Replicated Backup Service.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 314


Cloud-Based Data Protection

Local Backup Service (Managed Backup Service)

Cloud Resources

• Suitable when a cloud service provider already hosts consumer applications


and data.
• Service is offered by the provider to protect consumer’s data.
• Managed by the service provider.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 315


Cloud-Based Data Protection

Remote Backup Service

Cloud Resources

Backup data is sent to


the cloud

Agent is running on the


Consumer
backup client on
Organization
consumer’s location

• Service provider receives data from consumers.


• Managed by the service provider.

To learn more about Remote Backup Service, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 316


Cloud-Based Data Protection

Replicated Backup Service

Cloud Resources

Backup data is replicated to the


cloud

Agent is running on the


backup client on Consumer
consumer’s location Organization

• Service provider only manage data replication and IT infrastructure at disaster


recovery sites.
• Local backups are managed by consumers.

Cloud to Cloud Backup

Cloud Resources

Consumer organization
accesses cloud-hosted
applications (SaaS-based Backup data to the
application) third-party cloud

This service provider is


Cloud Resources backing up data from the
location of service provider 1
to their data center

Consumer
Organization

Allows consumers to backup cloud-hosted applications(SaaS) data to other cloud.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 317


Cloud-Based Data Protection

To read about an example of Cloud to Cloud Backup, click here.

Restoring Data from the Cloud

Web Based Restore

Cloud

Data is restored from


the cloud

Disasters sometimes
happen at the consumer
production Data Center

User restoring the


data from the cloud

Organization's Data
Center

• Requested data is gathered and sent to the server, running cloud backup agent.
• Received data is in an encrypted form. The agent software on the server
decrypts the files and restores it on the server.
• Considered if sufficient bandwidth is available to download large amounts of
data or if the restore data is small in size.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 318


Cloud-Based Data Protection

Media Based Restore

Cloud

Stores data to a set of


backup media and
ships it to the
consumer data center

Organization's Data
Center

• If a large amount of data needs to be restored and sufficient bandwidth is not


available, then the consumer may request the service provider for data
restoration using backup media such as DVD or disk drives.
• Service provider gathers the data to restore, stores data to a set of backup
media, and ships it to the consumer for a fee.

Use case: ROBO Backup in the Cloud

Challenges associated with ROBO backup:

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 319


Cloud-Based Data Protection

Backup to Backup to
cloud cloud

Remote Remote
Office 1 Office 2

Cloud Service Provider

Backup to
cloud

Remote
Office 3

• Lack of qualified IT staff with backup skills.


• Less IT infrastructure to manage the backup copies.
• Huge volume of redundant content.
• Silos of data repository leads to security threat.
• High cost to manage backup across remote offices.

ROBO Backup in cloud:

• Cloud backup services deploy disk-based backup solutions along with source-
based deduplication to eliminate the challenges associated with centrally
backing up remote-office data.
• Performing backup to the cloud, reduces the cost of managing the
organization’s ROBO backup environment.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 320


Cloud-Based Data Protection

Replication to the Cloud

Cloud
Replicating data to the cloud Resources

Backup
Storage
Clients

• Replicating application data and VM to the cloud enables organization to restart


the application from the cloud.
• Replication to the cloud can be performed using compute-based, network-
based, or storage-based replication techniques.

To learn more about Replication to Cloud, click here.

Disaster Recovery as a Service

VM instances are not


allowed

Compute Systems Storage


Replication Cloud Service Provider

Consumer Production Data Center

• Service provider offers resources to enable consumers to run their IT services


in the event of a disaster.

To know more about Disaster Recovery as a Service, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 321


Cloud-Based Data Protection

Disaster Recovery as a Service - Cont'd,

VM instances are invoked


to run the service
Upon Disaster

Compute Systems Storage


Replication
Cloud Service Provider
Consumer Production Data Center

Service provider offers resources to enable consumers to run their IT services in


the event of a disaster.

To learn more about Disaster Recovery as a Service, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 322


Cloud-Based Data Protection

Knowledge Check: Cloud-Based Data Protection

Knowledge Check Question

Carefully study the given image.

Cloud Resources

Backup data is sent to the


cloud for DR purpose

Backup is performed in
consumer's location Consumer
Organization

2. Which backup service deployment is shown in the above image?


a. Managed Backup Service
b. Replication Backup Service
c. Remote Backup Service
d. Cloud-to-cloud Backup Service

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 323


Cloud-Based Data Protection

Cloud-Based Data Archiving

Cloud-Based Data Archiving

Objectives

The objectives of the topic are to:

• Understand drivers for cloud-based data archiving.


• Review cloud-based archiving options.
• Review cloud-based storage tiering.
• Understand Data migration to cloud.
• Explore cloud-to-cloud data migration.
• Study cloud gateway appliance.

Drivers for Cloud-based Data Archiving

Cloud-based Data Archiving is a process in which inactive data is moved and


stored in a cloud for long-term retention and adhering to regulatory and compliance
requirements.

Cloud-based Data Archiving:

• Provides capital cost saving and agility.


• Reduces the complexity of managing archiving infrastructure.
• Services are accessed in a self-service manner.
• Solutions are highly scalable and available on demand as a service.
• Accessible from worldwide location, by using any device.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 324


Cloud-Based Data Protection

Cloud-based Archiving Options

Cloud-only Archiving

Archive server determines


which data needs to be
archived based on policies

Cloud Archive
Storage
Email Servers Non-critical data on the
primary storage system is
Archive Server moved to the public cloud
(Policy Engine)

File Server

Critical data on the primary Archive


storage system is moved to the Data Public Cloud
private cloud

Primary Archive Storage


Active Data
Storage System System
Critical Data

Inactive Data
Organization's Private Cloud

• Archives critical data to the on-premise archiving infrastructure and archives


non-critical data to the cloud archiving infrastructure.
• Allows organizations to distribute the archiving workload and also allows to
make use of public cloud for rapid resource provisioning.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 325


Cloud-Based Data Protection

Hybrid Archiving

Cloud Archive
Archive server
determines which data Storage
needs to be archived
based on policies
Email Servers

Archive Server
(Policy Engine) Archive
Data

File Server Cloud

Inactive data (both critical


and non-critical) on the
primary storage system is
moved to cloud-based
Primary Storage archive storage
System

Active Data
Inactive Data
Organization's Data Center

• Organization’s inactive data (both critical and non-critical) that meets the
organization’s archiving policies is archived to the cloud.

− IaaS - Archiving server on its data center and the archiving storage will
reside on the cloud.
− SaaS - Both the archiving server and the archiving storage reside on cloud
infrastructure.

Cloud-based Storage-Tiering

Tier 1 Tier 2 Less frequently Rarely used and Tier 3


Frequently accessed data non-critical data
accessed data resides on tier 2 reside on tier 3
resides on tier1 High capacity,
lower cost storage

Rarely used and


Less frequently non-critical data
accessed data is are moved to tier 3
moved to tier 2

Less frequently
accessed data is
moved to tier 2 Storage HDD Cloud
Storage SDD
Storage

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 326


Cloud-Based Data Protection

Establishes a hierarchy of different storage types (tiers) including cloud storage as


one of the tiers.

To learn more about Cloud Based Storage-tiering, click here.

Data Migration to the Cloud

Backup
Application
Application Data Migration
Servers
Archive Server

Cloud

Moves data from an


organization’s data center to the
cloud

Archive
Storage
Primary
Clients Storage

Organization's Data Center

• Process of moving data from an organization’s data center to the cloud.


• A replication technique is used to create remote point-in-time copies in the
cloud. The migration is application and OS independent.

Cloud-to-cloud Data Migration

Migrating data and application


from Cloud1 to Cloud 2

Cloud Resources Cloud Resources

Data migration between clouds


requires integration tools that
provide interoperability between
Cloud 1 clouds. Cloud 2

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 327


Cloud-Based Data Protection

• An organization may decide to migrate from one cloud provider to another when
it identifies that the cloud service provider is not able to meet the SLAs, not
adhering to security best practices, not meeting acceptable performance, or not
able to fulfill its future requirements.
• Since different cloud vendors may have different protocols and architecture,
data migration between clouds requires integration tools that will migrate the
data from one cloud to another.
• Cloud integration tool should provide features such as simplicity, flexibility,
interoperability, data portability, data integrity, security, reliability, and ease of
management.

Cloud Gateway Appliance

•Performs protocol conversion to send


•Encrypts the data before it transmits to the
data directly to the cloud storage
cloud storage
•Resides in the data center and presents
file and block-based storage interfaces to •Supports automated storage tiering
applications capability

•Supports deduplication and •Provides a local cache to reduce latency


compression

Application Servers

Cloud Gateway
Block Based Appliance
Interface REST

Data Center
Cloud

• Performs protocol conversion to send data directly to the cloud storage.


• Resides in the data center and presents file and block-based storage interfaces
to applications.
• Supports deduplication and compression.
• Encrypts the data before it transmits to the cloud storage.
• Supports automated storage tiering capability.
• Provides a local cache to reduce latency.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 328


Cloud-Based Data Protection

To learn more about Cloud Gateway Appliance, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 329


Cloud-Based Data Protection

Knowledge Check: Cloud-Based Data Archiving

Knowledge Check Question

3. Which archiving method is most suitable and cost effective for a large
organization having both sensitive data and non-sensitive data?
a. Cloud-only Archiving
b. Hybrid Archiving
c. In-house Archiving

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 330


Cloud-Based Data Protection

Concepts in Practice

Concepts in Practice

Click the right and left arrows to view all the concepts in practice.

Dell Technologies Cloud Storage for Multi-cloud

Enables users to connect their file and block storage – Dell EMC Unity,
PowerStore, PowerMax and PowerScale - consumed as a service, directly to public
cloud(s) including VMware Cloud on Amazon Web Services (AWS), AWS,
Microsoft Azure and Google Cloud Platform. This is done through a high-speed,
low latency connection from Dell EMC storage at a managed service provider to
the cloud or clouds of choice. Organizations gain an on demand, cloud
consumption model for both compute workloads and storage combined with the
high performance, up to 6-9s availability, and scalability of Dell EMC storage. This
solution is ideal for securely moving or deploying demanding applications to the
public cloud for disaster recovery, analytics, test/dev and more.

VMware Site Recovery Manager

VMware Site Recovery Manager automation software integrates with an underlying


replication technology to provide policy-based management, minimize downtime in
case of disasters via automated orchestration of recovery plans, and conduct
nondisruptive testing of your DR plans. It is designed for virtual machines (VMs)
and scalable to manage all applications in a VMware vSphere environment. To
deliver flexibility and choice, it integrates natively with VMware vSphere Replication
and VMware vSphere Virtual Volumes™ integrated storage arrays and supports a
broad range of array-based replication solutions available from all major VMware
storage partners. Site Recovery Manager natively leverages the benefits of
vSphere and can also take advantage of VMware Cloud Foundation, integrating
with other VMware solutions such as VMware NSX (network virtualization) and
VMware vSAN (hyperconverged infrastructure).

VMware Cloud on Dell EMC

VMware Cloud on Dell EMC is a fully managed hybrid cloud service that combines
the simplicity and agility of the public cloud with the security and control of on-
premises infrastructure. Delivered as a service to data center and edge locations,

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 331


Cloud-Based Data Protection

VMware Cloud on Dell EMC and its hybrid cloud services provide simple, secure,
and scalable infrastructure. Enable intrinsic security, including encryption for data
at rest and in transit. For additional security, there are micro-segmentation
capabilities available through VMware NSX. VMware Cloud on Dell EMC simplifies
the management of your data center services and edge infrastructures with an
offering that is fully managed, subscription based, and delivered as-a-service.

Virtustream Enterprise Cloud

Virtustream Enterprise Cloud is built to run complex, I/O-intensive, mission-critical


enterprise applications. It offers guaranteed availability and performance backed by
industry-leading SLAs, rigorous end-to-end security, and government and industry-
specific compliance solutions. In addition, Virtustream Enterprise Cloud includes a
full suite of professional and managed services from the infrastructure up to the
application layer. Virtustream Enterprise Cloud delivers superior value to some of
the world’s largest enterprises, while reducing the complexities of their IT
operations and the inherent business risk of operating mission critical applications.
Virtustream Enterprise Cloud is managed by the xStream Cloud Management
Platform. xStream provides a unified, control plane that integrates infrastructure
orchestration, enterprise application automation and a suite of business intelligence
and service management tools to run mission critical applications in private, public,
and hybrid clouds.

Dell Technologies Cloud Storage for Multi-cloud

Dell Technologies Cloud Platform (DTCP) delivers application-ready cloud


infrastructure with preconfigured instance-based offerings supporting a wide range
of enterprise workloads running on virtual machines or containers. In a few clicks,
you can now subscribe to instances designed for your workloads through the Dell
Technologies Cloud Console—and get it deployed in your datacenter, co-location
facility, and edge locations. With a simple way to size and order on-premises cloud
resources, this enables you to focus on your applications instead of infrastructure
procurement and upgrades. Develop, test, and run both cloud native and traditional
applications on a single platform to deliver a simple path to on-premises cloud
through automated operations. Instances deliver standardized combinations of
compute (in some cases with GPU/accelerators), memory, storage, and networking
resources, on which a virtual machine or container can run—powered by Dell EMC
VxRail.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 332


Cloud-Based Data Protection

Exercise: Cloud-based Data Protection

Exercise: Cloud-based Data Protection


1. Present Scenario:

A product based company

• Uses its own IT resources.

• Has multiple remote offices/branch offices (ROBO) across the globe.

• Plans to build the cloud infrastructure using existing IT infrastructure


components.

• Has heterogeneous infrastructure components.

• Does not have remote site for DR purpose.

• Performs archiving within its data center.

2. Organization Challenges:

• Exposes the business to the risk of losing data at ROBO sites.

− Lack of qualified IT staff with backup skills.


− Less IT infrastructure to manage the backup copies.
• Does not have adequate resources to manage peak workload that occurs
from time to time.

• Does not want to build and manage its own DR site due to budget
constraint.

• Increases the complexity and cost while managing the huge volume of
inactive data within its data center.

3. Organization Requirements:

• To protect data at remote sites using OPEX cost model.

• To manage the peak workload that occurs from time to time.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 333


Cloud-Based Data Protection

• To protect the data at DR site without involving CAPEX.

• To reduce the complexity and cost in managing archived data.

4. Expected Deliverables:

Propose a solution that will address the organization’s challenges and


requirements.

Solution

The proposed solution is as follows:

• Implement ROBO cloud backup solution to back up data to a centralized


location (cloud).
− Provides an effective solution to address the data backup and recovery
challenges of remote and branch offices.
• Deploy Hybrid Cloud model to accommodate the peak workload that may occur
from time to time.
• Adopt Disaster Recovery-as-a-Service (DRaaS) in cloud which reduces the
need for data center space and IT infrastructure and eliminates the need for
upfront capital expenditure.
• Implement Cloud-based archiving to reduce the complexity of managing
archiving infrastructure which enables capital cost savings.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 334


Protecting Big Data and Mobile Device Data

Protecting Big Data and Mobile Device Data

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 335


Protecting Big Data and Mobile Device Data

Protecting Big Data and Mobile Device Data

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 336


Protecting Big Data and Mobile Device Data

Protecting Big Data and Mobile Device Data

Protecting Big Data and Mobile Device Data

The main objectives of the topic are to:


→ Describe the characteristics of Big Data.
→ Explain Big Data analytics.
→ Describe the key data protection solutions for data lake.
→ Describe Big Data as a Service.
→ Describe mobile device backup
→ Describe cloud-based mobile device data protection.
→ Apply data protection to Big Data and mobile device environments.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 337


Protecting Big Data and Mobile Device Data

Big Data Overview

Big Data Overview

Objectives

The objectives of the topic are to:

• Explore the characteristics of Big Data.


• Review Big Data analytics.
• Understand Hadoop File System (HDFS).
• Learn about data lake.
• Examine some Big Data analytics use cases.

What is Big Data?

• Big Data represents the information assets whose high volume, high velocity,
and high variety require the use of new technical architectures and analytical
methods to gain insights and derive business value.
• The definition of Big Data has three principal aspects:

Characteristic of Data

• Apart from its considerable size (volume), the data is generated rapidly
(velocity) and is highly complex as it comes from diverse sources (variety).
Nearly 80-90 percent of the data getting generated is unstructured data.

Data Processing Needs

Big Data exceeds the storage and processing capability of conventional IT


infrastructure and software systems such as:

• Highly-scalable architecture for efficient storage and new and innovative


technologies and methods for programming and processing to realize business
benefits.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 338


Protecting Big Data and Mobile Device Data

• Use of platforms such as distributed processing, massively-parallel processing,


machine learning, and so on.
• New analytical and IT skills required along with business and domain
knowledge in a complex data-centric business environment.

Business Value

• Big Data has tremendous business importance to organizations and even to the
advancement on society.
• Proper analysis of big data will help to make better business decisions and add
value to business.
• Big Data analytics has many applications spanning numerous industry sectors
and scientific fields.

Characteristics of Big Data

In 2001, Gartner analyst Douglas Laney specified volume, velocity, and variety as
the three dimensions of the challenges associated with data management. These
dimensions— popularly known as “the 3Vs"—are now widely accepted in the
industry as the three primary characteristics of Big Data. In addition to the 3Vs,
there are three other characteristics identified by the industry namely variability,
veracity, and value.

Volume

• “Big” in Big Data refers to the massive volumes of data.


• Growth in data of all types such as transaction-based data stored over the
years, sensor data, and unstructured data streaming in from social media.
• Growth in data is reaching Petabyte—and even Exabyte—scales.
• Requires substantial cost-effective storage, but also gives rise to challenges in
data analysis.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 339


Protecting Big Data and Mobile Device Data

Velocity

• Refers to the rate at which data is produced and changes, and how fast the
data must be processed to meet business requirements.
• Real-time or near real-time analysis of the data is a challenge for many
organizations.

− For example: real-time face recognition for screening passengers at airports.

Variety

• Variety (also termed as “complexity”) refers to the diversity in the formats and
types of data.
• Data is generated by numerous sources in various structured and unstructured
forms. New insights are found when these various data types are correlated and
analyzed.
• Pertains to challenge of managing, merging, and analyzing different varieties of
data in a cost-effective manner.
• The combination of data from a variety of data sources and in a variety of
formats is a key requirement in Big Data analytics.

− For example: Combining a large number of changing records of a particular


patient with various published medical research to find the best treatment.

Variability

• Variability (unlike variety) refers to the constantly changing meaning of data,


particularly when data collection and analysis involve Natural Language
Processing.

− For example, natural language search and analyzing social media posts
require interpretation of complex and highly-variable grammar. The
inconsistency in the meaning of data gives rise to challenges related to
gathering the data and in interpreting its context.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 340


Protecting Big Data and Mobile Device Data

Veracity

• Refers to the reliability and accuracy of data. Accuracy of analysis depends on


the veracity of the source data.
• Establishing trust in Big Data presents a major challenge because as the variety
and number of sources grow, the likelihood of noise and errors in the data
increases.
• Significant effort goes into cleaning data to remove noise and errors, and to
produce accurate data sets before analysis can begin.

− For example, a retail organization may have gathered customer behavior


data from across systems to analyze product purchase patterns and predict
the purchase intent.

Value

• Refers to both the cost-effectiveness and the business value derived from the
use of Big Data analytics technology.
• Many organizations have maintained large data repositories such as data
warehouses, managed non-structured data, and carried out real-time data
analytics for many years.

Why Big Data Analytics?

Business Driver Examples

Desire to optimize business operations Sales, pricing, profitability, efficiency

Desire to identify business risk Loss of customer, fraud, default

Predict promising new business Upsell, cross-sell, best new


opportunities customer prospects

Comply with laws or regulatory Anti-money laundering, Fair lending,


requirements Basel II-III, Sarbanes-Oxley (SOX)

To learn more about Big Data Analytics, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 341


Protecting Big Data and Mobile Device Data

Big Data Analytics

Process of examining data to determine a useful piece of information or insight.


The primary goal of Big Data Analytics is to help organizations improve business
decisions.

Components of a Big Data Analytics Solution – SMAQ Stack (Storage,


MapReduce, Query).

Storage

• Distributed architecture (HDFS).


• Non-relational, unstructured content.

MapReduce

• Distributes (parallel) computation over many servers.


• Batch processing model.

Query

• Efficient way to process, store and retrieve data.


• Platform for user-friendly analytics systems.

To learn more about SMAQ Stack, click here.

Hadoop Distributed File System (HDFS)

HDFS is a distributed file system that provides access to data across nodes –
collectively called a “Hadoop Clusters”. HDFS architecture has two key
components:

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 342


Protecting Big Data and Mobile Device Data

Hadoop Cluster

Data Node Data Node Data Node Data Node

Rack 1
Clients

Name Node

Rack 2

Data Node Data Node Data Node Data Node

• Name Node:
− Acts as a Primary server and has in-memory maps of every file, file
locations, as well as all the blocks within the file and the Data Nodes to
which they reside on.
− Responsible for managing FS namespace and controlling the access of files
by the clients.
• Data Node:

− Acts as secondary that serves R/W requests as well as performs block


creation, deletion, and replication.

Data Lake – Repository for Big Data

Evolution of an Enterprise Data Warehouse (EDW) into an active repository for


structured, semi-structured, and unstructured data.

Data is classified, organized, or analyzed only when it is accessed.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 343


Protecting Big Data and Mobile Device Data

Sources

Ingest Store

Analyze

Surface

Act

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 344


Protecting Big Data and Mobile Device Data

Big Data Analytics Use Cases

Use Cases Description

Healthcare • Analyze consolidated diagnostic


information
• Monitor patients in real-time
• Improve patient care and services

Finance • Analyze purchase history and create


customer profiles
• Improve sales promotions
• Enable fraud detection

Retail • Analyze historical transactions, pricing,


and customer behavior
• Optimize pricing, anticipate demand,
improve marketing and inventory
management

Government • Manage and use Big Data in social


services, education, defense, crime
prevention, finance, and so on
• Improve the existing processes and
enable the new ventures

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 345


Protecting Big Data and Mobile Device Data

Knowledge Check: Big Data Overview

Knowledge Check Question

1. Which is responsible for managing FS namespace and controlling accessing of


files by clients in an HDFS environment?
a. Data Node
b. Secondary Node
c. Name Node
d. Database Node

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 346


Protecting Big Data and Mobile Device Data

Protecting Big Data

Protecting Big Data

Objectives

The objectives of the topic are to:

• Understand Big Data protection challenges.


• Examine key data protection solutions for data lake.
• Explore Big Data as a Service.
• Review data protection optimization method.

Big Data Protection Challenges

Protecting a big data environment requires new strategies about how to use the
existing tools and adopting new technologies that help in protecting the data more
efficiently.

• Need to protect massive volumes of data which exceeds the capabilities of


traditional data protection solutions.
• Hard to determine what data needs to be protected.
• More data may affect the service level agreements.
• Requires seamless integration of data repository (data lake) with data
protection software.
• Difficult to protect the data within budget.

To learn more about Big Data Protection Challenges, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 347


Protecting Big Data and Mobile Device Data

Data Lake – Repository for Big Data

Sources

Sources Store

Analyze

Surface

Act

• Evolution of an Enterprise Data Warehouse (EDW) into an active repository for


structured, semi-structured, and unstructured data.
• Data is classified, organized, or analyzed only when it is accessed.

To learn more about Data Lake as a Repository for Big Data, click here.

Key Data Protection Solutions for Data Lake

Backup and Mirroring and Cloud-based


Replication
Data Lake Deduplication Erasure Coding protection

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 348


Protecting Big Data and Mobile Device Data

Backup and Deduplication

Backup Data
HDFS Data Lake

DistCp Tool
Backup Device

Backup of a Snapshot created in a Scale-out NAS using NDMP.

• HDFS data lake is created in a scale-out NAS.


• Snapshot is created for the data to be protected and is backed up using NDMP.
• Snapshot data can be backed up to a backup device – scale-out NAS, and
scale-out object storage.

Replication

Snapshot is created for the data to


be replicated in a scale-out NAS
HDFS Data
Lake

Replication

Scale-out NAS Scale-out NAS

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 349


Protecting Big Data and Mobile Device Data

• Snapshot is created for the data to be protected.


• Snapshot data can be replicated to another scale-out NAS within a data center
or across data centers.
• Data can be synchronously or asynchronously replicated across data centers.

Mirroring and Erasure Coding

Data Mirroring Parity Protection(Erasure Coding)

Data is mirrored to multiple Method to protect striped data from disk drive failure
nodes. or node failure.

If the cluster is setup for Data is fragmented and encoded with parity data
3X mirroring, the original and stored across a set of different locations (drives
file will be stored along and nodes).
with two copies of the file
in various locations within
the cluster.

Requires more storage Supports higher levels of protection than RAID.


space. Provides good space efficiency compared to
mirroring.

To learn more about Data Mirroring and Parity Protection, click here.

Big Data as a Service

Service provider offers resources to enable the consumers to run big data analytics
workload in the cloud

Big Data - Infrastructure as a Service

• Typically the service provider offers infrastructure (Compute as a Service,


Storage as a Service) to store and process the huge volume of data.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 350


Protecting Big Data and Mobile Device Data

Big Data - Platform as a Service

• Allows the consumers to analyze and build analytics applications on top of huge
volume of data. The service provider offers platform (database, Hadoop) and
cloud infrastructure to run or build analytics applications.

Big Data - Software as a Service (Analytics)

• Consumers interact with an analytics application on a higher abstraction level;


that is, they would typically execute scripts and queries or generate reports.
• Service provider offers the complete stack including infrastructure to host data
lake for big data, platform software, and big data analytics application.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 351


Protecting Big Data and Mobile Device Data

Data Protection Optimization Method

Data Protection Optimization Description


Method

Incremental backup • Copies the data that has changed


since the last backup, that enables to
backup of lesser files daily.
• Allows for shorter backup windows.

Deduplication • Allows to store only unique data on


data protection storage.
• Reduces the backup window.
• Eliminates redundant data that can
significantly shrink storage
requirements and reduce bandwidth
requirements.

Compression • Reduces the storage capacity


requirement for backup and
replication.
• Compression rate depends on the
type of data being compressed.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 352


Protecting Big Data and Mobile Device Data

Knowledge Check: Protecting Big Data

Knowledge Check Question

2. Which native utility is built into HDFS to backup and restore data from the data
lake to a backup device?
a. HDFS Mirroring
b. Hadoop Distributed Copy
c. Erasure Coding
d. Hadoop Data Copy

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 353


Protecting Big Data and Mobile Device Data

Protecting Mobile Devices

Protecting Mobile Devices

Objectives

The objectives of the topic are to:

• Identify challenges in protecting mobile device data.


• Understand mobile device backup.
• Explore the File sync-and-share application.
• Review mobile cloud computing.
• Understand cloud-based mobile device data protection.

Mobile Device Overview

Data Center Mobile Devices

A compute system that is portable and typically a handheld device with a display,
and has either a keyboard and/or touch input.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 354


Protecting Big Data and Mobile Device Data

• Enables users to access applications and information from their personal


devices from any location.
• Increases collaboration and enhances workforce productivity.

To learn more about Mobile Devices, click here.

Key Challenges in Protecting Mobile Device Data

• Data is protected (backed up) only when the mobile device is online.
• Data protection from mobile device to data center is impacted due to
intermittent network connectivity.
• Devices are not always connected to the corporate network, so it connects over
the Internet, which may rise to a security threat.
• Data protection software must support the mobile device OS.
• Network bandwidth limitations.
• Security features on the mobile devices restrict the access of the data stored on
the device.

To learn more about Challenges in Protecting Mobile Device Data, click here.

Mobile Device Backup

Mobile Backup
Clients

Backup Data

Enterprise Data
Center

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 355


Protecting Big Data and Mobile Device Data

• Requires installing backup client application (agent) on the mobile devices.


− Backs up the data to the enterprise data center.
• Data can be backed up manually or automatically from mobile devices.
• Deduplication, compression, encryption, and incremental backup can be
implemented for performing mobile device backup.

− Provides network and backup storage optimization, and security.


To read about Mobile Device Backup, click here.

File Sync-and-Share Application

Mobile Devices

Enterprise Data Center


Files are
synchronized
between mobile
devices and remote
Files are backed up from File
file server
Storage

File Sync-and-Share
Application Server

File Storage

• Automatically establishes two-way synchronization between the device and a


designated network location (enterprise data center).
• Files are backed up from the remote storage instead of the mobile devices.
• Improves productivity by allowing users to access data from any device,
anywhere, at any time.

To learn more about File Sync-and-Share Applications, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 356


Protecting Big Data and Mobile Device Data

Mobile Cloud Computing

Mobile Devices

Cloud
Resources

• Compute processing and storage are moved away from the mobile device and
takes place in a computing platform located in the cloud.
• Applications running in the cloud are accessed over wireless connection using a
thin client application/web browser on the mobile devices.
• Cloud services accessed over mobile devices.

− SaaS examples: Cloud storage, travel and expense management, and


CRM.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 357


Protecting Big Data and Mobile Device Data

Cloud-based Mobile Device Data Protection

Mobile
Devices Cloud
Resources

Backup Data to
the Cloud

Restore Data from


the Cloud

• Backup client application (agent) that is installed on the device enables access
to perform backup to the cloud.
− Typically backs up only the changed blocks to the cloud storage.
• Some mobile applications have built-in backup feature that backs up the data to
the cloud.
• Most of the cloud backup solutions available today offer a self-service portal
that allows users to recover data without manual intervention.

To learn more about Mobile Devices, click here.

Benefits of Cloud-based Backup for Mobile Devices

Comprehensive business continuity and accessibility

• Establishing a common repository of data that is used by the entire organization


leads to comprehensive business continuity and protection against data loss
across the organization.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 358


Protecting Big Data and Mobile Device Data

• Ideal for larger businesses or multiple branch offices working together over
geographically dispersed environments.
• Backup solutions that span multiple types of devices (from servers to PCs to
mobile devices) will serve the organization well to provide business continuity
and reduce backup and data sharing complexities.

Lower capital cost

• Depending on the level of adoption, a cloud-based backup service allows the


customer to reduce, if not eliminate, the need for capital on storage capacity
(disk, tape, and/or other removable media) associated with backup processes.
• With a subscription-based pricing model, the cost associated with backup
becomes a more predictable operational expenditure.

Reduced management complexity

• The limited IT staff is not overwhelmed because cloud-based backup can


reduce or eliminate cumbersome management procedures that must be
manually monitored and maintained.

Increased backup consistency

• Cloud-based backup easily institutes policies that govern backup processes and
access control.
• Establishing particular levels of service can be well defined through service-
level agreements (SLAs).

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 359


Protecting Big Data and Mobile Device Data

Knowledge Check: Protecting Mobile Devices

Knowledge Check Question

3. What is a key benefit of the file sync-and-share application?


a. Offers a self-service portal that allows users to recover data without manual
intervention.
b. Improves the productivity by allowing users to access data from any
device, anywhere, at any time.
c. Moves compute processing and storage from the mobile device and takes
place in a computing platform located in the cloud.
d. Recovers the corrupted data from a copy that is stored in the cloud.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 360


Protecting Big Data and Mobile Device Data

Concepts in Practice

Concepts in Practice

Dell EMC Ready Solutions for Data Analytics

ECS, the leading object storage platform from Dell EMC, provides unmatched
scalability, performance, resilience, and economics.

• Deployable as a turnkey appliance or in a software-defined model.


• Delivers rich S3-compatibility on a globally distributed architecture, empowering
organizations to support enterprise workloads such as cloud-native, archive,
IoT, AI, and big data analytics applications at scale.
• Stores unstructured data at public cloud scale with the reliability and control of a
private cloud.
• Capable of scaling to exabytes and beyond.
• Empowers organizations to manage a globally distributed storage infrastructure
under a single global namespace with anywhere access to content.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 361


Protecting Big Data and Mobile Device Data

Exercise: Data Protection in Big Data and Mobile Device


Environment

Exercise: Data Protection in Big Data and Mobile Device


Environment
1. Present Scenario:

An organization runs business applications and internal applications across


data centers.

• Plans to implement big data analytics for their business along with
necessary data protection solutions.

• Provides mobile banking applications to its customers and employees.

− Business critical data resides in mobile devices.


− Supports BYOD (Bring Your Own Device).
2. Organization Challenges:

• Currently, it does not have infrastructure to support big data analytics and its
protection.

− They do not have budget to implement infrastructure.


• Lack of IT professionals to manage the big data analytics infrastructure.

• Facing challenges in sharing the documents among employees that impacts


the collaborative work culture.

• Mobile device theft causes critical data loss.

3. Organization Requirements:

• Need a solution to implement big data analytics but looking for OPEX cost
model.

• Need a solution to effectively share the documents among employees to


improve the collaborative work.

• Need a solution to effectively protect the data on mobile devices.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 362


Protecting Big Data and Mobile Device Data

4. Expected Deliverables:

Propose a solution that will meet the organization’s requirements.

Solution

The proposed solution is as follows:

• Adopt cloud-based big data analytics solutions:


− No CAPEX
− Data can be protected in the cloud itself
• Deploy file sync-and-share application that improves collaborative work.
− Improves productivity by allowing employees to access and share
documents (files) from any device, anywhere, at any time.
• Backs up the data from mobile devices to organization’s data center or to the
cloud for protecting the mobile device data.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 363


Securing the Data Protection Environment

Securing the Data Protection Environment

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 364


Securing the Data Protection Environment

Securing the Data Protection Environment

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 365


Securing the Data Protection Environment

Securing the Data Protection Environment

Securing the Data Protection Environment

The main objectives of the topic are to:


→ Describe the drivers for data security.
→ Describe the key security terminologies.
→ Describe Governance, Risk Management, and Compliance.
→ Describe the security threats in a data protection environment.
→ Explain the key security controls in a data protection environment.
→ Apply the concept of security in a data protection environment to address
the organizations challenges and requirements.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 366


Securing the Data Protection Environment

Overview of Data Security

Overview of Data Security

Objectives

The objectives of the topic are to:


→ Understand the key drivers for data security.
→ Define various key security terminologies.
→ Understand the concepts of governance, risk and compliance.

Introduction to Data Security

Data Security and Management

Data Protection Services

Orchestration

Operations Management

Discovery

Data Protection Management

Interaction Interaction

Backup
Applications

Replication
Applications

Interaction Archiving
Applications

Business Applications Protection Applications


Primary Storage Protection Storage
Data Source
Protection Application and Storage

Data is an organizations most valuable asset. Organizations data:

• Includes intellectual property, personal identities, and financial transactions.


• Requires protection against the events such as component failures, disaster,
and security attacks.
• Requires protection from unauthorized access, unauthorized modification, and
unauthorized deletion.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 367


Securing the Data Protection Environment

Data security includes a set of practices that protect data and information systems
from unauthorized disclosure, access, use, destruction, deletion, modification, and
disruption.

• Involves implementing various kinds of safeguards or controls in order to lessen


the risk of an exploitation or a vulnerability in the information system.
• Deploys various tools within their infrastructure to protect the asset such as
compute, storage, and network.

Drivers for Data Security

• The two key drivers for organization’s data security are Confidentiality, Integrity,
and Availability (CIA); and Governance, Risk, and Compliance (GRC)
requirements.

• Enables organizations to provide • Enables the organizations to


right privileges and access to the develop the policies and procedures
right users at the right time. and enforce them to minimize
potential risks.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 368


Securing the Data Protection Environment

Governance, Risk and Compliance

Governance

Data Protection policies


IT •Backup to cloud/on-premise
•Data retention policy
•Archiving policy

Data security policies


•RBAC policy
•Data shredding policy
HR •Data encryption policy

Board of Directors

Finance

Governance determines the purpose, strategy, and operational rules by which


companies are directed and managed.

• Based on the company’s business strategy and driven by the Board of


Directors.
− Business strategy includes legal, HR, finance, and the office of the CEO.
• IT governance main objective is to determine the results to achieve the IT's
strategic goals.
− Leaders monitor, evaluate, and direct IT management to ensure IT
effectiveness, accountability, and compliance.
• Roles and responsibilities must be clearly defined such as:
− Who is responsible for directing, controlling, and executing decisions?
− What information is required to make the decisions.
− How exceptions will be handled.
• Defines policy that determines whether the data should be protected either on-
premise or on the cloud.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 369


Securing the Data Protection Environment

Risk

• A systematic process of assessing its assets, placing a realistic valuation on


each asset, and creating a risk profile that is rationalized for each information
asset across the business.
− Involves identification, assessment, and prioritization of risks.
• There are four key steps of risk management that an organization must perform
before offering resources or services to the users.

Steps Description

Risk • Identifies source of threats that give rise to risk.


Identification
• Should be performed before building an IT infrastructure.

Risk Assessment • Determines the likelihood of a risk.


• Helps to identify appropriate controls.

Risk Mitigation • Involves planning and deploying security mechanisms.


• Helps mitigate risks/minimize impact.

Monitoring • Involves continuous observation of existing risks.


• Ensures proper control of security mechanisms.

To learn more about risk management steps, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 370


Securing the Data Protection Environment

Compliance

Types of Compliance

Internal Policy Compliance External Policy Compliance

Controls the nature of IT operations within an Includes legal requirements, legislation, and
organization industry regulations

• An act of adhering to, and demonstrating adherence to, external laws and
regulations as well as to corporate policies and procedures.
• There are primarily two types of policies controlling IT operations in an
enterprise that require compliance: internal policy compliance and external
policy compliance.
• Compliance management activities include:

− Periodic reviews of compliance enforcement.


− Identifying deviations and initiating corrective actions.
To learn more about compliance, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 371


Securing the Data Protection Environment

Authentication, Authorization, and Auditing

1:

• A process to ensure that users or assets are who they claim to be by verifying
their identity credentials.
• A user may be authenticated by a single-factor77 or multi-factor78 method.

2:

• A process of determining whether and in which manner, a user, device,


application, or process is allowed to access only the particular service or
resource.

− For example, a user with administrator’s privileges is authorized79 to access


more services or resources compared to a user with non-administrator (for
example, read-only) privileges.
3:

77 Involves the use of only one factor such as a password.

78 Uses more than one factor to authenticate a user.

79 Authorization should be performed only if authentication is successful.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 372


Securing the Data Protection Environment

• Refers to the logging of all transactions for the purpose of assessing the
effectiveness of security mechanisms.
• Helps to validate the behavior of the infrastructure components, and to perform
forensics, debugging, and monitoring activities.

• Authentication, Authorization, and Auditing processes support the objectives of


IT security implementation to implement effective CIA and GRC.
• Click on each process on the image for more information.

Vulnerabilities

1 2 3

1:

• Refers to the various entry points that an attacker can use to launch an attack,
which include people, process, and technology.
• For example, each component of a storage infrastructure is a source of
potential vulnerability.
− An attacker can use all the external interfaces80 supported by that
component, such as the hardware and the management interfaces, to
execute various attacks.
• Unused network services, if enabled, can become a part of the attack surface.

2:

• A step or a series of steps necessary to complete an attack.

80 Forms the attack surface for the attacker.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 373


Securing the Data Protection Environment

− For example, an attacker might exploit a bug in the management interface to


execute a snoop attack.
3:

• Refers to the amount of time and effort required to exploit an attack vector.

• A weakness of any information system that an attacker exploits to carry out an


attack.
− Components that provide a path enabling access to information are
vulnerable to potential attacks.
• Vulnerabilities give rise to threats which are the potential attacks that can be
carried out on an IT infrastructure.
• Three factors that needs to be consider when assessing the extent to which an
environment is vulnerable to security threats are:
− Attack Surface
− Attack Vector
− Work Factor
Click on each factor on the image for more details.
• Organizations can deploy specific security controls at reducing vulnerabilities
by:

− Minimize the attack surface


− Maximize the work factor

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 374


Securing the Data Protection Environment

Defense-in-depth

Perimeter Security (Physical Security)

Remote Access Controls (VPN, Authentication, etc.)

Network Security (Firewall, DMZ, etc.)

Compute Security (Hardening, Malware Protection software, etc.)

Storage Security (Encryption, Zoning etc.)

A multilayered security mechanism in which multiple layers of defense strategies


are deployed throughout the infrastructure to help mitigate the risk of security
threats if one layer of the defense is compromised.

• Defense-in-depth increases the barrier to exploitation.

− An attacker must breach each layer of defense to be successful.


− Provides additional time to detect and respond to an attack.
o Reduces the scope of a security breach.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 375


Securing the Data Protection Environment

Knowledge Check: Overview of Data Security

Knowledge Check Question

1. Match the following elements with their descriptions:

A. Compliance B Logging of all transactions for


assessing the effectiveness of
security mechanisms.

B. Auditing A Demonstrating adherence to


external laws and regulations as
well as to policies and procedures.

C. Governance C Determines the purpose, strategy,


and operational rules by which
companies are directed and
managed.

D. Authentication D Process to ensure that users or


assets verify their identity
credentials.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 376


Securing the Data Protection Environment

Security Threats in Data Protection Environment

Security Threats in Data Protection Environment

Objectives

The objective of the topic is to:


→ Explain various security threats in data protection environment.

Introduction to Security Threats

Modification of system
configuration by unauthorized
Data Security and Management
access to management
application
Data Protection Services

Orchestration

Operations Management

Discovery

Data Protection Management

Restore data to
Unauthorized modification or
unauthorized destination via
deletion of data through
protection application
application, file system, or
database Interaction Interaction

Backup
Applications
Replication
Applications
Interaction Archiving
Applications

Business Applications Protection Applications

Primary Storage Protection Storage


Data Source Protection Application and Storage

The threats in the data protection environment may exist at data source, protection
application and protection storage, and data management domain.

• Security threats at data source involves unauthorized access to primary


storage, business application, and hypervisor that impact CIA.
− Threat includes gaining access to primary storage through application, file
system, or a database interface, and modifying or deleting the files residing
on primary storage.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 377


Securing the Data Protection Environment

• Security threats to protection application includes unauthorized access to


protection applications such as backup, replication, and archiving.
− Also includes security threats to protection storage which could be on-
premise or in the cloud.
− Major threat is that an attacker may gain access to the backup application,
and the attacker can recover the data to an unauthorized destination.
• Security threats in management domain involves unauthorized access to
management application which can enable an attacker to carry out the attack by
modifying the system configurations.

Threats to Data Source

An attacker is elevating the user Primary Storage


Email System A
OLTP privileges or spoofing identity to
Application
gain access to the application
and modify or delete the data

Compute
System A

Database Multimedia
Application Application
An attacker may gain
unauthorized access to
application by bypassing access
control

Compute Primary Storage


System B System B
An attacker installs a rogue
An attacker can access hypervisor to take control of
business application by using compute system
stolen mobile devices

• Data source can be a business application, a hypervisor, or a primary storage.


• An attacker may gain unauthorized access to the organization’s application,
data, or primary storage by various ways such as:

− Bypassing the access control, operating system, or application.


− Exploiting a vulnerability in the hypervisor.
o Failure of hypervisor may expose user’s data to other users.
o Hyperjacking is an example of this type of attack in which the attacker
installs a rogue hypervisor that takes control of the compute system.
− Elevating the privileges, spoofing identity, and device theft.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 378


Securing the Data Protection Environment

For detailed information about threats to data source, click here.

Threats to Protection Applications

The protection applications are responsible for creating backups and replicas to
ensure business continuity.

• Security threats can negatively impact the confidentiality, integrity, and


availability of data.
− Therefore, it is important to identify the threats that are posed to the
protection application.
• The image shows the backup and replication environment. In this environment,
an attacker may:
− Spoof the administrator’s identity and take control of the backup and
replication application to carry out the attack.
− Exploit the vulnerabilities of the backup and replication application to carry
out the attack.
• Some of the control mechanisms that can reduce the security threats are:

− Identity and access management


− Installing security updates (patches) of the backup and replication
applications.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 379


Securing the Data Protection Environment

Threats to Protection Storage

An attacker is gaining
Primary Cloud access to user data
Storage Protection
Storage
Backup Agent

Media theft while shipping


Backup to Cloud
backup media to the DR
Compute
site
System A

Backup
Agent

Shipping media to DR site

Compute
Backup Device
System B
(Protection
Storage) An attacker is stealing the
physical media by gaining
Protection
access to protection storage
Storage
An attacker is stealing the
physical media by gaining
access to protection storage
Backup Server/Storage
Node

The protection storage is exposed to various kinds of threats in both the backup
and the replication environment.

• In replication environment the protection storage may be a block-based, file-


based, or object-based storage. In this environment an attacker may:
− Gain unauthorized access to the protection storage system and steal the
physical media to carry out an attack.
− Steal backup media either from the backup storage or while transporting to
the DR site as shown in the image.
• Many organizations backup their data to the cloud. In such environment an
attacker may:

− Compromise cloud storage and gain unauthorized access to an


organization’s data.
− Spoof the DR site identity to copy the backup data to an unauthorized
protection storage.
To know how control mechanisms can help in reducing the risks caused due to
threats, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 380


Securing the Data Protection Environment

Threats to Management Applications

Storage Storage
System A System B
Attacker may gain unauthorized access
to management application to perform
unauthorized resource provisioning.

VSAN
Management Compute System A Compute System B
Applications

VLAN

Management VLAN

The management application provides visibility and control of the components and
protection operations.

• Protecting the management domain is important because the impact of security


breach on the data protection infrastructure is significant.
• In such environment, an attacker may:

− Gain access to management application by either spoofing user identity,


elevating privileges, or by bypassing the security to carry out an attack.
− Carry out attack such as unauthorized resource provisioning, modification or
deletion of resource configuration, and so on.
For more information about threats to management applications, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 381


Securing the Data Protection Environment

Knowledge Check: Security Threats in Data Protection


Environment

Knowledge Check Question

2. Which threat applies to management domain?


a. Media theft
b. Insecure APIs
c. Spoofing DR site identity
d. Bypassing security of production application

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 382


Securing the Data Protection Environment

Security Controls in a Data Protection Environment – 1

Security Controls in a Data Protection Environment – 1

Objectives

The objectives of the topic are to:


→ Understand the concept of physical security.
→ Explain identity and access management and role-based access control.
→ Describe various security controls.

Introduction to Security Controls

Security controls should involve all the three aspects of infrastructure: people,
process, and technology, and their relationships.

• To authenticate and authorize a user or a system, first step should be to:


− Establish and assure their identity.
− Implement selective controls to access data and resources.
• Security measures are governed by the processes and policies.
• The processes should be based on a thorough understanding of risks in the
environment. The process should:

− Recognizes the relative sensitivity of different types of data and resources.


− Helps to determine the needs of various stakeholders to access the data
and resources.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 383


Securing the Data Protection Environment

For detailed information about security controls, click here.

Physical Security

Disable all unused


devices and ports

Provide 24x7x365 onsite


security guard

Implement biometric or
Physical Security Controls security badge-based
authentication

Install surveillance
cameras

Install sensors and alarms

Physical security is the foundation of the overall IT security strategy.

• Strict enforcement of policies, processes, and procedures by an organization


are the critical elements of successful physical security.
• Social engineering81 is a kind of attack that can lead to physical security
breaches.
• To secure the data protection environment, the following physical security
measures may be deployed:

− Disable all unused IT infrastructure devices and ports.


− Provide 24x7x365 onsite security guard.

81An attack that relies heavily on human interaction and often involves tricking
people into breaching security measures.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 384


Securing the Data Protection Environment

− Implement biometric or security badge-based authentication to grant access


to the facilities.
− Install surveillance cameras [CCTV] to monitor activity throughout the facility.
− Install sensors and alarms to detect motion and fire.

Identity and Access Management (IAM)

Network
Password

Access Request Access Granted


Badge ID Backup Application

User
User Verified

Cloud

Verification
Biometric Request

IAM Controls Storage


Authentication and
Authorization System IT Resources

• Identity and access management (IAM) is the process of:


− Managing user’s identifiers and their authentication and authorization to
access IT infrastructure resources.
− Controlling access to resources by placing restrictions based on user
identities.
o For example, only an authorized user, such as a backup administrator, is
allowed to login to the backup management software and perform
backup operations, configure resources, and provision backup
resources.
− Identifying the user and the privileges assigned to the user.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 385


Securing the Data Protection Environment

• Click on the example to know how a user is validated for identity and privileges.
• Multi-factor authentication82 uses more than one factor to authenticate a user.

Role-Based Access Control

Identity Roles Permissions

Security
Create, delete, and modify security settings
Administrator

Create, edit, and delete backup policies

Backup Schedule,configure start/stop backup, and recover


Administrator operations

Provision resources for backup

Monitor backup and recover operations


Activity Monitor

Monitor security and application settings

• Role-based access control (RBAC) is an approach to restrict access to the


authorized users based on their respective roles83.
− Minimum privileges are assigned to a role that is required to perform the
tasks associated with that role.

82A commonly implemented two-factor authentication process requires the user to


supply both something he or she knows (such as a password) and also something
he or she has (such as a device).

83 A role may represent a job function, for example a backup administrator.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 386


Securing the Data Protection Environment

• Always consider administrative controls, such as separation of duties84, when


defining the data center security procedures.
− For example, the person who authorizes the creation of administrative
accounts in a data protection environment should not be the person who
uses those accounts.
• The image shows the implementation of RBAC in a data protection
environment.

Security Controls

Click the right and left arrows to view all security controls.

Firewall

84Clear separation of duties ensures that no individual can both specify an action
and carry it out.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 387


Securing the Data Protection Environment

• A firewall is a security control designed to examine data packets traversing a


network and compare them to a set of filtering rules85.
− Rules can be set for both the incoming and the outgoing traffic.
− Effectiveness of a firewall depends on how robustly and extensively the
security rules are defined.
− Packets that are not authorized by a filtering rule are dropped and are not
allowed to continue to the requested destination.
• Can be deployed at the network, compute system, and hypervisor levels.
• Can be either physical86 or virtual87.

For more information about firewall- demilitarized zone, click here.

85A rule may use various filtering parameters such as source address, destination
address, port numbers, and protocols.

86 A physical firewall is a device that has custom hardware and software on which
filtering rules can be configured. Physical firewalls are deployed at the network
level.

87A virtual firewall is a software that runs on a hypervisor to provide traffic filtering
service. Virtual firewalls give visibility and control over virtual machine traffic and
enforce policies at the virtual machine level.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 388


Securing the Data Protection Environment

IDPS

Anomalous activity detected

Block
anomalous
activity Servers

IDS/IP Switch Connectivity


Firewall
S
Primary Storag Protectio
Anomalous activity Management e
Server

Data Protection Environment

Attacke
r

Intrusion detection is the process of detecting events that can compromise the
confidentiality, integrity, or availability of IT resources.

• Intrusion Detection System (IDS)88 and Intrusion Prevention System (IPS)89 are
the two controls usually work together and are generally referred to as intrusion
detection and prevention system (IDPS).
• The key techniques used by an IDPS to identify intrusion in the environment
are:

− Signature-based detection90

88A security tool that automates the detection process. An IDS generates alerts, in
case anomalous activity is detected.

89A tool that has the capability to stop the events after they have been detected by
the IDS.

90IDPS relies on a database that contains known attack patterns or signatures, and
scans events against it.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 389


Securing the Data Protection Environment

− Anomaly-based detection91

Virtual Private Network


Organization needs to perform remote
replication between two sites using
VPN connection

Application VPN-enabled VPN-enabled


Server router router Standby Server

VPN Connection Established


SAN SAN
LAN LAN
Tunnel

Clients Storage Storage


Clients

VPN Connection Disaster Recovery Site


Primary Site Established

VPN Clients
Remote user connects to the corporate
Remote User
network using VPN connection

• A virtual private network (VPN) can be used to provide a user a secure


connection to the IT resources. In the data protection environment, VPN is used
to provide:
− Secure site-to-site connection between a primary site and a DR site when
performing remote replication.
− Secure site-to-site connection between an organization’s data center and
cloud when performing cloud-based backup and replication.
• There are two methods in which a VPN connection can be established:

91 IDPS scans and analyzes events to determine whether they are statistically
different from events normally occurring in the system.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 390


Securing the Data Protection Environment

− Remote access VPN connection92


− Site-to-site VPN connection93
For detailed information about virtual private network, click here.

VLAN

VLAN10 VLAN20 VLAN30


VLAN10 allows traffic between Compute System (Engineering) (Finance) (HR)
A, Compute System B, and Storage System A.
VLAN 10 also restricts traffic to and from VLAN 20
and VLAN 30

Compute
System E

Compute
VLAN configured on Ethernet Switch A
System C
provides traffic isolation and therefore
enhanced security
Ethernet Switch A Compute
System A

Compute
System F
IP Router

Compute
System D

Replication traffic between Storage


System A and Storage System B has to Ethernet Switch B Compute
pass through the IP Router System B

Storage
System C

Ethernet Switch C Storage


System B
Storage System
A

• VLAN ensures security by providing isolation over the shared infrastructure.


VLAN

92 A remote client (typically client software installed on the user’s compute system)
initiates a remote VPN connection request. A VPN server authenticates and
provides the user access to the network.

93The remote site initiates a site-to-site VPN connection. The VPN server
authenticates and provides access to internal network.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 391


Securing the Data Protection Environment

− Ensures that the data is separated from one department to another


department.
− Enables communication among a group of nodes based on the functional
requirements of the group, independent of the node’s location in the
network.
• Consider the example shown in the image three VLANs are created: VLAN 10,
VLAN 20, and VLAN 30. To understand this example in details, click here.

VSAN

VSAN 10 VSAN 20
VSAN10 allows traffic between Compute (Engineering) (Finance)
System A and Storage System A. VSAN
10 also restricts traffic from VSAN 20 FC Switch A

Compute Compute
System A System B

VSAN configured on FC Switch A


provides traffic isolation and therefore
enhanced security

FC Switch B

Storage Storage
System A System B

• VSAN ensures security by providing isolation over the shared infrastructure.


VSAN:
− Ensures that the data is separated from one department to another
department.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 392


Securing the Data Protection Environment

− Enables communication among a group of nodes based on the functional


requirements of the group, independent of the node’s location in the
network.
• Consider the example shown in the image, two VSANs are created: VSAN 10
and VSAN 20. To understand this example in details, click here.

Zoning

Switch Domain ID = 15

Port 5

Zone 2 (Port Zone) = 15,5; 15,12


Compute
Storage
System System
Port 1
FC Switch

WWN 10:00:00:00:C9:20:DC:40

Port 12

Compute Port 9
System
Zone 3 (Mixed Zone) =10:00:00:00:C9:20:DC:56; 15,12

WWN 10:00:00:00:C9:20:DC:56

Compute WWN 50:06:04:82:E8:91:2B:9E


System
Zone 1

WWN 10:00:00:00:C9:20:DC:82 (WWN Zone) =10:00:00:00:C9:20:DC:82; 50:06:04:82:E8:91:2B:9E

• Zoning is an Fibre Channel (FC) switch-based security control that:


− Enables node ports connected to an FC SAN to be logically segmented into
groups and communicate with each other within the group.
− Provides access control, along with other access control mechanisms such
as LUN masking.
− Provides control by restricting the access only to the members in the same
zone to establish communication with each other.
• Zoning can be categorized into three types:

− WWN zoning

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 393


Securing the Data Protection Environment

− Port zoning
− Mixed zoning
To learn more about types of zoning, click here.

LUN Masking

Compute system A can have access to LUN A and restricts the access to
LUN B

LUN A

Compute System A - HR

LUN B

Storage System

Compute System B - Finance

Compute system B can have access to LUN B and restricts the access to
LUN A

• LUN masking is the storage system-based security control that is used to:
− Protects against unauthorized access to LUNs of a Storage System.
− Grants LUN access only to the authorized hosts.
• Consider a Storage System with two LUNs that store data of the HR and
finance departments as shown in the image.

− Without LUN masking, both the departments can easily see and modify each
other’s data, posing a high risk to data integrity and security.
− With LUN masking, LUNs are accessible only to the designated hosts.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 394


Securing the Data Protection Environment

Discovery Domain

Storage System
Management A
iSNS can be part of a
Station network or a
management station

Compute System A has


access only to Storage
System A

Compute System A is restricted


from accessing Storage System B

Compute
System A

Two discovery domains


Storage System
B
Compute
System B

• Internet Storage Name Service (iSNS) discovery domains work in the same way
as FC zones and primarily used in IP-based network.
• Provides functional groupings of devices in an IP-SAN.
• For devices to communicate with one another, they must be configured in the
same discovery domain.
− State change notifications inform the iSNS server when devices are added
or removed from a discovery domain.
• The image shows the discovery domains in an iSNS environment.

− Compute System A can access only Storage System A and Compute


System B can only access Storage System B.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 395


Securing the Data Protection Environment

Knowledge Check: Security Controls in a Data Protection


Environment – 1

Knowledge Check Question

3. Which statement is true when implementing RBAC?


a. An individual can both specify an action and carry it out.
b. Maximum privileges are assigned to a role to perform multiple tasks.
c. No individual can both specify an action and carry it out.
d. Activity monitor’s role can create and delete the security settings.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 396


Securing the Data Protection Environment

Security Controls in a Data Protection Environment – 2

Security Controls in a Data Protection Environment – 2

Objectives

The objectives of the topic are to:


→ Explain securing hypervisor, management server, VM, OS, and application.
→ Understand malware protection software and mobile device management.
→ Describe data encryption and data shredding.

Securing Hypervisor, Management Server, VM, OS, and Application

Storage
Storage
System B
System A
•Restrict core functionality to selected
administrators
•Encrypt network traffic when managing remotely
•Design with proper architecture, threat modeling, and
•Deploy firewall between management system and secure coding
rest of the network
•Include process spawning control, executable file
protection, and system tampering protection

VSAN

Management Server •Delete unused files and applications

•Install current OS updates

•Perform vulnerability scan and penetration


•Install security-critical hypervisor updates

•Harden hypervisor using specifications by CIS •Change default configuration of a VM


and DISA •Tune configuration of VM features to operate in secure
Compute Compute manner
System A System B •VM templates must be hardened to a known security
baseline

VLAN
Management VLAN

• The hypervisor and the related management servers are critical components of
an IT infrastructure because they control the operation and management of the
virtualized compute environment.

− Compromising a hypervisor or a management server places all VMs at a


high risk of attack.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 397


Securing the Data Protection Environment

Component Roles

Management Server • Restrict core functionality to selected


administrators.
• Encrypt network traffic when managing remotely.
• Deploy firewall between management system
and rest of the network.

Hypervisor • Install security-critical hypervisor updates


• Harden hypervisor using specifications by CIS
and DISA.

Operating System • Delete unused files and applications.


• Install current OS updates.
• Perform vulnerability scan and penetration.

Virtual Machine • Change default configuration of a VM.


• Tune configuration of VM features to operate in
secure manner.
• VM templates must be hardened to a known
security baseline.

Application • Design with proper architecture, threat modeling,


and secure coding.
• Include process spawning control, executable file
protection, and system tampering protection.

For detailed information about IT infrastructure components and their roles against
security attacks, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 398


Securing the Data Protection Environment

Malware Protection Software

Viruses Spyware

Worms

Key Loggers

Trojans

Malware protection software


installed
Techniques to detect malware

Signature based detection scans the files to determine Malware protection software protects against these
signature attacks

Heuristic can be used to detect malware by examining the


behavior of programs

• Software installed on a compute system or on a mobile device to provide


protection for operating system and applications. The malware protection
software:

− Detects, prevents, and removes malware and malicious programs such as


viruses, worms, Trojan horses, key loggers, and spyware.
− Uses various techniques to detect malware.
o Most common techniques used is signature-based detection94.
− Identifies malware by examining the behavior of programs.
− Protects operating system against attacks.
To learn more about malware protection software, click here.

94
In this technique, the malware protection software scans the files to identify a
malware signature.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 399


Securing the Data Protection Environment

Mobile Device Management

• Several organizations allow their employees to access organization’s internal


application and resources via mobile devices.
− Introduces a threat that may expose resources to an attacker.
• Mobile device management (MDM) is a control that restricts access to
organization’s resources only to authorized mobile devices.
• MDM solution consists of two components: the server component95 and the
client component96.
• MDM solution enables organizations to enforce organization’s security policies
on the user’s mobile devices.

95 Responsible for performing device enrollment, administration, and management


of mobile devices.

96Installed on the mobile device that needs access to the organization’s resources.
The client receives commands from the server component which it executes on the
mobile device.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 400


Securing the Data Protection Environment

For detailed information about mobile device management, click here.

Data Encryption

TSL and SSL protocols


Compute System
encrypt the data traversing
over the network

Compute System A Compute System B

Encryption Decryption
Appliance Replication Appliance

Encryption Appliance encrypts


the data before sending on the
Replication Network
Decryption Appliance decrypts the
data before storing on the Storage
System

Storage System encrypts the


data before storing on the storage Storage System
Storage System media

Production Site DR Site

• Data encryption is a cryptographic technique in which data is encoded and


made indecipherable to eavesdroppers or hackers.
− Provides protection from threats such as tampering with data which violates
data integrity, media theft which compromises data availability, and
confidentiality and sniffing attacks which compromise confidentiality.
• Data encryption is one of the most important controls for securing data in-flight97
and at-rest98 in data protection environment.

For more information about data encryption, click here.

97 Refers to data that is being transferred over a network.

98 Refers to data that is stored on a storage medium.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 401


Securing the Data Protection Environment

Data Shredding

• A process of deleting data or residual representations (sometimes called


remanence) of data which makes it unrecoverable.
• Organizations must deploy data shredding controls at all location to ensure that
all the copies are shred.
• Organizations can deploy data shredding controls in their data protection
environment to protect from loss of confidentiality of their data.
• Degauss techniques to shred data stored on tape includes:

Destruction Techniques Description

Physically destroying Damaging the storage media physically.

Degaussing Process of decreasing or eliminating the magnetic


field of media.

Overwriting Data on the disk or flash drives can be shared by


overwriting the disks several times with invalid data.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 402


Securing the Data Protection Environment

Knowledge Check: Security Controls in a Data Protection


Environment – 2

Knowledge Check Question

4. Match the following:

A. Data B A process of deleting data and


Encryption making it unrecoverable.

B. Data Shredding A A technique in which data is


encoded.

C. VM Hardening D A technique which scans the files


to identify a malware.

D. Signature- C A process in which the default


based configuration is changed to
Detection achieve security.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 403


Securing the Data Protection Environment

Cyber Recovery

Cyber Recovery

Objectives

The objectives of the topic are to:


→ Define different types of cyber attacks.
→ Understand the impact of cyber attacks.
→ Explain best practices against cyber attacks.
→ Describe cyber recovery architecture.

Cyber Attacks

A cyber attack is an attempt by hackers to damage, destroy, or control a network or


system. It includes any type of offensive action and can also target information
systems, infrastructures, networks, or personal computers. The purpose of the
attacks includes stealing, altering, hijacking, or destroying data or information
systems.

Select the link on each title to learn more about the most common cyber attacks.

Digital currency Spam Adware


Denial of mining
service

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 404


Securing the Data Protection Environment

Business email Banking Ransomware


Malicious compromise trojan
web scripts

Impact of Cyber Attacks

• Global crime damage is predicted to grow by 15 percent per year over the next
five years, reaching $10.5 trillion USD annually by 2025. This number is an
increase from 3 trillion USD in 2015.

− Cost estimations are based on historical cybercrime figures including year


wise growth, nation or state sponsored crime group hacking activities.
Reference:

Cyberwarfare-2021-Report.pdf (netdna-ssl.com)

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 405


Securing the Data Protection Environment

Best Practice Against Cyber Attacks

Cyber attacks have become a common occurrence. Reports of companies that


have experienced IT infrastructure security breaches are on the rise. There is a
growing concern that the cyber-attacks can lead to the destruction of mission
critical data and held hostage for ransom.

• Backup of data is the most important and effective way of combating


ransomware.
• The data protection best practice approach is to:

− Keep the backup copies offline, where cyberattacks cannot access the
secure copies.
− Keep security software up to date on latest definitions of virus and malware.
− Keep operating systems and software updated with security patches.
− Educate employees to be aware of links or attachments in suspicious email
messages.

Cyber Recovery Architecture

• True data protection emphasizes keeping an isolated copy of your critical data
such as essential applications and intellectual property off the network.
• Cyber recovery architecture:

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 406


Securing the Data Protection Environment

− Maintains critical business data and technology configurations in a secure,


air-gapped 'vault' environment that can be used for recovery or analysis.
− Isolates data from an unsecure system or network and ensure an
uncompromised copy always exists.
− Creates point-in-time (PIT) retention-locked copies that can be validated and
then used for recovery of the production system.
• Policies99 and retention locks make part of the architecture.
• The image shows the basic Synch-Copy-Lock operation of data protection and
vaulting process.

Click on each yellow circle on the image for more information.

1
4

5
6

1: Security mechanism that involves isolating a network and preventing it from


establishing an external connection.

2: Creates point-in-time copies that can serve as restore points in case production
backup data is subject to destructive cyberattack.

Synchronizes the latest data, creates a copy, and then secures it.

3: Immutable file locking and secure data retention to meet both corporate
governance and compliance standards.

99 What, where, when and how data is secured in the vault.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 407


Securing the Data Protection Environment

4: Determines if a replication copy contains malware or other anomalies that must


be removed.

5:

• Provides comprehensive alerting and reporting that enable administrators to


monitor ongoing activities.
• Detects affected copies, and alert is sent and actions must be taken to resolve
the problems that might occur.

6: The data in a point-in-time copy can be re-orchestrated and then used to replace
the lost data in production.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 408


Securing the Data Protection Environment

Knowledge Check: Cyber Recovery

Knowledge Check Question

5. Match the type of attack description with the name of the attack.

A. Trojan B This attack overwhelms the


resources of the system with
excessive requests that consume all
the resources.

B. Denial of A This attack tricks user into


Service downloading a "harmless" file that
becomes malware.

C. Malicious C This attack when run can detect and


Web exploit the vulnerabilities of a system
Scripts of visitors to the website.

D. Spam D This attack sends unsolicited bulk


messages sent through email, instant
messaging, or other digital
communication assets.

Knowledge Check Question

6. What is the purpose of having point-in-time replication copies in the cyber


recovery vault?
a. Enable restore points if production backup data is jeopardized.
b. Secure data from corruption of malicious data changes.
c. Determine if copy contains malware or anomalies.
d. Provide comprehensive alerting and reporting to monitor activities.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 409


Securing the Data Protection Environment

Concepts in Practice

Concepts in Practice

Click the right and left arrows to view all the concepts in practice.

Dell EMC PowerProtect Cyber Recovery Solution

• PowerProtect Cyber Recovery protects and isolates critical data from


ransomware and other sophisticated threats.
− Machine learning identifies suspicious activity and allows you to recover
known good data and resume normal business operations with confidence.
• The PowerProtect Cyber Recovery vault offers multiple layers of protection to
provide resilience against cyber-attacks even from an insider threat.
− Moves critical data away from the attack surface, physically isolating it within
a protected part of the data center and requires separate security credentials
and multi-factor authentication for access.
− Include an automated operational air gap to provide network isolation and
eliminate management interfaces which could be compromised.
• Automates the synchronization of data between production systems and the
vault creating immutable copies with locked retention policies.

− If a cyber-attack occurs you can quickly identify a clean copy of data,


recover your critical systems and get your business back up and running.

Dell EMC Encryption Enterprise

• Dell Encryption Enterprise offers options with its flexible encryption technology
such as data-centric policy-based approach as well a Full Disk Encryption
approach to protect data.
• The solution is designed for:
− Ease of Deployment
− End-user transparency
− Hassle-free compliance

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 410


Securing the Data Protection Environment

− Ease of management with single console


• A flexible suite of enhanced security solutions that include File Based
encryption, Full Disk Encryption, enhanced centralized management of native
encryption (Microsoft BitLocker and Mac FireVault) and protection of data on
external media, self-encrypting drives and mobile devices.

RSA SecurID Suite

• To address today’s toughest security challenges of delivering access to a


dynamic and modern workforce across complex environments, the RSA
SecurID Suite:

− Enables your organization to accelerate business while mitigating identity


risk and ensuring compliance.
− Transforms secure access to be convenient, intelligent and pervasive across
all access use cases.
o Ensure your journey to the cloud is secure and convenient, without
compromising either.
o Drive business agility through secure access.
o Accelerate secure user access to applications by providing a seamless
and convenient user
experience with modern authentication options when additional
authentication is required.
o Reduce identity risks by eliminating inappropriate access.
o Empower business users to make smart, informed and timely access
decisions.
o Enable visibility and control across all access use cases, ground to
cloud, to provide a
holistic identity solution.

VMware NSX Service-defined Firewall

• Rely on a distributed, stateful Layer 7 internal firewall, built on NSX, to secure


data center traffic across virtual, physical, containerized, and cloud workloads.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 411


Securing the Data Protection Environment

• Gain superior protection against lateral movement of malware with advanced


threat prevention that includes IDS/IPS, network sandbox, and network
detection and response.
• VMware’s unique, intrinsic approach to security simplifies deployments and
streamlines firewalling of every workload at a fraction of the cost.
• Enable security to move at the speed of development to provide a true public
cloud experience on-premises, decoupled from physical infrastructure
constraints.
• Deliver “security as code” with an API driven, object-based policy model which
ensures new workloads automatically inherit relevant security policies and
automates policy mobility with workloads.

VMware Carbon Black Cloud

• Most of the today’s cyberattacks feature advanced tactics such as lateral


movement and island hopping that target legitimate tools to inflict damage.
− These sophisticated hacking methods pose a tremendous risk to targets with
decentralized systems protecting high-value assets, including money,
intellectual property and state secrets.
• VMware Carbon Black Cloud™ thwarts attacks by making it easier to:
− Analyze billions of system events to understand what is normal in your
environment.
− Prevent attackers from abusing legitimate tools.
− Automate your investigation workflow to respond efficiently.
− All of this is unified into one console and one agent, so that infrastructure
and InfoSec teams have a single, shared source of truth to improve security
together.
• Consolidates multiple endpoint security capabilities using one agent and
console, helping you operate faster and more effectively.
• As part of VMware’s intrinsic security approach, VMware Carbon Black Cloud
spans the system hardening and threat prevention workflow to accelerate
responses and defend against a variety of threats.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 412


Securing the Data Protection Environment

Exercise: Securing the Data Protection Environment

Exercise: Securing the Data Protection Environment


1. Present Scenario:

A large multinational bank:

• Provides mobile banking to its customers that enables them to access the
application and data from any location.

• Enables their employees to access internal banking applications using


mobile devices.

• Has multiple remote/branch offices (ROBO) across various locations.

• Offers single factor authentication solution for security.

• Sends physical tape media to offsite.

• Currently performs remote replication between the primary site and the
secondary site for DR.

2. Organization’s Challenges:

• Mobile device theft may expose resources to an attacker.

• Difficulty in tracking anomalous activity in the data center.

• Sending tapes to offsite locations would increase the risk of losing sensitive
data in transit.

• Data is exposed to attackers when data is replicated between the primary


site and the secondary site for DR.

• An attack was attempted by exploiting loophole in the hypervisor


management system.

3. Organization’s Requirements:

• Need to protect the confidentiality of data if employee’s mobile device theft


occurs.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 413


Securing the Data Protection Environment

• Requires security controls to identify anomalous activity.

• Need to protect data on tapes when sending tapes to offsite location.

• Need to protect data when performing replication between sites.

• Need to have security controls to protect hypervisor management system.

4. Expected Deliverable:

• Propose a solution that will address the organization’s challenges and


requirements.

Solution

The proposed solution is as follows:

• Implement Mobile Device Management (MDM).


• Implement intrusion detection and prevention system (IDPS).
• Implement data encryption at rest and in flight.
− Encrypt data at rest for tapes.
− Encrypt data in flight for remote replication.
• Implement hypervisor management security controls.

− Perform hypervisor hardening based on CIS and DISA best practices.


− Perform security-critical hypervisor management updates.
− Implement separate firewall with strong filtering rules.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 414


Managing the Data Protection Environment

Managing the Data Protection Environment

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 415


Managing the Data Protection Environment

Managing the Data Protection Environment

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 416


Managing the Data Protection Environment

Managing the Data Protection Environment

Managing the Data Protection Environment

The main objectives of the topic are to:


→ Explain data protection management, its characteristics and functions.
→ List the key management processes that support data protection operations.
→ Apply the concept of data protection management to meet the organization’s
challenges and requirements.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 417


Managing the Data Protection Environment

Introduction to Data Protection Management

Introduction to Data Protection Management

Objectives

The objectives of the topic are to:

• Explain the need for data protection management.


• List the traditional data protection management challenges.
• Discuss the important data protection management functions.

Need for Data Protection Management

Data Protection Management includes all the protection-related functions that are
necessary for the management of data protection environment and services, and
for the maintenance of data throughout its lifecycle.

Scroll down to learn more about the need for data protection and management.

Data protection management aligns protection operations and services to the


strategic business goal and service level requirements.

It ensures that the data protection environment is operated optimally by using as


few resources as needed.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 418


Managing the Data Protection Environment

It ensures better utilization of existing data protection components.

Traditional Data Protection Management Challenges

Scroll down to learn about traditional data protection management challenges.

Component or asset-specific management

Traditionally, data protection management is component-specific. The


management tools only enable monitoring and management of specific
component(s). This may cause management complexity and system
interoperability issues in a large environment.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 419


Managing the Data Protection Environment

Overly complex

Management operations are very complex, especially in large environment that


includes many multi-vendor components residing in world-wide locations.

Manual operations

Traditional management operations, such as provisioning a backup storage and


creating a replica of a volume, are mostly manual. The provisioning tasks often
take days to weeks to complete, due to rigid resource acquisition process and long
approval cycle.

May not support service-oriented infrastructure

The traditional management processes and tools may not support a service-
oriented infrastructure, especially if the requirement is to provide cloud services.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 420


Managing the Data Protection Environment

Interoperability issues

Interoperability issues exist among multi-vendor IT components.

Unsuitable for on-demand service provisioning

They usually lack the ability to execute management operations in an agile manner,
scale resources rapidly, respond to adverse events quickly, orchestrate the
functions of distributed infrastructure components, and meet sustained service
levels. This component-specific, extremely manual, time consuming, and overly
complex management is simply not appropriate for modern-day data protection
management.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 421


Managing the Data Protection Environment

Key Characteristics of Modern-day Data Protection Management

Modern-day management is different in many ways from the traditional


management and have the following characteristics:

Click the right and left arrows to learn about the characteristics.

Service-focused approach

Modern storage infrastructure management has a service-based focus. It is linked


to the service requirements and service level agreement (SLA).

An SLA is a formalized contract document that describes service level targets,


service support guarantee, service location, and the responsibilities of the service
provider and the user. These parameters of a service determine how the
components of the data protection environment will be managed.

Examples of Management Functions Linked to Service Requirements and


Service-Level Agreements (SLAs):

• Determining the optimal amount of storage space needed in a backup storage


system to meet the capacity requirement of a service.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 422


Managing the Data Protection Environment

• Creating a disaster recovery plan to meet the recovery time objective (RTO) of
services.
• Ensuring that the management processes, management tools, and staffing are
appropriate to provide a data archiving service.

Software-defined data center-aware

• Software-defined data center management is more valued over hardware-


specific management.
• Many common, repeatable, hardware-specific management tasks are
automated. Management is focused on strategic, value-driven activities.
• Management functions move to an external software controller.
• Management operations become independent of underlying hardware.

End-to-end visibility

• End-to-end visibility of the data protection environment enables comprehensive


and centralized management.
• It provides information on configuration, connectivity, capacity, performance,
and interrelationships between components centrally.
• It helps in consolidating reports, correlating issues to find root-cause, and
tracking migration of data and services.
• End-to-end visibility is provided by specialized monitoring tools.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 423


Managing the Data Protection Environment

Orchestrated operations

• SDDC controller/orchestrator programmatically integrates and sequences


component functions into workflows.
• Orchestrator triggers an appropriate workflow upon receiving a service
provisioning or management request.
• Management operations are orchestrated as much as possible to provide
business agility.
• Orchestration reduces service provisioning time, risk of manual errors, and
administration cost.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 424


Managing the Data Protection Environment

Key Data Management Functions

Data protection management performs two key functions, which are as follows:

• Discovery
• Operations management

Discovery

Discovery creates an inventory of infrastructure components and provides


information about the components including their:

Discovery may be scheduled to occur periodically

Discovery may also be initiated by an administrator or triggered by an orchestrator

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 425


Managing the Data Protection Environment

Discovery provides the visibility needed to monitor and manage data center infrastructure

Discovery tool
interacts and collects
information from the
components

• Configuration and connectivity


• Functions
• Performance and capacity
• Availability and utilization
• Physical-to-virtual dependencies

Discovery is performed using a specialized tool that interacts with infrastructure


components commonly through the native APIs of these components. Through the
interaction, it collects information from the infrastructure components. A discovery
tool may be integrated with the software-defined data center (SDDC) controller,
bundled with a management software, or an independent software that passes
discovered information to a management software.

Discovery may be scheduled by setting an interval for its periodic occurrence.


Discovery may also be initiated by an administrator or triggered by an orchestrator
when a change occurs in the data protection infrastructure.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 426


Managing the Data Protection Environment

Operations Management

Key processes that support operations management activities

Configuration Availability Incident


Monitoring Management Management Management
Performance
Management
Change Capacity Problem Security
Management Management Management Management

Operations management involves on-going management activities to maintain the


data protection infrastructure and the deployed services.

• Operations management involves on-going management activities to maintain


the data protection infrastructure and the deployed services.
• It ensures that the services and service levels are delivered as committed.
Operations management involves several management processes.
• Ideally, operations management should be automated to ensure the operational
agility. Management tools are usually capable of automating many management
operations.
• Further, the automated operations of management tools can also be logically
integrated and sequenced through orchestration.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 427


Managing the Data Protection Environment

Knowledge Check: Introduction to Data Protection


Management

Knowledge Check Question

1. "Discovery creates an inventory of infrastructure components and provides


information about...". Select the right answer from the given options.
a. Configuration and connectivity
b. Capacity
c. Physical-to-virtual dependencies
d. All of the given options

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 428


Managing the Data Protection Environment

Operations Management – 1

Operations Management – 1

Objectives

The objectives of the topic are to:

• Define monitoring.
• Explain alerting.
• Understand the concept of reporting.

Introduction to Monitoring

Monitoring provides visibility into the data protection environment and forms the
basis for performing management operations. It offers the following benefits:

Monitoring

Tracks the performance and Measures the utilization and


availability status of components consumption of protection
and services storage by the services

Tracks events impacting data Generates reports for


recovery and availability of protection status, potential
components and services risks, and trends

Tracks environmental parameters


(HVAC) and deviations from their
Triggers alerts when backup window normal status
is exceeded, policies are violated, and
SLA is missed

Monitoring Parameters

The data protection environment is primarily monitored for the following:

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 429


Managing the Data Protection Environment

Monitoring Parameters

Availability

Configuration

Capacity

Performance

Security

Monitoring Configuration

Monitoring configuration involves:

• Tracking configuration changes.


• Deploying protection components and services.
• Detects configuration errors, non-compliance with protection policies, and
unauthorized configuration changes

This table shows a list of backup clients (VMs), their type, CPU and memory
configurations, and compliance to a predefined backup policy. The VM
configurations are captured and reported by a monitoring tool.

Backup VM Type CPU Memory (GB) Compliance


Client (VM) (GHz) Breach

VM49 Windows 4.8 2.0 Not backed up


Server 2003 since last week
(64-bit)

VM50 Windows 3.2 2.0 --


Server 2003
(32-bit)

VM51 Windows 4.8 2.0 --


Server 2003
(32-bit)

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 430


Managing the Data Protection Environment

VM52 Windows 3.2 2.0 Not backed up


Server 2003 since last week
(32-bit)

Monitoring Availability

Application Servers/Backup
Clients No redundancy due to switch SW1
failure

Backup Storage
System

Storage Node

Monitoring availability of hardware components (for example, a port, an HBA, or a


storage controller) or software components (for example, a database instance, an
SDDC controller, or an orchestration software) involves checking their availability
status by reviewing the alerts generated from the system. It identifies the failure of
any component or protection operation that may lead to data and service
unavailability or degraded performance.

For more information about the example, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 431


Managing the Data Protection Environment

Monitoring Capacity

Pool expanded

Notification: Pool is 66% full

Notification: Pool is 80% full

Inadequate capacity leads to degraded performance or even service unavailability.


Monitoring capacity involves examining the amount of infrastructure resources
used and usable such as the free space available on a file system or a storage
pool, the numbers of ports available on a switch, or the utilization of protection
storage space. Monitoring capacity helps an administrator to ensure uninterrupted
data protection and availability by averting outages before they occur.

For more information, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 432


Managing the Data Protection Environment

Monitoring Performance

Application servers / backup Storage Node


clients

Backup Storage
System

New backup clients

Performance monitoring tracks how efficiently different protection components and


services are performing and helps to identify bottlenecks.

Performance monitoring -

• Measures and analyzes behavior in terms of number of completed and failed


protection operations per hour, amount of data backed up daily, and throughput
of protection storage.
• Identifies whether the behavior of components and services meets the
acceptable and agreed performance level.

For more information, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 433


Managing the Data Protection Environment

Monitoring Security

Workgroup 2 (WG2)

Replication Storage System


Command

Workgroup 1 (WG1)
Notification: Attempted replication of WG2 devices by
WG1 user – Access denied

Monitoring a data protection environment for security includes tracking


unauthorized access, whether accidental or malicious, and unauthorized
configuration changes. For example, monitoring tracks and reports the initial zoning
configuration performed in an FC SAN and all the subsequent changes. Another
example of monitoring security is to track login failures and the unauthorized
access to protection storage for performing administrative changes.

For more information, click here.

Alerting

An alert is a system-to-user notification that


provides information about events or impending
threats or issues. Alerting keeps administrators
informed about the status of various components
and operations, which can impact the availability of
services and require immediate administrative
attention such as:

• Failure of power for storage drives, memory, switches, or availability zones.


• A storage pool reaching a capacity threshold.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 434


Managing the Data Protection Environment

• A replication operation breaching a protection policy.


• A soft media error on storage drives.

Type of Alert Description Example

Information • Provide useful • Creation of zone or


information VSAN
• Does not require • Creation of a storage
administrator pool
intervention

Warning • Require • File system is


administrative becoming full
attention
• Soft media errors

Fatal • Require immediate • Orchestration failure


attention
• Data migration failure

Reporting

Reporting on the data protection environment involves keeping track and gathering
information from various components and protection operations that are monitored.
The gathered information is compiled to generate reports for trend analysis,
capacity planning, configuration changes, deduplication ratio, chargeback,
performance, and security breaches.

Click the report types on the given figure for more information about that report.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 435


Managing the Data Protection Environment

2
1
3

5
4

1: Capacity planning reports contain current and historic information about the
utilization of protection storage, file systems, ports, etc.

2: Configuration and asset management reports include details about the allocation
of protection storage, local or remote replicas, network topology, and unprotected
systems. This report also lists all the equipment, with details, such as their
purchase date, license, lease status, and maintenance records.

3: Chargeback reports contain information about the number of backup and restore
operations, amount of data backed up and restored, amount of data retained over a
period of time, and the number of tapes as archive storage media used by various
user groups or tenants along with the associated cost.

4: Performance reports provide current and historical information about the


performance of various protection components and operations including success
rate, failed backup and recovery operations, and compliance with agreed service
levels.

5: Security breach reports provide details on the security violations, duration of


breach and its impact.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 436


Managing the Data Protection Environment

Knowledge Check: Operations Management – 1

Knowledge Check Question

2. Identify the monitoring parameters? Select all that apply.


a. Configuration
b. Availability
c. Performance
d. Profit

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 437


Managing the Data Protection Environment

Operations Management - 2

Operations Management -2

Objectives

The objectives of the topic are to:

• Define configuration management and change management.


• Define capacity management and performance management.
• Explain availability management.
• Define incident management and problem management.
• Explain security management.

Configuration Management

Configuration management is responsible for maintaining information about


configuration items (CIs). CIs include components such as:

SDDC Controller

Services Process Hardware Software People SLAs


Document

The information about CIs include their attributes, used and available capacity,
history of issues, and inter-relationships.

For more information, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 438


Managing the Data Protection Environment

Change Management

Change Management standardizes change-related procedures in a data protection


environment for prompt handling of all changes with minimal impact on data
protection operations and service quality.

Examples of changes include:

• Introduction of a new data replication service.


• Replacing an archive storage system.
• Expansion of a storage pool.
• Upgrade of a backup application.
• Change in process or procedural documentation.

For more information, click here.

Capacity Management

Capacity Management ensures that the data protection environment is able to meet
the required capacity demands for protection operations and services in a cost
effective and timely manner.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 439


Managing the Data Protection Environment

Examples of capacity management activities include:

• Adding new nodes to a scale-out NAS cluster or an OSD.


• Expanding a storage pool and setting a utilization threshold.
• Forecasting the usage of storage media.
• Removing unused resources from a service and reassigning those to another.

For more information, click here.

Performance Management

Performance management ensures the optimal operational efficiency of all


infrastructure components so that data protection operations and services can
meet or exceed the required performance level. Management tools also proactively
alert administrators about potential performance issues and may prescribe a
course of action to improve a situation.

Examples of performance management activities include:

• Adjusting conflicting backup schedules.


• Fine-tuning file system configuration.
• Adding new VMs or allocating more resources to the existing VMs.
• Adding new ISLs and aggregating links to eliminate bottleneck.
• Adding new nodes to a protection storage.
• Changing storage tiering and cache configuration.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 440


Managing the Data Protection Environment

Availability Management

Availability Management ensures that the availability requirements of data


protection operations and services are consistently met.

Examples of availability management activities include:

• Deploying redundant, fault-tolerant, and hot-swappable components.


• Implementing compute cluster, VM live shadow copy, and multi-pathing
solutions.

For more information, click here.

Incident Management

Incident Management is responsible for detecting and recording all incidents in a


data protection environment. It investigates the incidents and provides appropriate
solutions to resolve them.

The following table illustrates an example of an incident that was detected by the
Incident Management tool:

Severit Event Type Devic Priority Statu Last Updated Owne Escalatio
y Summar e s r n
y

Fatal Pool A Inciden NAS 1 None New 2016/03/0712:38:3 - No


usage is t 4
95%

Fatal Database Inciden DB High WIP 2016/03/0710:11:0 L. Support


1 is down t server 3 John Group 2
1

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 441


Managing the Data Protection Environment

Warning Port 3 Inciden Switch Mediu WIP 2016/03/0709:48:1 P. Kim Support


utilization t A m 4 Group 1
is 85%

Problem Management

Problem management prevents incidents that share common symptoms or root


causes from reoccurring, and minimizes the adverse impact of incidents that
cannot be prevented.

Problem management:

• Reviews incident history to detect problems in a data protection environment.


• Identifies the underlying root cause that creates a problem.
• Uses integrated incident and problem management tools to mark specific
incidents as problem and perform root cause analysis.
• Provides most appropriate solution or preventive remediation for problems.
• Analyzes and solves errors proactively before they become an
incident/problem.

For more information about problem management, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 442


Managing the Data Protection Environment

Data Security Management

Security Management prevents occurrence of security-related incidents or


activities. These incidents adversely affect the confidentiality, integrity, and
availability of organizations' data. Security management ensures the regulatory or
compliance requirements for data protection of organizations are met for protecting
data at reasonable costs. It develops data security policies and also deploys
required security architecture, processes, mechanisms, and tools.

Examples of security management activities are:

• Managing user accounts and access policies that authorize users to use a
backup/replication service.
• Implementing controls at multiple levels (defense in depth) to access data and
services.
• Scanning applications and databases to identify vulnerabilities.
• Configuring zoning, LUN masking, and data encryption services.

Data Protection Regulations

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 443


Managing the Data Protection Environment

With the flow of personal data across industries and on a global scale, data security
governance and data protection compliance requirements are becoming stronger
day by day. Organizations, which are dealing with personally identifiable
information (PII) must comply with stringent data protection regulations, including:

• Payment Card Industry Data Security Standard (PCI DSS) in the USA.
• Health Insurance Portability and Accountability Act (HIPAA) in the USA.
• General Data Protection Regulation (GDPR) in Europe.
• California Consumer Privacy Act (CCPA) in California.
• POPI in South Africa.

Data Security Governance

Data Security Governance

Data Security Governance (DSG) according to Gartner is “a subset of information


governance that deals specifically with protecting corporate data (in both structured
database and unstructured file-based forms) through defined data policies and
processes.”

There is no single product or an all-in-one solution to DSG. The organizations must


analyze their data requirements and select all the valuable data that needs to be
protected. Data governance must be treated with importance to avoid data security
management disasters.

There are three primary software methods for DSG: classification, discovery, and
de-identification or masking. These methods have been successfully employed
by IRI customers for PII and other sensitive data.

Click each primary software method type on the given figure for more information.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 444


Managing the Data Protection Environment

1 2

1: Data Classification refers to categorizing or grouping of data in order to protect it.


This categorization can be done with respect to its name, attributes, subject to
computational validation (so that it can be distinguished from other 9-digit strings),
and sensitivity attribution such as sensitive, secret and so on.

2: Sensitive data can be found by using certain search functions, which may or
may not be associated with data classes. This function is known as discovery
technique. Examples of discovery include:

• Perl Compatible Regular Expression (PCRE)


• Fuzzy (soundalike) matching algorithms
• Named entity recognition (NER)
• Facial recognition

3: A great way to reduce or even eliminate data breach risks is masking of data, at
rest or in motion. This process masks or shields sensitive or confidential data, such
as names, addresses, credit card information, Social Security numbers etc. from
the risk of unintended exposure to prevent data breaches.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 445


Managing the Data Protection Environment

Knowledge Check: Operations Management - 2

Knowledge Check Question

3. Match the following management processes with their descriptions:

A. 3. Change D B. Determines the optimal


management amount of resources required to
meet the needs of protection
operations.

B. 4. Availability C A. Prevents incidents that share


management common symptoms or root
causes from reoccurring.

C. 2. Problem A D. Makes a decision to approve


management or reject the request for creating
a new data protection service.

D. 1. Capacity B C. Ensures that the fault


management tolerance requirements of data
protection services are
consistently met.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 446


Managing the Data Protection Environment

Concepts in Practice

Concepts in Practice

Dell EMC PowerProtect Data Manager

Dell EMC PowerProtect Data Manager provides software defined data protection,
automated discovery, deduplication, operational agility, self-service and IT
governance for physical, virtual and cloud environments.

• Orchestrate protection directly through an intuitive interface or empower data


owners to perform self-service backup and restore operations from their native
applications.
• Ensure compliance and meet even the strictest of service level objectives.
• Leverage your existing Dell EMC PowerProtect appliances.

With operational simplicity, agility and flexibility at its core, PowerProtect Data
Manager enables the protection, management and recovery of data in on-
premises, virtualized and cloud deployments, including protection of in-cloud
workloads.

Dell EMC PowerProtect Data Manager builds on top of project Velero to provide a
data protection solution that enables application-consistent backups and restores
and that is always available for Kubernetes in on-premises and in-cloud workloads,
VMware hybrid cloud environments and Tanzu modern applications.

Dell EMC Data Protection Advisor

Dell EMC Data Protection Advisor can automate and centralize the collection and
analysis of all data—and get a single, comprehensive view of organization’s data
protection environment and activities. With automated monitoring, analysis, and
reporting across backup and recovery infrastructure, replication technologies,
storage platforms, enterprise applications and virtual environment, organization will
be able to more effectively manage service levels while reducing costs and
complexity.

Data Protection Advisor’s analysis engine looks across the entire infrastructure to
provide end-to-end visibility into protection levels, performance, utilization and
more. This enables unified, cross-domain event correlation analysis – insight into

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 447


Managing the Data Protection Environment

the entire data protection path to ensure each component is working correctly. And
that provides higher-level decision support based on defined policies. Built for cloud
infrastructure, Data Protection Advisor offers scalable, centralized multi-tenant data
protection management. With a SINGLE pane of glass view into your ENTIRE
infrastructure, every stakeholder has access to the information they need.

VMware vRealize Cloud Management

VMware vRealize Cloud Management enables consistent deployment and


operations of your apps, infrastructure, and platform services, from the data center
to the cloud to the edge. vRealize Cloud Management helps to accelerate
innovation, gain efficiency and improve control while mitigating risk so you can
make cloud your business. vRealize Cloud Management is available both on
premises and SaaS and comes in several packages to meet the unique
requirements of your hybrid cloud.

• Automate infrastructure provisioning by enforcing a repeatable, consistent and


reliable process with Infrastructure as Code and configuration management.
• Continuously optimize app performance with AI-driven automated workload
optimization.
• Reduce downtime, improve efficiency, gain end-to-end visibility, and manage
risk.

VMware vRealize Operations

VMware vRealize Operations delivers self-driving IT operations management for


private, hybrid, and multi-cloud environments in a unified, AI-powered platform. It
provides full-stack visibility from physical, virtual and cloud infrastructure – including
VMs and containers – to the applications they support. vRealize Operations
provides continuous performance optimization, efficient capacity and cost planning
and management, app-aware intelligent remediation, and integrated compliance.
vRealize Operations is available on premises and SaaS.

• vRealize Operations offers predictive analytics for continuous operations


management.
• It ensures real-time, predictive capacity and cost analytics to proactively
forecast demand and deliver actionable recommendations.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 448


Managing the Data Protection Environment

• vRealize Operations ensures cost transparency across private, hybrid and


public clouds to optimize planning.
• It offers unified monitoring and visibility across AWS, Google Cloud Platform
and Microsoft Azure.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 449


Managing the Data Protection Environment

Exercise - Managing the Data Protection Environment

Exercise - Managing the Data Protection Environment


Click each sub-heading for more information about the exercise.

1. Present Scenario:

An organization maintaining multiple data centers provide data protection


services to its customers. The details are as follows:

• Protection services cover both at local site as well as at remote site


protection for disaster recovery.

• The enterprise allows all its customer’s data to be stored, protected, and
accessed from worldwide location.

• It has virtualized compute, network, and storage components and deployed


various backup, replication, and archiving solutions.

• It provides automated reports that are generated by monitoring and reporting


tools.

• The management operations in the data center are mostly manual.

2. Organization’s Challenges:

• Difficulty in locating and resolving errors in infrastructure components and


data protection operations.

• Difficulty in allocating resources to meet dynamic resource consumption and


seasonal spikes in resource demand.

• Occasionally, the performance of replication operation gets degraded.

• Difficulty in creating the inventory of various infrastructure components


including their configuration, connectivity, functions, and performance.

3. Organization’s Requirements:

• Need to ensure adequate availability of IT resources to provide


data protection services.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 450


Managing the Data Protection Environment

• Need to gather and maintain information about all the infrastructure


components in a centralized database.

• Administrators should get proactive alerts about potential performance


issues on data protection operations.

• Need to reduce manual errors and administration cost related to common,


repetitive management tasks.

• Planning to deploy a new multi-site data protection service. It needs to


implement a management process for architecting the new multi-site data
protection solution.

4. Expected Deliverables:

Propose a solution that will address the organization’s challenges and


requirements.

Solution

The proposed solution is as follows:

• Implement a capacity management process that will help in planning for current
and future resource requirements. This may include dynamic resource
consumption and seasonal spikes in resource demand.
• Deploy discovery tool that gathers and stores data in a configuration
management system.
• Deploy performance management tool that can proactively alert administrators
about potential performance issues.
• Orchestrate management operations that are common and repetitive to
reduce manual errors and administration cost.
• Implement an availability management process that will help in architecting the
new multi-site data protection solution.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 451


Summary

Summary

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 452


Summary

Summary

Upon successful completion of this course, participants should be able to:


→ Explain data protection architecture and its building blocks.
→ Evaluate fault-tolerance techniques in a data center.
→ Describe data backup methods and data deduplication.
→ Describe data replication, data archiving and data migration methods.
→ Describe the data protection process in a software-defined data center.
→ Articulate cloud-based data protection techniques.
→ Describe various solutions for protecting Big data and mobile device data.
→ Describe security controls and management processes in a data protection
environment.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 453


Summary

You Have Completed This eLearning

Click the Save Progress and Exit button below to record this
eLearning as complete.
Go to the next eLearning or assessment, if applicable.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 454


Summary

Data Protection and Management – Associate

The Data Protection and Management certification provides a comprehensive


understanding of the various data protection infrastructure components in modern
data center environments. This certification will qualify towards all Backup
Recovery Specialist level certifications in the Dell EMC Proven Professional
Technology Architect, Implementation Engineer, Systems Administrator and
Infrastructure Security tracks. The course is available in Classroom, Virtual
Classroom and On Demand Course modalities.

Technology Architect Implementation Engineer Systems Administrator Infrastructure Security

• Data Protection
• PowerProtect Data Manager • PowerProtect Data Manager • Implementing the NIST
- Data Protection Training Cybersecurity Framework
- PowerProtect Data Manager - PowerProtect Data Manager
Bundle
Training Bundle Training Bundle
• Avamar • Avamar
- Avamar Implementation and - Avamar Administration
Administration
• Data Domain
• Data Domain
- DataDomain System - DataDomain System
Administration Administration

• NetWorker • NetWorker
- NetWorker Implementation - NetWorker Implementation
and Administration and Administration

Data Protection and Management

Data Protection and Management (C, VC, ODC)

(C) - Classroom

(VC) - Virtual Classroom

(ODC) - On Demand Course

For more information, visit: http://dell.com/certification

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 455


Appendix

Appendix

Data protection is one of the least glamorous yet important aspect in any
organization. In many respects it’s like being the goalkeeper in a soccer game—
when you do your job effectively, it’s easy to get overlooked. But if you fail, it
generally results in a loss. Data can exist in a variety of forms such as photographs
and drawings, alphanumeric text and images, and tabular results of a scientific
survey. In computing, digital data is a collection of facts that is transmitted and
stored in electronic form, and processed through software. Digital data is generated
by various devices such as desktops, laptops, tablets, mobile phones, and
electronic sensors. It is stored as strings of binary values (0s and 1s). In this
course, the word “data” implies the digital data. Most organizations use one or more
data protection methods to protect their digital data from disruption and disaster.

For example, backing up data creates a duplicate copy of data. The duplicate copy
or data backup is used to restore data in case the original data is corrupted or
destroyed. If a disaster occurs, an organization’s onsite data backup could be lost
along with the original data. Hence, it is a good practice to keep a copy of data in a
remote site. In addition, data archives are used to preserve older but important
files. Organizations also test data recovery operations periodically to examine the
readiness of their data protection mechanisms.

Further, security mechanisms such as anti-malware software and firewalls help in


protecting data from security attacks. A key question that should be answered at
this point is—what are the reasons for spending money, time, and effort on data
protection? Let us list the reasons that make data protection and its management
important for an organization. Note: The terms “data” and “information” are closely
related and it is common for the two to be used interchangeably. However, when
data is processed and presented in a specific context it can be interpreted in a
useful manner. This processed and organized data is called information.

As the business markets become increasingly connected, ensuring that data is


protected and always available becomes absolutely critical. Accessing data
constantly through numerous activities, such as web searches, social networking,
emailing, uploading and downloading content, and sharing media files are
commonplace. Moreover, internet-enabled smartphones, tablets, and wearable
gadgets such as a fitness activity tracker, along with Internet of Things (IoT) add to
anytime, anywhere data access via any device. The IoT is a technology trend

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 457


Appendix

wherein “smart” devices with embedded electronics, software, and sensors


exchange data with other devices over the Internet.

Application areas of IoT include remote controlling of household appliances and


remote monitoring of atmospheric conditions. For business applications, it is
essential to have uninterrupted, fast, reliable, and secure access to data for
enabling these services. This access, in turn, relies on how well the data is
protected and managed.

An organization’s data is its most valuable asset. An organization can leverage its
data to efficiently bill customers, advertise relevant products to the existing and
potential customers, launch new products and services, and perform trend analysis
to devise targeted marketing plans. These sensitive data, if lost, may lead to
significant financial, legal, and business loss apart from serious damage to the
organization’s reputation.

An organization seeks to reduce the risk of sensitive data loss to operate its
business successfully. It should focus its protection efforts where the need exists—
its high-risk data. Many government laws mandate that an organization must be
responsible for protecting its employee’s and customer’s personal data. The data
should be safe from unauthorized modification, loss, and unlawful processing.
Examples of such laws are U.S. Health Insurance Portability and Accountability Act
(HIPAA), U.S. Gramm-Leach-Bliley Act (GLBA), and U.K. Data Protection Act. An
organization must be adept at protecting and managing personal data in
compliance with legal requirements.

GDPR - The EU General Data Protection Regulation (GDPR), launched in 2018,


ensures that any organization dealing with data on EU citizens must be compliant
with the terms of GDPR, irrespective of where the organization is located. If the
organization fails to be complaint, the EU will levy a huge fine. This regulation
increased the accountability and responsibility of the company towards
safeguarding of its client data.

Data protection is the process of safeguarding data from corruption and loss. It
focuses on technologies or solutions that can prevent data loss and recover data in
the event of a failure or corruption. Data protection lays the foundation of improving
data availability.

Data protection technologies and solutions are used to meet data availability
requirements of business applications and IT services. Examples of IT services are

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 458


Appendix

email service, data upload service, and video conferencing service. Data availability
refers to the ability of an IT infrastructure component or service to function
according to business requirements and end users’ expectations during its
operating time, ensuring that data is accessible at a required level of performance.
The operating time is the specified or agreed time of operation when a component
or service is supposed to be available.

For example, a service that is offered from 9 AM to 5 PM Monday to Friday, 52


weeks per year, would have an operating time of 8 * 5 * 52 = 2080 hours per year.
Any disruption to the service outside of this time slot is not considered to affect the
availability of the service. Data availability is not all about technologies; it also
involves strategy, procedure, and IT resource readiness appropriate for each
application or service. Based on a data availability strategy, necessary data
protection technologies and solutions are picked up.

For example, an application owner cares about the availability of their application,
and the application strategically requires 24x7 access to data. The backup
administrator is responsible for protecting the application data aptly using an
appropriate backup technology. In the event of a data corruption or loss, the
application owner relies on the backup administrator to restore data from a backup.

Note: ITIL defines a service as “a means of delivering value to customers by


facilitating outcomes customers want to achieve without the ownership of specific
costs and risks”. According to Gartner, “IT services refers to the application of
business and technical expertise to enable organizations in the creation,
management and optimization of or access to information and business
processes.”

The goal of data availability is to ensure that users can access an application or a
service during its operating time. But failure of an infrastructure component or a
service might disrupt data availability and result in downtime. A failure is the
termination of a component’s or service’s ability to perform its required function.
The component’s or service’s ability can be restored by performing various external
corrective actions such as a manual reboot, a repair, or replacement of the failed
component(s). Therefore, both operating time and downtime of a component or a
service are factored in the measurement of data availability. Data availability is
usually calculated as a percentage of uptime, where uptime is equal to the
operating time minus the downtime. It is often measured by “Nines”. For example, a
service that is said to be “five 9s available” is available for 99.999 percent of the
agreed operating time in a year.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 459


Appendix

Data availability is also measured as a factor of the reliability of components or


services—as reliability increases, so does availability. It is calculated as the mean
time between failure (MTBF) divided by the MTBF plus the mean time to repair
(MTTR). Both MTBF and MTTR are reliability metrics. MTBF is the average time
available for a component or a service to perform its normal operations between
failures.

It is calculated as the total uptime divided by the number of failures. MTTR is the
average time required to repair a failed component or service. It is calculated as the
total downtime divided by the number of failures. These metrics are usually
expressed in hours.

For example, if the annual uptime of a component is 9609 hours, the annual
downtime of the component is 11 hours, and the component has failed thrice in an
year, then MTBF = 9609 hours / 3 = 3203 hours and MTTR = 11 hours / 3 = 3.66
Hours. Note: Mean Time to Restore Service (MTRS) is considered to be a better
metric than MTTR for measuring data availability. MTRS is the average time taken
to restore a failed component or a service.

The problem with MTTR is that while a component (or part of a service) may have
been repaired, the service itself is still not available to an end user. MTRS takes
care of the end user’s interest by encompassing the entire elapsed time after a
failure till the end user can get access to a service.

Mechanical damages of hardware are common reasons for device failures. In


addition, manufacturing defects, spilling coffee, and other water damages may
cause device outage. As a result of hardware failures, users may not be able to
access data.

Loss of power or even sudden changes in voltage affects IT infrastructure


components, which may lead to data unavailability. Poor application design or
resource configuration errors can also lead to data unavailability. For example, if
the database crashes for some reason, then the data will be inaccessible to the
users, which may lead to IT service outage.

The IT department of an organization performs routine activities such as application


upgrade, database reorganization, hardware upgrade, data migration, server
maintenance, and relocating services to another site. Any of these activities can
have its own significant and negative impact on data availability.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 460


Appendix

Natural disasters such as flood, earthquake, tornadoes, and volcanic eruptions can
affect businesses and availability of data in every part of the globe. In addition,
man-made disasters such as civil unrest, terrorist attacks, and accidents can
impact data availability.

Ransomware - It is a malware, created using Cryptovirology and is used to


threaten victims of leaking their valuable data or blocking access to it. The threat is
accompanied by a demand for a ransom. The simple forms of Ransomware may
lock the victim's system, which may be reversed by someone with IT knowledge.
However, the more advanced malware uses cryptoviral extortion techniques to
encrypt the victim's data and demand for a large ransom in exchange for decrypting
it. This is yet another instance of data unavailability and may result in a huge
financial loss for the victim.

In addition, loss of data due to data corruption, intentional or accidental deletion of


files or programs, and misplacement or theft of DVDs and tapes may lead to data
unavailability.

Note: In general, the outages can be broadly categorized into planned and
unplanned outages. Planned outages may include installation and maintenance of
new hardware, software upgrades or patches, performing application and data
restores, facility operations (renovation and construction), and migration.
Unplanned outages include failure caused by human errors, database corruption,
failure of components, and natural or man-made disasters.

A data center provides centralized data-processing capability. It is used to provide


worldwide access to business applications and IT services over a network,
commonly the Internet.

Data center usually stores large amounts of data and provides services to a vast
number of users. Therefore, data protection in a data center is vital for carrying out
business operations. There are several methods available to protect data in a data
center.

For example, a primary (production) database server may periodically transfer a


copy of transaction data to a standby database server. This method ensures that
the standby database is consistent up to a point-in-time with the primary database.
In case the primary database server fails, the standby database server may start
production operations.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 461


Appendix

In another method, data is copied directly from a primary storage to a standby


protection storage without involving application servers. The protection storage may
be used for data recovery or restarting business operations in the event of primary
storage failure.

Large organizations often maintain multiple data centers to distribute data-


processing workloads and provide remote protection of data. Data is copied
between data centers to provide remote protection and high availability. If one data
center experiences an outage, other data centers continue providing services to the
users.

In an enterprise data center, data is typically stored on storage systems (or storage
“arrays”). A storage system is a hardware component that contains a group of
storage devices assembled within a cabinet. It is controlled and managed by one or
more storage controllers. These enterprise-class storage systems are designed for
providing high capacity, scalability, performance, reliability, and security to meet
business requirements. The compute systems that run business applications are
provided storage capacity from storage systems.

Connectivity elements create communication paths between compute systems and


storage for data exchange and resource sharing. Examples of connectivity
elements are Open Systems Interconnection (OSI) layer-2 network switches, OSI
layer-3 switches or routers, cables, and network adapters such as a NIC. Switches
and routers are the commonly used interconnecting devices. An OSI layer-2 switch
enables multiple compute and storage systems in a network to communicate with
each other. A router (or an OSI layer-3 switch) allows multiple networks to
communicate with each other.

The commonly used cables are copper and optical fiber. A network adapter on a
compute or storage system provides a physical interface for communicating with
other systems.

The connectivity elements help in connecting IT equipment together in a data


center. The two primary types of connectivity include the interconnection between
compute systems and between compute systems and storage systems.

Note: The OSI model defines a layered framework to categorize various functions
performed by the communication systems. The model has seven layers, and each
layer includes specific communication functions. If functions of a communication
protocol, a network switch, or a type of network traffic match with specific layer

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 462


Appendix

characteristics, then they are often aliased by the layer number such as OSI layer-3
protocol, OSI layer-2 switch, and OSI layer-2 traffic.

Characteristics of Converged infrastructure:

• Pre-configured and optimized, which reduces the time to acquire and deploy the
infrastructure
• Less power and space requirements
• All hardware and software components can be managed from a single
management console

A potential area of concern regarding converged infrastructure solutions is the lack


of flexibility to use IT components from different vendors. Some vendors may
provide the flexibility to choose multi-vendor IT components for a converged
infrastructure.

Notes

The fundamental principle of DR is to maintain a secondary data center or site,


called a DR site. The primary data center and the DR data center should be located
in different geographical regions to avoid the impact of a regional disaster. The DR
site must house a complete copy of the production data. Commonly, all production
data is replicated from the primary site to the DR site either continuously or
periodically. A backup copy can also be maintained at the DR site. Usually, the IT
infrastructure at the primary site is unlikely to be restored within a short time after a
catastrophic event.

Organizations often keep their DR site ready to restart business operations if there
is an outage at the primary data center. This may require the maintenance of a
complete set of IT resources at the DR site that matches the IT resources at the
primary site. Organization can either build their own DR site, or they can use cloud
to build DR site.

Fault-tolerant IT infrastructure is designed based on the concept of fault tolerance.


Fault tolerance is the property that enables a system to continue operating properly
in the event of the failure of (or one or more faults within) some of its components.

Fault-tolerant IT infrastructure eliminates single points of failure. In the event of a


component failure, a redundant component can immediately take its place with no

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 463


Appendix

loss of service. The fault-tolerant infrastructure improves availability because a


single failure cannot make the entire infrastructure or a service unavailable.

Fault tolerance can be provided at the software level, or at the hardware level, or by
combining both of them. The fault-tolerant design can also be extended to include
multiple data centers or sites wherein redundant data centers are used to provide
site-level fault tolerance.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 464


Appendix

Why Data Protection Architecture?


• Organizations need a data protection architecture100 to combat accidental
architecture.
• Enables cost-optimized and consolidated data protection, simplifies data
protection management, and helps organizations to meet service level
requirements.
− An intentional data protection architecture is explicitly identified and then
implemented.
• Evolving data protection technology and expanding requirements have
transformed the IT industry.
− Organizations can choose from many new data protection options integrated
into their applications, operating systems (OSs), and storage systems.
− Unfortunately, with such a transformation, many organizations have fallen
into the chaos of an accidental architecture101.
• Multiple entities within an organization perform their own data protection
operations without a clear picture of the ownership of protection processes and
resources.

− Results in an ad hoc approach to data protection with no central visibility to


the data protection environment.

100A data protection architecture is a blueprint that specifies the protection


components and their interrelationships and guides an organization to provide
centralized data protection services.

101An accidental architecture consists of a fragmented set of data protection


processes, multiple unconnected data protection tools, and infrastructure silos. An
accidental architecture causes complexity in scaling the protection resources.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 465


Appendix

Business Applications
• Business applications run on compute systems102.
• Various types of business applications are enterprise resource planning (ERP)
applications, customer relationship management (CRM) applications, email
applications, ecommerce applications, database applications, and analytic
applications.
• A business application commonly provides a user interface such as a command
line interface (CLI) and graphical user interface (GUI).
− The user interface enables users to send requests and view responses.
− Also provide an application programming interface (API)103 that enables
other applications to interact with it.
• The protection applications and storage leverage these interfaces to track the
application data as it changes and also track the protection status of the data.

102Execute the requests from users or clients and pass back the generated
responses.

103Provides a flexible, easy-to-use means for integrating protection tools with the
business applications.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 466


Appendix

Hypervisors
• From a hypervisor’s perspective, each VM is a discrete set of files that store the
VM configuration, VM memory content, and guest OS and application data.
− Availability of these files is the key to run the VMs and continue business
operations. Therefore, protection of VMs should be included in the data
protection plan.
• Protection at the hypervisor level requires the hypervisor to function as the
source of all VM files managed by it.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 467


Appendix

Virtual Machine
• A VM does not have direct access to the hardware of the physical compute
system (host machine) on which it is created.

− The hypervisor translates the VM’s resource requests and maps the virtual
hardware of the VM to the hardware of the physical compute system.
− For example, a VM’s I/O requests to a virtual disk drive are translated by the
hypervisor and mapped to a file on the physical compute system’s disk drive.
A VM can be configured with one or more virtual CPUs. When a VM starts, its
virtual CPUs are scheduled by the hypervisor to run on the physical CPUs. Virtual
RAM is the amount of physical memory allocated to a VM and it can be configured
based on the requirements.

The virtual disk stores the VM’s OS, program files, and application data. A virtual
network adapter provides connectivity between VMs running on the same or
different compute systems, and between a VM and the physical compute systems.

Virtual optical drives and floppy drives can be configured to connect to either the
physical devices or to the image files, such as ISO and floppy images (.flp), on the
storage. SCSI/IDE virtual controllers provide a way for the VMs to connect to the
storage devices.

The virtual USB controller is used to connect to a physical USB controller and to
access the connected USB devices. Serial and parallel ports provide an interface
for connecting peripherals to the VM.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 468


Appendix

Containers
• Multiple containers can run on the same machine and share the Operating
System Kernel with other containers.
− For example, you might have one container on a system running Red Hat
Linux, serving a database, through a virtual network to another container
running Ubuntu Linux, running a web server that talks to that database, and
that web server might also be talking to a caching server that runs in a
SUSE Linux based container.
• Containers are lightweight in nature but running them in production environment
can quickly become a massive effort. Especially when used with microservices,
a containerized application might be translated into multiple containers. This can
introduce significant complexity if managed manually.

− Container orchestration104 is what makes that operational complexity


manageable for DevOps since it provides a way of automating much of the
work.
− The widely deployed container orchestration platforms are based on open-
source versions like Kubernetes, Docker Swarm.

104An automatic process of managing or scheduling the work of individual


containers for applications based on microservices within multiple clusters.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 469


Appendix

Primary Storage Device


A primary storage device is the persistent storage for data used by business
applications to perform transactions. A primary storage can be a standalone hard
disk drive (HDD) or solid-state drive (SSD) that is directly attached to a compute
system.

An entire storage system or some of its storage drives that store business
application data can also be the primary storage device. In addition to transactional
data, a primary storage device may also store OS and application software.

A primary storage device can be leveraged as a data source during protection


operations. Data from a primary storage device can be copied or moved directly to
protection storage without using the CPU cycles of the compute systems that run
business applications and hypervisors. Therefore, application performance is not
impacted during data protection. This may also improve the performance of data
protection operations.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 470


Appendix

Cloud Based Storage


In cloud based storage model data is stored in cloud and managed by cloud data
storage service provider.

Cloud based protection storage provides the following features:

• Improves productivity and competence and manage cost because of on-


demand delivery within the time.
• Eliminates the unnecessary investment on infrastructure, as an organization can
subscribe and pay as per their storage requirement.
• Gives access to the data anytime and from anywhere.
• Provides data security during data transfer and storage by encrypting the files
using encryption techniques and is maintained by the service provider.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 471


Appendix

Need for Fault Tolerance


Fault tolerance is needed to improve the reliability and availability of a service. It
ensures that a system remains up and a service remains available in the event of a
failure or fault within a system component. Fault tolerance is achieved by deploying
fault-tolerant compute, network, storage, and application systems in a data center.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 472


Appendix

What is fault tolerance


Fault tolerance may be provided by software, hardware, or a combination of both.
However, end-to-end data center fault tolerance is difficult and costly to achieve.
The closer an organization reaches 100 percent fault tolerance, the more costly is
the infrastructure.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 473


Appendix

Fault Isolation
The example shown image represents two I/O paths between a compute system
and a storage system.

The compute system uses both the paths to send I/O requests to the storage
system. If an error or fault occurs on a path causing a path failure, the fault isolation
mechanism present in the environment automatically detects the failed path. It
isolates the failed path from the set of available paths and marks it as a dead path
to avoid sending the pending I/Os through it.

All pending I/Os are redirected to the live path. This helps avoiding the time-out and
the retry delays.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 474


Appendix

Compute Clustering
Compute clustering provides continuous availability of services even when a virtual
machine (VM), physical compute system, OS, or hypervisor fails. In compute
clustering technique, at least two compute systems or hypervisors work together
and are viewed as a single compute system to provide high availability and load
balancing. If one of the compute systems in a cluster fails, the service running on
the failed compute system moves to another compute system in the cluster to
minimize or avoid outage. Clustering uses a heartbeat mechanism to determine the
health of each compute system in the cluster. The exchange of heartbeat signals,
usually happens over a private network, allows participating cluster members to
monitor each other’s status. Clustering can be implemented among multiple
physical compute systems, or multiple VMs, or VM and physical compute system,
or multiple hypervisors.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 475


Appendix

Virtual Machine (VM) Live Shadow Copy


The VM live shadow copy technique ensures that the secondary VM is always
synchronized with the primary VM. The hypervisor running the primary VM captures
the sequence of events that occur on the primary VM. Then it transfers these
sequence of events to the hypervisor running on another compute system. The
hypervisor running the secondary VM receives these event sequences and sends
them to the secondary VM for execution. The primary and the secondary VMs
share the same storage, but all output operations are performed only by the
primary VM. A locking mechanism ensures that the secondary VM does not
perform write operations on the shared storage. The hypervisor posts all events to
the secondary VM at the same execution point as they occurred on the primary VM.
This way, these VMs “play” exactly the same set of events and their states are
synchronized with each other.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 476


Appendix

Link Aggregation
Link aggregation combines two or more parallel interswitch links (ISLs) into a single
logical ISL, called a link aggregation group. It optimizes network performance by
distributing network traffic across the shared bandwidth of all the ISLs in a link
aggregation group. This allows the network traffic for a pair of node (compute
system and storage system) ports to flow through all the available ISLs in the group
rather than restricting the traffic to a specific, potentially congested ISL. The
number of ISLs in a link aggregation group can be scaled depending on
application’s performance requirement.

Link aggregation also enables network traffic failover in the event of a link failure. If
a link in a link aggregation group is lost, all network traffic on that link is
redistributed across the remaining links.

By combining ISLs, link aggregation also provides higher throughput than a single
ISL could provide. For example, the aggregation of three ISLs into a link
aggregation group provides up to 48 Gb/s throughput assuming the bandwidth of
an ISL is 16 Gb/s.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 477


Appendix

Multipathing
Enables a compute system to use multiple paths for transferring data to a storage
device on a storage system.

Enables automated path failover. This eliminates the possibility of disrupting an


application or a service due to failure of a component on the path such as network
adapter, cable, port, and storage controller (SC). In the event of a path failure, all
outstanding and subsequent I/O requests are automatically directed to alternative
paths.

Can be a built-in OS and hypervisor function or a third-party software module that


can be installed to the OS or hypervisor. To use multipathing, multiple paths must
exist between the compute and the storage systems. If a path fails, the multipathing
software or process detects the failed path and then redirects the pending I/Os of
the failed path to another available path.

Can also perform load balancing by distributing I/Os across all available paths. The
figure on the slide shows a configuration where four paths between a compute
system (with dual-port HBAs) and a storage device enable multipathing.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 478


Appendix

Configuring Hot-swappable Components


For example, a high-end switch or director contains redundant components with
automated failover capability. Its key components such as controller blades, port
blades, power supplies, and fan modules are all hot-swappable. If a switch
controller blade fails, it is hot-swapped for a new one.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 479


Appendix

RAID
Disk and solid state drives are susceptible to failures. A drive failure may result in
data loss. Today, a single storage system may support thousands of drives.
Greater the number of drives in a storage system, the greater is the probability of a
drive failure in the system.

Redundant Array of Independent Disks (RAID) is a technique in which multiple


drives are combined into a logical unit called storage pool and data is written in
blocks across the disks in the pool. The logical units are created from the pool by
partitioning the available capacity into smaller units. These units are then assigned
to the compute system based on their storage requirements. Logical units are
spread across all the physical drives that belong to that pool. Each logical unit
created from the pool is assigned a unique ID, called a logical unit number (LUN).

RAID protects against data loss when a drive fails, through the use of redundant
drives and parity. Typically, in a RAID storage system, the data is distributed across
physical drives and these set of physical drives are viewed as single logical drive or
volume by operating system. RAID also helps in improving the storage system
performance as read and write operations are served simultaneously from multiple
drives.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 480


Appendix

Erasure Coding Technique


The image illustrates an example of dividing a data into nine data segments (m = 9)
and three coding fragments (k = 3). The maximum number of drive failure
supported in this example is three. Erasure coding offers higher fault tolerance
(tolerates k faults) than replication with less storage cost.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 481


Appendix

Graceful Degradation
Graceful degradation of application functionality refers to the ability of an
application to maintain limited functionality even when some of the components,
modules, or supporting services are not available.

A well designed application (modern application) or service typically uses a


collection of loosely coupled modules that communicate with each other. Especially
a business application requires separation of concerns at the module level so that
an outage of a dependent service or module would not bring down the entire
application. The purpose of graceful degradation of application functionality is to
prevent the complete failure of a business application or service.

For example, consider an e-commerce application that consists of modules such as


product catalog, shopping cart, order status, order submission, and order
processing. Assume that the payment gateway is unavailable due to some
problem. It is impossible for the order processing module of the application to
continue. If the application or service is not designed to handle this scenario, the
entire application might go offline.

However, in this same scenario, it is possible that the product catalog module can
still be available to consumers to view the product catalog. The application can also
allow the consumers to place the order and move it into the shopping cart. This
provides the ability to process the orders when the payment gateway is available or
after failing over to a secondary payment gateway.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 482


Appendix

Fault Detection and Retry Logic


A key mechanism in a highly available application design is to implement retry logic
within a code to handle service that is temporarily down.

When applications use other services, errors can occur because of temporary
conditions such as intermittent service, infrastructure-level faults, or network issues.
Very often this form of problem can be solved by retrying the operation a few
milliseconds later, and the operation may succeed. The simplest form of transient
fault handling is to implement this retry logic in the application itself.

To implement this retry logic in an application, it is important to detect and identify


that particular exception which is likely to be caused by a transient fault condition.

A retry strategy must also be defined to state how many retries can be attempted
before deciding that the fault is not transient and define what the intervals should
be between the retries. The logic will typically attempt to execute the action(s) a
certain number of times, registering an error, and utilizing a secondary service if the
fault continues.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 483


Appendix

Persistent State Model


In a stateful application model, the session state information (for example user ID,
selected products in a shopping cart, and so on) is usually stored in compute
system memory. However, the information stored in the memory can be lost if there
is an outage with the compute system where the application runs.

In a persistent state model, the state information is stored out of the memory and is
usually stored in a repository (database). If a compute system (server) running the
application instance fails, the state information will still be available in the
repository.

A new application instance is created on another server which can access the state
information from the database and resume the processing.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 484


Appendix

Database Rollback
A rollback is the operation of restoring a database to a previous state by canceling
a specific transaction or transaction set. Rollbacks are important for database
integrity because they mean that the database can be restored to a consistent
previous state even after erroneous operations are performed.

Thus, a rollback occurs when a user begins to change data and realizes that the
wrong record is being updated and then cancels the operation to undo any pending
changes. Rollbacks may also be issued automatically after a server or database
crashes, e.g. after a sudden power loss. When the database restarts, all logged
transactions are reviewed; then all pending transactions are rolled back, allowing
users to reenter and save appropriate changes.

In the example shown in the image, transactions A,B, and C are performed and
committed to the database. Then, transactions D and E are performed and an issue
is identified. In such case, transactions D and E should be rolled back. After the
database is rolled back, transactions D and E are cancelled and database is
restored to the previous state with only committed data.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 485


Appendix

Need for Data Backup


A backup is an additional copy of production data, created and retained for the sole
purpose of recovering the lost or corrupted data. Typically, organizations implement
backup in order to protect the data from accidentally deleting files, application
crashes, data corruption, and disaster. Data should be protected at local location as
well as to a remote location for ensuring the availability of service.

The organizations are under pressure to deliver services to customers in


accordance with service level agreements (SLAs). The cost of unavailability of
information is greater than ever, and outages in key industries cost millions of
dollars per hour. So, it is important for any organization to have backup and
recovery solutions in place for meeting the required SLAs.

Recent world events including acts of terrorism, natural disasters, and large-scale
company fraud have resulted in a new raft of legislation designed to protect
company data from loss or corruption. Many government and regulatory laws
mandate that an organization must be responsible for protecting its employee’s and
customer’s personal data.

Backup enables organizations to comply with regulatory requirements. Data loss


can have a financial impact to organizations of all sizes. The financial impact on a
company is a combination of loss of business, low productivity, legal action, and the
cost of re-creating data. Backup solutions help organizations to avoid financial and
business loss in the event of any disaster.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 486


Appendix

Backup Operations
The backup server initiates the backup process for different clients based on the
backup schedule configured for them. For example, the backup for a group of
clients may be scheduled to start at 3:00 a.m. every day. The backup server
coordinates the backup process with all the components in a backup environment.

The backup server maintains the information about backup clients to be backed up
and storage nodes to be used in a backup operation. The backup server retrieves
the backup-related information from the backup catalog. Based on this information,
the backup server instructs the storage node to load the appropriate backup media
into the backup devices.

Simultaneously, it instructs the backup clients to gather the data to be backed up


and sends it over the network to the assigned storage node. After the backup data
is sent to the storage node, the client sends the backup metadata (the number of
files, name of the files, storage node details, and so on) to the backup server. The
storage node receives the client data, organizes it, and sends it to the backup
device.

The storage node then sends additional backup metadata (location of the data on
the backup device, time of backup, and so on) to the backup server. The backup
server updates the backup catalog with this information. The backup data from the
client can be sent to the backup device over a LAN or a SAN network.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 487


Appendix

Backup Operations Description


Backup initiation method: The backup operation is typically initiated by a server,
but it can also be initiated by a client. A client-initiated backup is a manual process
performed on a backup client. This type of backup is useful when a user wants to
perform backup any time outside of the regular backup schedule. The user
specifies which files, directories, and file systems need to be backed up. When the
client performs a backup, it sends the backup data to the assigned storage node,
and sends the tracking information to the backup server. A server-initiated backup
is a backup initiated from the backup server. Although backup process can be run
manually, it is normally scheduled to start automatically. The backup server sends
a backup request to a configured group of clients, causing the clients to gather the
data to be backed up.

Backup mode: Hot backup and cold backup are the two modes deployed for
backup. They are based on the state of the application when the backup is
performed. A cold backup requires the application to be shutdown during the
backup process. Hence, this method is also referred to as offline backup. The
disadvantage of a cold backup is that the application is inaccessible to users during
the backup process. In a hot backup, the application is up-and-running, with users
accessing their data during the backup process. This method of backup is also
referred to as online backup. The hot backup of online production data is
challenging because data is actively being used and changed. If a file is open, it is
normally not backed up during the backup process. In such situations, an open file
agent is required to back up the open file. These agents interact directly with the
operating system or application and enable the creation of consistent copies of
open files.

Backup-Type: Typically, backup can be performed at file-level, block-level, or


image-level. In a file-level backup, one or more files on a client system is backed
up. In a block-level backup, data is backed up at block-level instead of file-level and
typically requires client-side processing to identify the changed blocks. An image-
based backup is an image of a physical compute system or VM, consisting of the
block by block contents of a hard drive. The backup is saved as a single file that is
called an image. In the event of a disaster, a business’ entire data set is preserved,
allowing movement to a new hardware and performing a swift restore of all
information.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 488


Appendix

Recovery Operations
Upon receiving a restore request, an administrator opens the restore application to
view the list of clients that have been backed up. While selecting the client for
which a restore request has been made, the administrator also needs to identify the
client that will receive the restored data. Data can be restored on the same client
for whom the restore request has been made or on any other client.

The administrator then selects the data to be restored and the specific point-in-time
to which the data has to be restored based on the RPO. Because all this
information comes from the backup catalog, the restore application needs to
communicate with the backup server.

The backup server instructs the appropriate storage node to mount the specific
backup media onto the backup device. Data is then read and sent to the client that
has been identified to receive the restored data.

Some restorations are successfully accomplished by recovering only the requested


production data. For example, the recovery process of a spreadsheet is completed
when the specific file is restored. In database restorations, additional data, such as
log files, must be restored along with the production data.

This ensures consistency for the restored data. In these cases, the RTO is
extended due to the additional steps in the restore operation. It is also important to
have security mechanisms on the backup and recovery applications to avoid
recovery of data by non-authorized users.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 489


Appendix

Types of Recovery
Operational recovery or restore typically involves the recovery of individual files
or directories after they have been accidentally deleted or corrupted.

Disaster recovery involves bringing a data center or a large part of a data center
to an operational state in case of a disaster affecting the production site location.
Data for recovery are located in offsite locations. Portable media, such as tapes,
sent to an offsite location could be used for recovery. In another example, data
backed up locally can be replicated to an offsite location by the backup application.
Recovery can be from the most recent point-in-time replicated backup data.

Full VM recovery permanently restore your VMs either to the same host or to a
different virtual host, it can be done through the Live Recovery to ESXi Server
option. The VMs will be restored into the data store that is present in the storage
repositories.

Cloud disaster recovery (Cloud DR) allows enterprises to copy backed-up VMs
from their on-premises environments to the public cloud to orchestrate DR testing,
failover and failback of cloud workloads in a disaster recovery scenario. These
workloads can be run directly in the public cloud, so full deployment of your on-
premises data protection solutions in the cloud is not required in order to protect
and recover your VMs. Organizations can manage, recover, failback and test DR
plans through the Cloud DR Server (CDRS) UI. Cloud DR takes advantage of the
agility and cost-effectiveness of cloud object storage (Dell EMC ECS, AWS S3 or
Azure Blob), requires minimal footprint in the public cloud, as well as minimal
compute cycles, delivering a highly efficient disaster recovery solution.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 490


Appendix

Achieving Consistency in Backup


Typically while backing up file system data, the data to be backed up is accessed at
the file level. The backup application must have the necessary file permissions to
access the data. The backup is taken at a specific point-in-time. To ensure
consistency of the backup, no changes to the data should be allowed while the
backup is being created.

In case of file systems, consistency can be achieved by taking the file system
offline, i.e. by un-mounting the file system or by keeping the file system online and
flushing host buffers before creating the backup to ensure that all writes are
committed. No further writes are allowed to the data while the backup is being
created.

Backing up data while files are open becomes more challenging because data is
actively being used and changed. An open file is locked by the operating system
and is not copied during the backup process until the user closes it. The backup
application can back up open files by retrying the operation on files that were
opened earlier.

During the backup process, it may be possible that files opened earlier will be
closed and a retry will be successful. However, this method is not considered
robust because in some environments certain files are always open. In such
situations, the backup application or the operating system can provide open file
agents. These agents interact directly with the operating system and enable the
creation of copies of open files.

A database is composed of different files which may occupy several file systems.
Data in one file may be dependent upon data in another. A single transaction may
cause updates to several files and these updates may need to occur in a defined
order. A consistent backup of a database means that all files need to be backed up
at the same “point” or state. Consistent backups of databases can be done using a
cold (or offline) method which means that the database is shutdown while the
backup is running.

The downside is that the database will not be accessible by users. Hot backup is
used in situations where it is not possible to shutdown the database. Backup is
facilitated by database backup agents that can perform a backup while the
database is active. The disadvantage associated with a hot database backup is

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 491


Appendix

that the agents can negatively affect the performance of the database application
server.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 492


Appendix

Working of Synthetic Full Backup


A synthetic backup takes data from an existing full backup and merges it with the
data from any existing incremental backup. This effectively results in a new full
backup of the data. This backup is called synthetic because the backup is not
created directly from the production data.

All subsequent increments use the created synthetic full backup as a new starting
point. A previously used full backup file remains on backup device until it is
automatically deleted according to the backup retention policy.

A synthetic full backup enables a full backup copy to be created offline without
disrupting the I/O operation on the production volume. This also frees up network
resources from the backup process, making them available for other production
uses. Synthetic backups enable to take advantage of reduced backup window.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 493


Appendix

Backup Multiplexing
Multiplexing allows backups of multiple client machines to send data to a single
tape drive simultaneously. Multiplexing is useful when your tape drive throughput is
faster than the rate at which data can be extracted from the source (client).

Multiplexing may decrease backup time for large numbers of clients over slow
networks, but it does so at the cost of recovery time. Restores from multiplexed
tapes must pass over all non-applicable data.

This action increases restore times. When recovery is required, demultiplexing


causes delays in the restore. Multiplexing is primarily used in physical tape drives
to keep it streaming and avoid the “shoe shining” effect.

Note: Multistreaming

• Multistreaming is a process that divides the backup jobs into multiple sub-jobs
(streams) that run simultaneously and sends data to the destination backup
device.
• Multistreaming allows to use all of the available backup devices on the system
by splitting the backup jobs into multiple jobs using all available tape devices.
− It will increase the overall backup throughput compared to the sequential
method.
• Multistreaming is useful when performing large backup jobs, since it is more
efficient to divide multiple jobs between multiple backup devices.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 494


Appendix

Direct-Attached Backup
Direct-attached backups are generally better suited for smaller environments. The
key advantage of direct-attached backups is speed. The tape devices can operate
at the speed of the channels.

In a direct-attached backup, the backup device is not shared, which may lead to
silos of backup device in the environment. It might be difficult to determine if
everything is being backed up properly.

As the environment grows, however, there will be a need for central management
of all backup devices and to share the resources to optimize costs. An appropriate
solution is to share the backup devices among multiple servers.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 495


Appendix

LAN-Based Backup
In a LAN-based backup, the data to be backed up is transferred from the backup
client (source), to the backup device (destination) over the LAN, which may affect
network performance.

Streaming across the LAN also affects network performance of all systems
connected to the same segment as the backup server.

Network resources are severely constrained when multiple clients access and
share the same backup device. This impact can be minimized by adopting a
number of measures such as configuring separate networks for backup and
installing dedicated storage nodes for some application servers.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 496


Appendix

Agent-Based Backup Approach


This is a popular way to protect virtual machines due to the same workflow
implemented for a physical machine. This means backup configurations and
recovery options follow traditional methods that administrators are already familiar
with.

This approach allows to do a file-level backup and restoration. However, this


backup approach doesn’t capture virtual machine configuration files.

This approach doesn’t provide the ability to backup and restore the VM as a whole.
The agent running on the compute system consumes CPU cycles and memory
resources.

If multiple VMs on a compute system are backed up simultaneously, then the


combined I/O and bandwidth demands placed on the compute system by the
various backup operations can deplete the compute system resources.

This may impact the performance of the services or applications running on the
VMs. To overcome these challenges, the backup process can be offloaded from
the VMs to a proxy server. This can be achieved by using the image-based backup
approach.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 497


Appendix

Image-Based Backup
Image-level backup makes a copy of the virtual machine disk and configuration
associated with a particular VM. The backup is saved as a single entity called VM
image. This type of backup is suitable for restoring an entire VM in the event of a
hardware failure or human error such as the accidental deletion of the VM. It is also
possible to restore individual files and folders/directories within a virtual machine.

In an image-level backup, the backup software can backup VMs without installing
backup agents inside the VMs or at the hypervisor-level. The backup processing is
performed by a proxy server that acts as the backup client, thereby offloading the
backup processing from the VMs.

The proxy server communicates to the management server responsible for


managing the virtualized compute environment. It sends commands to create a
snapshot of the VM to be backed up and to mount the snapshot to the proxy
server. A snapshot captures the configuration and virtual disk data of the target VM
and provides a point-in-time view of the VM.

The proxy server then performs backup by using the snapshot. Performing an
image-level backup of a virtual machine disk provides the ability to execute a bare
metal restore of a VM.

Given the scalability and sheer explosion in the size of virtualized and cloud
environments, the workload burden placed on one proxy server can quickly be built.
In this scenario, the recommendation is to provision multiple proxies to handle the
combined workload and increase the amount of parallelism.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 498


Appendix

Changed Block Tracking Mechanism


To further enhance the image-based backup some of the vendors support changed
block tracking mechanism. This feature identifies and tags any blocks that have
changed since the last VM snapshot.

This enables the backup application to backup only the blocks that have changed,
rather than backing up every block. If changed block tracking is enabled for a VM
disk, the virtual machine kernel will create an additional file where it stores a map of
all the VM disk's blocks.

Once a block is changed it will be recorded in this map file. This way the kernel can
easily communicate to a backup application about the blocks of a file that have
changed since a certain point-in-time.

The backup application can then perform a backup by copying only these changed
blocks. Changed block tracking technique dramatically reduces the amount of data
to be copied before additional data reduction technologies (deduplication) are
applied. It also reduces the backup windows and the amount of storage required for
protecting VMs.

Note: Changed block tracking to restore

This technique reduces the recovery time (RTO) compared to full image restores by
only restoring the delta of the changed VM blocks. During a restore process, it is
determined which blocks have changed since the last backup. For example, if a
large database is corrupted, a changed block recovery would just restore the parts
of the database that has changed since the last backup was made.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 499


Appendix

Recovery-in-Place
Recovery-in-place (Instant VM recovery) is a term that refers to running a VM
directly from the purpose-built backup appliance, using a backed up copy of the VM
image instead of restoring that image file to the production system. In the
meantime, the VM data is restored to the primary storage from the backup copy.
Once the recovery has been completed, the workload is redirected to the original
VM.

One of the primary benefits of recovery-in-place mechanism is that it eliminates the


need to transfer the image from the backup area to the primary storage
(production) area before it is restarted; so the application that are running on those
VMs can be accessed more quickly. It reduces the RTO.

In a data center environment, a certain percentage of data, which is retained on a


backup media is redundant. The typical backup process for most organizations
consists of a series of daily incremental backups and weekly full backups. Daily
backups are usually retained for a few weeks and weekly full backups are retained
for several months. Because of this process, multiple copies of identical or slowly-
changing data are retained on backup media, leading to a high level of data
redundancy.

A large number of operating systems, application files and data files are common
across multiple systems in a data center environment. Identical files such as Word
documents, PowerPoint presentations and Excel spreadsheets, are stored by many
users across an environment. Backups of these systems will contain a large
number of identical files. Additionally, many users keep multiple versions of files
that they are currently working on. Many of these files differ only slightly from other
versions, but are seen by backup applications as new data that must be protected.

Due to this redundant data, the organizations are facing many challenges. Backing
up redundant data increases the amount of storage needed to protect the data and
subsequently increases the storage infrastructure cost. It is important for
organizations to protect the data within the limited budget. Organizations are
running out of backup window time and facing difficulties meeting recovery
objectives. Backing up large amount of duplicate data at the remote site or cloud for
DR purpose is also very cumbersome and requires lots of bandwidth.

Data deduplication provides a solution for organizations to overcome these


challenges in a backup and production environment. Deduplication is the process

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 500


Appendix

of detecting and identifying the unique data segments (chunk) within a given set of
data to eliminate redundancy. Only one copy of the data is stored; the subsequent
copies are replaced with a pointer to the original data.

The effectiveness of data deduplication is expressed as a deduplication or


reduction ratio, denoting the ratio of data before deduplication to the amount of
data after deduplication. This ratio is typically depicted as “ratio:1” or “ratio X”, (10:1
or 10 X). For example, if 200 GB of data consumes 20 GB of storage capacity after
data deduplication, the space reduction ratio is 10:1. Every data deduplication
vendor claims that their product offers a certain ratio of data reduction. However,
the actual data deduplication ratio varies, based on many factors.

These factors are as follows:

•Retention period: This is the period of time that defines how long the backup
copies are retained. The longer the retention, the greater is the chance of identical
data existence in the backup set which would increase the deduplication ratio and
storage space savings.

•Frequency of full backup: As more full backups are performed, it increases the
amount of same data being repeatedly backed up. So it results in high
deduplication ratio.

•Change rate: This is the rate at which the data received from the backup
application changes from backup to backup. Client data with a few changes
between backups produces higher deduplication ratios.

•Data type: Backups of user data such as text documents, PowerPoint


presentations, spreadsheets, and e-mails are known to contain redundant data and
are good deduplication candidates. Other data such as audio, video, and scanned
images are highly unique and typically do not yield good deduplication ratio.

•Deduplication method: Deduplication method also determines the effective


deduplication ratio. Variable-length, sub-file deduplication (discussed later in this
module) discover the highest amount of deduplication of data.

File-level deduplication (also called single instance storage) detects and removes
redundant copies of identical files in a backup environment. File-level deduplication
compares a file to be backed up with those already stored by checking its attributes
against an index. If the file is unique, it is stored and the index is updated; if not,
only a pointer to the existing file is stored. The result is that only one instance of the

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 501


Appendix

file is saved and the subsequent copies are replaced with a pointer that points to
the original file. Indexes for file-level deduplication are significantly smaller, which
takes less computational time when duplicates are being determined. Backup
performance is, therefore, less affected by the deduplication process. File-level
deduplication is simple but does not address the problem of duplicate content
inside the files. A change in any part of a file results in classifying that as a new file
and saving it as a separate copy as shown in the figure. Typically, the file-level
deduplication is implemented in a NAS environment.

Block-level deduplication (sub-file deduplication) operates by inspecting data


segments within files and removing duplication. Smaller segments make it easier
for the deduplication system to find duplicates efficiently. Sub-file deduplication not
only detects duplicate data within a single file, but also across the files. There are
two forms of sub-file deduplication, fixed-length and variable-length. The fixed-
length block deduplication divides the files into fixed length blocks and uses hash
algorithm to find duplicate data.

Fixed-length block deduplication fixes the chunking at a specific size, for example 8
KB or maybe 64 KB. The difference is that the smaller the chunk, the more likely is
the opportunity to identify it as redundant and results into greater reductions.
However, fixed-length block deduplication has challenge when a data is inserted or
deleted from a file. Inserting or deleting data causes a shift in all the data after the
point of insertion or deletion. This causes all the blocks after that point to be
different. The data is the same, but the blocks get cut at different points. So a small
insertion of data near the beginning of a file can cause the entire file to be backed
up and stored again.

Variable-length block-level deduplication is an advanced deduplication technique


that provides greater storage efficiency for redundant data, regardless of where the
new data has been inserted. As the name suggests, the length of the segments
vary, thus achieving higher deduplication ratios. In this method, if there is a change
in the block, then the boundary for that block only is adjusted, leaving the remaining
blocks unchanged. Variable-length block deduplication yields a greater granularity
in identifying duplicate data, improving upon the limitations of file-level and fixed-
length block-level deduplication.

Organizations with fast data growth, highly virtualized environments, and remote
offices greatly benefit from variable-length deduplication over a fixed-block
approach. Variable-length deduplication reduces backup storage and, when

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 502


Appendix

performed at the client, also reduces network traffic, making it ideal for remote
backup.

Source-based data deduplication eliminates redundant data at the source (backup


clients) before it transmits to the backup device. The deduplication agent is
installed in the backup client to perform deduplication. The deduplication server
maintains a hash index of the deduplicated data. The deduplication agent running
on the clients checks each file for duplicate content. It creates the hash value for
each chunk of the file. It checks the hash value with the deduplication server,
whether the hash is present on the server due to its corresponding chunk being
stored previously.

If there is no match on the server, the client will send the hash and the
corresponding chunk to the deduplication server to store the backup data. If the
chunk has already been backed up, then the chunk will not be sent to the
deduplication server by the client, which ensures that the redundant backup data is
eliminated at the client. The deduplication server can be deployed in different ways.
The deduplication server software can be installed on a general purpose physical
server (as shown in the figure) or on VMs. Some vendors offer deduplication server
along with backup device as an appliance.

The deduplication server would support encryption for secure backup data
transmission and also would support replication for disaster recovery purpose.
Source-based deduplication reduces the amount of data that is transmitted over a
network from the source to the backup device, thus requiring less network
bandwidth. There is also a substantial reduction in the capacity required to store
the backup data. Backing up only unique data from clients reduces the backup
window. However, a deduplication agent running on the client may impact the
backup performance, especially when a large amount of data needs to be backed
up. When an image-level backup is implemented, the backup workload is moved to
a proxy server.

The deduplication agent is installed on the proxy server to perform deduplication


without impacting the VMs that are running applications. Organization can
implement source-based deduplication when performing remote office branch office
(ROBO) backup to their centralized data center. Cloud service providers can also
implement source-based deduplication when performing backup (backup as a
service) from consumer’s location to their location.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 503


Appendix

Inline deduplication performs deduplication on the backup data before it is stored


on the backup device. With inline data deduplication, the incoming backup stream
is divided into small chunks, and then compared to data that has already been
deduplicated. The inline deduplication method requires less storage space than the
post process approach because duplicate data is removed as it enters the system.
However, inline deduplication may slow down the overall backup process.

In post-processing deduplication, the backup data is first stored to the disk in its
native backup format and deduplicated after the backup is completed. In this
approach, the deduplication process is separated from the backup process and the
deduplication happens outside the backup window. However, the full backup data
set is transmitted across the network to the storage target before the redundancies
are eliminated. So, this approach requires adequate storage capacity and network
bandwidth to accommodate the full backup data set. Organizations can consider
implementing target-based deduplication when their backup application would not
have built-in deduplication capabilities.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 504


Appendix

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 505


Appendix

Data Replication
Data is one of the most valuable assets of any organization. It is being stored,
mined, transformed, and utilized continuously. It is a critical component in the
operation and function of organizations. Outages, whatever may be the cause, are
extremely costly, and customers are concerned about data availability at all times.

Safeguarding and keeping the data highly available are some of the top priorities of
any organization. To avoid disruptions in business operations, it is necessary to
implement data protection technologies in a data center.

Based on business requirements, data can be replicated to one or more locations.


For example, data can be replicated within a data center, between data centers,
from a data center to a cloud, or between clouds.

In a replication environment, a compute system accessing the production data from


one or more LUNs on storage system(s) is called a production compute system.
These LUNs are known as source LUNs, production LUNs, or simply the source. A
LUN on which the production data is replicated to is called the target LUN or simply
the target or replica.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 506


Appendix

Primary Uses of Replicas


• Alternative source for backup: Under normal backup operations, data is read
from the production LUNs and written to the backup device. This places an
additional burden on the production infrastructure because production LUNs are
simultaneously involved in production operations and servicing data for backup
operations. To avoid this situation, a replica can be created from production
LUN and it can be used as a source to perform backup operations. This
alleviates the backup I/O workload on the production LUNs.
• Fast recovery and restart: For critical applications, replicas can be taken at
short, regular intervals. This allows easy and fast recovery from data loss. If a
complete failure of the source (production) LUN occurs, the replication solution
enables one to restart the production operation on the replica to reduce the
RTO.
• Decision-support activities, such as reporting: Running reports using the
data on the replicas greatly reduces the I/O burden placed on the production
device.
• Testing platform: Replicas are also used for testing new applications or
upgrades. For example, an organization may use the replica to test the
production application upgrade; if the test is successful, the upgrade may be
implemented on the production environment.
• Data migration: Another use for a replica is data migration. Data migrations are
performed for various reasons such as migrating from a smaller capacity LUN to
one of a larger capacity for newer versions of the application.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 507


Appendix

Replica Consistency
Consistency is a primary requirement to ensure the usability of replica device. In
case of file systems (FS), consistency can be achieved either by taking FS offline
i.e. by un-mounting FS or by keeping FS online by flushing compute system buffers
before creating replica.

File systems buffer the data in the compute system memory to improve the
application response time. Compute system memory buffers must be flushed to the
disks to ensure data consistency on the replica, prior to its creation. If the memory
buffers are not flushed to the disk, the data on the replica will not contain the
information that was buffered in the compute system.

Similarly in case of databases, consistency can be achieved either by taking


database offline for creating consistent replica or by keeping online. If the database
is online, it is available for I/O operations, and transactions to the database update
the data continuously.

When a database is replicated while it is online, changes made to the database at


this time must be applied to the replica to make it consistent. A consistent replica of
an online database is created by using the dependent write I/O principle or by
holding I/O momentarily to the source before creating the replica.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 508


Appendix

Types of Replication
• Local replication helps to
− Replicate data within the same storage system (in case of remote
replication) or the same data center (in case of local replication).
− Restore the data in the event of data loss or enables restarting the
application immediately to ensure business continuity. Local replication can
be implemented at compute, storage, and network.
• Remote replication helps to

− Replicate data to remote locations (locations can be geographically


dispersed).
− Mitigate the risks associated with regional outages resulting from natural or
human-made disasters.
o During disasters, the services can be moved (failover) to a remote
location to ensure continuous business operation.
− Replicate the data to the cloud for DR purpose. Remote replication can also
be implemented at compute, storage, and network.
o Data can be synchronously or asynchronously replicated.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 509


Appendix

File System Snapshot


• A snapshot is a virtual copy of a set of files, VM, or LUN as they appeared at a
specific point-in-time (PIT). A point-in-time copy of data contains a consistent
image of the data as it appeared at a given point in time.
• Snapshots can establish recovery points in just a small fraction of time and can
significantly reduce RPO by supporting more frequent recovery points. If a file is
lost or corrupted, it can typically be restored from the latest snapshot data in just
a few seconds.
• FS snapshot is a pointer-based replica that requires a fraction of the space used
by the production FS.
• When a snapshot is created, a bitmap and blockmap are created in the
metadata of the snapshot FS. The bitmap is used to keep track of blocks that
are changed on the production FS after the snapshot creation. The blockmap is
used to indicate the exact address from which the data is to be read when the
data is accessed from the snapshot FS.
• After the creation of the FS snapshot, all reads from the snapshot are actually
served by reading the production FS. If a write I/O is issued to the production
FS for the first time after the creation of a snapshot, the I/O is held and the
original data of production FS corresponding to that location is moved to the
snapshot FS. Then, the write is allowed to the production FS.
• The bitmap and the blockmap are updated accordingly. To read from the
snapshot FS, the bitmap is consulted. If the bit is 0, then the read will be
directed to the production FS. If the bit is 1, then the block address will be
obtained from the blockmap and the data will be read from that address on the
snapshot FS. Read requests from the production FS work as normal.
• Typically read-only snapshots are created to preserve the state of the
production FS at some PIT, but sometime the writeable FS snapshots are also
created for some business operations such as testing and decision support.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 510


Appendix

VM Clone
• When the cloning operation completes, the clone becomes a separate VM. The
changes made to a clone do not affect the parent VM. Changes made to the
parent VM do not appear in a clone.
• Installing a guest OS and applications on a VM is a time consuming task. With
clones, administrators can make many copies of a virtual machine from a single
installation and configuration process.
− For example, in an organization, the administrator can clone a VM for each
new employee, with a suite of preconfigured software applications.
• Snapshot is used to save the current state of the virtual machine, so that it can
allow to revert to that state in case of any error. But clone is used when a copy
of a VM is required for separate use.
− A full clone is an independent copy of a VM that shares nothing with the
parent VM. Because a full clone needs to have its own independent copy of
the virtual disks, the cloning process may take a relatively longer time.
− A linked clone is made from a snapshot of the parent VM. The snapshot is
given a separate network identity and assigned to the hypervisor to run as
an independent VM.
• All files available on the parent at the moment of the snapshot creation continue
to remain available to the linked clone VM in read-only mode.

− The ongoing changes (writes) to the virtual disk of the parent do not affect
the linked clone and the changes to the virtual disk of the linked clone do not
affect the parent. All the writes by the linked clone are captured in a delta
disk.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 511


Appendix

Snapshot – RoW

Some pointer-based virtual replication implementation use redirect on write


technology (RoW).

• Redirects new writes destined for the source LUN to a reserved LUN in the
storage pool.
• Replica (snapshot) still points to the source LUN.

− All reads from the replica are served from the source LUN.
To learn more about snapshot-RoW, click here.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 512


Appendix

Remote Replication – Synchronous


• Storage-based synchronous remote replication provides near zero RPO where
the target is identical to the source at all times.
• Writes must be committed to the source and the remote target prior to
acknowledging “write complete” to the production compute system.
− Writes on the source cannot occur until each preceding write has been
completed and acknowledged. This ensures that data is identical on the
source and the target at all times.
− Writes are transmitted to the remote site exactly in the order in which they
are received at the source. Therefore, write ordering is maintained and it
ensures transactional consistency when the applications are restarted at the
remote location.
• Most of the storage systems support consistency groups, which allow all LUNs
belonging to a given application, usually a database, to be treated as a single
entity and managed as a whole. This helps to ensure that the remote images
are consistent.

− The remote images are always restartable copies.


Note:

Application response time is increased with synchronous remote replication


because writes must be committed on both the source and the target before
sending the “write complete” acknowledgment to the compute system. The degree
of impact on response time depends primarily on the distance and the network
bandwidth between sites. If the bandwidth provided for synchronous remote
replication is less than the maximum write workload, there will be times during the
day when the response time might be excessively elongated, causing applications
to time out. The distances over which synchronous replication can be deployed
depends on the application’s capability to tolerate the extensions in response time.
Typically synchronous remote replication is deployed for distances less than 200
KM (125 miles) between the two sites.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 513


Appendix

Remote Replication – Asynchronous


• In asynchronous remote replication, a write from a production compute system
is committed to the source and immediately acknowledged to the compute
system.
• Asynchronous replication also mitigates the impact to the application’s response
time because the writes are acknowledged immediately to the compute system.
This enables to replicate data over distances of up to several thousand
kilometers between the source site and the secondary site (remote locations).
• Compute system writes are collected into buffer (delta set) at the source. This
delta set is transferred to the remote site in regular intervals.
• Adequate buffer capacity should be provisioned to perform asynchronous
replication.
• Makes asynchronous replication resilient to temporary increase in the write
workload or loss of network link.
• RPO depends on the size of the buffer, the available network bandwidth, and
the write workload to the source. This replication can take advantage of locality
of reference (repeated writes to the same location).

− If the same location is written multiple times in the buffer prior to


transmission to the remote site, only the final version of the data will
transmitted.
− Conserves link bandwidth.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 514


Appendix

Multi-Site Replication
• Multi-site replication mitigates the risks identified in two-site replication. In a
multi-site replication, data from the source site is replicated to two or more
remote sites. The example shown in the figure is a three-site remote replication
solution.
• In this approach, data at the source is replicated to two different storage
systems at two different sites. The source to remote site 1 (target 1) replication
is synchronous with a near-zero RPO. The source to remote site 2 (target 2)
replication is asynchronous with an RPO in the order of minutes.
− At any given instant, the data at the remote site 1 and the source is identical.
The data at the remote site 2 is behind the data at the source and the
remote site 1.
− The replication network links between the remote sites will be in place but
not in use.
− The difference in the data between the remote sites is tracked so that if a
source site disaster occurs, operations can be resumed at the remote site 1
or the remote site 2 with incremental resynchronization between these two
sites.
• The key benefit of this replication is the ability to failover to either of the two
remote sites in the case of source site failure, with disaster recovery
(asynchronous) protection between the remote sites.

− Disaster recovery protection is always available if any one-site failure


occurs. During normal operations, all three sites will be available and the
production workload will be at the source site.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 515


Appendix

Remote Replication CDP Operation


• For an asynchronous operation, writes at the source CDP appliance are
accumulated, and redundant blocks are eliminated. Then, the writes are
sequenced and stored with their corresponding timestamp.
− The data is then compressed, and a checksum is generated. It is then
scheduled for delivery across the IP or FC network to the remote CDP
appliance.
− After the data is received, the remote appliance verifies the checksum to
ensure the integrity of the data. The data is then written to the remote journal
volume.
− After the data is received, the remote appliance verifies the checksum to
ensure the integrity of the data. The data is then written to the remote journal
volume.
• In the synchronous replication mode, the host application waits for an
acknowledgment from the CDP appliance at the remote site before initiating the
next write.

− The synchronous replication mode impacts the application’s performance


under heavy write loads.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 516


Appendix

Key PowerMax and VMAX Family Remote Replication


Options

Product Description

SRDF/S • SRDF/S (Synchronous mode) maintains a real-time


(synchronous) mirrored copy of production data (R1 devices)
at a physically separated Symmetrix storage system (R2
devices).

SRDF/A • SRDF/A (Asynchronous mode) mirrors data from the R1


devices while always maintaining a dependent-write
consistent copy of the data on the R2 devices.
• The copy of the data at the secondary site is typically only
seconds behind the primary site.

SRDF/Star • SRDF/Star is a disaster recovery solution that consists of


three sites; primary (production), secondary, and tertiary.
• The secondary site synchronously mirrors the data from the
primary site, and the tertiary site asynchronously mirrors the
production data.

SRDF/Metro • SRDF/Metro provides the host read/write capability to both


R1 and R2 volumes while both volumes are read/write on the
SRDF link.
• Replication between the two sites is performed
synchronously across the link with a Metro distance of 200
kilometers.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 517


Appendix

Why do we need data archiving?


Data in the primary storage is actively accessed and changed. As data ages, it is
less likely to change and eventually becomes “fixed” but continues to be accessed
by applications and users. This data is called fixed data. Fixed data is growing at
over 90 percent annually. Keeping the fixed data in primary storage systems poses
several challenges.

Firstly, preserving data on the primary storage system causes increasing


consumption of expensive primary storage.

Secondly, high performance primary storage is used to store less frequently


accessed data, making it difficult to justify the cost of storage.

Thirdly, data that must be preserved over a long period for compliance reasons
may be modified or deleted by the users.

This poses a risk of a compliance breach. Finally, the backup of high-growth fixed
data results in an increased backup window and related backup storage cost. Data
archiving addresses these challenges.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 518


Appendix

Data Archiving and Its Benefits


Data archiving is the process of moving fixed data that is no longer actively
accessed to a separate lower cost archive storage system for long term retention
and future reference. With archiving, the capacity on expensive primary storage
can be reclaimed by moving infrequently-accessed data to lower-cost archive
storage.

Archiving fixed data before taking backup helps to reduce the backup window and
backup storage acquisition costs. Data archiving helps in preserving data that may
be needed for future reference and data that must be retained for regulatory
compliance. For example, new product innovation can be fostered if engineers can
access archived project materials such as designs, test results, and requirement
documents.

Similarly, both active and archived data can help data scientists drive new
innovations or help to improve current business processes. In addition, government
regulations and legal/contractual obligations mandate organizations to retain their
data for an extended period of time.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 519


Appendix

Data Archiving Operation


The data archiving operation involves the archiving agent, the archive server/policy
engine, and the archive storage. The archiving agent scans the primary storage to
find files that meet the archiving policy defined on the archive server (policy
engine).

After the files are identified for archiving, the archive server creates the index for
the files. Once the files have been indexed, they are moved to the archive storage
and small stub files are left on the primary storage. Each archived file on primary
storage is replaced with a stub file. The stub file contains the address of the
archived file. As the size of the stub file is small, it significantly saves space on
primary storage.

From a client’s perspective, the data movement from primary storage to secondary
storage is completely transparent.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 520


Appendix

Correlating Storage Tiering and Archive


Storage tiering is a technique of establishing a hierarchy of storage types (tiers)
and identifying the candidate data to relocate to the appropriate storage type to
meet service level requirements at a minimal cost. Each storage tier has different
levels of protection, performance, and cost.

As the tier number reduces, the storage performance improves but the cost of
storage increases which limits the usage of storage capacity. The higher the tier
number, the higher can be the storage capacity due to its cost advantage.

Archive storage is typically configured as the final tier or highest numbered tier in
the storage tiering. Keeping frequently used data in lowered numbered tiers, called
performance tiers, improves application performance.

Moving less-frequently accessed data or fixed data to the highest numbered tier,
called the archive tier, can free up storage space in performance tiers and reduce
the cost of storage.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 521


Appendix

Tiering Example: NAS to Archive File Movement


The image illustrates an example of file-level storage tiering, where files are moved
from a NAS device (primary storage system) to an archive storage system. The
environment includes a policy engine, where tiering policies are configured. The
policy engine facilitates automatically moving files from primary to archive storage.

Before moving a file to archive storage, the policy engine scans the NAS device to
identify files that meet the predefined tiering policies. After identifying the candidate
files, the policy engine creates stub files on the NAS device and then moves the
candidate files to the destination archive storage.

The small, space-saving stub files point to the actual files in the archive storage.
When an application server (NAS client) tries to access a file from its original
location on the NAS device, the actual file is provided from the archive storage.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 522


Appendix

Archiving Use Case: Email Archiving


• Email archiving is the process of archiving emails from the mail server to an
archive storage. After the email is archived, it is retained for years, based on the
retention policy.
• Legal Dispute/Government Compliance:
− An organization may be involved in a legal dispute and they need to produce
all emails within a specified time period containing specific keywords that
were sent to or from certain people. Email archiving also helps to meet
government compliance requirements such as Sarbanes-Oxley and SEC
regulations.
• Mailbox Space Saving:
− An organization may need to produce all emails from all individuals involved
in stock sales or transfers. Failure to comply with these requirements could
cause an organization to incur penalties. Email archiving also provides more
mailbox space by moving old emails to archive storage.
• Mailbox Space Saving:

− An organization may configure a quota on each mailbox to limit its size. A


fixed quota for a mailbox forces users to delete emails as they approach the
quota size. However, end users often need to access emails that are weeks,
months, or even years old. With email archiving, organizations can free up
space in user mailboxes and still provide user access to older emails.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 523


Appendix

Key Features of CAS


The key features of CAS are as follows:

Content integrity: It provides assurance that the stored data has not been altered.
If the fixed data is altered, CAS generates a new content address for the altered
data, rather than overwriting the original fixed data.

Content authenticity: It assures the genuineness of stored data. This is achieved


by generating a unique content address for each object and validating the content
address for stored objects at regular intervals. Content authenticity is assured
because the address assigned to each object is as unique as a fingerprint. Every
time an object is read, CAS uses a hashing algorithm to recalculate the object’s
content address as a validation step and compares the result to its original content
address. If the object validation fails, CAS rebuilds the object.

Single-instance storage: CAS uses a unique content address to guarantee the


storage of only a single instance of an object. When a new object is written, the
CAS system is polled to see whether an object is already available with the same
content address. If the object is available in the system, it is not stored; instead,
only a pointer to that object is created.

Retention enforcement: Protecting and retaining objects is a core requirement of


an archive storage. After an object is stored in CAS system and the retention policy
is defined, CAS does not make the object available for deletion until the policy
expires.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 524


Appendix

Key Features of CAS - Contd.,


• Location independence: CAS uses a unique content address, rather than
directory path names or URLs, to retrieve data. This makes the physical location
of the stored data irrelevant to the application that requests the data.
• Data protection: CAS provides both local and remote protection to the objects
stored on it. In the local protection option, data objects are either mirrored or
parity protected. For remote protection, objects are replicated to a secondary
CAS at a remote location. In this case, the objects remain accessible from the
secondary CAS if the primary CAS fails
• Performance: CAS stores all objects on disks which provide faster access to the
objects compared to tapes and optical discs.
• Self-healing: CAS automatically detects and repairs corrupted objects and alerts
the administrator about the potential problem. CAS can be configured to alert
remote support teams who can diagnose and repair it remotely.
• Audit trails: CAS keeps track of management activities and any access or
disposition of data. Audit trails are mandated by compliance requirements.

The various data migration techniques are as follows:

SAN-based data migration involves migrating data at the block-level between


storage systems within a data center or across data centers. In a SAN-based
technique, the migration software installed on the storage system performs direct
data migration between storage systems. Even the data migration between storage
systems can happen by using virtualization appliance.

NAS-based data migration involves migrating data at the file-level between NAS
systems. Even the file migration between NAS systems can happen by using
intermediary compute systems or virtualization appliance.

In a host-based migration, a specialized tool is installed on the compute system


to perform migration. In a virtualized environment, it is important to migrate running
VMs between hypervisors for various reasons such as avoiding downtime and
balancing the workload across hypervisors. The two key hypervisor-based
migration techniques are VM live migration and VM storage migration.

Application migration typically involves the migration of application from one


environment to another. Organizations have numerous migration options and
choosing the appropriate solution depends on several factors. Ease of

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 525


Appendix

configuration and management, hardware capabilities, ability to throttle the rate of


data movement, and determining application impact are critical when making a
choice.

The best solution in one migration may not necessarily be the best solution for
another migration. No one-size-fits-all migration tool or solution exists. Each
migration solution has its own set of advantages and challenges. So it is important
to choose an appropriate solution to successfully perform migration operation.

SAN-based data migration can also be implemented using a virtualization


appliance at the SAN. Typically for data migration, the virtualization appliance
(controller) provides a translation layer in the SAN, between the compute systems
and the storage systems. The LUNs created at the storage systems are assigned
to the appliance. The appliance abstracts the identity of these LUNs and creates a
storage pool by aggregating LUNs from the storage systems. A virtual volume is
created from the storage pool and assigned to the compute system. When an I/O is
sent to a virtual volume, it is redirected through the virtualization layer at the SAN to
the mapped LUNs.

For example, an administrator wants to perform a data migration from storage


system A to system B as shown in the figure. The virtualization layer handles the
migration of data, which enables LUNs to remain online and accessible while data
is migrating. In this case, physical changes are not required because the compute
system still points to the same virtual volume on the virtualization layer. However,
the mapping information on the virtualization layer should be changed. These
changes can be executed dynamically and made transparent to the end user. The
key advantage of using virtualization appliance is to support data migration
between multi-vendor heterogeneous storage systems.

Organizations require a robust file sharing environment that is dynamically


expandable, easily maintained, and flexible. When businesses outgrow their current
file servers and feel concerned about regulatory compliance, it is time for them to
upgrade the infrastructure. NAS-based data migration allows the organizations to
move the data from their old file servers to the NAS systems. Now-a-days the
organizations want to move the file-level data to the new NAS systems especially
scale-out NAS, to meet their business demands. The key requirements for NAS-
based data migration is the file-level data that needs to be accessed all the times
by the clients.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 526


Appendix

In a NAS to NAS direct data migration, file-level data is migrated from one NAS
system to another directly over the LAN without the involvement of any external
server. The two primary options of performing NAS-based migration is either by
using NDMP protocol or software tool. In this example, the new NAS system
initiates the migration operation and pulls the data directly from the old NAS system
over the LAN. The key advantage of NAS to NAS direct data migration is that there
is no need for an external component (host or appliance) to perform or initiate the
migration process.

Application migration typically involves moving the application from one data center
environment to another. Typically, the organization can move the application from
physical to virtual environment. In a virtualized environment, the application can
also be moved from one hypervisor to another for various business reasons such
as balancing workload for improving performance and availability. In an application
migration from a physical to virtual environment, the physical server running the
application is converted into a virtual machine. This option usually requires a
converter software that clones the data on the hard disk of the physical compute
system and migrates the disk content (application, OS, and data) to an empty VM.

After this, the VM is configured based on the physical compute system


configuration and the VM is booted to run the application. Virtual machine live
migration technique can be used to move a running application from a VM to
another VM without any downtime. This method involves copying the contents of
VM memory from the source hypervisor to the target and then transferring the
control of the VM’s disk files to the target hypervisor. Next, the VM is suspended on
the source hypervisor, and the VM is resumed on the target hypervisor.

Application Migration Strategies

Forklift Migration Strategy: In this strategy, rather than moving applications in


parts over the time, all applications are picked up at once and moved to the new
environment. Tightly coupled applications (multiple applications that are dependent
on each other and cannot be separated) or self-contained applications might be
better served by using the forklift approach.

Hybrid Migration Strategy: In this strategy, some parts of the application are
moved to the new environment while leaving the other parts of the application in
place. Rather than moving the entire application at once, parts of it can be moved
and optimized, one at a time. This strategy is good for large systems that involve
several applications and those that are not tightly coupled.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 527


Appendix

Key Attributes of SDDC


• SDDC is viewed as an important step to progress towards a complete
virtualized data center (VDC), and is regarded as the necessary foundational
infrastructure for third platform transformation.
• The key attributes of SDDC are:

− Abstraction and pooling: SDDC abstracts and pools IT resources across


heterogeneous infrastructure. IT resources are pooled to serve multiple
users or consumers using a multi-tenant model. Multi-tenancy enables
multiple consumers to share the pooled resources, which improves
utilization of the resource pool. Resources from the pool are dynamically
assigned and reassigned according to consumer demand.
− Automated, policy-driven provisioning including data protection: In the
SDDC model, IT services are dynamically created and provisioned including
data protection from available resources based on defined policy. If the
policy changes, the environment dynamically and automatically responds
with the new requested service level.
− Unified management: Traditional multi-vendor, siloed environments require
independent management, which is complex and time consuming. SDDC
provides a unified storage management interface that provides an abstract
view of the IT infrastructure. Unified management provides a single control
point for the entire infrastructure across all physical and virtual resources.
− Self-service: SDDC enables automated provisioning and self-service
access to IT resources. This enables organizations to allow users to select
services from a self-service catalog and self-provision them.
− Metering: The usage of resources per user is measured and reported by a
metering system. Metering helps in controlling and optimizing resource
usage as well as generating bills for the utilized resources.
− Open and extensible: An SDDC environment is open and easy to extend,
which enables adding new capabilities. An extensible architecture enables
integrating multi-vendor resources, and external management interfaces and
applications into the SDDC environment through the use of APIs.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 528


Appendix

Software Controller
• The control plane in software-defined data center is implemented by a software
controller. The controller is a software that:
• Discovers the available underlying resources and provides an aggregated view
of resources. It abstracts the underlying hardware resources (compute, storage,
and network) and pools them.
− This enables the rapid provisioning of resources from the pool, based on
pre-defined policies that align to the service level agreements for different
users.
− Enables storage management and provisioning.
• Enables organizations to dynamically, uniformly, and easily modify and manage
their infrastructure.
• Enables an administrator to manage the resources, node connectivity, and
traffic flow. It also controls the behavior of underlying components, allows
applying policies uniformly across the infrastructure components, and enforces
security, all from a software interface.
• Provides interfaces that enable software external to the controller to request
resources and access these resources as services.
• CLI and GUI are native management interfaces of the controller. API is used by
external software to interact with controller.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 529


Appendix

Architecture of SDDC
• The SDDC architecture decouples the control plane from the data plane.
− It separates the control functions from the underlying infrastructure
components and provides it to an external software controller.
− The centralized control plane provides policies for processing and
transmission of data, which can be uniformly applied across the multi-vendor
infrastructure components.
− The policies can also be upgraded centrally to add new features and to
address application requirements.
• The controller usually provides CLI and GUI for administrators to manage the IT
infrastructure and configure the policies. It also automates and orchestrates
many hardware-based or component-specific management operations.
− This reduces the need for manual operations that are repetitive, error-prone,
and time-consuming.
• The software controller provides APIs for external management tools and
orchestrators to manage data center infrastructure and orchestrate controller
operations.
• The SDDC architecture enables users to view and access IT resources as a
service from a self-service portal.
− The portal provides a service catalog that lists a standardized set of services
available to the users.
• The service catalog allows a user to request or order a service from the catalog
in a self-service way.

− The request is forwarded to the software controller by an orchestrator or a


management tool. Upon receiving the request, the controller provisions
appropriate resources to deliver the service.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 530


Appendix

Key Benefits of SDDC


• Agility: SDDC enables faster provisioning of resources based on workload
policies. Consumers provision infrastructure resources via self-service portal.
These significantly improve business agility.
• Cost efficiency: SDDC enables organizations to use commodity hardware and
also existing infrastructure, which significantly lowers CAPEX.
• Improved control: SDDC provides improved control over application
availability and security through policy-based governance. SDDC provides
automated data protection and disaster recovery features. Automated, policy-
driven operations help in reducing manual errors.
• Centralized management: An SDDC is automated, and managed by
intelligent, policy-based data center management software, vastly simplifying
governance and operations.
− A single, unified management platform allows centrally monitoring and
administering of all applications across physical geographies,
heterogeneous infrastructure, and hybrid clouds.
− Organizations can deploy and manage workloads in physical, virtual, and
cloud environments with a unified management experience.
• Flexibility: SDDC enables organizations to use heterogeneous commodity
hardware and the most advanced hardware technologies as they see fit.

− Lower-value workloads can run on commodity hardware, while software-


based services and mission-critical applications can run on advanced, more-
intelligent infrastructure.
− SDDC also supports the adoption of cloud model through the use of
standard protocols and APIs.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 531


Appendix

Functions of SDS Controller


• Discovery: The SDS controller discovers various types of physical storage
systems available in a data center to gather data about the components and
bring them under its control and management.
− Includes information on the storage pools and the storage ports for each
storage system.
• Resource abstraction and pooling: The SDS controller abstracts the physical
storage systems into virtual storage systems and virtual storage pools according
to the policies configured by the administrators. The SDS controller also:
− Enables an administrator to define storage services for the end users.
− Provides three types of interfaces to configure and monitor the SDS
environment as well as provision virtual storage resources.
− Command line interface (CLI), graphical user interface (GUI), and
application programming interface (API)105.
• Service provisioning: The defined storage services are typically visible and
accessible to the end users through a service catalog. The service catalog:

− Allows the end user to specify a compute system for which a virtual storage
must be provisioned and a virtual storage system and virtual storage pool
from which the storage has to be derived.
− Automates the storage provisioning tasks and delivers virtual storage
resources based on the requested services.

105API enables the external management tools and applications to interact with the
SDS controller for extracting data, monitoring SDS environment, and creating
logical storage resources.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 532


Appendix

Virtual Storage Pool


• Virtual storage resources are provisioned to the end users from the virtual
storage pools. While provisioning a storage resource, users choose the virtual
storage pool from which the storage will be used and the SDS controller
automatically selects an appropriate storage pool to meet the provisioning
request.
− Users do not need to know the details of the underlying physical storage
infrastructure. This is in contrast to the traditional storage provisioning where
users provision storage resources from the physical storage systems.
− Examples of virtual storage pool are block storage virtual pool and file
storage virtual pool.
• Multiple storage pools helps to create tiered storage services such as gold pool
(high), silver pool (moderate), and bronze pool (low).
• The SDS controller is usually capable of matching the existing storage pools to
the virtual storage pool characteristics specified by an administrator.
− The administrator can enable automatic assignment of the matching storage
pools to the virtual storage pool, or carry out the process manually.
• A storage pool may belong to multiple virtual storage pools. A virtual storage
pool may reside in a single data center or it may span multiple data centers.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 533


Appendix

Virtual Switch Example

Physical Compute System Physical Compute System

Web DB Clients
App Server

Physical
Switch

Virtual Switch Physical Physical Virtual Switch


NIC NIC

Consider the example of a web application that runs on a VM and needs to


communicate with a database (DB) server as shown in the image.

• The database server is hosted on another VM on the same compute system.


• The two VMs can be connected via a virtual switch to enable them to
communicate with each other.
− Because the traffic between the VMs does not travel over a network external
to the compute system, the data transfer speed between the VMs is
increased.
• The VMs residing on different compute systems may need to communicate
either with each other, or with other physical compute systems such as a client
machine.

− The virtual switch must be connected to the network of physical compute


systems.
− The VM traffic travels over both the virtual switch and the network of physical
compute systems as shown in the image.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 534


Appendix

What is Cloud Computing


According to U.S. National Institute of Standards and Technology, Special
Publication 800-145, “Cloud computing is a model for enabling ubiquitous,
convenient, on-demand network access to a shared pool of configurable computing
resources (e.g., networks, servers, storage, applications, and services) that can be
rapidly provisioned and released with minimal management effort or service
provider interaction.”

The cloud model is similar to utility services such as electricity, water, and
telephone. When consumers use these utilities, they are typically unaware of how
the utilities are generated or distributed. The consumers periodically pay for the
utilities based on usage. Consumers simply hire IT resources as services from the
cloud without the risks and costs associated with owning the resources.

Cloud services are accessed from different types of client devices over wired and
wireless network connections. Consumers pay only for the services that they use,
either based on a subscription or based on resource consumption. The figure on
the slide illustrates a generic cloud computing environment, wherein various types
of cloud services are accessed by consumers from different client devices over
different network types.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 535


Appendix

Traditional IT vs. Cloud Computing


Traditionally, IT resources such as hardware and software are often acquired by
the organization to support their business applications. The acquisition and
provisioning of new resources commonly follow a rigid procedure that includes
approvals from the concerned authorities.

As a result, they may take up a considerable amount of time. This can delay
operations and increase the time-to-market. Additionally, to the extent allowed by
the budget, the IT resources required for an application are sized based on peak
usage. This results in incurring high up-front capital expenditure (CAPEX) even
though the resources remain underutilized for a majority of the time.

As workloads continue to grow and new technologies emerge, businesses may not
afford for investments to increase proportionally. Further, a significant portion of the
IT budget goes to support and maintain the existing IT infrastructure, leaving a little
to provide innovative solutions to the business.

In cloud computing, users rent IT resources such as storage, processing, network


bandwidth, application, or a combination of them as cloud services. Cloud
computing enables on-demand resource provisioning and scalability. IT resources
are provisioned by the users using a self-service portal backed by an automated
fulfillment process. These provide quick time-to-market, and potentially competitive
advantage.

Resource consumption is measured by using a metering service which may help in


billing users as per consumption. Users can de-provision the rented resources
when resources are no longer needed. This reduces the investment in IT
infrastructure and improves the resource utilization. This also reduces expenses
associated with IT infrastructure management, floor space, power, and cooling.
Further, reduction of IT maintenance tasks can drive new business initiatives,
discovery of new markets, and innovation.

A computing infrastructure can be classified as a cloud only if it has some specific


essential characteristics, which are subsequently discussed.

From a business perspective, each advancing wave of technology and business


sophistication changes the way IT works. Businesses must adopt new IT products
and solutions rapidly to stay competitive in the market. This may enforce
organizations to periodically upgrade their IT infrastructure and acquire new
software and hardware resources. As an organization’s capital expenditure

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 536


Appendix

(CAPEX) rises, the risk associated with the investment also increases. For small
and medium sized businesses, this may be a big challenge, which eventually
restricts their ability to grow. As an individual, it may not be sensible or affordable to
purchase new applications every time, if you need them only for a brief period. This
image shows various requirements and constraints from a business perspective as
well as an individual perspective and also shows the way a cloud can address
these constraints and requirements.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 537


Appendix

Cloud Service Models - IaaS


• Defined as “the capability provided to the consumer is to provision processing,
storage, networks, and other fundamental computing resources where the
consumer is able to deploy and run arbitrary software, which can include
operating systems and applications. "
• Consumer does not manage or control the underlying cloud infrastructure but
has control over operating systems, storage, and deployed applications; and
possibly limited control of select networking components
• IaaS can even be implemented internally by an organization with internal IT
managing the resources and services. IaaS pricing can be subscription-based
or based on resource usage. Keeping in line with the cloud characteristics, the
provider pools the underlying IT resources which are shared by multiple
consumers through a multi-tenant model.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 538


Appendix

Cloud as a Service - Platform as a Service (PaaS)


• Defined as “the capability provided to the consumer is to deploy onto the cloud
infrastructure consumer-created or acquired applications created using
programming languages, libraries, services, and tools supported by the
provider. "
• Consumer does not manage or control the underlying cloud infrastructure
including network, servers, operating systems, or storage, but has control over
the deployed applications and possibly configuration settings for the application-
hosting environment configurations.
• Includes compute, storage, and network resources along with platform software.
Platform software includes software such as OS, database, programming
frameworks, middleware, and tools to develop, test, deploy, and manage
applications
• PaaS usage fees are typically calculated based on factors such as the number
of consumers, the types of consumers (developer, tester, and so on), the time
for which the platform is in use, and the compute, storage, or network resources
consumed by the platform.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 539


Appendix

Software as a Service - SaaS


• Defined as “the capability provided to the consumer is to use the provider’s
applications running on a cloud infrastructure. The applications are accessible
from various client devices through either a thin client interface, such as a web
browser, (e.g., web-based email), or a program interface."
• Consumer does not manage or control the underlying cloud infrastructure
including network, servers, operating systems, storage, or even individual
application capabilities, with the possible exception of limited user-specific
application configuration settings
• SaaS applications execute in the cloud and usually do not need installation on
end-point devices. This enables a consumer to access the application on
demand from any location and use it through a web browser on a variety of end-
point devices.
• Customer Relationship Management (CRM), email, Enterprise Resource
Planning (ERP), and office suites are examples of applications delivered
through SaaS.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 540


Appendix

Public Cloud
A cloud infrastructure deployed by a provider to offer cloud services to the general
public and/or organizations over the Internet.

There may be multiple tenants (consumers) who share common cloud resources. A
provider typically has default service levels for all consumers of the public cloud.
The provider may migrate a consumer’s workload at any time, to any location.

Some providers may optionally provide features that enable a consumer to


configure their account with specific location restrictions.

Services may be free, subscription-based or provided on a pay-per-use


model. Public cloud provides the benefits of low up-front expenditure on IT
resources and enormous scalability. However, some concerns for the consumers
include network availability, risks associated with multi-tenancy, limited or no
visibility and control over the cloud resources and data, and restrictive default
service levels.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 541


Appendix

Private Cloud
Many organizations may not wish to adopt public clouds as they are accessed over
the open Internet and used by the general public. With a public cloud, an
organization may have concerns related to privacy, external threats, and lack of
control over the IT resources and data.

When compared to a public cloud, a private cloud offers organizations a greater


degree of privacy and control over the cloud infrastructure, applications, and data.
The private cloud model is typically adopted by larger-sized organizations that have
the resources to deploy and operate private clouds.

There are two variants of a private cloud: on-premise and externally-hosted.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 542


Appendix

Community Cloud - On Premise


A community cloud is a cloud infrastructure that is set up for the sole use by a
group of organizations with common goals or requirements. The organizations
participating in the community typically share the cost of the community cloud
service. If various organizations operate under common guidelines and have similar
requirements, they can all share the same cloud infrastructure and lower their
individual investments.

Since the costs are shared by lesser consumers than in a public cloud, this option
may be more expensive. However, a community cloud may offer a higher level of
control and protection against external threats than a public cloud. There are two
variants of a community cloud: on-premise and externally-hosted. In an on-premise
community cloud, one or more participant organizations provide cloud services that
are consumed by the community.

Each participant organization may provide cloud services, consume services, or


both. At least one community member must provide cloud services for the
community cloud to be functional. The cloud infrastructure is deployed on the
premises of the participant organization(s) providing the cloud services. The
organizations consuming the cloud services connect to the clouds of the provider
organizations over a secure network. The organizations providing cloud services
require IT personnel to manage the community cloud infrastructure.

Participant organizations that provide cloud services may implement a security


perimeter around their cloud resources to separate them from their other non-cloud
IT resources. Additionally, the organizations that consume community cloud
services may also implement a security perimeter around their IT resources that
access the community cloud.

Many network configurations are possible in a community cloud. The figure on the
slide illustrates an on-premise community cloud, the services of which are
consumed by enterprises P, Q, and R. The community cloud comprises two cloud
infrastructures that are deployed on the premises of Enterprise P and Enterprise Q,
and combined to form a community cloud.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 543


Appendix

Externally Hosted Community Cloud - Appendix


Participant organizations of the community outsource the implementation of the
community cloud to an external cloud service provider.

The cloud infrastructure is hosted on the premises of the external cloud service
provider and not within the premises of any of the participant organizations.

The provider:

• Manages the cloud infrastructure and facilitates an exclusive community cloud


environment for the participant organizations. The IT infrastructure of each of
the participant organizations connects to the externally-hosted community cloud
over a secure network.
• Enforces security mechanisms in the community cloud as per the requirements
of the participant organizations. The cloud infrastructure may be shared by
multiple tenants.
• Provides a security perimeter around the community cloud resources; and they
are separated from other cloud tenants by access policies implemented by the
provider’s software.

Using an external provider’s cloud infrastructure for the community cloud may offer
access to a larger pool of resources as compared to an on-premise community
cloud.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 544


Appendix

Hybrid Cloud
Is composed of two or more individual clouds, each of which can be private,
community, or public clouds. There can be several possible compositions of a
hybrid cloud as each constituent cloud may be of one of the five variants as
discussed previously.

Each hybrid cloud has different properties in terms of parameters such as


performance, cost, security, and so on.

May change over the period of time as component clouds join and leave. In a
hybrid cloud environment, the component clouds are combined through the use of
open or proprietary technology such as interoperable standards, architectures,
protocols, data formats, application programming interfaces (APIs), and so on.

The use of such technology enables data and application portability.

The image illustrates a hybrid cloud that is composed of an on-premise private


cloud deployed by enterprise Q and a public cloud serving enterprise and individual
consumers in addition to enterprise Q.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 545


Appendix

Cloud Computing Benefits


Provides the capability to provision IT resources quickly and at any time, thereby
considerably reducing the time required to deploy new applications and services.
This enables businesses to reduce the time-to-market and to respond more quickly
to market changes.

Enables the consumers to rent any required IT resources based on the pay-per-use
or subscription pricing. This reduces a consumer’s IT capital expenditure as
investment is required only for the resources needed to access the cloud services

Has the ability to ensure availability at varying levels, depending on the provider’s
policy towards service availability. Redundant infrastructure components enable
fault tolerance for cloud deployments.

Data in a cloud can be broken into small pieces and distributed across a large
cluster of nodes in such a manner that an entire data set can be reconstructed
even if there is failure of individual nodes.

Cloud-based applications may be capable of maintaining limited functionality even


when some of their components, modules, or supporting services are not available.
A service provider may also create multiple service availability zones both within
and across geographically dispersed data centers.

A service availability zone is a location with its own set of resources. Each zone is
isolated from the other zone so that a failure in one zone does not impact the other.
If a service is distributed among several zones, consumers of that service can fail
over to other zones in the event of a zone failure.

Consumers can unilaterally and automatically scale IT resources to meet the


workload demand.

Applications and data reside centrally and can be accessed from anywhere over a
network from any device such as desktop, mobile, and thin client.

Infrastructure management tasks are reduced to managing only those resources


that are required to access the cloud services. The cloud infrastructure is managed
by the cloud provider and tasks such as software updates and renewals are
handled by the provider.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 546


Appendix

Enables collaboration between disparate groups of people by allowing them to


share the resources and information and access them simultaneously from wide
locations.

A cloud can also be leveraged to ensure business continuity. It is possible for IT


services to be rendered unavailable due to causes such as natural disasters,
human error, technical failures, and planned maintenance.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 547


Appendix

Drivers for Cloud-based Data Protection


Organizations need to regularly protect the data to avoid losses, stay compliant,
and preserve data integrity. Data explosion poses challenges such as strains on
the backup window, IT budget, and IT management. Enterprises must also comply
with regulatory and litigation requirements. These challenges can be addressed
with the emergence of cloud-based data protection.

• Simplified Management: The growth of data protection environment in an


organization will lead to management of wide range of software and hardware
resources. The tasks involve configuration, applying the latest patches and
updates, and carrying out upgrades and replacements. Furthermore, workloads
and manpower requirements increase with the size of the IT infrastructure.
When an organization uses cloud-based data protection services, their
infrastructure management tasks are reduced to managing only those resources
that are required to access the cloud services.
• On-demand self-service provisioning: In a traditional data protection
environment, provisioning IT resources takes more time because of rigid
procedures and approvals. In a cloud-based data protection IT resources can
be provisioned on-demand through service catalog. This considerably reduces
time for provisioning resources for data protection.
• Reduced CAPEX: Organizations in their data protection environment may have
the need for additional IT resources at times when workloads are greater.
However, they would not want to incur the capital expense of purchasing the
additional IT resources for supporting data protection environment. The cloud-
based data protection enables the organization to hire the IT resources based
on pay per use or subscription pricing. This reduces the organization’s IT capital
expenditure.
• Flexible Scalability: Consumers in the data protection environment may have
the need to increase the IT resources to meet the workload demand for a short
period of time. However, the consumer is not ready to invest for new IT
resources. The cloud-based data protection provides the capability to scale-in or
scale-out the resources as per the requirement.
• Recover data to any location/devices: Organizations need to plan for risks of
disaster. To overcome these risks, data needs to be replicated to the remote
locations. The cloud- based data protection enables the organization to recover
the data from any place to any device.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 548


Appendix

Backup as a Service
• Enables organizations to procure backup services on-demand in the cloud.
Organizations can build their own cloud infrastructure and provide backup
services on demand to their employees/users. Some organizations prefer hybrid
cloud option for their backup strategy, keeping a local backup copy in their
private cloud and using public cloud for keeping their remote copy for DR
purpose. For providing backup as a service, the organizations and service
providers should have necessary backup technologies in place in order to meet
the required service levels.
• Enables individual consumers or organizations to reduce their backup
management overhead. It also enables the individual consumer/user to perform
backup and recovery anytime, from anywhere, using a network connection.
Consumers do not need to invest in capital equipment in order to implement and
manage their backup infrastructure. These infrastructure resources are rented
without obtaining ownership of the resources.
• Backups can be scheduled and infrastructure resources can be allocated with a
metering service. This will help to monitor and report resource consumption.
Many organizations’ remote and branch offices have limited or no backup in
place. Mobile workers represent a particular risk because of the increased
possibility of lost or stolen devices.
• Ensures regular and automated backup of data. Cloud computing gives
consumers the flexibility to select a backup technology, based on their
requirement, and quickly move to a different technology when their backup
requirement changes.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 549


Appendix

Remote Backup Service


Consumers do not perform any backup at their local site. Instead, their data is
transferred over a network to a backup infrastructure, managed by the cloud
service provider.

To perform backup to the cloud, typically the cloud backup agent software is
installed on the servers that need to be backed up. After installation, this software
establishes a connection between the server and the cloud where the data will be
stored.

The backup data transferred between the server and the cloud is typically
encrypted to make the data unreadable to an unauthorized person or system.

Deduplication can also be implemented to reduce the amount of data to be sent


over the network (bandwidth reduction) and reduce the cost of backup storage.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 550


Appendix

Cloud to Cloud Backup


It is important for organizations to protect their data, regardless of where it resides.
When an organization uses SaaS-based applications, their data is stored on the
cloud service provider’s location. Typically, the service provider protects the data.
But some of the service providers may not provide the required data protection.
This imposes challenges to the organizations in recovering the data in the event of
data loss. For example, the organization might want to recover a purged email from
several months or years ago to be used as legal evidence. The service provider
might be unable to help the organization to recover the data.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 551


Appendix

Replication to Cloud
Cloud-based replication helps organizations to mitigate the risk associated with
outages at the consumer production data center. Organization of all levels are
looking for the cloud to be a part of the business continuity. Replicating application
data and VM to the cloud enable organization to restart the application from the
cloud and also allow to restore the data from any location.

Data and the VM replicated to the cloud is hardware independent; this further
reduces the recovery time.

Replication to the cloud can be performed using compute-based, network-based,


and storage-based replication techniques. Typically when replication occurs, the
data is encrypted and compressed at the production environment to improve the
security of data and reduce the network bandwidth requirements.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 552


Appendix

Disaster Recovery as a Service


Facing an increased reliance on IT and the ever-present threat of natural or man-
made disasters, organizations need to rely on business continuity processes to
mitigate the impact of service disruptions.

Disaster Recovery as a Service (DRaaS) has emerged as a solution to strengthen


the portfolio of a cloud service provider, while offering a viable DR solution to
consumer organizations. The cloud service provider assumes the responsibility for
providing resources to enable organizations to continue running their IT services in
the event of a disaster.

Having a DR site in the cloud reduces the need for data center space and IT
infrastructure, which leads to significant cost reduction, and eliminates the need for
upfront capital expenditure. Resources at the service provider can be dedicated to
the consumer or they can be shared. The service provider should design,
implement, and document a DRaaS solution specific to the customer’s
infrastructure.

They must conduct an initial recovery test with the consumer to validate complete
understanding of the requirements and documentation of the correct, expected
recovery procedures.

Replication of data occurs from the consumer production environment to the


service provider’s location over the network, as shown in the image. Typically when
replication occurs, the data is encrypted and compressed at the production
environment to improve the security of data and reduce the network bandwidth
requirements.

Typically during normal operating conditions, a DRaaS implementation may only


need a small share of resources to synchronize the application data and VM
configurations from the consumer’s site to the cloud. The full set of resources
required to run the application in the cloud is consumed only if a disaster occurs.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 553


Appendix

Disaster Recovery as a Service


In the event of a business disruption or disaster, the business operations will
failover to the provider’s infrastructure as shown in the image.

For applications or groups of applications that require restart in a specific order, a


sequence is worked out during the initial cloud setup for the consumer and
recorded in the disaster recovery plan.

Typically VMs are allocated from a pool of compute resources located in the
provider’s location. Returning business operations back to the consumer’s
production environment is referred to as failback. This requires replicating the
updated data from the cloud repository back to the in-house production system
before resuming the normal business operations at consumer’s location.

After starting the business operations at the consumer’s infrastructure, replication


to the cloud is re-established. To offer DRaaS, the service provider should have all
the necessary resources and technologies to meet the required service level.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 554


Appendix

Cloud-based Storage Tiering


Establishes a hierarchy of different storage types (tiers) including cloud storage as
one of the tiers.

Enables storing the right data to the right tier, based on service level requirements,
at a minimal cost. Each tier has different levels of protection, performance, and
cost.

For example, high performance solid-state drives (SSDs) can be configured as tier
1 storage to keep the frequently accessed data, lower cost HDDs as tier 2 storage
to keep the less frequently accessed data, and cloud as tier 3 storage to keep the
rarely used data.

Improves application performance. The movement of data happens based on


defined tiering policies. The process of moving the data from one type of tier to
another is typically automated. Cloud-based storage tiering provides flexible
storage positioning and the ability to increase or decrease the capacity on
demand.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 555


Appendix

Cloud Gateway Appliance


A physical or virtual appliance that resides in the data center and presents file and
block-based storage interfaces to the applications.

• Service providers offer cloud-based object storage with interfaces such as


Representational State Transfer(REST) or Simple Object Access
Protocol(SOAP), but most of the business applications expect storage
resources with block-based interface, or file-based interfaces such as NFS or
CIFS.

Provides a translation layer between these standard interfaces and service


provider's REST API. It performs protocol conversion so that data can be sent
directly to the cloud storage. To provide security for the data sent to the cloud, most
gateways automatically encrypt the data before it is sent. To speed up the data
transmission times (as well as to minimize cloud storage costs), most gateways
support data deduplication and compression.

Provides a local cache to reduce the latency associated with having the storage
capacity far away from the data center.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 556


Appendix

Why Big data Analytics


The table shown on the page outlines four categories of common business
problems that organizations contend with where they have an opportunity to
leverage advanced analytics to create competitive advantage.

• In addition to doing standard reporting on these areas, organizations can apply


advanced analytical techniques to optimize processes and derive more value
from these common tasks.
• Many compliance and regulatory laws have been in existence for decades, but
additional requirements are added every year, which represent additional
complexities and data requirements for organizations. Laws related to anti-
money laundering and fraud prevention require advanced analytical techniques
to comply with and manage properly.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 557


Appendix

Big Data Analytics


Primary goal of Big Data Analytics is to help organizations to improve business
decisions by enabling data scientists and other users to analyze huge volumes of
transaction data as well as other data sources that may be left untapped by
conventional business intelligence programs.

The technology layers in a Big Data analytics solution consist of storage,


MapReduce technologies, and query technologies. These components—
collectively called the “SMAQ stack”—are described below:

• Characterized by a distributed architecture with primarily non-structured content


in non-relational form. A storage system in the SMAQ stack is based on either a
proprietary or open-source distributed file system, a common file system is
Hadoop Distributed File System (HDFS).
• The intermediate layer consists of MapReduce technologies that enable the
distribution of computation across multiple servers for parallel processing. It also
supports a batch-oriented processing model of data retrieval and computation
as opposed to the record-set orientation of most SQL-based databases.
• The top of the stack is the Query layer that typically implements a NoSQL
database for storing, retrieving, and processing data. It also provides a user-
friendly platform for analytics and reporting.
• SMAQ solutions may be implemented as a combination of multi-component
systems or offered as a product with a self-contained system comprising
storage, MapReduce, and query – all in one.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 558


Appendix

Big Data Protection Challenges


• More data in the data center from various data sources require tougher choices
to be made regarding what to protect and when.
• Over-running the backup windows affect the performance and availability of
systems, reducing user’s productivity. Recovery processes are time-consuming
and unreliable, often failing to meet the organization’s recovery time objective
(RTO) and recovery point objective (RPO).
• Data protection software should seamlessly integrate with data repository (data
lake). Protecting a big data environment requires new strategies about how to
use the existing tools, and adopting the new technologies that help in protecting
the data more efficiently.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 559


Appendix

Data Lake - Repository for Big Data


• A data lake is the evolution of an Enterprise Data Warehouse (EDW) into an
active repository for structured, semi-structured, and unstructured data.
• The data lake is formed by the combination of Hadoop and NoSQL.
• Does not require an upfront schema which means it is much more flexible and
makes it easy to add new data sources and store them in their native format.
• Allows customers to easily add and leverage many other data sources in order
to make more holistic business decisions on their data.
• Is less structured compared to a data warehouse.
• Data is classified, organized, or analyzed only when it is accessed.
• Presents an unrefined view of data
• By eliminating a number of parallel linear data flows, enterprises can
consolidated vast amounts of their data into a single store (as shown in the
image)

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 560


Appendix

Data Mirroring and Parity Protection


• Typically the data lake is created by using scale-out NAS or object-based
storage.
• In a mirror data protection, when a file is written to the cluster, multiple copies of
the file is stored on the cluster in different locations that enhance fault tolerance.
− For example, if the cluster is setup for 3X mirroring, the original file will be
stored along with two copies of the file in various locations within the cluster.
Data mirroring requires significant amount of additional capacity.
• Parity-based protection (Erasure coding) is a method to protect striped data
from disk drive failure or node failure without the cost of mirroring. This
technique breaks the data written to the storage system into fragments,
encoded with parity data and stored across a set of different locations such as
drives and nodes. This protection technique is represented as N+M data
protection model, N represents the number of nodes and M represents the
number of simultaneous failures of nodes or drives or a combination of nodes
and drives – that the cluster can withstand without incurring data loss.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 561


Appendix

Mobile Device Overview


• Organizations are increasingly providing their workforce with ubiquitous access
to information and business applications over mobile devices.
• Organizations are also increasingly exploring the option of Bring Your Own
Device (BYOD), whereby employees are allowed to use non-company devices,
such as laptops and tablets as business machines.
• BYOD enables employees to have access to applications and information from
their personal devices while on the move. This enables the employees to stay
informed and carry out business operations, irrespective of their location.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 562


Appendix

Key Challenges in Protecting Mobile Device Data


• Potential loss of corporate data if the device is lost or stolen.
• Device should be online.
• Backing up or replicating data from mobile devices to corporate data center or
to the cloud can be challenging due to intermittent (and sometimes poor)
connectivity.
• Devices are not always connected to a corporate network. So, the data is
copied over the Internet, which may rise to security threat.
• Protecting data from mobile devices to the corporate data center or to the cloud
requires huge bandwidth to transfer data.
• Smartphone and tablet operating systems have security features built-in that
limit the access to the data stored on the device.
• Some of the mobile devices, particularly tablet and smartphone, may not allow
traditional backup applications to access data.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 563


Appendix

Mobile Device Backup


• Mobile Device Management (MDM) solution is used by an IT department to
monitor, manage, protect (backup), and secure (remote password locks, full
data wipes) employees' mobile devices that are deployed across multiple
mobile service providers and across multiple mobile operating systems being
used in the organization.
• The Gartner research firm defines mobile device management as "a range of
products and services that enables organizations to deploy and support
corporate applications to mobile devices, such as smartphones and tablets,
possibly for personal use — enforcing policies and maintaining the desired level
of IT control across multiple platforms”.
• MDM software also reduces the overhead on IT administration associated with
deploying and updating applications on mobile devices.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 564


Appendix

File Sync-and-Share Application


• The storage capacity is limited on mobile devices; many users store data
remotely rather than on the device itself. Storing data remotely is the best way
to share user’s data across all devices such as desktops, laptops, tablets, and
smartphones.
• As shown in the image, the key components of file sync-and-share environment
include file sync-and-share client (agent) that runs on mobile devices, enterprise
file sync-and-share application that runs on a server, and storage that stores
data (file/object).
• Any data a user creates or modifies on the mobile device is automatically
synchronized with the server. This software typically synchronizes a dedicated
folder(s) on mobile devices with folders created in the server. This creates a
secondary copy of a file in another location.
• The files are backed up from the remote storage instead of the mobile devices.
File sync-and-share also improves employee productivity by allowing the users
to access data from any device, anywhere, at any time.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 565


Appendix

Cloud-based Mobile Device Data Protection


• Copies the data over the Internet to a shared storage infrastructure in a cloud,
maintained by a service provider. Cloud-based backup is one of the key data
protection mechanisms for protecting mobile device data.
• Typically use a backup client application (agent) that is installed on the device to
access and back up data to the cloud. These agents typically scan the mobile
devices for newly created or modified blocks and then backs up only these
changed blocks to the cloud storage. This considerably saves network
bandwidth.
• Some of the mobile applications have built-in feature that automatically backs
up the mobile application data to the cloud. Even file sync-and-share
applications synchronize data between the mobile device and the cloud storage.
• In a mobile cloud computing environment, if an application runs in a cloud, the
application data is usually stored in the cloud. This data is backed up by the
service provider based on the SLAs. If the data on the mobile device is lost, the
data can be recovered from the cloud. Most of the cloud backup solutions
available today offer a self-service portal that allows users to recover data
without manual intervention.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 566


Appendix

Steps of Risk Management


• Step 1: Risk identification points to the various sources of threats that give
rise to risk.
− After identifying risks in a data protection environment, these risks and their
sources need to be classified into meaningful severity levels.
o For example organization performs remote replication over an unsecured
network. The risk identification step points to the sources of threat that
give rise to risk in this replication environment.
• Step 2: Risk assessment determines the extent of potential threat and the risk
associated with data protection resources.
− The output of this process helps organizations to identify appropriate
controls for reducing or eliminating risk during the risk mitigation process.
− All the assets at risk (data, data source, protection application and storage,
and management applications) must be carefully evaluated to assess their
criticality to the business.
− After the risks are assessed, the critical assets should be associated with
potential risks.
o For example, a company’s Intellectual Property records can be identified
as critical assets, and the disclosure of these records can be a risk of
high severity level.
• Step 3: Risk mitigation involves planning and deploying various security
controls (such as those discussed in Security Controls in Data Protection
Environment lessons) that can either mitigate the risks or minimize the impact of
the risks.
• Step 4: Monitoring involves continuous observation of existing risks and
security controls to ensure the risks are mitigated.

− Monitoring can be performed using inputs from deployed security controls in


a data protection environment such as Identity and access management,
firewalls, IDPS, and malware protection software.
o Controls typically have alerts configured to indicate any observed
malicious activity or security breach.
o Observes new risks that may arise. If a new risk is identified then the
entire process will be repeated.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 567


Appendix

Compliance
• Internal policy compliance controls the nature of IT operations within an
organization. This requires clear assessment of the potential difficulties in
maintaining the compliance and processes to ensure that this is effectively
achieved.
• External policy compliance includes legal requirements, legislation, and
industry regulations. These external compliance policies control the nature of IT
operations related to the flow of data out of an organization.
− They may differ, based upon the type of information (for example, source
code versus employee records), and business (for example, medical
services versus financial services).
• Compliance management ensures that an organization adheres to relevant
policies and legal requirements. Policies and regulations can be based on
configuration best practices and security rules.
− These include administrator roles and responsibilities, physical infrastructure
maintenance timelines, information backup schedules, and change control
processes.
• Ensuring CIA and GRC are the primary objectives of any IT security
implementation.

− These are supported through the use of authentication, authorization, and


auditing processes.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 568


Appendix

Threats to Data Source


• Data centers deploy hypervisor to provide multi-tenant106 environment enabling
the sharing of resources.
• Failure of these mechanisms may expose user’s data to other users, raising
security risks.
− Compromising a hypervisor is a serious event because it exposes the entire
environment to potential attacks.
• Hyperjacking is an example of this type of attack in which the attacker installs a
rogue hypervisor that takes control of the compute system.
− The attacker now can use this hypervisor to run unauthorized virtual
machines in the environment.
− Detecting this attack is difficult and involves examining components such as
program memory and processor core which registers anomalies.
• Many organizations allow their employees to access some of the applications
through mobile devices.
− This enables employees to access the application and data from any
location. Mobile device theft may increase the risk of exposure of data to an
attacker.
• Some of the control mechanisms that may reduce the risk of these threats
includes

− Strong authentication and authorization


− Installing security updates of operating systems and hypervisors
− Mobile device management, and encryption

106
Multi-tenancy is achieved by using mechanisms that provide separation of
computing resources such as memory and storage for each user.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 569


Appendix

Control Mechanisms for Protection Storage


• Some of the control mechanisms that can reduce the risks due to these threats
include:

− Always encrypt the data on the protection storage.


− Shred the data that is no longer required.
− Use strong physical security controls such as CCTV cameras, 24X7 on
premise security guard, alarms, and badge IDs.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 570


Appendix

Threats to Management Applications


• The management component of the data protection architecture interacts with
other components to exchange data, command, and status. These interactions
occur with the help of Application programming interfaces (APIs). APIs are used
extensively in today’s data centers to:
− Perform activities such as resource provisioning, configuration, monitoring,
management, and orchestration.
− Secure the data protection environment and the APIs.
• An attacker may exploit vulnerability in an API to breach an organization’s
infrastructure perimeter and carry out an attack.
• To provide protection against both accidental and malicious attempts, an API
must be designed and developed by following security best practices such as:

− Requiring authentication and authorization


− Avoiding buffer overflows
− Accessing to the APIs must be restricted to authorized users

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 571


Appendix

Introduction to Security Controls


• Security controls can be technical or non-technical.
− Technical controls are usually implemented at compute107, network108, and
storage109 level.
− Non-technical controls are implemented through administrative and physical
controls.
• Data protection environment also requires identity and access management,
role based access control, and physical security arrangements.
• Security controls are categorized as preventive, detective, and corrective.

• Preventive: Avoid problems before they occur.


• Detective: Detect the problem that has occurred.
• Corrective: Correct the problem that has occurred.

107At the compute system level, security mechanisms are deployed to secure
hypervisors and hypervisor management systems, virtual machines, guest
operating systems, and applications.

108Security controls at the network level commonly include firewalls, demilitarized


zones, intrusion detection systems, virtual private networks, zoning and iSNS
discovery domains, VLAN and VSAN.

109At the storage level, security mechanisms include LUN masking, data
shredding, and data encryption.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 572


Appendix

Identity and Access Management Example


• A user tries to gain access to the IT resources. While doing so:

o The IAM controls prompt for the user’s credentials. Depending on the type of
IAM control deployed in this environment the user provides the necessary
credentials.
o Credentials are then verified against a system that has the ability to
authenticate and authorize the user.
o Upon successfully verifying the credentials, the authorized user is granted
access to the IT resources.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 573


Appendix

Firewall-Demilitarized Zone
• A demilitarized zone is a control to secure internal assets while allowing
Internet-based access to selected resources.
• In a demilitarized zone environment, servers that need Internet access are
placed between two sets of firewall.
• Servers in the demilitarized zone may or may not be allowed to
communicate with internal resources.
• Application-specific ports such as those designated for HTTP or FTP traffic are
allowed through the firewall to the demilitarized zone servers.
• No Internet-based traffic is allowed to go through the second set of firewall and
gain access to the internal network.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 574


Appendix

Virtual Private Network


• A virtual private network:
− Extends an organization’s private network across a public network such as
Internet.
− Establishes a point-to-point connection between two networks over which
encrypted data is transferred.
− Enables organizations to apply the same security and management policies
to the data transferred over the VPN connection as applied to the data
transferred over the organization’s internal network.
o User is authenticated before the security and management policies are
applied.
• Remote access VPN connection method can be used by administrators to
establish a secure connection to the data center and carry out multiple
management operations.
• A typical usage scenario for site-to-site VPN connection method will be while
deploying a remote replication or connecting to the cloud.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 575


Appendix

VLAN Example
• Consider the example with three VLANs: VLAN 10, VLAN 20, and VLAN 30.

− VLAN 10 includes Compute System A, Compute System B, and Storage


System A.
− VLAN 20 includes Compute System C, Compute System D, and Storage
System B.
− VLAN 30 includes Compute System E, Compute System F, and Storage
System C.
− VLAN 10 allows only Compute System A, Compute System B, and Storage
System A to communicate with each other.
o Any traffic from other VLANs to VLAN 10 has to pass through the IP
router.
o This isolation provides enhanced security even though the traffic of
multiple VLANs traverse over the same physical switch.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 576


Appendix

VSAN Example
• Consider the example with two VSANs: VSAN 10 and VSAN 20.

− VSAN 10 includes Compute System A and Storage System A.


− VSAN 20 includes Compute System B and Storage System B.
− VSAN 10 allows only Compute System A and Storage System A to
communicate with each other.
o Any traffic from VSAN 20 to VSAN 10 will be blocked.
o This isolation provides enhanced security even though the traffic of
multiple VSANs traverse over the same physical switch.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 577


Appendix

Types of Zoning
• WWN zoning: It uses World Wide Names to define zones. The zone members
are the unique WWN addresses of the FC HBA and its targets (storage
systems).
− A major advantage of WWN zoning is its flexibility. If an administrator moves
a node to another SAN switch port, the node will maintain connectivity to its
zone partners without modifying the zone configuration. This is possible
because WWN is static to the node port.
− WWN zoning could run the risk of WWN spoofing, enabling a host to gain
access to resources from another host. Switches protect this by reviewing
WWN and FCID of the host match.
• Port zoning: It uses the switch port ID to define zones. In port zoning, access
to the node is determined by the physical switch port to which a node is
connected.
− The zone members are the port identifiers (switch domain ID and port
number) to which FC HBA and its targets (storage systems) are connected.
− If a node is moved to another switch port in the SAN, port zoning must be
modified to allow the node in its new port to participate in its original zone.
− If an FC HBA or a storage system port fails, an administrator just needs to
replace the failed device without changing the zoning configuration.
• Mixed zoning: It combines the qualities of both WWN zoning and port zoning.
Using mixed zoning enables a specific node port to be tied to the WWN of
another node.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 578


Appendix

Securing IT infrastructure components


• Hypervisors may be compromised by hyperjacking or other forms of attack.
− Management server may be compromised by exploiting vulnerabilities in the
management software or by an insecure configuration.
o For example, an administrator may have configured a non-secured or
non-encrypted remote access mechanism.
− Malicious attacker may take control of the management server by exploiting
a security loophole of the system.
o This enables the attacker to perform unauthorized activities such as
controlling all the existing VMs, creating new VMs, deleting VMs, and
modifying VM resources.
• Hypervisor updates should be installed when they are released by the
hypervisor vendor.
− Hypervisor hardening should be performed using specifications provided by
organizations such as the Center for Internet Security (CIS) and Defense
Information Systems Agency (DISA).
− Access to the management server should be restricted to authorized
administrators. Access to core levels of functionality should be restricted to
selected administrators.
− Network traffic should be encrypted when management is performed
remotely. A separate firewall with strong filtering rules installed between the
management system and the rest of the network can enhance security.
• Virtual machines store troubleshooting information in a log file that is stored on
the storage presented to a hypervisor.

− An attacker may cause a virtual machine to abuse the logging function,


causing the size of the log file to grow rapidly.
o Log file can consume all the capacity of the storage presented to the
hypervisor, effectively causing a denial of service. This can be prevented
by configuring the hypervisor to rotate or delete log files when they reach
a certain size.
− Administrator can configure settings to:

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 579


Appendix

o Maximum size of the log file. When this size is reached, the hypervisor
makes an archive copy of the log file and starts storing information in a
new log file.
o Maintain a specific number of old log files. When the configured limit is
reached, the hypervisor automatically deletes the oldest file.
Learn more about VM hardening, OS hardening and Application hardening.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 580


Appendix

Malware Protection Software


• Malware protection software uses various techniques to detect malware.
− One of the most common techniques used is signature-based detection.
o In this technique, the malware protection software scans the files to
identify a malware signature.
o A signature is a specific bit pattern in a file.
o Signatures are cataloged by malware protection software vendors and
are made available to users as updates.
o Must be configured to regularly update these signatures to provide
protection against new malware programs.
− Another technique, called heuristics, can be used to detect malware by
examining suspicious characteristics of files.
o For example, malware protection software may scan a file to determine
the presence of rare instructions or code.
• Also identify malware by examining the behavior of programs.
− For example, malware protection software may observe program execution
to identify inappropriate behavior such as keystroke capture.
• Also be used to protect operating system against attacks.

− A common type of attack that is carried out on operating systems is by


modifying its sensitive areas, such as registry keys or configuration files, with
the intention of causing the application to function incorrectly or fail.
o Can be prevented by disallowing the unauthorized modification of
sensitive areas by adjusting operating system configuration settings or
via malware protection software.
o When a modification is attempted, the operating system or the malware
protection software challenges the administrator for authorization.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 581


Appendix

Mobile Device Management


• To enroll the device, an MDM client is installed on the mobile device.
− The client component is used to connect to the server component to receive
administration and management commands.
− To connect to the server component, the user is required to provide MDM
authentication server details and user credentials.
− Typically, the authentication server is placed in a DMZ. These credentials
are authenticated by the MDM authentication server.
o Devices that are successfully authenticated are redirected to the MDM
server.
o Now the authenticated mobile devices are enrolled and can be managed.
Further, these mobile devices can be granted access to the applications
and other resources.
• MDM solution enables organizations to enforce organization’s security policies
on the user’s mobile devices.

− The solution also provides the organizations the administrative and


management control to the user’s mobile device.
− With this control the organization will have the ability to remotely wipe the
data on the enrolled devices or brick the device when a threat is detected.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 582


Appendix

Data Encryption
• Data should be encrypted as close to its origin as possible. Data encryption:
− Can be used for encrypting data at the point of entry into the storage
network.
− Can be implemented on the fabric to encrypt data between the compute
system and the storage media. These controls can protect both the data at-
rest on the destination device and the data in-transit.
− Can be deployed at the storage-level, which can encrypt data-at-rest.
• Another way to encrypt network traffic is by using cryptographic protocols such
as Transport Layer Security (TLS) which is a successor to Secure Socket Layer
(SSL).

− These are application layer protocols and provide an encrypted connection


for client-server communication.
− These protocols are designed to prevent eavesdropping and tampering of
data on the connection over which it is being transmitted.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 583


Appendix

Types of Attacks
Denial of Service

Denial of service attacks attempt to bring systems to a halt.


These attacks overwhelm the resources of the system with
excessive requests that consume all the resources. Distributed
Denial of Service launches the attack from many other host
machines.

The purpose of denial of service attacks is to bring down a


system to initiate another attack or affect the system by a business competitor.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 584


Appendix

Types of Attacks
Digital Currency Mining

Digital currency relies on block chain, which requires distributed


computing power to mine and process operations. The systems
involved in mining receive a commission for facilitating the
transaction. While digital mining is a legitimate operation,
hackers can use compute resources of many victims to mine for
cryptocurrencies without their authorization.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 585


Appendix

Types of Attacks
Spam

Unsolicited bulk messages sent through email, instant messaging,


or other digital communication assets are known as spam. While
spam might be a common practice for marketing, it can be used to
trick victims into providing sensitive information that can be used
later to perpetrate a crime.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 586


Appendix

Types of Attacks
Adware

Adware is part greyware, potentially unwanted programs that are


not a virus or malicious software, but have problematic code or
hidden intensions. Adware collects information about a user with
the purpose of advertisement.

These programs on a computer are usually referred to as


adware, while programs on a mobile device are referred to as
madware. Adware has the potential of slowing down a system and can work with
spyware.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 587


Appendix

Types of Attacks
Malicious Web Scripts

Malicious web scripts can be in existing legitimate websites or in


websites that are redirected from legitimate websites. Malicious
web scripts are scripts that when run can detect and exploit the
vulnerabilities of a system of visitors to the website.

Whether they are a redirect or embedded in the legitimate


website, customers feel safe because they are visiting a known
source.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 588


Appendix

Types of Attacks
Business Email Compromise

Business email compromise is a phishing attempt that relies on


deception. There are several forms of this scam, but the common
trait is that scammers target employees. If their interests are
financial, attackers trick employees into transferring to bank
accounts. Employees believe that these bank accounts belong to
their trusted partners.

Attackers can be interested in proprietary information or trade secrets. After gaining


their victim's trust, they can obtain company private information that should not be
public.

These attacks can be perpetrated through email spoofing, social engineering,


identity theft, and malware among others.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 589


Appendix

Types of Attacks
Banking Trojan

A banking trojan tricks users into downloading a “harmless” file


that becomes malware that identifies a user’s banking
information. This attack is very profitable because it gains access
to bank accounts and can transfer funds from it. This malware can
target businesses or individuals and is also perpetrated through
social engineering, phishing and spam emails, exploit kits, and so
on.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 590


Appendix

Types of Attacks
Ransomware

Ransomware is also a form of malware, different from adware; it


is malicious software that encrypts the entire hard drive of the
computer, locking a user out of the system. Alternatively, it can
be crypto ransomware, which encrypts specific files, most
commonly documents and images in the systems.

When a system is infected with ransomware malware, it asks the user to pay a fee
to unlock and reclaim the data, or else the data is lost or made public.

Ransomware is normally distributed through phishing emails or exploit kits. It is


more common than the different categories of cybercrime because it provides
significantly less effort for a greater gain.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 591


Appendix

In this example, a backup environment includes three physical compute systems


(H1, H2, and H3) that host backup clients (VMs). Two SAN switches (SW1 and
SW2) connect the compute systems to a storage node and the storage node to the
backup storage system. Multipathing software is installed on hypervisor running on
all the three compute systems. If one of the switches, SW1, fails, the multipathing
software will initiate a path failover, and all the backup clients will continue to send
backup data through the other switch, SW2. However, due to the absence of a
redundant switch, a second switch failure could result in failure of the backup
operation. Monitoring for availability enables detecting the switch failure and helps
administrator to take corrective action before another failure occurs.

This example illustrates the importance of monitoring the capacity of a storage pool
in a NAS system. Monitoring tools can be configured to issue a notification when
thresholds are reached on the storage pool capacity. For example, notifications are
issued when the pool capacity reaches 66 percent and 80 percent so that the
administrator can take the right action. Proactively monitoring the storage pool can
prevent service outages caused due to lack of space in the storage pool.

This example shows a backup environment that includes three physical compute
systems—H1, H2, and H3—that host backup clients (VMs). Two SAN switches
(SW1 and SW2) connect the compute systems to a storage node and the storage
node to the backup storage system. A new compute system running backup clients
with a high workload must be deployed. The backup data from the new compute
system must be ingested through the same backup storage system port as H1, H2,
and H3. Monitoring backup storage system port utilization ensures that the new
compute system does not adversely affect the performance of the backup clients
running on other compute systems.

Here, utilization of the shared backup storage system port is shown by the solid
and dotted lines in the graph. If the port utilization prior to deploying the new
compute system is close to 100 percent, then deploying the new compute system is
not recommended because it might impact the performance of the backup clients
running on other compute systems. However, if the utilization of the port prior to
deploying the new compute system is closer to the dotted line, then there is room to
add a new compute system.

IT organizations typically comply with various data security policies that may be
specific to government regulations, organizational rules, or deployed services.
Monitoring detects all protection operations and data migration that deviate from
predefined security policies. Monitoring also detects unavailability of data and

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 592


Appendix

services to authorized users due to a security breach. Further, physical security of


a data center can also be continuously monitored using badge readers, biometric
scans, or video cameras. This slide illustrates the importance of monitoring security
in a storage system.

In this example, the storage system is shared between two workgroups, WG1 and
WG2. The data of WG1 should not be accessible by WG2 and vice versa. A user
from WG1 might try to make a local replica of the data that belongs to WG2. If this
action is not monitored or recorded, it is difficult to track such a violation of security
protocols. Conversely, if this action is monitored, a notification can be sent to
prompt a corrective action or at least enable discovery as part of regular auditing
operations.

Examples of CI attribute are the CI’s name, manufacturer name, serial number,
license status, version, description of modification, location, and inventory status
(for example, on order, available, allocated, or retired). The inter-relationships
among CIs in a data protection environment commonly include service-to-user,
virtual storage pool-to-service, virtual storage system-to-virtual storage pool,
physical storage system-to-virtual storage system, and data center-to geographic
location.

All information about CIs is usually collected and stored by the discovery tools in a
single database or in multiple autonomous databases mapped into a federated
database called a configuration management system (CMS). Discovery tools also
update the CMS when new CIs are deployed or when attributes of CIs change.
CMS provides a consolidated view of CI attributes and relationships, which is used
by other management processes for their operations. For example, CMS helps the
security management process to examine the deployment of a security patch on
VMs, the problem management to resolve a remote replication issue, or the
capacity management to identify the CIs affected on expansion of a virtual storage
pool.

Change management typically uses an orchestrated approval process that helps


making decision on changes in an agile manner. Through an orchestration
workflow, the change management receives and processes the requests for
changes. Changes that are at low risk, routine, and compliant to predefined change
policies go through the change management process only once to determine that
they can be exempted from change management review thereafter. After that,
these requests are typically treated as service requests and approved
automatically. All other changes are presented for review to the change

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 593


Appendix

management team. The change management team assesses the potential risks of
the changes, prioritizes, and makes a decision on the requested changes.

Capacity management ensures adequate availability of IT resources to provide


data protection services and meet the SLA requirements. It determines the optimal
amount of resources required to meet the needs of protection operations and
services regardless of dynamic resource consumption and seasonal spikes in
resource demand. It also maximizes the utilization of available capacity and
minimizes spare and stranded capacity without compromising the service levels.
Capacity management team uses several methods to maximize the utilization of
capacity such as data deduplication, compression, and storage tiering.

Capacity management tools are usually capable of gathering historical information


on the usage of backup/archiving servers and protection storage over a period of
time. In addition, they establish trends on capacity consumption and perform
predictive analysis of future demand. This analysis serves as input to the capacity
planning activities and enables the procurement and provisioning of additional
capacity in the most cost effective and least disruptive manner.

Availability management is responsible for establishing a proper guideline based on


the defined availability levels of data protection operations and services. The
guideline includes the procedures and technical features required to meet or
exceed both the current and the future data availability needs at a justifiable cost.
Availability management also identifies all availability-related issues in a data
protection environment and areas where availability must be improved. The
availability management team proactively monitors whether the availability of
protection components and services is maintained within acceptable and agreed
levels.

The monitoring tools also help the administrators to identify the gap between the
required availability and the achieved availability. With this information, the
administrators can quickly identify errors or faults in the components that may
cause data unavailability in future. Based on the data availability requirements and
areas found for improvement, the availability management team may propose and
architect new data protection and availability solutions or changes in the existing
solutions.

For example, the availability management team may propose an NDMP backup
solution to support a data protection service or any critical business function that
requires high availability. The team may propose both component-level and site-

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 594


Appendix

level redundancy. This is generally accomplished by deploying two or more network


adapters per backup component, multi-pathing software, and compute clustering.
The backup components must be connected to each other using redundant
switches and/or network. The switches must have built-in redundancy and hot-
swappable components. The VMs hosting backup applications must be protected
from hardware failure/unavailability through VM live shadow copy mechanisms. The
backup storage system should also have built-in redundancy for various
components and should support local and remote backup.

The example shown illustrates the resolution of a problem that impacts the
performance of a synchronous replication over a SAN recurrently. The problem is
detected by an integrated incident and problem management tool deployed in the
data protection environment. The problem is recognized by correlating multiple
incidents that pertain to the same performance-related issue. The integrated
incident and problem management tool performs root cause analysis, which reveals
that insufficient bandwidth of network links that carry replication traffic is the root
cause of the problem. The tool also logs the problem for administrative action.

Administrators of the problem management team can view the problem details
including the root cause recorded by the integrated incident and problem
management tool. They determine the remedial steps to correct the root cause. In
this case, the administrators decide to add a new network link to increase the
bandwidth for replication traffic. For that, they generate a request for change. Upon
obtaining approval from the change management, they ensure that the new link is
created by the implementation engineers. Thereafter, the problem management
team closes the problem.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 595


Appendix

Snapshot – RoW: Details


• RoW redirects new writes destined for the source LUN to a reserved LUN in the
storage pool.
• In RoW, a new write from production compute system is simply written to a new
location (redirected) inside the pool.
− The original data remains where it is, and is therefore read from the original
location on the source LUN and is untouched by the RoW process.
• Some vendor’s local replication software provides the capability to create target-
less snapshots. They only relate to a source device and cannot be otherwise
accessed directly.

− Snapshots can be restored back to the source devices or linked to another


set of target devices which can be made accessible to the compute system.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 596


Appendix

Virtual Machine Hardening


• Virtual machine hardening is a key security control to protect virtual machines
from various attacks.
• A virtual machine is created with several default virtual components and
configurations.
− Some of the configurations and components may not be used by the
operating system and application running on it.
o These default configurations may be exploited by an attacker to carry out
an attack.
• A virtual machine hardening process should be used in which the default
configuration is changed to achieve greater security. In this process:
− Virtual machine’s devices that are not required are removed or disabled.
− Configuration of VM features is tuned to operate in a secure manner such as
changing default passwords, setting permissions to VM files, and disallowing
changes to the MAC address assigned to a virtual NIC, mitigating spoofing
attacks.
• Hardening is highly recommended while creating virtual machine templates.
This way, the virtual machines created from the template start from a known
security baseline.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 597


Appendix

Operating System Hardening


• Operating system hardening typically includes
− Deleting unused files and applications
− Installing current operating system updates (patches)
− Configuring system and network components following a hardening checklist
• These hardening checklists are typically provided by operating system vendors
or organizations such as the CIS and DISA, who also provide security best
practices.
• Vulnerability scanning and penetration testing can be performed to identify
existing vulnerabilities and determine the feasibility of an attack.

− These controls assess the potential impact of an attack on the business.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 598


Appendix

Application Hardening
• Application hardening is a process followed during application development,
with the goal of preventing the exploitation of vulnerabilities that are typically
introduced during the development cycle.
• Application architects and developers must focus on various factors such as
proper application architecture, threat modeling, and secure coding while
designing and developing an application.
− Installing current application updates or patches provided by the application
developers can reduce some of the vulnerabilities identified after the
application is released.
• Application hardening process also includes process spawning control,
executable file protection, and system tampering protection.
• A common type of attack that can be imparted on applications is tampering with
executable files.

− In this type of attack, virus code is incorporated into the application’s


executable files. When the infected application runs, the virus code is also
executed. This type of attack can be prevented by disallowing the application
executable from being modified.
− This type of attack can be prevented by disallowing the application
executable from being modified.
− Countermeasures for this type of attack are typically implemented in
operating system configuration settings or via malware protection software.
o When an attempt of modification is performed, the OS or the malware
protection stops the modification of the executable files.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 599


Glossary
Incident Management
An incident is an unplanned event such as a switch failure, security attack, or
replication software error that may cause an interruption to the protection
operations and services, or degrade their quality.

Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 601


Data Protection and Management-SSP

© Copyright 2021 Dell Inc. Page 602

You might also like