You are on page 1of 52

Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Module 1: Avamar Fundamentals


Upon completion of this module, you will be able to:
y Describe the Avamar advantage over traditional backup
systems
y Define Avamar terminology
y Describe Avamar system components
y Describe the Avamar server and client processes
y Describe the Avamar de-duplication data flow
y Describe how Avamar is used to back up VMware,
NDMP, clusters and database application environments

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 1 of 52

The objectives for this module are shown here. Please take a moment to read them.

Avamar Fundamentals, Page 1 - 1


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Lesson 1

Avamar Features
and
Functions

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 2 of 52

Avamar Fundamentals, Page 1 - 2


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

What is Avamar?
Efficient, disk-based backup with data de-duplication

Data Protection Challenges Solution

Global de-duplication at source


Explosive data growth and target
Faster backup & recovery
requirements Disk as primary backup storage
medium
Service level demands
Centralized backup management
Data spread across
remote sites Scale with capacity & growth
requirements
High overhead costs
Scheduled replication

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 3 of 52

EMC Avamar is a comprehensive, client-server network backup and restore solution. With its unique
global data de-duplication technology, Avamar addresses the data protection challenges in today’s IT
environments.
The ever-increasing amount of data to backup presents a challenge to organizations facing the
demands of shorter backup windows, quicker restore responses, consistent backups of remote sites, and
regulatory requirements. All with the need to accomplish this with fewer staff and tighter budgets.
Avamar meets these challenges by re-designing backup and restore as true disk-based processes.
Avamar’s patented global de-duplication technology reduces the amount of backup data by identifying
unique data at the source. Avamar stores only one copy of this common data across the backup
network. This results in a dramatic reduction in the amount of data that is moved across the network
and stored in backup storage. The same data is backed up as in traditional backup systems, but
consumes significantly less network and backup resources as only unique data is stored. And, by using
standard IP network technologies, dedicated backup networks are not required.
Avamar employs a scalable disk-based, server architecture built of modules that provide a balance of
connectivity, security, processing and disk storage resources. Scheduled backup and replication
functionality enable efficient backup of remote sites and provide disaster recovery of primary backup
sites. Avamar provides a user-friendly interface for central management of the entire backup system.

Avamar Fundamentals, Page 1 - 3


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Traditional Backups
Store many copies of same files:
from same machine, multiple
machines, versions across time A

y Multiple copies of Client 1


Doc A exist on A
different clients in
the network Client 2
A
y Multiple versions A AA
of documents A A
AA
exist on the same
client A
Backup Server
y Same documents Client 3
backed up with Multiple copies of
every full backup the same
y Full backups often A document exist on
taken once/week backup storage
& retained for Client 4
months & years

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 4 of 52

A high percentage of data that is retained on backup media by most backup solutions is highly
redundant. The typical backup process for most organizations consists of a series of daily incremental
backups and weekly full backups. Daily backups are usually retained for a few weeks and weekly full
backups are retained for several months to several years. Because of this process, multiple copies of
identical or slowly changing data are retained on backup media, leading to a high level of data
redundancy.
A large number of operating system, application files and data files are common across multiple
systems in an enterprise. Identical files such as Word documents, PowerPoint presentations and Excel
spreadsheets, are stored by many users across an environment. Backups of these systems will contain a
large number of identical files.
Additionally, many users keep multiple versions of files that they are currently working on. Many of
these files differ only slightly from other versions, but are seen by backup applications as new data that
must be protected.
Backing up redundant data increases the amount of backup storage needed and can negatively impact
network bandwidth. Organizations are running out of backup window time and facing difficulties
meeting recovery objectives due to the need to manage backup versions and a myriad of backup tapes.

Avamar Fundamentals, Page 1 - 4


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

The Avamar Advantage


Only unique
objects are stored
A in backup storage;
the data for this
Client 1 file is stored only
A once
Multiple copies of
Doc A exist on Client 2
different clients in
the network
A
A
Backup Server
Client 3

A
Client 4
y Identify and store only unique sub-file data y Create & store “trees” that link all data objects
objects
y Recreate files for restore
y Store objects taking max advantage of disks
© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 5 of 52

Avamar differs from traditional backup and restore solutions by identifying and storing only unique,
sub-file data objects. Redundant data is identified at the source, drastically reducing the amount of
backup data that travels across the network to be stored and managed by the backup host. When storing
data objects, Avamar takes maximum advantage of inherent hard-disk characteristics. Avamar also
creates and stores “trees” that link all data objects from a single backup. These “trees” are used to re-
create files for restore.

Avamar Fundamentals, Page 1 - 5


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

What is Data De-duplication?


y Identifies duplicate or redundant data
y File level
– Most common implementation
y Fixed block level
– Sometimes referred to as fixed-length level
– Commonly employed by snapshot or
replication technologies
– Even small changes to data can change all
fixed-length segments in a dataset even
though little has changed
y Variable block level
– Intelligent method of determining segment
size by looking at the data itself to determine
logical boundaries
– Greater granularity; more efficient
– Employed by EMC Avamar
© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 6 of 52

Data de-duplication, or single instance storage, reduces storage needs by identifying duplicate or
redundant data. Only unique data is then stored on the storage media. The level at which data de-
duplication is employed determines the granularity of de-duplication. Three levels of data de-
duplication are:
y File level de-duplication helps organizations reduce storage needs for file servers by identifying
duplicate files within hard disk volumes and providing an efficient mechanism for consolidating
them. The most common implementation of single instance storage is at the file level. With this
method, a single change in a file results in the entire file being identified as unique. As shown in
the example, if there were 5 versions of a file in a backup environment, the 5 files in their entirety
are stored.
y Fixed block de-duplication, also called fixed length de-duplication, is commonly employed in
snapshot and replication technologies. This method breaks a file into fixed length sub-objects.
However, even with small changes to the data, all fixed length segments in a dataset can change
despite the fact that very little of the dataset has actually changed.
y Variable block level de-duplication uses an intelligent method of determining segment size that
looks at the data itself to determine repeatable boundary points. Variable block level de-duplication
yields a greater granularity in identifying duplicate data, eliminating the inefficiencies of file level
and fixed block level de-duplication. With variable block level de-duplication, a change in a file
results in only the variable-sized block containing the change being identified as unique.
Consequently, more data is identified as common data, and in the case of backup, there is less data
to store as only the unique data is backed up. This is the method used by Avamar.

Avamar Fundamentals, Page 1 - 6


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Key Avamar Features


y Global de-duplication
y Systematic fault tolerance: RAID,
RAIN, Checkpoints & Replication
y Disk Storage
Clients

y Standard IP network architectures Avamar


Backup/Restore
Server
– Dedicated backup network not required

y Scalable server architecture Management


Console

y Flexible deployment options


y Centralized management

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 7 of 52

The Avamar solution includes the following key features:


y Global data de-duplication ensures that data objects are only backed up once across the backup
environment.
y Systematic fault tolerance, using RAID, RAIN, checkpoints and replication, provides data integrity
and disaster recovery protection
y Highly reliable, inexpensive disk storage for primary backup storage
y Standard IP network technologies. Optimizes use of network for backup; dedicated backup
networks are not required. Daily full backups are possible using existing networks and
infrastructure.
y Scalable server architecture, provides security and expandability. Additional storage nodes can be
added to an Avamar multi-node server to accommodate increased backup storage requirements.
y Centralized management. Avamar Enterprise Manager and Avamar Administrator interfaces
enable remote management of Avamar servers from a centralized location via internet access.
y Flexible deployment options include Avamar Virtual Edition and Avamar Data Store. Avamar
supports a wide-variety of client operating systems and applications including Windows, Linux,
Unix, NDMP, Microsoft SQL, Microsoft Exchange and Oracle. With its global de-duplication
technology, Avamar is an efficient backup choice for VMware and remote office backup
environments.
These features will be discussed in more detail throughout this course.

Avamar Fundamentals, Page 1 - 7


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Avamar Terminology
y System: one or more Avamar
servers and the
clients/servers that back up
data to those servers
System
y Server: a group of one or more
Nodes on a local, high-speed
network; also known as a grid
Node
y Node: computer running
Server
Avamar server software
Stripe
y Stripe: units of disk space for
storing objects
Chunk/ y Object: variable-sized units
Object of de-duplicated data

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 8 of 52

There are several Avamar terms that we will be using throughout the course:
System: one or more Avamar servers and the network servers or desktop clients that back up data to
those servers
Server: a group of one or more Nodes on a local, high-speed network.
Node: a self-contained, rack-mountable network-addressable computer consisting of both processing
power and hard drive storage. Nodes run Avamar server software on the Linux operating system. Hard
disk storage may be internal to the node or implemented using an external SAN array.
Stripe: a unit of disk drive space managed by Avamar.
Object: a single instance of de-duplicated data. Objects are stored and managed within stripes on the
Avamar server.

Avamar Fundamentals, Page 1 - 8


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Data Protection Functions

Clients Disaster
Avamar
Server Recovery Site

y Backup: point-in-time copy of y Encryption: Axion or AES 128-bit


client data
y Retention: determines the length of
y Initialization: process of running
a first backup from a client time that a backup is available for
recovery
y Restore: retrieves one or more
filesystems, directories, or files y Replication: stores a logical copy of
from an existing backup Avamar server data on another
Avamar server on a scheduled basis
© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 9 of 52

An Avamar backup is defined as a point-in-time copy of client data that can be restored as individual
files, selected directories or entire filesystems. Initialization is the process of running a first backup
from a client.
Restore is an operation that retrieves one or more filesystems, directories or files from an existing
backup and writes it to a designated location.
Encryption provides enhanced security during client/server data transfers and on the Avamar server.
For data in transit, Avamar supports either of two encryption methods: Axion or AES 128-bit. Axion
encryption is a proprietary algorithm suitable for Avamar client to server communication over trusted
networks. 128-bit Advanced Encryption Standard (AES) encryption should be used for any network
communication where security is a concern. As part of server installation, an Avamar server can be
configured to encrypt all backup data stored on the server.
Retention determines the length of time that a backup is available for restore. Avamar allows you to
specify how long a backup is retained; unused chunks from backups that have expired are deleted from
the system.
Replication is the process of storing a logical copy of Avamar server data on another Avamar server to
support future disaster recovery of the source server.

Avamar Fundamentals, Page 1 - 9


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Avamar Administration and Management


y Avamar Administrator
y PostgreSQL Database
y Avamar Administrator Command
Line Interface (CLI)
y Avamar Enterprise Manager

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 10 of 52

Avamar administration tools provide central administrative access to the Avamar system.
The Avamar Administrator is a graphical user interface (GUI) used to configure, monitor and
manage an Avamar system from one or more Windows or Linux clients.
Avamar uses a PostgreSQL database to store various kinds of data, such as backup and restore
activities, events, defined groups and clients. This information is available for reporting using third-
party reporting tools such as Crystal Reports, MS Query, and Microsoft Excel.
The Avamar Administrator Command Line Interface (CLI) is a Java application providing
command line access to the features and functions that are available via the GUI.
The Avamar Enterprise Manager provides centralized access to the Avamar Administrator for each
Avamar system in an enterprise as well as dashboard, reporting and search capabilities. With
Enterprise Manager, backup administrators can monitor and manage all Avamar servers in a
distributed environment.

Avamar Fundamentals, Page 1 - 10


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

EMC Avamar Resources


y Resources related to the use of Avamar include:
y Avamar System Administration Manual
y Avamar Operational Best Practices Manual
y Product Security Manual
y Avamar Release Notes
y Avamar Technical Addendum
y Avamar Administrator CLI Programmer’s Guide and Reference
Manual
y Avamar Overview (eLearning training)
y Avamar Administration (classroom training)
y Avamar Installation and Configuration (classroom training)
© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 11 of 52

Resources related to Avamar include the guides, release notes, and training courses listed on the slide.
For a complete set of product information and documentation for Avamar, as well as client and
Administrator Console software, please go to the Avamar secure web server at
http://youravamarname.domain.com. You can register for EMC Education Services training courses
via the EMC Powerlink web site, http://powerlink.emc.com.

Avamar Fundamentals, Page 1 - 11


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Lesson 2

Avamar System
Components

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 12 of 52

Avamar Fundamentals, Page 1 - 12


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Avamar System Components

Client software on Policy engine


systems to be and storage
Administrator
protected
Console

Avamar Backup Clients Avamar Server Avamar


y Contain data to be backed up y Avamar Administrator Server Administrator
y Avamar agents and one or more plug-ins y Avamar Data Server

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 13 of 52

The three major components of an Avamar system are the Avamar server, Avamar backup clients and
the Avamar Administrator.
The Avamar Server stores client backups and provides essential processes and services required for
client access and remote system administration. Avamar Administrator Server (mcs) and Avamar Data
Server (gsan) run on the Avamar server.
Avamar Client software runs on each computer or network server that is being backed up. Avamar
provides client software for various computing platforms. Each client consists of a client agent and one
or more plug-ins.
Avamar Administrator is a user management console software application that is used to remotely
administer an Avamar system from a supported Windows or client computer.

Avamar Fundamentals, Page 1 - 13


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Avamar Server Node Types


y Utility node is dedicated to providing internal Avamar
server processes
y Data Storage nodes provide backup storage
y NDMP Accelerator node provides backup and recovery
functionality for NAS devices via NDMP

Data Node
Data Node
Data Node
Data Node
Data Node (spare)
Utility Node

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 14 of 52

The primary building block of an Avamar system is a node.


Utility nodes are dedicated to providing internal Avamar server processes and services including the
administrator server, cron jobs, Domain Name Server (DNS), external authentication, Network Time
Protocol (NTP) and web access.
Data storage nodes include the Avamar Data Server software and are dedicated to providing backup
storage.
NDMP Accelerator is a specialized node that, when used as part of an Avamar system, provides a
complete backup and recovery solution for NAS devices via the Network Data Management Protocol
(NDMP). Avamar supports Network Appliance filers and EMC Celerra with the NDMP Accelerator.

Avamar Fundamentals, Page 1 - 14


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Systematic Fault Tolerance

y RAID protection for disk


failures

y RAIN protection for node


failures

y Replication for server loss

y Checkpoints for operational


failures 9 Checkpoints are validated
© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 15 of 52

To ensure system integrity, Avamar provides systematic fault tolerance at the following levels:
y RAID (redundant array of independent disks) protection for disk failures
− Balance between performance and efficiency
− RAID-1 or RAID-5
− Hot-swap capability with minimum system impact for highest failure-rate components (more
than 90% of expected failures)
y RAIN (redundant array of independent nodes) provides failover and fault tolerance across nodes
− Uninterrupted functionality during node failure, replacement and reconstruction
− In the unlikely event of a node failure, the system will load balance around the failed node.
Backup data will be stored on the remaining nodes; data for recoveries is reconstructed using
parity.
− Once the failed node is decommissioned, its data will be rebuilt automatically from parity
y Replication for server loss
− Efficient, scheduled replication (local or remote) ensures availability/redundancy of data if
primary server is lost
y Checkpoints for operational failures
− Provide redundancy across time
− Read-only snapshot of the Avamar server taken to facilitate server rollbacks
− Created using hard-links to all the stripes
− Regular checkpoint validation, including auto-repair capability, to ensure data integrity

Avamar Fundamentals, Page 1 - 15


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Avamar Server Standard Configurations


y Non-RAIN

1x1 1x2
Stand-alone Data Node 0.0
Node
Data Node 0.1
Utility Node

y RAIN
1xn+S
Data Node 0.3
Data Node 0.2
Data Node 0.1
Data Node 0.0
Spare Data Node
Utility Node

* In the formula, 1 is the number of modules or servers, n the number of data nodes, S the spare data node

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 16 of 52

Avamar supports two basic types of standard Avamar server configurations.


Non-RAIN configurations consist of either a single stand-alone node or one utility node and two data
storage nodes. In single node configurations, both utility and data functions are provided in the single
node.
RAIN configurations include one utility node, four or more data storage nodes, plus a spare data node.
Currently, the largest standard configuration consists of 16 data nodes, 1 utility node, and 1 spare data
node.
In a multi-node system, the nodes operate together as one server. The hostname and IP address of the
utility node is the identity of the Avamar server for access and client/server communication. Avamar
load balances data across all available nodes in a server. With node architecture, Avamar can be easily
scaled by adding more nodes.
Beginning with Avamar release 4.0, two sizes of data nodes are supported: nodes with 1 TB of
licensable capacity and nodes with 2 TB of licensable capacity. Licensable capacity includes de-duped
data plus RAIN parity protection.

Avamar Fundamentals, Page 1 - 16


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Avamar Server 4.0 Specifications


y With release 4.0, server runs on RHEL4 64-bit
y Support for multiple processors
y Two types of data storage nodes:
– 1 TB licensable capacity
– 2 TB licensable capacity

y Minimum RAM Requirements:


– 4 GB RAM for utility, accelerator and 1 TB data nodes
– 16 GB RAM for 2 TB nodes

y Three server editions:


– Software only
– Hardware appliance: Avamar Data Store
– Virtual appliance: Avamar Virtual Edition
© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 17 of 52

Beginning with Avamar release 4.0, the Avamar server runs on 64-bit Red Hat Enterprise Linux 4.0
only. The Avamar server is capable of operating on server hardware with multiple processors.
With release 4.0, two sizes of data nodes are supported.
y 1 TB licensable capacity.
y 2 TB of licensable capacity.
All data nodes within an Avamar server must be of the same size.
The Avamar solution is available in three different editions, providing the flexibility to meet different
customer requirements. With the Software edition of Avamar server, the server software is installed on
customer-supplied hardware selected from a list of Avamar-certified hardware devices. This provides
customers with the flexibility to use hardware from a single vendor throughout their site for support
and maintenance purposes. The Data Store edition includes both hardware and Avamar server software
from EMC. The hardware is qualified and pre-tested by EMC, requiring less resources to complete the
deployment at the customer site. The Virtual Edition is a software-only solution that is installed on
VMware ESX servers. This solution is ideal for remote branch offices that may need to backup locally.

Avamar Fundamentals, Page 1 - 17


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Avamar Server Editions: Software Only (1 of 2)


y Software installed on customer-supplied hardware
y Certified hardware platforms include Dell, HP, and IBM
equipment
– Please contact your sales representative for more information

y RAM Requirements:
– 4 GB RAM for utility and 1 TB data nodes
– 16 GB RAM for 2 TB nodes

y Storage requirements and configuration vary by type of


node (2 TB data node, 1 TB data node, utility node,
single-node server)

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 18 of 52

The Avamar Software edition is a software only solution. Avamar server software is installed on
customer-supplied qualified hardware platforms. With this option, the entire server implementation is
performed at the customer site by EMC-trained personnel, including RAID configuration and testing.
Currently, Dell, HP, and IBM hardware platforms are among the supported platforms.

Avamar Fundamentals, Page 1 - 18


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Avamar Server Editions: Software Only (2 of 2)

Node Type RAM Hard Drives RAID Level Luns

Single-node server 16 GB 6 x 1 TB RAID-1 3

Single-node server 4 GB 6 x 300 GB RAID-5 4

Utility node 4 GB 2 x 146 GB RAID-1 1

Data storage node 16 GB 6 x 1 TB RAID-1 3

Data storage node 4 GB 6 x 300 GB RAID-5 4

4 GB
NDMP Accelerator 2 x 146 GB RAID-1 1
Rec. 8 GB – 16 GB

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 19 of 52

The table on the slide lists the node types, minimum RAM requirements and a description of internal
data storage configuration and allocation.

Note: Please refer to Avamar installation documentation for detailed hardware platform and
compatibility specifications.

Avamar Fundamentals, Page 1 - 19


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Avamar Server Editions: Avamar Data Store


y Complete EMC solution:
– EMC-certified hardware, pre-configured and tested offsite
– Avamar software installed onsite

y Available pre-racked configurations:


– 6-node with 1 TB data nodes (4 TB licensable capacity)
– 6-node with 2 TB data nodes (8 TB licensable capacity)
– 12-node with 2 TB data nodes (20 TB licensable capacity)
– 18-nodes with 2 TB data nodes (32 TB licensable capacity)

y Additional supported configurations:


– Single node expansion (either 1 TB or 2 TB data nodes)
– Single server systems (either 1 TB or 2 TB data nodes)
– 1 x 2 systems (2 TB or 4 TB licensable capacity)
– Accelerator node
© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 20 of 52

Avamar Data Store simplifies the purchase and deployment of Avamar by delivering a pre-packaged
solution consisting of Avamar server software installed onsite on pre-configured and pre-tested
Avamar-certified hardware. Deployment time at customer sites is reduced since hardware stress tests
and benchmark tests are performed before the hardware is shipped. Avamar Data Store is available in
several configurations as listed in the slide, including multi-node and single-node servers and single
expansion nodes. Avamar Data Store is deployed by EMC-trained personnel.

Avamar Fundamentals, Page 1 - 20


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Avamar Server Editions: Avamar Virtual Edition


y Avamar server runs on a VMware ESX server virtual machine
(supplied by customer)
y Single-node Avamar server
y Two licensed capacity sizes: .5 TB and 1.0 TB
y Must meet expected I/O performance benchmarks

Application Avamar Virtual Edition

Operating System Operating System

ESX Server
Hardware

CPU Memory NIC Disk

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 21 of 52

The EMC Avamar Virtual Edition (AVE) allows the Avamar solution to be standardized on VMware
infrastructure. It is ideal for small, remote offices or small data centers, by lowering the total cost of
ownership through sharing the server and storage infrastructure and reducing the cost of hardware
support and maintenance.
AVE is a single-node Avamar server running on a VMware ESX Server 3.0.1 or 3.0.2 virtual machine.
There are two licensed capacity sizes: .5 TB and 1.0 TB. Each of these capacity versions has a set of
requirements for memory, I/O, and storage. The choice of AVE version to be deployed depends on the
type of data in the environment to be backed up and the expected daily change rate.
The VMware ESX Server is supplied by the customer. Installation of AVE on a virtual machine is
performed by EMC-trained personnel. The AVE benchmark test must be run to ensure that server
hardware and the virtual environment meet expected I/O performance benchmarks. Also, the
benchmark test helps to determine the impact of AVE on other virtual machines running on the same
physical server.
Note: For more information about Avamar Virtual Edition, please refer to the EMC Avamar Virtual
Edition Installation Guide and Reference Manual, available on EMC Powerlink. Training for AVE
includes the eLearning course, EMC Avamar Virtual Edition Overview.

Avamar Fundamentals, Page 1 - 21


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Avamar in Distributed Enterprises

Data Center Disaster Recovery Site

• Efficient Communication
• Centralized Management
• Consistent Policies
• Wide Scalability
• Disaster Recovery
• Optimized Performance
• Eliminates need to manage WAN
complex tape system for
backups

Remote Site Remote Site Remote Site


© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 22 of 52

Because Avamar architecture is extremely flexible and scalable, Avamar is an ideal solution for
distributed enterprises. Corporate backup policies can be implemented, enforced, and managed
throughout the organization from a central location. Avamar supports both local area network and wide
area network connections. There is minimal impact to network traffic and performance as, after
initialization, only changes travel over the networks.
You can backup both local and remote clients to a centralized Avamar server. As a centralized backup
system, Avamar protects critical branch data without the addition of hardware or specially trained
personnel at branch office sites. For larger branch office sites, a local Avamar system may be
employed to backup local data at the site and then automatically replicate the backup data to the
central data center or disaster recovery site. Additionally, the central Avamar server can be replicated
to an offsite location for disaster recovery. All backup and replication activity is managed from the
central data center using the Avamar Enterprise Manager and Administrator interfaces. Employing
Avamar disk-based backup eliminates the need to manage a complex tape system for backups, restores,
and offsite security.

Avamar Fundamentals, Page 1 - 22


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Lesson 3

Avamar Processes
and
Backup Process Flow

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 23 of 52

Avamar Fundamentals, Page 1 - 23


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Avamar Server Processes

Web Browser 8443 Utility Node


Web Server
7778 mcs/ems/cron
Administrator Console
28001
(GUI or CLI)
Storage Node 0.N
gsan

...
Ports
28002
avagent
Storage Node 0.1
gsan
Storage Node 0.0
avtar gsan
27000/29000
Avamar Server
Avamar Client

Ports shown for illustration purposes. Please see Product Security Manual for complete information.
© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 24 of 52

The Avamar server will support up to 18 simultaneous client connections per node. This limitation
may affect initial backups, but has little impact on subsequent backups. The Avamar server includes
the following processes:
Utility Nodes:
y Point of entry for Avamar server support – the hostname and IP of the Utility node is the identity of
the Avamar server for access and client/server communication
y Administrator Server – the Management Console Server (mcs) provides centralized administration
(scheduling, monitoring, and management) for the Avamar server and runs the server-side
processes used by the Avamar Administrator. This comprises all the Java processes as well as the
postgres (SQL) processes.
y Web Server – for web access to documents and downloads, web authentication for restore
y Enterprise Manager Server (ems)– for web-based management
y Server cron jobs – all daily housekeeping processes, checkpoints, garbage collection, HFS check
and replication
Data Storage Nodes:
y Avamar data server (gsan) - in a multi-node system, each data storage node runs the gsan
process
y Each date storage node runs thousands of threads to handle individual requests from multiple
clients.
Notes: Ports shown on the slide are for illustration purposes. Please see the Avamar Product Security
Manual for complete data port usage and firewall requirements.

Avamar Fundamentals, Page 1 - 24


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Avamar Server Processes


y Notifications and Reporting

Administrator Console
(GUI or CLI) GUI Alerts 7778 Utility Node

mcs
Email Notifications
SMTP Server
Administrator Server
Syslog/SNMP
Database
5555 (MCDB)
Avamar Monitoring
(EMC Smarts, EMC Backup Advisor)

SQL Commands

Avamar Reporting Ports


(EMC Backup Advisor) Avamar Server

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 25 of 52

System activities and operational status are reported as events to the administrator server. Various
kinds of notifications are created when specific events occur. Events can be viewed using the Avamar
Administrator Event Management functionality. Events can be configured as pop-up alerts to the GUI
and notifications via email messages to designated recipients. Examples of events include client
activation, and successful and failed backups.
Third-party tools and applications can be used to monitor and report on the syslog files and SNMP
traps, including EMC Smarts and EMC Backup Advisor.
System events and actions initiated by users, such as user logins, are maintained in an audit log,
permitting enforcement and accountability of security policies. Avamar activities and events can also
be accessed through read-only views of the Management Console Database to run pre-configured and
ad-hoc reports.

Avamar Fundamentals, Page 1 - 25


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Avamar Client Processes

Web Browser 8443


Utility Node
Web Server
7778 mcs
Administrator Console
(GUI or CLI) 28001
Storage Node 0.N
gsan

...
Ports
28002
avagent
Storage Node 0.1
gsan
Storage Node 0.0
avtar gsan
27000/29000
Avamar Server
Avamar Client

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 26 of 52

Avamar client software runs on each computer that is being backed up. Avamar provides client
software for various computing platforms including AIX, HP-UX, Linux, MAC OS, Netware, Solaris,
VMWare, and Windows. Avamar client software consists of two processes, avagent and avtar,
and one or more client plug-ins.
avagent: avagent is the process that listens for incoming workorders from MCS on the Avamar
server. In response to a workorder, such as a backup or a restore, avagent will spawn avtar.
avagent listens on port 28002.
avtar:avtar is the primary process for backups and restores. avtar communicates with the gsan
processes on the storage nodes.
Client plug-ins: There are two types of client plug-ins: filesystem and database. Filesystem plug-ins
browse, backup and restore files or directories on a specific client filesystem. Database plug-ins
support backup and restore of databases.
Current plug-in types include:
y AIX, HP-UX, Linux, Macintosh, Netware, Solaris, Linux, Windows, FreeBSD filesystems
y NDMP, Celerra via NDMP, Netapp Filer via NDMP
y DB2 (AIX, Windows), Oracle RMAN (AIX, HP-UX, Linux, Solaris, Windows), Exchange
Database, Exchange Message, Windows SQL

Avamar Fundamentals, Page 1 - 26


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Avamar Backup Process Flow

Utility Node
avscc Web Server
mcs

28001
Storage Node 0.N
28002
avagent Ports gsan

...
Storage Node 0.1
avtar gsan
Storage Node 0.0
Avamar Client 27000/29000 gsan
Avamar Server

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 27 of 52

During a scheduled backup or restore, the administrator server (mcs) running on the Avamar server
generates a work order. mcs then either pages the client avagent, or avagent checks in to pick up
the work order. On the client, avagent starts avtar to begin the backup or restore process.
avtar performs the majority of backup tasks. avtar communicates to a gsan process running on
one of the Avamar data nodes and sends backup data to the gsan. The gsan process then distributes
the data across the available data nodes.

Avamar Fundamentals, Page 1 - 27


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Avamar De-duplication
y De-duplication ensures each unique sub-file object is
stored only once on the Avamar server
y Duplicate objects identified at the client (source)
y Only objects that aren’t stored on the Avamar server are
sent across the network
y Benefits include:
– Less backup data
– Less network traffic

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 28 of 52

De-duplication is a key feature of the Avamar system. De-duplication ensures that each unique object
is stored only once in the Avamar storage system. Redundant backup data is eliminated at the client
(source), drastically reducing the amount of data that travels across the network to be stored and
managed by the backup host. As long as a data object is stored on the server, it is never re-sent to the
server. This dramatically reduces network traffic and enhances backup storage efficiency guaranteeing
the most effective de-duplication of the data.
The next several slides depict a high-level logical process and data flow of the Avamar de-duplication
process.

Avamar Fundamentals, Page 1 - 28


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Avamar De-duplication Data Flow


Ba
ck
up
re Backup
qu
es Client
t

Local
File

hash
Cache

?
1. File Yes 2. Sticky-byte 3. Compression 4. Hashing
changed? factoring

Continued on next slide

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 29 of 52

The Avamar agent running on the backup client (avtar) traverses each directory in the backup.
For each file in the backup:
1. avtar checks the client’s file cache to see if the file has been backed up before. Files that
have been previously backed up are skipped from processing.
2. If there is no match in the file cache, sticky-byte factoring divides the file data into
variable-sized chunks.
3. Each data chunk may be compressed.
4. Each compressed data chunk is hashed. The hash created from a data chunk is referred to as
an atomic hash. Atomic hashes are combined to create Composites.

(Continued on next page)

Avamar Fundamentals, Page 1 - 29


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Avamar De-duplication Data Flow, continued


Ba
ck
up
re Backup
qu
es Client
t

Local File

hash
cache
?
1. File 2. Sticky-byte 3. Compression 4. Hashing
Yes
changed? factoring

6. Is present on the
Avamar server? 5. Stored previously?
7. Hash and data
corresponding
? Local Hash

hash
to the hash sent hash
cache
to Avamar server
No
?
has

No
Avamar Server

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 30 of 52

5. Each atomic and composite hash is compared to the entries in the client’s hash cache to
determine if it has been stored before.
6. If there is no match in the hash cache, the hash cache is updated and the hash is sent to the
Avamar server with an is-present command.
7. If there is no match on the Avamar server, then the hash and the data corresponding to the hash are
sent to the Avamar server.
The cycle for backing up a file continues with other atomics and composites. The local hash cache file
is updated with the hashes for the atomics and composites.
This process is repeated for other files and directories until a single root hash for the backup is
created. Every backup has a root hash stored on the server, that, through the series of hashes, links to
all the files and data that comprise the backup.

Avamar Fundamentals, Page 1 - 30


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

More About Sticky-byte Factoring


1st Backup: data is separated into
variable-sized chunks
Sticky
Byte
Factoring 18K 10K 25K 22K 8K
Algorithm

Sticky-byte factoring always produces the same results on data


that has not changed: chunks are identical to previous chunks
Sticky
Byte
Factoring 18K 10K 25K 22K 8K
Algorithm

Subsequent Backup when there is a change of data: very


quickly, chunks are re-synced with previous backup
Sticky
Byte
Factoring
20K 8K 25K 22K 8K
Algorithm

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 31 of 52

With sticky-byte factoring, during backup processing, avtar separates raw data into chunks or
objects that vary in size between 1 byte and 64 K bytes. Data chunks average 24 KB in size.
Sticky-byte factoring will always produce the same chunk results as long as the data has not changed.
Where there is a change of data since the previous backup, it locates where the data has been changed
and quickly re-synchronizes the data chunking process to match data chunks created during the
previous backup.

Avamar Fundamentals, Page 1 - 31


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

More About Compression


y Compresses chunks in the average range of 30 – 50%
y Average chunk size after compression: 12 – 16 KB

Sticky
Byte
Factoring
20K 8K 25K 22K 8K
Algorithm

Compression 12K 4K 15K 13K 4K

y Avamar will not compress data that results in an unfavorable


compression ratio (< 25%)
© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 32 of 52

During Compression, the chunks created during sticky-byte factoring are compressed in the range of
30 to 50% of their original size. This is accomplished with a compression algorithm that is optimized
for speed. Average chunk size after compression is between 12 KB and 16 KB.
Depending upon the data (some files result in an unfavorable compression ratio), Avamar will not
compress the chunk. This is for efficiency: compressing these files is a waste of CPU utilization and
may actually cause the chunk to grow in size.
Note: Because they are written in a compressed form, Office 2007 files are also handled differently.
Beginning with Avamar 4.0, Office 2007 files are uncompressed for de-duplication processing and
backup storage, then re-compressed upon restore. This is not backward compatible to pre-4.0 Avamar
clients. For these clients, this data will not restore correctly.

Avamar Fundamentals, Page 1 - 32


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

SHA-1Hashing
y Uses SHA-1 secure hash algorithm
y Creates 20-byte data string from the compressed data chunks

Sticky
Byte
Factoring
20K 8K 25K 22K 8K
Algorithm

Compression 12K 4K 15K 13K 4K

20-byte
20-byte
20-byte
20-bytehash
Hashing
hash
hash
hash
20-byte hash

20-byte hash
20-byte hash

20-byte hash

20-byte hash
atomic hashes

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 33 of 52

Hashing is the process of creating a short fixed-length data string from a large block of data.
During hashing, data chunks that have been through the compression process are input to the SHA-1
hashing algorithm. SHA-1 processes the data and creates a unique 20-byte (160-bit) data string called a
hash.
The hash identifies the data chunk and is used in the de-duplication process to determine if the chunk
has been stored before.
The hash created from each data object is called an atomic hash. Atomic hashes are combined to
create composites. During a backup, hashing continues until a single root hash for the backup is
created.
Note: The original data chunk remains intact. It is used to create the hash, but it is not converted into
the hash. After the hashing process, both the data chunk and the hash exist.

Avamar Fundamentals, Page 1 - 33


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Data Object Storage


y Hashes are used to store & find data objects
y Both hash and chunk are stored
y Data distributed across all storage nodes
y No separate file level catalog

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 34 of 52

Data object storage is used to manage objects on the Avamar server. Both the chunk that has gone
through the compression process and its corresponding hash are stored. Part of the number of the hash
is used as an address to identify the location where the corresponding data chunk is stored on backup
disk storage. Because each hash is a random and unique number, data is automatically evenly
distributed across all available storage nodes and disks within an Avamar system. This type of address
is called an “object address.” It eliminates the need for a separate file level catalog.
Once an object has been stored, it cannot be deleted until the specified retention period has expired and
it is not used by any current backup. Storing data on disk, rather than on tape, streamlines the process
of searching for stored objects.

Avamar Fundamentals, Page 1 - 34


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Backup Data Storage Organization


Hierarchical architecture provides file-system organization

Data Storage Path Data Storage Stripes

Data
Atomics (objects)

Atomic Hashes

Composites

Composite Hashes

Composite-Composite

Root Hash

Atomics Stripe
Fault Tolerance
Composites Stripe
Accounts Stripe
© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 35 of 52

Data is stored using a complex hierarchical hashing file system with an indexing structure consisting of
the data elements grouped together by multiple levels of hashes, composites and root hashes. The root
hash of each backup links to the data objects and hashes comprising the backup at the point-in-time
when the backup occurred.
Data objects are stored on Avamar disk storage in special files called data stripes. Each data stripe
fills in like grain being poured into a silo. A single data stripe can hold approximately 8,000 to 10,000
data objects. For new Avamar 4.0 and above systems, stripes hold approximately 30,000 objects.
Composite hashes are stored in separate stripes. Root hashes, as well as information about the origin of
the files (client, domain, etc.), are stored in the accounts stripes. On a RAIN system, an additional
stripe file contains RAIN parity data. This data is used to reconstruct data for a failed node. These
additional stripe files account for the RAIN overhead.

Avamar Fundamentals, Page 1 - 35


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Data Image Reconstruction for Restore


y Avamar presents a full backup as of a single point-in-time for restore
y Hierarchical storage structure enables fast restores from disk storage
y Browse backup directory structure to select items to restore

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 36 of 52

For restore, Avamar presents a full backup as of a single point-in-time.


Each backup has its own root hash linking to the data objects and hashes which comprise the specific
backup at the point in time when the backup occurred. By storing data intelligently, backup data can be
leveraged with Avamar search capabilities to assist in legal discovery and other regulatory and
compliance activities.

Avamar Fundamentals, Page 1 - 36


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Lesson 4

Backing Up
VMware, NDMP, Clusters
And Database Application
Environments

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 37 of 52

Avamar Fundamentals, Page 1 - 37


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Avamar Backups with VMware


y De-duplication reduces the amount of backup data
among VMware virtual machines
y Central management of backup environment, including
VMware and non-VMware clients APP APP APP

y Support for 3 different backup


OS OS OS

options: VMware

– Guest Hardware

CPU Memory NIC Disk


– VCB
Avamar Server
– Console

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 38 of 52

Avamar is ideally suited for protecting clients in VMware environments by reducing the amount of
backup data within and across the virtual machines. Avamar provides the flexibility of implementing a
VMware backup solution in any of three ways. Avamar agents can be installed in the virtual machines
(guest backup), on the VMware Consolidated Backup (VCB) proxy server, or in the ESX server
service console. Unlike traditional backup solutions, Avamar can de-duplicate the data stored in virtual
disks (.vmdk files). Data centers can leverage these different VMware backup options to create a
backup environment that meets their individual backup requirements..

Avamar Fundamentals, Page 1 - 38


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Avamar – VMware Guest Backup

y Individual backup for


each virtual machine Avamar Avamar Avamar
OS OS OS
y Virtual machines treated Avamar Server
Hardware
like any other physical VMware
machine
y Avamar agent software
CPU Memory NIC Disk

installed on each guest


y Backup sent via LAN to Physical
Server
Avamar server

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 39 of 52

With the guest backup option, Avamar client software is installed on the individual virtual machines.
Backup configuration for this method is identical to that of a physical machine. Avamar provides
support for file system and application (such as Exchange, SQL, Oracle) backups. The main
advantages of VMware guest backup are that it provides the highest level of data de-duplication and
lets backup administrators leverage identical backup methods for physical and virtual machines. There
is no requirement for advanced scripting or VMware software knowledge and it means unchanged day-
to-day procedures for backup.

Avamar Fundamentals, Page 1 - 39


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Avamar – VMware Consolidated Backup


y Centralized Proxy-based Avamar
Server
backup for VMware
Infrastructure 3 Virtual Machines Centralized
Data Mover

y Leverages Avamar VCB APP APP APP


Backup
Proxy
OS OS OS
Interoperability Module Server

distributed by VMware ESX Server


MOUNT

SNAPSHOT

y Reduces load on ESX SNAPSHOT

Server during backups SNAPSHOT

Physical SAN
Server Storage
y Reduces LAN backup traffic
by moving data directly from
SAN

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 40 of 52

Avamar takes advantage of the VCB to safely, efficiently, and securely protect virtual servers running
within the VMware infrastructure.
VMware Consolidated Backup (VCB) offloads the backup workload to the Consolidated Backup proxy
server. Avamar leverages VMware Consolidated Backup through integration with a tool called the
Avamar VCB Interoperability Module. The Avamar agent and the interoperability module run on the
Consolidated Backup proxy server to provide the backup services. One backup server can provide
backup services to many ESX server hosts as long as all machines share the same SAN (storage area
network).
With this approach, users have several advantages. By consolidating backups onto fewer proxy servers,
backup can be simplified and backup licensing costs for VMware environments can be reduced. VCB
reduces the load on the ESX Server during backups, eliminating the need to manage backup agents in
each virtual machine. And, because of its method of transferring data using the SAN, LAN traffic
associated with backup is reduced. With this approach, there is no direct application integration as is
possible with guest backups using Avamar database plug-ins.
Note: Backup proxy server supported on Windows 2003 server only.

Avamar Fundamentals, Page 1 - 40


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Avamar – VMware Service Console Backup


y Avamar client software
Avamar
installed on the VMware APP APP APP APP APP Server
console OS (Linux) OS OS OS OS Service
Console
Avamar
Agent

y Enables virtual machine VMware Virtualization Layer

x86 Architecture
image (.vmdk) level backup
and recovery CPU Memory NIC Disk

y Backup sent via LAN to


backup server or storage Physical
Server
node
– No proxy or SAN required
– No file level backup

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 41 of 52

Avamar can also leverage service console-based backups for the ESX Server. This simply means that
the Avamar client agent is installed on the Linux OS running the service console. This backup method
targets the VMware VMDK files or image-level for protection. Using the service console backup
method, users can choose to backup the virtual machine either online or offline. In either case, the
backup starts on the production ESX Server host and resources from the ESX Server host are used for
moving the data to the Avamar server.
The advantages of this method of backup include:
y Less management overhead because backup agents are not needed inside the virtual machines.
y No backup proxy server or SAN required because backup is done on the ESX Server host itself
y Restore goes directly to the ESX Server host.

Avamar Fundamentals, Page 1 - 41


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Avamar in NDMP Environments


y Use NDMP Accelerator for backup of EMC Celerra IP
storage systems and Network Appliance filers
– Dedicated, single node Avamar client that interfaces NDMP storage
devices to Avamar server via NDMPCOPY and avtar
– Currently only NDMP version 2 protocol is supported
– Pass through only; data never hits disk on Accelerator
– Single accelerator can support more than one NAS storage device

EMC NDMP Avamar


Celerra Accelerator Server
or
LAN LAN/WAN
Network
NDMPCOPY
Appliance
NDMP Avtar
Storage

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 42 of 52

Use the NDMP Accelerator to backup EMC Celerra IP storage systems and Network Appliance filers.
The NDMP accelerator is a dedicated single node Avamar client, that when used as part of an Avamar
system, provides a complete backup and recovery solution. The NDMP accelerator hosts a special
version of the Avamar client and acts as a “pass through” conduit from the NAS device to the Avamar
server. Data streams through the NDMP Accelerator; no user data is stored on the NDMP Accelerator.
A single accelerator can support multiple NAS storage devices.

Avamar Fundamentals, Page 1 - 42


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Backup Process with NDMP Accelerator


y By default, only one backup at a time
– Can be configured for multiple simultaneous backups
y Each backup can contain a maximum of 10 million files
y 1st backup level-0; 2nd – nth, level-1
y Level-1 contents merged with previous backup to create a new full backup

y Celerra level-1 backups should only be performed at the volume level


NDMP Avamar
Storage Device NDMP Accelerator Server

Data

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 43 of 52

By default, backups are limited to running one at a time. However, the accelerator can be configured to
support multiple, simultaneous backups. 8 GB of RAM or greater is required. With this option, up to
four backups can run simultaneously provided that all are initiated as scheduled group backups.
Depending upon available CPU processing, expected data throughput is at least 40 GB per hour for the
data stream.
Each backup can contain a maximum of 10 million files (with 16 GB of RAM in the NDMP
accelerator). Backups and restores are performed at the volume, q-tree or directory levels; browsing is
supported only at the volume level.
By default, the first backup (initialization) is always a level-0 backup. The 2nd through nth backups are
performed as level-1. The level-1 contents are merged on-the-fly with previous backup data resulting
in a new full backup. It is recommended to always perform level-1 Celerra backups at the volume
level. For Celerra, Avamar will default to performing full backups of the lower level directories.
The diagram on the slide depicts a high level process flow of an NDMP backup operation. The ndmjob
daemon accepts NDMP Data Server connections (port 10000) and the avagent daemon accepts Avamar
Administrator connections (port 28002). The Avamar NDMP plug-in (avndmp) uses ndmpcopy to
copy data to/from the NDMP storage device and the Avamar server. A backup is performed by
copying data from the storage device to the Avamar server. A restore is performed by copying data
from the Avamar server to the NDMP storage device.

Avamar Fundamentals, Page 1 - 43


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Avamar Support for Microsoft Windows Clusters


y Support for Active/Passive and Active/Active clustering
y Avamar Windows File System client to back up physical node data
y Avamar Windows Cluster client to back up shared cluster data
y Supports 64-bit Windows clusters

Avamar
Server

Cluster-1
Node A Shared Node B
External
Storage

© 2008 EMC Corporation. All rights reserved.


Virtual Server Cluster-1 Avamar Fundamentals 1 - 44 of 52

Microsoft Windows Clustering can help ensure that data or applications are continuously available to
clients on a network. Avamar supports both active/passive and active/active clustering.
The following is an overview of active/passive functionality. The cluster group, or virtual server,
consists of resources including servers defined as cluster nodes, applications and designated areas of
shared external storage. At any given time, only one cluster node is the owner of the cluster group and
is the only node that can access the shared external storage for a single shared application; other nodes
will be offline or standing by. If a planned shutdown or failure of the active node occurs, control of the
virtual server is transferred to another node in the cluster.
In a clustered environment, there is a requirement to protect the data residing on each server’s internal
hard drive and also the cluster data residing on shared external storage. The Avamar Windows File
System client is installed on all nodes of the cluster and will be used to backup each internal hard disk
drive. Then, the Avamar Windows Cluster client software is installed on all cluster nodes, with the first
install performed on the server that currently has access to the shared external storage for the cluster
group. This creates certain configuration files and directories necessary for backing up the shared
external storage. The cluster group will be managed as a separate client in Avamar Administrator and
is used to backup the shared cluster data. Backups of physical cluster nodes will not include volumes
that will be backed up as part of the virtual cluster.
Avamar supports the File System, Exchange Database and Microsoft SQL Server plug-ins on a
Windows cluster.
For more information about Avamar in a clustered environment, please reference the Avamar Backup
Clients Installation Guide and Reference Manual.
Avamar Fundamentals, Page 1 - 44
Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Backing Up Databases vs. File System Data


y More unique data in databases than in file systems
y Database daily change rate typically 3% vs. 0.3% for file
systems
y Implications:
– More storage is required for storing database backups
– More network bandwidth is required for communicating database
backups
– Database backups take more time: 100 GB/hour vs. 250-500 GB/hour
– Higher client CPU utilization if backup is performed on database server

y Avamar best suited for environments where database data


comprises 20% or less of overall protected data

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 45 of 52

Databases are different than file systems. These differences include:


y More unique data in databases than in file systems
y Daily change rate typically 3% versus 0.3% for file systems
y Initial change rate typically 65% versus 35% for file systems
The implications of these differences are that backing up database data is different from backing up
file system data:
y More storage is required for storing database backups
y More network bandwidth is required for communicating database backups
y Database backups take more time; 100 GB/hour is typical for database backups
y Higher client CPU utilization if backup processing is performed on the database server
Because of these factors, Avamar is best suited for environments where there is no more than 20% of
database data.

Avamar Fundamentals, Page 1 - 45


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Avamar Database Clients

y Specialized Avamar clients are


provided for online backups and
restores of Microsoft SQL Server, Avamar
Server
Microsoft Exchange, DB2 and
Oracle databases
Avamar
Client (avtar)
Database
plug-in

Database Database
Application Server

y Other configurations include making database data available to


a separate host and backup as file system data using avtar
© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 46 of 52

Avamar provides specialized clients for backing up and restoring Microsoft SQL Server, Microsoft
Exchange, DB2 and Oracle databases. Several configurations for installing client agents when backing
up databases, including the placement of agents and the types of agents required, are available.
Where the Avamar client and the Avamar database client are installed on the database server, during
installation, the Avamar database clients add one or more plug-ins to the database server. During a
backup, the database plug-ins communicate with the database API or backup facility and pass data to
avtar to backup to the Avamar server.
Where data from a database is made available to a separate host where the Avamar client is installed
and running, backups of the data can be taken as file system data using avtar. Offloading backup
processing to another machine may reduce the impact of taking backups on CPU utilization of the
database server.

Avamar Fundamentals, Page 1 - 46


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Avamar - Database Backup Options (1 of 3)


y Installing Avamar database agent on the database server

Database Avamar Avamar


Server Exchange Server
Agent

Exchange
API

Exchange

Database

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 47 of 52

There are several options for installing client agents when backing up databases, including the
placement of agents and the types of agents required.
The first option is installing the Avamar database agent on the database server.
Pros with this approach include:
y Unassisted, single-click backup and restore
y No additional primary storage required
y With Oracle RMAN incremental backups, Avamar client only scans changed data--but all
incrementals need to be applied during restore process
Cons:
y High CPU utilization on client
y Limited to supported databases

Avamar Fundamentals, Page 1 - 47


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Avamar - Database Backup Options (2 of 3)


y Backup “hot database” Dump with Accelerator

Database Accelerator Avamar


Server Server

avtar

Exchange
Ex
po
rt

Database Database
Dump

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 48 of 52

In this option, the database is first dumped to a local “disk cache” accessed by an accelerator machine
where the Avamar agent is installed.
Pros include:
y Fast restore - backup copy immediately available on local “disk cache”
y Modest CPU utilization on database server; same as used for tape backup
y Standard file system backup
y Works with non-supported databases (no Avamar database agent) that provide an export or backup
to file capability
y With RMAN incremental backups, Avamar client agent only scans changed data (Oracle Only)
Cons
y Additional primary storage required
y Additional server required
y Two step restore process

Avamar Fundamentals, Page 1 - 48


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Avamar - Database Backup Options (3 of 3)


y Backup “hot” Snapshot copy with Accelerator

Database Accelerator Avamar


Server Server

avtar

Exchange

Database
Primary

Snapshot

© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 49 of 52

In this option, the Avamar agent is installed on an accelerator machine with access to the snapshot.
The pros with this approach include:
y Fastest restore - latest backup copy immediately available
y No additional CPU utilization on database server
y Standard file system backup
y Works with non-supported databases (no Avamar database agent) that provide an export or back to
file capability
y Single step restore process
Cons:
y Additional primary storage required to store snapshot
y Additional server required

Avamar Fundamentals, Page 1 - 49


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Best Practices: Architecture


y RAID
– Set the RAID rebuild to a relatively low priority
– For data store, set up the log scanning capability
– For 3rd party hardware, monitor and address hardware issues

y RAIN
– Always enable for any configurations other than single-node servers
and 1x2 (2 active data nodes)
– Always include an active (available) spare node

y Replication
– Take advantage of failover capability

y Checkpoints
– Leave checkpoint retention policy at default value.
© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 50 of 52

Because RAID rebuilds can significantly reduce the I/O performance, the rebuilds can adversely
impact the performance of the Avamar server. Therefore, set the RAID rebuild to a relatively low
priority.
Always enable RAIN for any configurations other than single-node servers and 1x2 (2 active data
nodes). When deploying a RAIN configuration, always include an active (available) spare node in the
module.
Always deploy non-RAIN single-node and 1x2 systems with replication.
Leave the checkpoint retention policy at the default values.

Avamar Fundamentals, Page 1 - 50


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

Module Summary
Key points covered in this module:
y Avamar stores only one copy of all common data across the backup network.
y Avamar presents a full backup as of a single point-in-time for restore.
y Major steps in the de-duplication data flow include sticky-byte factoring,
compression, SHA-1 hashing, and data object storage.
y Avamar system architecture includes the Avamar clients and the Avamar
server. A multi-node Avamar server is made up of utility and data nodes.
y Avamar provides systematic fault tolerance with RAIN, RAID, replication and
checkpoints.
y Avamar Administrator Server (mcs) and Avamar Data Server (gsan) run on
the Avamar server. Major processes running on the Avamar client are
avagent and avtar.
y Avamar backups can be performed in VMware, Microsoft Cluster, NDMP
and database environments.
© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 51 of 52

These are the key points covered in Module 1. Please take a moment to review them.

Avamar Fundamentals, Page 1 - 51


Copyright © 2008 EMC Corporation. Do not Copy - All Rights Reserved.

9 Check Your Knowledge


1. How do Avamar backups differ from traditional
backups?
2. At what level does Avamar de-duplicate data?
3. What is an ideal mix of databases and file system data
for an Avamar environment?
4. Which encryption options does Avamar offer?
5. Which fault tolerance features are supported with
Avamar?
6. What is an advantage of the Avamar Data Store?
7. Which size data nodes are available in Avamar 4.0?
© 2008 EMC Corporation. All rights reserved. Avamar Fundamentals 1 - 52 of 52

Verify your understanding of the topics presented in Module 1 by answering the Check Your
Knowledge questions. Please take a moment to review them.

Avamar Fundamentals, Page 1 - 52

You might also like