:::» symantec

,

VERITAS Cluster Server for UNIX, Fundamentals (Lessons)

11111111111111111111111111111111111111111111111111111111111111111111111111111111

* 1 a a - a a 2 3 6 9 - A *

COURSE DEVELOPERS Bilge Gerrits Siobhan Seeger

LEAD SUBJECT MATTER EXPERTS

Pete Toemmes Brad Willer

TECHNICAL CONTRIBUTORS AND REVIEWERS

Geoff Bergren Margy Cassidy Tomer Gurantz Gene Henriksen Kleber Saldanha

Copyright lid 2006 Symantec Corporation. All rights reserved. Syrnantec. the Symantec Logo, and VERITAS are trademarks or registered trademarks of Symantec Corporation or its affiliates in the U.S. and other countries. Other names Illay be trademarks of their respective owners.

THIS PUBLICATION IS PROVIDED "AS IS" AND ALL EXPRESS OR IMPLTED CONDITIONS. REPRESENTATfONS AND WARRANTIES. INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NONINFRINGEMENT, ARE DISCLAIMED. EXCEPT TO THE EXTENT THAT SUCH DISCLAIMERS ARE HELD TO BE LEGALLY INVALID. SYMANTEC CORPORATION SHALL NOT BE LIABLE FOR INCIDENTAL OR CONSEQUENTIAL DAMAGES TN CONNECTION WITH THE FURNISHING, PERFORMANCE, OR USE OF THIS PUBLICATION. THE INFORMATION CONTAINED HEREIN IS SUBJECT TO CHANGE WITHOUT NOTICE.

No part of the contents of this book may be reproduced or transmitted in any form or by any means without the written permission of the publisher.

VERITAS Cluster Server/or UNIX Fundamentals

Syrnantec Corporation 20330 Stevens Creek Blvd. Cupertino, CA 95014

http://www.symantec.com

Printed in Canada

Table of Contents

Course Introduction

VERITAS Cluster Server Curriculum Intro-2

Cluster Design Intro-4

Lab Design for the Course Intro-5

Lesson 1: High Availability Concepts

High Availability Concepts 1-3

Clustering Concepts 1-7

Clustering Prerequisites 1-14

Lesson 2: VCS Building Blocks

VCS Terminology 2-3

Cluster Communication 2-12

VCS Architecture 2-17

Lesson 3: Preparing a Site for VCS

Hardware Requirements and Recommendations 3-3

Software Requirements and Recommendations 3-5

Preparing Installation Information 3-8

Lesson 4: Installing VCS

Using the VERITAS Product Installer 4-3

VCS Configuration Files 4-7

Viewing the Default VCS Configuration 4-10

Other I nsta Ilation Considerations.......................................................................... 4-12

Lesson 5: VCS Operations

Managing Applications in a Cluster Environment.. 5-3

Common VCS Operations 5-5

Using the VCS Simulator 5-16

Lesson 6: VCS Configuration Methods

Starting and Stopping VCS 6-3

Overview of Configuration Methods 6-7

Online Configuration 6-9

Offline Configuration 6-16

Controlling Access to VCS : 6-19

Lesson 7: Preparing Services for VCS

Preparing Applications for VCS 7-3

Performing One-Time Configuration Tasks 7-5

Testing the Application Service 7-10

Stopping and Migrating an Application Service 7-18

Lesson 8: Online Configuration

Online Service Group Configuration 8-3

Adding Resources 8-6

Solving Common Configuration Errors 8-15

Testing the Service Group 8-19

Table of Contents

Copyright © 2006 Symanlec Corporation. All rights reserved

Lesson 9: Offline Configuration

Offline Configuration Procedures 9-3

Solving Offline Configuration Problems 9-13

Testing the Service Group 9-17

Lesson 10: Sharing Network Interfaces

Parallel Service Groups........................................................................................ 10-3

Sharing Network Interfaces 10-7

Using Parallel Network Service Groups 10-11

Localizing Resource Attributes 10-14

Lesson 11: Configuring Notification

Notification Overview.................................................................................. 11-3

Configuring Notification........................ 11-6

Using Triggers for Notification 11-11

Lesson 12: Configuring VCS Response to Resource Faults

VCS Response to Resource Faults 12-3

Determining Failover Duration 12-9

Controlling Fault Behavior 12-13

Recovering from Resource Faults 12-17

Fault Notification and Event Handling 12-19

Lesson 13: Cluster Communications

VCS Communications Review 13-3

Cluster Membership................................. 13-6

Cluster Interconnect Configuration 13-8

Joining the Cluster Membership 13-14

Changing the Interconnect Configuration 13-19

Lesson 14: System and Communication Faults

Ensuring Data Integrity 14-3

Cluster Interconnect Failures 14-6

Lesson 15: 1/0 Fencing

Data Protection Requirements 15-3

I/O Fencing Concepts and Components 15-8

I/O Fencing Operations....................................................................................... 15-11

I/O Fencing Implementation 15-19

Configuring I/O Fencing 15-25

Stopping and Recovering Fenced Systems 15-28

Lesson 16: Troubleshooting

Monitoring VCS 16-3

Troubleshooting Guide 16-7

Archiving VCS-Related Files 16-9

VERITAS Cluster Server for UNIX, Fundamentals

Copyright © 2006 Symantec Corporation. All rights reserved

Course Introduction

symantec,

VER!TAS Cluster Server Curriculum

Learning Path

VERITAS Cluster Server Curriculum

The VERlTAS Cluster Server curriculum is a series of courses that are designed to provide a full range of expertise with VERlTAS Cluster Server (VCS) high availability solutions=-frorn design through disaster recovery.

VERiTAS Cluster Server; Fundamentals

This course covers installation and configuration of common VCS configurations, focusing on two-node clusters running application and database services.

VERlTAS Cluster Server, implementing Local Clusters

This course focuses on multinode VCS clusters and advanced topics related to more complex cluster configurations.

High Availabilitv Design and Customization Using VERITAS Cluster Server This course enables participants to translate high availability requirements into a VCS design that can be deployed using VERlTAS Cluster Server.

Disaster Recovery Using VVR and GLobal Cluster Option

This course covers cluster configurations across remote sites, including VERlTAS Volume Replicator and the Global Cluster Option for wide-area clusters.

Copyright © 2006 Symantec Corporation. All rights reserved

VERITAS Cluster Server for UNIX, Fundamentals

Intro-2

• Lesson 1: High Availability Concepts

• Lesson 2: VCS Building Blocks

• Lesson 3: Preparing a Site for VCS

• Lesson 4: Installing VCS

• Lesson 5: VCS Operations

• Lesson 6: VCS Configuration Methods

• Lesson 7: Preparing Services for VCS

• Lesson 8: Online Configuration

• Lesson 9: Offline Configuration

• Lesson 10: Sharing Network Interfaces

• Lesson 11: Configuring Notification

• Lesson 12: Configuring VCS Response to Faults

• Lesson 13: Cluster Communications

• Lesson 14: System and Communication Faults

• Lesson 15: 110 Fencing

• Lesson 16: Troubleshooting

symantec.

Course Overview

This training provides comprehensive instruction on the installation and initial configuration ofYERlTAS Cluster Server (YCS). The course covers principles and methods that enable you to prepare, create, and test YCS service groups and resources using tools that best suit your needs and your high availability environment. You leam to configure and test fail over and notification behavior, cluster additional applications, and further customize your cluster according to specified design criteria.

Course Introduction

Intro-3

Copyright © 2006 Symantec Corporation. All rights reserved

symantec.

Sample Cluster Design input

Web Server

IP Address 192.168.3.132

V

'%t Mount Iweb

+

Volume WebVol

Web Service Start up on system 51. Restart Web server process 3 times before faulting it.

Fail over to 52 if any resource faults.

Notify patg@company.com if any resource faults.

NIC eriO

--------

--------

Disk Group WebDG

------

------

Cluster Design

Sample Cluster Design Input

A yeS design can be presented in many different formats with varying levels of detail.

In some cases, you may have only the information about the application services that need to be clustered and the desired operational behavior in the cluster. For example, you may be told that the application service uses multiple network ports and requires local failover capability among those ports before it fails over to another system.

In other cases, you may have the information you need as a set of service dependency diagrams with notes on various aspects of the desired cluster operations.

lfyou receive the design information that does not detail the resource information, develop a detailed design worksheet before starting the deployment.

U sing a design worksheet to document all aspects of your high availability environment helps ensure that you are well-prepared to start implementing your cluster design.

In this course, you are provided with a design worksheet showing sample values as a tool for implementing the cluster design in the lab exercises.

You can use a similar format to collect all the inforrnation you need before starting deployment at your site.

Intro--4

Copyright © 2006 Syrnantec Corporation All rights reserved

VERITAS Cluster Server for UNIX, Fundamentals

symantec.

lab Design for the Course

their_nameSG1

their nameSG2

your_nameSG1 your_nameSG2

Lab Design for the Course

The diagram shows a conceptual view of the cluster design used as an example throughout this course and implemented in hands-on lab exercises.

Each aspect of the cluster configuration is described in greater detail, where applicable, in course lessons.

The cluster consists of:

Two nodes

Five high availability services; four failover service groups and one parallel network service group

Fibre connections to SAN shared storage from each node through a switch Two private Ethernet interfaces for the cluster interconnect network Ethernet connections to the public network

Additional complexity is added to the design to illustrate certain aspects of cluster configuration in later lessons. The design diagram shows a conceptual view of the cluster design described in the worksheet.

Course Introduction

lntro-S

Copyright © 2006 Symantec Corporation. All rights reserved

lab Naming Conventions

syrnantec.

Service Group Definition

Sample Value

Resource Definition

Sample Value

Service Group Name

Resource Name

narnelP

Resource Type

IP

Required Attributes ResAttribute1

value

ResAttribute2

value

• Substitute your name, or a nickname, wherever tables or instructions indicate nerne in labs.

• Following this convention:

- Simplifies lab instructions

- Helps prevent naming conflicts with your lab partner

Lab Naming Conventions

To simplify the labs. use your name or a nickname as a prefix for cluster objects created in the lab exercises. This includes Volume Manager objects, such as disk groups and volumes, as well as ves service groups and resources.

Following this convention helps distinguish your objects when multiple students are working on systems in the same cluster and helps ensure that each student uses unique names. The lab exercises represent your name with the word name in italics. You substitute the name you select whenever you see the name placeholder in a lab step.

Intro-6

Copyright © 2006 Symantec Corporation. All rights reserved.

VERITAS Cluster Server for UNIX, Fundamentals

Software Location Your Value
VCS installation dir
Lab files directory
... • Use the classroom values provided by your instructor at the beginning

of each lab exercise.

• Lab tables are provided in the lab appendixes to record these values.

• Your instructor may also hand out printed tables.

• If sample values are provided as guidelines, substitute your cls ssroom-s peciiic veiues provided by your instructor.

Classroom Values for Labs

Your instructor will provide the classroom-specific information you need to perform the lab exercises. You can record these values in your lab books using the tables provided, or your instructor may provide separate handouts showing the classroom values for your location.

In some lab exercises, sample values may be shown in tables as a guide to the types of values you must specify. Substitute the values provided by your instructor to ensure that your configuration is appropriate for your classroom.

If you are not sure of the configuration for your classroom, ask your instructor.

Course Introduction

Intro-7

Copyright © 2006 Symantec Corporation. All rights reserved.

Typographic Conventions Used in This Course

The following tables describe the typographic conventions used in this course.

Convention

Typographic Conventions in Text and Commands

Examples

Courier New, bold

Element

Command input, both syntax and examples

To display the robot and drive configuration: tpconfig -d

To display disk information: vxdisk -0 alldgs list

Courier New, plain

Courier New, Italic, bold or plain

• Command output

• Command names, directory names. file names, path names, user names, passwords, U RLs when used within regular text paragraphs

Variables in command syntax, and examples:

• Variables in command input are Italic, plain.

• Variables in command output are Italic, bold.

In the output: protocol_minimum: 40 protocol maximum: 60

protocol current: 0

Locate the al tnames directory.

Go to http://www.symantec.com. Enter the value 300.

Log on as userl.

To install the media server: /cdrom directory/install To access a manual page:

man command name

To display detailed information for a disk: vxdisk -g disk group list

disk name

Typographic Conventions in Graphical User Interface Descriptions

Convention Element Examples
Arrow Menu navigation paths Select File->Save.
Initial capitalization Buttons, menus, windows, Select the Next button.
options, and other interface Open the Task Status
elements window.
Clear the checkrnark from
the Print File check box.
Quotation marks Interface elements with Mark the "Include
long names subvolumes in object view
window" check box. Intro-8

VERITAS Cluster Server for UNIX, Fundamentals

Copyright © 2006 Symantec Corporation, All rights reserved.

Lesson 1

High Availability Concepts

• Lesson 3: Preparing a Site for VCS

• Lesson 4: Installing VCS

• Lesson 5: VCS Operations

• Lesson 6: VCS Configuration Methods

• Lesson 7: Preparing Services for VCS

• Lesson 8: Online Configuration

• Lesson 9: Offline Configuration

• Lesson 10: Sharing Network Interfaces

• Lesson 11: Configuring Notification

• Lesson 12: Configuring VCS Response to Faults

• Lesson 13: Cluster Communications

• Lesson 14: System and Communication Faults

• Lesson 15: 1/0 Fencing

• Lesson 16: Troubleshooting

Lesson Topics and Objectives

Clustering Concepts

High Availability Concepts Describe the merits of high availability in the data center environment.

1-2

Describe how clustering is used to implement high availability.

Describe how applications are managed in a high availability environment.

Describe key requirements for a clustering environment.

Clustering Prerequisites

High Availability Application Services

Copyright © 2006 Symantec Corporation. All rights reserved.

VERITAS Cluster Server for UNIX, Fundamentals

Challenges in the Data Center

Who is making changes? Am I in compliance? How do I track usage and align with the business?

How can I automate mundane tasks? How do I maintain standards?

How can I pool servers and decouple apps?

How do I reduce planned and unplanned downtime How do I meet my disaster recovery requirements? How do I track & deliver against SLAs?

symanrec.

High Availability Concepts

Challenges in the Data Center

Managing a data center presents many challenges, which can be roughly split into three categories:

Visibility: Viewing and tracking the components in the data center Control: Managing these components

Availability: Keeping critical business applications available

Availability can be considered as the most important aspect of data center management. When critical business applications are offline, the loss of revenue and reputation can be devastating.

Copynght © 2006 Symantec Corporation. All rights reserved

1-3

Lesson 1 High Availability Concepts

Causes of Downtime

Prescheduled Downtime 30%

Software 40%

People 1

Environment 5%

Client <1%

LANiWAN Equipment <1%

Causes of Downtime

Downtime is defined as the period of time in which a user is unable to perform tasks in an efficient and timely manner due to poor system performance or system failure.

The data in the graph shows reasons for downtime from a study published by the International Electric and Electronic Engineering Association. It shows that hardware failures are the cause of only about 10 percent of total system downtime.

As much as 30 percent of all downtime is prescheduled, and most of this time is required due to the lack of system tools to enable online administration of systems. Another 40 percent of downtime is due to software errors. Some of these errors are as simple as a database running out of space on disk and stopping its operations as a result.

Downtime can be more generally classified as either planned or unplanned.

Examples of unplanned downtime include events such as server damage or application failure.

Examples of planned downtime include times when the system is shut down to add additional hardware, upgrade the operating system, rearrange or repartition disk space, or clean up log files and memory.

With an effective HA strategy, you can significantly reduce the amount of planned downtime. With planned hardware or software maintenance, a high availability product can enable manual failover while upgrade or hardware work is performed.

1-4

VERITAS Cluster Server for UNIX, Fundamentals

Copyright © 2006 Symantec Corporation. All rights reserved

Costs of Downtime

Actual unplanned downtime per month:

• Hours: 9

• Cost per hour: $1 06k to $183K

• Total cost: $954,000 to 1,647,000

Goal for monthly unplanned downtime: • Hours: 3

• Cost savings: $636,000 to $1,098,000

Gartner User Survey: Hi~,h~lviiiliit;iliiyalrld',Missionil Critical

Costs of Downtime

A Gartner study shows that large companies experienced a loss of between $954,000 and $1,647,000 (USD) per month for nine hours of unplanned downtime.

In addition to the monetary loss, downtime also results in loss of business opportunities and reputation.

Planned downtime is almost as costly as unplanned. Planned downtime can be significantly reduced by migrating a service to another server while maintenance is performed.

Given the magnitude ofthe cost of downtime, the case for implementing a high availability solution is clear.

Lesson 1 High Availability Concepts

1-5

Copyright © 2006 Symanlec Corporation. All rights reserved.

¥ remote clustering GeO 1

I,i

.. r remote replication VVR

/ /

}:J local clustering VCS

//

~d:ta availability V)(VMiVXFS

D--,-,-----~

backup NetBacKUp

levels of Availability

Levels of Availability

Data centers may implement different levels of availability depending on their requirements for availability

Backup: At minimum, all data needs to be protected using an effective backup solution, such as YERlTAS NctBackup.

Data availability: Local mirroring provides real-time data availability within the local data center. Point-in-time copy solutions protect against corruption. Online configuration keeps data available to applications while storage is expanded to accommodate growth.

Local clustering: After protecting, the next level is using a clustering solution, such as YERITAS Cluster Server (YCS), for application and server availability,

Remote replication: After implementing local availability, you can further ensure data availability in the event of a site failure by replicating data to a remote site. Replication can be application-, host-, or array-based.

Remote clustering: Implementing remote clustering ensures that the applications and data can be started at a remote site. The YCS Global Cluster Option supports remote clustering with automatic site failover capability.

1-6

VERITAS Cluster Server for UNIX, Fundamentals

Copyright © 2006 Symantec Corporation. All rights reserved

Types of Clusters

• Cluster is a broadly-used term: - High availability (HA) clusters

- Parallel processing clusters

- Load balancing clusters

- High performance computing clusters

- Fault tolerant clusters

• VCS is primarily an HA cluster with support for:

- Parallel processing applications, such as Oracle RAG

- Application workload balancing

Clustering Concepts

The term cluster refers to multiple independent systems connected into a management framework.

Types of Clusters

symantec,

A variety of clustering solutions are available for various computing purposes.

HA clusters: Provide resource monitoring and automatic startup and faiJover Parallel processing clusters: Break large computational programs into smaller tasks executed in parallel on multiple systems

Load balancing clusters: Monitor system load and distribute applications automatically among systems according to specified criteria

High performance computing clusters: Use a collection of computing resources to enhance application performance

Fault-tolerant clusters: Provide uninterrupted application availability

Fault tolerance guarantees 99.9999 percent availability, or approximately 30 seconds of downtime per year. Six 9s (99.9999 percent) availability is appealing, but the costs of this solution are well beyond the affordability of most companies. In contrast, high availability solutions can achieve five 9s (99.999 percent availability=-Iess than five minutes of downtime per year) at a fraction of the cost.

The focus of this course is VERlTAS Cluster Server, which is primarily used for high availability, although it also provides some support for parallel processing and load balancing.

Copyright © 2006 Syrnentec Corporation. All rights reserved,

1-7

Lesson 1 High Availability Concepts

local Cluster Configurations

~

•.......

j::

Utilization:

Utilization:

[mf~l U .~!~!!~

syrnantec.

I

Local Cluster Configurations

Depending on the clustering solution you deploy, you may be able to implement a variety of configurations, enabling you to deploy your clustering solution to best suit your HA requirements and utilize existing hardware.

Acti ve/Passivc

In this configuration, an application runs on a primary or master server and a dedicated redundant server is present to take over on any fail over.

Active/ Active

In this configuration, each server is configured to run specific applications or services, and essentially provides redundancy for its peer.

N-to-l

In this configuration, the applications fail over to the spare when a system crashes. When the server is repaired, applications must be moved back to their original systems.

N + I

Similar to N-to-I, the applications restart on the spare after a failure. Unlike the N-to-l configuration, after the failed server is repaired, it can become the redundant server.

N-to-N

This configuration is an active/active configuration that supports multiple application services running on multiple servers. Each application service is capable of being failed over to different servers in the cluster.

1-8

VERITAS Cluster Server for UNIX, Fundamentals

Copyright © 2006 Symantec Corporation. All rights reserved

symantec,

Global Cluster Configurations

REPLICATION IBM ..... . NetApp

.'.",,' "'.'.".'.'''''.",,,w.w.,, """,.,,,,,,

Campus and Global Cluster Configurations

Cluster configurations that enable data to be duplicated among multiple physical sites protect against site-wide failures.

Campus Clusters

The campus or stretch cluster environment is a single cluster stretched over multiple locations, connected by an Ethernet subnet for the cluster interconnect and a fiber channel SAN, with storage minored at each location.

Advantages of this configuration are:

It provides local high availability within each site as well as protection against site failure.

It is a cost-effective solution; replication is not required. Recovery time is short.

The data center can be expanded.

You can leverage existing infrastructure.

Global Clusters

Global clusters, or wide-area clusters, contain multiple clusters in different geographical locations. Global clusters protect against site failures by providing data replication and application fail over to remote data centers.

Global clusters are not limited by distance because cluster communication uses TCP/IP. Replication can be provided by hardware vendors or by a software solution, such as VERlTAS Volume Replicator, for heterogeneous array support.

Lesson 1 High Availability Concepts

1-9

Copyright © 2006 Symantec Corporation. All rights reserved.

HA Application Services

Ii Collection of all hardware and software components required to provide a service

.. All components moved together

• Components started, stopped in order

• Examples: Web servers, databases, and applications

HA Application Services

• Application requires:

• Database

• IP address

Database requires file systems

• File systems require volumes

• Volumes require disk groups

An application service is a collection of hardware and software components required to provide a service, such as a Web site an end-user may access by connecting to a particular network IP address or host name. Each application service typically requires components of the following three types:

Application binaries (executables) Network

Storage

I f an application service needs to be switched to another system, all of the components of the application service must migrate together to re-create the service on another system.

These are the same components that the administrator must manually move from a failed server to a working server to keep the service available to clients in a nonclustered environment.

Application service examples include:

A Web service consisting of a Web server program, IP addresses, associated network interfaces used to allow access into the Web site, a file system containing Web data files, and a volume and disk group containing the file system.

A database service may consist of one or more IP addresses, database management software, a file system containing data files, a volume and disk group on which the file system resides, and a NIC for network access.

1-10

VERITAS Cluster Server for UNIX, Fundamentals

Copyright © 2006 Symantec Corporation. All rights reserved.

Local Application Service Failovar

Local Application Service Failover

Cluster management software performs a series of tasks in order for cI ients to access a service on another server in the event a failure occurs. The software must:

Ensure that data stored on the disk is available to the new server, if shared storage is configured (Storage).

Move the IP address of the old server to the new server (Network). Start up the application on the new server (Application).

The process of stopping the application services on one system and starting it on another system in response to a fault is referred to as efailover

Lesson 1 High Availability Concepts

Copyright © 2006 Symantec Corporation. All rights reserved

1-11

local and Global Failover

Local and Global Failover

In a global cluster environment, the application services are generally highly available within a local cluster, so faults are first handled by the HA software, which performs a local faiJover.

When HA methods such as replication and clustering are implemented across geographical locations, recovery procedures are started immediately at a remote location when a disaster takes down a site.

1-12

VERITAS Cluster Server for UNIX, Fundamentals

Copyright © 2006 Symantec Corporation. All rights reserved.

Applicatlon Requirements for Clustering

Start! restart

Restarted to a known state after a failure

Stop

Stopped using a defined procedure

Clean

Cleaned up after operational failures

Monitor

Monitored periodically

Nodeindependent

Not tied to a particular host due to licensing constraints or host name dependencies

Application Requirements for Clustering

The most important requirements for an application to run in a cluster are crash tolerance and host independence. This means that the application should be able to recover after a crash to a known state, in a predictable and reasonable time, on two or more hosts.

Most commercial applications today satisfy this requirement. More specifically, an application is considered well-behaved and can be controlled by clustering software if it meets the requirements shown in the slide.

Copyright © 2006 Symantec Corporation. All rights reserved.

1-13

Lesson 1 High Availability Concepts

symantec.

Hardware and infrastructure Redundan

Switch1

Two independent links from each server to storage

Clustering Prerequisites

Hardware and Infrastructure Redundancy

All failovers cause some type of client disruption. Depending on your configuration, some applications take longer to fail over than others. For this reason, good design dictates that the HA software first try to fail over within the system, using agents that monitor local resources.

Design as much resiliency as possible into the individual servers and components so that you do not have to rely on any hardware or software to cover a poorly configured system or application. Likewise, try to use all resources to make individual servers as reliable as possible.

Single Point of Failure Analysis

Determine whether any single points of failure exist in the hardware, software, and infrastructure components within the cluster environment.

Any single point of failure becomes the weakest link of the cluster. The application is equally inaccessible if a client network connection fails, or if a server fails.

Also consider the location of redundant components. Having redundant hardware equipment in the same location is not as effective as placing the redundant component in a separate location.

In some cases, the cost of redundant components outweighs the risk that the component will become the cause of an outage. For example, buying an additional expensive storage array may not be practical. Decisions about balancing cost versus availability need to be made according to your availability requirements.

1-14

VERITAS Cluster Server for UNIX, Fundamentals

Copyright@2006 Symantec Corporation. All rights reserved

External Dependencies

• Avoid dependence

on services 'IF',9W1wn;m

outside the tH",ii:::;;;,i'i=rIlmlDrik1'::::iii;

cluster, where

possible.

• Ensure redundancy of external services, if required.

System A

System B

External Dependencies

Whenever possible, it is good practice to eliminate or reduce reliance by high availability applications on external services. If it is not possible to avoid outside dependencies, ensure that those services are also highly available.

For example, network name and information services, such as DNS (Domain Name System) and NIS (Network Information Service), are designed with redundant capabilities.

Lesson 1 High Availability Concepts

1-15

Copyright © 2006 Symantec Corporation. All rights reserved

svmantec.

lesson Summary • Key Points

- Clustering is used to make business-critical applications highly available.

- Local and global clusters can be used together to provide disaster recovery for data center sites.

• Reference Materials

- High A vailability Design and Customization Using VERITAS Cluster Server course

- VERITAS High Availability Fundamentals Webbased training

High Availability References

Use these references as resources for building a complete understanding of high availability environments within your organization.

The Resilient Enterprise: Recovering in/ormation Servicesfrom Disasters This book explains the nature of disasters and their impacts on enterprises, organizing and training recovery teams, acquiring and provisioning recovery sites, and responding to disasters.

Blueprintsfor High Availability: Designing Resilient Distributed Systems This book provides a step-by-step guide for building systems and networks with high availability, resiliency, and predictability.

High Availability Design, Techniques, and Processes

This guide describes how to create systems that are easier to maintain, and defines ongoing availability strategies that account for business change.

Designing Storage Area Networks

The text offers practical guidelines for using diverse SAN technologies to solve existing networking problems in large-scale corporate networks. With this book, you learn how the technologies work and how to organize their components into an effective, scalable design.

Storage Area Network Essentials: A Complete Guide to Understanding and implementing SANs (VERITAS Series)

This book identifies the properties, architectural concepts, technologies, benefits, and pitfalls of storage area networks (SANs).

Copyright © 2006 Symantec Corporation. All rights reserved.

VERITAS Cluster Server for UNIX, Fundamentals

1-16

Lesson 2

VCS Building Blocks

• Lesson 5: VCS Operations

• Lesson 6: VCS Configuration Methods

• Lesson 7: Preparing Services for VCS

• Lesson 8: Online Configuration

• Lesson 9: Offline Configuration

• Lesson 10: Sharing Network Interfaces

• Lesson 11: Configuring Notification

• Lesson 12: Configuring VCS Response to Faults

• Lesson 13: Cluster Communications

Lesson 14: System and Communication Faults

• Lesson 15: 1/0 Fencing

• Lesson 16: Troubleshooting

Lesson Topics and Objectives

symantec.

VCS Terminology

Define VCS terminology.

Cluster Communication

Describe VCS cluster communication mechanisms.

VCS Architecture

Describe the VCS architecture.

2-2

Copyright © 2006 Syrnantec Corporation. All rights reserved.

VERITAS Cluster Server for UNIX, Fundamentals

VCS Cluster

symantec,

ves clusters consist of:

• Up to 32 systems (nodes)

• An interconnect for cluster communication

• A public network for client connections

• Shared storage accessible by each system

£ Online service ~ Offline service

~... Cluster Interconnect

VCS Terminology

VCS Cluster

A ves cluster is a collection of independent systems working together under the ves management framework for increased service availability.

ves clusters have the following components:

Up to 32 systems-sometimes referred to as nodes or servers Each system runs its own operating system.

A cluster interconnect, which allows for cluster communications

A public network, connecting each system in the cluster to a LAN for client access

Shared storage (optional), accessible by each system in the cluster that needs to run the application

Lesson 2 ves Building Blocks

2-3

Copyright © 2006 Symantec Corporation. All rights reserved.

syrnantec.

Service Groups

A service group is a container that enables VCS to manage an application service as a unit.

A service group is defined by:

• Resources: Components required to provide the service

• Dependencies: Relationships between components

• Attributes: Behaviors for startup and failure conditions

Service Groups

A service group is a virtual container that enables yeS to manage an application service as a unit. The service group contains all the hardware and software components required to run the service. The service group enables yeS to coordinate failover of the application service resources in the event of failure or at the administrator's request.

A service group is defined by these attributes:

The cluster-wide unique name of the group

The list of the resources in the service group, usually determined by which resources are needed to run a specific application service

The dependency relationships between the resources

The list of cluster systems on which the group is allowed to run

The list of cluster systems on which you want the group to start automatically

2-4

VERITAS Cluster Server for UNIX, Fundamentals

Copyright © 2006 Syrnantec Corporation. All rights reserved

Service Group Types • Failover

- Online on only one cluster system at a time

- Most common type

• Parallel

- Online on multiple cluster systems simultaneously

- Example: Oracle Real Application Cluster (RAC)

• Hybrid

Special-purpose service group used in replicated data clusters (ROCs) using VERITAS Volume Replicator

syrnantec,

Service Group Types

Service groups can be one of three types:

Failover

This service group runs on one system at a time in the cluster. Most application services, such as database and NFS servers, use this type of group.

Parallel

This service group runs simultaneously on more than one system in the cluster. This type of service group requires an application that can be started on more than one system at a time without threat of data corruption.

Hybrid (4.x and later)

A hybrid service group is a combination of a failover service group and a parallel service group used in YCS 4.x (and later) replicated data clusters (RDCs), which use replication between systems at different sites instead of shared storage. This service group behaves as a failover group within a defined set of systems, and a parallel service group within a different set of systems. RDC configurations are described in the High Availability Using VERITAS Cluster Serverfor UNIX Implementing Remote C/usterscourse.

Copyright © 2006 Symantec Corporation. All rights reserved.

2-5

Lesson 2 VCS Building Blocks

Resources

VCS resources:

• Correspond to the hardware or software components of

an application service

• Have unique names throughout the cluster

• Are always contained within service groups

• Are categorized as:

Persistent: Always on

- Nonpersistent: Turned on and off

Recommendation: Choose names that reflect the service group name to easily identify all resources in that group; for example, WeblP in the WebSG group.

Resources

Resources are yeS objects that correspond to hardware or software components. such as the application. the networking components, and the storage components.

yeS controls resources through these actions:

Bringing a resource online (starting) Taking a resource offline (stopping) Monitoring a resource (probing)

Resource Categories

Persistent None

yeS can only monitor persistent resources--these resources cannot be brought online or taken offline. The most common example of a persistent resource is a network interface card (NIC), because it must be present but cannot be stopped. FileNone and ElifNone are other examples.

On-only

yeS brings the resource online ifrequired but does not stop the resource if the associated service group is taken offline. ProcessOnOnly is a resource used to start, but not stop a process such as daemon. for example.

Nonpersistent, also known as on-off

Most resources fall into this category, meaning that yeS brings them online and takes them offline as required. Examples are Mount, IP, and Process. FileOnOff is an example of a test version of this resource.

Copyright © 2006 Symantec Corporation. All rights reserved

VERITAS Cluster Server for UNIX, Fundamentals

2-6

symantec,

Resource Dependencies Resources dependencies:

• Determine online and offline order

• Have parent/child relationships; parent depends on child

• Cannot be cyclical

Offl;o, 0",", ~P P.,,".

,!~ ~:UI~arenVChild

,............... . , ., .. ., .. N !.c,.~:aU ~O!llrn8

r O.skGwup Child Online order

... " .... "." .. , .. ""." ..

Resource Dependencies

Resources depend on other resources because of application or operating system requirements, Dependencies are defined to configure ves for these requirements,

Dependency Rules

These rules apply to resource dependencies:

A parent resource depends on a child resource. In the diagram. the Mount resource (parent) depends on the Volume resource (child). This dependency illustrates the operating system requirement that a file system cannot be mounted without the Volume resource being available.

Dependencies are homogenous. Resources can only depend on other resources.

• No cyclical dependencies are allowed. There must be a clearly defined starting point.

Lesson 2 VCS Building Blocks

2-7

Copyright © 2006 Symanlec Corporation. All rights reserved

svmantec,

Resource Attributes

Resource attributes:

• Define individual resource properties

• Are used by VCS to manage the resource

• Can be required or optional

• Have values that match actual components

Resource Attributes

Resources attributes define the specific characteristics on individual resources. As shown in the slide, the resource attribute values for the sample resource of type Mount correspond to the UNIX command line to mount a specific file system. yeS uses the attribute values to run the appropriate command or system call to perform an operation on the resource.

Each resource has a set of required attributes that must be defined in order to enable yeS to manage the resource.

For example, the Mount resource on Solaris has four required attributes that must be defined for each resource of type Mount:

The directory of the mount point (MountPoint) The device for the mount point (BlockDevice) The type of file system (FSType)

The options for the f s ck eommand (FsckOpt)

The first three attributes are the values used to build the UNIX moun t command shown in the slide. The FsckOpt attribute is used if the mount command fails. In this case, yeS runs f s c k with the specified options ( - y, which means answer yes to all fsck questions) and attempts to mount the file system again.

Some resources also have additional optional attributes you can define to control how yes manages a resource. In the Mount resource example, MountOpt is an optional attribute you can use to define options to the UNIX mount command. For example, if this is a read-only file system, you can specify - ro as the MountOpt value.

2-8

VERITAS Cluster Server for UNIX, Fundamentals

Copyright © 2006 Symantec Corporation. All rlqnts reserved

imount [-F FSType] [options] block_device mount_point

Resource Types Resources types:

• Are classifications of resources

• Specify the attributes needed to define a resource

• Are templates for defining resource instances

: 1-lI;olJr,rPOir'.l'

Bbd:(,t".,:e

est-ce

rl1(wurolO~.t

f$""Opt 5n.!l>li.~n",u"t C\:ptlJrr1O<.lr,( Sl':crojL;;-,>",,1·1i)f)II"J.0I 5e~0I1d.«.,,~TIH1.;{"Jt

'!il ~

:~ l·l.;lW!(E t~ ~JF:'

19 NF5lcw:!

• Moniturll1t~ry<t1 : 60

• Ove~atiOfIS : On0ff

,fij ~Jott","rr";ng, .~~ Phenrcrn

.:;'? ~''''O:~,'OI.JOtE '?:jl Shs.e

%I "'PT5~l.'et.,;,pp ::',jj

:;'j Zl)"'~

(thITd

Resource Types and Type Attributes

Resources are classified by resource type. For example, disk groups, network interface cards (NICs), IP addresses, mount points, and databases are distinct types of resources. YCS provides a set of predefined resource types-some bundled, some add-ons-in addition to the ability to create new resource types.

Individual resources are instances of a resource type. For example, you may have severallP addresses under YCS control. Each of these IP addresses individually is a single resource of resource type IP.

A resource type can be thought of as a template that defines the characteristics or attributes needed to define an individual resource (instance) of that type.

You can view the relationship between resources and resource types by comparing the moun t command for a resource on the previous slide with the moun t syntax on this slide. The resource type defines the syntax for the mount command. The resource attributes fill in the values to form an actual command line.

Lesson 2 VCS Building Blocks

2-9

Copyright © 2006 Syrnantec Corporation. All rights reserved.

Agents: How VCS Controls Resources Each resource type has a corresponding agent that manages all resources of that type.

• Agents have one or more entry points.

• Entry points perform set actions on resources.

• Each system runs one agent for each active resource type.

/web /.log

1{),1,2,3

ri~

""

WebDG Web Vol LogVoi

[_ ~;w ~~

"" "'''"

~~ """"

ey:iO

Agents: How VCS Controls Resources

Agents are processes that control resources, Each resource type has a corresponding agent that manages all resources of that resource type, Each cluster system runs only one agent process for each active resource type, no matter how many individual resources of that type are in use,

Agents control resources using a defined set of actions, also called entry points, The four entry points common to most agents are:

Online: Resource startup Offline: Resource shutdown

Monitor: Probing the resource to retrieve status

Clean: Killing the resource or cleaning up as necessary when a resource fails to be taken offline gracefully

The difference between offline and clean is that offline is an orderly termination and clean is a forced termination. In UNIX, this can be thought of as the difference between exiting an application and sending the k i. 11 - 9 command to the process,

Each resource type needs a different way to be controlled, To accomplish this, each agent has a set of predefined entry points that specify how to perform each of the four actions, For example, the startup entry point of the Mount agent mounts a block device on a directory, whereas the startup entry point of the IP agent uses the if con f ig command to set the IP address on a unique IP alias on the network interface,

YCS provides both predefined agents and the ability to create custom agents,

2-10

VERITAS Cluster Server for UNIX, Fundamentals

Copyright © 2006 Symantec Corporation. All rights reserved.

VERITAS Cluster Server Bundled Agents

Reference Guide

Veritas" Cluster Server Bundled Agents Reference GUide

• Defines all ves resource types for all bundled agents

• Includes all supported UNIX platforms

• Downloadable from

!l.t1P..1i.!?ll22ortverit~ 5.0

VERITAS Cluster Server Bundled Agents Reference Guide

The VERITAS Cluster Server Bundled Agents Reference Guide describes the agents that are provided with yeS and defines the required and optional attributes for each associated resource type.

YERlTAS also provides additional application and database agents in an Agent Pack that is updated quarterly. Some examples of these agents are:

Oracle NetBackup lnformix iPlanet

Select the Agents and Options link on the VERlTAS Cluster Server page at www . ver i tas . com for a complete list of agents available for yes.

To obtain PDF versions of product documentation for yeS and agents, see the Support Web site at http://support.veritas. com.

Copyright © 2006 Symantec Corporation. All rights reserved

2-11

Lesson 2 VCS Building Blocks

Cluster Communication

The duster interconnect provides a communication channel between nodes.

The interconnect:

• Determines which nodes are affiliated by cluster ID

• Uses a heartbeat mechanism

• Maintains cluster membership:

A single view of the state of each cluster node

• Is also referred to as the private network

Cluster Communication

YCS requires a cluster communication channel between systems in a cluster to serve as the cluster interconnect. This communication channel is also sometimes referred to as the private network because it is often implemented using a dedicated Ethernet network.

YERlTAS recommends that you use a minimum of two dedicated communication channels with separate infrastructures=-for example, multiple NICs and separate network hubs-to implement a highly available cluster interconnect.

The cluster interconnect has two primary purposes:

Determine cluster membership: Membership in a cluster is determined by systems sending and receiving heartbeats (signals) on the cluster interconnect. This enables YCS to determine which systems are active members of the cluster and which systems are joining or leaving the cluster.

In order to take corrective action on node failure, surviving members must agree when a node has departed. This membership needs to be accurate and coordinated among active members-nodes can be rebooted, powered off, faulted, and added to the cluster at any time.

Maintain a distributed configuration: Cluster configuration and status information for every resource and service group in the cluster is distributed dynamically to all systems in the cluster.

Cluster communication is handled by the Group Membership Services/Atomic Broadcast (GAB) mechanism and the Low Latency Transport (LLT) protocol, as described in the next sections.

Copyright © 2006 Symantec Corporation. All rights reserved

VERITAS Cluster Server for UNIX, Fundamentals

2-12

LLT:

• Sends heartbeat messages

• Transports cluster communication traffic

• Balances traffic load across multiple network links

• Is a proprietary protocol

• Runs on an Ethernet network

Low-Latency Transport (Ll, T)

II T is a high-performance, low-latency protocol for cluster communication.

Low-Latency Transport

Clustering technologies from Symantec use a high-performance, low-latency protocol for communications. LLT is designed for the high-bandwidth and lowlatency needs of not only VERlTAS Cluster Server, but also VERITAS Cluster File System, in addition to Oracle Cache Fusion traffic in Oracle RAC configurations.

LLT runs directly on top of the Data Link Provider Interface (DLPI) layer over Ethernet and has several major functions:

Sending and receiving heartbeats over network links

Monitoring and transporting network traffic over multiple network links to every active system

Balancing the cluster communication load over multiple links Maintaining the state of communication

Providing a transport mechanism for cluster communications

Lesson 2 VCS Building Blocks

2-13

Copyright © 2006 Symantec Corporation All rights reserved

Group Membership Services/Atomic Broadcast (GAB)

GAB is a proprietary broadcast protocol that uses LL T as its transport mechanism.

GAB:

• Manages cluster membership-GAB membership

• Is a proprietary broadcast protocol

• Sends and receives configuration information

• Uses the LL T transport mechanism

Group Membership Services/Atomic Broadcast (GAB) GAB provides the following:

Group Membership Services: GAB maintains the overall cluster membership by way of its group membership services function. Cluster membership is determined by tracking the heartbeat messages sent and received by LLT on all systems in the cluster over the cluster interconnect.

GAB messages determine whether a system is an active member of the cluster, joining the cluster, or leaving the cluster. If a system stops sending heartbeats, GAB determines that the system has departed the cluster.

Atomic Broadcast: Cluster configuration and status information are distributed dynamically to all systems in the cluster using GAB's atomic broadcast feature. Atomic broadcast ensures that all active systems receive all messages for every resource and service group in the cluster.

2-14

Copyright © 2006 Symantec Corporation. All rights reserved

VERITAS Cluster Server for UNIX, Fundamentals

1/0 Fencing

UO fencing is a mechanism to prevent uncoordinated access to shared storage.

110 fencing:

• Monitors GAB for cluster membership changes

• Prevents simultaneous access to shared storage (fences off nodes)

• Is implemented as a kernel driver

• Coordinates with Volume Manager

• Requires hardware with SCSI-3 PR support

The Fencing Driver

The fencing driver prevents multiple systems from accessing the same Volume Manager-controlled shared storage devices in the event that the cluster interconnect is severed. In the example of a two-node cluster displayed in the diagram, if the cluster interconnect fails, each system stops receiving heartbeats from the other system.

GAB on each system determines that the other system has failed and passes the cluster membership change to the fencing module.

The fencing modules on both systems contend for control ofthe disks according to an internal algorithm. The losing system is forced to panic and reboot. The winning system is now the only member of the cluster, and it fences off the shared data disks so that only systems that are still part of the cluster membership (only one system in this example) can access the shared storage.

The winning system takes corrective action as specified within the cluster configuration, such as bringing service groups online that were previously running on the losing system.

Lesson 2 VCS Building Blocks

Copyright © 2006 Symantec Corporation. All rights reserved.

2-15

High Availability Daemon (HAD)

HAD is the ves engine, which manages all resources and tracks all configuration and state changes.

HAD:

• Runs on each cluster node

• Maintains resource configuration and state information

• Manages agents and service groups

• Is monitored by the hashadow daemon

symantec.

The High Availability Daemon

The yeS engine, also referred to as the high availability daemon (had), is the primary yeS process running on each cluster system.

HAD tracks all changes in cluster configuration and resource status by communicating with GAB. HAD manages all application services (by way of agents) whether the cluster has one or many systems.

Building on the knowledge that the agents manage individual resources, you can think of HAD as the manager of the agents. HAD uses the agents to monitor the status of all resources on all nodes.

This modularity between had and the agents allows for efficiency of roles:

HAD does not need to know how to start up Oracle or any other applications that can come under yeS control.

Similarly, the agents do not need to make cluster-wide decisions.

This modularity allows a new application to come under yeS control simply by adding a new agent-no changes to the yeS engine are required.

On each active cluster system, HAD updates all the other cluster systems with changes to the configuration or status.

In order to ensure that the had daemon is highly available, a companion daemon, hashadow, monitors had, and ifhad fails, hashadow attempts to restart had. Likewise, had restarts hashadow ifhashadow stops.

2-16

Copyright © 2006 Symantec Corporation. All rights reserved

VERITAS Cluster Server for UNIX, Fundamentals

Maintaining the Cluster Configuration

• HAD maintains the cluster configuration in memory on each node.

• Configuration changes are broadcast by HAD to all systems.

The configuration is preserved on disk (main. cf).

VCS Architecture

Maintaining the Cluster Configuration

HAD maintains configuration and state information for all cluster resources in memory on each cluster system. Cluster state refers to tracking the status of all resources and service groups in the cluster. When any change to the cluster configuration occurs, such as the addition of a resource to a service group, HAD on the initiating system sends a message to HAD on each member of the cluster by way of GAB atomic broadcast, to ensure that each system has an identical view of the cluster.

Atomic means that all systems receive updates, or all systems are rolled back to the previous state, much like a database atomic commit.

The cluster configuration in memory is created from the main. cf file on disk in the case where HAD is not currently running on any cluster systems, so there is no configuration in memory. When you start VCS on the first cluster system, HAD builds the configuration in memory on that system from the main. cf file. Changes to a running configuration (in memory) are saved to disk in main. cf when certain operations occur. These procedures are described in more detail later in the course.

Lesson 2 VCS Building Blocks

2-17

Copyright © 2006 Symanlec Corporation. All rights reserved.

VCS Configuration Files

/etc/VRT8vcs/conf/config

include "types.cf" cluster vcs_web (

UserNarnes = { admin = ElmElgLimHmmKumGlj I Administrators = { admin I

Counterlnterval = 5

Cluster configuration stored in text rile on disk

system 81 )

system 82

)

group Web8G

8ystemList = { 81 = 0, 82 = 1 ) )

Mount WebMount

MountPoint = "/Web"

BlockDevice = "/dev/vx/dsk/WebDG/WebVol"

F8Type = vxfs FsckOpt = "-y" )

VCS Configuration Files

Configuring YCS means conveying to YCS the definitions of the cluster, service groups, resources, and resource dependencies. YCS uses two configuration files in a default configuration:

The rna in. c f file defines the entire cluster, including the cluster name, systems in the cluster, and definitions of service groups and resources, in addition to service group and resource dependencies.

The types. cf tile defines the resource types.

Additional tiles similar to type s . c f may be present if agents have been added. For example, if the Oracle enterprise agent is added, a resource types tile, such as OracleTypes. c f , is also present.

The cluster configuration is saved on disk in the / etc/VRTSvcs / conf / config directory, so the memory configuration can be re-created after systems are restarted.

Copyright © 2006 Symantec Corporation. All rights reserved

VERITAS Cluster Server for UNIX, Fundamentals

2-18

symantec,

lesson Summary • Key Points

- HAD is the primary VCS process, which manages resources by way of agents.

- Resources are organized into service groups.

- Each system in a cluster has an identical view of

the state of resources and service groups.

• Reference Materials

- High Availability Design and Customization Using VERITAS Cluster Server course

- VERITAS Cluster Server Bundled Agents Reference Guide

- VERITAS Cluster Server User's Guide

Next Steps

Your understanding of basic ves architecture enables you to prepare your site for installing Yes.

Copyright © 2006 Symantec Corporation. All rights reserved

2-19

Lesson 2 VCS Building Blocks

2-20

VERITAS Cluster Server for UNIX, Fundamentals

Copyright © 2006 Symantec Corporation. All rights reserved.

Lesson 3

Preparing a Site for VCS

• Lesson 1: High Availability Concepts

• Lesson 2: VCS Building Blocks

~;{~~~~*!;~'~f'f~:~~~!!e for v:~~~~_,.;~U3

• Lesson 5: VCS Operations

• Lesson 6: VCS Configuration Methods

• Lesson 7: Preparing Services for VCS

• Lesson 8: Online Configuration

• Lesson 9: Offline Configuration

• Lesson 10: Sharing Network Interfaces

• Lesson 11: Configuring Notification

• Lesson 12: Configuring VCS Response to Faults

• Lesson 13: Cluster Communications

• Lesson 14: System and Communication Faults

• Lesson 15: I/O Fencing

• Lesson 16: Troubleshooting

Lesson Topics and Objectives

Describe general VCS hardware requirements.

Describe general VCS software uirements.

Hardware Requirements and Recommendations

3-2

Preparing Installation Information

Software Requirements and Recommendations

Collect cluster design information to for installation.

Copyright © 2006 Symantec Corporation. All rights reserved

VERITAS Cluster Server for UNIX, Fundamentals

symantec.

Hardware Requirements

a Hardware Compatibility

entsq,ppoEt::symantec;spm\l"_'__ List (HC L)

~i~::.~~~::.::m_:;'::''So:&''~''&.,%_''&..''S.*'-''<..':::::l':.:~::::i .. ~'m.::'~::'::'::'::'=$.'$.*-~"%».'~":.&

• Minimum configurations: - Memory

- Disk space

• Cluster interconnect:

- Redundant interconnect links

- Separate infrastructure (hubs, switches)

- No single point of failure

• Systems installed and verified

Hardware Requirements and Recommendations

Hardware Requirements

See the hardware compatibility list (HCL) at the VERITAS Web site for the most recent list of supported hardware for VERITAS products by Symantec.

Cluster Interconnect

VERlTAS Cluster Server requires a minimum of two heartbeat channels for the cluster interconnect.

Loss of the cluster interconnect results in downtime, and in nonfencing environments, can result in split brain condition (described in detail later in the course).

Configure a minimum of two physically independent Ethernet connections on each node for the cluster interconnect:

Two-node clusters can use crossover cables.

Clusters with three or more nodes require hubs or switches.

You can use layer 2 switches; however, this is not a requirement.

For clusters using VERITAS Cluster File System or Oracle Real Application Cluster (RAC), Symantec recommends the use of multiple gigabit interconnects and gigabit switches.

Copyright © 2006 Symantec Corporation. All rights reserved.

3-3

Lesson 3 Preparing a Site for VCS

symantec.

Hardware Recommendations

• No single points of failure

• Redundancy for:

- Public network interfaces and infrastructures

- HBAs for shared storage (Fibre or SCSI)

• Identically configured

systems:

- System hardware

- Network interface cards

- Storage HBAs

Networking

For a highly available configuration, each system in the cluster should have a minimum of two physically independent Ethernet connections for the public network. Using the same interfaces on each system simplifies configuring and managing the cluster.

Shared Storage

VCS is designed primarily as a shared data high availability product; however, you can configure a cluster that has no shared storage.

For shared storage clusters, consider these recommendations:

One HBA minimum for shared and nonshared (boot) disks:

To eliminate single points of failure, it is recommended to have two HBAs to connect to disks and to use a dynamic multipathing software, such as VERlTAS Volume Manager DMP.

Use multiple single-port HBAs or SCSI controllers rather than multi port interfaces to avoid single points of failure.

Shared storage on a SAN must reside in the same zone as all cluster nodes. Data should be minored or protected by a hardware-based RAID mechanism. Use redundant storage and paths.

Include all cluster-controlled data in your backup planning, implementation, and testing.

For information about configuring SCSI shared storage, see the SCSI Controller Configuration for Shared Storage section in the "Job Aids" appendix.

Copyright © 2006 Symantec Corporation. All rights reserved

VERITAS Cluster Server for UNIX, Fundamentals

3-4

symantec,

Software Requirements

• Determine supported software: - Operating system

- Patch level

- Volume management

- File system

- Applications

• Obtain VCS license key

Emtsu1?P.or.t •. symp.nte9(Com

-i Releas¢notesand instaUationguide

;%licePc§\:!V~5~'l:;~S·.C tSalesrepresent",tive· '[:rechnicaISup rir:Hor

Software Requirements and Recommendations

Software Requirements

Ensure that the software meets requirements for installing yes.

Verify that the required operating system patches arc installed on the systems before installing yes.

For the latest software requirements, refer to the VERlTAS Cluster Server Release Notes and the VERITAS Support Web site.

Verify that storage management software versions are supported.

Using storage management software, such as VERlTAS Volume Manager and VERlTAS File System, enhances high availability by enabling you to minor data for redundancy and change the configuration or physical disks without interrupting services.

Obtain ves license keys.

You must obtain license keys for each cluster system to complete the license process. For new installations, use the vLicense Web site, http://vlicense . veri tas . com, or contact your VERlTAS/Symantec sales representative for license keys. For upgrades, contact Technical Support.

Also, verify that you have the required licenses to run applications on all systems where the corresponding service can run.

Copyright © 2006 Symantec Corporation. All rights reserved

3-5

Lesson 3 Preparing a Site for VCS

Software Recommendations

• Identical system software configuration: - Operating system version and patch level

- Kernel and networking

- Configuration files

- User accounts

• Identical application configuration: - Version and patch level

- User accounts

- Licenses

Software Recommendations

Follow these recommendations to simplify installation, configuration, and management of the cluster:

Operating system: Although it is not a strict requirement to run the same operating system version on all cluster systems, doing so greatly reduces the complexity of installation and ongoing cluster maintenance.

Configuration: Setting up identical configurations on each system helps ensure that your application services can fail over and run properly on all cluster systems.

Application: Yerify that you have the same revision level of each application you are placing under YCS control. Ensure that any application-specific user accounts are created identically on each system.

Ensure that you have appropriate licenses to enable the applications to run on any designated cluster system.

3-6

VERITAS Cluster Server for UNIX, Fundamentals

Copyright © 2006 Symantec Corporation. All rights reserved.

System and Network Preparation Before beginning VCS installation:

• Add /sbin, /usr/sbin, /opt/VRTSvcs/bin to PATH.

• Verify that systems are accessible using fully qualified host names.

• Create an alias for the abort->go sequence (Solaris). ,. Configure ssh or rsh.

'-- • Only required for the duration of the VCS installation procedure

• No prompting permitted:

• s sh: Set public/private keys

• rsh: Set / . rhos ts

·Move jete/issue or similar type files

• Can instal! systems individually if remote access is not allowed

System and Network Preparation

PerfOIlTI these tasks before starting YCS installation.

Add directories to the PATH variable, if required. For the PATH settings, see the Installation guide for your platform.

Verify that administrative IP addresses are configured on your public network interfaces and that all systems are accessible on the public network using fully qualified host names.

For details on configuring administrative IP addresses, see the "Job Aids" appendix.

Solaris

Consider disabling the go sequence after Stop-A on Solaris systems. When a Solaris system in a YCS cluster is halted with the abort sequence (STOP-A), it stops producing YCS heartbeats. This causes other systems to consider this a failed node.

Ensure that thc only action possible after an abort is a reset. To ensure that you never issue a go function after an abort, create an alias for the go function that displays a message. See the VERlTAS Cluster Server installation Guide for the detailed procedure.

Enable ssh or rsh to install all cluster systems from one systcm.lfyou cannot enable secure communications, you can install YCS on each system separately.

Copyright © 2006 Symantec Corporation. All rights reserved.

3-7

Lesson 3 Preparing a Site for VCS

Required Installation Input

Collect required installation information:

• System (node) names

• License keys fill Cluster name

• Cluster ID (0 - 64K)

• Network interfaces for cluster interconnect links

Preparing Installation Information

symantec.

Required Installation Input

Verify that you have the information necessary to install ves. Be prepared to supply:

Names of the systems that will be members of the cluster

A name for the cluster, beginning with a letter of the alphabet (a-z, A-Z) A unique ID number for the cluster in the range 0 to 64K

Avoid using 0 because this is the default setting and can lead to conflicting cluster numbers if other clusters are added later using the default setting. All clusters sharing a private network infrastructure (including connection to the same public network if used for low-priority links) must have a unique ID.

Device names of the network interfaces used for the cluster interconnect

Copyright © 2006 Syrnantec Corporation. All rights reserved

3-8

VERITAS Cluster Server for UNIX, Fundamentals

symanrec.

Cluster Configuration Options

Prepare for configuring options:

• [Ii]

• III]

• 1111

• Root broker node for security

VCS user names and passwords -------'""'l Managed host (Cluster Management Console)

Local CMC (Web GUI):

• Network interface for CMC Web GUI

• Virtual IP address for CMC Web GUI

SMTP server name and e-mail addresses

SNMP Console name and message levels

[ ··························-···1

Default account: !

• User name: admin I

• Password: password ,

.. _""_,, __ . ,, ._._._._. __ ~J

You can opt to configure additional cluster services during installation.

VCS user accounts: Add accounts or change the default admin account Managed host: Add cluster nodes to a Cluster Management Console management server as described in the "Managed Hosts" section.

Local Cluster Management Console (Web GUI): Specify a network interface and virtuallP address on the public network to configure a highly available Web management interface for local cluster administration.

Notification: Specify SMTP and SNMP information during installation to configure the cluster notification service.

Broker nodes (4.1 and later): VCS can be configured to use VERJTAS Security Services (YxSS) to provide secure communication between cluster nodes and clients, as described in the "VERlTAS Security Services" section.

Copyright © 2006 Symantec Corporation. All rights reserved.

3-9

Lesson 3 Preparing a Site for VCS

Managed Hosts A managed host:

• Can be any 4.x or 5.0 cluster system, any platform

• Is under control of 5.0 CMC

• Runs a console connector that communicates with CMC

symantec,

Managed Hosts

During YCS installation, you are prompted to select whether the systems in this cluster are managed hosts in a Cluster Management Console environment.

Cluster Management Console (CMC) is a Web-based interface for managing multiple clusters at different physical locations, with cluster systems running on any operating system platform supported by YCS 4.x or 5.0.

You can also use the CMC in local mode to manage only the local cluster. This is similar to the Web GUI functionality in pre-S.O versions of YCS. Alternately, you can place cluster systems under CMC control by configuring a cluster connector, which enables the systems to be CMC-managed hosts.

You can select the type ofCMC functionality (or none at all) during YCS installation, or configure this after installation. During installation:

If you select to use CMC for local cluster management, you must provide: - A public NIC for each node

A virtuallP address and netmask

If you configure the cluster nodes as managed hosts, you must also configure the cluster connector by providing:

The IP address or fully-qualified host name for the CMC server The CMC service account password

The root hash of the management server

This course covers local cluster management only. Refer to the product documentation for information about managed hosts and CMC.

3-10

VERITAS Cluster Server for UNIX, Fundamentals

Copy rig III © 2006 Symantec Corporation. All rights reserved.

syrnantec.

Symantec Product Authentication Service

• Provides secure communication: - Among cluster systems

- Between ves interfaces and cluster systems

• Uses digital certificates for authentication

• Uses Secure Socket Layer (SSL) for encryption

• Provides user authentication (single sign-on)

• Requires one root broker node to be running

• Requires all cluster systems to be authentication brokers

• Formerly named VERITAS Security Services (VxSS)

Symantec recommends using a system outside the cluster to serve as the root broker node.

Symantec Product Authentication Service

ves versions 4. I and later can be configured to use Symantec Product Authentication Service (formerly named VERlTAS Security Services or VxSS) to provide secure communication between cluster nodes and clients, including the Java and the Web consoles. ves uses digital certificates for authentication and uses SSL to encrypt communication over the public network.

In the secure mode, yes uses platform-based authentication; ves does not store user passwords. All ves users are system users. After a user is authenticated, the account information does not need to be provided again to connect to the cluster (single sign-on).

Note: Security Services are in the process of being implemented in all VERITAS products.

VxSS requires one system to act as a root broker node. This system serves as the main registration and certification authority and should be a system that is not a member of the cluster.

All cluster systems must be configured as authentication broker nodes, which can authenticate clients.

Security can be configured after ves is installed and running. For additional information on configuring and running ves in secure mode, see "Enabling and Disabling VERlTAS Security Services" in the VERfTAS Cluster Server User :1' Guide.

Copyright © 2006 Symantec Corporation. All rights reserved

3-11

Lesson 3 Preparing a Site for VCS

symanrec.

Using a Design Worksheet

Validate installation input as you prepare the site.

Cluster Definition Value
Cluster Name yes_web
Required Attributes
UserNames admin=password
ClusterAddress 192.168.3.91
Administrators admin 3-12

System Definition Value
System S1
System S2 Using a Design Worksheet

You may want to use a design worksheet to collect the information required to install yeS as you prepare the site for ves deployment. You can then usc this worksheet later when you are installing yes.

m

Copyright © 2006 Symantec Corporation, All rights reserved.

VERITAS Cluster Server for UNIX, Fundamentals

Lesson Summary • Key Points

- Verify hardware and software compatibility and record information in a worksheet.

- Prepare cluster configuration values before you begin installation.

• Reference Materials

VERIT AS Cluster Server Release Notes VERITAS Cluster Server Installation Guide - http://entsupport. symantec. com

- http://vlicense . veri tas . com

syrnantec.

Lab 3: Validating Site Preparation

! • Visually inspect the classroom lab site.

I • Complete and validate the design worksheet.

System Definition Sample Value Your Value
System train1
System train2 See the next slide for lab assignments.

Labs and solutions for this lesson are located on the following pages.

"Lab 3: Validating Site Preparation." page A-3.

"Lab 3 Solutions: Validating Site Preparation," page B-3.

Lesson 3 Preparing a Site for ves

Copyright © 2006 Symantec Corporation. All rights reserved.

3-13

3-14

VERITAS Cluster Server for UNIX, Fundamentals

Copyright © 2006 Symantec Corporation. All rights reserved.

Sign up to vote on this title
UsefulNot useful

Master Your Semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master Your Semester with a Special Offer from Scribd & The New York Times

Cancel anytime.