Professional Documents
Culture Documents
Executive summary............................................................................................................................... 3 Document outline ................................................................................................................................. 3 Introduction ......................................................................................................................................... 4 Definitions ....................................................................................................................................... 4 Availability .................................................................................................................................. 4 Dependability .............................................................................................................................. 5 Fault tolerance ............................................................................................................................. 5 Availability in parallel and serial systems ............................................................................................ 6 Availability in an SAP landscape ....................................................................................................... 7 Cost of downtime ............................................................................................................................. 8 Impact of people, process, and technology ....................................................................................... 10 Causes of downtime ....................................................................................................................... 12 Services are key to high availability ................................................................................................. 13 Windows cluster solutions an historical overview ................................................................................ 14 Windows Server 2008 ....................................................................................................................... 14 New functionality ........................................................................................................................... 14 Server roles and other key innovations .......................................................................................... 15 HP products and services for Windows Server 2008 .......................................................................... 17 Windows Server 2008 high availability ............................................................................................... 18 Microsoft failover cluster technology overview ................................................................................... 18 Detailed description ....................................................................................................................... 19 Improved failover cluster management interface ............................................................................. 20 SCSI-3 commands ...................................................................................................................... 20 A new way to create clusters ....................................................................................................... 21 Migrating legacy clusters............................................................................................................. 21 Improvements in scoping and managing shares ............................................................................. 22 Better storage and backup support ............................................................................................... 22 Superior scalability ..................................................................................................................... 23 New quorum model .................................................................................................................... 23 Improved security model.............................................................................................................. 24 New networking capabilities and more flexible dependencies ......................................................... 25 Summary ................................................................................................................................... 25 SAP Windows high availability ........................................................................................................... 25 Overview ...................................................................................................................................... 25 SPOFs in an ABAP SAP system ........................................................................................................ 26 ABAP SAP application stack ........................................................................................................ 26 SPOFs in a Java SAP system ............................................................................................................ 30 SPOFs in an SAP system with ABAP and Java .................................................................................... 31 Replicating the Enqueue service ....................................................................................................... 33 Risk associated with the Enqueue service ....................................................................................... 33 Replication solution ..................................................................................................................... 34
Geographically-dispersed Microsoft cluster configurations .................................................................. 35 Networking in geographically-dispersed clusters ............................................................................ 35 SAP support for failover cluster configurations ....................................................................................... 36 Restrictions .................................................................................................................................... 37 Microsoft cluster support ................................................................................................................. 37 Cluster resource group ................................................................................................................ 37 Sample configurations .................................................................................................................... 39 Rules-of-thumb: Single-SID ............................................................................................................ 39 Rules-of thumbs: Multi-SID ............................................................................................................ 40 Key restriction ............................................................................................................................ 40 Cluster configuration options........................................................................................................ 40 Database high-availability features ...................................................................................................... 43 Database availability features in Windows Server 2008 .................................................................... 43 Database failover cluster support .................................................................................................. 43 Database backup ....................................................................................................................... 44 Database replication at the hardware level.................................................................................... 48 Summary....................................................................................................................................... 49 Data protection and recovery storage solutions .............................................................................. 49 Disaster-tolerant storage solutions ................................................................................................. 50 Overview .................................................................................................................................. 50 HP- and SAP-certified Windows Server 2008 server systems ................................................................... 51 HP server offerings ......................................................................................................................... 51 What is certified?........................................................................................................................... 52 Clustering on Windows Server 2008 [MST12] .............................................................................. 52 HP-supported storage systems for Microsoft clusters................................................................................ 54 HP-specific high-availability solutions for SAP ........................................................................................ 55 HP business continuity solutions for SAP ............................................................................................ 55 Data protection and recovery [HPQ08] ......................................................................................... 55 Disaster tolerance [HPQ10] ......................................................................................................... 58 HP EVA Dynamic Capacity Manager ............................................................................................ 60 HP Competent Cluster Service ...................................................................................................... 62 HP PolyServe [HPQ11] ............................................................................................................... 62 Virtualization solutions for SAP ........................................................................................................ 64 High availability for virtualized SAP systems .................................................................................. 64 Data replication with virtualized systems ....................................................................................... 66 HP management software ............................................................................................................... 66 HP Systems Insight Manager ........................................................................................................ 66 HP Operations Manager for Windows.......................................................................................... 67 HP Network Node Manager ........................................................................................................ 68 HP Services for SAP ........................................................................................................................... 68 Implementation services .................................................................................................................. 68 Startup services .......................................................................................................................... 68 Archiving .................................................................................................................................. 69 Operations services ........................................................................................................................ 69 Proactive 24 Service for SAP and Critical Service for SAP................................................................... 70 HP Global SAP Competency Center and HP-SAP collaborative support processes .............................. 71 HP high-availability reference configurations for SAP on Windows Server 2008 ....................................... 71 Server classes ................................................................................................................................ 72 Sizing ........................................................................................................................................... 73 Small Customer reference configuration ............................................................................................ 76 Medium Customer reference configuration ........................................................................................ 78 Large Customer reference configuration ............................................................................................ 80 Summary .......................................................................................................................................... 83 References ........................................................................................................................................ 84 For more information .......................................................................................................................... 86
Executive summary
The latest Microsoft server operating system, Windows Server 2008, provides new features and significant enhancements over earlier releases. Stronger security and increased reliability especially through re-engineered Failover Clustering help you deploy business-critical SAP applications with levels of availability and configuration flexibility that are unprecedented in the Windows area. SAP support for multiple, independently-running SAP instances within a single cluster is just one example of the benefits you can achieve with the latest SAP and Windows releases. Later releases from Microsoft and SAP can help you follow the HP recommendations to deploy highly-available SAP production systems and to use clustering when there are 300 or more users. This document provides a generic overview of high availability and introduces new Windows Server 2008 and SAP features. HP-specific solutions for SAP installations on Windows Server 2008 are outlined; in addition, reference configurations for a range of customer profiles are provided. SAP plans to support Windows Server 2008 as soon the Java SDK 1.4.2 is released from Sun for Windows Server 2008. The first SAP version supported on Windows Server 2008 will be NetWeaver 7.0 SR3. SAP support for further applications is planned for Q1 2009 [SAP05]. Target audience: This paper is technical in nature; its intended audience is SAP consultants and service personnel.
Document outline
Introduction Outlines high availability and the costs and causes of downtime; introduces the three pillars (people, process, technology) that can compromise availability Windows cluster solutions an historical overview Presents a brief overview of Windows cluster solutions Windows 2008 Server Introduces the new Windows Server 2008 operating system, its new functionality and key innovations; outlines HP products and services for this operating system Windows Server 2008 high-availability features Presents an overview of Failover Clustering; continues with a more detailed description of its features and capabilities, including the new quorum model SAP Windows high availability Introduces standard SAP-supported Windows cluster configurations; identifies single points of failure in the SAP software stack; explains why the Enqueue service should be replicated; examines geographically-dispersed clusters SAP support for failover cluster configurations Details the failover cluster configurations supported by SAP; outlines certain restrictions; graphically summarizes a range of configurations Database high-availability features Describes how to protect the database at the heart of an SAP installation; presents high-availability tools (failover cluster support, database backup, database replication, database mirroring). HP- and SAP-certified Windows Server 2008 server systems Outlines appropriate HP server offerings and the concept of certification HP-supported storage systems for Windows clusters Outlines HP-supported storage for clustered SAP solutions HP-specific solutions for SAP Introduces business continuity solutions from HP, such as HP Data Protector with Zero Downtime Backup, dynamic capacity management, and HP PolyServe, an alternative server consolidation solution; outlines HP virtualization solutions for SAP and HP management software for end-to-end management
HP Support Services for SAP Presents a rich set of offerings that can increase the availability of SAP systems, including special offerings that provide both proactive and reactive services HP high-availability reference configurations for SAP on Windows Server 2008 Provides sample configurations for small, medium, and large customers
Introduction
More than ever, organizations are depending on their business-critical enterprise resource planning (ERP) systems. However, the failure of an ERP system means a business disruption; depending on the nature of the failure and the time needed for recovery, the impact on business continuity may be disastrous, with consequences that can include: Customer dissatisfaction Loss of productivity Loss of revenue Bad publicity Thus, organizations are searching for solutions that can deliver high-availability. But how do you define availability?
Definitions
Definitions for availability, dependability, and fault tolerance are provided. Availability In an IT context, availability is typically expressed as a yearly rate (percentage), by dividing the time a particular service was operable by the total time it should have been operable. This definition may assume that planned downtime for maintenance is acceptable; however, systems that are very business-critical must always be up-and-running, with no downtime planned or unplanned permitted. Thus, the calculation for the availability of a very business-critical service or system is as follows: Percentage availability = (Total elapsed time Total downtime)/Total elapsed time If the system is less critical, planned downtime is acceptable and is typically omitted from availability calculations. Thus, the availability of a less business-critical system is calculated as follows: Percentage availability = (Total elapsed time Total unplanned downtime)/Total elapsed time Table 1 lists availability rates as percentages, based on 24-hours per day operation, 365 days per year.
Downtime 438 hours 87.6 hours 43.8 hours 8.76 hours 0.876 hours 5.256 minutes 31.536 seconds (18.25 days) (3.65 days) (1.825 days) (0.365 days) (52.56 minutes) (315.36 seconds)
Dependability It is clear that the statement, A server or a storage subsystem has an availability rate of over 99.99%, does not provide an overall picture of the availability the dependability of a complex SAP system where multiple components must work together. This new term, dependability, is defined by the International Federation for Information Processing (IFIP) 10.4 Working Group on Dependable Computing and Fault Tolerance as: "the trustworthiness of a computing system which allows reliance to be justifiably placed on the service it delivers." [DEP01] The following broader definition is provided by International Electrotechnical Commission (IEC) International Electrotechnical Vocabulary 191-02-03: "dependability [is] the collective term used to describe the availability performance and its influencing factors : reliability performance, maintainability performance and maintenance support performance"[IEC01] Fault tolerance Fault-tolerant or failsafe components provide the basis for a highly-available system. With fault tolerance, a system can continue operating in the event of a component failure though possibly with reduced throughput levels and increased user response times1. The Institute of Electrical and Electronics Engineers (IEEE) defines fault tolerance as follows: "[Fault tolerance is] the ability of a system or component to continue normal operation despite the presence of hardware or software faults. [IEEE01]
In practical terms, fault tolerance gives you some assurance that your system can sustain the availability levels you need to achieve a well-defined goal at a specified time using finite resources.
Methods for implementing fault tolerance include the following: Adding hardware redundancy through hot plug power supplies or RAID-protected storage subsystems Adding software/process redundancy by replicating transactions, providing multiple, identical instances, or utilizing a cluster service The next section provides information on calculating system availability.
R1 = 95% availability, R2 = 95% availability R12 = 1 (1 R1)(1 R2) = 1 (1 0.95)(1 0.95) = 0.9975 or 99.75%
It is clear from the examples shown in Figures 1 and 2 that availability is higher in a parallel system; availability is lower in a serial system.
The SAP system consists of a server system, the installed operating system, a database, and the SAP application. The system shown in Figure 3 has been clustered to provide redundancy.
Assume that availability rates for these systems are as follows: Storage: 99.99% Storage array network (SAN): 99.99% SAP cluster: 99.95% Network: 90.0% Thus, overall availability for this SAP landscape is calculated to be 0.9999 x 0.9999 x 0.9995 x 0.90, or 0.8994 (89.94%). In this example, the weakest link is the network, which helps lower overall availability below 90%. You should re-configure the network in order to increase system availability.
Note This calculation is only possible if the availability rates of the different system components are available. For very complex systems, you can use the CUT-SET or TIE-SET method [CON01].
Cost of downtime
In 2005, The Standish Group published the costs of downtime for sample applications, [Sta01]. Figure 4 summarizes these costs, showing that, for example, The Standish Group calculated the cost of unplanned downtime for an ERP to be around $214,800 per minute ($888,000 per hour).
Note While Figure 4 reflects quantifiable costs, loss of customers or customer confidence can be disastrous.
*All figures are in U.S. dollars. Source: The Standish Group 2005, [Sta01].
Take, for example, a very business-critical ERP system that is needed 24 hours a day. If this system has an availability rate of 99.5%, then calculated downtime costs are around $38,894,400 ($14,800 x 2628 minutes) per year! If, however, time could be allowed for planned maintenance, the availability rate for this ERP system might increase to 99.99%. Now, calculated downtime costs would fall to around $777,888 ($14,800 x 52.56 minutes) per year, or less than 2% of the costs accrued without planned maintenance. Thus, these two examples show that the cost of downtime can vary dramatically based on how downtime is defined. Table 2 provides a detailed overview of the Standish Group study, [Sta01].
The cost-of-downtime metrics presented by the Standish Group may not reflect the cost structure within your organization. To calculate your possible annual downtime costs, ensure you are using company-specific costs.
Application Trading (securities) Home location register (HLR) ERP Order processing E-commerce Supply chain Electric funds transfer (EFT) Point of sales (POS) Automatic teller machine (ATM) E-mail
Cost per minute $73,000 $29,300 $14,800 $13,300 $12,600 $11,500 $6,200 $4,700 $3,600 $1,900
Given the associated risks, lowering the probability of unplanned downtime throughout the entire SAP ERP system solution stack should be a key goal for todays IT departments. However, you should understand that high availability depends on more than just technology; it also depends on the people and processes involved in the management of a particular environment. These dependencies for high availability are known as its three pillars.
10
On the people side, well-educated, experienced administrators and operators are key to success. Even though a system has been designed for high availability, unless the administrators and operators have the right experience and education, the system can never achieve expected availability levels. Hand-in-hand with well-educated and experienced people goes the process side of a system. Unless processes (such as system and change management, backup and recovery, and permanent monitoring and control) are well-planned and -documented, and people have been trained on these processes, a system can never become highly available. The technology stack is the foundation of a high available system. Redundancy features built into the hardware along with suitable system design are necessary to achieve high availability. When you are planning, designing, and implementing a highly-available system, you should take a holistic approach. Start with the datacenter infrastructure (such as cooling, power, and access control), include failure-tolerant hardware components (such as RAID sets for storage and error checking and correction (ECC) memory in server systems, and redundant high-speed networks), and end at the users desk with suitable user and system security policies. Only if all three pillars are in place and working perfectly together is it possible to build and operate a highly-availably SAP system landscape.
11
Causes of downtime
In 2005, The Standish Group analyzed 50,000 downtime incidents and observed in their first quarter 2006 research report [Sta02] that 34% of all these incidents were due to operator error, as shown in Figure 6. The other category in this figure refers to downtime caused by the environment, hackers, viruses, planned downtime, and unspecified reasons.
Since the availability of the system and application is of paramount importance, it follows that the time taken to recover from a downtime incident is critical. In the same report, The Standish Group also analyzed the time needed to recover from a failure, [Sta02], and found that, despite the fact that operator error is the cause of 34% of downtime incidents, only 17% of system downtime is caused by operator-related incidents, as shown in Figure 7.
12
While human error will always happen, knowing that up to 34% percent of all incidents are caused by people makes it is important to lower the probability of these errors occurring and the time needed to recover from them. Only well-trained people and suitably designed processes can limit the negative impact of human error.
13
The remainder of this document focuses on Windows Server 2008 cluster technology and how an SAP ERP system and its associated database can be deployed on a Windows Server 2008 cluster with HP servers and storage. Information on management, operation, and training are beyond the scope of this document.
New functionality
Windows Server 2008 introduces many new features and technologies, designed to improve security, increase productivity, and reduce administrative overhead. This section highlights features and changes that will potentially have the greatest impact. One of the key innovations in Windows Server 2008 is its ability to support either a full or Server Core installation. A core installation creates a reduced, more secure operating system footprint that is primarily intended for infrastructure services like DHCP, web servers, or Hyper-V virtualization. Table 3 provides an overview of current Windows Server 2008 operating system editions.
14
Edition*
CPU architecture
Maximum memory
SAP support
Server Core
Failover clustering
HyperV**
Standard
4 GB / 32 GB 64 GB / 2 TB 64 GB/ 2 TB
16
Enterprise Windows Server 2008 Datacenter Windows Server 2008 for ItaniumBased Systems
32/64
16
No limit
i64
64
2 TB
No limit
*Source Microsoft **Hyper-V is only supported on x64 platforms with built-in CPU virtualization technology
As shown in Table 3, highly-available SAP systems can only be deployed with Enterprise and Datacenter editions of Windows Server 2008 and Windows Server 2008 for Itanium-Based Systems. Furthermore, Server Core installations are not supported in the SAP environment due to lack of application server support. Server roles and other key innovations In addition to the features shown in Table 3, Windows Server 2008 introduces server roles. Only basic functionality is enabled during the initial installation; an administrator then customizes the server, adding the roles (functionality) you need to support your planned workload. This results in a more secure platform, with a smaller attack surface for possible threats. Server Core supports the following roles: DHCP server File services Print services DNS server Active Directory Domain Services (AD DS) Active Directory Lightweight Directory Services (AD LDS) Streaming Media Services Windows Server 2008 Virtualization To minimize the environment, a Server Core installation deploys only a subset of the binaries (that is, the binaries required by the particular server role). Once you have installed and configured the server, you can manage it either locally at the command prompt 3 or remotely through Remote Desktop.
By default, the user interface for a Server Core installation is the command prompt.
15
Windows Server 2008 also provides key enhancements in following areas [MST06]: Server consolidation and resource optimization Hyper-V Hyper-V, Microsofts virtualization solution, allows a physical server to host the workload of multiple, independent server systems. Server virtualization helps you optimize the utilization of your hardware resources and provides the agility you need to adapt to changing IT needs. In addition, server virtualization can simplify the management and deployment of server systems. Flexible application access for remote users TS RemoteApp Installed as part of the Windows Server 2008 Terminal Server role, Terminal Services RemoteApp (TS RemoteApp) allows users to access individual applications rather than a computer desktop during a Terminal Services session. TS RemoteApp applications run on a host server and only send application windows to the user, consuming fewer client-side resources and reducing administration and deployment costs. Modular, minimal installation server core Intended for network servers fulfilling specific infrastructure roles, the new Server Core installation offers a highly reliable, efficient platform. Because this option loads the fewest operating system components only those required to support core infrastructure roles patch requirements are reduced, and reliability and security enhanced. Delivering rich web content and applications IIS 7.0 Microsoft Internet Information Services (IIS) 7.0 delivers a broad range of functionality, including streaming media and Web applications in Active Server Pages (ASP) and Personal Home Page (PHP). The new modular design of IIS 7.0 minimizes the attack surface of the Web server by allowing you to only install the components you need. Improved network performance and control new TCP/IP stack The redesigned, next-generation Transmission Control Protocol/Internet Protocol (TCP/IP) stack included in Windows Server 2008 can significantly improve performance in a remote location scenario, offering faster throughput and more efficient routing of network traffic. Network Access Protection (NAP) in Windows Server 2008 helps prevent non-compliant computers from accessing your network. In addition, NAP can verify the health of connecting computers and enforce compliance with your security standards. Supporting business continuity for demanding workloads high-availability features Beside the support for failover clusters and Network Load Balancing, Microsoft improved the dynamic hardware partitioning, the storage options, and the machine-check architecture to remove single-point-of-failure problems. Simplified deployment and management help organizations of all sizes take advantage of these features to improve availability and reliability. Enabling secure collaboration Active Directory Rights Management Services The new federated Rights Management Services helps control how your documents are used internally and externally (for example, by defining rights to view, print, forward, or delete a document). Connecting heterogeneous environments Windows Server 2008 includes Subsystem for UNIX-based Applications (SUA), a multi-user UNIX environment that supports more than 300 UNIX commands, utilities, and shell scripts. SUA runs on Windows-based servers without emulation, supporting native UNIX performance and enabling UNIX applications to leverage Windows application programming interfaces (APIs) and components.
16
Enabling top-shelf service and support for remote sites Windows Server 2008 enables remote management, allowing administrators to correct many problems from remote locations. The new Read-Only Domain Controller provides a safer way to provide Active Domain administration in the remote infrastructure. Easing administration, management, and automation Server Manager and PowerShell The Server Manager Console provides a single, unified console for managing a servers configuration and system information, displaying server status, identifying problems with server role configuration, and managing all roles installed on the server. Server Manager also interfaces directly with PowerShell, the command-line shell and scripting language for automation. All Server Manager functions that can be used in the interface are available to PowerShell scripts.
With offerings across its portfolio of Business Technology Optimization (BTO) software, HP Software provides support for Windows Server 2008 environments. The recently announced HP Business Service Automation suite provides automated, consistent, enforceable processes for Windows Server 2008 provisioning, patch management, and migration. In addition, the HP Business Service Management solution supports the management and monitoring of Windows Server 2008 environments, mitigating business risk and reducing the potential costs of service downtime. To help you get up-and-running quickly, HP Services offers planning, design, implementation, and support services for new Windows Server 2008 technologies, saving valuable time and reducing the risk of global installations. HP Services can integrate this operating system into IT environments as part of a joint solution set, known as HP & Microsoft Solutions for the People-Ready Business. Additionally, the HP Services education team provides customized training to help you prepare for Windows Server 2008. Services include classroom training, live online training, self-paced elearning, and informal learning that can improve your overall return on investment for Windows Server 2008 by enabling you to stay up-to-date while minimizing your time away from core business activities. More information on HP support for Windows Server 2008 is available on the HP website [HPQ01].
17
18
To summarize, Windows Server 2008 Failover Clustering provides high availability for mission-critical applications such as ERP systems, databases, messaging systems, file and print services, and virtualized workloads. If a cluster node were to fail, up to 15 other nodes could host the service or application being delivered by the failed node; users would be able to continue working, typically unaware of any disruption. The following section provides more information about this new Windows Server 2008 feature.
Detailed description
As explained in the Microsoft white paper, Windows Server 2008 Failover Clustering Architecture Overview - New Features and Capabilities [MST07], clustering in Windows Server 2008 has been radically redesigned to simplify and streamline cluster creation and administration. Innovations include a cluster setup wizard that helps the administrator install a cluster and a new cluster management utility that eases the operation and control of such a cluster. The goal of Windows Server 2008 Failover Clustering is to make it possible for the non-specialist to create a failover cluster that works; now, even an IT generalist with no special training in failover cluster services can confidently create and configure the cluster to host redundant services.
Important Despite the usability enhancements delivered with Failover Clustering, HP recommends that only knowledgeable IT specialists should work with an SAP cluster.
19
Failover Clustering is so-named to remove possible confusion with a different type of Microsoft cluster, Windows Compute Cluster Server, and the cluster technology provided with earlier Windows platforms, Microsoft Cluster Server (MSCS) with Windows NT 4.0 and cluster service with Windows 2000 and Windows Server 2003. While ease-of-use was a key objective for Failover Clustering, this solution also includes the following new features and technical improvements: Improved failover cluster management interface SCSI-3 support Validate tool A new way to create clusters Migration of legacy clusters Support for Windows Server 2008 Server Core Improvements in share scoping and management Better storage and backup support Enhanced maintenance mode Superior scalability New quorum model Improved security model New networking capabilities and more flexible dependencies Each of these features is outlined below. Improved failover cluster management interface Many management tools have been streamlined in Windows Server 2008; for example, the earlier cluster administration interface has been replaced with a Microsoft Management Console (MMC) 3.0 snap-in, CluAdmin.msc. The new Failover Cluster Management console has been designed to be taskrather than cluster resource-oriented, as it was with previous versions of Microsoft clustering. With earlier cluster management interfaces, the procedure for creating a highly-availability file application or service was complex; now, the cluster management utility and its wizards do all the work for you. Using the Failover Cluster Management console, experienced cluster server administrators can still access cluster commands that were available via the earlier command-line interface. Furthermore, the management of Windows Server 2008 failover clusters is fully scriptable through Windows Management Instrumentation (WMI). SCSI-3 commands Failover Clustering requires cluster storage to use SCSI-3 Persistent Reservation commands, rather than older, SCSI-2 reserve/release commands. Since the newer commands avoid SCSI bus resets, they are much less disruptive than the SCSI-2 commands. SCSI-3 command compatibility is enforced by the cluster validation tool and if a used storage is not compatible, then it is not possible to use it for Windows Server 2003 server clustering.
20
Validate tool Earlier Microsoft clusters often failed due to configuration complexity. To help solve this problem, Failover Clustering includes the built-in Validate tool, an expansion and integration of the ClusPrep tool that was designed for Windows Server 2003 server clustering. Validate runs a focused set of tests that are intended to verify functionality and best practices on servers that are to be configured as part of a particular cluster. Validate performs a software inventory, tests the network, and checks the system configuration. This tool also validates the network and storage. For example, storage configured to enable access via multiple paths is validated if multi-path software is installed, ensuring the storage complies with the Microsoft Multipath IO (MPIO) standard and that it is configured correctly. Before running the Validate tool, ensure that there are at least two servers in the cluster. If not, storage tests that require two servers will not be run, which will be reflected in the resulting report. Validate can also be run after the clusters are created and configured.
Passing Validate is the standard for support for clusters in Windows Server 2008: if a cluster does not pass Validate, it is not supported by Microsoft. However, running Validate does not release you from the responsibility of using only hardware and software certified under the Windows Server Logo Program for Windows Server 2008.
A new way to create clusters The process for installing cluster functionality in servers has changed dramatically with Windows Server 2008, which is far more compartmentalized than Windows Server 2003. Windows Server 2008 uses a component-based model, whereby components are not added until you need them. Thus, cluster services are no longer installed by default; in Windows Server 2008, you must use the Add Feature Wizard to install Failover Clustering. When you run the Create Cluster Wizard, you can now enter all cluster members at the same time. Migrating legacy clusters To enhance security, Failover Clustering does not offer backwards compatibility with earlier clusters or, by extension, provide support for rolling upgrade migrations. Thus, Windows Server 2003 cluster nodes and Failover Clustering nodes cannot be configured on the same cluster. In addition, Failover Clustering nodes must be joined to an Active Directory-based domain (not a Windows NT 4.0-based domain). Moving from a Windows Server 2003 cluster to a Windows Server 2008 failover cluster requires migration. The Failover Clustering migration feature can be used to import configuration information concerning Windows Server 2003 clustered applications to Windows Server 2008 Failover Clustering.
Note With some restrictions, Failover Clustering can even migrate specific cluster resource types from Windows Server 2003 server clusters.
21
Support for Windows Server 2008 Server Core A Server Core installation provides a minimal environment to support a specific server role, thus reducing maintenance and management requirements and the attack surface associated with the particular role. Server Core supports the failover cluster feature. You can manage failover clusters on Server Core using the cluster.exe command line tool or, remotely, from the Failover Clustering MMC. Enabling Failover Clustering in a Server Core environment can lead to a reduction in maintenance requirements; since fewer updates are required, uptime is significantly increased.
Important
You can cluster Server Core host server systems and use the resulting cluster to deploy SAP systems that have been virtualized through Hyper-V.
Server Core does not support the application server role and is therefore not supported by SAP or database application vendors. SAP applications and databases typically do not work with Server Core because they depend on the application server role and on features that are not included in a Server Core installation. Nevertheless, Server Core can be used as the basis for Hyper-V virtualization.
Improvements in scoping and managing shares In the past, users could see all cluster and local shares, regardless of the virtual server name in the file shares cluster group, which could lead to some confusion. For example, if shares were mounted via the physical server name, following a failure the shares would be online on the second node and not on the node from which the user created the mount. This confusion cannot occur with Failover Clustering; users see shares that can be accessed by the node to which they are connected but not to shares owned by other groups. This helps prevent confusion and the incorrect mapping of network drives. Due to the changes in how the share is presented, it is now possible to have the same share named multiple times. However, a share must be unique in its cluster group, with a dedicated virtual server name and IP address assigned.
The new method for presenting shares allows a Windows Server 2008 failover cluster to serve multiple, independently-running SAP instances within the same cluster.
Better storage and backup support Failover Clustering features storage enhancements designed to improve stability and accommodate future growth. Enhancements to storage and backup support include: Support for both Master Boot Record (MBR) and GUID Partition Table (GPT) disks; shared disks can now be greater than 2 terabytes (GPT) Disk identification is based on either the disk signature in the MBR or SCSI inquiry vital product data (VPD) page 0x83, which is an attribute of the logical unit number (LUN) Storage hardware must support SCSI Primary Commands-3 (SPC-3) commands for persistent reservation and release All host bus adapters (HBAs) use the Storport Miniport Driver model listed in the Windows Server Catalog All multi-path solutions are MPIO
22
Failover Clustering has its own Volume Shadow Copy Service (VSS) writer, allowing VSS backup applications to more easily support clusters Enhanced maintenance mode Failover Clustering provides new Maintenance Mode functionality, which temporarily shuts off health monitoring on a disk so that it cannot report a failure while IT staff are working on it. This mode helps you perform maintenance or administrative tasks (such as volume snapshots or ChkDsk) on clustered disk resources. To enable a disk to be placed in maintenance mode, all nodes must be able to communicate with each other. Before entering maintenance mode, the disk is fenced from all nodes that do not own the resource. In this configuration, other nodes can still see the disk but are not able to access it. The owning node then removes its persistent reservation from the disk. Superior scalability Windows Server 2008 failover clusters can support more nodes than Windows Server 2003 clusters. Specifically, there can be up to 16 x64-based nodes in a single failover cluster, as opposed to the maximum of eight nodes in a Windows Server 2003 cluster. In addition to providing superior scalability, Windows Server 2008 failover clusters now support GPT disks4. A GPT disk offers the following benefits: Supports up to 128 primary partitions (while MBR disks can support up to four primary partitions and an infinite number of partitions inside an extended partition.) Supports a much larger volume size over 2 TB (which is the limit for MBR disks) Provides greater reliability than MBR disks, due to replication and cyclical redundancy check (CRC) protection of the partition table The combination of an increase in the possible number of nodes and support for GPT disks greatly enhances the scalability of larger volumes in a failover cluster deployment. New quorum model The quorum model has changed in Windows Server 2008, moving away from the concept of a shared disk containing the cluster configuration and some replicated files. This shared disk was a single point of failure (SPOF) in the cluster; if the quorum disk were to fail, cluster service would terminate. With Failover Clustering, the quorum model now truly encompasses a quorum (or consensus), which is achieved through a majority of votes from cluster resources. With Failover Clustering, the quorum disk is now known as a witness disk. Failover Clustering provides the following mechanisms to establish a quorum: No Majority disk only No Majority mode is similar to the Windows Server 2003 shared disk quorum. Only the quorum disk (now known as the witness disk) gets a vote; the cluster remains up-and-running as long as one node can access the disk. Features of this mode include: Only the witness disk gets a vote; nodes cannot vote The witness disk is the master and a SPOF The cluster remains up-and-running as long as one node can access the witness disk
4
A GPT disk uses the GUID partition table (GPT) disk partitioning system.
23
Node Majority Similar to the Windows Server 2003 Majority Node Set model, Node Majority mode requires three or more nodes; there is no dependence on the availability of the witness disk. Node Majority cannot be implemented on a cluster with an even number of nodes since it would be impossible to achieve a majority. Features of this mode include: Only nodes get votes No vote for shared storage Three or more nodes required (odd number of nodes) Majority of votes needed to operate cluster Node and Disk Majority The new Node and Disk Majority mode allows nodes and the witness disk to vote, with the cluster coming online if a majority of votes is reached. This mode is a commonly used with two-node clusters. Features of this mode include: Based on a majority of nodes Witness disk can provide the deciding vote In a two-node cluster, there are three votes; can survive the loss of any one vote Node and File Share Majority The new Node and File Share Majority mode allows nodes and the witness disk (file share) to vote, with the cluster coming online if a majority of votes is reached. This mode is an excellent solution for geographically dispersed, multi-site clusters. Features of this mode include: Quorum based on a majority of nodes and the witness Using a file-share witness supports the creation of two-node clusters with no shared disks Excellent solution for geographically-dispersed clusters Single file server could serve as witness for multiple clusters File server could reside at a different site to any node Improved security model Several changes have been made to Failover Clustering that make it a more secure, reliable product. These changes include: Remove the requirement for a domain user account for the Cluster Service Account (CSA) Improve logging and event tracing Transition from insecure datagram-based remote procedure call (RPC) communications to TCP-based RPC communications Enabling Kerberos authentication by default on all cluster network name resources
24
Audit access to the cluster service (Clussvc.exe) through either the Failover Cluster Management snap-in (cluadmin.msc) or the cluster command-line interface (cluster.exe) Secure inter-cluster communications New networking capabilities and more flexible dependencies Failover Clustering includes a new networking model; major improvements to cluster networking include: Improve support for geographically-distributed networks Introduce the ability to place cluster nodes on different networks Use DHCP server to assign IP addresses to cluster interfaces Improve the cluster heartbeat mechanism Support IP version 6 (IPv6) Summary A key objective for the improvements delivered by Windows Server 2008 Failover Clustering is the creation of a new clustering paradigm that radically simplifies and streamlines cluster creation, configuration, and management. Cluster performance and flexibility have also been improved. For example, x64-based clusters can support up to 16 nodes; IP addresses for cluster nodes can be assigned by the DHCP server; and geographically-dispersed clusters can span subnets. Built around a more resilient and customizable quorum model, failover clusters are designed to work well with SANs, natively supporting the most commonly used SAN bus types.
Overview
The standard, SAP-supported Windows cluster configuration consists of a two-node cluster deploying the database and the SAP CI or, since NetWeaver 7.0(2004s), the SCS. Clustering two or more server systems follows the concept in paralleling systems and services.
Note SAP provides the necessary Microsoft cluster resource dynamic link libraries (DLLs) for Java- and ABAP-based SAP application servers.
25
If Failover Clustering detects an application or network failure, a failover is initiated and the application restarted on the other cluster node. This automated failover and application restart helps reduce unplanned system downtime and increase the overall availability of the system.
Note Refer to Figure 2 for an example of a parallel-coupled system.
To further increase system availability, either the SAP Enqueue service or the entire SAP system database can be replicated to the other node or to alternate local or remote storage. The remainder of this section is based on information provided in SAPs MSCS configuration guidelines, which, unless otherwise noted, also apply to Windows Server 2008 Failover Clustering [SAP02].
Figure 9. SPOFs of an SAP ABAP system based on SAP NetWeaver 2004 or older
26
Figure 10 shows a traditional SAP NetWeaver 2004 or older Windows cluster (up to kernel version 6.40). The SAP cluster group contains the following components, which are all SPOFs in an SAP NetWeaver 2004 system: Shared SAP binary cluster disk SAPMNT, SAPLOC share SAP CI virtual host name and IP address SAP CI service In addition to SAP, the database is also deployed on the cluster.
Figure 10. Traditional SAP Windows cluster (up to kernel version 6.40)
Note Since SAP is only planning to release NW2004s or later products on Windows Server 2008, the solution shown in Figure 10 is not supported with Windows Server 2008.
As mentioned earlier, SAP changed the SAP kernel architecture with NetWeaver 7.0(2004s). Instead of having a single, large, monolithic SAP application server, where processes like Message, Enqueue, dialog, and update were bundled in one big service, NetWeaver 7.0(2004s) moved the SPOFs (Enqueue, Message, and Gateway services) into the System Central Services (SCS) instance.
27
Two SCS instances were provided, one for Java (known as the SCS) and one for ABAP (known as the ASCS). The benefits in clustering only the SCS instance rather than the full CI include faster failover times, reduced resource needs for nodes, and, since fewer services and dependencies need to be failed over, a more robust failover solution. Figure 11 shows a NetWeaver 7.0(2004s)-based SAP system with its SPOFs separated from their redundant components, such as work processes. The ASCS instance that must be protected via a Microsoft cluster is highlighted; in addition, other SPOFs (such as host name and IP address) are shown in red and must also be protected by the cluster.
Figure 11. SPOFs for an ABAP system based on SAP NetWeaver 7.0(2004s) or newer
28
Figure 12 shows the NetWeaver 7.0(2004s) Windows cluster (kernel version 7.0 or later), with an SAP cluster group that contains the following components, all of which are SPOFs: Shared SAP binary cluster disk SAPMNT share ASCS virtual host name and IP address ASCS instance The database is also deployed on the cluster. In addition to the small, clustered ASCS instance, local application server instances (primary and secondary) are installed on the cluster nodes. With this cluster configuration, cluster nodes can be better-utilized than in a CI-based cluster.
Figure 12. SAP Windows cluster (starting with kernel version 7.0)
Typically, the database instance uses significantly more resources than the ASCS instance. Thus, to reduce the total cost of ownership (TCO) of the cluster, SAP allows you to install additional local instances (shown as primary and secondary application servers in Figure 12) on the cluster nodes.
Note When you perform an upgrade from a NetWeaver 2004 or older release to a NetWeaver 7.0(2004s) or newer release, the Microsoft cluster configuration is not automatically upgraded. Follow SAP OSS note 101190 to perform a manual upgrade.
29
Figure 13. SPOFs of a Java application stack based on NetWeaver 2004 or newer
30
Figure 14 shows a Java-based NetWeaver application server. The SAP cluster group contains the following components, which are all SPOFs in a Java-based NetWeaver application server: Shared SAP binary cluster disk SAPMNT share SAP Java SCS virtual host name and IP address SAP Java SCS instance Locally-installed application server instances as well as CI application servers are installed on the cluster nodes. The database is also clustered.
Typically, the database instance uses significantly more resources than the SCS instance. Thus, to reduce the TCO of the cluster, SAP allows you to install additional local application servers, shown in Figure 14 as SAP Primary and Secondary AS (such as a Java CI with SDM on one node, a dialog instance on the other), on the cluster nodes.
31
SAP binary disk and the dependent SAPMNT share with SAP profiles and log files SAP hostname and IP address Depending on the configuration and customer needs, Java SDM and the SAP Gateway service could also become SPOFs. If it does become a SPOF, the SAP Gateway service should also be deployed on the cluster. However, since Java SDM is only needed when newly-developed software is being deployed, it is typically treated as non-critical. Figure 15 shows a NetWeaver 7.0(2004s) system, with components that must be protected by the Microsoft cluster shown in red.
32
Figure 16 shows a dual-stack NetWeaver 2004 system with locally-installed SAP application servers and the SAP database deployed in the cluster.
The configuration shown in Figure 16, which is the standard SAP cluster implemented by todays customers, does not provide protection for the Enqueue lock table. The following section describes how to protect this SPOF so that you can resume work after a cluster node failover.
33
is a SPOF, can be restarted quickly and easily, the contents of the lock table could be lost and all transactions that had locks reset. Replication solution Standalone Enqueue service is an enhancement to SAPs data-locking architecture. Enqueue service was normally integrated with the CI or system SCS instance as a work process; since the web application server release 6.20, however, Enqueue service has been separated and can run on multiple host systems. In Windows, a maximum of two nodes can run Enqueue service. Standalone Enqueue service can work with high-availability hardware and software to enhance failover protection. In this scenario, standalone Enqueue service replicates locking data from the primary Enqueue table to a separate backup host. In the event of a failure, a new Enqueue service is started on the backup host; normal processing continues without loss of data. Standalone Enqueue replication service does not, by itself, make up a high-availability solution; a Microsoft cluster and certified high-availability solutions from HP are also required. The benefits of standalone Enqueue service include: Prevent data loss or forced rollback in the case of sudden system or Enqueue service failure Safeguard the normal processing of locked data after a system or Enqueue service failure Support the faster restart of a stopped systems CI services if the Enqueue service restart is decoupled from central services Add flexibility to the system landscape layout and network management of an SAP system by allowing the Enqueue service to run on a separate host from the CI Figure 17 shows Enqueue lock table replication to another cluster node.
Separate Enqueue services are required if you run ABAP and Java applications in parallel on a single web application server.
34
Standalone Enqueue replication service is generally available for SAP Web Application Server 6.40 and later; however, according to OSS note 524816, this service can also be used with older SAP releases like 4.6D.
Note: See SAP Notes 524816 and 804078 for more information about availability and usage limitations.
Failover Clustering provides a new networking model that significantly improves support for geographically-dispersed networks; for example, you can now place cluster nodes on different networks. For more information, refer to the New quorum model section. The storage architecture of a geographically-dispersed cluster requires an arbitration mechanism to ensure the cluster there is only a single persistent disk with which to communicate cluster information. HP StorageWorks Enterprise Virtual Array (EVA) and XP storage systems provide such mechanisms and can be used to build geographically-dispersed configurations. SAP supports the deployment of a single SAP instance on two Microsoft nodes (possible owners) in a Node Majority cluster that includes a file share witness; other configurations are handled as multisystem identifier (SID) configurations. The following section provides more information on SAP support restrictions.
35
Note: For information on support for particular distances between datacenter sites, contact your local HP SAP Competency Center. For information on geographically-dispersed clusters, refer to the Microsoft white paper, Windows Server 2008 Multi-Site Clustering [MST11].
Before starting to plan a Windows Server 2008 SAP server environment, refer to SAPs latest release and support schedule for Windows Server 2008 (SAP PAM). As of August 2008, you cannot deploy Windows Server 2008 in productive SAP environments due lack of support for Java 1.4.2. SAP plans to support Windows Server 2008 in Q1 2009. As of August 2008, the only SAP application supported on Windows Server 2008 is NetWeaver 7.0 SR3 [SAP05].
In the past, SAP only supported two-node homogeneous cluster configurations deploying a single SAP instance installed. Now, in cooperation with HP and other hardware vendors, SAP supports a much broader range of configurations [SAP02]. Support for the original, very basic configuration was mainly restricted by the following issues: Limitations within Microsoft Cluster Service 32-bit memory limitations within the hardware and operating system Monolithic design of the SAP CI SAP is now able to support alternate cluster configurations thanks to 64-bit-ready operating systems and hardware, the richer feature set offered by Failover Clustering, and the redesign of ABAP and Java SCS. As a result, SAP and, therefore, HP can support the following configuration options for a high-availability Microsoft failover cluster: A single SAP system in a single Microsoft failover cluster A single SAP system in two Microsoft failover clusters, with SAP components deployed on one cluster, the database instance on the other
Note: In addition, database vendors offer high-availability solutions or techniques that do not use Microsoft clusters, including standby databases, shadow databases, mirror databases, and Oracle Real Application Clusters (RAC). Such solutions are beyond the scope of this document.
Multiple SAP systems (multi-SID) in one or more Microsoft failover cluster(s), each cluster with two or more nodes
Note: Currently, the multi-SID option is only supported if your SAP system runs on Windows with an Oracle, Microsoft SQL Server, or IBM DB2 Universal Database database.
36
SAP system(s) in one or more Microsoft failover cluster(s), with the database instance installed outside the cluster(s)
Restrictions
In general, the ASCS or SCS instance must be installed and configured to run on two nodes in one cluster. With the appropriate database support, you can install the database on more than two nodes in one cluster. If you use SAPinst to deploy one of the following, SAP supports its installation, configuration, and operation: A single failover cluster with two nodes, or Two failover clusters, each with two nodes However, the deployment of a more complex system that includes one or more failover cluster(s), each with two or more nodes, requires in-depth knowledge of the Windows operating system, Failover Clustering, and the sizing and clustering of an SAP system. As a result, to qualify for SAP support, the sizing, installation, and configuration of such a system must be performed by an appropriate SAP Global Technology Partner.
Note: Multi-SID sizing, installation, and configuration must be performed by an SAP Global Technology Partner, such as HP, with the ability to manage the problems that arise from such a complex deployment.
SAP provides the cluster DLLs needed to install SAP within a homogeneous cluster system. Since R/3 itself is not affected by these extensions, any existing SAP R/3 or ERP 5.0 system on Windows NT, Windows 2000, Windows Server 2003, and Windows Server 2008 (version 3.1I or later) can easily be upgraded to a clustered installation. The following resource DLLs are provided: saprc.dll Starts and stops functions and checks status saprcex.dll Allows the cluster administrator to manage the SAP cluster resource Cluster extensions and SAP software are available in 32- and 64-bit versions, which must not be mixed. Cluster resource group A special cluster resource group is created after SAP has been configured to run within a Microsoft cluster. This cluster group contains the following SAP resources that are needed by the SAP CI: Physical disk where SAP binaries are stored
37
IP address and network name SAP shares sapmnt and saploc The SAP service itself In addition to the clustered SAP ABAP CI, multiple SAP application server systems are required with a highly-available SAP system to provide support for clients. These applications servers build into an SAP logon group. Cluster dependencies Cluster resources have dependencies, ensuring, for example, that SAP shares can only be created when the physical disk is online and running. Figure 18 provides an overview of the dependency tree for the cluster resource group associated with SAP kernel 4.6x and 6.x0. At this time, SAP cluster implementations used hard-coded names for SAP shares and cluster resources; moreover, the entire SAP CI and all its processes (including dialog, batch, update, and spool) were clustered. As a result, it was impossible to support more than a single SAP instance within the cluster.
Figure 18. Resource dependencies for an SAP cluster group (SAP kernel 4.6x and 6.x0)
Solid lines shown in Figure 18 denote direct dependencies; dotted lines denote indirect dependencies. Thus, for example, the SAP service could not come online if the shares were offline; the file shares could not come online if the physical disk were offline. To address this single-instance limitation, HP offered in the past the HP Competent Cluster Service (HP CCS) service, which is no longer available. With kernel 7.x, SAP introduced a new clustering concept that makes it possible to install multiple SAP instances within a single cluster. Now, only the SAP SCS is clustered; unique resource names are used. Figure 19 illustrates the new cluster resource dependency tree.
38
Figure 19. Resource dependencies for an SAP cluster group (SAP kernel 7.x)
As shown in Figure 19, there is only a single SAP share in the cluster; local shares are no longer part of the cluster. Through the use of unique resource names and a junction-based share, it is now possible to install multiple SAP instances within a single cluster. In addition, clustering only the SCS rather than the entire CI frees up the resources needed to implement a multi-SID cluster.
Please refer to the SAP MSCS Configuration and Support Information for SAP NetWeaver 04 and SAP NetWeaver 7.0(2004s) Systems guide [SAP02] for more information on SAP Multi-SID.
Sample configurations
This section presents rules-of-thumb for single- and multi-SID configurations and summarizes failover cluster configurations that are supported by SAP and by HP and other hardware vendors. Rules-of-thumb: Single-SID The following rules-of thumbs apply to supported single-SID configurations: All SAP releases can be installed in single-SID configurations To support the latest approach in SAP system clustering, use the latest NetWeaver 7.0(2004s) release to cluster only the SCS or ASCS; earlier releases cluster the full SAP CI and, optionally, the SCS With the exception of NetWeaver 7.0(2004s), additional SAP application server systems cannot be installed within the cluster You can install the database within or outside the cluster. Always install Enqueue Replication Service (ERS) as a default option.
39
Rules-of thumbs: Multi-SID Only use NetWeaver 2004 or newer releases for a multi-SID configuration With NetWeaver 2004, a single instance per cluster is supported With NetWeaver 7.0(2004s), multiple instances per cluster are supported NetWeaver 2004 and NetWeaver 7.0(2004s) can be mixed within a single cluster Key restriction
Cluster configuration options are only supported with a two-node cluster. While a cluster of three or more nodes is possible, it would not be supported by the SAP installation utility. As a result, you can only use two nodes to protect an SAP instance.
Cluster configuration options To help you plan a configuration that can satisfy particular customer requirements, the cluster configurations presented in this section highlight a range of options for single- and dual-stack SAP systems. The SAP databases shown in these configurations can be deployed within or outside the cluster 7; there is no special SAP requirement. However, if the database is deployed within the cluster, it must support Failover Clustering.
Note: In 2008 SAP renamed NetWeaver 2004s to NetWeaver 7.0.
Single-stack SAP cluster, prior to NetWeaver 7.0(2004s) Figure 20 shows the SAP cluster configuration that has typically been used since SAP release 3.1i, with a full SAP CI and database running within the same cluster. The SAP and database applications are only active on one node; the other is a standby node with idle resources.
Since the SAP CI includes full SAP application server capability, no additional SAP application server system can be supported within the cluster. Any additional application servers must be installed outside the cluster.
SAP documentation refers to a configuration with the database deployed on a separate cluster as an SAP system that is distributed over multiple clusters.
7
40
Dual-stack SAP cluster, prior to NetWeaver 7.0(2004s) Since SAP Web Application Server 6.30/6.40 you have been able to install Java as an optional solution stack. Figure 21 shows a NetWeaver 2004 system (kernel 6.40) with the optional Java SCS installed; both stacks are protected via the cluster service and Enqueue replication.
Figure 21. SAP cluster prior to NetWeaver 7.0(2004s) with optional Java SCS installed
Dual-stack SAP cluster, NetWeaver 7.0(2004s) or later, deploying application servers within the cluster Systems based on NetWeaver 7.0(2004s) (SAP kernel 7.0) no longer have a CI. Rather than having all SAP processes even those that are not unique deployed on a large application server, NetWeaver 7.0(2004s) only provides the System Central Services (SCS), which are the Message and Enqueue services. As a result, you can now install SAP application servers on both cluster nodes, better utilizing cluster server hardware resources; moreover, the small size of the clustered SCS helps improve failover times. In addition to the application servers installed within the cluster, you can also install external application servers. A dual-stack cluster is shown in Figure 22. NetWeaver 7.0(2004s) or later also supports single-stack systems; the installation of a second stack is optional and depends on the particular customers needs.
41
= active after failover; blue denotes ABAP, red denotes Java PAS=Primary application server; AAS = Additional application server
Dual-stack SAP cluster, NetWeaver 7.0(2004s) or later, deploying application servers outside the cluster Figure 23 shows a NetWeaver 7.0(2004s) system that does not provide the necessary primary application server or optional additional SAP application servers within the cluster. This configuration makes it easy to consolidate several NetWeaver 7.0(2004s) systems within a single cluster. Single stack systems are also possible. Installing the second stack is optional and depends on the customer need.
Figure 23. Dual-stack SAP cluster based on NetWeaver 7.0(2004s), with application servers outside the cluster
The configuration in figure 23 is not supported on NetWeaver 2004 or earlier. NetWeaver 7.0(2004s) or later is the only release that is comprised of an SCS; the application server is no longer clustered, instead of this the SCS is clustered.
42
When planning an SAP solution, it is important to understand that SAP only supports certain database releases and versions for use with kernels such as SAP 4.6x or 7.0. For a complete list of supported databases, OS versions, and SAP releases, refer to OSS note Availability of R/3 on Microsoft Cluster Server, number 106275.
Note: SAP releases do not support 32-bit versions of Windows Server 2008.
43
Note: Failover cluster support for the database requires special software tools and extensions that are provided and maintained by the particular database vendor and not HP or SAP.
Failover clusters provide protection for the following database components: Network access (hostnames and IP addresses for all cluster nodes) Database software instance Database data and log files Failover cluster support can reduce the downtime associated with software or hardware failures by automatically restarting the database on the other cluster node. The failover and restart process is handled by the cluster service and requires no intervention by IT staff. After failover, however, a database crash recovery must be performed. To ensure consistency, all non-committed, open transactions must be rolled back, creating the risk of data loss. This loss can be minimized by replicating the SAP Enqueue lock table. The database automatically performs the crash recovery. In rare cases, manual intervention may be required. A failover cluster has the following limitations: No lock-step fault-tolerance as with HP NonStop systems though, to some extent, replicating the SAP Enqueue service helps overcome this limitation Unable to move running applications Unable to recover a state that is shared between client and server (such as a database transaction)
Note: A database failover cluster provides automatic failure recovery and helps to reduce unplanned downtime. It provides no protection against data loss or corruption.
Database backup Protecting systems and their applications via Failover Clustering helps the applications come back online in minutes instead of hours or even days following a failure event. However, hardware or software failures are not the only risks to be addressed in the SAP landscape; for example, your data may become completely corrupted indeed, viruses or users that change, delete, or otherwise abuse data continue to be the biggest threats to your data. As a result, performing daily backups to safeguard your data is still the only reliable method for data protection. Database backup solutions range from the traditional, using a directly-attached data backup device like a simple or virtual tape solution, to two-stage backup and recovery solutions that include data replication. For businesses that cannot afford downtime and need a backup solution that goes beyond the traditional, HP offers HP StorageWorks Business Continuity Solutions for SAP. For more information on these business continuity solutions, refer to the HP-specific high-availability solutions for SAP section.
44
Database replication In addition to conventional online and offline backups, most databases support the replication of content to a standby database server. Do not confuse this standby server, which contains a second copy of the database, with a standby database node within a cluster configuration; only a single copy of the database exists on the shared cluster storage, making this storage a SPOF. A database replication solution provides functionality for copying and distributing data and database objects between databases and then synchronizing the databases to maintain consistency. If desired, the database data may be distributed between different geographical locations using local or wide area networks. Databases like SQL Server support multiple replication solutions: data mirroring focuses on availability and fast recovery; and log shipping focuses more on near-line data backup, integrity, and point-in-time data recovery. These solutions are very similar; for example, both replicate transactions. However the key differences are the amount of data replicated log file or entire transaction and the replication time interval. Database log shipping A log shipping solution initially creates a standby database based on either a database clone, which is an exact copy of the database at the storage level, or a backup copy of the primary database. The solution automatically maintains this standby database as a transactionally-consistent copy of the primary database by transmitting primary database transaction/redo log data to the standby system and then applying these logs a process known as database log shipping. Logs are typically backed up and shipped to the standby system every five minutes. For businesscritical systems, this interval can be reduced to one minute.
A short log backup and transfer interval ensures that data can nearly be recovered to the point-in-time at which the disaster occurred.
45
A network or server failure can impact a log being shipped to the standby system. In the event of a disaster, a complete log file may be lost and cannot be recovered, resulting in data loss. To minimize this risk, you should replicate data via SANs rather than slow, sometime unreliable network links. Different methods for creating and maintaining the standby database are used by different vendors. For example, there are two methods for applying redo data to the standby database and keeping it consistent with transactions stored on the primary database. These methods correspond to the following types of standby databases supported by different vendors: Redo apply for physical standby databases A physical standby database provides a physically-identical copy of the primary database, with ondisk database structures that are identical to the primary database on a block-for-block basis. The database schema, including indexes, is the same. Redo apply technology applies redo data on the physical standby database. SQL apply for logical standby databases While a logical standby database contains the same logical information as the source database, the physical organization and structure of the data may be different. SQL apply technology keeps the logical standby database synchronized with the primary database by transforming data in the redo logs shipped from the primary database into SQL statements and then executing the SQL statements on the standby database. Since the logical standby database can be accessed for query and reporting purposes at the same time as the SQL statements are being applied to it, this database can be used concurrently for data protection and reporting. Redo data can be applied synchronously or asynchronously. When applied synchronously, redo data is applied as soon as the entire log file has arrived at the standby server (auto apply), ensuring that
46
the standby database is nearly identical to the production database. However, with synchronous application, any data inconsistency is repeated in the standby database. When logs are applied asynchronously, they are applied after a user-configurable delay, the length of which depends on the particular customers backup strategy. Due to this delay, the standby database may be in an older state than the production database; on the other hand, the delay can help failure recovery by preventing the application of inconsistent data. Thus, asynchronous database log application is often featured in sophisticated backup and recovery solutions. Database mirroring Database mirroring can be used with a redundant database server to enhance availability in much the same way as server clustering. Mirroring also provides a redundant data store, which can protect your data in much the same way as a log shipping environment. Thus, database mirroring combines the benefits of server and storage redundancy.
Note: A database mirror can be created using either hardware or software. Microsoft SQL Server 2005 or later, for example, can mirror changing database transactions to a secondary database server.
Unlike log shipping, a database mirroring solution transfers entire transactions to the secondary database system rather than logs; as a result, both systems contain the same content. As shown in Figure 25, data submitted by the SAP application server is first stored in the local database transaction log, before being replicated to the secondary server. Once the secondary server has stored the data in its transaction log, it sends an acknowledgement to the application server. There are two ways to replicate data: synchronously and asynchronously. In an SAP environment, synchronous replication is preferred; if a failure were to occur during asynchronous replication, data loss may result. Asynchronous mode is typically used in installations where the latency involved in returning a transaction-complete acknowledgement is excessive.
47
A benefit of database mirroring or log shipping rather than hardware-based replication is that you can deploy different types of storage on the primary and standby databases.
You can build a very reliable, disaster-tolerant database solution based on the mirroring capabilities of SQL Server 2005 or later. Nevertheless, it is recommended that, in addition to mirroring, you should consider deploying a log-shipping database or, at the very least, implementing a normal tape backup strategy. Without a log-shipping database or tape backup, inconsistencies that occur during a failover cannot be resolved.
Database replication at the hardware level Rather than using software to replicate a database on a standby database host, you can also use an HP storage subsystem to build a highly reliable, scalable data protection solution. In an HP hardware-based solution, the host on the primary site (the initiator) sends all I/O requests to local storage; HP StorageWorks Continuous Access-enabled disk arrays then synchronously replicate all write I/Os to a remote storage system (the target). After the target has received the replicated data, it sends an acknowledgement to the initiator. The hardware-based replication process is transparent to upper-level applications like the operating system or the database and can span local as well as continental distances. Figure 26 presents a hardware-based replication solution.
48
By only replicating SAP database transaction and redo log information via HP StorageWorks Continuous Access and a standby database mechanism, you can implement disaster tolerance up to the point-in-time at which the latest transactional database update occurred. An installation that only replicates logs requires less bandwidth and allows database changes to be applied after a timed delay at the target site, thus protecting the standby database from data inconsistencies and corruption.
Hardware-based replication can help you to provide the highest levels of data protection while maximizing replication performance.
Summary
Since the level of data protection required varies based on organization size and industry, HP offers products, technologies, and services designed to provide the protection and recovery capabilities you need. This section summarizes a range of options. Data protection and recovery storage solutions You can regain access to the data, hardware, and software needed to resume critical business operations after a planned or unplanned disruption by implementing backup and restore capabilities with virtual tape and local replication. Options include: Traditional backup You can backup and recover data at the speed your business demands Two-stage backup Two-stage backups (disk-to-disk and disk-to-tape) are faster since the first stage is performed to disk Local replication With local replication, you can seamlessly protect and recover data
49
Disaster-tolerant storage solutions You can maintain access to the data, hardware, software, and services needed for normal business operations and mitigate the impact of a disaster and other forms of downtime using array-based data replication and remote mirroring software, plus operating system-specific clustering solutions. Options include: Remote replication You can protect data, reduce downtime, and transfer data seamlessly between multiple sites. Storage and server clustering You can depend on zero downtime for better business results Overview Figure 27 provides an overview of different data protection solutions, comparing their objectives and costs.
Selecting the most suitable data protection and recovery solution for your business strongly depends on your vulnerability to system- and data-loss, and on the cost of the desired solution. For more information on selecting the right solution, visit the HP web page, Business Continuity & Availability Storage [HPQ03].
50
HP server offerings
HP offers the following server model lines: HP ProLiant ML The ML line is optimized for the use of internal storage HP ProLiant DL The DL line is density-optimized, supporting fewer internal hard drives than the ML line HP ProLiant BL The BL line has been designed for HP BladeSystem environments HP Integrity HP Integrity servers are highly scalable, with an extensive management portfolio designed for enterprise customers. There are blade models, entry-class models (1 4 processors), mid-range models (8 32 processors), and high-end models (up to 128 processors) Since clusters require an external shared storage system, servers used as cluster nodes must typically provide Fibre Channel or, more recently, iSCSI support to provide a connection to the storage subsystem. As such, due to their larger internal storage capacity and physical size, HP does not recommend using servers from the ML line as cluster nodes. Some HP ProLiant server models are equipped with Intel Xeon processors, others with AMD Opteron processors; HP Integrity servers are equipped with Itanium processors. All of todays HP servers support 64-bit computing, which is a requirement for Windows Server 2008 Failover Clustering. HP ProLiant servers offer either Intel 64 Architecture or the AMD64 platform; the HP Integrity server line supports the Itanium 64-bit architecture.
Note: Mixing different CPU architectures in a single cluster is neither supported nor recommended by HP.
SAP only supports their standard cluster configuration and, with hardware partners, multi-SID cluster configurations as outlined in the SAP support for failover cluster configurations section. Deploying a standard, homogeneous SAP cluster may be impractical and very expensive 9, particularly if you are using larger HP Integrity Superdome servers with more than 16 processors. However, the combined HP and SAP support for multi-SID configurations allows you to better utilize such servers.
The ProLiant Server Support Matrix for Windows provides a good overview of the HP Windows supported server systems [HPQ17].
Sizing a node that is able to run all the applications may mean over-sizing
51
What is certified?
Information on server systems certified for SAP applications on Microsoft clusters can be found on the appropriate SAP, Microsoft, and HP web pages. Figure 28 shows the SAP hardware certification page hosted by AddOn; to view a list of certified HP hardware, click on Vendor of Certified Hardware and select Hewlett Packard.
Clustering on Windows Server 2008 [MST12] With Windows Server 2008 Microsoft released the new Failover Cluster Configuration Program (FCCP). Unlike the old Microsoft Cluster Hardware Compatibility List (HCL), Windows Server 2008 program partners are listing complete cluster configurations on their own websites that they have tested and validated to work for Windows Server 2008 Failover Clustering rather than listing configurations in the Windows Server Catalog. Beside this all hardware components that comprise a cluster configuration need to earn "Certified for" or "Supported" on Windows Server 2008 designations and will be listed in the Windows Server Catalog.
When you build a cluster, all used hardware and software components must meet the qualifications to receive a Certified for Windows Server 2008 logo and the fully configured solution must pass the Validate test in the Failover Clusters Management snap-in.
For more details about the Microsoft support policy for Windows Server 2008 Failover Clusters see the Microsoft Knowledge Base article 943984 [MST14]. Figure 29 shows the Microsoft Cluster Validation utility. This utility decides if your configuration will be supported or not.
52
Figure 30 shows Microsofts Windows Server catalog. Click on Hardware, then on Cluster Solutions. Enter HP in the sort field to get a list of certified HP server and storage solutions.
53
Up-to-date lists of HP servers and storage supported for Windows clusters are provided on the following web pages: HP ProLiant ProLiant Clusters [HPQ05] HP Integrity Windows on HP Integrity servers [HPQ06]
Before starting to plan a Windows Server 2008 SAP server environment, check out SAPs latest release and support schedule for Windows Server 2008 (SAP PAM). As of August 2008, you cannot currently use Windows Server 2008 in productive SAP environments due to the lack of Java 1.4.2 support. SAP plans to support Windows Server 2008 in Q1 2009 with the release of NetWeaver 7.0 SR4.
XP disk arrays
NAS systems
High-end storage for 24x7 business continuity demands Massive consolidation for greater efficiency Virtualization platform for internal and external data
Powerfully simple, enterprise-class storage Affordable, virtualized storage with a low entry price and low TCO Reliable and available
Flexibility to start small, then migrate drives and enclosures into larger configurations Increase server capacity Modular design enables expansion
Easy-to-use industrystandard file and print solutions High-performance clustered NAS with no SPOFs Increase file serving performance, lower costs, and centralize management
When selecting a storage solution, ensure the functionality you need, such as replication, is supported; perform an appropriate sizing. If you are considering entry-level products such as NAS or MSA storage, support for these product lines is limited. For example, as of August 2008, the only MSA products supported for Windows clusters are MSA2000 models.
EVA and XP disk arrays are supported on Windows clusters and provide the most complete feature sets for high availability solutions based on Failover Clustering.
54
55
HP Data Protector ZDB (as shown in Figure 31) momentarily places the SAP database in backup mode while the storage array creates an identical copy of the production data either a split-mirror or disk-to-disk snapshot. This process is very fast, allowing SAP user access to quickly return to normal. HP Data Protector then backs up the copied disk image to tape, disk, or a virtual tape library.
56
HP Data Protector Instant Recovery (as shown in Figure 32) can reduce application downtime after a failure event. This feature rapidly restores backup data from disk, tape, or virtual tape, allowing a production SAP database to be recovered in minutes rather than hours, even in a large SAP environment. Applying transaction logs then brings the SAP database back to an applicationconsistent state.
In addition to the ZDB feature, HP Data Protector provides backup agents for all important databases used in SAP environments so that online or offline backups to tape, virtual tape, or the network can be performed. Figure 33 shows the HP Data Protector SAP R/3 backup concept for Oracle [HPQ09].
57
Figure 33. HP Data Protector SAP R/3 backup concept for Oracle [HPQ09]
Legend: SM Database Library IDB MA The HP Data Protector Session Manager, which is the HP Data Protector Backup Session Manager during backup or the HP Data Protector Restore Session Manager during restore The interface between SAP R/3 Server processes and HP Data Protector An internal database (IDB) that stores information about HP Data Protector sessions (such as session messages) and information about objects, data, used devices, and media The Data Protector General Media Agent
Disaster tolerance [HPQ10] Characterized by short recovery times and the avoidance of data loss, disaster tolerance is typically implemented by businesses with multiple, geographically-separate sites, each featuring redundant, active servers and client interconnects. In an SAP environment, data replication can be used to help achieve disaster tolerance. Data produced by an SAP application at one site (the primary) is copied by a replication system that maintains a consistent replica of this data at a secondary site. Should the primary site suffer a disaster, SAP instances that were running at this now-disabled site can be failed over to a secondary site, along with the resources needed to support them.
Note: The secondary site need not be dedicated to the SAP application.
58
The process of failing over an SAP application to a node at a secondary site node involves starting instances on the secondary node to restore application availability and making replicated data accessible to the application. The functionality required by a disaster-tolerant solution such as that shown in Figure 34 includes: Mirror all data; create parallel structures in two different storage arrays that may be located in separate geographical locations Send each write I/O to both storage locations; only process read I/Os locally Configure HP StorageWorks Continuous Access software to copy data online, in real time to the remote location through a local or extended SAN
Figure 34. HP StorageWorks Continuous Access data replication combined with a Node Majority failover cluster
Replicating the entire SAP database creates a robust, high-performing, manageable solution, with failover and recovery time measured in minutes at the remote site. Customers with existing HP StorageWorks EVA or XP disk arrays can upgrade to the required storage capability using HP StorageWorks Continuous Access software. In a Windows Failover Clustering environment, the solution can be further enhanced by adding automatic failover capabilities to Continuous Access through the HP StorageWorks Cluster Extension (CLX) product.
59
If desired, you need only use Continuous Access to replicate SAP database log information through SQL Server log shipping or the Oracle standby database mechanism. This scenario, which provides disaster-tolerant functionality up to the latest transactional update and requires less replication bandwidth compared to full database replication. Database changes can be propagated with a time delay at the secondary site to protect the standby database from human error. However, compared to full replication, this scenario requires additional management effort to maintain the standby database; advanced expertise must be provided in the event of a site failover. HP EVA Dynamic Capacity Manager HP StorageWorks EVA Dynamic Capacity Management (DC-Management) software automates storage provisioning and helps improve capacity utilization for the HP StorageWorks EVA family. This comprehensive solution can reduce downtime due to low volume capacity. Designed for the enterprise market, EVA DC-Management automatically right-sizes a supported file system and EVA virtual disk (vdisk) storage volume to meet the needs of a particular application. This capability can dramatically improve capacity utilization by allowing you to simply specify the capacity utilization range of each vdisk. In some cases, EVA DC-Management can more than double storage utilization when compared to traditional storage provisioning methods where utilization may be as low as 20% 40% [HPQ16]. Windows Server 2008 provides full support for EVA DC-Management and dynamically adapts storage capacity to meet business needs. Database downtime due to low volume capacity can be minimized using EVA DC-Management and Windows Server 2008. Features and benefits of EVA DC-Management include: Automated provisioning for increased storage utilization File systems and storage volumes are automatically expanded online as application needs increase, or shrunk to reclaim unused capacity that can be returned to the disk group for use by other applications. Simple, quick set up and configuration It takes just seconds to configure or change policies across multiple volumes. Once policies are set, capacity provisioning and reclamation are automatic, allowing you to focus on other businesscritical tasks. Reduced capital and operational expenses Achieving higher capacity utilization rates reduces the need to purchase additional storage capacity and software licenses. In addition, higher utilization lowers power and cooling requirements by reducing the number of disk drives needed. Accelerated storage consolidation Improved capacity utilization allows more applications to be deployed on the same storage array. Management flexibility for greater control The flexibility of this solution allows you to easily switch between automatic and manual modes to quickly adapt to changing business needs. Figures 35 and 36 show the automatic extend and shrink capabilities of EVA DC-Management.
60
EVA DC-Management eliminates the need for ongoing storage administration to improve capacity with an enterprise-class provisioning solution using the ease of configuration of HP StorageWorks Enterprise Virtual Arrays.
61
HP Competent Cluster Service Until mid-2008, HP offered a Windows SAP cluster service, known as HP Competent Cluster Service (HP CCS), designed for business-critical applications. This service provided the same rich set of cluster features and supported configurations as a typical UNIX cluster, thus eliminating some disadvantages of deploying a Windows-based SAP cluster rather than a vendor-specific UNIX cluster. Since SAP, in partnership with hardware vendors like HP, now supports multi-SID clusters, HP CCS is no longer offered in its earlier form. Today, HP CCS focuses on the planning, configuration, and implementation of SAP multi-SID Windows cluster solutions.
Note: The HP CCS cluster manager product that was originally part of HP CCS is no longer available.
Ask your local HP SAP services team about integration, configuration, and design services for the SAP multi-SID configuration. HP PolyServe [HPQ11] HP PolyServe for Microsoft SQL Server (also known as HP PolyServe Database Utility for SQL Server) is a server consolidation solution that offers an alternative to the traditional Microsoft shared-nothing, standby, and failover cluster concepts. HP PolyServe software can consolidate servers and storage into manageable, available, scalable utilities for database- and file-serving. Implementing the HP PolyServe approach to virtualization can extend the benefits of consolidation to business-critical applications like SAP. Unlike conventional approaches, HP PolyServe's unique shared-data technology helps deliver the raw performance and availability levels that are essential in todays business-critical environments. With HP PolyServe, it is possible to decrease not only server costs but also storage, software, and IT operational costs.
Note: HP PolyServe is a Microsoft Gold Certified Partner and has passed Microsoft's rigorous review for inclusion in the SQL Server Always On program.
HP PolyServe for Microsoft SQL Server is an integrated product that allows a collection of servers and iSCSI SAN storage to be managed as a single entity for hosting multiple SQL Server databases. It is made up of the following components, all working together: Matrix Server Supports shared-data clustering and allows a set of servers to be managed as a single unit Matrix Volume Manager Allows storage from multiple arrays to be used and managed as a single unit Database Utility for SQL Server Adapts core shared-data clustering capabilities delivered by Matrix Server and Matrix Volume Manager for use with SQL Server Support for SQL Server 2005 Matrix Server now supports both SQL Server 2000 and SQL Server 2005 running in the same cluster
62
The unique Windows cluster file system implemented by HP PolyServe allows multiple SAP SQL Server database instances to be clustered and consolidated on fewer servers. Figure 37 shows how multiple SAP SQL Server database instances running on a range of standalone and clustered systems can be consolidated on an HP PolyServe matrix.
HP PolyServe provides its own cluster manager and can only be used for clustering and consolidating databases. The SAP application cannot be deployed on PolyServe and must, therefore, be deployed on a separate SAP Windows cluster. For more information, refer to Database high-availability features.
Figure 37. Typical HP PolyServe for Microsoft SQL Server solution implementation
The SAP clusters shown in Figure 37 use standard Failover Clustering. Any SAP application server deployment that is supported with SQL Server can be combined with an HP PolyServe extended database. Databases in the database layer would run on PolyServe. For more information read the HP guide, Installing and operating HP PolyServe for Microsoft SQL Server for SAP databases [HPQ11].
63
SAP offers limited support for virtualization solutions in the SAP Windows area; currently, only VMware ESX Server is certified. Support for the Hyper-V feature of Windows Server 2008 is expected by December 2008.
A single SAP system can be virtualized, or even a complete SAP landscape that includes production, quality assurance (QA), and development systems, as shown in Figure 38.
A disadvantage with the configuration shown in Figure 38 is that all three virtualized SAP systems would be impacted by a hardware failure in the host server, which has become a SPOF for the landscape. Thus, systems consolidated using virtualization technology must be protected from possible server failures. High availability for virtualized SAP systems To enhance availability, virtualization solutions such as VMware ESX Server and Hyper-V allow multiple host server systems to be configured as a farm on which VMs can run. Both solutions support the movement of VMs between host servers: VMware uses the VMotion feature in conjunction with a cluster file system to seamlessly move running VMs; Hyper-V uses Quick Migration, which is based on Microsoft Failover Clustering. Our testing showed that Hyper-V takes longer to freeze the VM and transfer necessary information (such as memory and CPU register contents) to the new host. Due to its underlying cluster file system, ESX Server is able to move running VMs from one host to another faster than Hyper-V and, in conjunction with VMware Site Recovery Manager, allows VMs to be replicated to an alternate site. Clustering the virtualization host systems is often called in the context of virtualization host-clustering.
64
VMotion or Quick Migration can only move a VM when the host system is online; if the host is offline due to failure, active data (such as the contents of the VMs memory) is lost.
Figure 39 shows how a host server farm or cluster can provide high availability services for VMs.
Figure 39. SAP system consolidation on a virtualization host server farm or cluster
In Figure 39, three systems, PRD, QAS, and DEV, have been virtualized and consolidated on a single host. By adding a second host and clustering it with the first, you can now move PRD, QAS, and DEV between these two nodes. This approach works well as long as all VMs and hosts are up-and-running, thus preserving necessary information during moves. When a VM crashes, however, all its information is lost. When one of the hosts crashes, VMs are moved to the surviving host and restarted there the same process as for a non-virtualized cluster. In a virtualized environment, it is difficult to preserve the contents of the Enqueue table following a hard failover in a virtualized system. After such an event, the content of this table are typically lost; it is only possible to preserve Enqueue information while all hosts and VMs are running. Clustering guests, which is called in this context guest clustering, solves this problem. Microsoft supports guest clustering with Hyper-V when the following conditions are met: Windows Server 2003 guests, the Hyper-V host server systems that host the VM 2003 guests must be listed on the Windows Server 2003 Cluster HCL list. Windows Server 2008 guests, the VM guests have to pass the cluster validation test [ MST14 and MST15]. Clustering the guests with Hyper-V requires the usage of iSCSI. VMware provides support for Microsoft guest clustering with Windows Server 2003 [ VMw01]. For Windows Server 2008 support check the latest VMware support statements [1].
[1]
as of October 2008
65
To summarize, VMotion or Quick Migration in conjunction with basic Failover Clustering capabilities can be used to move running VMs from one host to another. This allows you to distribute the server workload over multiple host systems and to provide basic high availability functionality.
Since neither VMotion nor Quick Migration can protect the contents of the Enqueue table, a virtualized cluster is less fault tolerant than a conventional cluster. However, these features may be used to automate VM start-up following the failure of a host server. As of October 2008, guest clustering is not supported by SAP and Microsoft on VMware. Hyper-V guest clustering is supported as outlined above. It is not recommended by HP to build three-node or larger VM guest-clusters.
Data replication with virtualized systems In conjunction with the implementation of a virtualization solution, replicating data between datacenters is the basis of a disaster-tolerant virtualized server landscape. As described earlier, HP StorageWorks Continuous Access software can replicate data at the storage hardware level between datacenters. VMware Site Recovery Manager (SRM) can utilize this functionality to automate the site failover process.
Note: HP supports SRM and offers integration services.
Leveraging the VMware virtualization solution along with Continuous Access replication software, SRM provides centralized management for recovery plans, recovery process automation, and dramatic improvements in recovery plan testing. As a result, disaster recovery becomes rapid, reliable, manageable, and affordable. For more information on SRM, visit http://www.vmware.com/go/srm. While data replication can also be combined with Hyper-V Quick Migration, failover may result in the loss of necessary information that is stored exclusively in the host servers memory.
Before starting a virtualization project in the SAP area, read SAP OSS note 674851, Virtualization on Windows to ensure support for the planned configuration.
HP management software
This section provides information on optimizing availability through the use of HP management software, HP Systems Insight Manager and HP Operations Manager for Windows. HP Systems Insight Manager HP Systems Insight Manager (HP SIM) is the foundation for HPs unified strategy for server and storage management. HP SIM is a hardware-level product that supports multiple operating systems on HP ProLiant, HP Integrity, and HP 9000 servers; HP StorageWorks MSA, EVA, and XP arrays; and some third-party arrays. It provides basic management functionality for system discovery and identification, single-event view, inventory data collection, and reporting. The core HP SIM software uses Web Based Enterprise Management (WBEM) to deliver the essential capabilities required to manage all HP server platforms. HP SIM can be extended to provide systems management with plug-ins for HP clients, storage, power, and printer products. Plug-in applications for workload management, capacity management, VM
66
management, and partition management10 allow you to choose the value-added software that delivers complete lifecycle management for your hardware assets and helps you maximize uptime. Unified infrastructure management from HP, as shown in Figure 40, delivers the following benefits: Enhance your ability to troubleshoot complex problems that span server and storage infrastructure Provide a single source for server and storage asset information Provides a comprehensive selection of HP ProLiant Essentials, HP Integrity Essentials, and HP Storage Essentials value-added plug-ins for extended management of HP ProLiant, HP Integrity, HP 9000, and HP StorageWorks platforms Enable effective cross-training across domains of expertise Allow your IT organization to focus less on daily maintenance and more on meeting future business needs
HP Operations Manager for Windows When it comes to effectively managing your e-business infrastructure, nothing is more important than a strong operational platform that not only handles basic system availability and performance, but also lets you extend your control to match your business expansion. HP Operations Manager for Windows provides that platform. Out of the box, it gives you comprehensive event management, and proactive performance monitoring, along with automated alerting, reporting, and graphing for Windows systems, middleware, and applications and it delivers all these capabilities from a unique service management perspective. HP Operations Manager for Windows delivers the following benefits: Service-driven operations management lets you provide value to customers by helping them understand the business impact of IT infrastructure availability and performance issues Cross-platform e-business infrastructure management lets you manage the broadest range of operating systems, applications, and services from Windows Integrated performance and availability management lets you auto-discover the managed environment, auto-deploy management rules and policies, collect and automatically respond to events, view and handle messages, generate reports and graphs, and view business-critical services in a color-coded topology map for efficient root cause drill-down and troubleshooting
10
67
Heterogeneous e-infrastructure management enhancements enable more effective cross-firewall management and accessibility from a Web browser Extensive out-of-the-box value includes core Smart Plug-ins and enhancements that are easy to use and quick to implement, and fit transparently with your existing environment Enhanced flexibility and scalability, including manager-of-managers support for concepts such as follow-the-sun, backup server, and competency center policies To summarize, HP Operations Manager for Windows enables a service- and business-driven approach, allowing you to achieve rapid control and availability of IT operations across the heterogeneous enterprise. Used to correlate the impact of IT infrastructure on business-critical services (such as e-mail, ERP, and e-commerce), Operations Manager for Windows builds on an extensive policy base to monitor operating system and application attributes and provide automated responses to common events. HP Operations Manager for Windows delivers distributed, large-scale management from a unique service management perspective to monitor, control, and report the health of the IT environment across boundaries, improving availability and reliability. HP Network Node Manager HP Network Node Manager (NNM) provides robust, standards-based management for heterogeneous networks of all sizes that require advanced management of routers and switches, sophisticated root-cause analysis, and distributed management for large or complex networks. NNM discovers and graphically displays complex network configurations and monitors network infrastructure availability, helping organizations meet usage demands and optimize TCO. Out-of-thebox automation and systems intelligence helps IT staff identify the components that make up the enterprise network services and understand their relationships with network devices in complex switched environments for increased staff efficiency. Beside its standard system management tools, HP Business Technology Optimization (BTO) Software and Solutions provides a complete set of tools for optimizing your IT environment.
Implementation services
HP can help you build an SAP system that it is protected from downtime and geared for optimal performance. Startup services Factory integration and on-site installation services from HP can greatly accelerate your time-toproduction by reducing the occurrence of configuration and installation issues. SAP-certified technology consultants not only help ensure a trouble-free implementation but also transfer knowledge to your IT staff, ensuring a smooth transition.
68
HP Education Services provides the training needed to help you realize the full potential of your HP solutions, increase your network optimization and responsiveness, and achieve a better return on your IT investments. R/3 upgrade Implementing an SAP R/3 upgrade to the newest release can be a time-consuming, costly task that should cause only minimal impact to the SAP production environment. HP understands this transition process well and has defined a proven, phased approach for migrating to a new release, enabling a smooth, easy evolution without disruption to your production environment. Operating system, database, and server upgrades and migrations HP has outstanding expertise in the operating systems, databases, and hardware that make up todays SAP environments. HP can help your IT team plan and implement upgrades and migrations with minimal disruption to your SAP production environment, whether you are upgrading to a new server, upgrading to the latest operating system or database version, or performing a heterogeneous migration such as UNIX to Windows or Oracle to SQL Server. Archiving You can maximize SAP system performance by achieving the right balance of online and offline data. HP has partnered with a number of industry-leading archiving-technology vendors to provide SAP data- and document-archiving services. Backup and recovery HP offers an approach based on industry best practices, including HP's IT Service Management methodology and ITIL (Information Technology Infrastructure Library) principles, combining processes, people, and technology to plan for any potential disruption. SAP Solution Manager SAP is tightly integrating many of its support processes, tools, and services within Solution Manager. HP can help you deploy and configure SAP Solution Manager to best meet your requirements.
Operations services
Once your SAP system has been built to your requirements, HP can help keep it running at the highest performance and availability levels through standalone technical services and upfront or contractual support services. Regular trend analysis and capacity planning The trend analysis and capacity planning service is designed to provide timely identification, notification, and escalation of issues that may potentially impact the performance of your SAP system. HP periodically analyzes how well your SAP application is utilizing resources, developing metrics for CPU, I/O, memory, database, SAP application buffers, user response times, and more. Results are analyzed to identify areas where changes should be made. If necessary, HP may recommend additional services to obtain a deeper understanding of a particular performance issue before providing an action plan. The results can also be used as a baseline against which to compare capacity plans that will impact your workload (such as more users, additional SAP modules, or higher transaction volume). HP Performance Analysis for SAP Systems service HP Performance Analysis for SAP Systems (PASS) provides a deeper analysis of SAP system performance and identifies the source of existing bottlenecks. Depending on the problem area, this service can also focus on SAP application-specific topics, like expensive SQL statements and longrunning programs or transactions.
69
To extract performance information, HP uses an HP MeasureWare Agent (MWA) with a special HPdeveloped MWA configuration file. All results are summarized in a written report that contains graphical representations, explanatory text, and recommendations. Storage Performance Analysis for SAP Systems service Providing a deeper level of analysis specific to HP storage operating in SAP environments, HP analyses your current storage implementation and identifies persistent performance bottlenecks. The service is delivered using advanced performance tools designed for the HP storage subsystems and a sophisticated extension that seamlessly integrates these tools into the SAP management landscape. The HP Performance Advisor extension for SAP acts as an interface between SAP Computing Center Management Systems (CCMS), SAP Solution Manager, and HP Performance Advisor, providing configuration data for the storage subsystem, along with critical performance data for all hosts and devices connected to the SAN.
70
Seven (P24/SAP) or 12 (CS/SAP) consultancy days per year to be used flexibly for additional HP technical services, for example: Review of SAP Early Watch recommendations HP Performance Analysis for SAP Storage Performance Analysis for SAP HP Cluster Consistency Service, with HP Change Alert Service, to monitor clusters for changes and inconsistencies, making it possible to identify and remedy problems that could otherwise inhibit a successful failover Assessment services Services related to SAP Solution Manager, ranging from awareness training to installation and service-level reporting HPs objectives are two-fold: Create an environment where problems do not occur If, however, a problem does occur, resolve it quickly to minimize risks to business-critical operations For database, SAP Basis, and application problems, HP commits to working collaboratively to resolve the problem. With HP Mission Critical Services, enhanced for SAP, you can call either HP or SAP to start the process. HP and SAP have aligned their support processes to support the exchange of information and seamless case handling, helping to ensure rapid problem resolution in SAP environments. HP Global SAP Competency Center and HP-SAP collaborative support processes HPs Global SAP Competency Center support teams are the primary contacts for HPs remote SAP support services. HP and SAP are committed to working together on SAP-related problems. Should a call require SAP expertise, HP maintains SAP trained support engineers located at SAP offices to ensure end-to-end collaboration. These engineers work with SAP Active Global Support to diagnose and solve SAPrelated problems. HP has access to SAPs support organization 24 hours a day, 365 days a year, with named contacts and defined escalation processes. SAP knowledge database access HP support teams can access the SAP knowledge database, with similar access rights to SAP support engineers. This direct access can enhance HPs ability to effectively troubleshoot problems by referencing information that has not necessarily been released for public use. For more information on business-critical service offerings for SAP, visit HPs Mission Critical & Proactive Services web page [HPQ02].
71
Medium customer Large customer The small, medium, and large customer classifications refer not only to the number of SAP users supported but also typical levels of high availability and disaster tolerance required by these types of customer. The reference configurations only include business-critical components within the SAP landscape; note that you also need SAP development and QA systems, which are not shown.
Important: Replication of the Enqueue service lock table is implied in these reference configurations and is strongly recommended to help enhance the availability of the network and storage infrastructure.
Server classes
Rather than specifying particular server models, the reference configurations utilize server classes selected from the HP ProLiant or Integrity master configuration guide [HPQ15], which is maintained by HPs SAP Competency Center in Walldorf. Developed to expedite SAP system sizing for the Windows platform, this guide characterizes a range of server classes and suggests the number of users that can be supported by each. The contents of the master configuration guide reflect HPs experience with approximately 40,00011 SAP system installations on HP server-based Windows platforms worldwide. Table 5 outlines the characteristics of sample server classes.
Table 5. Server classes A D, featuring HP ProLiant DL and BL servers (September 2008)
Class A Purpose
Class B
Class C
Class D
ERP production
systems with up to 100 users
ERP production
systems up to 200 users
ERP production
systems up to 500 users
ERP production
systems up to 900 users
Test and
development systems for ECC, BI, XI, APO, CRM, SRM*, Enterprise Portal
Test and
development systems for ERP, BI, XI, APO, CRM, SRM, Enterprise Portal
Test and
development systems for ERP, BI, XI, APO, CRM, SRM, Enterprise Portal
BI production
systems
Production systems
for APO, CRM, SRM, Enterprise Portal; Application Server
Production systems
for XI, APO, CRM, SRM, Enterprise Portal Manager, Application Server
Production systems
for BI, APO, CRM, SRM, Enterprise Portal; Application Server
Two dual-processor
servers; quad-core Xeon, 2.00 GHz
Two dual-processor
servers; quad-core Xeon, 3.00 GHz
Two dual-processor
servers; quad-core Xeon, 3.00 GHz
DL380 G5/BL460c
(2.33 GHz)
DL380 G5/BL460c
(2.66 GHz)
DL380 G5/BL460c
(3.00 GHz)
DL380 G5/BL460c
(3.00 GHz)
8 GB RAM
11
16 GB RAM
32 GB RAM
64 GB
As of September 2008
72
*ECC = ERP Central Component, BI = Business-Intelligence, XI = Exchange Infrastructure, APO = Advanced Planner and Optimizer, CRM = Customer Relationship Management, SRM = Supplier Relationship Management
Sizing
Tables 6, 7, and 8 provide more information on sizing an SAP solution, based on the following sizing categories: Comfort Generous CPU resources based on dual-processor, quad-core technology Easily scalable by adding memory Clustering and SAN storage recommended for high availability User-based sizing in most cases HP services recommended Advanced Clustering and SAN storage mandatory for high availability Quantity/special sizing recommended Disaster tolerance to be discussed HP services mandatory Expert Detailed customer consulting mandatory Clustering and SAN storage mandatory for high availability Quantity/special sizing recommended Disaster tolerance strongly recommended HP services mandatory
Note: To obtain updates or additional information, contact your local SAP Competency Center.
73
Table 6. Server classes recommended for the Comfort sizing area [HPQ15]
System Production system (two similarlyconfigured cluster nodes installed as SAP Central Systems that include SAP SCS, SAP Application Server, and database instances) Test and quality assurance system (Central System) Key server characteristics Server class
200 GB
400 GB
1.0 TB
2 CPU 8 GB RAM A or J
2 CPU 8 GB RAM A or J
2 CPU 16 GB RAM B or K
Approximate net database storage Development system (Central System) Key server characteristics Server class
100 GB
100 GB
200 GB
74
Table 7. Server classes recommended for the Advanced sizing area [HPQ15]
System Production system (two similarlyconfigured cluster nodes installed as SAP Central Systems that include SAP SCS, SAP Application Server, and database instances) Test and quality assurance system (Central System) Key server characteristics Server class
1.8 TB
2.4 TB
2 CPU 16 GB RAM B or K
2 CPU 32 GB RAM C or L
Approximate net database storage Development system (Central System) Key server characteristics Server class
200 GB
200 GB
75
Table 8. Server classes recommended for the Expert sizing area [HPQ15]
System Production system (two similarlyconfigured cluster nodes installed as SAP Central Systems that include SAP SCS, SAP Application Server, and database instances) Test and quality assurance system (Central System) Key server characteristics Server class
Approximate net database storage Development system (Central System) Key server characteristics Server class
200 GB
The storage array must be protected via data backup. For a smaller installation, you could use a direct-attached backup device on the cluster or traditional backup over the network. For larger installations or to optimize backup performance and lower the potential impact on the client network, you could deploy a dedicated backup network. All SAP users connect over the network to two application server instances within a cluster. If one cluster node were to fail, all users connected to this
76
application server would have to log back on to the surviving application server, since only the SCS with its Message and Enqueue services is protected via Failover Clustering. Table 9 and Figure 41 outline the Small Customer reference configuration.
Table 9. Small Customer configuration overview
Cluster components
Storage
Redundantly-configured network
switches and infrastructure
Redundantly-configured power
provisioning
77
Remember that the cluster only protects the SCS and its Message and Enqueue services.
Table 10 and Figure 42 outline the Medium Customer reference configuration. Figure 43 outlines an alternate configuration.
Table 10. Medium Customer configuration overview
Cluster components
Storage
Redundantly-configured SAN
switches and infrastructure
Redundantly-configured network
switches and infrastructure
Redundantly-configured power
provisioning
78
Figure 42 shows the SAP SCS and database instance running in the cluster, with all necessary application server systems deployed outside the cluster. This is an effective solution that separates the SAP cluster from the public network. The log shipping standby database provides short-term backup and fast recovery capabilities; the backup server with its connected tape library provides long-term backup and archiving.
79
Figure 43 outlines an alternate Medium Customer configuration, where the application server systems are deployed within the cluster; cluster nodes are the more powerful server types B and K.
80
Databases like SQL Server support database mirroring, a feature that can enhance the availability of your SAP system. If data loss is not an option, then database mirroring or the use of standby databases may not be sufficient; in this case, data must be replicated via storage hardware (for example, using HP StorageWorks Continuous Access software for EVA and XP disk arrays). For customers using very large SAP systems and databases within a single cluster, or those wishing to consolidate several SAP systems on a single cluster, an SAP application server farm is required. The servers used in the Large Customer reference configuration are type C or L systems. If you need to support additional users, contact your local SAP Competency Center for a customized SAP system sizing. Table 11 outlines the Large Customer reference configuration. Figure 44 shows a configuration with hardware data replication, Figure 45 a configuration with database mirroring.
Table 11. Large Customer configuration overview
Cluster components
Storage
Redundantly-configured network
switches and infrastructure
Redundantly-configured power
provisioning
Figure 44 shows a multi-SID SAP Node Majority failover cluster with two NetWeaver 7.0(2004s) instances. All data is synchronously replicated from Site A to remote Site B. A third node, deployed at the remote site, can access the replicated storage. Using CLX, the third node automatically gains this access. In the event of a local system failure, the SAP and database instances are configured for automatic local failover, not a failover to Site B. The instances at Site B are only started when there is a complete site failover at the local site. A standby database and a backup server with a tape library provide data backups to protect and conserve replicated data. SAP application server systems are required at both sites to ensure access to the surviving cluster node.
81
Since Windows does not currently support multiple replication partners, the Enqueue table is only replicated between nodes 1 and 2. Thus, after a site failover, Enqueue table information is lost.
The configuration shown in Figure 45 is similar to that shown in Figure 44 but uses software mirroring rather than replication for the database. Note that mirroring is not supported by all databases; this configuration uses SQL Server, which has supported database mirroring since SQL Server 2003.
Note: Since mirroring is being used, HP StorageWorks replication software is not required with this configuration.
Now, the cluster is only protecting the SAP SCS. While the database is no longer part of the cluster, HP recommends adding the SQL Server IP address and network name to the cluster. Virtualizing the configuration of the database network helps provide the network name and IP where they are needed.
82
Summary
The information provided in this white paper should help you gain the knowledge you need to implement high availability and disaster tolerance for NetWeaver 7.0(2004s) or later systems on Windows Server 2008. Currently, due to lack of support for Java 1.4.2, Windows Server 2008 cannot be used in the SAP arena. SAP plans to support Windows Server 2008 in Q1 2009; the only current SAP release with Windows Server 2008 support is SAP NetWeaver 7.0(2004s) SR4. Please check the SAP product availability matrix (SAP PAM) for more details on Windows Server 2008 support.
83
References
[CON01] O'Connor, Patrick D. T.: Practical Reliability Engineering, Wiley; 4th edition, July 9, 2002 [DEC01] Digital Equipment Corporation: Digital Clusters for Windows NT Admin Guide, June 1996, Order Number: AAQVUTATE [DEP01] dependability.org IFIP WG10.4 Dependable Computing and Fault Tolerance, web site [IEEE01] IEEE Standards Association: IEEE Standard Glossary of Software Engineering Terminology Description, IEEE Std 610.12-1990 [HPQ01] HP: HP Readies Customers for Upcoming Microsoft Windows Server 2008 Launch, PALO ALTO, Calif., Dec. 5, 2007 [HPQ02] HP: Mission Critical & Proactive Services, web page [HPQ03] HP: Business Continuity & Availability Storage, web page [HPQ04] HP: ProLiant architecture, web page [HPQ05] HP: HP ProLiant Clusters, web page [HPQ06] HP: Windows on HP Integrity servers, web page [HPQ07] HP: HP StorageWorks Business Continuity Solutions for SAP, Document number 4AA15683ENW, November 2007 [HPQ08] HP: HP Data Protector software Zero Downtime Backup and Instant Recovery, Document number 4AA0-5769ENW, December 2007 [HPQ09] HP: HP OpenView Storage Data Protector Integration Guide for Oracle SAP, Document number B6960-96008, release A06.00, July 2006 [HPQ10] HP: HP StorageWorks Disaster Tolerant Solution for mySAP Business Suite on EVA, Business blueprint, Document number 5981-7701EN, February 2004 [HPQ11] HP: Installing and operating HP PolyServe for Microsoft SQL Server for SAP databases, Document number 4AA1-6165ENW, December 2007 [HPQ12] HP: HP Systems Insight Manager - QuickSpecs, Document number DA - 11824, July 2008 [HPQ13] HP: HP A7143A RAID160 SA Controller Support Guide, Document number J636990026, 2005 [HPQ14] HP: HP Disk Storage Systems, web page [HPQ15] HP: SAP Master Configurations with ProLiant and Integrity Servers, SAP Competency Center, Walldorf, Version 1.7, July 2008 [HPQ16] HP: HP StorageWorks EVA Dynamic Capacity Management Software - QuickSpecs, DA12815 2 - Version 2 - February 26, 2008 [HPQ17] HP: Windows support for HP ProLiant Servers, web page
[IEC01] International Electrotechnical Commission (IEC): IEV number 191-02-03, Web site [MST01] Microsoft: MS Windows NT Server, Enterprise Edition Cluster Server Admin Guide, 1997, Document Number: X0327902 [MST02] Microsoft: Description of Network Load Balancing features, February 2007, Article ID: 232190 [MST03] Microsoft: Windows 2000 Clustering Technologies: Cluster Service Architecture, TechNet article [MST04] Microsoft: Windows Compute Cluster Server 2003 Release Notes, June 2006, TechNet article [MST05] Microsoft: Maximum number of supported nodes in a cluster, June 2007, KB Article ID: 288778 [MST06] Microsoft: Top 11 Reasons to Upgrade to Windows Server 2008, TechNet article [MST07] Microsoft: Windows Server 2008 Failover Clustering Architecture Overview, November 2007, white paper [MST08] Microsoft: Windows Server - 2008 Compare Technical Features and Specifications, web page [MST09] Microsoft: Server Clusters: Majority Node Set Quorum, web page [MST10] Microsoft: Cluster file share witness, web page [MST11] Microsoft: Windows Server 2008 Multi-Site Clustering, November 2007, white paper [MST12] Microsoft: Clustering on Windows Server 2008, web page [MST13] Microsoft: HP Microsoft certified server and storage systems, web page [MST14] Microsoft: The Microsoft Support Policy for Windows Server 2008 Failover Clusters, Knowledge Base article ID943984, November 9, 2007, revision: 1.5 [MST15] Symon Perriman, Program Manager, Cluster & HA Microsoft: Failover Clustering with Hyper-V Deployment Options, Microsoft blog, June 21, 2008 [SAP01] SAP AG: The SAP Lock Concept, SAP Library [SAP02] SAP AG: MSCS Configuration and Support Information for SAP NetWeaver 04 and SAP NetWeaver 7.0, Document Version 1.0, May 09, 2007 [SAP03] SAP AG: SAP Product Availability Matrix, SAP web page (SAP account needed) [SAP04] SAP AG: HP SAP certified server systems, SAP web page [SAP05] SAP AG: SAP System Installation on Windows Server 2008, SAP OSS note 1054740, August 2008 (SAP account needed) [Sta01] Standish Group: 2005 fourth quarter research report, 2005 [Sta02] Standish Group: 2006 first quarter research report, 2006 [VMw01] VMware: Setup for Microsoft Cluster Service, Revision: 20080725, Item: EN-000081-00
2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein. Microsoft, Windows and Windows NT are U.S. registered trademarks of Microsoft Corporation. AMD Opteron is a trademark of Advanced Micro Devices, Inc. Intel, Itanium and Xeon are trademarks of Intel Corporation in the U.S. and other countries. Java is a US trademark of Sun Microsystems, Inc. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. UNIX is a registered trademark of The Open Group. SAP, SAP NetWeaver, R/3, MaxDB are the trademarks or registered trademarks of SAP AG in Germany and in several other countries. 4AA2-2644ENW, Revision 3, October 2008