You are on page 1of 286

Front cover

In-memory Computing
with SAP HANA on IBM
eX5 and X6 Systems
IBM System x solution for
SAP HANA
SAP HANA overview and
use cases
Operational aspects for
SAP HANA appliances

Martin Bachmaier
Ilya Krutov

ibm.com/redbooks

International Technical Support Organization


In-memory Computing with SAP HANA on IBM eX5
and X6 Systems
September 2014

SG24-8086-02

Note: Before using this information and the product it supports, read the information in
Notices on page vii.

Third Edition (September 2014)


This edition applies to IBM System x solution for SAP HANA that is based on IBM eX5 and X6
servers and the SAP HANA offering.
Copyright International Business Machines Corporation 2013, 2014. All rights reserved.
Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP
Schedule Contract with IBM Corp.

Contents
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
Now you can become a published author, too! . . . . . . . . . . . . . . . . . . . . . . . . . xi
Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Stay connected to IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
Chapter 1. Basic concepts of in-memory computing . . . . . . . . . . . . . . . . . 1
1.1 Keeping data in-memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Using main memory as the data store . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.2 Data persistence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Minimizing data movement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.1 Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.2 Columnar storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.3 Pushing application logic to the database . . . . . . . . . . . . . . . . . . . . . 13
1.3 Dividing and conquering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3.1 Parallelization on multi-core systems . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3.2 Data partitioning and scale-out . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Chapter 2. SAP HANA overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.1 SAP HANA overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.1.1 SAP HANA architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.1.2 SAP HANA appliance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2 SAP HANA delivery model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2.1 SAP HANA as an appliance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2.2 SAP HANA tailored data center integration . . . . . . . . . . . . . . . . . . . 22
2.3 Sizing SAP HANA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.1 Memory per core ratio for SAP HANA appliances . . . . . . . . . . . . . . 23
2.3.2 Sizing approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4 SAP HANA software licensing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Chapter 3. Software components and data replication methods . . . . . . . 29
3.1 SAP HANA software components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.1.1 SAP HANA database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.1.2 SAP HANA client. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.1.3 SAP HANA studio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.1.4 SAP HANA studio repository. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

Copyright IBM Corp. 2013, 2014. All rights reserved.

iii

3.1.5 SAP HANA landscape management structure . . . . . . . . . . . . . . . . . 41


3.1.6 SAP host agent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.1.7 SAP HANA Lifecycle Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.1.8 SAP HANA Lifecycle Management tools . . . . . . . . . . . . . . . . . . . . . 42
3.1.9 Solution Manager Diagnostics agent . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2 Data replication methods for SAP HANA . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.2.1 Trigger-based replication with SAP Landscape Transformation . . . . 45
3.2.2 ETL-based replication with SAP BusinessObjects Data Services . . 46
3.2.3 Extractor-based replication with Direct Extractor Connection . . . . . . 47
3.2.4 Log-based replication with Sybase Replication Server . . . . . . . . . . . 48
3.2.5 Comparing the replication methods . . . . . . . . . . . . . . . . . . . . . . . . . 49
Chapter 4. SAP HANA integration scenarios . . . . . . . . . . . . . . . . . . . . . . . 51
4.1 Basic use case scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.2 SAP HANA as a technology platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2.1 SAP HANA data acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.2.2 SAP HANA as a source for other applications . . . . . . . . . . . . . . . . . 56
4.3 SAP HANA for operational reporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.4 SAP HANA as an accelerator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.5 SAP products running on SAP HANA . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.5.1 SAP NetWeaver Business Warehouse powered by SAP HANA . . . 65
4.5.2 Migrating SAP NetWeaver Business Warehouse to SAP HANA . . . 70
4.5.3 SAP Business Suite powered by SAP HANA . . . . . . . . . . . . . . . . . . 76
4.6 Programming techniques using SAP HANA . . . . . . . . . . . . . . . . . . . . . . . 77
Chapter 5. IBM System x solutions for SAP HANA . . . . . . . . . . . . . . . . . . 79
5.1 IBM eX5 systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.1.1 x3850 X5 and x3950 X5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.1.2 x3690 X5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.1.3 Intel Xeon processor E7 family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.1.4 Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.1.5 Flash technology storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.1.6 x3950 X5 Workload Optimized Solution for SAP HANA . . . . . . . . . . 91
5.1.7 x3690 X5 Workload Optimized Solution for SAP HANA . . . . . . . . . . 99
5.2 IBM X6 systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.2.1 Intel Xeon processor E7 v2 family. . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.2.2 Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.2.3 Flash technology storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.2.4 x3850 X6 Workload Optimized Solution for SAP HANA . . . . . . . . . 111
5.2.5 x3950 X6 Workload Optimized Solution for SAP HANA . . . . . . . . . 117
5.3 IBM General Parallel File System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
5.3.1 Common GPFS features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
5.3.2 GPFS extensions for shared-nothing architectures . . . . . . . . . . . . 126

iv

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

5.3.3 Scaling-out SAP HANA using GPFS. . . . . . . . . . . . . . . . . . . . . . . . 128


5.4 IBM System Networking options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5.4.1 IBM System Networking RackSwitch G8264 . . . . . . . . . . . . . . . . . 137
5.4.2 IBM System Networking RackSwitch G8124 . . . . . . . . . . . . . . . . . 140
5.4.3 IBM System Networking RackSwitch G8052 . . . . . . . . . . . . . . . . . 143
Chapter 6. SAP HANA IT landscapes with IBM System x solutions. . . . 145
6.1 IBM eX5 based environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
6.1.1 Single-node eX5 solution for Business Warehouse . . . . . . . . . . . . 148
6.1.2 Single-node eX5 solution for SAP Business Suite on HANA . . . . . 151
6.1.3 Scale-out eX5 solution for Business Warehouse . . . . . . . . . . . . . . 154
6.2 IBM X6 based environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
6.2.1 Single-node X6 solution for Business Warehouse . . . . . . . . . . . . . 161
6.2.2 Single-node X6 solution for SAP Business Suite on HANA . . . . . . 165
6.2.3 Scale-out X6 solution for Business Warehouse . . . . . . . . . . . . . . . 170
6.3 Migrating from eX5 to X6 servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
6.3.1 Disruptive migration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
6.3.2 Hybrid SAP HANA cluster with eX5 and X6 nodes . . . . . . . . . . . . . 174
6.4 SAP HANA on VMware vSphere. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
6.4.1 vSphere on eX5 workload-optimized models for SAP HANA . . . . . 179
6.4.2 vSphere on X6 workload-optimized models for SAP HANA . . . . . . 180
6.4.3 VMware vSphere licensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
6.5 Sharing an SAP HANA system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
6.6 SAP HANA on IBM SmartCloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
Chapter 7. Business continuity and resiliency for SAP HANA . . . . . . . . 187
7.1 Overview of business continuity options . . . . . . . . . . . . . . . . . . . . . . . . . 188
7.1.1 GPFS based storage replication . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
7.1.2 SAP HANA System Replication . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
7.1.3 Special considerations for DR and long-distance HA setups . . . . . 195
7.2 HA and DR for single-node SAP HANA . . . . . . . . . . . . . . . . . . . . . . . . . 196
7.2.1 High availability (using GPFS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
7.2.2 Stretched high availability (using GPFS). . . . . . . . . . . . . . . . . . . . . 200
7.2.3 Disaster recovery (using GPFS) . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
7.2.4 Disaster recovery (using SAP HANA System Replication) . . . . . . . 212
7.2.5 HA plus DR (using GPFS). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
7.2.6 HA (using GPFS) plus DR (using SSR) . . . . . . . . . . . . . . . . . . . . . 218
7.3 HA and DR for scale-out SAP HANA . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
7.3.1 High availability using GPFS storage replication . . . . . . . . . . . . . . 222
7.3.2 Disaster recovery using GPFS storage replication . . . . . . . . . . . . . 223
7.3.3 HA using GPFS replication plus DR using SAP HANA System
Replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
7.4 Backup and restore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236

Contents

7.4.1
7.4.2
7.4.3
7.4.4
7.4.5
7.4.6
7.4.7

Basic backup and recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236


File-based backup tool integration . . . . . . . . . . . . . . . . . . . . . . . . . 239
GPFS snapshots as a backup source . . . . . . . . . . . . . . . . . . . . . . . 239
Backup tool integration with Backint for SAP HANA . . . . . . . . . . . . 240
Tivoli Storage Manager for ERP 6.4 . . . . . . . . . . . . . . . . . . . . . . . . 242
Symantec NetBackup 7.5 for SAP HANA . . . . . . . . . . . . . . . . . . . . 242
Backup and restore as a DR strategy . . . . . . . . . . . . . . . . . . . . . . . 243

Chapter 8. SAP HANA operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245


8.1 Installation services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
8.2 IBM SAP HANA Operations Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
8.3 Interoperability with other platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
8.4 Monitoring SAP HANA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
8.5 Installing additional agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
8.6 Software and firmware levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
8.7 Support process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
8.7.1 IBM and SAP integrated support. . . . . . . . . . . . . . . . . . . . . . . . . . . 252
8.7.2 IBM SAP International Competence Center InfoService. . . . . . . . . 253
Appendix A. Additional topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
A.1 GPFS license information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
A.2 File-based backup with IBM Tivoli Storage Manager for ERP . . . . . . . . 258
A.2.1 Setting up Data Protection for SAP HANA . . . . . . . . . . . . . . . . . . . 258
A.2.2 Backing up the SAP HANA database . . . . . . . . . . . . . . . . . . . . . . . 258
A.2.3 Restoring the SAP HANA database . . . . . . . . . . . . . . . . . . . . . . . . 261
Abbreviations and acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
Related publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Online resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Help from IBM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268

vi

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Notices
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult your
local IBM representative for information on the products and services currently available in your area. Any
reference to an IBM product, program, or service is not intended to state or imply that only that IBM product,
program, or service may be used. Any functionally equivalent product, program, or service that does not infringe
any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and
verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document. The
furnishing of this document does not grant you any license to these patents. You can send license inquiries, in
writing, to:
IBM Director of Licensing, IBM Corporation, North Castle Drive, Armonk, NY 10504-1785 U.S.A.
The following paragraph does not apply to the United Kingdom or any other country where such
provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION
PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR
IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT,
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of
express or implied warranties in certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically made to the
information herein; these changes will be incorporated in new editions of the publication. IBM may make
improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time
without notice.
Any references in this information to non-IBM websites are provided for convenience only and do not in any
manner serve as an endorsement of those websites. The materials at those websites are not part of the materials
for this IBM product and use of those websites is at your own risk.
IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any
obligation to you.
Any performance data contained herein was determined in a controlled environment. Therefore, the results
obtained in other operating environments may vary significantly. Some measurements may have been made on
development-level systems and there is no guarantee that these measurements will be the same on generally
available systems. Furthermore, some measurements may have been estimated through extrapolation. Actual
results may vary. Users of this document should verify the applicable data for their specific environment.
Information concerning non-IBM products was obtained from the suppliers of those products, their published
announcements or other publicly available sources. IBM has not tested those products and cannot confirm the
accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the
capabilities of non-IBM products should be addressed to the suppliers of those products.
This information contains examples of data and reports used in daily business operations. To illustrate them as
completely as possible, the examples include the names of individuals, companies, brands, and products. All of
these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is
entirely coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate programming
techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any
form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs
conforming to the application programming interface for the operating platform for which the sample programs are
written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or
imply reliability, serviceability, or function of these programs. You may copy, modify, and distribute these sample
programs in any form without payment to IBM for the purposes of developing, using, marketing, or distributing
application programs conforming to IBM's application programming interfaces.

Copyright IBM Corp. 2013, 2014. All rights reserved.

vii

Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business
Machines Corporation in the United States, other countries, or both. These and other IBM trademarked
terms are marked on their first occurrence in this information with the appropriate symbol ( or ),
indicating US registered or common law trademarks owned by IBM at the time this information was
published. Such trademarks may also be registered or common law trademarks in other countries. A current
list of IBM trademarks is available on the Web at http://www.ibm.com/legal/copytrade.shtml
The following terms are trademarks of the International Business Machines Corporation in the United States,
other countries, or both:
AIX
BladeCenter
DB2
Global Business Services
Global Technology Services
GPFS
IBM SmartCloud

IBM
Passport Advantage
POWER
PureFlex
RackSwitch
Redbooks
Redbooks (logo)

System x
System z
Tivoli
X-Architecture
z/OS

The following terms are trademarks of other companies:


Intel Xeon, Intel, Itanium, Intel logo, Intel Inside logo, and Intel Centrino logo are trademarks or registered
trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
Linux is a trademark of Linus Torvalds in the United States, other countries, or both.
Microsoft, Windows, and the Windows logo are trademarks of Microsoft Corporation in the United States,
other countries, or both.
Java, and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its
affiliates.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Other company, product, or service names may be trademarks or service marks of others.

viii

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Preface
This third edition of this IBM Redbooks publication describes in-memory
computing appliances from IBM and SAP that are based on IBM eX5 and X6
flagship systems and SAP HANA. It covers the basic principles of in-memory
computing, describes the IBM eX5 and X6 hardware offerings, and explains the
corresponding SAP HANA IT landscapes using these offerings.
This book also describes the architecture and components of the IBM System x
solution for SAP HANA, with IBM General Parallel File System (GPFS) as a
cornerstone. The SAP HANA operational disciplines are explained in detail:
Scalability options, high availability and disaster recovery, backup and restore,
and virtualization possibilities for SAP HANA appliances.
The following topics are covered:
Basic principles of in-memory computing
SAP HANA overview
Software components and replication methods
SAP HANA use cases and integration scenarios
The System x solution for SAP HANA
SAP IT landscapes using the System x solution for SAP HANA
Options for business continuity (high availability, disaster recovery, and
backup and restore)
SAP HANA operations
This book is intended for SAP administrators and technical solution architects. It
is also for IBM Business Partners and IBM employees who want to know more
about the SAP HANA offering and other available IBM solutions for SAP clients.

Copyright IBM Corp. 2013, 2014. All rights reserved.

ix

Authors
This book was produced by a team of specialists from around the world working
at the IBM International Technical Support Organization (ITSO), Raleigh Center.
Martin Bachmaier is an IT Versatilist in the IBM
hardware development lab at Boeblingen, Germany.
He is part of the team developing the System x solution
for SAP HANA. Martin has a deep background in
designing, implementing, and managing scale-out data
centers, HPC clusters, and cloud environments, and
has worked with GPFS for eight years. He gives
university lectures, and likes to push IT limits. Martin is
an IBM Certified Systems Expert. He holds the CCNA,
CCNA Security, and VMware Certified Professional
credentials and has authored over a dozen books and
papers.
Ilya Krutov is a Project Leader at the ITSO Center in
Raleigh and has been with IBM since 1998. Before
joining the ITSO, Ilya served in IBM as a Run Rate
Team Leader, Portfolio Manager, Brand Manager,
Technical Sales Specialist, and Certified Instructor. Ilya
has expertise in IBM System x, BladeCenter and
PureFlex System products, server operating
systems, and networking solutions. He has authored
over 170 books, papers, and product guides. He has a
Bachelors degree in Computer Engineering from the
Moscow Engineering and Physics Institute.
Thanks to the authors of the previous edition of this book:
Gereon Vey
Tomas Krojzl
Special thanks to Irene Hopf, a Senior Architect in the IBM SAP International
Competence Center (ISICC) in Walldorf, Germany, who made a significant
contribution to the development of this book by extensively consulting and
guiding the team on SAP HANA topics
Thanks to the following people for their contributions to this project:
Tamikia Barrow, Cheryl Gera, Chris Rayns, Wade Wallace, David Watts,
Debbie Willmschen
International Technical Support Organization, Raleigh Center

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Dr. Oliver Rettig, Tag Robertson, Volker Fischer


IBM

Now you can become a published author, too!


Heres an opportunity to spotlight your skills, grow your career, and become a
published authorall at the same time! Join an ITSO residency project and help
write a book in your area of expertise, while honing your experience using
leading-edge technologies. Your efforts will help to increase product acceptance
and customer satisfaction, as you expand your network of technical contacts and
relationships. Residencies run from two to six weeks in length, and you can
participate either in person or as a remote resident working from your home
base.
Find out more about the residency program, browse the residency index, and
apply online at:
ibm.com/redbooks/residencies.html

Comments welcome
Your comments are important to us!
We want our books to be as helpful as possible. Send us your comments about
this book or other IBM Redbooks publications in one of the following ways:
Use the online Contact us review Redbooks form found at:
ibm.com/redbooks
Send your comments in an email to:
redbooks@us.ibm.com
Mail your comments to:
IBM Corporation, International Technical Support Organization
Dept. HYTD Mail Station P099
2455 South Road
Poughkeepsie, NY 12601-5400

Preface

xi

Stay connected to IBM Redbooks


Find us on Facebook:
http://www.facebook.com/IBMRedbooks
Follow us on Twitter:
http://twitter.com/ibmredbooks
Look for us on LinkedIn:
http://www.linkedin.com/groups?home=&gid=2130806
Explore new Redbooks publications, residencies, and workshops with the
IBM Redbooks weekly newsletter:
https://www.redbooks.ibm.com/Redbooks.nsf/subscribe?OpenForm
Stay current on recent Redbooks publications with RSS Feeds:
http://www.redbooks.ibm.com/rss.html

xii

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Chapter 1.

Basic concepts of
in-memory computing
In-memory computing is a technology that allows the processing of massive
quantities of data in main memory to provide immediate results from analysis and
transaction. The data that is processed is ideally real-time data (that is, data that
is available for processing or analysis immediately after it is created).
To achieve the preferred performance, in-memory computing follows these basic
concepts:
Keep data in main memory to speed up data access.
Minimize data movement by using the columnar storage concept,
compression, and performing calculations at the database level.
Divide and conquer. Use the multi-core architecture of modern processors
and multi-processor servers, or even scale out into a distributed landscape, to
grow beyond what can be supplied by a single server.
This chapter describes those basic concepts with the help of a few examples. It
does not describe the full set of technologies that are employed with in-memory
databases, such as SAP HANA, but it does provide an overview of how
in-memory computing is different from traditional concepts.

Copyright IBM Corp. 2013, 2014. All rights reserved.

This chapter covers the following topics:


Keeping data in-memory
Minimizing data movement
Dividing and conquering

1.1 Keeping data in-memory


Today, a single enterprise class server can hold several terabytes of main
memory. At the same time, prices for server main memory dramatically dropped
over the last few decades. This increase in capacity and reduction in cost makes
it a viable approach to keep huge amounts of business data in memory. This
section describes the benefits and challenges.

1.1.1 Using main memory as the data store


The most obvious reason to use main memory (RAM) as the data store for a
database is because accessing data in main memory is much faster than
accessing data on disk. Figure 1-1 on page 3 compares the access times for
data in several locations.

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

1,000,000
100,000

150x
10,000
1,000
100

2,000x
10
1

17x
0,1

12x
0,01
0,001
CPU register

CPU Cache
Volatile

RAM

SSD/Flash

Hard disk

Non-volatile

Figure 1-1 Data access times of various storage types relative to RAM (logarithmic scale)

The main memory is the fastest storage type that can hold a significant amount
of data. Although CPU registers and CPU cache are faster to access, their usage
is limited to the actual processing of data. Data in main memory can be accessed
more than a hundred thousand times faster than data on a spinning hard disk
drive (HDD), and even flash technology storage is about a thousand times slower
than main memory. Main memory is connected directly to the processors through
a high-speed bus, and hard disks are connected through a chain of buses (QPI,
PCIe, and SAN) and controllers (I/O hub, RAID controller or SAN adapter, and
storage controller).
Compared with keeping data on disk, keeping the data in main memory can
improve database performance just by the advantage in access time.

Chapter 1. Basic concepts of in-memory computing

1.1.2 Data persistence


Keeping data in main memory brings up the question of what happens if there is
a loss of power.
In database technology, atomicity, consistency, isolation, and durability (ACID) is
a set of requirements that ensures that database transactions are processed
reliably:
A transaction must be atomic. If part of a transaction fails, the entire
transaction must fail and leave the database state unchanged.
The consistency of a database must be preserved by the transactions that it
performs.
Isolation ensures that no transaction interferes with another transaction.
Durability means that after a transaction is committed, it remains committed.
Although the first three requirements are not affected by the in-memory concept,
durability is a requirement that cannot be met by storing data in main memory
alone. Main memory is volatile storage. It loses its content when it is out of
electrical power. To make data persistent, it must be on non-volatile storage,
such as HDDs, solid-state drives (SSDs), or flash devices.
The storage that is used by a database to store data (in this case, main memory)
is divided into pages. When a transaction changes data, the corresponding
pages are marked and written to non-volatile storage in regular intervals. In
addition, a database log captures all changes that are made by transactions.
Each committed transaction generates a log entry that is written to non-volatile
storage, which ensures that all transactions are permanent.
Figure 1-2 shows this setup by using the example of SAP HANA. SAP HANA
stores changed pages in savepoints, which are asynchronously written to
persistent storage in regular intervals (by default, every five minutes). The log is
written synchronously. A transaction does not return before the corresponding
log entry is written to persistent storage to meet the durability requirement.

Time

Data savepoint
to persistent
storage

Log written
to persistent storage
(committed transactions)

Figure 1-2 Savepoints and logs in SAP HANA

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Power failure

After a power failure, the database can be restarted much like a disk-based
database. The database pages are restored from the savepoints and then the
database logs are applied (rolled forward) to restore the changes that were not
captured in the savepoints. This action ensures that the database can be
restored in memory to the same state as before the power failure.

1.2 Minimizing data movement


The second key to improving data processing performance is to minimize the
movement of data within the database and between the database and the
application. This section describes measures to achieve this state.

1.2.1 Compression
Even though todays memory capacities allow keeping enormous amounts of
data in-memory, compressing the data in-memory is still preferable. The goal is
to compress data in a way that does not use up the performance that is gained
while still minimizing data movement from RAM to the processor.
By working with dictionaries to represent text as integer numbers, the database
can compress data significantly and thus reduce data movement while not
imposing additional CPU load for decompression; in fact, it can add to the
performance, as shown in Figure 1-5 on page 10.

Chapter 1. Basic concepts of in-memory computing

Figure 1-3 illustrates this situation with a simplified example.

Row
ID

Date/
Time

Material

Customer
Name

Quantity

Customers

Material

Chevrier

MP3 Player

Di Dio

Radio

Dubois

Refrigerator

Miller

Stove

Newman

Laptop

14:05

Radio

Dubois

14:11

Laptop

Di Dio

14:32

Stove

Miller

14:38

MP3 Player

Newman

Row
ID

14:48

Radio

Dubois

14:55

Refrigerator

Miller

15:01

Stove

Chevrier

Date/
Time

Material

Customer
Name

Quantity

845

851

872

878

888

895

901

Figure 1-3 Illustration of dictionary compression

On the left side of Figure 1-3, the original table is shown, and it contains text
attributes (that is, material and customer name) in their original representation.
The text attribute values are stored in a dictionary (upper right), and an integer
value is assigned to each distinct attribute value. In the table, the text is replaced
by the corresponding integer value, as defined in the dictionary. The date and
time attribute also are converted to an integer representation. Using dictionaries
for text attributes reduces the size of the table because each distinct attribute
value must be stored only once in the dictionary; therefore, each additional
occurrence in the table must be referred to by the corresponding integer value.
The compression factor that is achieved by this method highly depends on data
being compressed. Attributes with few distinct values compress well, but
attributes with many distinct values do not benefit as much.

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Although there are other, more effective compression methods that can be
employed with in-memory computing, for these methods to be useful, they must
have the correct balance between compression effectiveness, which gives you
more data in your memory, or less data movement (that is, higher performance),
resources that are needed for decompression, and data accessibility (that is, how
much unrelated data must be uncompressed to get to the data that you need).
Dictionary compression combines good compression effectiveness with low
decompression resources and high data access flexibility.

1.2.2 Columnar storage


Relational databases organize data in tables that contain the data records. The
difference between row-based and columnar (or column-based) storage is how
the table is stored:
Row-based storage stores a table in a sequence of rows.
Column-based storage stores a table in a sequence of columns.

Chapter 1. Basic concepts of in-memory computing

Figure 1-4 shows the row-based and column-based models.

Row-based
Row
ID

Date/
Time

Column-based

Material

Customer
Name

Quantity

Row
ID

Date/
Time

Material

Customer
Name

Quantity

845

845

851

851

872

872

878

878

888

888

895

895

901

901

Row-based store
1

845

851

851

872

878

872

878

Column-based store
1

845

Figure 1-4 Row-based and column-based storage models

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Both storage models have benefits and drawbacks, which are listed in Table 1-1.
Table 1-1 Benefits and drawbacks of row-based and column-based storage
Row-based storage
Benefits

Record data is stored


together.
Easy to insert/update.

Column-based storage

Drawbacks

All data must be read


during selection, even if
only a few columns are
involved in the selection
process.

Only affected columns


must be read during
the selection process
of a query.
Efficient projections.a
Any column can serve
as an index.
After selection, selected
rows must be
reconstructed from
columns.
No easy insert/update.

a. Projection: View on the table with a subset of columns.

The drawbacks of column-based storage are not as grave as they seem. In most
cases, not all attributes (that is, column values) of a row are needed for
processing, especially in analytic queries. Also, inserts or updates to the data are
less frequent in an analytical environment.1 SAP HANA implements both a
row-based storage and a column-based storage; however, its performance
originates in the usage of column-based storage in memory. The following
sections describe how column-based storage is beneficial to query performance
and how SAP HANA handles the drawbacks of column-based storage.

An exception is bulk loads (for example, when replicating data in the in-memory database, which can be
handled differently).

Chapter 1. Basic concepts of in-memory computing

Efficient query execution


To show the benefits of dictionary compression that is combined with columnar
storage, Figure 1-5 shows an example of how a query is run. Figure 1-5 refers to
the table that is shown in Figure 1-3 on page 6.

Get all records with Customer Name Miller and Material Refrigerator
Dictionary lookup of the strings
Strings are only compared once!

Only those columns are read


which are part of the query condition

Customers

Material

Chevrier

MP3 Player

Di Dio

Radio

Dubois

Refrigerator

Miller

Stove

Newman

Laptop

Integer comparison operations

Customer

Material

0 0 1 0 0 1 0

0 0 0 0 0 1 0
Combine
bit-wise AND

0 0 0 0 0 1 0
Resultset

The resulting records can be assembled from the column stores fast, because positions are known
(here: 6th position in every column)

Figure 1-5 Example of a query that is run on a table in columnar storage

The query asks to get all records with Miller as the customer name and
Refrigerator as the material.
First, the strings in the query condition are looked up in the dictionary. Miller is
represented by the number 4 in the customer name column. Refrigerator is
represented by the number 3 in the material column. This lookup must be done
only once. Subsequent comparisons with the values in the table are based on
integer comparisons, which are less resource-intensive than string comparisons.

10

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

In a second step, the columns are read that are part of the query condition (that
is, the Customer and Material columns). The other columns of the table are not
needed for the selection process. The columns are then scanned for values
matching the query condition. That is, in the Customer column, all occurrences of
4 are marked as selected, and in the Material column, all occurrences of 3 are
marked.
These selection marks can be represented as bitmaps, which are data structures
that allows efficient Boolean operations on them, which are used to combine the
bitmaps of the individual columns to a bitmap representing the selection or
records matching the entire query condition. In this example, the record number
6 is the only matching record. Depending on the columns that are selected for
the result, the additional columns must be read to compile the entire record to
return it. But because the position within the column is known (record number 6),
only the parts of the columns that contain the data for this record must be read.
This example shows how compression can limit not only the amount of data that
must be read for the selection process, but can simplify the selection itself while
the columnar storage model further reduces the amount of data that is needed
for the selection process. Although the example is simplified, it illustrates the
benefits of dictionary compression and columnar storage.

Delta-merge and bulk inserts


To overcome the drawback of inserts or updates impacting the performance of
the column-based storage, SAP plans to implement a lifecycle management for
database records.2

Efficient Transaction Processing in SAP HANA Database - The End of a Column Store Myth, found
at http://dl.acm.org/citation.cfm?id=2213946.

Chapter 1. Basic concepts of in-memory computing

11

Figure 1-6 shows the lifecycle management for database records in the
column-store.

Update / Insert / Delete

L1 Delta

Merge

Bulk Insert

L2 Delta

Merge

Main store

Unified Table

Read
Figure 1-6 Lifetime management of a data record in the SAP HANA column-store

There are three different types of storage for a table:


L1 Delta Storage is optimized for fast write operations. The update is
performed by inserting a new entry into the delta storage. The data is stored
in records, as with a traditional row-based approach. This action ensures high
performance for write, update, and delete operations on records that are
stored in the L1 Delta Storage.
L2 Delta Storage is an intermediate step. Although organized in columns, the
dictionary is not as optimized as in the main storage because it appends new
dictionary entries to the end of the dictionary. This action results in easier
inserts, but has drawbacks with regards to search operations on the
dictionary because it is not sorted.
Main Storage contains the compressed data for fast read with a search
optimized dictionary.
All write operations on a table work on the L1 Delta storage. Bulk inserts bypass
L1 Delta storage and write directly into L2 Delta storage. Read operations on a
table always read from all storages for that table, merging the result set to provide
a unified view of all data records in the table.

12

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

During the lifecycle of a record, it is moved from L1 Delta storage to L2 Delta


storage and finally to the Main Storage. The process of moving changes to a
table from one storage to the next one is called Delta Merge, and it is an
asynchronous process. During the merge operations, the columnar table is still
available for read and write operations.
Moving records from L1 Delta storage to L2 Delta storage involves reorganizing
the record in a columnar fashion and compressing it, as shown in Figure 1-3 on
page 6. If a value is not yet in the dictionary, a new entry is appended to the
dictionary. Appending to the dictionary is faster than inserting, but results in an
unsorted dictionary, which impacts the data retrieval performance.
Eventually, the data in the L2 Delta storage must be moved to the Main Storage.
To accomplish that task, the L2 Delta storage must be locked, and a new L2
Delta storage must be opened to accept further additions. Then, a new Main
Storage is created from the old Main Storage and the locked L2 Delta storage.
This is a resource-intensive task and must be scheduled carefully.

1.2.3 Pushing application logic to the database


Although the concepts that are described above speedup processing within the
database, there is still one factor that can slow down the processing of data. An
application running the application logic on the data must get the data from the
database, process it, and possibly send it back to the database to store the
results. Sending data back and forth between the database and the application
usually involves communication over a network, which introduces communication
impact and latency and is limited by the speed and throughput of the network
between the database and the application itself.
To eliminate this factor and increase overall performance, it is beneficial to
process the data where it is (in the database.) If the database can perform
calculations and apply application logic, less data needs to be sent back to the
application and might even eliminate the need for the exchange of intermediate
results between the database and the application. This action minimizes the
amount of data transfer, and the communication between database and
application adds less time to the overall processing time.

Chapter 1. Basic concepts of in-memory computing

13

1.3 Dividing and conquering


The phrase divide and conquer (derived from the Latin saying divide et impera)
typically is used when a large problem is divided into a number of smaller,
easier-to-solve problems. Regarding performance, processing huge amounts of
data is a problem that can be solved by splitting the data into smaller chunks of
data, which can be processed in parallel.

1.3.1 Parallelization on multi-core systems


When chip manufacturers reached the physical limits of semiconductor-based
microelectronics with their single-core processor designs, they started to
increase processor performance by increasing the number of cores, or
processing units, within a single processor. This performance gain can be
leveraged only through parallel processing because the performance of a single
core remains unchanged.
The rows of a table in a relational database are independent of each other, which
allows parallel processing. For example, when scanning a database table for
attribute values matching a query condition, the table, or the set of attributes
(columns) that are relevant to the query condition can be divided into subsets
and spread across the cores that are available to parallelize the processing of the
query. Compared with processing the query on a single core, this action basically
reduces the time that is needed for processing by a factor equivalent to the
number of cores working on the query (for example, on a 10-core processor, the
time that is needed is one-tenth of the time that a single core needs).
The same principle applies for multi-processor systems. A system with eight
10-core processors can be regarded as an 80-core system that can divide the
processing into 80 subsets that are processed in parallel.

1.3.2 Data partitioning and scale-out


Even though servers available today can hold terabytes of data in memory and
provide up to eight processors per server with up to 10 cores per processor, the
amount of data that is stored in an in-memory database or the computing power
needed that is to process such quantities of data might exceed the capacity of a
single server. To accommodate the memory and computing power requirements
that go beyond the limits of a single server, data can be divided into subsets and
placed across a cluster of servers, forming a distributed database (scale-out
approach).

14

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

The individual database tables can be placed on different servers within the
cluster, or tables bigger than what a single server can hold can be split into
several partitions, either horizontally (a group of rows per partition) or vertically (a
group of columns per partition), with each partition on a separate server within
the cluster.

Chapter 1. Basic concepts of in-memory computing

15

16

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Chapter 2.

SAP HANA overview


This chapter describes the SAP HANA offering, including its architecture,
components, use cases, delivery model, and sizing and licensing aspects.
This chapter covers the following topics:

SAP HANA overview


SAP HANA delivery model
Sizing SAP HANA
SAP HANA software licensing

Copyright IBM Corp. 2013, 2014. All rights reserved.

17

2.1 SAP HANA overview


This section gives an overview of SAP HANA. When talking about SAP HANA,
these terms are used:
SAP HANA database
The SAP HANA database (also referred to as the SAP in-memory database)
is a hybrid in-memory database that combines row-based, column-based,
and object-based database technology, optimized to use the parallel
processing capabilities of current hardware. It is the heart of SAP offerings,
such as SAP HANA.
SAP HANA appliance (SAP HANA)
SAP HANA is a flexible, data source-neutral appliance that allows you to
analyze large volumes of data in real time without needing to materialize
aggregations. It is a combination of hardware and software, and it is delivered
as an optimized appliance in cooperation with SAPs hardware partners for
SAP HANA.
For the sake of simplicity, this book uses the terms SAP HANA, SAP in-memory
database, SAP HANA database, and SAP HANA appliance synonymously. It
covers only the SAP in-memory database as part of the SAP HANA appliance.
Where required, we ensure that the context makes it clear which part is being
described.

2.1.1 SAP HANA architecture


Figure 2-1 on page 19 shows the high-level architecture of the SAP HANA
appliance. Section 3.1, SAP HANA software components on page 30 explains
the most important software components of the SAP HANA database.

18

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

SAP HANA Appliance


SAP HANA Database
Session Management

SAP HANA
Studio

Request processing / Execution Control


SAP HANA
Client

SAP HANA
Studio Repository

SQL

MDX

SQL Script

Calculation Engine

Relational Engines
Software Update
Manager

Row
Store

Column
Store

SAP Host Agent

SAP HANA
Client

Transaction
Manager

Authorization
Manager

Metadata
Manager

Page
Management

Persistency Layer

Logger

Data Volumes

Persistent Storage

Log Volumes

LM Structure
JVM
SAP CAR

Figure 2-1 SAP HANA architecture

SAP HANA database


The heart of the SAP HANA database is the relational database engines. There
are two engines within the SAP HANA database:
The column-based store: Stores relational data in columns, which are
optimized holding tables with huge amounts of data, which can be aggregated
in real time and used in analytical operations.
The row-based store: Stores relational data in rows, as traditional database
systems do. This row store is more optimized for row operations, such as
frequent inserts and updates. It has a lower compression rate, and query
performance is much lower compared to the column-based store.
The engine that is used to store data can be selected on a per-table basis at the
time of the creation of a table. It is possible to convert an existing table from one
type to another type. Tables in the row-store are loaded at start time, but tables in
the column-store can be either loaded at start or on demand during normal
operation of the SAP HANA database.

Chapter 2. SAP HANA overview

19

Both engines share a common persistency layer, which provides data


persistency that is consistent across both engines. There is page management
and logging, much like in traditional databases. Changes to in-memory database
pages are persisted through savepoints that are written to the data volumes on
persistent storage, which is usually hard disk drives (HDDs). Every transaction
that is committed in the SAP HANA database is persisted by the logger of the
persistency layer in a log entry that is written to the log volumes on persistent
storage. The log volumes use flash technology storage for high I/O performance
and low latency.
The relational engines can be accessed through various interfaces. The SAP
HANA database supports SQL (JDBC/ODBC), MDX (ODBO), and BICS (SQL
DBC). The calculation engine allows calculations to be performed in the
database without moving the data into the application layer. It also includes a
business functions library that can be called by applications to do business
calculations close to the data. The SAP HANA-specific SQL Script language is
an extension to SQL that can be used to push down data-intensive application
logic into the SAP HANA database.

2.1.2 SAP HANA appliance


The SAP HANA appliance consists of the SAP HANA database and adds
components that are needed to work with, administer, and operate the database.
It contains the repository files for the SAP HANA studio, which is an
Eclipse-based administration and data-modeling tool for SAP HANA, in addition
to the SAP HANA client, which is a set of libraries that is required for applications
to connect to the SAP HANA database. Both the SAP HANA studio and the client
libraries are usually installed on a client PC or server.
The Software Update Manager (SUM) for SAP HANA is the framework allowing
the automatic download and installation of SAP HANA updates from the SAP
Marketplace and other sources by using a host agent. It also allows distribution
of the studio repository to the users.
The Lifecycle Management (LM) Structure for SAP HANA is a description of the
current installation and is, for example, used by SUM to perform automatic
updates.
More information about existing software components is in 3.1, SAP HANA
software components on page 30.

20

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

2.2 SAP HANA delivery model


SAP decided to deploy SAP HANA as an integrated solution combining software
and hardware, frequently referred to as the SAP HANA appliance. As with SAP
NetWeaver Business Warehouse Accelerator (SAP NetWeaver BW Accelerator),
SAP partners with several hardware vendors to provide the infrastructure that is
needed to run the SAP HANA software. IBM was among the first hardware
vendors to partner with SAP to provide an integrated solution.
Over the last few years, SAP has gained more experience with running SAP
HANA in production environments, so a second delivery model is also supported,
which is called tailored data center integration (TDI). TDI aims to integrate
clients hardware from different vendors on which to deploy SAP HANA. Both
approaches are described briefly in this chapter. The rest of this book covers only
the appliance delivery model.

2.2.1 SAP HANA as an appliance


Ensuring the highest customer satisfaction, SAP HANA is available as an
appliance from different hardware partners. The partners select the components
of the appliance and tune it specifically for use with SAP HANA. Infrastructure for
SAP HANA must run through an SAP certification process to ensure that certain
performance requirements are met. Only certified configurations are supported by
SAP and the respective hardware partner. These configurations must adhere to
certain requirements and restrictions to provide a common platform across all
hardware providers:
Only certain Intel Xeon processors can be used.
All configurations must provide a certain main memory per core ratio, which is
defined by SAP to balance CPU processing power and the amount of data
being processed.
All configurations must meet minimum redundancy and performance
requirements for various load profiles. SAP tests for these requirements as
part of the certification process.
The capacity of the storage devices that are used in the configurations must
meet certain criteria that are defined by SAP.
The networking capabilities of the configurations must include 10 Gb Ethernet
for the SAP HANA software.

Chapter 2. SAP HANA overview

21

By imposing these requirements, SAP can rely on the availability of certain


features and ensure a well-performing hardware platform for their SAP HANA
software. These requirements give the hardware partners enough room to
develop an infrastructure architecture for SAP HANA, which adds differentiating
features to the solution. The benefits of the IBM solution are described in
Chapter 5, IBM System x solutions for SAP HANA on page 79.

2.2.2 SAP HANA tailored data center integration


To allow for an existing infrastructure to be integrated and reused when deploying
SAP HANA, clients can follow the TDI delivery model. Existing storage and
networks that fulfill certain criteria can be used to run SAP HANA. Among others,
these criteria include storage and network performance.
Implementing SAP HANA by following the TDI model requires close collaboration
between the client, SAP, and the vendor of the infrastructure element that is
integrated. More information about this SAP delivery model can be found at the
following website:
http://www.saphana.com/docs/DOC-3633
IBM released a white paper that gives architectural guidance about how to
implement an SAP HANA environment with IBM products while following the TDI
model. It is available for download at the following website:
http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP102347
EMC and IBM released a joint paper about using the TDI method to configure a
scale-out IBM environment with EMC Symmetrix VMAX Storage and IBM GPFS.
It is available for download at the following website:
https://community.emc.com/docs/DOC-35182
Only certain components are eligible for integration. SAP maintains a list of
certified enterprise storage for SAP HANA at the following website:
http://scn.sap.com/docs/DOC-48516

2.3 Sizing SAP HANA


In the first part, this section introduces the concept of T-shirt sizes and the new
shortname concept for SAP HANA. Then, this section gives a brief introduction
about how to size an SAP HANA system.

22

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

2.3.1 Memory per core ratio for SAP HANA appliances


For in-memory computing appliances such as SAP HANA, the amount of main
memory is important.
In-memory computing brings data that is kept on disk into main memory. This
action allows for much faster processing of the data because the CPU cores do
not have to wait until the data is loaded from disk to memory, which means each
CPU is better used.
There is a certain ideal memory per core ratio. If an in-memory appliance has too
much memory per CPU core, then the cores cannot fully use the advantage of
fast data access. If an in-memory appliance has too little memory per CPU core,
then the cores are underused and remain idle. In this case, your system does not
run as fast as it theoretically can.
This memory per core ratio is defined by SAP. IBM is building solutions for SAP
HANA following this memory per core ratio.

T-shirt sizes for SAP HANA


When SAP introduced SAP HANA into the market, they defined so-called T-shirt
sizes to both simplify the sizing and to limit the number of hardware
configurations to support, thus reducing complexity. Each T-shirt size reflects a
multiple of the memory to core ratio that was initially allowed by SAP.
The SAP hardware partners provide configurations for SAP HANA according to
one or more of these T-shirt sizes. Table 2-1 lists the T-shirt sizes for SAP HANA.
Table 2-1 SAP HANA T-shirt sizes
SAP T-shirt
size

XS

S and S+

M and M+

Compressed
data in
memory

64 GB

128 GB

256 GB

512 GB

Server main
memory

128 GB

256 GB

512 GB

1024 GB

Number of
CPUs

Chapter 2. SAP HANA overview

23

The T-shirt sizes, S+ and M+, denote upgradeable versions of the S and M sizes:
S+ delivers capacity that is equivalent to S, but the hardware is upgradeable
to an M size.
M+ delivers capacity that is equivalent to M, but the hardware is upgradeable
to an L size.
These T-shirt sizes are used when relevant growth of the data size is expected.
In addition to these standard T-shirt sizes, which apply to all use cases of SAP
HANA, there are configurations that are specific to and limited for use with SAP
Business Suite applications that are powered by SAP HANA. Table 2-2 shows
these additional T-shirt sizes.
Table 2-2 Additional T-shirt sizes for SAP Business Suite that are powered by SAP HANA
SAP T-shirt size

XL

XXL

Compressed data in
memory

512 GB

1 TB

2 TB

Server main memory

1 TB

2 TB

4 TB

Number of CPUs

The workload for the SAP Business Suite applications has different
characteristics. It is less CPU-bound and more memory-intensive than a
standard SAP HANA workload. Therefore, the memory per core ratio is different
than for the standard T-shirt sizes. All workloads can be used on the T-shirt sizes
in Table 2-1 on page 23, including SAP Business Suite applications with SAP
HANA as the primary database. The T-shirt sizes in Table 2-2 are specific to and
limited for use with SAP Business Suite applications that are powered by SAP
HANA only.
Section 4.5.3, SAP Business Suite powered by SAP HANA on page 76 has
more information about which of the SAP Business Suite applications are
supported by SAP HANA as the primary database.
For more information about T-shirt size mappings to IBM eX5 based building
blocks, see 6.1, IBM eX5 based environments on page 146.

New concept of shortnames for SAP HANA


With the introduction of the latest processor generation by Intel in February 2014,
the maximum number of cores per CPU socket increased from 10 to 15. Each
core also provides more compute power than a previous generation core. This
has lead to various memory per core ratios that results in many different solution
offerings for the same T-shirt size.

24

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

To simplify things and to avoid confusion, IBM decided to introduce the following
generic naming schema that replaces T-shirt sizes. Instead of S, M, L, and so on,
names for different types of building blocks with the following naming convention
are used:
XXX-YS-NNNN-Z (dashes can also be left out)
XXX means servers model type, YS is the number of installed processors (or
sockets), NNNN is the memory amount in gigabytes, and Z is stand-alone (S) or
cluster (C) node.
For example, model AC3-2S-256-S represents a stand-alone 4S server
(AC3-models have four sockets) with two processors that are installed and 256
GB of memory. This model replaces former T-shirt size S.
A model AC4-8S-1024-C represents an 8-socket cluster node (AC4-models have
eight sockets) with eight processors that are installed and 1 TB of main memory.
This new naming schema is used in this book for all the X6 based SAP HANA
landscapes that are described 6.2, IBM X6 based environments on page 157.

2.3.2 Sizing approach


The sizing of SAP HANA depends on the scenario in which SAP HANA is used.
The sizing methodology for SAP HANA is described in detail in the following SAP
Notes1:
Note 1514966 - SAP HANA 1.0: Sizing SAP In-Memory Database
Note 1637145 - SAP NetWeaver BW on HANA: Sizing SAP In-Memory
Database
Note 1793345 - Sizing for SAP Suite on HANA
Note 1872170 - Suite on HANA memory sizing
These SAP notes and their attached documents and scripts provide a good
starting point for sizing an SAP HANA database environment.
Detailed sizing depends on many different parameters, such as compression
factor, the state of the NetWeaver BW system (size of the row store), encoding
schema, or cache size, among other parameters.

SAP Notes can be accessed at http://service.sap.com/notes. An SAP S-user ID is required.

Chapter 2. SAP HANA overview

25

Special considerations for scale-out BW systems


For specific information about scale-out SAP NetWeaver BW systems, review the
following SAP notes and attached presentations:
Note 1637145 - SAP NetWeaver BW on SAP HANA: Sizing SAP In-Memory
Database
Note 1702409 - SAP HANA DB: Optimal number of scale out nodes for SAP
NetWeaver BW on SAP HANA
Note 1736976 - Sizing Report for BW on HANA
Note: SAP must be involved for a detailed sizing because the result of the
sizing affects the hardware infrastructure and SAP HANA licensing.
All SAP sizing documents that are mentioned in this section are listed to give a
first estimate only.
In addition to the sizing methodologies that are described in SAP Notes, SAP
provides sizing support for SAP HANA in the SAP Quick Sizer. The SAP Quick
Sizer is an online sizing tool that supports most of the SAP solutions that are
available. For SAP HANA, it supports sizing for the following systems:
A stand-alone SAP HANA system, which implements the sizing algorithms
that are described in SAP Note 1514966.
SAP HANA as the database for an SAP NetWeaver BW system, which
implements the sizing algorithms that are described in SAP Note 1637145.
Special sizing support for the SAP HANA rapid-deployment solutions.
The SAP Quick Sizer is accessible online at the following website (an SAP
S-user ID is required):
http://service.sap.com/quicksizer

2.4 SAP HANA software licensing


As described in 2.2, SAP HANA delivery model on page 21, the prevalent
deployment method of SAP HANA is an appliance-like delivery model. However,
although the hardware partners deliver the infrastructure, including the operating
system and middleware, the license for the SAP HANA software must be
obtained directly from SAP.

26

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Although the SAP HANA software comes packaged in two different editions
(platform and enterprise edition), the SAP HANA software licensing depends on
the use case. Chapter 4, SAP HANA integration scenarios on page 51
describes different use cases for SAP HANA.
Tip: Licensing SAP HANA is handled by SAP and there are different options
depending on the use case. Because metrics can always change, discuss
licensing options for your particular use case with SAP.

Chapter 2. SAP HANA overview

27

28

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Chapter 3.

Software components and


data replication methods
This chapter explains the purpose of individual software components of the SAP
HANA solution and introduces available replication technologies.
This chapter covers the following topics:
SAP HANA software components
Data replication methods for SAP HANA

Copyright IBM Corp. 2013, 2014. All rights reserved.

29

3.1 SAP HANA software components


The SAP HANA solution is composed of the main software components that are
described in the following sections:

3.1.1, SAP HANA database on page 30


3.1.2, SAP HANA client on page 31
3.1.3, SAP HANA studio on page 32
3.1.4, SAP HANA studio repository on page 40
3.1.5, SAP HANA landscape management structure on page 41
3.1.6, SAP host agent on page 41
3.1.7, SAP HANA Lifecycle Manager on page 41
3.1.8, SAP HANA Lifecycle Management tools on page 42

Figure 3-1 illustrates the possible locations of these components.

User
workstation
SAP HANA
studio
(optional)
SAP HANA
client
(optional)

Application
Server
SAP HANA
client

SAP HANA Appliance


SAP HANA
studio

SAP host
agent

Data Modeling
Row Store

SAP HANA
client

SAP HANA
LM structure

SAP HANA
studio
repository

SAP HANA
Lifecycle
Manager

Column Store

SAP HANA
database

Application
Function
Library
SAP
liveCache
Applications

Other optional components:


SMD Agent
(optional)

Figure 3-1 Distribution of software components that are related to SAP HANA

3.1.1 SAP HANA database


The SAP HANA database is the heart of the SAP HANA offering and the most
important software component running on the SAP HANA appliance.

30

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

SAP HANA is an in-memory database that combines row-based and


column-based database technology. All standard features that are available in
other relational databases are supported (for example, tables, views, indexes,
triggers, and SQL interface).
On top of these standard functions, the SAP HANA database also offers
modeling capabilities that allow you to define in-memory transformation of
relational tables into analytic views. These views are not materialized; therefore,
all queries are providing real-time results that are based on the content of the
underlying tables.
Another feature extending the capabilities of the SAP HANA database is the
SQLscript programming language, which allows you to capture transformations
that might not be easy to define using simple modeling.
The SAP HANA database can also be integrated with external applications, such
as an SAP application environment (for example, ERP). Using these possibilities,
customers can extend their models by implementing existing statistical and
analytical functions that were, for example, developed in ABAP.
The internal structures of the SAP HANA database are described in detail in
Chapter 2, SAP HANA overview on page 17.

3.1.2 SAP HANA client


The SAP HANA client is a set of libraries that are used by external applications to
connect to the SAP HANA database.
The following interfaces are available after installing the SAP HANA client
libraries:
SQLDBC
An SAP native database SDK that can be used to develop new custom
applications working with the SAP HANA database.
OLE DB for OLAP (ODBO) (available only on Windows)
ODBO is a Microsoft driven industry standard for multi-dimensional data
processing. The query language that is used with ODBO is the
Multidimensional Expressions (MDX) language.
Open Database Connectivity (ODBC)
The ODBC interface is a standard for accessing database systems, which
was originally developed by Microsoft.
Java Database Connectivity (JDBC)
JDBC is a Java based interface for accessing database systems.

Chapter 3. Software components and data replication methods

31

The SAP HANA client libraries are delivered in 32-bit and 64-bit editions. It is
important always to use the correct edition based on the architecture of the
application that uses this client. 32-bit applications cannot use 64-bit client
libraries and vice versa.
To access the SAP HANA database from Microsoft Excel, you can also use a
special 32-bit edition of the SAP HANA client that is called SAP HANA client
package for Microsoft Excel.
The SAP HANA client is compatible with earlier versions, meaning that the
revision of the client must be the same or higher than the revision of the SAP
HANA database.
The SAP HANA client libraries must be installed on every machine where
connectivity to the SAP HANA database is required. This includes not only all
servers, but also user workstations that are hosting applications that are directly
connecting to the SAP HANA database (for example, SAP BusinessObjects
Client Tools or Microsoft Excel).
Whenever the SAP HANA database is updated to a more recent revision, all
clients that are associated with this database also must be upgraded.
For more information about how to install the SAP HANA client, see the official
SAP guide SAP HANA Database - Client Installation Guide, which can be
downloaded at the following website:
http://help.sap.com/hana_platform

3.1.3 SAP HANA studio


The SAP HANA studio is a graphical user interface (GUI) that is required to work
with local or remote SAP HANA database installations. It is a multipurpose tool
that covers all of the main aspects of working with the SAP HANA database. The
user interface is slightly different for each function.
The SAP HANA studio does not depend on the SAP HANA client.

32

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

The following main function areas are provided by the SAP HANA studio (each
function area is also illustrated by a corresponding figure of the user interface):
Database administration
The key functions are stopping and starting the SAP HANA databases, status
overview, monitoring, performance analysis, parameter configuration, tracing,
and log analysis.
Figure 3-2 shows the SAP HANA studio user interface for database
administration.

Figure 3-2 SAP HANA studio - administration console (overview)

Chapter 3. Software components and data replication methods

33

Security management
This function area provides tools that are required to create users, to define
and assign roles, and to grant database privileges.
Figure 3-3 shows an example of the user definition window.

Figure 3-3 SAP HANA studio - user definition window

34

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Data management
These functions create, change, or delete database objects (such as tables,
indexes, and views), and include commands to manipulate data (for example,
insert, update, delete, or perform a bulk load).
Figure 3-4 shows an example of the table definition window.

Figure 3-4 SAP HANA studio - table definition window

Chapter 3. Software components and data replication methods

35

Modeling
This is the user interface that is used to work with models (metadata
descriptions about how source data is transformed in resulting views),
including the possibility to define new custom models, and to adjust or delete
existing models.
Figure 3-5 shows a simple analytic model.

Figure 3-5 SAP HANA studio - modeling interface (analytic view)

36

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Content management
These are the functions offering the possibility to organize models in
packages, to define delivery units for transport into a subsequent SAP HANA
system, or to export and import individual models or whole packages.
Content management functions are accessible from the main window in the
modeler perspective, as shown in Figure 3-6.

Figure 3-6 SAP HANA studio - content functions on the main window of modeler perspective

Chapter 3. Software components and data replication methods

37

Replication management
Data replication into the SAP HANA database is controlled from the data
provisioning window in the SAP HANA studio, where new tables can be
scheduled for replication, suspended, or replication for a particular table can
be interrupted.
Figure 3-7 shows an example of a data provisioning window.

Figure 3-7 SAP HANA studio - data provisioning window

38

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Lifecycle Management
The SAP HANA solution offers the possibility to download and install
automatically updates to SAP HANA software components. This function is
controlled from the Lifecycle Manager window in the SAP HANA studio.
Figure 3-8 shows an example of such a window.

Figure 3-8 SAP HANA studio - Lifecycle Manager window

The SAP HANA database queries are consumed indirectly by using front-end
components, such as SAP BusinessObjects BI 4.0 clients. Therefore, the SAP
HANA studio is required only for administration or development and is not
needed for users.
The SAP HANA studio runs on the Eclipse platform; therefore, every user must
have a recent version of the Java Runtime Environment (JRE) installed that uses
the same architecture (64-bit SAP HANA studio has 64-bit JRE as a
prerequisite).
Currently supported platforms are Windows 32-bit, Windows 64-bit, and Linux
64-bit.

Chapter 3. Software components and data replication methods

39

The SAP HANA studio is also compatible with earlier versions, meaning that the
revision level of the SAP HANA studio must be the same or higher than the
revision level of the SAP HANA database. However, based on practical
experience, the preferred approach is to keep SAP HANA studio on same
revision level as the SAP HANA database whenever possible. Installation and
parallel usage of multiple revisions of SAP HANA studio on one workstation is
possible. When using one SAP HANA studio instance for multiple SAP HANA
databases, the revision level of the SAP HANA studio must be the same or
higher revision level than the highest revision level of the SAP HANA databases
being connected to.
SAP HANA studio must be updated to a more recent version on all workstations
whenever the SAP HANA database is updated. This can be automated by using
Update Server, which can be configured by using SAP HANA Lifecycle Manager
(HLM). There are more details about HLM in 3.1.7, SAP HANA Lifecycle
Manager on page 41. Using HLM is the best way to keep installations of SAP
HANA studio synchronized with the SAP HANA database revision.
For more information about how to install the SAP HANA studio, see the official
SAP guide, SAP HANA Database - Studio Installation Guide, which is available
for download at the following website:
http://help.sap.com/hana_platform

3.1.4 SAP HANA studio repository


Because SAP HANA studio is an Eclipse-based product, it can benefit from all
the standard features that are offered by this platform. One of these features is
the ability to update automatically the product from a central repository on the
SAP HANA server.
The SAP HANA studio repository is initially installed during the deployment of the
SAP HANA appliance and must be updated manually every time the SAP HANA
database is updated. This repository can then be used by all SAP HANA studio
installations to download and install automatically new versions of code.
For more information about how to install the SAP HANA studio repository, see
the official SAP guide, SAP HANA Database - Studio Installation Guide, which is
available for download at the following website:
http://help.sap.com/hana_platform

40

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

3.1.5 SAP HANA landscape management structure


The SAP HANA landscape management (LM) structure (lm_structure) is an
XML file that describes the software components that are installed on a server.
The information in this file contains the following items:
System ID (SID) of the SAP HANA system and host name
A stack description, including the edition (depending on the license schema)
Information about the SAP HANA database, including the installation
directory
Information about the SAP HANA studio repository, including its location
Information about the SAP HANA client, including its location
Information about the host controller
The LM structure description also contains revisions of individual components,
so it must be upgraded when the SAP HANA database is upgraded.
Information that is contained in this file is used by the System Landscape
Directory (SLD) data suppliers and by the SAP HANA Lifecycle Manager (HLM).

3.1.6 SAP host agent


The SAP host agent is a standard part of every SAP installation. In an SAP
HANA environment, it is important in the following situations:
Automatic update using SAP HANA Lifecycle Manager
Remote start and stop of SAP HANA database instances

3.1.7 SAP HANA Lifecycle Manager


In older SAP HANA releases (SAP HANA SP05 and before), there were two
tools to address update, configuration and customization tasks:
Software Update Manager (SUM) for SAP HANA focused on automating
updates of various SAP HANA components and the SAP HANA database
itself.
SAP HANA On-site Configuration Tool (OCT) for SAP HANA automated
customization of SAP HANA appliances, performing postinstallation steps
such as deployment of additional components (such as adding/removing a
host or deploying a secondary system).
In the latest SAP HANA release (SPS07), both tools are marked as deprecated
and replaced by SAP HLM.

Chapter 3. Software components and data replication methods

41

The SAP HLM is a tool that automates many installation, configuration, and
deployment tasks that are related to SAP HANA environments. It offers the
following functions:

Rename an SAP HANA system (SID, instance number, or host name)


Configure SAP Landscape Transformation (SLT) replication
Configure the SAP HANA connection to System Landscape Directory (SLD)
Add/Remove Solution Manager Diagnostics (SMD) agent
Add/Remove additional SAP HANA system (MCOS installation)
Add/Remove hosts to/from scale-out (or distributed) SAP HANA installation
Configure SAP HANA system internal network
Update SAP HANA System - Apply Support Package Stack
Update SAP HANA System - Apply Single Support Packages
Add/Update Application Function Libraries (AFL) on an SAP HANA system
Add/Update SAP liveCache Applications (LCA) on an SAP HANA system
Add SAP HANA Smart Data Access (SDA) on an SAP HANA system

SAP HANA Lifecycle Manager can be run in three ways:


Locally or remotely as part of SAP HANA studio
Locally from the Linux command line on an SAP HANA server node
Locally or remotely through a web interface
For more information about SAP HANA Lifecycle Manager, see the official SAP
guide, SAP HANA Update and Configuration Guide, which is available for
download at the following website:
http://help.sap.com/hana_platform

3.1.8 SAP HANA Lifecycle Management tools


The SAP HANA Unified Installer was a tool that was targeted to be used by SAP
HANA hardware partners. It automatically installed all required software
components on the SAP HANA appliance according to SAP requirements and
specifications.
With the release of SAP HANA SPS07 the SAP HANA Unified Installer is marked
as deprecated. However, it is still available as a part of the standard delivery
package.
SAP HANA Lifecycle Management tools (also referred to and available in Linux
as hdblcm or hdblcmgui) are delivered as the replacement for SAP HANA
Unified Installer. Although hdblcm is a text-based Linux command-line utility that
can be run in interactive or batch mode, its counterpart hdblcmgui delivers the
same functions in a graphical user interface.

42

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Like Unified Installer, these two utilities are mainly intended for hardware vendors
to automate initial deployment of SAP HANA on their appliance models. System
administrators are expected to use SAP HANA Lifecycle Manager (HLM)
described in 3.1.7, SAP HANA Lifecycle Manager on page 41.
Main features of the SAP HANA Lifecycle Management tools are:
Install SAP HANA system (single-node or scale-out)
Add host to scale-out SAP HANA system
Enable or disable auto-start of SAP HANA database instance
Regeneration of SSL certificates that are used by SAP HANA Lifecycle
Manager
Installation and update of additional SAP HANA components:
Application Function Library
SAP HANA Client
SAP HANA Studio
SAP HANA Database server
SAP HANA Lifecycle Manager (HLM)
SAP liveCache applications
Update of server-side studio repository
Install and update SAP host agent
For more information about SAP HANA Lifecycle Management tools, see the
official SAP guide, SAP HANA Server Installation Guide, which is available for
download at the following location:
http://help.sap.com/hana_platform

3.1.9 Solution Manager Diagnostics agent


SAP HANA can be connected to SAP Solution Manager 7.1, SP03 or higher.1 It
is preferable to use SP05 or higher.
The Solution Manager Diagnostics (SMD) provides a set of tools to monitor and
analyze SAP systems, including SAP HANA. It provides a centralized way to
trace problems in all systems that are connected to an SAP Solution Manager
system.

With monitor content update and additional SAP notes for SP02

Chapter 3. Software components and data replication methods

43

The SMD agent is an optional component, which can be installed on the SAP
HANA appliance. It enables diagnostic tests of the SAP HANA appliance through
SAP Solution Manager. The SMD agent provides access to the database logs
and the file system, and collects information about the systems CPU and
memory consumption through the SAP host agent.
More information about how to deploy SMD agent is provided in the official SAP
guide, SAP HANA Update and Configuration Guide, which is available for
download at the following location:
http://help.sap.com/hana_platform

3.2 Data replication methods for SAP HANA


Data can be written to the SAP HANA database either directly by a source
application, or it can be replicated by using replication technologies.
The following replication methods are available for use with the SAP HANA
database:
Trigger-based replication
This method is based on database triggers that are created in the source
system to record all changes to monitored tables. These changes are then
replicated to the SAP HANA database by using the SAP Landscape
Transformation system.
ETL-based replication
This method employs an Extract, Transform, and Load (ETL) process to
extract data from the data source, transform it to meet the business or
technical needs, and load it into the SAP HANA database. The SAP
BusinessObject Data Services application is used as part of this replication
scenario.
Extractor-based replication
This approach uses the embedded SAP NetWeaver Business Warehouse
(SAP NetWeaver BW) that is available on every SAP NetWeaver based
system. SAP NetWeaver BW starts an extraction process by using available
extractors and then redirects the write operation to the SAP HANA database
instead of the local Persistent Staging Area (PSA).
Log-based replication
This method is based on reading the transaction logs from the source
database and reapplying them to the SAP HANA database.
Figure 3-9 on page 45 illustrates these replication methods.

44

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

SAP HANA
database
Source System
SAP ERP

Trigger-Based Replication
Application Layer

ETL-Based Replication
Embedded BW

Database

Log
File

Extractor-Based Replication

Log-Based Replication

Figure 3-9 Available replication methods for SAP HANA

The following sections describe these replication methods for SAP HANA in
more detail.

3.2.1 Trigger-based replication with SAP Landscape Transformation


SAP Landscape Transformation replication is based on tracking database
changes by using database triggers. All modifications are stored in logging tables
in the source database, which ensures that every change is captured even when
the SAP Landscape Transformation system is not available.
The SAP Landscape Transformation system reads changes from source systems
and updates the SAP HANA database. The replication process can be
configured as real-time (continuous replication) or scheduled replication in
predefined intervals.
The SAP Landscape Transformation operates on the application level; therefore,
the trigger-based replication method benefits from the database abstraction that
is provided by the SAP software stack, which makes it database-independent. It
also has extended source system release coverage, where supported releases
start from SAP R/3 4.6C up to the newest SAP Business Suite releases.

Chapter 3. Software components and data replication methods

45

The SAP Landscape Transformation also supports direct replication from


database systems that are supported by the SAP NetWeaver platform. In this
case, the database must be connected to the SAP Landscape Transformation
system directly (as an additional database) and the SAP Landscape
Transformation plays the role of the source system.
The replication process can be customized by creating ABAP routines and
configuring their execution during the replication process. This feature allows the
SAP Landscape Transformation system to replicate additional calculated
columns and to scramble existing data or filter-replicated data based on defined
criteria.
The SAP Landscape Transformation replication uses proven System Landscape
Optimization (SLO) technologies (such as Near Zero Downtime, Test Data
Migration Server (TDMS), and SAP Landscape Transformation) and can handle
both Unicode and non-Unicode source databases. The SAP Landscape
Transformation replication provides a flexible and reliable replication process,
fully integrates with SAP HANA Studio, and is simple and fast to set up.
The SAP Landscape Transformation Replication Server does not have to be a
separate SAP system. It can run on any SAP system with the SAP NetWeaver
7.02 ABAP stack (Kernel 7.20EXT). However, it is preferable to install the SAP
Landscape Transformation Replication Server on a separate system to avoid a
high replication load causing a performance impact on the base system.
The SAP Landscape Transformation Replication Server is the ideal solution for
all SAP HANA customers who need real-time (or scheduled) data replication
from SAP NetWeaver based systems or databases that are supported by SAP
NetWeaver.

3.2.2 ETL-based replication with SAP BusinessObjects Data Services


An ETL-based replication for SAP HANA can be set up by using SAP
BusinessObjects Data Services, which is a full-featured ETL tool that gives
customers maximum flexibility regarding the source database system:
Customers can specify and load the relevant business data in defined periods
from an SAP ERP system into the SAP HANA database.
SAP ERP application logic can be reused by reading extractors or using SAP
function modules.
It offers options for the integration of third-party data providers and supports
replication from virtually any data source.
Data transfers are done in batch mode, which limits the real-time capabilities of
this replication method.

46

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

SAP BusinessObjects Data Services provides several kinds of data quality and
data transformation functions. Due to the rich feature set that is available,
implementation time for the ETL-based replication is longer than for the other
replication methods. SAP BusinessObjects Data Services offers integration with
SAP HANA. SAP HANA is available as a predefined data target for the load
process.
The ETL-based replication server is the ideal solution for all SAP HANA
customers who need data replication from non-SAP data sources.

3.2.3 Extractor-based replication with Direct Extractor Connection


Extractor-based replication for SAP HANA is based on existing application logic
that is available in every SAP NetWeaver system. The SAP NetWeaver BW
package that is a standard part of the SAP NetWeaver platform can be used to
run an extraction process and store the extracted data in the SAP HANA
database.
This function requires some corrections and configuration changes to both the
SAP HANA database (import of delivery unit and parameterization) and on the
SAP NetWeaver BW system as part of the SAP NetWeaver platform
(implementing corrections by using an SAP note or installing a support package
and parameterization). Corrections in the SAP NetWeaver BW system ensure
that extracted data is not stored in local Persistent Staging Area (PSA), but
diverted to the external SAP HANA database.
Usage of native extractors instead of the replication of underlying tables can
bring certain benefits. Extractors offer the same transformations that are used by
SAP NetWeaver BW systems. This can decrease the complexity of modeling
tasks in the SAP HANA database.
This type of replication is not real time, and only available features and
transformation capabilities that are provided by a given extractor can be used.
Replication using Direct Extractor Connection (DXC) can be realized in the
following basic scenarios:
Using the embedded SAP NetWeaver BW function in the source system
SAP NetWeaver BW functions in the source system usually are not used.
After the implementation of the required corrections, the source system calls
its own extractors and pushes data into the external SAP HANA database.
The source system must be based on SAP NetWeaver 7.0 or higher. Because
the function of a given extractor is diverted into SAP HANA database, this
extractor must not be in use by the embedded SAP NetWeaver BW
component for any other purpose.

Chapter 3. Software components and data replication methods

47

Using an existing SAP NetWeaver BW to drive replication


An existing SAP NetWeaver BW can be used to extract data from the source
system and to write the result to the SAP HANA system.
The release of the SAP NetWeaver BW system that is used must be at least
SAP NetWeaver 7.0, and the given extractor must not be in use for this
particular source system.
Using a dedicated SAP NetWeaver BW to drive replication
The last option is to install a dedicated SAP NetWeaver system to extract data
from the source system and store the result in the SAP HANA database. This
option has minimal impact on existing functions because no existing system is
changed in any way. However, a new system is required for this purpose.
The current implementation of this replication technology allows for only one
database schema in the SAP HANA database. Using one system for controlling
replication of multiple source systems can lead to collisions because all source
systems use the same database schema in the SAP HANA database.

3.2.4 Log-based replication with Sybase Replication Server


The log-based replication for SAP HANA is realized by the Sybase Replication
Server. It captures table changes from low-level database log files and
transforms them into SQL statements that are in turn run on the SAP HANA
database. This action is similar to what is known as log shipping between two
database instances.
Replication with the Sybase Replication Server is fast and consumes little
processing power because of its closeness to the database system. However,
this mode of operation makes this replication method highly
database-dependent, and the source database system coverage is limited.2 It
also limits the conversion capabilities; therefore, replication with the Sybase
Replication Server supports only Unicode source databases. The Sybase
Replication Server cannot convert between code pages, and because SAP
HANA works with Unicode encoding internally, the source database must use
Unicode encoding as well. Also, certain table types that are used in SAP systems
are unsupported.

48

Only certain versions of IBM DB2 on AIX, Linux, and HP-UX are supported by this replication
method.

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

To set up replication with the Sybase Replication Server, the definition and
content of tables that are chosen to be replicated must be copied initially from the
source database to the SAP HANA database. This initial load is done with the
R3Load program, which is also used for database imports and exports. Changes
in tables during initial copy operation are captured by the Sybase Replication
Server; therefore, no system downtime is required.
This replication method is recommended only for SAP customers who were
invited to use it during the ramp-up of SAP HANA 1.0. It was part of the SAP
HANA Enterprise Extended Edition, which was discontinued with the introduction
of SAP HANA 1.0 SPS05 in November 2012.3
SAP recommends that you use trigger-based data replication using the SAP
Landscape Transformation Replicator.

3.2.5 Comparing the replication methods


Each of the described data replication methods for SAP HANA has its benefits
and weaknesses:
The trigger-based replication method with the SAP Landscape
Transformation system provides real-time replication while supporting a wide
range of source database systems. It can handle both Unicode and
non-Unicode databases and use proven data migration technology. It uses
the SAP application layer, which limits it to SAP source systems. Compared to
the log-based replication method, it offers a broader support of source
systems, while providing almost similar real-time capabilities, and for that
reason it is preferred for replication from SAP source systems.
The ETL-based replication method is the most flexible of all, paying the price
for flexibility with only near real-time capabilities. With its variety of possible
data sources, advanced transformation, and data quality functions, it is the
ideal choice for replication from non-SAP data sources.
The extractor-based replication method offers reuse of existing transformation
capabilities that are available in every SAP NetWeaver based system. This
method can decrease the required implementation effort. However, this type
of replication is not real time and is limited to capabilities that are provided by
the available extractors in the source system.

For more information, see section 2.1.3 at


http://help.sap.com/hana/hana_sps5_whatsnew_en.pdf.

Chapter 3. Software components and data replication methods

49

The log-based replication method with the Sybase Replication Server


provides the fastest replication from the source database to SAP HANA.
However, it is limited to Unicode-encoded source databases, and it does not
support all table types that are used in SAP applications. It provides no
transformation function, and the source database system support is limited.
Figure 3-10 shows a comparison of these replication methods.

Real-Time

Near Real-Time

Real-Time Capabilities

Preferred by SAP

SAP LT System

Direct Extractor
Connection

Unicode only
Very limited DB support
Data Conversion Capabilities

Sybase
Replication Server

Real Real-Time

SAP Business Objects


Data Services

Many DBs supported


Unicode and Non-Unicode
on Application Layer

SAP NetWeaver 7.0+


Re-use of extractors
Transformation
Any Datasource
Transformation
Data Cleansing

Figure 3-10 Comparison of the replication methods for SAP HANA

The replication method that you choose depends on the requirements. When
real-lime replication is needed to provide benefit to the business, and the
replication source is an SAP system, then the trigger-based replication is the
best choice. Extractor-based replication might keep project costs down by
reusing existing transformations. ETL-based replication provides the most
flexibility regarding data source, data transformation, and data cleansing options,
but does not provide real-time replication.

50

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Chapter 4.

SAP HANA integration


scenarios
This chapter outlines the different ways that SAP HANA can be implemented in
existing client landscapes and highlights various aspects of such an integration.
Whenever possible, this chapter mentions real-world examples and related
offerings.
This chapter is divided in to several sections that are based on the role of SAP
HANA and the way it interacts with other software components:

Basic use case scenarios


SAP HANA as a technology platform
SAP HANA for operational reporting
SAP HANA as an accelerator
SAP products running on SAP HANA
Programming techniques using SAP HANA

Copyright IBM Corp. 2013, 2014. All rights reserved.

51

4.1 Basic use case scenarios


The following classification of use cases was presented in the EIM205
Applications powered by SAP HANA session during the SAP TechEd 2011
event. SAP defined the following five use case scenarios:

Technology platform
Operational reporting
Accelerator
In-memory products
Next generation applications

Figure 4-1 illustrates these use case scenarios.

Accelerator
Operational
Reporting

In-Memory
Products

Data Modeling

Technology
platform

Column Store
Row Store

Next
Generation
Applications

SAP HANA
Figure 4-1 Basic use case scenarios that are defined by SAP in session EIM205

These five basic use case scenarios describe the elementary ways that SAP
HANA can be integrated. This chapter covers each of these use case scenarios
in dedicated sections.
SAP maintains a SAP HANA Use Case Repository with specific examples for
how SAP HANA can be integrated. This repository is online at the following
website:
http://www.experiencesaphana.com/community/resources/use-cases

52

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

The use cases in this repository are divided into categories that are based on
their relevance to a specific industry sector. It is a good idea to review this
repository to find inspiration about how SAP HANA can be used in various
scenarios.

4.2 SAP HANA as a technology platform


SAP HANA can be used even in non-SAP environments. The client can use
structured and unstructured data that is derived from non-SAP application
systems to take advantage of SAP HANA power. SAP HANA can be used to
accelerate existing functions or to provide new functions that were, until now, not
realistic.
Figure 4-2 presents SAP HANA as a technology platform.

Data Modeling

Non-SAP
or SAP
data source

Non-SAP
application

Column Store
Row Store

SAP HANA

SAP
Reporting
and Analytics

Figure 4-2 SAP HANA as technology platform

SAP HANA is not technologically dependent on other SAP products and can be
used independently as the only one SAP component in the clients information
technology (IT) landscape. However, SAP HANA can be easily integrated with
other SAP products, such as SAP BusinessObjects BI platform for reporting or
SAP BusinessObjects Data Services for ETL replication, which gives clients the
possibility to use only the components that are needed.
There are many ways that SAP HANA can be integrated into a client landscape,
and it is not possible to describe all combinations. Software components around
the SAP HANA offering can be seen as building blocks, and every solution must
be assembled from the blocks that are needed in a particular situation.
This approach is versatile and the number of possible combinations is growing
because SAP constantly keeps adding new components to their SAP
HANA-related portfolio.

Chapter 4. SAP HANA integration scenarios

53

IBM offers consulting services that help clients to choose the correct solution for
their business needs.

4.2.1 SAP HANA data acquisition


There are multiple ways that data can flow into SAP HANA. This section
describes the various options that are available. Figure 4-3 gives an overview.

Current situation
Non-SAP
application

Replacing existing database


Custom
database

Data replication
Non-SAP
application

Non-SAP
application

SAP HANA

Dual database approach


Custom
database

Non-SAP
application

Custom
database

Data replication

SAP HANA

SAP HANA

Figure 4-3 Examples of SAP HANA deployment options with regard to data acquisition

The initial situation is displayed schematically in the upper left of Figure 4-3. In
this example, a client-specific non-SAP application writes data to a custom
database that is slow and is not meeting client needs.
The other three examples in Figure 4-3 show that SAP HANA can be deployed in
such a scenario. These examples show that there is no single solution that is
best for every client, but that each situation must be considered independently.

54

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Each of these three solutions has both advantages and disadvantages, which we
highlight to show aspects of a given solution that might need more detailed
consideration:
Replacing the existing database with SAP HANA
The advantage of this solution is that the overall architecture is not going to be
significantly changed. The solution remains simple without the need to
include additional components. Customers might also save on license costs
for the original database.
A disadvantage to this solution is that the custom application must be
adjusted to work with the SAP HANA database. If ODBC or JDBS is used for
database access, this is not a significant problem. Also, the whole setup must
be tested properly. Because the original database is being replaced, a certain
amount of downtime is inevitable.
Clients considering this approach must be familiar with the features and
characteristics of SAP HANA, especially when certain requirements must be
met by the database that is used (for example, special purpose databases).
Populating SAP HANA with data replicated from the existing database
The second option is to integrate SAP HANA as a side-car database to the
primary database and to replicate required data using one of the available
replication techniques.
An advantage of this approach is that the original solution is not touched and
no downtime is required. Also, only the required subset of data must be
replicated from the source database, which might allow customers to
minimize acquisition costs because SAP HANA acquisition costs are linked
directly to the volume of stored data.
The need for implementing replication technology can be seen as the only
disadvantage of this solution. Because data is delivered only into SAP HANA
through replication, this component is a vital part of the whole solution.
Customers considering this approach must be familiar with various replication
technologies, including their advantages and disadvantages, as described in
3.2, Data replication methods for SAP HANA on page 44.
Clients must also be aware that replication might cause additional load on the
existing database because modified records must be extracted and then
transported to the SAP HANA database. This aspect is highly dependent on
the specific situation and can be addressed by choosing the proper replication
technology.
Adding SAP HANA as a second database in parallel to the existing one
This third option keeps the existing database in place while adding SAP
HANA as a secondary database. The custom application then stores data in
both the original database and in the SAP HANA database.

Chapter 4. SAP HANA integration scenarios

55

This option balances advantages and disadvantages of the previous two


options. A main prerequisite is the ability of the source application to work
with multiple databases and the ability to control where data is stored. This
can be easily achieved if the source application was developed by the client
and can be changed, or if the source application is going to be developed as
part of this solution. If this prerequisite cannot be met, this option is not viable.
An advantage of this approach is that no replication is required because data
is stored directly in SAP HANA as required. Customers also can decide to
store some of the records in both databases.
If data that is stored in the original database is not going to be changed and
SAP HANA data is stored in both databases simultaneously, customers might
achieve only minimal disruption to the existing solution.
A main disadvantage is the prerequisite that the application must be able to
work with multiple databases and that it must be able to store data according
to the customers expectations.
Customers considering this option must be aware of the abilities that are
provided by the application delivering data into the existing database. Also,
disaster recovery plans must be carefully adjusted, especially when
consistency between both databases is seen as a critical requirement.
These examples must not be seen as an exhaustive list of integration options for
an SAP HANA implementation, but rather as a demonstration of how to develop
a solution that matches client needs.
It is possible to populate the SAP HANA database with data coming from multiple
different sources, such as SAP or non-SAP applications, and custom databases.
These sources can feed data into SAP HANA independently, each using a
different approach or in a synchronized manner using the SAP BusinessObjects
Data Services, which can replicate data from several different sources
simultaneously.

4.2.2 SAP HANA as a source for other applications


The second part of integrating SAP HANA is to connect existing or new
applications to run queries against the SAP HANA database. Figure 4-4 on
page 57 illustrates an example of such an integration.

56

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Current situation
Custom
database

Possible scenario
Non-SAP
application

Custom
database

Non-SAP
application

SAP HANA

SAP analytic
tools

SAP BOBJ
reporting
Figure 4-4 An example of SAP HANA as a source for other applications

The initial situation is visualized schematically in the left part of Figure 4-4. A
customer-specific application runs queries against a custom database, which is a
function that we must preserve.
A potential solution is in the right part of Figure 4-4. A customer-specific
application runs problematic queries against the SAP HANA database. If the
existing database is still part of the solution, specific queries that do not need
acceleration can still be run against the original database.
Specialized analytic tools, such as SAP BusinessObjects Predictive Analysis,
can be used to run statistical analysis on data that is stored in the SAP HANA
database. This tool can run analysis directly inside the SAP HANA database,
which helps avoid expensive transfers of massive volumes of data between the
application and the database. The result of this analysis can be stored in SAP
HANA, and the custom application can use these results for further processing,
for example, to facilitate decision making.
SAP HANA can be easily integrated with products from the SAP
BusinessObjects family. Therefore, these products can be part of the solution,
responsible for reporting, monitoring critical key performance indicators (KPIs)
using dashboards, or for data analysis.
These tools can also be used without SAP HANA; however, SAP HANA is
enabling these tools to process much larger volumes of data and still provide
results in reasonable time.

Chapter 4. SAP HANA integration scenarios

57

4.3 SAP HANA for operational reporting


Operational reporting is playing a more important role. In todays economic
environment, companies must understand how various events in the globally
integrated world affect their business to be able to make proper adjustments to
counter the effects of these events. Therefore, the pressure to minimize the delay
in reporting is becoming higher. An ideal situation is to have the ability to have a
real-time snapshot of current situations within seconds of a request.
Concurrently, the amount of data that is being captured grows every year.
Additional information is collected and stored at more detailed levels. All of this
makes operational reporting more challenging because huge amounts of data
must be processed quickly to produce the preferred result.
SAP HANA is a perfect fit for this task. Required information can be replicated
from existing transactional systems into the SAP HANA database and then
processed faster than directly on the source systems.
The following use case is often referred to as a data mart or side-car approach
because SAP HANA sits by the operational system and receives the operational
data (usually only an excerpt) from this system by means of replication.
In a typical SAP-based application landscape today, you find many systems,
such as SAP ERP, SAP CRM, SAP SCM, and other, possibly non-SAP,
applications. All of these systems contain loads of operational data, which can be
used to improve business decision making by using business intelligence
technology. Data that is used for business intelligence purposes can be gathered
either on a business unit level using data marts or on an enterprise level with an
enterprise data warehouse, such as the SAP NetWeaver Business Warehouse
(SAP NetWeaver BW). ETL processes feed the data from the operational
systems into the data marts and the enterprise data warehouse.
Figure 4-5 on page 59 illustrates such a typical landscape.

58

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Corporate BI

Enterprise Data Warehouse (BW)


BWA

Database

Local BI

Data
Mart

SAP ERP 1

SAP ERP n

(or CRM, SRM, SCM)

(or CRM, SRM, SCM)

...

BI

Data
Mart

ETL
DB

Non-SAP
Business
Application

BI

Data
Mart

ETL
Database

DB

Database

Database

DB

Figure 4-5 Typical view of an SAP-based application landscape today

With the huge amount of data that is collected in an enterprise data warehouse,
response times of queries for reports or navigation through data can become an
issue, generating new requirements for the performance of such an environment.
To address these requirements, SAP introduced the SAP NetWeaver Business
Warehouse Accelerator (SAP NetWeaver BW Accelerator), which is built for this
use case by speeding up queries and reports in the SAP NetWeaver BW by
using in-memory technology. Although being a perfect fit for an enterprise data
warehouse holding huge amounts of data, the combination of SAP NetWeaver
BW and SAP NetWeaver BW Accelerator is not always a viable solution for
relatively small data marts.

Chapter 4. SAP HANA integration scenarios

59

With the introduction of SAP HANA 1.0, SAP provided an in-memory technology
aiming to support Business Intelligence (BI) at a business unit level. SAP HANA
combined with business intelligence tools, such as the SAP BusinessObjects
tools and data replication mechanisms feeding data from the operational system
into SAP HANA in real time, brought in-memory computing to the business unit
level. Figure 4-6 shows such a landscape with the local data marts replaced by
SAP HANA.

Corporate BI

Enterprise Data Warehouse (BW)


Database

Accelerator

Local BI

SAP
HANA
1.0

SAP ERP 1

SAP ERP n

(or CRM, SRM, SCM)

(or CRM, SRM, SCM)

...

Sync

Database

SAP
HANA
1.0

Database

Non-SAP
Business
Application

Database

SAP
HANA
1.0

Figure 4-6 SAP vision after the introduction of SAP HANA 1.0

BI functionality is provided by an SAP BusinessObjects BI tool, such as the SAP


BusinessObjects Explorer, communicating with the SAP HANA database through
the BI Consumer Services (BICS) interface.
This use case scenario is oriented mainly on existing products from the SAP
Business Suite, where SAP HANA acts as a foundation for reporting on large
volumes of data.
Figure 4-7 on page 61 illustrates the role of SAP HANA in an operational
reporting use case scenario.

60

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

SAP
Business
Suite

Data Modeling
Column Store
repl.

RDBMS

Row Store

SAP
Reporting
and Analytics

SAP HANA

Figure 4-7 SAP HANA for operational reporting

Usually, the first step is the replication of data into the SAP HANA database,
which usually originates from the SAP Business Suite. However, some solution
packages are also built for non-SAP data sources.
Sometimes, source systems must be adjusted by implementing modifications or
by performing specific configuration changes.
Data typically is replicated by using the SAP Landscape Transformation
replication; however, other options, such as replication by using SAP
BusinessObjects Data Services or SAP HANA Direct Extractor Connection
(DXC), are also possible. The replication technology usually is chosen as part of
the package design and cannot be changed easily during implementation.
A list of tables to replicate (for SAP Landscape Transformation replication) or
transformation models (for replication using Data Services) are part of the
package.
SAP HANA is loaded with models (views) that are either static (designed by SAP
and packaged) or automatically generated based on customized criteria. These
models describe the transformation of source data into the resulting column
views. These views are then consumed by SAP BusinessObjects BI 4.0 reports
or dashboards that are either delivered as final products or pre-made templates
that can be finished as part of implementation process.
Some solution packages are based on additional components (for example, SAP
BusinessObjects Event Insight). If required, additional content that is specific to
these components can also be part of the solution package.
Individual use cases, required software components, prerequisites, and
configuration changes (including overall implementation processes) are properly
documented and attached as part of the delivery.

Chapter 4. SAP HANA integration scenarios

61

Solution packages can contain the following items:


SAP BusinessObjects Data Services Content (data transformation models)
SAP HANA Content (exported models with attribute views and analytic views)
SAP BusinessObjects BI Content (prepared reports and dashboards)
Transports and ABAP reports (adjusted code to be implemented in source
system)
Content for other software components, such as SAP BusinessObjects Event
Insight and Sybase Unwired Platform
Documentation
Packaged solutions such as these are being delivered by SAP under the name
SAP Rapid Deployment Solutions (RDSs) for SAP HANA or by other system
integrators, such as IBM.
Available offerings contain everything that customers need to implement the
requested function. Associated services, including implementation, can also be
part of delivery.
Although SAP HANA as a technology platform can be seen as an open field
where every client can build their own solution by using available building blocks,
the SAP HANA for operational reporting scenarios are prepared packaged
scenarios that can easily and quickly be deployed on existing landscapes.
A list of SAP RDS offerings are at the following website:
http://www.sap.com/resources/solutions-rapid-deployment/solutions-by-bu
siness.epx
Alternatively, you can use the following quick link and then click Technology
SAP HANA:
http://service.sap.com/solutionpackages

4.4 SAP HANA as an accelerator


SAP HANA in a side-car approach as an accelerator is similar to a side-car
approach for reporting purposes. The difference is that the consumer of the data
that is replicated to SAP HANA is not a BI tool but the source system itself. The
source system can use the in-memory capabilities of the SAP HANA database to
run analytical queries on the replicated data. This helps applications performing
queries on huge amounts of data to run simulations, pattern recognition,
planning runs, and so on.

62

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

SAP HANA can also be used to accelerate existing processes in SAP Business
Suite systems, even for those systems that are not yet released to be running
directly on the SAP HANA database.
Some SAP systems are processing large amounts of records that must be
filtered or aggregated based on specific criteria. Results are then used as inputs
for all dependent activities in a given system.
In the case of large data volumes, the running time can be unacceptable (in
number of hours). Such workloads can easily run several hours, which can cause
unnecessary delays. Currently, these tasks typically are being processed
overnight as batch jobs.
SAP HANA as an accelerator can help decrease this running time.
Figure 4-8 illustrates this use case scenario.

SAP UI

SAP
Business
Suite

read
Data Modeling

repl.

RDBMS

SAP
Reporting
and Analytics

Column Store
Row Store

SAP HANA

Figure 4-8 SAP HANA as an accelerator

The accelerated SAP system must meet specific prerequisites. Before this
solution can be implemented, installation of specific support packages or
implementation of SAP Notes might be required, which introduces the necessary
code changes in the source system.
The SAP HANA client must be installed on a given server, and the SAP kernel
must be adjusted to support direct connectivity to the SAP HANA database.

Chapter 4. SAP HANA integration scenarios

63

As a next step, replication of data from the source system is configured. Each
specific use case has a defined replication method and a list of tables that must
be replicated. The most common method is the SAP Landscape Transformation
replication. However, some solutions offer alternatives. For example, for the SAP
CO-PA Accelerator, replication can also be performed by an SAP CO-PA
Accelerator-specific ABAP report in the source system.
The source system is configured to have direct connectivity into SAP HANA as
the secondary database. The required scenario is configured according to the
specifications and then activated. During activation, the source system
automatically deploys the required column views into SAP HANA and activates
new ABAP code that was installed in the source system as the solution
prerequisite. This new code can run, consuming queries against the SAP HANA
database, which leads to shorter execution times.
Because SAP HANA is populated with valuable data, it is easy to extend the
accelerator use case by adding operational reporting functions. Additional
(usually optional) content is delivered for SAP HANA and for SAP
BusinessObjects BI 4.0 client tools, such as reports or dashboards.
SAP HANA as the accelerator and SAP HANA for operational reporting use case
scenarios can be combined in to a single package. Here is a list of SAP RDSs
implementing SAP HANA as an accelerator:
SAP Bank Analyzer Rapid-Deployment Solution for Financial Reporting with
SAP HANA (see SAP Note 1626729):
http://service.sap.com/rds-hana-finrep
SAP rapid-deployment solution for customer segmentation with SAP HANA
(see SAP Note 1637115):
http://service.sap.com/rds-cust-seg
SAP ERP rapid-deployment solution for profitability analysis with SAP HANA
(see SAP Note 1632506):
http://service.sap.com/rds-hana-copa
SAP ERP rapid-deployment solution for accelerated finance and controlling
with SAP HANA (see SAP Note 1656499):
http://service.sap.com/rds-hana-fin
SAP Global Trade Services rapid-deployment solution for sanctioned-party
list screening with SAP HANA (see SAP Note 1689708):
http://service.sap.com/rds-gts

64

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

4.5 SAP products running on SAP HANA


Another way that SAP HANA can be deployed is to use SAP HANA as the
primary database for selected products.
SAP NetWeaver BW running on SAP HANA has been generally available since
April 2012. The SAP ERP Central Component (SAP ECC) running on HANA was
announced in early 2013, and other products from the SAP Business Suite family
are expected to follow.
One significant advantage of running existing products to use SAP HANA as the
primary database is the minimal disruption to the existing system. Almost all
functions, customizations, and with SAP NetWeaver BW, client-specific
modeling, are preserved because application logic that is written in ABAP is not
changed. From a technical perspective, the SAP HANA conversion is similar to
any other database migration.
Figure 4-9 illustrates SAP NetWeaver BW running on SAP HANA.

SAP BW on SAP HANA


traditional
extraction

SAP
Business
Suite
RDBMS

SAP ECC on SAP HANA

SAP BW

SAP ECC

Column Store

Column Store

Row Store

Row Store

SAP HANA

SAP HANA

Figure 4-9 SAP products running on SAP HANA - SAP NetWeaver BW and SAP ECC

4.5.1 SAP NetWeaver Business Warehouse powered by SAP HANA


SAP HANA can be used as the database for an SAP NetWeaver BW installation.
In this scenario, SAP HANA replaces the traditional database server of an SAP
NetWeaver BW installation. The application servers stay the same.

Chapter 4. SAP HANA integration scenarios

65

The in-memory performance of SAP HANA improves query performance and


eliminates the need for manual optimizations by materialized aggregates in SAP
NetWeaver BW. Figure 4-10 shows SAP HANA as the database for the SAP
NetWeaver BW.

Corporate BI

Enterprise Data Warehouse (BW)


SAP HANA
Local BI

Virtual
Data Mart

Virtual
Data Mart

Virtual
Data Mart

Local BI

Local BI

SAP ERP 1

SAP ERP n

(or CRM, SRM, SCM)

(or CRM, SRM, SCM)

...

SAP
HANA
Database

Non-SAP
Business
Application
SAP
HANA

Database

Database

Figure 4-10 SAP HANA as the database for SAP NetWeaver BW

In contrast to an SAP NetWeaver BW system that is accelerated by the


in-memory capabilities of SAP NetWeaver BW Accelerator, an SAP NetWeaver
BW system with SAP HANA as the database keeps all data in-memory. With
SAP NetWeaver BW Accelerator, the client chooses the data to be accelerated,
which is then copied to the SAP NetWeaver BW Accelerator. Here, the traditional
database server (for example, IBM DB2 or Oracle) still acts as the primary data
store.
SAP NetWeaver BW on SAP HANA is probably the most popular SAP HANA use
case, which achieves performance improvements with relatively small efforts.
The underlying database is replaced by the SAP HANA database, which
improves both data loading times and query run times. Because the application
logic that is written in ABAP is not impacted by this change, all investments in
developing BW models are preserved. The transition to SAP HANA is a
transparent process that requires minimal effort to adjust existing modeling.

66

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

In-memory optimized InfoCubes


InfoCubes in SAP NetWeaver BW running on traditional database are using the
so called Enhanced Star Schema. This schema was designed to optimize
different performance aspects of working with multidimensional models on
existing database systems.
Figure 4-11 illustrates the Enhanced Star Schema in SAP NetWeaver BW with
an example.
Data Package
Dimension Table:
/BI0/D0COPC_C08P

Company Code
Master Data Table:
/BI0/PCOMP_CODE

DIMID
SID_0CHNGID
SID_0RECORDTP

Fact Table:
/BI0/F0COPC_C08

SID_0REQUID

KEY_0COPC_C08P
KEY_0COPC_C08T
KEY_0COPC_C08U

Enterprise Structure
Dimension Table:
/BI0/D0COPC_C081

KEY_0COPC_C081

DIMID

KEY_0COPC_C082

SID_0COMP_CODE

KEY_0COPC_C083

SID_0PLANT

KEY_0COPC_C084
KEY_0COPC_C085
AMOUNTFX

Material
Dimension Table:
/BI0/D0COPC_C082

COMP_CODE

Company Code
SID Table:
/BI0/SCOMP_CODE

OBJVERS
CHANGED

COMP_CODE

CHRT_ACCTS

SID

COMPANY

CHCKFL

COUNTRY

DATAFL

...

INCFL

Plant
SID Table:
/BI0/SPLANT

Plant
Master Data Table :
/BI0/PPLANT
PLANT

PLANT

OBJVERS

SID

CHANGED

AMOUNTVR

DIMID

CHCKFL

ALTITUDE

PRDPLN_QTY

SID_0MATERIAL

DATAFL

BPARTNER

LOTSIZE_CM

SID_0MAT_PLANT

INCFL

...

Figure 4-11 Enhanced Star Schema in SAP NetWeaver BW

The core part of every InfoCube is the fact table. This table contains dimension
identifiers (IDs) and corresponding key figures (measures). This table is
surrounded by dimension tables that are linked to fact tables using the dimension
IDs.

Chapter 4. SAP HANA integration scenarios

67

Dimension tables are small tables that group logically connected combinations of
characteristics, usually representing master data. Logically connected means
that the characteristics are highly related to each other, for example, company
code and plant. Combining unrelated characteristics leads to many possible
combinations, which can have a negative impact on the performance.
Because master data records are in separate tables outside of the InfoCube, an
additional table is required to connect these master data records to dimensions.
These additional tables contain a mapping of auto-generated Surrogate IDs
(SIDs) to the real master data.
This complex structure is required on classical databases; however, with SAP
HANA, this requirement is obsolete. So, SAP introduced the SAP HANA
Optimized Star Schema, which is illustrated in Figure 4-12.

Fact Table:
/BI0/F0COPC_C08

Data Package
Dimension Table:
/BI0/D0COPC_C08P

KEY_0COPC_C08P

DIMID

SID_0CALDAY

SID_0CHNGID

SID_0FISCPER

SID_0RECORDTP

SID_0FISCVARNT

SID_0REQUID

Company Code
Master Data Table:
/BI0/PCOMP_CODE
COMP_CODE

Company Code
SID Table:
/BI0/SCOMP_CODE

CHANGED

COMP_CODE

CHRT_ACCTS

SID_0CURRENCY

SID

COMPANY

SID_0UNIT

CHCKFL

COUNTRY

SID_0COMP_CODE

DATAFL

SID_0PLANT

INCFL

SID_0MATERIAL
SID_0MAT_PLANT
SID_0CURTYPE

Plant
SID Table:
/BI0/SPLANT

Plant
Master Data Table :
/BI0/PPLANT
PLANT

...

PLANT

OBJVERS

AMOUNTFX

SID

CHANGED

AMOUNTVR

CHCKFL

ALTITUDE

PRDPLN_QTY

DATAFL

BPARTNER

LOTSIZE_CM

INCFL

Figure 4-12 SAP HANA Optimized Star Schema in an SAP NetWeaver BW system

68

OBJVERS

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

The content of all dimensions (except for the Data Package dimension) is
incorporated into the fact table. This modification brings several advantages:
Simplified modeling
Poorly designed dimensions (wrong combinations of characteristics) cannot
affect performance anymore. Moving characteristics from one dimension to
another is not a physical operation anymore; instead, it is just a metadata
update.
Faster loading
Because dimension tables do not exist, all overhead workload that is related
to identification of existing combinations or creating combinations in the
dimension tables is not required anymore. Instead, the required SID values
are inserted directly into the fact table.
The SAP HANA Optimized Star Schema is used automatically for all newly
created InfoCubes on the SAP NetWeaver BW system running on the SAP
HANA database.
Existing InfoCubes are not automatically converted to this new schema during
the SAP HANA conversion of the SAP NetWeaver BW system. The conversion of
standard InfoCubes to in-memory optimized InfoCubes must be done manually
as a follow-up task after the database migration.

SAP HANA acceleration areas


The SAP HANA database can bring performance benefits; however, it is
important to set the expectations correctly. SAP HANA can improve loading and
query times, but certain limits cannot be overcome.
Migration of SAP NetWeaver BW to run on SAP HANA does not improve
extraction processes because extraction happens in the source system.
Therefore, it is important to understand how much of the overall load time is
taken by extraction from the source system. This information is needed to
correctly estimate the potential performance improvement for the load process.
Other parts of the load process are improved. The new Optimized Star Schema
removes unnecessary activities from the loading process.
Some of the calculations and application logic can be pushed to the SAP HANA
database, which ensures that data-intensive activities are being done on the SAP
HANA database level instead of on the application level. This activity increases
the performance because the amount and volume of data that is exchanged
between the database and the application are reduced.

Chapter 4. SAP HANA integration scenarios

69

SAP HANA can calculate all aggregations in real time. Therefore, aggregates are
no longer required, and roll-up activity that is related to aggregate updates is
obsolete. This also reduces the overall run time of update operations.
If SAP NetWeaver BW Accelerator is used, the update of its indexes is also no
longer needed. Because SAP HANA is based on technology similar to SAP
NetWeaver BW Accelerator, all queries are accelerated. Query performance with
SAP HANA can be compared to situations where all cubes are indexed by the
SAP NetWeaver BW Accelerator. In reality, query performance can be even
faster than with SAP NetWeaver BW Accelerator because additional features are
available for SAP NetWeaver BW running on SAP HANA, for example, the
possibility to remove an InfoCube and instead run reports against in-memory
optimized DataStore Objects (DSOs).

4.5.2 Migrating SAP NetWeaver Business Warehouse to SAP HANA


There are multiple ways that an existing SAP NetWeaver Business Warehouse
(BW) system can be moved to an SAP HANA database. It is important to
distinguish between a building proof of concept (POC) demonstration system and
a productive migration.
The available options are divided into two main groups:
SAP NetWeaver Business Warehouse database migration
Transporting the content to the SAP NetWeaver Business Warehouse system
These two groups are the main driving ideas behind the move from a traditional
database to SAP HANA. Within each group, there are still many possibilities of
how a project plan can be orchestrated.
The following sections explain these two approaches in more detail.

SAP NetWeaver Business Warehouse database migration


The following software levels are prerequisites for SAP NetWeaver BW running
on SAP HANA1:
SAP NetWeaver BW 7.30 SP52 or SAP NetWeaver BW 7.31 SP4 or SAP
NetWeaver BW 7.40
SAP HANA 1.0 SPS03 for SAP NetWeaver BW 7.30 and 7.31, and SAP
HANA 1.0 SPS06 for SAP NetWeaver BW 7.40 (the latest available SAP
HANA revision is highly recommended)
1
2

70

For the latest information, see SAP Note 1600929.


According to SAP Note 1600929, SP07 or higher must be imported for your SAP NetWeaver BW
Installation (ABAP) before migration and after installation.

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Not all SAP NetWeaver BW add-ons are supported to run on the SAP
HANA-based system. For the latest information, see following SAP Notes:
Note 1600929 - SAP NetWeaver BW powered by SAP HANA DB: Information
Note 1532805 - Add-On Compatibility of SAP NetWeaver 7.3
Note 1652738 - Add-on compatibility for SAP NetWeaver EHP 1 for NW 7.30
Unless your system already meets the minimal release requirements, the first
step before converting SAP NetWeaver BW is to upgrade the system to the latest
available release and to the latest available support package level.
A database upgrade might be required as part of the release upgrade or as a
prerequisite before database migration to SAP HANA. For a list of supported
databases, see SAP Note 1600929.
Table 4-1 lists the databases that were approved as source databases for the
migration to SAP HANA at the time of writing.
Table 4-1 Supported source databases for a migration to the SAP HANA database
Database

SAP NetWeaver
BW 7.30

SAP NetWeaver
BW 7.31

SAP NetWeaver
BW 7.40

Oracle

11.2

11.2

11.2

MaxDB

7.8

7.9

7.9

MS SQL server

2008

2008

2008, 2012

IBM DB2 for Linux,


UNIX, and Windows

9.7

9.7

10.1

IBM DB2 for i

6.1, 7.1

6.1, 7.1

7.1

IBM DB2 for z/OS

9, 10

9, 10

10

SybaseASE

N/A

15.7

15.7, 16

SAP HANA is not a supported database for any SAP NetWeaver Java stack.
Therefore, dual-stack installations (ABAP plus Java) must be separated into two
individual stacks by using the Dual-Stack Split Tool from SAP.
Because some existing installations are still non-Unicode installations, another
important prerequisite step might be a conversion of the database to Unicode
encoding. This Unicode conversion can be done as a separate step or as part of
the conversion to the SAP HANA database.

Chapter 4. SAP HANA integration scenarios

71

All InfoCubes with data persistency in the SAP NetWeaver BW Accelerator are
set as inactive during conversion, and their content in SAP NetWeaver BW
Accelerator is deleted. These InfoCubes must be reloaded again from the
original primary persistence; therefore, required steps must be incorporated into
the project plan.
A migration to the SAP HANA database follows the same process as any other
database migration. All activity in the SAP NetWeaver BW system is suspended
after all preparation activities are finished. A special report is run to generate
database-specific statements for the target database that is used during import.
Next, the content of the SAP system is exported to a platform-independent
format and stored in files on disk.
These files are then transported to the primary application server of the SAP
NetWeaver BW system. The application part of SAP NetWeaver BW is not
allowed to run on the SAP HANA appliance. Therefore, a minimal installation
must have two servers:
SAP HANA appliance hosting the SAP HANA database
The SAP HANA appliance is delivered by IBM with the SAP HANA database
preinstalled. However, the database is empty.
Primary application server hosting ABAP instance of SAP NetWeaver BW
There are minimal restrictions regarding the operating system of the primary
application server. For available combinations, see the Product Availability
Matrix (PAM) (search for SAP NetWeaver 7.3 and download the overview
presentation), found at the following website:
http://service.sap.com/pam
At the time of writing, the operating systems that are shown in Table 4-2 on
page 73 are available to host the ABAP part of the SAP NetWeaver BW
system.

72

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Windows Server 2012


x86_64 (64-bit)

Windows Server 2008


x86_64 (64-bit) (including R2)

AIX 6.1, 7.1


Power (64-bit)

HP-UX 11.31
IA64 (64-bit)

Solaris 10
SPARC (64-bit)

Solaris 10
x86_64 (64-bit)

Linux SLES 11
x86_64 (64-bit)

Linux RHEL 5
x86_64 (64-bit)

Linux RHEL 6
x86_64 (64-bit)

IBM i 7.1
Power (64-bit)

Table 4-2 Supported operating systems for primary application server

SAP
NetWeaver
BW 7.30

No

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Yes

No

SAP
NetWeaver
BW 7.31

No

Yes

Yes

Yes

Yes

Yes

Yes

No

Yes

Yes

SAP
NetWeaver
BW 7.40

Yes

Yes

Yes

Yes

Yes

Yes

Yes

No

Yes

Yes

The next step is the database import. It contains the installation of the SAP
NetWeaver BW on the primary application server and the import of data into the
SAP HANA database. The import occurs remotely from the primary application
server as part of the installation process.
Parallel export/import by using socket connection and File Transfer Protocol
(FTP) and Network File System (NFS) exchange modes are not supported.
Currently, only the asynchronous file-based export/import method is available.
After mandatory post-activities, conversion of InfoCubes and DataStore objects
to their in-memory optimized form must be initiated to use all the benefits that the
SAP HANA database can offer. This task can be done either manually for each
object or as a mass operation by using a special report.
Clients must plan enough time to perform this conversion. This step can be
time-consuming because the content of all InfoCubes must be copied into
temporary tables that have the new structure.
After all post-activities are finished, the system is ready to be tested.

Chapter 4. SAP HANA integration scenarios

73

Transporting the content to the SAP NetWeaver Business


Warehouse system
Unlike with a database migration, this approach is based on performing
transports of activated objects (Business Content) from the existing SAP
NetWeaver BW system into a newly installed SAP NetWeaver BW system with
SAP HANA as a primary database.
The advantage of this approach is that content can be transported across
releases, as explained in following SAP Notes:

Note 1090842 - Composite note: Transports across several releases


Note 454321 - Transports between Basis Release 6.* and 7.0
Note 1273566 - Transports between Basis Release 700/701 and 702/73*
Note 323323 - Transport of all activated objects of a system

The possibility to transport content across different releases can reduce the
amount of effort that is required to build a proof of concept (POC) system
because most of the prerequisite activities, such as the release upgrade,
database upgrade, and dual-stack split, are not needed.
After transporting the available objects (metadata definitions), their content must
also be transported from the source to the target system. The SAP NetWeaver
BW consultant must assess which available options are most suitable for this
purpose.
This approach is not recommended for production migration where a
conventional database migration is used. Therefore, additional effort that is
invested in building a POC system in the same way as the production system is
treated, is a valuable test. This test can help customers create a realistic effort
estimation for the project, estimate required runtimes, and define detailed
planning of all actions that are required. All involved project team members
become familiar with the system and can solve and document all specific
problems.

Parallel approach to SAP HANA conversion


The preferred approach to convert an SAP NetWeaver BW system to use the
SAP HANA database is a parallel approach, meaning that the new SAP
NetWeaver BW system is created as a clone of the original system. The standard
homogeneous system copy method can be used for this purpose.
This clone is then reconfigured in a way that both the original and the cloned BW
system is functional and both systems can extract data from the same sources.
Detailed instructions about how to perform this cloning operation are explained in
scenario B2 of SAP Note 886102.

74

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Here is some important information that is relevant to the cloned system. To


understand the full procedure that must be applied on the target BW system, see
SAP Note 886102. The SAP Note states the following:
Caution: This step deletes all transfer rules and PSA tables of these source
systems, and the data is lost. A message is generated stating that the source
system cannot be accessed (since you deleted the host of the RFC connection).
Choose Ignore.
It is important to understand the consequences of this action and to plan the
required steps to reconfigure the target BW system so that it can again read data
from the source systems.
Persistent Staging Area (PSA) tables can be regenerated by the replication of
DataSources from the source systems, and transfer rules can be transported
from the original BW system. However, the content of these PSA tables is lost
and must be reloaded from source systems.
This step might cause problems where DataStore objects are used and PSA
tables contain the complete history of data.
An advantage of creating a cloned SAP NetWeaver BW system is that the
original system is not impacted and can still be used for productive tasks. The
cloned system can be tested and results compared with the original system
immediately after the clone is created and after every important project
milestone, such as a release upgrade or the conversion to SAP HANA itself.
Both systems are synchronized because both systems periodically extract data
from the same source systems. Therefore, after a project is finished, and the new
SAP NetWeaver BW system running on SAP HANA meets the clients
expectations, the new system can replace the original system.
A disadvantage of this approach is the additional load that is imposed on the
source systems, which is caused by both SAP NetWeaver BW systems
performing extraction from the same source system, and certain limitations that
are mentioned in the following SAP notes:
Note 775568 - Two and more BW systems against one OLTP system
Note 844222 - Two OR more BW systems against one OLTP system

Chapter 4. SAP HANA integration scenarios

75

4.5.3 SAP Business Suite powered by SAP HANA


SAP announced restricted availability of SAP Business Suite, which is powered
by SAP HANA, in January 2013.3 After a successful ramp-up program, SAP
made this generally available during the SAPPHIRENOW conference, held in
Orlando, FL, in May 2013.
SAP HANA can be used as the database for an SAP Business Suite installation.
In this scenario, SAP HANA replaces the traditional database server of an SAP
Business Suite installation. The application servers stay the same, and can run
on any platform that supports the SAP HANA database client. As of May 2014,
the following applications of SAP Business Suite are supported by SAP HANA as
their primary database:

Enterprise Resource Planning (ERP)


Customer Relationship Management (CRM)
Supply Chain Management (SCM)
Supplier Relationship Management (SRM)

The Product Lifecycle Management (PLM) application is not available with SAP
HANA, but there is a plan to make these available to run with SAP HANA.
SAP Business Suite, which is powered by SAP HANA, does not induce any
functional changes. Configuration, customization, the ABAP Workbench,
connectivity, security, transports, and monitoring stay unchanged. For
modifications, the same upgrade requirements as with any other upgrade apply.
Customized code can stay unchanged, or can be adjusted to use additional
performance.
SAP Business Suite applications can benefit in various ways from using the
in-memory technology of SAP HANA:
Running dialog processes instead of batch
Integration of unstructured data and machine to machine data (M2M) with
ERP processes
Integration of predictive analysis with ERP processes
Running operational reports in real time, directly on the source data
Removing the need for operational data stores
Eliminating the need for data replication or transfers to improve operational
report performance

76

http://www.news-sap.com/sap-business-suite-on-sap-hana-launch

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

SAP is enabling the existing functions in SAP applications to use the in-memory
technology with the following versions:

SAP enhancement package 6 for SAP ERP 6.0, version for SAP HANA
SAP enhancement package 2 for SAP CRM 7.0, version for SAP HANA
SAP enhancement package 2 for SAP SCM 7.0, version for SAP HANA
SAP enhancement package 2 for SAP SRM 7.0, version for SAP HANA

Restrictions
There are certain restrictions4 in effect regarding running SAP Business Suite
with SAP HANA.
Currently, multi-node support for SAP Business Suite with SAP HANA is limited.5
SAP HANA multi-node configurations can serve different purposes:
Achieving high-availability by the use of standby nodes
Scaling the main memory to accommodate larger databases (scale-out)
Scale-out scenarios with multiple worker nodes (as described in 5.3.3,
Scaling-out SAP HANA using GPFS on page 128) are not yet supported for
SAP Business Suite with SAP HANA.
High availability (HA) scenarios for SAP Business Suite with SAP HANA are
supported, but restricted to the simplest case of two servers, one being the
worker node and one acting as a standby node. In this case, the database is not
partitioned, but the entire database is on a single node. This is why this
configuration is sometimes also referred to as a single-node HA configuration.
Because of these restrictions with regards to scalability, SAP decided to allow
configurations with a higher memory per core ratio, specifically for this use case.
Sections 6.1.2, Single-node eX5 solution for SAP Business Suite on HANA on
page 151 and 6.2.2, Single-node X6 solution for SAP Business Suite on HANA
on page 165 describe available configurations that are dedicated to SAP
Business Suite, which is powered by SAP HANA.

4.6 Programming techniques using SAP HANA


The last use case scenario is based on recent developments from SAP where
applications can be built directly against the SAP HANA database using all its
features, such as the embedded application server (XS Engine) or stored
procedures, which allows logic to be directly processed inside the SAP HANA
database.
4
5

See SAP Note 1774566.


For up-to-date information about multi-node support, see SAP Note 1825774.

Chapter 4. SAP HANA integration scenarios

77

A new software component can be integrated with SAP HANA either directly or it
can be built on top of the SAP NetWeaver stack, which can work with the SAP
HANA database using client libraries.
Because of its breadth and depth, this use case scenario is not described in
detail as part of this publication.

78

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Chapter 5.

IBM System x solutions for


SAP HANA
This chapter describes the IBM System x solutions for SAP HANA. It describes
the configurations of the workload-optimized solutions for both generations of
Intel processors. It then explores IBM GPFS and what features it provides to the
solution. The final section describes the networking switches that are part of the
solution.
This chapter covers the following topics:

IBM eX5 systems


IBM X6 systems
IBM General Parallel File System
IBM System Networking options

Copyright IBM Corp. 2013, 2014. All rights reserved.

79

5.1 IBM eX5 systems


IBM decided to base their first-generation offering for SAP HANA on their
high-performance, scalable IBM eX5 family of servers. These servers represent
the IBM high-end Intel-based enterprise servers. IBM eX5 systems, all based on
the eX5 Architecture, are the IBM BladeCenter HX5, the IBM System x3850 X5,
the IBM System x3950 X5, and the IBM System x3690 X5. They have a common
set of technical specifications and features:
The HX5 is a single wide (30 mm) blade server that follows the same design
as all previous IBM blade servers. The HX5 brings unprecedented levels of
capacity to high-density environments.
The x3850 X5 is a 4U highly rack-optimized server. The x3850 X5 also forms
the basis of the x3950 X5, the new flagship server of the IBM x86 server
family. These systems are designed for maximum usage, reliability, and
performance for compute-intensive and memory-intensive workloads, such as
SAP HANA.
The x3690 X5 is a 2U rack-optimized server. This machine brings the eX5
features and performance to the mid tier. It is an ideal match for the smaller,
two-CPU configurations for SAP HANA.
When compared with other machines in the System x portfolio, these systems
represent the upper end of the spectrum and are suited for the most demanding
x86 tasks.
For SAP HANA, the x3690 X5 and the x3950 X5 are used, which is why only
these systems are featured in this chapter.
Note: For the latest information about the eX5 portfolio, see IBM eX5 Portfolio
Overview: IBM System x3850 X5, x3950 X5, x3690 X5, and BladeCenter
HX5, REDP-4650, which provides further eX5 family members and
capabilities.

5.1.1 x3850 X5 and x3950 X5


The x3850 X5 (Figure 5-1 on page 81) offers improved performance and
enhanced features, including MAX5 memory expansion and workload-optimized
x3950 X models to maximize memory, minimize costs, and simplify deployment.

80

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Figure 5-1 x3850 X5 and x3950 X5

The x3850 X5 and the workload-optimized x3950 X5 are the logical successors
to the x3850 M2 and x3950 M2, featuring the IBM eX4 chipset. Compared with
previous generation servers, the x3850 X5 offers the following features:
High memory capacity
Up to 64 dual inline memory modules (DIMMs) standard and 96 DIMMs with
the MAX5 memory expansion per four-socket server
Intel Xeon processor E7 family
Exceptional scalable performance with advanced reliability for your most
data-demanding applications
Extended SAS capacity with eight HDDs and 900 GB 2.5-inch SAS drives or
1.6 TB of hot-swappable Redundant Array of Independent Disks 5 (RAID 5)
with eXFlash technology
Standard dual-port Emulex 10 GB Virtual Fabric adapter
Ten-core, 8-core, and 6-core processor options with up to 2.4 GHz (10-core),
2.13 GHz (8-core), and 1.86 GHz (6-core) speeds with up to 30 MB L3 cache
Scalable to a two-node system with eight processor sockets and 128 dual
inline memory module (DIMM) sockets
Seven PCIe x8 high-performance I/O expansion slots to support hot-swap
capabilities
Optional embedded hypervisor

Chapter 5. IBM System x solutions for SAP HANA

81

The x3850 X5 and x3950 X5 both scale to four processors and 2 TB of RAM.
With the MAX5 attached, the system can scale to four processors and 3 TB of
RAM. Two x3850 X5 servers can be connected together for a single system
image with eight processors and 4 TB of RAM.

5.1.2 x3690 X5
The x3690 X5 (Figure 5-2) is a 2U rack-optimized server that brings new features
and performance to the mid tier.

Figure 5-2 IBM System x3690 X5

This machine is a two-socket, scalable system that offers up to four times the
memory capacity of current two-socket servers. It supports the following
specifications:
Up to two sockets for Intel Xeon E7 processors. Depending on the processor
model, processors have six, eight, or ten cores.
Scalable 32 - 64 DIMM sockets with the addition of an MAX5 memory
expansion unit.
Advanced networking capabilities with a Broadcom 5709 dual Gb Ethernet
controller standard in all models and an Emulex 10 Gb dual-port Ethernet
adapter standard on some models, optional on all others.
Up to 16 hot-swap 2.5-inch SAS HDDs, up to 16 TB of maximum internal
storage with RAID 0, 1, or 10 to maximize throughput and ease installation.
RAID 5 is optional. The system comes standard with one HDD backplane that
can hold four drives. Second and third backplanes are optional for an
additional 12 drives.
New eXFlash high-input/output operations per second (IOPS) solid-state
device (SSD) storage technology.

82

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Five PCIe 2.0 slots.


Integrated management module (IMM) for enhanced systems management
capabilities.
The x3690 X5 is an excellent choice for a memory-demanding and
performance-demanding business applications, such as SAP HANA. It provides
maximum performance and memory in a dense 2U package.

5.1.3 Intel Xeon processor E7 family


The IBM eX5 portfolio of servers uses processors from the Intel Xeon processor E7
family to maximize performance. They were introduced by Intel in 2011 as part of
their high-end high-performance processor line. They can be used to scale up to four
or more processors. When used in the x3850 X5 or x3950 X5, these servers can
scale up to eight processors.
The Intel Xeon E7 processors have many features that are relevant for the SAP
HANA workload. For more in-depth information about the benefits of the Intel
Xeon processor E7 family for SAP HANA, see the Intel white paper Analyzing
Business as it Happens, found at:
http://www.intel.com/content/dam/doc/white-paper/high-performance-compu
ting-xeon-e7-analyze-business-as-it-happens-with-sap-hana-software-brie
f.pdf

Reliability, availability, and serviceability


Most system errors are handled in hardware by the use of technologies, such as
error checking and correcting (ECC) memory. The E7 processors have additional
reliability, availability, and serviceability (RAS) features because of their
architecture:
Cyclic redundancy checking (CRC) on the QPI links
The data on the QPI link is checked for errors.
QPI packet retry
If a data packet on the QPI link has errors or cannot be read, the receiving
processor can request that the sending processor try sending the packet
again.
QPI clock failover
If there is a clock failure on a coherent QPI link, the processor on the other
end of the link can take over providing the clock. This is not required on the
QPI links from processors to I/O hubs because these links are asynchronous.

Chapter 5. IBM System x solutions for SAP HANA

83

Scalable memory interconnect (SMI) packet retry


If a memory packet has errors or cannot be read, the processor can request
that the packet be resent from the memory buffer.
SMI retry
If there is an error on an SMI link, or a memory transfer fails, the command
can be tried again.
SMI lane failover
When an SMI link exceeds the preset error threshold, it is disabled, and
memory transfers are routed through the other SMI link to the memory buffer.
All these features help prevent data from being corrupted or lost in memory. This
is especially important with an application, such as SAP HANA, because any
failure in the area of memory or inter-processor communication leads to an
outage of the application or even of the complete system. With huge amounts of
data loaded into main memory, even a restart of only the application means
considerable time is required to return to operation.

Machine Check Architecture


The Intel Xeon processor E7 family also features the Machine Check
Architecture (MCA), which is a RAS feature that enables the handling of system
errors that otherwise require the operating system to be halted. For example, if a
dead or corrupted memory location is discovered, but it cannot be recovered at
the memory subsystem level, and if it is not in use by the system or an
application, an error can be logged but the operation of the server can continue.
If it is in use by a process, the application to which the process belongs can be
aborted or informed about the situation.
Implementation of the MCA requires hardware support, firmware support (such
as that found in the Unified Extensible Firmware Interface (UEFI)), and operating
system support. Microsoft, SUSE, Red Hat, and other operating system vendors
included support for the Intel MCA on the Intel Xeon processors in their latest
operating system versions.
SAP HANA is the first application that uses the MCA to handle system errors to
prevent the application from being terminated in a system error. Figure 5-3 on
page 85 shows how SAP HANA uses the MCA.

84

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Normal
operation

Hardware

OS

Error
detected

Error
passed to
OS

Hardware
correctable
error

Error
corrected

Memory
Page
unused

Memory Page
unmapped
and marked

SAP HANA

Application
signaled

Application
terminates

Page identified
and data can be
reconstructed

Reconstruct
data in
corrupted page

Figure 5-3 Intel Machine Check Architecture with SAP HANA

If a memory error is encountered that cannot be corrected by the hardware, the


processor sends an MCA recovery signal to the operating system. An operating
system supporting MCA, such as SUSE Linux Enterprise Server, which is used
in the SAP HANA appliance, now determines whether the affected memory page
is in use by an application. If unused, it unmaps the memory page and marks it
as bad. If the page is used by an application, traditionally the OS must hold that
application, or in the worst case stop all processing and halt the system. With
SAP HANA being MCA-aware, the operating system can signal the error
situation to SAP HANA, giving it the chance to try to repair the effects of the
memory error.
Using the knowledge of its internal data structures, SAP HANA can decide what
course of action to take. If the corrupted memory space is occupied by one of the
SAP in-memory tables, SAP HANA reloads the associated tables. In addition, it
analyzes the failure and checks whether it affects other stored or committed data,
in which case it uses savepoints and database logs to reconstruct the committed
data in a new, unaffected memory location.
With the support of MCA, SAP HANA can take appropriate action at the level of
its own data structures to ensure a smooth return to normal operation and avoid
a time-consuming restart or loss of information.

Chapter 5. IBM System x solutions for SAP HANA

85

5.1.4 Memory
For an in-memory appliance, such as SAP HANA, a systems main memory, its
capacity, and its performance play an important role. The Intel Xeon processor
E7 family, which is shown in Figure 5-4, has a memory architecture that is suited
to the requirements of such an appliance.
The E7 processors have two SMIs. Therefore, memory must be installed in
matched pairs. For better performance, or for systems that are connected
together, memory must be installed in sets of four. The memory that is used in
the eX5 systems is DDR3 SDRAM registered DIMMs. All of the memory runs at
1066 MHz or less, depending on the processor.

Processor

Memory
controller

Buffer

Buffer

Memory
controller

Buffer

Buffer

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

Figure 5-4 Memory architecture with Intel Xeon processor E7 family

Memory DIMM placement


The eX5 servers support various ways to install memory DIMMs. It is important
to understand that because of the layout of the SMI links, memory buffers, and
memory channels, you must install the DIMMs in the correct locations to
maximize performance.

86

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Figure 5-5 shows eight possible memory configurations for the two memory
cards and 16 DIMMs connected to each processor socket in an x3850 X5.
Similar configurations apply to the x3690 X5 and HX5. Each configuration has a
relative performance score.

1
2
3
4
5
6
7
8

Mem Ctrl 1

Mem Ctrl 1

Mem Ctrl 1

Mem Ctrl 1

Mem Ctrl 1

Mem Ctrl 1

Mem Ctrl 1

Mem Ctrl 1

Mem Ctrl 2

Relative
performance
Each processor:
2 memory controllers
2 DIMMs per channel
8 DIMMs per MC

1.0

Each processor:
2 memory controllers
1 DIMM per channel
4 DIMMs per MC

0.94
0.61

Mem Ctrl 2

Each processor:
2 memory controllers
2 DIMMs per channel
4 DIMMs per MC

0.58

Mem Ctrl 2

Each processor:
2 memory controllers
1 DIMM per channel
2 DIMMs per MC

Mem Ctrl 2

Each processor:
1 memory controller
2 DIMMs per channel
8 DIMMs per MC

0.51

Mem Ctrl 2

Each processor:
1 memory controller
1 DIMM per channel
4 DIMMs per MC

Mem Ctrl 2

Each processor:
1 memory controller
2 DIMMs per channel
4 DIMMs per MC

Mem Ctrl 2

Mem Ctrl 2

Each processor:
1 memory controller
1 DIMM per channel
2 DIMMs per MC

Memory card
DIMMs
Channel
Memory buffer
SMI link
Memory controller

Mem Ctrl 1

0.47
0.31

Relative memory performance

Memory configurations

1
0.94

0.9
0.8
0.7

0.61

0.6

0.58
0.51

0.5

0.47

0.4

0.31

0.29

0.3
0.2
0.1
0
1

Configuration

0.29

Figure 5-5 Relative memory performance based on DIMM placement (one processor and two memory
cards shown)

The following key information from this chart is important:


The best performance is achieved by populating all memory DIMMs in the
server (configuration 1 in Figure 5-5).
Populating only one memory card per socket can result in approximately a
50% performance degradation. (Compare configuration 1 with 5.)

Chapter 5. IBM System x solutions for SAP HANA

87

Memory performance is better if you install DIMMs on all memory channels


than if you leave any memory channels empty. (Compare configuration 2 with
3.)
Two DIMMs per channel result in better performance than one DIMM per
channel. (Compare configuration 1 with 2, and compare configuration 5 with
6.)

Nonuniform memory architecture


Nonuniform memory architecture (NUMA) is an important consideration when
configuring memory because a processor can access its own local memory
faster than non-local memory. The configurations that are used for SAP HANA do
not use all available DIMM sockets. For configurations like these, another
principle to consider when configuring memory is that of balance. A balanced
configuration has all of the memory cards configured with the same amount of
memory. This principle helps keep remote memory access to a minimum.
A server with a NUMA, such as the servers in the eX5 family, has local and
remote memory. For a given thread running in a processor core, local memory
refers to the DIMMs that are directly connected to that particular processor.
Remote memory refers to the DIMMs that are not connected to the processor
where the thread is running currently. Remote memory is attached to another
processor in the system and must be accessed through a QPI link. However,
using remote memory adds latency. The more such latencies add up in a server,
the more performance can degrade. Starting with a memory configuration where
each CPU has the same local RAM capacity is a logical step toward keeping
remote memory accesses to a minimum.
In a NUMA system, each processor has fast, direct access to its own memory
modules, reducing the latency that arises because of bus-bandwidth contention.

Hemisphere mode
Hemisphere mode is an important performance optimization of the Intel Xeon
processor E7, 6500, and 7500 product families. Hemisphere mode is
automatically enabled by the system if the memory configuration allows it. This
mode interleaves memory requests between the two memory controllers within
each processor, enabling reduced latency and increased throughput. It also
allows the processor to optimize its internal buffers to maximize memory
throughput.
Hemisphere mode is enabled only when the memory configuration behind each
memory controller on a processor is identical. In addition, because eight DIMMs
per processor are required for using all memory channels, eight DIMMs per
processor must be installed at a time for optimized memory performance.

88

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

5.1.5 Flash technology storage


As described in 1.1.2, Data persistence on page 4, storage technology
providing high IOPS capabilities with low latency is a key component of the
infrastructure for SAP HANA. IBM uses two different flash memory elements in
its workload-optimized solutions for SAP HANA. The x3850 X5/x3950 X5-based
configurations that are used for the IBM Systems Solution for SAP HANA
features the IBM High IOPS SSD PCIe adapter as flash memory devices. The
x3690 X5 solution uses IBM eXFlash storage.

High IOPS adapter


The IBM High IOPS SSD PCIe adapters provide a new generation of
ultra-high-performance storage that is based on SSD technology for System x.
Designed for high-performance servers and computing appliances, these
adapters deliver throughput of up to 900,000 IOPS, while providing the added
benefits of lower power, cooling, and management impact and a smaller storage
footprint. Based on a standard PCIe architecture that is coupled with
silicon-based NAND clustering storage technology, the High IOPS adapters are
optimized for System x rack-mount systems. They are available in storage
capacities up to 2.4 TB.
These adapters use NAND flash memory as the basic building block of SSD
storage and contain no moving parts. Thus, they are less sensitive to issues that
are associated with vibration, noise, and mechanical failure. They function as a
PCIe storage and controller device, and after the appropriate drivers are loaded,
the host operating system sees them as block devices. Therefore, these adapters
cannot be used as bootable devices.
The IBM High IOPS PCIe adapters combine high IOPS performance with low
latency. As an example, with 512 KB block random reads, the IBM 1.2 TB High
IOPS MLC Mono adapter can deliver 143,000 IOPS, compared with 420 IOPS
for a 15 K RPM 146 GB disk drive. The read access latency is about 68 ms,
which is 1/100th of the latency of a 15 K RPM 146 GB disk drive (about
5 ms or 5000 microseconds). The write access latency is even less, with about
15 ms.
Reliability features include the usage of Enterprise-grade MLC (eMLC),
advanced wear-leveling, ECC protection, and Adaptive Flashback redundancy
for RAID-like chip protection with self-healing capabilities, providing unparalleled
reliability and efficiency. Advanced bad-block management algorithms enable
taking blocks out of service when their failure rate becomes unacceptable. These
reliability features provide a predictable lifetime and up to 25 years of data
retention.

Chapter 5. IBM System x solutions for SAP HANA

89

The x3950 X5-based models of the IBM Systems Solution for SAP HANA come
with IBM High IOPS adapters with 1.2 TB storage capacity (7143-HAx, -HBx, and
-HCx).
Figure 5-6 shows the IBM 1.2 TB High IOPS MLC Mono adapter.

Figure 5-6 IBM 1.2 TB High IOPS MLC Mono adapter

eXFlash
IBM eXFlash is the name of the eight 1.8-inch SSDs, the backplanes, SSD
hot-swap carriers, and indicator lights that are available for the x3690 X5. Each
eXFlash can be put in place of four SAS or SATA disks. The eXFlash units
connect to the same types of ServeRAID disk controllers as the SAS/SATA disks.
Figure 5-7 shows an eXFlash unit, with the status light assembly on the left side.

Status lights
Solid-state drives
(SSDs)

Figure 5-7 IBM eXFlash unit

90

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

In addition to using less power than rotating magnetic media, the SSDs are more
reliable and can service many more IOPS. These attributes make them suited to
I/O-intensive applications, such as transaction processing, logging, backup and
recovery, and Business Intelligence (BI). Built on enterprise-grade MLC NAND
flash memory, the SSD drives that are used in eXFlash for SAP HANA deliver up
to 60,000 read IOPS per single drive. Combined into an eXFlash unit, these
drives potentially can deliver up to 480,000 read IOPS and up to 4 GBps of
sustained read throughput per eXFlash unit.
In addition to its superior performance, eXFlash offers superior uptime with three
times the reliability of mechanical disk drives. SSDs have no moving parts to fail.
Each drive has its own backup power circuitry, error correction, data protection,
and thermal monitoring circuitry. They use Enterprise Wear-Leveling to extend
their use even longer.
A single eXFlash unit accommodates up to eight hot-swap SSDs and can be
connected to up to two performance-optimized controllers. The x3690 X5-based
models for SAP HANA enable RAID protection for the SSD drives by using two
ServeRAID M5015 controllers with the ServeRAID M5000 Performance
Accelerator Key for the eXFlash units.

5.1.6 x3950 X5 Workload Optimized Solution for SAP HANA


All x3950 X5-based workload-optimized solutions for SAP HANA follow one
architectural approach, which ensures the highest level of investment protection
by providing a seamless upgrade path for clients when scaling up their
environment.
Certain configuration options are defined by SAP, including the processor being
used, the minimum storage capacity that must be provided, and the network
connectivity of the system.
A standard x3950 X5 workload-optimized solution has the following minimum
components:
Two, four, or eight Intel Xeon E7-8870 processors
Up to 4 TB of DDR3 main memory
Two local 900 GB SAS drives in a RAID 1 for the operating system (SLES 11)
and local housekeeping
Six local 900 GB SAS drives in a RAID 5 to store SAP HANA savepoints
One 1.2 TB HighIOPS adapter to store SAP HANA log files

Chapter 5. IBM System x solutions for SAP HANA

91

CPU and memory


SAP restricts supported CPU modes to the Xeon E7-x870 processor (with x
being either 2, 4, or 8 to denote the number of maximum supported processors in
a single server). Because the x3950 X5 models can scale up to eight sockets
IBM offers only the Intel Xeon Processor E7-8870 with 2.4 GHz.
To allow memory to scale from 256 GB up to 4 TB, memory modules with two
different capacities are used:
16 GB DIMM DDR3 ECC (4Rx4, 1.35 V) CL7 1066 MHz
32 GB DIMM DDR3 ECC (4Rx4, 1.35 V) CL7 1066 MHz
For an overview of which DIMMs are used in which T-shirt size see 6.1, IBM eX5
based environments on page 146, which details the different memory
configurations. DIMM placement is crucial for best performance. It is not
supported to change module placement or memory configuration in a
workload-optimized solution for SAP HANA.

Storage
Each SAP HANA model must provide a certain amount of traditional storage
capacity so that savepoints of the in-memory database can be written out to
persistent storage in regular intervals. In addition, a log entry must be stored for
every change in the database.
All systems come with storage for both the data volume and the log volume, as
shown in Figure 5-8 on page 93. Savepoints are stored on a RAID protected
array of 10 K SAS hard disk drives (HDDs), and optimized for data throughput.
Savepoints consist of a consistent backup of all database tables that are kept in
main memory. SAP HANA database logs are stored on flash-based High IOPS
PCIe adapters.

92

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Time

Data savepoint
to persistent
storage

SAS Drives

Log written
to persistent storage
(committed transactions)

HighIOPS SSD

optimized for

optimized for

throughput

high IOPS / low latency

Server
local
storage

GPFS file system


For maximum performance and scalability
Figure 5-8 SAP HANA data persistency on internal storage of the x3950 X5 workload-optimized systems

This flash technology storage device is optimized for high IOPS performance and
low latency to provide the SAP HANA database with a log storage that allows the
highest possible performance. Because a transaction in the SAP HANA
database can return only after the corresponding log entry is written to the log
storage, overall database performance is limited to how quickly those log entries
can be persisted.

Chapter 5. IBM System x solutions for SAP HANA

93

Figure 5-9 shows the two storage devices, HDD and flash, together with the SAP
HANA software stack for a single node that is based on the x3950 X5 model.
Both the savepoints (data01) and the logs (log01) are stored once, denoted as
the first replica in the figure.1

x3950 X5
SAP HANA DB
DB partition 1
- SAP HANA DB

- Index server
- Statistic server
- SAP HANA studio
Shared file system - GPFS

First replica

HDD

Flash

data01

log01

Figure 5-9 Storage architecture of a x3950 workload-optimized solution for SAP HANA
(SAP HANA DB mentioned twice to be consistent with scale-out figures later on in this
chapter)

The HDD storage in Figure 5-9 consists of the local drives that are internal to the
x3950 server, and is enhanced with additional drives in a locally attached SAS
enclosure for bigger T-shirt sizes that require more storage capacity. The amount
of flash memory can be enhanced with additional HighIOPS adapters that are
installed in the server.
Despite GPFS being a cluster file system, single node IBM SAP HANA solutions
also can use it. From a GPFS point of view, single node configuration is a specific
form of a cluster, consisting of just one node. This single-node solution does not
use the cluster features of the GPFS, but takes advantages of GPFS by using
different types of storage on eX5 systems in one file system and placing files
onto proper storage media according to the IO type (that is, either data or log).
1

94

Previous editions of this book used the term primary data for the first replica. To be consistent with
official documentation and to emphasize that there is no difference between multiple replicas of a
single file, this edition uses the term first replica (and second and third replica later on in the book).

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

This placement is handled transparently without SAP HANA requiring to know


about the different media types that are underneath.
Using GPFS for all SAP HANA T-shirt sizes enables seamless scalability without
the need to reformat the storage when growing into bigger T-shirt sizes. In
addition, GPFS provides great performance and enables reliable storage for
huge amount of data.
Additional hot spare drives are supported. They can be added without impacting
the overall performance of the solution.

Network
The standard x3950 X5 building blocks are used to build single node SAP HANA
database environments and scale-out, or clustered, environments.
In scale-out installations, the participating nodes are interconnected by 10 Gb
Ethernet in a redundant fashion. There are two redundant 10 Gb Ethernet
networks for the communication within the solution:
A fully redundant 10 Gb Ethernet network for cluster-internal communication
of the SAP HANA software
A fully redundant 10 Gb Ethernet network for cluster-internal communication
of GPFS, including replication
These networks are internal to the scale-out solution and have no connection to
the client network. IBM uses Emulex 10 Gigabit Ethernet adapters for both
internal networks. Those adapters and the networking switches for these
networks are part of the appliance and cannot be substituted with other than the
validated switch models.
Uplinks to the client network are handled through either 1 Gigabit or 10 Gigabit
Ethernet connection depending on the existing client infrastructure.

Chapter 5. IBM System x solutions for SAP HANA

95

Figure 5-10 illustrates the networking architecture for a scale-out solution and
shows the SAP HANA scale-out solution that is connected to an SAP NetWeaver
Business Warehouse (SAP NetWeaver BW) system as an example. Port
numbers reference the physical ports in Figure 5-11 on page 97.

SAP
BW
system

SAP HANA scale-out solution

switch

switch

10 GbE switch

node 1

node 2

node 3

...

node n

10 GbE switch

Black network carries remaining ports (a, b, e, f, g, h, i).

Figure 5-10 Networking architecture of the x3950 X5 workload-optimized solution for SAP HANA

All network connections within the scale-out solution are fully redundant. Both
the internal GPFS network and the internal SAP HANA network are connected to
two 10 Gb Ethernet switches, which are interconnected for full redundancy. The
switch model that is used here is the IBM System Networking RackSwitch
G8264. It delivers exceptional performance, being both lossless and low latency.
With 1.2 Tbps throughput, the G8264 provides massive scalability and low
latency that is ideal for latency-sensitive applications, such as SAP HANA.
The scale-out solution for SAP HANA makes intensive use of the advanced
capabilities of this switch, such as virtual link aggregation groups (vLAGs). For
smaller scale-out deployments, the smaller IBM Systems Networking
RackSwitch G8124 can be used instead of the G8264. For details about the
switch options, see 5.4, IBM System Networking options on page 136.

96

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

To illustrate the network connectivity, Figure 5-11 shows the back of an x3950 X5
building block with the network interfaces available. The letters denoting the
interfaces correspond to the letters that are used in Figure 5-10 on page 96.

GPFS

IMM

SAP HANA
Figure 5-11 The back of an x3950 X5 building block with the network interfaces available

Each eX5 building block comes with four 10 Gb Ethernet and six 1 GbE network
ports. The available 1 Gb Ethernet interfaces available (a.b.e.f.g.h) on the system
can be used to connect the systems to other networks or systems, for example,
for client access, application management, systems management, and data
management. The interface that is denoted with the letter i is used to connect
the integrated management module (IMM) of the server to the management
network.

Integrated virtualization
The VMware ESXi embedded hypervisor software is a virtualization platform that
allows multiple operating systems to run on a host system concurrently. Its
compact design allows it to be embedded in physical servers.

Chapter 5. IBM System x solutions for SAP HANA

97

IBM offers versions of VMware vSphere Hypervisor (ESXi) that are customized
for select IBM hardware to give you online platform management, including
updating and configuring firmware, platform diagnostic tests, and enhanced
hardware alerts. All models support several USB keys as options, which are
listed in Table 5-1.
Table 5-1 VMware ESXi memory keys
Part number

Feature code

Description

41Y8298

A2G0

IBM Blank USB Memory Key for VMware ESXi Downloads

41Y8300

A2VC

IBM USB Memory Key for VMware ESXi 5.0

41Y8307

A383

IBM USB Memory Key for VMware ESXi 5.0 Update 1

41Y8311

A2R3

IBM USB Memory Key for VMware ESXi 5.1

41Y8385

A584

IBM USB Memory Key for VMware ESXi 5.5

The x3850 X5 and x3950 X5 have two internal USB connectors that are available
for the embedded hypervisor USB key. The location of these USB connectors is
shown in Figure 5-12.

Internal USB sockets

Embedded
hypervisor key
installed
Figure 5-12 Location of internal USB ports for embedded hypervisor on the x3850 X5
and x3950 X5

98

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

For more information about the USB keys, and to download the IBM customized
version of VMware ESXi, go to the following website:
http://www.ibm.com/systems/x/os/vmware/esxi
In addition to the USB key, you also must license vSphere with VMware.
Depending on the physical server, you need different licenses. For more
information, see 6.4, SAP HANA on VMware vSphere on page 177. That
section also contains more information about virtualizing an SAP HANA
instance.

5.1.7 x3690 X5 Workload Optimized Solution for SAP HANA


All IBM System x3690 X5-based workload-optimized solutions for SAP HANA
follow one architectural approach, which ensures a seamless upgrade path for
clients when they must grow their installation.
Certain configuration options are defined by SAP, which includes the processor
being used, the minimum storage capacity that must be provided, and the
network connectivity of the system.
A standard x3690 X5 workload-optimized solution for SAP HANA has the
following minimum components:
Two Intel Xeon E7-8870 processors.
Up to 256 GB of DDR3 main memory.
Ten local 200 GB SAS SSDs in eXFlash units. RAID arrays on those devices
hold the operating system (SLES 11), SAP HANA savepoints, and SAP
HANA logs.

CPU and memory


SAP restricts supported CPU modes to the Xeon E7-x870 processor (with x
being either 2, 4, or 8 to denote the number of maximum supported processors in
a single server). Because the x3690 X5 models can scale up to two sockets, IBM
offers the Intel Xeon Processor E7-2870 with 2.4 GHz. The 16 GB DIMM DDR3
ECC (4Rx4, 1.35 V) CL7 1066 MHz memory module is used.
For an overview of which DIMMs are used in which T-shirt size see 6.1, IBM eX5
based environments on page 146, which details the different memory
configurations. DIMM placement is crucial for best performance. It is not
supported to change module placement or memory configuration in a
workload-optimized solution for SAP HANA.

Chapter 5. IBM System x solutions for SAP HANA

99

Storage
All systems come with storage for both the data volume and the log volume. The
building blocks that are based on the x3690 X5 come with combined data and log
storage on an array of RAID-protected, hot-swap eXFlash SSD drives.
These flash technology storage devices are optimized for high IOPS
performance and low latency to provide the SAP HANA database with a log
storage that allows the highest possible performance. Because a transaction in
the SAP HANA database can return only after the corresponding log entry is
written to the log storage, high IOPS performance and low latency are key to
database performance.
Figure 5-13 shows the storage architecture of workload-optimized solution that is
based on the x3690 X5 model.

x3690 X5
SAP HANA DB
DB partition 1
- SAP HANA DB

- Index server
- Statistic server
- SAP HANA studio
Shared file system - GPFS

SSD
First replica

SSD

data + log

Figure 5-13 Storage architecture of the x3690 X5 workload-optimized solution for SAP
HANA

IBM uses General Parallel File System (GPFS) to provide a robust and
high-performance file system that allows you to grow an SAP HANA environment
without needing to reformat the storage arrays. GPFS also takes all available
storage arrays and combines them into one file system for use with SAP HANA.
Additional hot spare SSDs are supported. They can be added without impacting
the performance of the overall solution.

100

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Network
The standard x3690 X5 building blocks are used to build single node SAP HANA
database environments and scale-out, or clustered, environments. In scale-out
installations, the participating nodes are interconnected by 10 Gb Ethernet in a
redundant fashion.
The networking architecture for scale-out SAP HANA environments that are
based on the x3690 X5 follows the same approach as the x3950 X5 solution.
There are two redundant 10 Gb Ethernet networks for the communication within
the solution:
A fully redundant 10 Gb Ethernet network for cluster-internal communication
of the SAP HANA software
A fully redundant 10 Gb Ethernet network for cluster-internal communication
of GPFS, including replication
These networks are internal to the scale-out solution and have no connection to
the client network. IBM uses Emulex 10 Gigabit Ethernet adapters for both
internal networks. Those adapters and the networking switches for these
networks are part of the appliance and cannot be substituted with other than the
validated switch models.
Uplinks to the client network are handled through either a 1 Gigabit or 10 Gigabit
Ethernet connection, depending on the existing client infrastructure.

5.2 IBM X6 systems


In February 2014, IBM announced the sixth generation of the IBM Enterprise
X-Architecture servers. The IBM X6 rack family consists of the new flagship
servers of the IBM x86 server family:
IBM System x3850 X6 (a 4U rack-optimized server scalable to four sockets)
IBM System x3950 X6 (an 8U rack-optimized server scalable to eight sockets)

Chapter 5. IBM System x solutions for SAP HANA

101

Figure 5-14 shows the x3850 X6.

Figure 5-14 x3850 X6

The x3950 X6 looks like two x3850 X6 servers where one is placed on top of the
other. However, unlike eX5 servers, x3950 X6 employs single chassis with a
single midplane design without any external connectors and cables.
Figure 5-15 shows the x3950 X6.

Figure 5-15 x3950 X6

102

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

The X6 systems offer a new bookshelf design concept that is based on a fixed
chassis that is mounted in a standard rack cabinet. There is no need to pull the
chassis in or out of the rack to access components because all components can
be accessed either from the front or from the rear like pulling books from a
bookshelf.
Figure 5-16 shows the x3850 X6 server with one of the four Compute Books
partially removed.

Figure 5-16 IBM x3850 X6 server with a Compute Book partially removed

The modular component that can be installed in a chassis is called a Book. There
are several types of books available:
Compute Books
A Compute Book contains one processor, 24 DIMM slots, and 2 hot-swap fan
modules. It is accessible from the front of the server.
The x3850 X6 supports up to four Compute Books. The x3950 X6 supports
up to eight Compute Books.

Chapter 5. IBM System x solutions for SAP HANA

103

Storage Books
The Storage Book contains standard 2.5-inch drives or IBM eXFlash 1.8-inch
hot-swap SSD drives. It also provides front USB and video ports, and it has
two PCIe slots that are reserved for internal storage adapters. The Storage
Book is accessible from the front of the server.
The x3850 X6 has one Storage Book. The x3950 X6 has two Storage Books.
I/O Books
I/O Book is a container that provides PCIe expansion capabilities. I/O Books
are accessible from the rear of the server.
There are three types of I/O Books:
Primary I/O Book. This book provides core I/O connectivity, including the
ML2 unique slot for an onboard network, three standard PCIe 3.0 slots,
Integrated Management Module II, hot-swap fan modules and USB, video,
serial, and systems management ports.
Full-length I/O Book. This hot-swap Book provides three optional
full-length PCIe slots.
Half-length I/O Book. This hot-swap Book provides three optional
half-length PCIe slots.
The x3850 X6 has one Primary I/O Book and supports one or two of the full or
half-length I/O Books (one of each or two of either). The x3950 X6 has two
Primary I/O Books and supports up to four of the full or half-length I/O Books
(any combination).
You can find more information about X6 servers and the sixth generation of IBM
Enterprise X-Architecture technology in IBM X6 Servers: Technical Overview,
REDP-5059.
The next sections introduce the technology components that are used to build
the IBM X6-based workload-optimized solution for SAP HANA and explain the
architecture of x3850 X6 and x3950 X6-based solutions. They share a common
concept so that you can start with a x3850 X6-based installation and later on
upgrade to an x3950 X6 installation without leaving parts on the floor.

5.2.1 Intel Xeon processor E7 v2 family


The IBM X6 portfolio of servers uses CPUs from the Intel Xeon processor E7 v2
family to maximize performance. These processors are the latest in a long line of
high-performance processors.

104

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

The Intel Xeon processor E7 v2 family CPUs are the latest Intel scalable
processors and can be used to scale up to four processors in the x3850 X6 or up
to eight processors in the x3950 X6.
The current models of the X6 systems use processors from the Intel Xeon
processor E7 v2 product family. The Intel Xeon processors that are used in the
X6 systems are follow-ons to the Intel Xeon processor E7 product family. New
processors feature the new Intel microarchitecture (formerly codenamed
IvyBridge-EX) and a new 22 nm manufacturing process that provides higher
core count, larger cache sizes, higher core frequencies, and higher memory
speeds. In addition, these new processors support more memory with up to
24 DIMMs per processor and faster low-latency I/O with integrated PCIe 3.0
controllers.
The Intel Xeon processor E7 v2 product family offers the following key features:
Up to 15 cores and 30 threads (using Hyper-Threading feature) per processor
Up to 37.5 MB of L3 cache
Up to 3.4 GHz core frequencies
Up to 8 GTps bandwidth of QPI links
Integrated memory controller with four SMI2 channels that support up to
24 DDR3 DIMMs
Up to 1600 MHz DDR3 memory speeds and up to 2667 MHz SMI link speeds
Integrated PCIe 3.0 controller with 32 lanes per processor
Intel Virtualization Technology (VT-x and VT-d)
Intel Turbo Boost Technology 2.0
Intel Advanced Vector Extensions (AVT)
Intel AES-NI instructions for accelerating of encryption
Advanced QPI and memory reliability, availability, and serviceability (RAS)
features
Machine Check Architecture recovery (non-execution and execution paths)
Enhanced Machine Check Architecture Gen1
Machine Check Architecture I/O
Security technologies: OS Guard, Secure Key, and Intel TXT

Chapter 5. IBM System x solutions for SAP HANA

105

The Intel Xeon E7 v2 processors have many new features that improve
performance of SAP HANA workloads. A white paper called Infuse your
business with real-time, data-driven intelligence shows the benefits of the new
generation Xeon processors for SAP HANA. You can find this white paper at the
Intel website:
http://www.intel.com/content/www/us/en/big-data/big-data-xeon-e7-sap-ha
na-real-time-business-platform-brief.html

Instruction set extension


With the release of the latest Xeon processor E7 v2 family, Intel added Advanced
Vector Extensions (AVX) to the CPU instruction set. AVX has several new
instructions and an increased register size of 256 bits up from 128 bits.
When used in single instruction, multiple data (SIMD) algorithms, AVX allows for
a much higher throughput on each single CPU core because twice the number of
data values can now be processed per single clock cycle. SIMD processing is
perfectly suited to speed up SAP HANA because running typical data warehouse
algorithms, such as aggregation or scanning on column stored tables, are
inherently parallelizable.

Intel Advanced Encryption Standard - New Instructions


Advanced Encryption Standard (AES) is an encryption standard that is widely
used to protect network traffic and sensitive data. Advanced Encryption Standard
- New Instructions (AES-NI), available with the E7 processors, implements
certain complex and performance-intensive steps of the AES algorithm by using
processor hardware. AES-NI can accelerate the performance and improve the
security of an implementation of AES over an implementation that is performed
by software.
For more information about Intel AES-NI, go to the following website:
http://software.intel.com/en-us/articles/intel-advanced-encryption-stan
dard-instructions-aes-ni

Intel Data Direct I/O: PCI Express 3.0


On the Xeon processor E7 v2 family processors, Intel integrates the I/O
subsystem into the chip to allow for lower latency and faster data transfers
(compared to dedicated I/O hubs in the previous processor generation).
Intel also adds support for the latest PCI Express 3.0 standard that almost
doubles the theoretical maximum bandwidth while keeping compatibility with
previous generations of the PCIe protocol. PCIe 1.x and 2.x cards now properly
work in PCIe 3.0-capable slots.

106

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

PCIe 3.0 uses the 128b/130b encoding scheme, which is more efficient than the
8b/10b encoding that is used in the PCIe 2.0 protocol. This approach reduces the
processing impact to less that 2% compared to the 20% of PCIe 2.0, and allows
almost double bandwidth at 8 GTps speed.
Up to 32 PCIe 3.0 lanes are available per processor. These 32 lanes can be split
into any combination of x4, x8, and x16.
For more information about Data Direct I/O, go to the following website:
http://www.intel.com/content/www/us/en/io/direct-data-i-o.html

QuickPath Interconnect
The Intel Xeon E7 processors that are implemented in IBM X6 servers include
two integrated memory controllers in each processor. Processor-to-processor
communication is carried over shared-clock or coherent quick path interconnect
(QPI) links. Each processor has three QPI links to connect to other processors.
Figure 5-17 shows the QPI configurations. On the left side is how the four
sockets of the x3850 X6 are connected together. On the right side is how all eight
sockets of the x3950 X6 are connected together.

x3850 X6
4 sockets

x3950 X6 - 8 sockets

Figure 5-17 QPI links between processors

Each processor has some amount of memory, which is connected directly to the
processor. To access memory that is connected to another processor, each
processor uses QPI links through another processors. This design creates a
non-uniform memory access (NUMA) system. Similarly, I/O can be local to a
processor or remote through another processor.
For QPI usage, Intel modified the MESI cache coherence protocol to include a
forwarding state. Therefore, when a processor asks to copy a shared cache line,
only one other processor responds.

Chapter 5. IBM System x solutions for SAP HANA

107

For more information about QPI, see the following website:


http://www.intel.com/technology/quickpath

Intel RunSure technology


In addition to the RAS features that exist in the Xeon processor E7 family, Intel
add new features to further improve the RAS of the E7 v2 family. They are
grouped into CPU-related, memory-related, and I/O-related features:
Cyclic redundancy checking (CRC) on the QPI links
The data on the QPI link is checked for errors.
QPI packet retry
If a data packet on the QPI link has errors or cannot be read, the receiving
processor can request that the sending processor try resending the packet.
QPI clock failover
If there is a clock failure on a coherent QPI link, the processor on the other
end of the link can become the clock. This action is not required on the QPI
links from processors to I/O hubs, as these links are asynchronous.
QPI self-healing
If there are persistent errors that are detected on a QPI link, the link width can
be reduced dynamically to allow the system to run in a degraded mode until a
repair can be performed.
QPI link can reduce its width to a half width or a quarter width, and slow down
its speed.
Scalable memory interconnect (SMI) packet retry
If a memory packet has errors or cannot be read, the processor can request
that the packet be resent from the memory buffer.

5.2.2 Memory
The x3850 X6 and x3950 X6 support DDR3 memory with ECC protection. The
x3850 X6 supports up to 96 DIMMs when all processors are installed (24 DIMMs
per processor), and the x3950 X6 supports up to 192 DIMMs. The processor, the
memory buffers, and the corresponding memory DIMM slots are on the Compute
Book.
Each processor has two integrated memory controllers, and each memory
controller has two Scalable Memory Interconnect generation 2 (SMI2) links that
are connected to two scalable memory buffers. Each memory buffer has two
DDR3 channels, and each channel supports three DIMMs, for a total of 24
DIMMs per processors.

108

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Figure 5-18 shows the overall architecture.

Intel Xeon processor


Memory
controller

Memory
controller
SMI2 links

Memory
buffer

Memory
buffer

Memory
buffer

Memory
buffer
DDR3 links

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

Figure 5-18 Intel Xeon processor E7 v2 memory architecture

The SMI2 links runs at a frequency of up to 1333 MHz and supports two transfers
per clock cycle, leading to an effective transfer rate of 2666 megatransfers per
second (MTps). Using a data width of 64-bit, this results in a channel bandwidth
of 21.3 GBps per SMI2 link.
Note: The Intel Xeon processor E7 v2 family allows for an alternative
operational mode that is called RAS mode or Lockstep mode. Although RAS
mode is supported on the IBM X6 architecture, you cannot enable it for any of
the workload-optimized solutions for SAP HANA.

Chipkill
Chipkill memory technology, an advanced form of ECC from IBM, is available for
the X6 servers. Chipkill (also known as Single Device Data Correction (SDDC))
protects the memory in the system from any single memory chip failure. It also
protects against multi-bit errors from any portion of a single memory chip.
Chipkill on its own can provide 99.94% memory availability to the applications
without sacrificing performance and with standard ECC DIMMs.

Chapter 5. IBM System x solutions for SAP HANA

109

IBM Advanced Page Retire


Advanced Page Retire is an IBM unique algorithm to handle memory errors. It is
a built-in sophisticated error handling firmware that uses and co-ordinates
memory recovery features, balancing the goals of maximum up time and
minimum repair actions.
The algorithm uses short- and long-term thresholds per memory rank with leaky
bucket and automatic sorting of memory pages with the highest correctable error
counts. First, it uses hardware recovery features, followed by software recovery
features, to optimize recovery results for both newer and older operating systems
and hypervisors.
When recovery features are exhausted, the firmware issues a Predictive Failure
Alert. Memory that fails is held offline during reboots until it is repaired. Failed
DIMMs are indicated by light path diagnostics LEDs physically at the socket
location.
IBM performs thorough testing to verify the features and co-ordination between
the firmware and the operating system or hypervisor.

5.2.3 Flash technology storage


The IBM X6 systems support different types of flash memory. This section
introduces you to the flash technology that is used by the X6 workload-optimized
solutions for SAP HANA. You can read about the other technologies in IBM
System x3850 X6, TIPS1084 and IBM System x3950 X6, TIPS1132.
All X6 workload-optimized solutions for SAP HANA include storage controllers to
provide certain features like RAID support. There is at least one storage
controller for the internal disk drives. Depending on the memory size of the
workload-optimized solution, there also are one or more controllers for external
storage expansion enclosures (for more information, see Chapter 6, SAP HANA
IT landscapes with IBM System x solutions on page 145 and Chapter 7,
Business continuity and resiliency for SAP HANA on page 187).
Both storage controllers support SSD caching for traditional HDDs. You can use
this feature to accelerate the performance of HDD arrays with only an
incremental investment in SSD technology. SSDs are configured as a dedicated
pool of controller cache and the controller firmware automatically places the most
frequently accessed (hot) data on these SSDs.2

110

The SSD caching feature should not be confused with IBM FlashCache Storage Accelerator.

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

SSD caching works in a transparent manner to the operating system, that is, that
the SSDs are not visible as block devices by Linux. They can be seen only with
the controller configuration utility.
SSD caching of the local HDD arrays means that you can accelerate all types of
I/O operations that are sent to the local RAID devices. Because all storage
controllers in the X6 workload-optimized solutions for SAP HANA are configured
with SSD caching, all data that SAP HANA writes to disk is accelerated.
SSD caching is the most flexible way to speed up any kind of disk-based
operation in an SAP HANA solution.

5.2.4 x3850 X6 Workload Optimized Solution for SAP HANA


The workload-optimized solutions that are based on the x3850 X6 server cover
two socket and four socket configurations. Customers can start with two socket
machines and scale up over time when the database size grows by adding only
new components. (There will be no leftover parts.)
A standard IBM x3850 X6 workload-optimized solution has the following
minimum components:
Two or four CPU Books with Intel Xeon E7-8880 v2 processors
Up to 2 TB of DDR3 main memory
Two local 1.2 TB SAS drives in a RAID 1 for the operating system (SLES 11
for SAP Applications SP3 or RHEL 6.5 for SAP HANA) and local
housekeeping
Four local 1.2 TB SAS drives in a RAID 5 to store SAP HANA data and logs
One internal storage controller, plus locally attached 400 GB SSDs for the
SSD caching feature
Two dual-port Mellanox ConnectX-3 VPI adapters running in 10 Gigabit mode
for SAP HANA internal communication
One Intel I350 quad-port 1 Gigabit network adapter for uplink into a customer
network (different NIC vendor and speed supported)
When your database size grows and you must scale up the x3850 X6 solution,
you add one or more of the following components:
Additional CPU Books with Intel Xeon E7-8880 v2 processors
Additional DDR3 memory modules

Chapter 5. IBM System x solutions for SAP HANA

111

Additional 1 Gbps or 10 Gbps network adapters


A storage expansion enclosure to increase local storage capacity with
additional 1.2 TB SAS drives, plus one external storage controller, plus locally
attached 400 GB SSDs for the SSD caching feature
Note: When scaling up from an x3850 X6 environment to an x3950 X6
installation, you can reuse all your parts. Only the mechanical enclosure must
be changed from a 4U to an 8U frame. All active components (that is, all CPU
Books, Storage Books, I/O Books, power supplies, and fans) can be reused in
the new system.
However, as your system environment changes, you might be forced to
re-create the RAID 1 array for the operating system. To avoid losing the
operating system installation, you can use UEFI and import the RAID array
configuration (that is stored on the HDDs) to the RAID controller. This has
worked in the lab, but make a backup of the installation.
If you want to start with an x3850 X6 but expect to grow to an x3950 X6, buy a
four socket x3950 X6 solution, which allows for seamless growth to eight
sockets.
The following sections explain the components in more detail.

CPU and memory


SAP restricts supported CPU models to a certain subset of the Xeon E7 v2
family, which ensures that application tuning is most effective and ensures the
highest performance of the database application. The following CPU models are
supported by SAP and IBM on the x3850 X6:

Intel Xeon Processor E7-8880 v2: 2.5 GHz, for up to eight sockets (default)
Intel Xeon Processor E7-8890 v2: 2.8 GHz, for up to eight sockets3
Intel Xeon Processor E7-4880 v2: 2.5 GHz, for up to four sockets4
Intel Xeon Processor E7-4890 v2: 2.8 GHz, for up to four sockets4

To allow memory to scale from 128 GB up to 2 TB, memory modules with


different capacities are used. Here is a list of memory modules that are used with
the x3850 X6:
8 GB DIMM DDR3 ECC RDIMM (1Rx4, 1.35 V) CL11 1600 MHz
16 GB DIMM DDR3 ECC RDIMM (2Rx4, 1.35 V) CL11 1600 MHz
32 GB DIMM DDR3 ECC LRDIMM (4Rx4, 1.35 V) CL11 1600 MHz
3
4

112

Available upon request for compute-bound SAP HANA installations (such as SAP Business Suite
powered by SAP HANA).
Available upon special request only because CPU Books with E7-48x0 v2 processors cannot be
reused when scaling up to an x3950 X6 installation.

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

For an overview of which DIMMs are used with which memory configuration, see
6.2, IBM X6 based environments on page 157. DIMM placement is crucial for
best performance. It is not supported to change module placement or memory
configuration without consulting IBM first.

Storage
SAP requires a certain amount of persistent storage capacity in each SAP HANA
server to save main memory data onto a non-volatile storage media regularly.
This storage space is divided into a log volume and a data volume. For binary
and configuration files, SAP requires an additional volume. The exact storage
requirements depend on the nodes memory size. Bigger nodes must provide
more storage capacity.
IBM has the following storage devices in the x3850 X6 solution for SAP HANA:
3.6 TB (four 1.2 TB 2.5-inch SAS drives in a RAID 5) using four internal
storage bays in the Storage Book.
Two additional 400 GB 2.5-inch SAS SSDs are used for the SSD caching
feature of the ServeRAID adapter. They occupy two bays in the Storage Book.
9.6 TB (nine 1.2 TB 2.5-inch SAS drives in a RAID 5 or ten 1.2 TB 2.5-inch
SAS drives in a RAID 6) using storage expansion enclosure EXP2524,
connected to an IBM ServeRAID M5120 adapter.
Two additional 400 GB 2.5-inch SAS SSDs are used for the SSD caching
feature of the ServeRAID adapter. They occupy two bays in the EXP2524.
IBM uses General Parallel File System (GPFS) to provide a robust and
high-performant file system that you can use to non-disruptively grow your
environment as your needs increase. If you add additional drives to your node,
GPFS transparently includes this additional storage and starts using it. No data
is lost during this upgrade.

Chapter 5. IBM System x solutions for SAP HANA

113

Figure 5-19 shows the storage architecture on an x3850 solution. The optional
second RAID array is shown in dashed lines. GPFS takes care of the different
sizes of the block devices. It balances I/O operations to maximize the usage of
both devices.

x3850 X6 node
SAP HANA DB
DB partition 1
- SAP HANA DB

- Index server
- Statistic server
- SAP HANA studio
Shared file system - GPFS
3.6 TB
First replica

9.6 TB

data + log

Figure 5-19 Storage architecture of the x3850 X6 workload-optimized solution

Additional hot spare drives are supported. They can be added without impacting
the overall performance of the solution.

Network
IBM solutions for SAP HANA have several different network interfaces that can
be grouped into the following modes:
Internal communication (HANA communication and GPFS communication).
Redundancy is required.
External communication (SAP data management, SAP client access, data
replication, appliance management, and others, depending on the customer
landscape). Redundancy is optional.
Internal communication remains internal within scale-out SAP HANA solutions.
They have no connection to the customer network. The networking switches for
these networks are part of the appliance and cannot be substituted with other
than the validated switch models.

114

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

IBM uses two dual-port Mellanox ConnectX-3 VPI adapters running in 10 Gbit
Ethernet mode for both internal communication networks. Each of the two
networks requires its own connection on two physical interfaces to allow for
redundancy in case of a hardware failure in the network. The IBM System
Networking RackSwitch G8264 is used as the switch for internal communication.
One switch handles both traffic, HANA, and GPFS communication. To allow for a
switch failure, a second G8264 is required in scale-out solutions. Smaller
scale-out installations can also use the G8124 switch. For more information
about the switches, see 5.4.1, IBM System Networking RackSwitch G8264 on
page 137 and 5.4.2, IBM System Networking RackSwitch G8124 on page 140.
For the uplink into the client network, an Intel quad-port 1 Gigabit Ethernet
adapter is included as the default adapter. A second adapter can be added if
more ports are required (either for access to more networks or for redundancy
reasons) or a different vendor can be chosen if, for example, 10 Gigabit is also
required towards the customer networks.
Figure 5-20 shows the back side of an x3850 X6 workload-optimized solution
with one quad-port 1 GbE card installed (the right-most PCIe slot). The network
names are examples only.

GPFS

SAP Appl.

SAP Data M.

HANA

SAP Client Acc.

SAP Data M

IMM

Figure 5-20 Networking interfaces of the x3850 X6 solution

Chapter 5. IBM System x solutions for SAP HANA

115

Figure 5-21 shows how the different network interfaces are connected.

SAP
BW
system

SAP HANA scale-out solution

switch

B E

10 GbE switch

node 1
switch

node 2

...

node n

10 GbE switch

G H

D G H

Figure 5-21 Network architecture for the x3850 X6 scale-out solution

Integrated virtualization
The VMware ESXi embedded hypervisor software is a virtualization platform that
allows multiple operating systems to run on a host system at the same time. Its
compact design allows it to be embedded in physical servers.
IBM offers versions of VMware vSphere Hypervisor (ESXi) customized for select
IBM hardware to give you online platform management, including updating and
configuring firmware, platform diagnostic tests, and enhanced hardware alerts.
All models support the USB keys as options, which are listed in Table 5-2.
Table 5-2 VMware ESXi memory keys for the x3850 X6
Part number

Feature code

Description

41Y8298

A2G0

IBM Blank USB Memory Key for VMware ESXi Downloads

41Y8382

A4WZ

IBM USB Memory Key for VMware ESXi 5.1 U1

41Y8385

A584

IBM USB Memory Key for VMware ESXi 5.5

116

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

The x3850 X6 has one internal USB connector on the primary I/O book for the
embedded hypervisor USB key. The location of this USB connector is shown in
Figure 5-22.

Internal USB port for


hypervisor

Figure 5-22 Location of internal USB port for embedded hypervisor on the x3850 X6

For more information about the USB keys, and to download the IBM customized
version of VMware ESXi, go to the following website:
http://www.ibm.com/systems/x/os/vmware/esxi
In addition to the USB key, you also need to license vSphere with VMware.
Depending on the configuration of the physical server, you must buy different
licenses. For more information, see 6.4, SAP HANA on VMware vSphere on
page 177 This section also contains more information about virtualizing an SAP
HANA instance on x3850 X6 servers.

5.2.5 x3950 X6 Workload Optimized Solution for SAP HANA


The workload-optimized solutions that are based on the x3950 X6 server cover
four socket and eight socket configurations. Customers can start with a four
socket configuration and scale up over time when the database size grows by
adding only new components. (There will be no leftover parts.)

Chapter 5. IBM System x solutions for SAP HANA

117

A standard x3950 X6 workload-optimized solution has the following minimum


components:
Four or eight CPU Books with Intel Xeon E7-8880 v2 processors
Up to 6 TB of DDR3 main memory
Two local 1.2 TB SAS drives in a RAID 1 for the operating system (SLES 11
for SAP Applications SP3 or RHEL 6.5 for SAP HANA) and local
housekeeping
Four local 1.2 TB SAS drives in a RAID 5 to store SAP HANA data and logs
One internal storage controller, plus locally attached 400 GB SSDs for the
SSD caching feature
Two Half-length I/O Books
Two dual-port Mellanox ConnectX-3 VPI adapters running in 10 Gigabit mode
for SAP HANA internal communication
Two Intel I350 quad-port 1 Gigabit network adapters for uplink into customer
network (different NIC vendor and speed are supported)
When your database size grows and you must scale-up the x3850 X6 solution,
you add one or more of the following components:
Additional CPU Books with Intel Xeon E7-8880 v2 processors
Additional DDR3 memory modules
Additional 1 Gbps or 10 Gbps network adapters
An additional six local 1.2 TB SAS drives, an internal storage controller, plus
locally attached 400 GB SSDs for the SSD caching feature
A storage expansion enclosure that increases local storage capacity with an
additional 1.2 TB SAS drives, one external storage controller, plus locally
attached 400 GB SSDs for the SSD caching feature
Note: x3950 X6 servers are supported by only four CPU Books installed. In
this configuration, two CPU Books must be installed in the lower half of the
chassis and the other two CPU Books must be installed in the upper half of the
chassis. Different configurations lead to situations where not all PCIe adapters
are usable.
The following sections explain the components in more detail.

118

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

CPU and memory


SAP restricts supported CPU models to a certain subset of the Xeon E7 v2
family, which ensures that application tuning is most effective and ensures the
highest performance of the database application. The following CPU models are
supported by SAP and IBM on the x3950 X6:
Intel Xeon Processor E7-8880 v2: 2.5 GHz, for up to eight sockets (default)
Intel Xeon Processor E7-8890 v2: 2.8 GHz, for up to eight sockets5
To allow memory to scale from 256 GB up to 6 TB, memory modules with
different capacities are used. Here is a list of memory modules that are used with
the x3950 X6:

8 GB DIMM DDR3 ECC RDIMM (1Rx4, 1.35 V) CL11 1600 MHz


16 GB DIMM DDR3 ECC RDIMM (2Rx4, 1.35 V) CL11 1600 MHz
32 GB DIMM DDR3 ECC LRDIMM (4Rx4, 1.35 V) CL11 1600 MHz
64 GB DIMM DDR3 ECC LRDIMM (8Rx4, 1.35 V) 1333 MHz

For an overview of which DIMMs are used with which memory configuration, see
6.2, IBM X6 based environments on page 157. DIMM placement is crucial for
best performance. Changing the module placement or memory configuration
without consulting IBM first is not supported.

Storage
SAP requires a certain amount of persistent storage capacity in each SAP HANA
server to save main memory data onto a non-volatile storage media regularly.
This storage space is divided into a log volume and a data volume. For binary
and configuration files, SAP requires an additional volume. The exact storage
requirements depend on the nodes memory size. Bigger nodes must provide
more storage capacity.
IBM has the following storage devices in the x3950 X6 solution for SAP HANA:
3.6 TB (four 1.2 TB 2.5-inch SAS drives in a RAID 5) using four internal
storage bays in the lower Storage Book.
Two additional 400 GB 2.5-inch SAS SSDs are used for the SSD caching
feature of the ServeRAID adapter. They occupy two bays in the lower Storage
Book.
6 TB (six 1.2 TB 2.5-inch SAS drives in a RAID 56) using six internal storage
bays in the upper Storage Book, which requires an additional IBM ServeRAID
M5210 storage controller in the upper Storage Book to be installed.

5
6

Available upon request for compute-bound SAP HANA installations (like SAP Business Suite
powered by SAP HANA)
Installations requiring RAID 6 have only 4.8 TB usable storage space instead of 6 TB.

Chapter 5. IBM System x solutions for SAP HANA

119

Two 400 GB 2.5-inch SAS SSDs are used for the SSD caching feature of the
upper ServeRAID adapter. They occupy two bays in the upper Storage Book.
9.6 TB (nine 1.2 TB 2.5-inch SAS drives in a RAID 5 or ten 1.2 TB 2.5-inch
SAS drives in a RAID 6) using storage expansion enclosure EXP2524,
connected to an IBM ServeRAID M5120 storage controller.
Two additional 400 GB 2.5-inch SAS SSDs are used for the SSD caching
feature of the ServeRAID adapter. They occupy two bays in the EXP2524.
9.6 TB (nine 1.2 TB 2.5-inch SAS drives in a RAID 5 or ten 1.2 TB 2.5-inch
SAS drives in a RAID 6) using empty bays in the EXP2524.
Two additional 400 GB 2.5-inch SAS SSDs are used for the SSD caching
feature of the ServeRAID adapter. They occupy two bays in the EXP2524.
IBM uses General Parallel File System (GPFS) to provide a robust and
high-performance file system that allows to non-disruptively grow your
environment as your needs increase. If you add additional drives to your node,
GPFS transparently includes this additional storage and starts using it. No data
is lost during this upgrade.
Figure 5-23 on page 121 shows the storage architecture on an x3950 X6
solution. The optional second, third, and fourth RAID arrays are shown in dashed
lines. GPFS takes care of the different sizes of the block devices. It balances I/O
operations to maximize the usage of all available devices.

120

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

x3950 X6 node
SAP HANA DB
DB partition 1
- SAP HANA DB

- Index server
- Statistic server
- SAP HANA studio

Shared file system - GPFS


3.6 TB
First replica

6 TB

9.6 TB

9.6 TB

data + log

Figure 5-23 Storage architecture of the x3950 X6 workload-optimized solution

Additional hot spare drives are supported. They can be added without impacting
the overall performance of the solution.

Network
IBM solutions for SAP HANA have several different network interfaces that can
be grouped into the following modes:
Internal communication (HANA communication and GPFS communication).
Redundancy is required.
External communication (SAP data management, SAP client access, data
replication, appliance management, and others, depending on the customer
landscape). Redundancy is optional.
Internal communication remains internal within scale-out SAP HANA solutions.
They have no connection to the customer network. The networking switches for
these networks are part of the appliance and cannot be substituted with other
than the validated switch models.

Chapter 5. IBM System x solutions for SAP HANA

121

IBM uses two dual-port Mellanox ConnectX-3 VPI adapters running in 10 Gbit
Ethernet mode for both internal communication networks. Each of the two
networks requires its own connection on two physical interfaces to allow for
redundancy in case of a hardware failure in the network. The IBM System
Networking RackSwitch G8264 is used as the switch for internal communication.
One switch handles both traffic, HANA, and GPFS communication. To allow for a
switch failure, a second G8264 is required in scale-out solutions. Smaller
scale-out installations can also use the G8124 switch. For more information
about the switches, see 5.4.1, IBM System Networking RackSwitch G8264 on
page 137 and 5.4.2, IBM System Networking RackSwitch G8124 on page 140.
For the uplink into the customer network, two Intel quad-port 1 Gigabit Ethernet
adapters are included as the default adapters. Third and fourth adapters can be
added if more ports are required (to access more networks) or a different vendor
can be chosen if, for example, 10 Gigabit is required also towards the customer
networks.
Figure 5-24 shows the back side of an x3950 X6 workload-optimized solution
with two quad-port 1G cards installed (the right-most PCIe slots). The network
names are examples only.

SAP Appl.

SAP Data M.

HANA

GPFS
IMM

SAP Client Acc.

SAP Data M

Figure 5-24 Networking interfaces of the x3950 X6 solution

122

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Figure 5-25 shows how the different network interfaces are connected.

SAP
BW
system

SAP HANA scale-out solution

switch

B E

10 GbE switch

node 1
switch

node 2

...

node n

10 GbE switch

G H

D G H

Figure 5-25 Network architecture for the x3950 X6 scale-out solution

Integrated virtualization
The VMware ESXi embedded hypervisor software is a virtualization platform that
allows multiple operating systems to run on a host system at the same time. Its
compact design allows it to be embedded in physical servers.
IBM offers versions of VMware vSphere Hypervisor (ESXi) that are customized
for selected IBM hardware to give you online platform management, including
updating and configuring firmware, platform diagnostic tests, and enhanced
hardware alerts. All models support the USB keys options that are listed in
Table 5-3.
Table 5-3 VMware ESXi memory keys for the x3950 X6
Part number

Feature code

Description

41Y8298

A2G0

IBM Blank USB Memory Key for VMware ESXi Downloads

41Y8382

A4WZ

IBM USB Memory Key for VMware ESXi 5.1 U1

41Y8385

A584

IBM USB Memory Key for VMware ESXi 5.5

Chapter 5. IBM System x solutions for SAP HANA

123

The x3950 X6 has two internal USB connectors on each of the primary I/O Books
for the embedded hypervisor USB key. The location of these USB connectors is
shown in Figure 5-26.

Internal USB port for


hypervisor

Figure 5-26 Location of the internal USB port for the embedded hypervisor on the x3950
X6

Although the x3950 X6 has two primary I/O books, you need to equip only one of
them with the embedded hypervisor. Installing two hypervisors is supported only
when the x3950 X6 is configured to be partitioned, where the two halves of the
server operate as two independent four-socket servers.
For more information about the USB keys, and to download the IBM customized
version of VMware ESXi, see the following website:
http://www.ibm.com/systems/x/os/vmware/esxi
In addition to the USB key, you also must license vSphere with VMware.
Depending on the configuration of the physical server, you must buy different
licenses. For more information, see 6.4, SAP HANA on VMware vSphere on
page 177. This section also contains more information about virtualizing an SAP
HANA instance on x3950 X6 servers.

124

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

5.3 IBM General Parallel File System


The IBM General Parallel File System (GPFS) is a key component of the IBM
Systems Solution for SAP HANA. It is a high-performance, shared-disk file
management solution that can provide faster, more reliable access to a common
set of file data. It enables a view of distributed data with a single global
namespace.

5.3.1 Common GPFS features


GPFS uses its cluster architecture to provide quicker access to your file data. File
data automatically is spread across multiple storage devices, providing optimal
usage of your available storage to deliver high performance.
GPFS is designed for high-performance parallel workloads. Data and metadata
flow from all the nodes to all the disks in parallel under the control of a distributed
lock manager. It has a flexible cluster architecture that enables the design of a
data storage solution that meets current needs and can quickly be adapted to
new requirements or technologies. GPFS configurations include direct-attached
storage, network block input and output (I/O), or a combination of the two, and
multi-site operations with synchronous data mirroring.
GPFS can intelligently prefetch data into its buffer pool, issuing I/O requests in
parallel to as many disks as necessary to achieve the peak bandwidth of the
underlying storage-hardware infrastructure. GPFS recognizes multiple I/O
patterns, including sequential, reverse sequential, and various forms of striped
access patterns. In addition, for high-bandwidth environments, GPFS can read or
write large blocks of data in a single operation, minimizing the impact of I/O
operations.
Expanding beyond a storage area network (SAN) or locally attached storage, a
single GPFS file system can be accessed by nodes using a TCP/IP or InfiniBand
connection. Using this block-based network data access, GPFS can outperform
network-based sharing technologies, such as Network File System (NFS) and
even local file systems such as the EXT3 journaling file system for Linux or
Journaled File System. Network block I/O (also called network shared disk
(NSD)) is a software layer that transparently forwards block I/O requests from a
GPFS client application node to an NSD server node to perform the disk I/O
operation and then passes the data back to the client. Using a network block I/O,
configuration can be more cost-effective than a full-access SAN.

Chapter 5. IBM System x solutions for SAP HANA

125

Storage pools enable you to manage transparently multiple tiers of storage


based on performance or reliability. You can use storage pools to provide
transparently the appropriate type of storage to multiple applications or different
portions of a single application within the same directory. For example, GPFS
can be configured to use low-latency disks for index operations and high-capacity
disks for data operations of a relational database. You can make these
configurations even if all database files are created in the same directory.
For optimal reliability, GPFS can be configured to help eliminate single points of
failure. The file system can be configured to remain available automatically in the
event of a disk or server failure. A GPFS file system is designed to transparently
fail over token (lock) operations and other GPFS cluster services, which can be
distributed throughout the entire cluster to eliminate the need for dedicated
metadata servers. GPFS can be configured to recover automatically from node,
storage, and other infrastructure failures.
GPFS provides this function by supporting these functions:
Data replication to increase availability in the event of a storage media failure
Multiple paths to the data in the event of a communications or server failure
File system activity logging, enabling consistent fast recovery after system
failures
In addition, GPFS supports snapshots to provide a space-efficient image of a file
system at a specified time, which allows online backup and can help protect
against user error.

5.3.2 GPFS extensions for shared-nothing architectures


IBM added several features to GPFS that support the design of shared-nothing
architectures. This need is driven by todays trend to scale-out applications
processing big data.
A single shared storage is not necessarily the best approach when dozens,
hundreds, or even thousands of servers must access the same set of data.
Shared storage can impose a single point of failure (unless designed in a fully
redundant way using storage mirroring). It can limit the peak bandwidth for the
cluster file system and is expensive to provide storage access to hundreds or
thousands of nodes.

126

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

GPFS File Placement Optimizer (GPFS FPO) is the name for a set of features to
support big data applications on shared-nothing architectures. In such
scenarios, hundreds or even thousands of commodity servers compute certain
problems. They do not have shared storage to hold the data. The internal disks of
the nodes are used to store all data, which requires a new way of thinking to run
a cluster file system on top of a shared-nothing architecture.
Here are some of the features that are introduced with GPFS-FPO:
Write affinity: Provides control over the placement of new data. It can either
be written to the local node or wide striped across multiple nodes.
Locality awareness: Ability to obtain on which node certain data chunks are.
This allows the scheduling of jobs on the node holding the data, thus avoiding
costly transfer of data across the network.
Metablocks: Enable two block sizes within the same file system. MapReduce
workloads tend to have small files (below 1 MB, for example, for index files)
and large files (such as 128 MB, holding the actual data) in the same file
system. The concept of metablocks allows for an optimal usage of the
available physical blocks.
Pipelined replication: Makes the most effective use of the node interconnect
bandwidth. Data that is written on node A sends data to node B, which in turn
sends data to node C. In contrast to pipelined replication, the other replication
schema is star replication, where node A sends data to both node B and node
C. For bandwidth-intense operations or for servers with limited network
bandwidth, the outgoing link of node A can limit replication performance in
such a scenario. Choosing the correct replication schema is important when
running in a shared-nothing architecture because this almost always involves
replicating data over the network.
Fast recovery: An intelligent way to minimize recovery efforts after the cluster
is healthy again. After an error, GPFS tracks what updates are missing
through the failed disks. In addition, the load to recover the data is distributed
across multiple nodes. GPFS also allows two different recovery policies. After
a disk has failed, data can either be rebuilt when the disk is replaced or it can
immediately be rebuilt using other nodes or disks to hold the data.
GPFS offers reliability and is installed on thousands of nodes across industries,
from weather research to multimedia, retail, financial industry analytics, and web
service providers. GPFS also is the basis of many IBM cloud storage offerings.

Chapter 5. IBM System x solutions for SAP HANA

127

The IBM Systems Solution for SAP HANA benefits in several ways from the
features of GPFS:
GPFS provides a stable, industry-proven, cluster-capable file system for SAP
HANA.
GPFS transparently works with multiple replicas (that is, copies) of a single
file to protect from disk failures.
GPFS adds extra performance to the storage devices by striping data across
devices.
With the new FPO extensions, GPFS enables the IBM Systems Solution for
SAP HANA to grow beyond the capabilities of a single system, into a
scale-out solution, without introducing the need for external storage.
GPFS adds high-availability and disaster recovery features to the solution.
All these features make GPFS the ideal file system for the IBM Systems Solution
for SAP HANA.

5.3.3 Scaling-out SAP HANA using GPFS


Scaling up a single node SAP HANA appliance allows you to expand the
capabilities of an SAP HANA installation up to a certain point when then physical
limit is reached. To allow for further growth, the IBM Systems Solution for SAP
HANA supports a scale-out approach (that is, combining a number of systems
into a clustered solution, which represents a single SAP HANA instance). An
SAP HANA system can span multiple servers, partitioning the data to hold and
process larger amounts of data than a single server can accommodate.
All scale-out solutions are based on the same building 5.2, IBM X6 systems on
page 101 and 5.1, IBM eX5 systems on page 80.
All IBM scale-out solutions for SAP HANA have the following properties:
The scale-out solution is a cluster of servers, which are interconnected with
two separate 10 Gb Ethernet networks, one for the SAP HANA application
and one for the shared GPFS file system communication. Both networks are
redundant.
The SAP HANA database is split into partitions on each cluster node, forming
a single instance of the SAP HANA database.

128

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Each node of the cluster holds its own savepoints and database logs on the
local storage devices of the server.
The GPFS file system is a shared file system. Because GPFS spans all
nodes of the cluster, it makes the data of each node available to all other
nodes in the cluster despite using local storage devices only (for more
information about this technology, see 5.3.2, GPFS extensions for
shared-nothing architectures on page 126).
To an outside application connecting to the SAP HANA database, this looks like a
single instance of SAP HANA. The SAP HANA software distributes the requests
internally across the cluster to the individual worker nodes, which process the
data and exchange intermediate results, which are then combined and sent back
to the requester. Each node maintains its own set of data, persisting it with
savepoints and logging data changes to the database log that are stored on local
storage.
GPFS combines the storage devices of the individual nodes into one large file
system, making sure that the SAP HANA software has access to all data
regardless of its location in the cluster. GPFS also makes sure that savepoints
and database logs of an individual database partition are stored on the
appropriate storage device of the node on which the partition is located. Although
GPFS provides the SAP HANA software with the functionality of a shared
storage system, it ensures maximum performance and minimum latency by using
locally attached disks and flash devices.
In addition, because server-local storage devices are used, the total capacity and
performance of the storage within the cluster automatically increases with the
addition of nodes, maintaining the same per-node performance characteristics
regardless of the size of the cluster. This kind of scalability is not achievable with
external storage system.
Note: With eX5 and X6 nodes, SAP validates the IBM scale-out solution for up
to 56 nodes in a cluster. However, the building block approach of IBM makes
the solution scalable without any known limitations.
IBM has shown scalability for up to 224 nodes in a single SAP HANA scale-out
cluster. With the current X6 servers, this allows for SAP HANA database
instances of up to 448 TB.
Clients requiring scale-out configurations beyond the generally available
56 nodes can work with IBM and SAP to jointly validate such large clusters at
the client site.

Chapter 5. IBM System x solutions for SAP HANA

129

Scaling out an IBM SAP HANA solution creates a cluster of nodes. SAP HANA
designates nodes in a scale-out configuration with a certain role. They can be
either a worker node or a standby node. Worker nodes actively process
workload, and standby nodes are only part of the cluster and do not process
workload while the cluster remains in a healthy state. Standby nodes take over
the role of a worker node when it fails. Standby nodes are required for scale-out
clusters with high availability.

Scale-out solution without high-availability capabilities


This section covers scale-out environments with only SAP HANA worker nodes.
Such environments have no support for high availability because no standby
nodes are part of the cluster.
Figure 5-27 shows the networking architecture of a four-node scale-out solution.
The node designation has no impact on the network connectivity of a node. All
nodes are considered equal.

SAP HANA
worker node

SAP HANA
worker node

SAP HANA
worker node

SAP HANA
worker node

Two 10G Ethernet Switches:


GPFS network
SAP HANA network
Redundant
Inter-Switch Links

Figure 5-27 Network architecture of a four node scale-out solution (eX5 and X6)

There are two networks spanning the redundant Ethernet switches:


GPFS network, for communication and data transfer between the nodes
SAP HANA network for database communication
Every node has redundant connectivity to each of the two networks, which leads
to four 10 Gbps Ethernet ports that are required per node in scale-out
environments. If the SAP HANA database instance running on those nodes
grows, then clients can add additional nodes to extend the overall main memory
of the cluster. This is possible without affecting any of the existing nodes, so the
cluster does not have to be taken down for this operation.

130

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Going up the stack, Figure 5-28 gives insight into a three-node configuration and
shows how GPFS stripes data across the nodes. Local storage can either be a
single storage device (like for smaller X6 nodes) or multiple devices (for bigger
X6 nodes or x3690 X5 nodes), or even devices with different underlying
technologies (such as HDD and flash memory on x3950 X5 nodes).

node01

node02

node03

SAP HANA DB
DB partition 1

DB partition 2

DB partition 3

- SAP HANA DB
Worker node

- SAP HANA DB
Worker node

- SAP HANA DB
Worker node

- Index server
- Statistic server
- SAP HANA studio

- Index server
- Statistic server

- Index server
- Statistic server

Shared file system - GPFS

First replica

data01 + log01

data02 + log02

data03 + log03

local storage

local storage

local storage

Figure 5-28 A three-node clustered solution without failover capabilities

To an outside application connecting to the SAP HANA database, this solution


looks like a single instance of SAP HANA. The SAP HANA software distributes
the requests internally across the cluster to the individual worker nodes, which
process the data and exchange intermediate results, which are then combined
and sent back to the requester. Each node maintains its own set of data,
persisting it with savepoints and logging data changes to the database log.
GPFS combines one of multiple storage devices of the individual nodes into one
large file system, making sure that the SAP HANA software has access to all
data regardless of its location in the cluster, while making sure that savepoints
and database logs of an individual database partition are stored on the
appropriate storage device of the node on which the partition is. This feature is
called locality.

Chapter 5. IBM System x solutions for SAP HANA

131

The absence of failover capabilities represents a major disadvantage of this


solution. The cluster acts as a single-node configuration. In case one node
becomes unavailable for any reason, the database partition on that node
becomes unavailable, and with it the entire SAP HANA database. Loss of the
storage of a node means data loss (as with a single-server solution), and the
data must be recovered from a backup. To cover the risk of a node failure, a
standby node must be added to the cluster. This solution is described in the next
section.

Scale-out solution with high-availability capabilities


The scale-out solution for SAP HANA with high-availability capabilities enhances
the scale-out solution in two major ways:
Making the SAP HANA application highly available by introducing SAP HANA
standby nodes, which can take over from a failed node within the cluster.
Making the data that is provided through GPFS highly available to the SAP
HANA application, including its data on the local storage devices. This allows
you to tolerate the loss of a node.
SAP HANA allows the addition of nodes in the role of a standby node. These
nodes run the SAP HANA application, but do not hold any data in memory or
take an active part in the processing. In case one of the active worker nodes fails,
a standby node takes over the role of the failed node, including the data (that is,
the database partition) of the failed node. This mechanism allows the clustered
SAP HANA database to continue operating.
To take over the database partition from the failed node, the standby node must
load the savepoints and database logs of the failed node to recover the database
partition and resume operation in place of the failed node. This is possible
because GPFS provides a shared file system across the entire cluster, giving
each individual node access to all the data that is stored on the storage devices
that are managed by GPFS.

132

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Figure 5-29 shows a four-node cluster with the fourth node being a standby
node.

node01

node02

node03

node04

SAP HANA DB
DB partition 1

DB partition 2

DB partition 3

- SAP HANA DB
Worker node

- SAP HANA DB
Worker node

- SAP HANA DB
Worker node

- SAP HANA DB
Standby node

- Index server
- Statistic server
- SAP HANA studio

- Index server
- Statistic server

- Index server
- Statistic server

- Index server
- Statistic server

Shared file system - GPFS

First replica

data01 + log01

data02 + log02

data03 + log03

local storage

local storage

local storage

Second replica

local storage

Figure 5-29 A four-node clustered solution with failover capabilities

If a node has an unrecoverable hardware error, the storage devices holding the
nodes data might become unavailable or even destroyed. In contrast to
Scale-out solution without high-availability capabilities on page 130, when
high-availability features are implemented, the GPFS file system replicates the
data of each node to the other nodes, creating a second replica, to prevent data
loss in case one of the nodes goes down. Replication is done in a striping
fashion. Every node has a piece of data of all other nodes. In Figure 5-29, the
contents of the data storage (that is, the savepoints, here data01) and the log
storage (that is, the database logs, here log01) of node01 are replicated to
node02, node03, and node04.
Replication happens for all nodes generating data, so that all information is
available twice within the GPFS file system, which makes it tolerant to the loss of
a single node. Replication occurs synchronously. The write operation finishes
only when the data is both written locally and on a remote node. This ensures
consistency of the data at any point in time. Although GPFS replication is done
over the network and in a synchronous fashion, this solution still over-achieves
the performance requirements for validation by SAP.

Chapter 5. IBM System x solutions for SAP HANA

133

The File Placement Optimizer (FPO), part of GPFS, ensures that the first replica
always is stored local to the node generating the data. In case SAP HANA data
must be read from disk (for example, for backups or restore activity), FPO always
prefers the replica that is available locally. This ensures the best read
performance of the cluster.
Using replication, GPFS provides the SAP HANA software with the functionality
and fault tolerance of a shared storage subsystem while maintaining its
performance characteristics. Again, because server-local storage devices are
used, the total capacity and performance of the storage within the cluster
automatically increases with the addition of nodes, maintaining the same
per-node performance characteristics regardless of the size of the cluster. This
kind of scalability is not achievable with external storage systems.

Example of a node takeover


To further illustrate the capabilities of this solution, this section provides a node
takeover example. In this example, we have a four-node setup, initially configured
as shown in Figure 5-29 on page 133, with three active nodes and one standby
node.
First, node03 experiences a problem and fails unrecoverably. Data that is stored
on this node is not available anymore. The SAP HANA master node (node01)
recognizes this fact and directs the standby node, node04, to take over from the
failed node. The standby node is running the SAP HANA application and is part
of the cluster, but in an inactive role.
To re-create database partition 3 in memory to take over the role of node03
within the cluster, node04 reads the savepoints and database logs of node03
from the GPFS file system, reconstructs the savepoint data in memory, and
reapplies the logs so that the partition data in memory is exactly like it was before
node03 failed. Node04 is in operation, and the database cluster has recovered.

134

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Figure 5-30 illustrates this scenario.

node01

node02

node03

node04

SAP HANA DB
DB partition 1

DB partition 2

DB partition 3

- SAP HANA DB
Worker node

- SAP HANA DB
Worker node

- SAP HANA DB
Defunct node

- SAP HANA DB
Worker node

- Index server
- Statistic server
- SAP HANA studio

- Index server
- Statistic server

- Index server
- Statistic server

- Index server
- Statistic server

Shared file system - GPFS

First replica

data01 + log01

data02 + log02

local storage

local storage

data03 + log03

Second replica

local storage

local storage

Figure 5-30 Standby node 4 takes over from failed node 3

The data that node04 must load into memory is the data of node03, which failed,
including its local storage devices. For that reason, GPFS had to deliver the data
to node04 from the second replica, which is spread across the cluster. GPFS
handles this transparently so that the application does not recognize from which
node the data was read. If data is available locally, GPFS prefers to read from
node04 and avoid going over the network.
Now, when node04 starts writing savepoints and database logs again during the
normal course of operations, these are not written over the network, but to the
local drives, again with a second replica striped across the other cluster nodes.

Chapter 5. IBM System x solutions for SAP HANA

135

After fixing the cause for the failure of node03, it can be reintegrated into the
cluster as the new standby system. This situation is shown in Figure 5-31.

node01

node02

node03

node04

SAP HANA DB
DB partition 1

DB partition 2

DB partition 3

- SAP HANA DB
Worker node

- SAP HANA DB
Worker node

- SAP HANA DB
Standby node

- SAP HANA DB
Worker node

- Index server
- Statistic server
- SAP HANA studio

- Index server
- Statistic server

- Index server
- Statistic server

- Index server
- Statistic server

Shared file system - GPFS

First replica

data01 + log01

data02 + log02

local storage

local storage

data03 + log03

Second replica

local storage

local storage

Figure 5-31 Node 3 is reintegrated into the cluster as a standby node

This example illustrates how IBM combines two independently operating


high-availability measures (that is, the concept of standby nodes on the SAP
HANA application level and the reliability features of GPFS on the infrastructure
level), resulting in a scalable solution that provides fully automated high
availability with no administrative intervention required.

5.4 IBM System Networking options


Larger SAP HANA implementations scale beyond the limits of a single server. In
those environments, the database is split into several partitions, with each
partition on a separate server within the cluster. Nodes in a cluster communicate
with each other through a high-speed interconnect. Network switches are crucial
in such scale-out solutions.

136

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

All IBM System x solutions for SAP HANA use network switches that meet these
requirements. There are three top-of-rack Ethernet switches that are part of the
scale-out solution of IBM:
IBM System Networking RackSwitch G8264
The G8264 switch is a 10 Gb Ethernet switch with 64 SFP+ ports. This switch
is used in scale-out solutions to provide internal cluster communication for
GPFS and SAP HANA networks.
IBM System Networking RackSwitch G8124
The G8124 switch is a 10 Gb Ethernet switch with 24 SFP+ ports. It is used
for smaller installations that do not need as many ports as the G8264 switch
provides.
IBM System Networking RackSwitch G8052
The G8052 switch is a 1 Gb Ethernet switch with 48 10/100/1000 BASE-T
RJ45 ports. It is used in scale-out environments for management and SAP
client networks.

5.4.1 IBM System Networking RackSwitch G8264


The G8264 switch is a 10 Gb/40 Gb Top-of-Rack switch that is designed for
applications that require the highest performance at low latency. It combines
1.28 Tbps throughput with up to sixty-four 10 Gb SFP+ ports in an ultra-dense
1U form factor.
Figure 5-32 shows the front view of the G8264 switch.

Figure 5-32 G8264 switch front view

Chapter 5. IBM System x solutions for SAP HANA

137

The G8264 switch offers the following benefits with respect to SAP HANA
environments:
High performance: The 10 Gb/40 Gb Low Latency Switch provides the best
combination of low latency, non-blocking line-rate switching, and ease of
management. It also has a throughput of 1.2 Tbps.
Lower power and better cooling: The G8264 switch uses as little as 275 W of
power, which is a fraction of the power consumption of most competitive
offerings. Unlike side-cooled switches, which can cause heat recirculation and
reliability concerns, the G8264 switchs front-to-rear or rear-to-front cooling
design reduces data center air conditioning costs by having airflow match the
servers in the rack. In addition, variable speed fans assist in automatically
reducing power consumption.
Layer 3 functionality: IBM System Networking RackSwitch switches include
Layer 3 functionality, which provides security and performance benefits, as
inter-VLAN traffic stays within the switch. These switches also provide the full
range of Layer 3 protocols from static routes for technologies, such as Open
Shortest Path First (OSPF) and Border Gateway Protocol (BGP) for
enterprise customers.
Seamless interoperability: IBM System Networking RackSwitch switches
interoperate seamlessly with other vendors' upstream switches.
Fault tolerance: IBM System Networking RackSwitch switches learn
alternative routes automatically and perform faster convergence in the
unlikely case of a link, switch, or power failure. The switches use proven
technologies, such as L2 trunk failover, advanced VLAN-based failover,
VRRP, and Hot Link.
Multicast: This switch supports IGMP Snooping v1, v2, and v3 with 2 K IGMP
groups, and Protocol Independent Multicast, such as PIM Sparse Mode or
PIM Dense Mode.
Converged fabric: IBM System Networking RackSwitch switches are
designed to support CEE and connectivity to FCoE gateways. CEE helps
enable clients to combine storage, messaging traffic, VoIP, video, and other
data on a common data center Ethernet infrastructure. FCoE helps enable
highly efficient block storage over Ethernet for consolidating server network
connectivity. As a result, clients can deploy a single-server interface for
multiple data types, which can simplify both deployment and management of
server network connectivity, while maintaining the high availability and
robustness required for storage transactions.

138

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

IBM System Networking RackSwitch G8264 has the following performance


characteristics:

100% line rate performance


1280 Gbps non-blocking switching throughput (full duplex)
Sub 1.1 ms latency (1 ms is 1000 microseconds)
960 Mpps

Here are the interface options:


Forty-eight SFP+ ports (10 GbE)
The first eight SFP+ ports (10 GbE) can form two QSFP+ ports (40 GbE),
which are used in high availability (HA) cluster solutions for interswitch links
(ISLs). Other 56 SFP+ ports can be used for nodes connections. Each cluster
node uses two switch ports (one port for internal GPFS network, one port for
internal HANA network), so the G8264 switch supports up to 28 nodes in the
cluster.
Four QSFP+ ports (40 GbE)
The G8264 switch has four QSFP+ ports, which can operate as 40 GbE ports
or can form four SFP+ ports (10 GbE) each. In an HA scale-out solution, the
first two QSFP+ ports are used for forming ISLs between a pair of G8264
switches, and the other two QSFP+ ports may be used as eight SFP+ ports,
increasing the total number of 10 GbE ports.
One 10/100/1000 Ethernet RJ45 port for out-of-band management
One dedicated RJ45 port is used as the management port of the switch.
The G8264 switch supports the following features:
Security:
RADIUS
TACACS+
SCP
Wire Speed Filtering: Allow and Deny
SSH v1 and v2
HTTPS Secure BBI
Secure interface login and password
MAC address move notification
Shift B Boot menu (Password Recovery/ Factory Default)
VLANs:
Port-based VLANs
4096 VLAN IDs supported
1024 Active VLANs (802.1Q)
802.1x with Guest VLAN
Private VLAN Edge

Chapter 5. IBM System x solutions for SAP HANA

139

FCoE/Lossless Ethernet:
802.1 Data Center Bridging
Priority Based Flow Control (PFC)
Enhanced Transmission Selection (ETS)
Data Center Bridge Exchange protocol (DCBX)
FIP Snooping
Fibre Channel over Ethernet (FCoE)
Converged Enhanced Ethernet (CEE)
Trunking:
LACP
Static Trunks (Etherchannel)
Configurable Trunk Hash algorithm
Spanning Tree:
Multiple Spanning Tree (802.1 s)
Rapid Spanning Tree (802.1 w)
PVRST+
Fast Uplink Convergence
BPDU guard
High availability:
Layer 2 failover
Hot Links
VRRP

5.4.2 IBM System Networking RackSwitch G8124


The IBM System Networking RackSwitch G8124 is designed with top
performance in mind. This low-latency switch provides line-rate, high-bandwidth
switching, filtering, and traffic queuing without delaying data.
Figure 5-33 shows the IBM System Networking RackSwitch G8124 Top-of-Rack
(TOR) switch.

Figure 5-33 IBM System Networking RackSwitch G8124 TOR switch

140

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

The G8124 switch offers the following feature benefits with respect to SAP HANA
environments:
High performance: The 10G Low Latency (<700 ns) switch provides the best
combination of low latency, non-blocking line-rate switching and ease of
management.
Lower power and better cooling: The G8124 switch uses as little power as two
60 W light bulbs, which is a fraction of the power consumption of most
competitive offerings. Unlike side-cooled switches, which can cause heat
recirculation and reliability concerns, the G8124 switchs rear-to-front cooling
design reduces data center air conditioning costs by having airflow match the
servers in the rack. In addition, variable speed fans assist in automatically
reducing power consumption.
Layer 3 functionality: This IBM System Networking RackSwitch includes Layer
3 functionality, which provides security and performance benefits, as
inter-VLAN traffic stays within the switch. This switch also provides the full
range of Layer 3 protocols from static routes for technologies, such as Open
Shortest Path First (OSPF) and Border Gateway Protocol (BGP) for
enterprise customers.
Active MultiPath (AMP): Effectively doubles bandwidth by allowing all uplink
ports to be active/active, eliminating cross-stack traffic, and providing up to
900 Gbps aggregate bandwidth between servers. Built-in fault tolerance
constant health checking ensures maximum availability.
Seamless interoperability: IBM System Networking RackSwitch switches
interoperate seamlessly with other vendors' upstream switches.
Fault tolerance: IBM System Networking RackSwitch switches learn
alternative routes automatically and perform faster convergence in the
unlikely case of a link, switch, or power failure. The switch uses proven
technologies, such as L2 trunk failover, advanced VLAN-based failover,
VRRP, and Hot Link.
Converged fabric: The IBM System Networking RackSwitch is designed to
support CEE and connectivity to FCoE gateways. CEE helps enable clients to
combine storage, messaging traffic, VoIP, video, and other data on a common
data center Ethernet infrastructure. FCoE helps enable highly efficient block
storage over Ethernet for consolidating server network connectivity. As a
result, clients can deploy a single-server interface for multiple data types,
which can simplify both deployment and management of server network
connectivity, while maintaining the high availability and robustness that are
required for storage transactions.

Chapter 5. IBM System x solutions for SAP HANA

141

The G8124 switch provides the best combination of low latency, non-blocking
line-rate switching and ease of management. The G8124 switch has the following
performance characteristics:
100% line rate performance
Latency under 700 ns (ultra-low latency, less than 1 microsecond)
480 Gbps non-blocking switching throughput (full duplex)
Interface options:
Twenty-four 10G SFP+ fiber connectors
In HA scale-out configurations, the first four ports of the switch should be
used for ISL between switches; the other 20 ports may be used for nodes
connections. Each cluster node uses two switch ports (one port for internal
GPFS network, one port for internal HANA network), so the G8124 switch
supports up to 10 nodes in the cluster.
2x 10/100/1000 Ethernet RJ45 ports for management
Two dedicated RJ45 ports are used as the management ports of the switch.
The IBM System Networking RackSwitch G8264 supports the following features:
Security:
RADIUS
TACACS+
SCP
Wire Speed Filtering: Allow and Deny
SSH v1 and v2
HTTPS Secure BBI
Secure interface login and password
MAC address move notification
Shift B Boot menu (Password Recovery/ Factory Default)
VLANs:
Port-based VLANs
4096 VLAN IDs supported
1024 Active VLANs (802.1Q)
Private VLAN Edge
FCoE/Lossless Ethernet:
802.1 Data Center Bridging
Priority Based Flow Control (PFC)
Enhanced Transmission Selection (ETS)
Data Center Bridge Exchange protocol (DCBX)
FIP Snooping
Fibre Channel over Ethernet (FCoE)
Converged Enhanced Ethernet (CEE)

142

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Trunking:
LACP
Static Trunks (Etherchannel)
Configurable Trunk Hash algorithm
Spanning Tree:
Multiple Spanning Tree (802.1 s)
Rapid Spanning Tree (802.1 w)
PVRST+
Fast Uplink Convergence
BPDU guard
High availability:
Layer 2 failover
Hot Links
VRRP

5.4.3 IBM System Networking RackSwitch G8052


This switch combines great performance, server-like airflow for cooling, and
low-power consumption in a 1U package.
Figure 5-34 shows the IBM System Networking RackSwitch G8052 Top-of-Rack
(TOR) switch.

Figure 5-34 IBM System Networking RackSwitch G8052 TOR switch

The G8052 switch is a Top-of-Rack data center switch that delivers unmatched
line-rate Layer 2/3 performance at an attractive price. It has forty-eight
10/100/1000 BASE-T RJ45 ports and four 10 Gigabit Ethernet SFP+ ports, and
includes hot-swap redundant power supplies and fans standard, minimizing your
configuration requirements. Unlike most rack equipment that cools from side to
side, the G8052 switch has rear-to-front or front-to-rear airflow that matches
server airflow.
For 10 Gb uplinks, there is a choice of either SFP+ transceivers (SR or LR) for
longer distances or more cost-effective and lower-power-consuming options,
such as SFP+ direct-attached cables (DAC or Twinax cables), which can be
1 - 7 meters in length and are ideal for connecting to another TOR switch, or even
connecting to an adjacent rack.

Chapter 5. IBM System x solutions for SAP HANA

143

The G8052 switch provides the following features with respect to SAP HANA
environments:
High performance: The G8052 switch provides up to 176 Gbps throughput
and supports four SFP+ 10 Gb uplink ports for a low oversubscription ratio, in
addition to a low latency of 1.7 ms.
Lower power and better cooling: The G8052 switch typically consumes just
120 W of power, a fraction of the power consumption of most competitive
offerings. Unlike side-cooled switches, which can cause heat recirculation and
reliability concerns, the G8052 switchs rear-to-front or front-to-rear cooling
design reduces data center air conditioning costs by matching airflow to the
servers configuration in the rack. Variable speed fans assist in automatically
reducing power consumption.
Layer 3 functionality: The G8052 switch includes Layer 3 functionality, which
provides security and performance benefits, as inter-VLAN traffic can be
processed at the access layer. This switch also provides the full range of
Layer 3 static and dynamic routing protocols, including Open Shortest Path
First (OSPF) and Border Gateway Protocol (BGP) for enterprise customers at
no additional cost.
Fault tolerance: These switches learn alternative routes automatically and
perform faster convergence in the unlikely case of a link, switch, or power
failure. The switch uses proven technologies, such as L2 trunk failover,
advanced VLAN-based failover, VRRP, Hot Link, Uplink Failure Detection
(UFD), IGMP v3 Snooping, and OSPF.
Seamless interoperability: IBM RackSwitch switches interoperate seamlessly
with other vendors' upstream switches.
Here are the performance features and specifications of the G8052 switch:

Single switch ASIC design


Full-line rate performance
176 Gbps (full duplex) switching architecture
Low latency: 1.7 ms

Interface options:
Forty-eight 10/100/1000BaseT ports (RJ-45)
Four 10 GbE SFP+ ports
G8052 switches are used in all System x offerings for SAP HANA as switches for
the management network (connecting IMM interfaces of the nodes) and as the
default choice for SAP HANA appliances to the client network. Clients that have
10 Gbps Ethernet backbone for their SAP environments can also choose the
G8264 switch for uplink connectivity.

144

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Chapter 6.

SAP HANA IT landscapes


with IBM System x solutions
This chapter presents IT landscapes in which SAP HANA can be deployed and
shows the corresponding IBM System x workload-optimized solution that is
based on the building blocks that are introduced in Chapter 5, IBM System x
solutions for SAP HANA on page 79.
This chapter describes implementations that are based on IBM eX5 models, new
IBM X6 models, and mixed scale-out environments that are based on existing
eX5 servers and new X6 servers.
This chapter also covers solutions that are based on VMware vSphere and SAP
HANA systems running on IBM SmartCloud, which is the IBM cloud service for
SAP HANA appliances.
The last section of this chapter describes possible ways of sharing SAP HANA
systems. This chapter covers the following topics:

IBM eX5 based environments


IBM X6 based environments
Migrating from eX5 to X6 servers
SAP HANA on VMware vSphere
Sharing an SAP HANA system
SAP HANA on IBM SmartCloud

Copyright IBM Corp. 2013, 2014. All rights reserved.

145

6.1 IBM eX5 based environments


Following the appliance-like delivery model for SAP HANA, IBM created several
custom server models for SAP HANA. These workload-optimized models are
designed to match and exceed the performance requirements and the functional
requirements that are specified by SAP. With a small set of System x
workload-optimized models for SAP HANA, all sizes of SAP HANA solutions can
be built, from the smallest to large installations.

IBM eX5 workload-optimized models for SAP HANA


In October 2012, IBM announced a set of eX5 workload-optimized models for
SAP HANA that were built on the Intel Xeon processor E7-2800 and E7-8800
product families. Because there is no direct relationship between the
workload-optimized models and the SAP HANA T-shirt sizes, we refer to these
models as building blocks.
The building blocks are configured to match the SAP HANA sizing requirements.
The main memory sizes match the number of processors to give the correct
balance between processing power and data volume. Also, the storage devices
in the systems provide the storage capacity that is required to match the amount
of main memory.
IBM eX5 workload-optimized models for SAP HANA are available with SUSE
Linux Enterprise Server (SLES) for SAP Applications only. Red Hat Enterprise
Linux is not available with these eX5 models.
Figure 6-1 on page 147 shows the portfolio of all eX5-based building blocks for
SAP HANA. You can see that they are based on two different servers, the IBM
System x3690 X5 and the IBM System x3950 X5.

146

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

IBM System x3690 X5


Processors

IBM System x3950 X5


Processors

L
M
M
M

XL

XXL

SoH

SoH

BW, SoH

M
M
M
M

XM
SoH

BW, SoH

XS

S
SSS

BW, SoH

Axis are not meant to


scale mathematically

S+

BW = Business Warehouse
SoH = Suite on HANA

BW, SoH

BW, SoH

Scale-out for BW only

128 GB

256 GB

256 GB

512 GB

1 TB

2 TB

4 TB

Memory size

Figure 6-1 Portfolio of eX5 based building blocks for SAP HANA

In some cases, there are several building blocks that are available for one T-shirt
size. In some cases, two-building blocks must be combined to build a specific
T-shirt size. Table 6-1 shows all eX5-based building blocks and their features.
Table 6-1 Overview of all eX5 workload-optimized models for SAP HANA
Building
block

Server

CPUs

Main
memory

Business
Warehouse

Suite on HANA

Upgrade
options

XS

x3690 X5

2x Intel Xeon
E7-2870

128 GB

Yes

Yes

XS S

x3690 X5

2x Intel Xeon
E7-2870

256 GB

Yes

Yes

None

S+

x3950 X5

2x Intel Xeon
E7-8870

256 GB

Yes

Yes

S+ M

x3950 X5

4x Intel Xeon
E7-8870

512 GB

Yes

Yes

M XM
ML

XM

x3950 X5

4x Intel Xeon
E7-8870

1 TB

No

Yes

XM XL

x3950 X5 +
x3950 X5

8x Intel Xeon
E7-8870

1 TB

Yes

Yes

L XL

Chapter 6. SAP HANA IT landscapes with IBM System x solutions

147

Building
block

Server

CPUs

Main
memory

Business
Warehouse

Suite on HANA

Upgrade
options

XL

x3950 X5 +
x3950 X5

8x Intel Xeon
E7-8870

2 TB

No

Yes

XL
XXL

XXL

x3950 X5 +
x3950 X5

8x Intel Xeon
E7-8870

4 TB

No

Yes

None

All models come with preinstalled software that is composed of SUSE Linux
Enterprise Server for SAP Applications (SLES for SAP) 11, IBM GPFS, and the
SAP HANA software stack. Licenses and maintenance fees (for three years) for
SLES for SAP and GPFS are included. The section GPFS license information
on page 256 has an overview about which type of GPFS license comes with a
specific model.
XM, XL, and XXL building blocks are specific to and limited for use with SAP
Business Suite powered by SAP HANA. They have a different memory to core
ratio than the regular models, which is only suitable for this specific workload, as
described in 4.5.3, SAP Business Suite powered by SAP HANA on page 76.

6.1.1 Single-node eX5 solution for Business Warehouse


A single-node solution is the simplest possible implementation of the SAP HANA
system. Depending on SAP HANA sizing requirements, choose the appropriate
building block from the available set. Figure 6-2 on page 149 shows an overview
of all eX5-based single-node solutions for Business Warehouse (BW), including
possible upgrade paths.

148

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

IBM System x3690 X5

IBM System x3950 X5


Processors

Processors

L
M
M
M

M
M
M
M

XS

128 GB

S
SSS

256 GB

S+

256 GB

512 GB

1 TB

Memory size

Figure 6-2 Portfolio of single-node eX5 based solutions for Business Warehouse

There are two different servers on which these solutions are based on: the x3690
X5 and the IBM System 3850 X5 and x3950 X5. Table 6-2 lists the details of each
of the building blocks.
Table 6-2 IBM eX5 workload-optimized models for Business Warehouse on SAP HANA
Building
block

Server
(MTM)

CPUs

Main
memory

Log
storage

Data
storage

Upgrade
options

XS

x3690 X5
(7147-HAxa)

2x Intel Xeon
E7-2870

128 GB DDR3
(8x 16 GB)

10x 200 GB 1.8 MLC SSD


(combined log and data)

x3690 X5
(7147-HBx)

2x Intel Xeon
E7-2870

256 GB DDR3
(16x 16 GB)

10x 200 GB 1.8 MLC SSD


(combined log and data)

S+

x3950 X5
(7143-HAx)

2x Intel Xeon
E7-8870

256 GB DDR3
(16x 16 GB)

1.2 TB High
IOPS adapter

6x 900 GB
10 K SAS HDD

S+ M

x3950 X5
(7143-HBx)

4x Intel Xeon
E7-8870

512 GB DDR3
(32x 16 GB)

1.2 TB High
IOPS adapter

6x 900 GB
10 K SAS HDD

ML

XS S

Chapter 6. SAP HANA IT landscapes with IBM System x solutions

149

Building
block

Server
(MTM)

CPUs

Main
memory

Log
storage

Data
storage

x3950 X5
(7143-HBx)
+
x3950 X5
(7143-HCx)

8x Intel Xeon
E7-8870

1 TB DDR3
(64x 16 GB)

1.2 TB High
IOPS adapter

14x 900 GB
10 K SAS HDD

Upgrade
options

a. x = Country-specific letter (for example, EMEA MTM is 7147-HAG, and the US MTM is 7147-HAU).
Contact your IBM representative for regional part numbers.
b. The two servers are connected with QPI wrap cards that leverage eX5 scalability.

You cannot upgrade from the x3690 X5 to the x3950 X5. Upgrades within one
server type are supported. The following upgrade options exist:
An XS building block can be upgraded to be an S-size SAP HANA system by
adding 128 GB of main memory to the system.
An S+ building block can be upgraded to be an M-size SAP HANA system by
adding two more processors plus another 256 GB of main memory.
M building blocks can be extended with the L option to resemble an L-size
SAP HANA system.
With the option to upgrade S+ to M, and M to L, IBM can provide an
unmatched upgrade path from a T-shirt size S+ up to a T-shirt size L, without
the need to retire a single piece of hardware.
You can also grow into a clustered scale-out environment without changing any
of the hardware of the single-node solution. The upgrade to scale-out is
supported for S, M, and L (as denoted by the multiple boxes in Figure 6-2 on
page 149). Scale-out of XS and S+ is not supported because adding more
memory first, that is, scaling up, gives better performance than adding additional
nodes.
Upgrading the server requires downtime of the system. However, because of the
capability of GPFS to add storage capacity to an existing GPFS file system by
just adding devices, data that is on the system remains intact. Do a backup of the
data before changing the systems configuration.

Support for business continuity


All single-node solutions that are listed in Table 6-2 on page 149 support the
following features for business continuity:
HA within a single data center
HA across data centers (metro distance)
HA within a single data center plus DR to secondary site

150

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

HA across data centers (metro distance) plus DR to third site


DR to secondary site
All HA and DR solutions can be implemented using either GPFS based storage
replication or SAP HANA System Replication. You also can use the standby
nodes in HA and DR solutions to run a second SAP HANA instance for
non-productive purposes. For more information about these architectures, see
7.2, HA and DR for single-node SAP HANA on page 196.
Adding additional business continuity features after the initial implementation is
easily possible. Only additional hardware must be bought, and no parts must be
retired.

6.1.2 Single-node eX5 solution for SAP Business Suite on HANA


SAP Business Suite is a set of applications that are bundled and designed to run
an entire business. When running Business Suite on SAP HANA, the memory
per core ratio is different compared to a BW scenario because the type of
workload on the database server is different. This allows IBM to offer additional
building blocks with more main memory than for BW.
Figure 6-3 shows all the building blocks that you can choose from when you plan
for SAP Business Suite powered by SAP HANA.

IBM System x3690 X5


Processors

IBM System x3950 X5


Processors

XS

XL

XXL

XM

S+
Memory size

128 GB

256 GB

256 GB

512 GB

1 TB

2 TB

4 TB

Figure 6-3 Portfolio of eX5 based building blocks for SAP Business Suite on SAP HANA

Chapter 6. SAP HANA IT landscapes with IBM System x solutions

151

The XS and S building blocks are based on the x3690 X5 server; all other T-shirt
sizes are based on the x3950 X5 building blocks. Table 6-3 lists the configuration
details of each building block for a Business Suite environment.
Table 6-3 IBM eX5 workload-optimized models for SAP Business Suite on SAP HANA
Building
block

Server
(MTM)

CPUs

Main
memory

Log
storage

Data
storage

Upgrade
options

XS

x3690 X5
(7147-HAxa)

2x Intel Xeon
E7-2870

128 GB DDR3
(8x 16 GB)

10x 200 GB 1.8 MLC SSD


(combined log and data)

XS S

x3690 X5
(7147-HBx)

2x Intel Xeon
E7-2870

256 GB DDR3
(16x 16 GB)

10x 200 GB 1.8 MLC SSD


(combined log and data)

None

S+

x3950 X5
(7143-HAx)

2x Intel Xeon
E7-8870

256 GB DDR3
(16x 16 GB)

1.2 TB High
IOPS adapter

6x 900 GB
10 K SAS HDD

S+ M

x3950 X5
(7143-HBx)

4x Intel Xeon
E7-8870

512 GB DDR3
(32x 16 GB)

1.2 TB High
IOPS adapter

6x 900 GB
10 K SAS HDD

M XM
ML

XM

x3950 X5
(7143-HDx)

4x Intel Xeon
E7-8870

1 TB DDR3
(32x 32 GB)

1.2 TB High
IOPS adapter

6x 900 GB
10 K SAS HDD

XM XL

x3950 X5
(7143-HBx)
+
x3950 X5
(7143-HCx)

8x Intel Xeon
E7-8870

1 TB DDR3
(64x 16 GB)

1.2 TB High
IOPS adapter

14x 900 GB
10 K SAS HDD

L XL

8x Intel Xeon
E7-8870

2 TB DDR3
(64x 32 GB)

2x 1.2 TB
High IOPS
adapter

14x 900 GB
10 K SAS HDD

XL
XXL

8x Intel Xeon
E7-8870

4 TB DDR3
(128x 32 GB)

4x 1.2 TB
High IOPS
adapter

14x 900 GB
10 K SAS HDD
+ 8x 900 GB
10 K SAS HDD
through
EXP2524

None

XL

x3950 X5
(7143-HDx)
+
x3950 X5
(7143-HEx)
b

XXL

x3950 X5
(7143-HDx)
+
x3950 X5
(7143-HEx)
b

a. x = Country-specific letter (for example, EMEA MTM is 7147-HAG, and the US MTM is 7147-HAU).
Contact your IBM representative for regional part numbers.
b. The two servers are connected with QPI wrap cards that leverage eX5 scalability.

152

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

You cannot upgrade from the x3690 X5 to the x3950 X5 building blocks.
Upgrades within one server type are supported. The following upgrade options
exist:
An XS building block can be upgraded to be an S-size system by adding
128 GB of main memory to the system.
An S+ building block can be upgraded to be an M-size system by adding two
more processors plus another 256 GB of main memory.
M building blocks can be extended with the L option to resemble an L-size
system.
With the option to upgrade S+ to M, and M to L, IBM can provide an
unmatched upgrade path from a T-shirt size S+ up to a T-shirt size L, without
the need to retire a single piece of hardware.
Customers can start with an S+ configuration, and then upgrade M and L, and
finally to XL only by adding new components. Further growth to XXL is
possible, but requires an exchange of the memory DIMMs.
Clients starting at a 1 TB configuration can upgrade from XM to XL and then
XXL without the need to retire a single piece of hardware, which quadruples
the memory capacity.
Upgrading the server requires downtime of the system. However, because of the
capability of GPFS to add storage capacity to an existing GPFS file system by
just adding devices, data that is on the system remains intact. Do a backup of the
data before changing the systems configuration.

Support for business continuity


Implementing some level of business continuity is highly recommended because
SAP Business Suite constitutes the entire business and its processes. Downtime
of these applications most often leads to a significant impact on the business.
Clients in the manufacturing industry are often faced with a halt of their
production lines if the supply chain system is not available.
For this reason, all single-node solutions that are listed in Table 6-3 on page 152
support the following business continuity features:

HA within a single data center


HA across data centers (metro distance)
HA within a single data center plus DR to a secondary site
HA across data centers (metro distance) plus DR to a third site
DR to secondary site

Chapter 6. SAP HANA IT landscapes with IBM System x solutions

153

All HA and DR solutions can be implemented using either GPFS based storage
replication or SAP HANA System Replication. You also can use the standby
nodes in HA and DR solutions to run a second SAP HANA instance for
non-productive purposes. For more information about these architectures, see
7.2, HA and DR for single-node SAP HANA on page 196.
Adding additional business continuity features after the initial implementation is
easily possible. Only additional hardware must be bought, and no parts must be
retired.
Scale-out SAP HANA environments are not supported when running SAP
Business Suite on top.

6.1.3 Scale-out eX5 solution for Business Warehouse


Customers whose BW databases do not fit into one single node must scale out
their SAP HANA installation. This task involves clustering multiple nodes
together with a high-speed interconnect (10 Gbit Ethernet) and spreading out the
database tables across the participating nodes. Figure 6-4 gives an overview of
all eX5 based building blocks that are supported for such scale-out scenarios.

Processors

L
M
M
M

M
M
M
M

S
SSS

Memory size
256 GB

512 GB

1 TB

Figure 6-4 Portfolio of eX5 based solutions for scale-out Business Warehouse

The S building block is based on the x3690 X5, and M and L are based on the
x3950 X5 building blocks. Table 6-4 on page 155 lists the configuration details of
each building block.

154

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Table 6-4 IBM eX5 workload-optimized models for Business Warehouse on SAP HANA
Building
block

Server
(MTM)

CPUs

Main
memory

Log
storage

Data
storage

Maximum
cluster
nodes

x3690 X5
(7147-HBxa)

2x Intel Xeon
E7-2870

256 GB DDR3
(16x 16 GB)

10x 200 GB 1.8 MLC SSD


(combined log and data)

16

x3950 X5
(7143-HBx)

4x Intel Xeon
E7-8870

512 GB DDR3
(32x 16 GB)

1.2 TB High
IOPS adapter

6x 900 GB
10 K SAS
HDD

56

x3950 X5
(7143-HBx)
+
x3950 X5
(7143-HCx)b

8x Intel Xeon
E7-8870

1 TB DDR3
(64x 16 GB)

1.2 TB High
IOPS adapter

14x 900 GB
10 K SAS
HDD

56

a. x = Country-specific letter (for example, EMEA MTM is 7147-HAG, and the US MTM is 7147-HAU).
Contact your IBM representative for regional part numbers.
b. The two servers are connected with QPI wrap cards that leverage eX5 scalability.

An SAP HANA cluster environment can consist of only building blocks with the
same memory size. Mixing M and L nodes in a cluster is not supported. When
upgrading from a cluster with M nodes to L nodes, you must add the additional
hardware to every node.
The minimum number of nodes in any SAP HANA scale-out environment is three
nodes. A cluster of two nodes is not supported. The maximum number of nodes
that is generally available for implementation is 56. However, IBM validates
feasibility for up to 224 nodes in its labs.1

Support for business continuity


All eX5 scale-out solutions that are listed in Table 6-4 support the following
features for business continuity:
HA within the cluster to protect against node failures
DR that replicates data to a secondary data center
Combination of HA and DR above
HA can be implemented using GPFS based storage replication. DR solutions
can be implemented using either GPFS based storage replication or SAP HANA
System Replication. You also can use the idling DR site nodes to run a second
SAP HANA instance for non-productive purposes. For more information about
these solutions, see 7.2, HA and DR for single-node SAP HANA on page 196.
1

If you are interested in solutions beyond 56 nodes, contact your IBM representative or email the
IBM SAP International Competence Center at isicc@de.ibm.com.

Chapter 6. SAP HANA IT landscapes with IBM System x solutions

155

Single-node to scale-out solution upgrade requirements


With eX5 systems, the scale-out solution for the IBM Systems Solution for SAP
HANA builds upon the same building blocks as they are used in a single-server
installation. There are additional hardware and software components that are
needed to complement the basic building blocks when upgrading a single-node
solution into a scale-out solution, as described in this section.
Depending on the building blocks that are used, additional GPFS licenses might
be needed for the scale-out solution. The GPFS on x86 Single Server for
Integrated Offerings V3 provides file system capabilities for single-node
integrated offerings. This GPFS license does not cover usage in multi-node
environments, such as the scale-out solution that is described here. To use
building blocks that come with the GPFS on x86 Single Server for Integrated
Offerings licenses, for a scale-out solution, GPFS on x86 Server licenses must
be obtained for these building blocks. Section GPFS license information on
page 256 has an overview about which type of license comes with a specific
model, and the number of processor value units (PVUs) that are needed.
Alternatively, GPFS File Placement Optimizer licenses can be used with GPFS
on x86 Server licenses. In a scale-out configuration, a minimum of three nodes
must use GPFS on x86 Server licenses, and the remaining nodes can use GPFS
File Placement Optimizer licenses.
Other setups, such as the disaster recovery solution that is described in 7.3, HA
and DR for scale-out SAP HANA on page 221, might require more nodes using
GPFS on x86 Server licenses, depending on the role of the nodes in the setup.
Section GPFS license information on page 256 has an overview of the GPFS
license types, which type of license comes with a specific model, and the number
of PVUs that are needed.
As described in 5.1.6, x3950 X5 Workload Optimized Solution for SAP HANA
on page 91 and 5.1.7, x3690 X5 Workload Optimized Solution for SAP HANA
on page 99, additional 10 Gb Ethernet NICs must be added to the building blocks
in some configurations to provide redundant network connectivity for the internal
networks, and possibly also for the connection to the client network, in case a
10 Gb Ethernet connection to the other systems (for example, replication server
or SAP application servers) is required. Information about supported network
interface cards for this purpose is provided in the Quick Start Guide, which can
be found at the following website:
http://www-947.ibm.com/support/entry/myportal/docdisplay?lndocid=MIGR-5
087035

156

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

For a scale-out solution built upon the SSD-only building blocks that are based
on x3690 X5, additional 200 GB 1.8 MLC SSD drives are required to
accommodate the additional storage capacity that is required for GPFS
replication. The total number of SSD drives that are required is documented in
the SAP Product Availability Matrix (PAM) for SAP HANA, which is available
online at (search for HANA) the following website:
http://service.sap.com/pam

6.2 IBM X6 based environments


With the introduction of X6, builds of the building block concept gives customers
the most flexibility in selecting components to fit their needs while protecting
investments through an unmatched upgrade path when workload increases.
IBM defines workload-optimized models based on X6 servers that are designed
to match and exceed the performance and the functional requirements that are
specified by SAP.

IBM X6 workload-optimized models for SAP HANA


In spring 2014, IBM announced the most recent addition to their portfolio of
workload-optimized models for SAP HANA. They are built on two new servers
with processors from the latest Intel Xeon processor E7 v2 family. The two
servers are the x3850 X6, with up to four processor sockets, and the x3950 X6,
with up to eight processor sockets. Both servers use the same components that
can be reused when growing from an x3850 X6 to an x3950 X6 solution. Only the
mechanical chassis must be replaced to accommodate more components, such
as processors and I/O. Section 5.2, IBM X6 systems on page 101 explains the
X6 server architecture in detail and has pictures of several X6 components.

Chapter 6. SAP HANA IT landscapes with IBM System x solutions

157

Figure 6-5 shows an overview of all the available workload-optimized solutions


that are based on the x3850 X6 server, which covers the entry-level and
mid-range requirements.
Processors
Upgrade to 8 processors supported
with x3950 X6 chassis (all parts reusable)

BW, SoH

BW, SoH

BW, SoH

BW, SoH

SoH

SoH

BW = Business Warehouse
SoH = Suite on HANA
Scale-out for BW only

2
BW, SoH

128 GB

BW, SoH

256 GB

BW, SoH

384 GB

Some upgrades require to


replace memory modules with
higher capacity modules

BW, SoH

512 GB

768 GB

1 TB

1.5 TB

2 TB

Memory size

Figure 6-5 Portfolio of x3850 X6 based building blocks for SAP HANA

Figure 6-6 on page 159 shows all the building blocks for SAP HANA solutions
that are based on the x3950 X6 server, which covers the mid-range to high-end
requirements.

158

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Processors

BW, SoH

BW, SoH

BW, SoH

BW, SoH

SoH

SoH

SoH

BW = Business Warehouse
SoH = Suite on HANA
Scale-out for BW only

4
BW, SoH

256 GB

BW, SoH

BW, SoH

512 GB

768 GB

BW, SoH

1 TB

SoH

1.5 TB

Some upgrades require to


replace memory modules with
higher capacity modules

SoH

2 TB

3 TB

4 TB

6 TB

Memory size

Figure 6-6 Portfolio of x3950 X6 based building blocks for SAP HANA

Table 6-5 shows the technical details of all x3850 X6-based workload-optimized
models for SAP HANA. They cover the range of two and four socket solutions.
Table 6-5 Overview of x3850 X6 workload-optimized models for SAP HANA
Model

CPUsa

Main
memory

Business
Warehouse

Scale-out
BW

Suite on
HANA

Total rack
space

AC3-2S-128

2x Intel Xeon
E7-8880 v2

128 GB DDR3

Yes

No

Yes

4U

AC3-2S-256

2x Intel Xeon
E7-8880 v2

256 GB DDR3

Yes

Yes, up to
4 nodes

Yes

4U

AC3-2S-384

2x Intel Xeon
E7-8880 v2

384 GB DDR3

Yes

No

Yes

4U

AC3-2S-512

2x Intel Xeon
E7-8880 v2

512 GB DDR3

Yes

No

Yes

4U

AC3-4S-256

4x Intel Xeon
E7-8880 v2

256 GB DDR3

Yes

No

Yes

4U

AC3-4S-512

4x Intel Xeon
E7-8880 v2

512 GB DDR3

Yes

Yes, up to
56 nodesb

Yes

4Uc

Chapter 6. SAP HANA IT landscapes with IBM System x solutions

159

Model

CPUsa

Main
memory

Business
Warehouse

Scale-out
BW

Suite on
HANA

Total rack
space

AC3-4S-768

4x Intel Xeon
E7-8880 v2

768 GB DDR3

Yes

Yes, up to
56 nodesb

Yes

6U

AC3-4S-1024

4x Intel Xeon
E7-8880 v2

1 TB DDR3

Yes

Yes, up to
56 nodesb

Yes

6U

AC3-4S-1536

4x Intel Xeon
E7-8880 v2

1.5 TB DDR3

No

N/A

Yes

6U

AC3-4S-2048

4x Intel Xeon
E7-8880 v2

2 TB DDR3

No

N/A

Yes

6U

a. Alternative CPU types E7-4880 v2, E7-4890 v2 (both support no upgrade to eight sockets), and
E7-8890 v2 are available upon request. For more information, see CPU and memory on
page 112.
b. Support for up to 224 nodes is verified by IBM development and presented to SAP.
c. 6U when used in a scale-out environment.

Table 6-6 shows the technical details of all x3950 X6 based workload-optimized
models for SAP HANA. They cover the range of four and eight socket solutions.
Table 6-6 Overview of IBM System x3950 X6 workload-optimized models for SAP HANA
Model

CPUsa

Main
memory

Business
Warehouse

Scale-out
BW

Suite on
HANA

Total rack
space

AC4-4S-256

4x Intel Xeon
E7-8880 v2

256 GB DDR3

Yes

Yes, up to
56 nodesb

Yes

8U

AC4-4S-512

4x Intel Xeon
E7-8880 v2

512 GB DDR3

Yes

Yes, up to
56 nodesb

Yes

8U

AC4-4S-768

4x Intel Xeon
E7-8880 v2

768 GB DDR3

Yes

Yes, up to
56 nodesb

Yes

8U

AC4-4S-1024

4x Intel Xeon
E7-8880 v2

1 TB DDR3

Yes

Yes, up to
56 nodesb

Yes

8U

AC4-4S-1536

4x Intel Xeon
E7-8880 v2

1.5 TB DDR3

No

N/A

Yes

8U

AC4-4S-2048

4x Intel Xeon
E7-8880 v2

2 TB DDR3

No

N/A

Yes

8U

AC4-8S-512

8x Intel Xeon
E7-8880 v2

512 GB DDR3

Yes

No

Yes

8U

AC4-8S-1024

8x Intel Xeon
E7-8880 v2

1 TB DDR3

Yes

Yes, up to
56 nodesb

Yes

8U

160

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Model

CPUsa

Main
memory

Business
Warehouse

Scale-out
BW

Suite on
HANA

Total rack
space

AC4-8S-1536

8x Intel Xeon
E7-8880 v2

1.5 TB DDR3

Yes

Yes, up to
56 nodesb

Yes

8U

AC4-8S-2048

8x Intel Xeon
E7-8880 v2

2 TB DDR3

Yes

Yes, up to
56 nodesb

Yes

8Uc

AC4-8S-3072

8x Intel Xeon
E7-8880 v2

3 TB DDR3

No

N/A

Yes

10U

AC4-8S-4096

8x Intel Xeon
E7-8880 v2

4 TB DDR3

No

N/A

Yes

10U

AC4-8S-6144

8x Intel Xeon
E7-8880 v2

6 TB DDR3

No

N/A

Yes

10U

a. Alternative CPU type E7-8890 v2 is supported for enhanced performance. Available upon request.
b. Support for up to 224 nodes is verified by IBM development and presented to SAP.
c. 10U when used in a scale-out environment.

Note: For non-productive database instances, SAP maintains relaxed


hardware requirements, which allows you to have more memory in a server,
for example, for test or training instances of SAP HANA.
The following sections list only solutions that are certified by SAP for
productive environments.
All X6 servers physically support the installation of additional memory beyond
what is listed in this section. At the time of writing, IBM can provide X6
systems with 3 TB, 4 TB, 6 TB, 8 TB, or 12 TB of main memory for
non-productive SAP HANA landscapes.

6.2.1 Single-node X6 solution for Business Warehouse


A single-node solution is the simplest possible implementation of an SAP HANA
environment. Depending on the sizing requirements, you choose the appropriate
building block from the available set of X6 workload-optimized solutions.

Chapter 6. SAP HANA IT landscapes with IBM System x solutions

161

Figure 6-7 gives an overview of all X6 based models that are available for a
single-node SAP BW environment.
Processors

x3950 X6
Upgrade supported by
replacing mechanical chassis

4
x3850 X6

Some upgrades require to


replace memory modules with
higher capacity modules

Memory size
128 GB

256 GB

384 GB

512 GB

768 GB

1 TB

1.5 TB

2 TB

Figure 6-7 Portfolio of single-node X6 based solutions for Business Warehouse (x3850 X6 and x3950 X6)

The lower half of Figure 6-7 (below the dashed line) shows the available models
that are based on the x3850 X6 server, and the upper half shows the models that
are based on the x3950 X6 server. There are four socket models for both of
them. You physically can upgrade an x3850 X6 to an x3950 X6 server by
replacing the mechanical enclosure that houses the CPU Books, Storage Books,
and I/O Books. The Books can be used in both server types.
Table 6-5 on page 159 shows the technical details of all x3850 X6
workload-optimized models for SAP HANA.

162

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Table 6-7 IBM System x3850 X6 workload-optimized models for single-node Business Warehouse
Model)

CPUsa

Main
memory

Storage
(data and log)

Upgrade
options

AC3-2S-128

2x Intel Xeon
E7-8880 v2

128 GB DDR3
(16x 8 GB)

3.6 TB
(2x 400 GB SAS SSD,
4x 1.2 TB 10K SAS HDD)

128 256

AC3-2S-256

2x Intel Xeon
E7-8880 v2

256 GB DDR3
(32x 8 GB or
16x 16 GB)

3.6 TB
(2x 400 GB SAS SSD,
4x 1.2 TB 10K SAS HDD)

256 384
256 512
2S 4S

AC3-2S-384

2x Intel Xeon
E7-8880 v2

384 GB DDR3
(48x 8 GB)

3.6 TB
(2x 400 GB SAS SSD,
4x 1.2 TB 10K SAS HDD)

256 512
2S 4S-512

AC3-2S-512

2x Intel Xeon
E7-8880 v2

512 GB DDR3
(32x 16 GB or
16x 32 GB)

3.6 TB
(2x 400 GB SAS SSD,
4x 1.2 TB 10K SAS HDD)

2S 4S

AC3-4S-256

4x Intel Xeon
E7-8880 v2

256 GB DDR3
(32x 8 GB)

3.6 TB
(2x 400 GB SAS SSD,
4x 1.2 TB 10K SAS HDD)

256 512

AC3-4S-512

4x Intel Xeon
E7-8880 v2

512 GB DDR3
(64x 8 GB or
32x 16 GB)

3.6 TB
(2x 400 GB SAS SSD,
4x 1.2 TB 10K SAS HDD)

512 768
512 1 TB

AC3-4S-768

4x Intel Xeon
E7-8880 v2

768 GB DDR3
(96x 8 GB)

13.2 TB
(4x 400 GB SAS SSD,
13x 1.2 TB 10K SAS HDD)b

None

AC3-4S-1024

4x Intel Xeon
E7-8880 v2

1 TB DDR3
(64x 16 GB or
32x 32 GB)

13.2 TB
(4x 400 GB SAS SSD,
13x 1.2 TB 10K SAS HDD)b

1 TB 1.5 TB
1 TB 2 TB

a. Alternative CPU types E7-4880 v2, E7-4890 v2 (both support no upgrade to eight sockets),
and E7-8890 v2 are available upon request. For more information, see CPU and memory
on page 112.
b. Additional drives are installed in the EXP2524 expansion unit.

Chapter 6. SAP HANA IT landscapes with IBM System x solutions

163

Table 6-6 on page 160 shows the technical details of all x3950 X6
workload-optimized models for SAP HANA.
Table 6-8 Overview of x3950 X6 workload-optimized models for SAP HANA
Model

CPUsa

Main
memory

Storage
(data and log)

Upgrade
options

AC4-4S-256

4x Intel Xeon
E7-8880 v2

256 GB DDR3
(32x 8 GB)

3.6 TB
(2x 400 GB SAS SSD,
4x 1.2 TB 10K SAS HDD)

256 512

AC4-4S-512

4x Intel Xeon
E7-8880 v2

512 GB DDR3
(64x 8 GB or
32x 16 GB)

3.6 TB
(2x 400 GB SAS SSD,
4x 1.2 TB 10K SAS HDD)

512 768
512 1 TB
4S 8S

AC4-4S-768

4x Intel Xeon
E7-8880 v2

768 GB DDR3
(96x 8 GB)

9.6 TB
(4x 400 GB SAS SSD,
10x 1.2 TB 10K SAS HDD)

4S 8S-1 TB

AC4-4S-1024

4x Intel Xeon
E7-8880 v2

1 TB DDR3
(64x 16 GB or
32x 32 GB)

9.6 TB
(4x 400 GB SAS SSD,
10x 1.2 TB 10K SAS HDD)

1 TB 1.5 TB
1 TB 2 TB
4S 8S

AC4-4S-1536

4x Intel Xeon
E7-8880 v2

1.5 TB DDR3
(96x 16 GB)

9.6 TB
(4x 400 GB SAS SSD,
10x 1.2 TB 10K SAS HDD)

4S 8S

AC4-4S-2048

4x Intel Xeon
E7-8880 v2

2 TB DDR3
(64x 32 GB or
32x 64 GB)

9.6 TB
(4x 400 GB SAS SSD,
10x 1.2 TB 10K SAS HDD)

None

AC4-8S-512

8x Intel Xeon
E7-8880 v2

512 GB DDR3
(64x 8 GB)

3.6 TB
(2x 400 GB SAS SSD,
4x 1.2 TB 10K SAS HDD)

512 1 TB

AC4-8S-1024

8x Intel Xeon
E7-8880 v2

1 TB DDR3
(128x 8 GB or
64x 16 GB)

9.6 TB
(4x 400 GB SAS SSD,
10x 1.2 TB 10K SAS HDD)

1 TB 1.5 TB
1 TB 2 TB

AC4-8S-1536

8x Intel Xeon
E7-8880 v2

1.5 TB DDR3
(192x 8 GB)

9.6 TB
(4x 400 GB SAS SSD,
10x 1.2 TB 10K SAS HDD)

None

AC4-8S-2048

8x Intel Xeon
E7-8880 v2

2 TB DDR3
(128x 16 GB or
64x 32 GB)

9.6 TB
(4x 400 GB SAS SSD,
10x 1.2 TB 10K SAS HDD)

2 TB 3 TB
2 TB 4 TB

a. Alternative CPU type E7-8890 v2 is supported for enhanced performance. Available upon request.

164

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

You can also grow into a clustered scale-out environment without changing any
of the hardware of the single-node solution. The upgrade to scale-out is
supported for memory configurations with 256 GB, 512 GB, 768 GB, 1 TB,
1.5 TB, or 2 TB (as denoted by the multiple boxes in Figure 6-7 on page 162).
Scale-out of other memory sizes is not supported because adding more memory
first (to scale up) to reach one of the supported memory sizes gives better
performance than adding additional nodes.
Upgrading the server requires downtime of the system. However, because of the
capability of GPFS to add storage capacity to an existing GPFS file system by
just adding devices, data that is on the system remains intact. Do a backup of the
data before changing the systems configuration.

Support for business continuity


All the single-node solutions that are listed in Table 6-7 on page 163 and
Table 6-8 on page 164 support the following features for business continuity:

HA within a single data center


HA across data centers (metro distance)
HA within a single data center plus DR to a secondary site
HA across data centers (metro distance) plus DR to a third site
DR to a secondary site

All HA and DR solutions can be implemented using either GPFS based storage
replication or SAP HANA System Replication. You also can use the standby
nodes in HA and DR solutions to run a second SAP HANA instance for
non-productive purposes. For more information about these architectures, see
7.2, HA and DR for single-node SAP HANA on page 196.
Adding additional business continuity features after the initial implementation is
easily possible. Only additional hardware must be bought, and no parts must be
retired.

6.2.2 Single-node X6 solution for SAP Business Suite on HANA


SAP Business Suite is a set of applications that are bundled and designed to run
an entire business. When running Business Suite on SAP HANA, the memory
per core ratio is different compared to a BW scenario because the type of
workload on the database server is different. This setup allows IBM to offer
additional building blocks with more main memory than for SAP BW.

Chapter 6. SAP HANA IT landscapes with IBM System x solutions

165

Figure 6-8 shows all the building blocks that you can choose from when you plan
for SAP Business Suite powered by SAP HANA.

Processors

x3950 X6
Upgrade supported
by replacing
mechanical chassis

4
x3850 X6

Some upgrades require to


replace memory modules with
higher capacity modules

Memory size

128 GB

256 GB

384 GB

512 GB

768 GB

1 TB

1.5 TB

2 TB

3 TB

4 TB

6 TB

Figure 6-8 Portfolio of single-node X6 based models for SAP Business Suite powered by SAP HANA

The lower half of Figure 6-8 (below the dashed line at four processors) shows the
available models that are based on the x3850 X6 server and the upper half
shows the models that are based on the x3950 X6 server. There are four socket
models for both of them. You physically can upgrade an x3850 X6 to an x3950 X6
server by replacing the mechanical enclosure that houses the CPU Books,
Storage Books, and I/O Books. The Books can be used in both server types.

166

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Table 6-9 lists the technical details of all x3850 X6-based models for SAP HANA
when running SAP Business Suite on top.
Table 6-9 Overview of x3850 X6 workload-optimized models for Business Suite on SAP HANA
Model)

CPUsa

Main
memory

Storage
(data and log)

Upgrade
options

AC3-2S-128

2x Intel Xeon
E7-8880 v2

128 GB DDR3
(16x 8 GB)

3.6 TB
(2x 400 GB SAS SSD,
4x 1.2 TB 10K SAS HDD)

128 256

AC3-2S-256

2x Intel Xeon
E7-8880 v2

256 GB DDR3
(32x 8 GB or
16x 16 GB)

3.6 TB
(2x 400 GB SAS SSD,
4x 1.2 TB 10K SAS HDD)

256 384
256 512
2S 4S

AC3-2S-384

2x Intel Xeon
E7-8880 v2

384 GB DDR3
(48x 8 GB)

3.6 TB
(2x 400 GB SAS SSD,
4x 1.2 TB 10K SAS HDD)

256 512
2S 4S-512

AC3-2S-512

2x Intel Xeon
E7-8880 v2

512 GB DDR3
(32x 16 GB or
16x 32 GB)

3.6 TB
(2x 400 GB SAS SSD,
4x 1.2 TB 10K SAS HDD)

2S 4S

AC3-4S-256

4x Intel Xeon
E7-8880 v2

256 GB DDR3
(32x 8 GB)

3.6 TB
(2x 400 GB SAS SSD,
4x 1.2 TB 10K SAS HDD)

256 512

AC3-4S-512

4x Intel Xeon
E7-8880 v2

512 GB DDR3
(64x 8 GB or
32x 16 GB)

3.6 TB
(2x 400 GB SAS SSD,
4x 1.2 TB 10K SAS HDD)

512 768
512 1T

AC3-4S-768

4x Intel Xeon
E7-8880 v2

768 GB DDR3
(96x 8 GB)

13.2 TB
(4x 400 GB SAS SSD,
13x 1.2 TB 10K SAS HDD)b

None

AC3-4S-1024

4x Intel Xeon
E7-8880 v2

1 TB DDR3
(64x 16 GB or
32x 32 GB)

13.2 TB
(4x 400 GB SAS SSD,
13x 1.2 TB 10K SAS HDD)b

1T 1.5T
1T 2T

AC3-4S-1536

4x Intel Xeon
E7-8880 v2

1.5 TB DDR3
(96x 16 GB)

13.2 TB
(4x 400 GB SAS SSD,
13x 1.2 TB 10K SAS HDD)b

None

AC3-4S-2048

4x Intel Xeon
E7-8880 v2

2 TB DDR3
(64x 32 GB or
32x 64 GB)

13.2 TB
(4x 400 GB SAS SSD,
13x 1.2 TB 10K SAS HDD)b

None

a. Alternative CPU types E7-4880 v2, E7-4890 v2 (both support no upgrade to eight sockets), and
E7-8890 v2 are available upon request. For more information, see CPU and memory on
page 112.
b. Additional drives are installed in the EXP2524 expansion unit.

Chapter 6. SAP HANA IT landscapes with IBM System x solutions

167

Table 6-10 shows the technical details of all x3950 X6-based models for SAP
Business Suite powered by SAP HANA.
Table 6-10 Overview of x3950 X6 workload-optimized models for Business Suite on SAP HANA
Model

CPUsa

Main
memory

Storage
(data and log)

Upgrade
options

AC4-4S-256

4x Intel Xeon
E7-8880 v2

256 GB DDR3
(32x 8 GB)

3.6 TB
(2x 400 GB SAS SSD,
4x 1.2 TB 10K SAS HDD)

256 512

AC4-4S-512

4x Intel Xeon
E7-8880 v2

512 GB DDR3
(64x 8 GB or
32x 16 GB)

3.6 TB
(2x 400 GB SAS SSD,
4x 1.2 TB 10K SAS HDD)

512 768
512 1 TB
4S 8S

AC4-4S-768

4x Intel Xeon
E7-8880 v2

768 GB DDR3
(96x 8 GB)

9.6 TB
(4x 400 GB SAS SSD,
10x 1.2 TB 10K SAS HDD)

4S 8S- 1 TB

AC4-4S-1024

4x Intel Xeon
E7-8880 v2

1 TB DDR3
(64x 16 GB or
32x 32 GB)

9.6 TB
(4x 400 GB SAS SSD,
10x 1.2 TB 10K SAS HDD)

1 TB 1.5 TB
1 TB 2 TB
4S 8S

AC4-4S-1536

4x Intel Xeon
E7-8880 v2

1.5 TB DDR3
(96x 16 GB)

9.6 TB
(4x 400 GB SAS SSD,
10x 1.2 TB 10K SAS HDD)

4S 8S

AC4-4S-2048

4x Intel Xeon
E7-8880 v2

2 TB DDR3
(64x 32 GB or
32x 64 GB)

9.6 TB
(4x 400 GB SAS SSD,
10x 1.2 TB 10K SAS HDD)

None

AC4-8S-512

8x Intel Xeon
E7-8880 v2

512 GB DDR3
(64x 8 GB)

3.6 TB
(2x 400 GB SAS SSD,
4x 1.2 TB 10K SAS HDD)

512 1 TB

AC4-8S-1024

8x Intel Xeon
E7-8880 v2

1 TB DDR3
(128x 8 GB or
64x 16 GB)

9.6 TB
(4x 400 GB SAS SSD,
10x 1.2 TB 10K SAS HDD)

1 TB 1.5 TB
1 TB 2 TB

AC4-8S-1536

8x Intel Xeon
E7-8880 v2

1.5 TB DDR3
(192x 8 GB)

9.6 TB
(4x 400 GB SAS SSD,
10x 1.2 TB 10K SAS HDD)

None

AC4-8S-2048

8x Intel Xeon
E7-8880 v2

2 TB DDR3
(128x 16 GB or
64x 32 GB)

9.6 TB
(4x 400 GB SAS SSD,
10x 1.2 TB 10K SAS HDD)

2 TB 3 TB
2 TB 4 TB

AC4-8S-3072

8x Intel Xeon
E7-8880 v2

3 TB DDR3
(192x 16 GB)

19.2 TB
(6x 400 GB SAS SSD,
19x 1.2 TB 10K SAS HDD)b

None

168

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Model

CPUsa

Main
memory

Storage
(data and log)

Upgrade
options

AC4-8S-4096

8x Intel Xeon
E7-8880 v2

4 TB DDR3
(128x 32 GB or
64x 64 GB)

19.2 TB
(6x 400 GB SAS SSD,
19x 1.2 TB 10K SAS HDD)b

4 TB 6 TB

AC4-8S-6144

8x Intel Xeon
E7-8880 v2

6 TB DDR3
(192x 32 GB)

28.8 TB
(8x 400 GB SAS SSD,
28x 1.2 TB 10K SAS HDD)b

None

a. Alternative CPU type E7-8890 v2 is supported for enhanced performance. Available upon request.
b. Additional drives are installed in the EXP2524 expansion unit.

Upgrading the server requires downtime of the system. However, because of the
capability of GPFS to add storage capacity to an existing GPFS file system by
just adding devices, data that is on the system remains intact. Do a backup of the
data before changing the systems configuration.

Support for business continuity


Implementing some level of business continuity features is highly recommended
because SAP Business Suite constitutes the entire business and its processes.
Downtime of these applications most often has a significant impact on the
business. Clients in the manufacturing industry are often faced with a halt of their
production lines if the supply chain system is not available.
All single-node solutions that are listed in Table 6-9 on page 167 and Table 6-10
on page 168 support the following business continuity features:

HA within a single data center


HA across data centers (metro distance)
HA within a single data center plus DR to secondary site
HA across data centers (metro distance) plus DR to third site
DR to secondary site

All HA and DR solutions can be implemented using either GPFS based storage
replication or SAP HANA System Replication. You also can use the standby
nodes in HA and DR solutions to run a second SAP HANA instance for
non-productive purposes. For more information about these architectures, see
7.2, HA and DR for single-node SAP HANA on page 196.
Adding additional business continuity features after the initial implementation is
easily possible. Only additional hardware must be bought, and no parts must be
retired.
Scale-out SAP HANA environments are not supported at the time of writing when
running SAP Business Suite on top.

Chapter 6. SAP HANA IT landscapes with IBM System x solutions

169

6.2.3 Scale-out X6 solution for Business Warehouse


Customer whose BW databases do not fit into one single node must scale out
their SAP HANA installation, which means clustering multiple single nodes
together with a high-speed interconnect (10 Gbit Ethernet) and spreading out the
database tables across the participating nodes.
Figure 6-9 gives an overview of all X6 based building blocks that are supported in
such scale-out scenarios.

Processors
8

x3950 X6
Upgrade supported by
replacing mechanical chassis

4
x3850 X6

2
Some upgrades require to
replace memory modules with
higher capacity modules

Memory size
256 GB

384 GB

512 GB

768 GB

1 TB

1.5 TB

2 TB

Figure 6-9 Portfolio of all X6 based solutions for scale-out Business Warehouse

The lower half of the figure shows models that are based on the x3850 X6 server
and the upper half shows models that are based on the x3950 X6 server. Models
with four sockets exist using both servers. You physically can upgrade a x3850
X6 server to a x3950 X6 server by replacing the x3850 4U mechanical chassis
with an 8U version for the x3950 X6. All CPU Books, I/O Books, and Storage
Books can be reused in the 8U chassis.
Table 6-11 on page 171 lists the technical details of all workload-optimized x3850
X6 models in a scale-out configuration.

170

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Table 6-11 Overview of x3850 X6 models for a scale-out SAP Business Warehouse
Model

CPUsa

Main
memory

Storage
(data and log)

Upgrade
options

Maximum
cluster
size

AC3-2S-256-C

2x Intel Xeon
E7-8880 v2

256 GB DDR3
(32x 8 GB or
16x 16 GB)

3.6 TB
(2x 400 GB SAS SSD,
4x 1.2 TB SAS HDD)

2S
4S+512

4 nodes

AC3-4S-512-C

4x Intel Xeon
E7-8880 v2

512 GB DDR3
(64x 8 GB or
32x 16 GB)

13.2 TB
(4x 400 GB SAS SSD,
13x 1.2 TB SAS HDDb)

512 768
512 1
TB

56 nodes

AC3-4S-768-C

4x Intel Xeon
E7-8880 v2

768 GB DDR3
(96x 8 GB or
48x 16 GB)

13.2 TB
(4x 400 GB SAS SSD,
13x 1.2 TB SAS HDDb )

768 1
TB

56 nodes

AC3-4S-1024-C

4x Intel Xeon
E7-8880 v2

1 TB DDR3
(64x 16 GB or
32x 32 GB)

13.2 TB
(4x 400 GB SAS SSD,
13x 1.2 TB SAS HDDb )

56 nodes

a. Alternative CPU types E7-4880 v2, E7-4890 v2 (both support no upgrade to eight sockets), and
E7-8890 v2 are available upon request. For more information, seeCPU and memory on page 112.
b. Additional drives are installed in the EXP2524 expansion unit.

Table 6-12 lists the technical details of all workload-optimized x3950 X6 models
in a scale-out configuration.
Table 6-12 Overview of x3950 X6 models for scale-out SAP Business Warehouse
Model

CPUsa

Main
memory

Storage
(data and log)

Upgrade
options

Maximum
cluster
size

AC4-4S-256-C

4x Intel Xeon
E7-8880 v2

256 GB DDR3
(32x 8 GB)

3.6 TB
(2x 400 GB SAS SSD,
4x 1.2 TB SAS HDD)

256 512

56 nodes

AC4-4S-512-C

4x Intel Xeon
E7-8880 v2

512 GB DDR3
(64x 8 GB or
32x 16 GB)

9.6 TB
(4x 400 GB SAS SSD,
10x 1.2 TB SAS HDD)

512 768
512 1 TB

56 nodes

AC4-4S-768-C

4x Intel Xeon
E7-8880 v2

768 GB DDR3
(96x 8 GB)

9.6 TB
(4x 400 GB SAS SSD,
10x 1.2 TB SAS HDD)

768 1 TB
4S 8S

56 nodes

AC4-4S-1024-C

4x Intel Xeon
E7-8880 v2

1 TB DDR3
(64x 16 GB or
32x 32 GB)

9.6 TB
(4x 400 GB SAS SSD,
10x 1.2 TB SAS HDD)

4S 8S

56 nodes

Chapter 6. SAP HANA IT landscapes with IBM System x solutions

171

Model

CPUsa

Main
memory

Storage
(data and log)

Upgrade
options

Maximum
cluster
size

AC4-8S-1024-C

8x Intel Xeon
E7-8880 v2

1 TB DDR3
(128x 8 GB or
64x 16 GB)

9.6 TB
(4x 400 GB SAS SSD,
10x 1.2 TB SAS HDD)

1 TB
1.5 TB
1 TB
2 TB

56 nodes

AC4-8S-1536-C

8x Intel Xeon
E7-8880 v2

1.5 TB DDR3
(192x 8 GB)

9.6 TB
(4x 400 GB SAS SSD,
10x 1.2 TB SAS HDD)

1.5 TB
2 TB

56 nodes

AC4-8S-2048-C

8x Intel Xeon
E7-8880 v2

2 TB DDR3
(128x 16 GB
or 64x 32 GB)

19.2 TB
(6x 400 GB SAS SSD,
19x 1.2 TB SAS HDDb)

56 nodes

a. Alternative CPU type E7-8890 v2 is supported for enhanced performance. Available upon request.
b. Additional drives are installed in the EXP2524 expansion unit.

An SAP HANA cluster environment can consist of only building blocks with the
same memory size. Mixing different memory sizes in a cluster is not supported.
For example, when upgrading from a cluster with 512 GB nodes to 1 TB nodes,
you must add the additional memory to every node.
Note: The minimum number of nodes in any SAP HANA scale-out
environment is three worker nodes. A cluster of two worker nodes is not
supported. Adding one standby node for business continuity results in a
four-node cluster being the minimum that is supported.
The maximum number of nodes that is available for implementation is 56
(except for model AC3-2S-256-C, which is limited to four nodes). However,
IBM validates feasibility for up to 224 nodes in its labs to accommodate for
growing customer demand.
If you are interested in solutions beyond 56 nodes, contact your IBM
representative or email the IBM SAP International Competence Center at
isicc@de.ibm.com.

Support for business continuity


All X6 scale-out solutions that are listed in Table 6-11 on page 171 and
Table 6-12 on page 171 support the following features for business continuity:
HA within the cluster to protect against node failures
DR that replicates data to a secondary data center
Combination of HA and DR above

172

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

HA can be implemented using GPFS based storage replication. DR solutions


can be implemented using either GPFS based storage replication or SAP HANA
System Replication. You also can use the idling DR site nodes to run a second
SAP HANA instance for non-productive purposes. For more information about
these solutions, see 7.3, HA and DR for scale-out SAP HANA on page 221.

Single-node to scale-out solution upgrade requirements


With X6 systems, the scale-out solution for the IBM Systems Solution for SAP
HANA builds upon the same building blocks as they are used in a single-server
installation. There are additional hardware and software components that are
needed to complement the basic building blocks when upgrading a single-node
solution into a scale-out solution, as described in this section.
Additional hardware must be only bought when scaling out using models
AC3-4S-512 or AC4-8S-2048. For these two models, additional storage capacity
is required for the second GPFS replica that is generated in scale-out
environments. This second replica does not exist for single nodes. The hardware
to acquire includes an EXP2524 storage expansion enclosure, two 400 GB SAS
SSDs, nine 1.2 TB 10k SAS disk drives, an IBM ServeRAID M5120 SAS
Controller, and one SAS cable to connect the storage unit with the M5120
adapter.
Depending on the building blocks that are used, additional GPFS licenses might
be needed for the scale-out solution. The GPFS on x86 Single Server for
Integrated Offerings V3 provides file system capabilities for single-node
integrated offerings. This GPFS license does not cover the usage in multi-node
environments, such as the scale-out solution described here. To use building
blocks that come with the GPFS on x86 Single Server for Integrated Offerings
licenses, for a scale-out solution, GPFS on x86 Server licenses must be obtained
for these building blocks. Section GPFS license information on page 256 has
an overview about which type of license comes with a specific model, and the
number of processor value units (PVUs) that is needed. Alternatively, GPFS File
Placement Optimizer licenses can be used with GPFS on x86 Server licenses.
In a scale-out configuration, a minimum of three nodes must use GPFS on x86
Server licenses, and the remaining nodes can use GPFS File Placement
Optimizer licenses. Other setups, such as the disaster recovery solution that is
described in 7.3, HA and DR for scale-out SAP HANA on page 221, might
require more nodes using GPFS on x86 Server licenses, depending on the role
of the nodes in the setup. Section GPFS license information on page 256 has
an overview on the GPFS license types, which type of license comes with a
specific model, and the number of PVUs that is needed.

Chapter 6. SAP HANA IT landscapes with IBM System x solutions

173

6.3 Migrating from eX5 to X6 servers


With the introduction of the latest Intel processor generation, new server models
are introduced. These models all have processors with 15 cores each, and in the
older generation each processor only had 10 cores. In addition, each current
generation core is more powerful than a previous generation core. These two
reasons lead to a shift in the memory per core ratio that must be taken into
account when comparing eX5-based models with X6-based models.
The following two sections talk about migration scenarios to X6 servers for
customers that already have eX5 based SAP HANA models in their data center.

6.3.1 Disruptive migration


SAP HANA environments with only one single node server or cluster
environments that can accommodate downtime of their SAP HANA database
instance have two different ways to migrate from eX6 to X6 technology.
The first approach is to take a backup of the eX5 environment and then restore it
on an X6-based workload-optimized model. This approach works for single
nodes and scale-out landscapes, but for scale-out landscapes, the number of
nodes is not allowed to change from a backup to a restore. So, if you plan to
migrate from a cluster with, for example, five eX5 nodes to three X6 nodes, this
approach does not work.2
The second migration strategy involves reloading the database with data from
the outside replication source. This approach does not require any preparations
on the eX5 nodes before they are decommissioned. You replace the eX5
environment with X6 nodes and start from scratch. This approach supports a
modified number of X6 nodes in a scale-out environment.

6.3.2 Hybrid SAP HANA cluster with eX5 and X6 nodes


Customers who have a scale-out SAP HANA environment running on eX5 based
models can grow their database size with X6 nodes. Adding X6-based
workload-optimized servers to a running eX5-based cluster is supported.
Step-by-step instructions of the necessary commands can be found in the
Operations Guide outlined in 8.2, IBM SAP HANA Operations Guide on
page 246.
2

174

Although the latest releases of SAP HANA support a change in the number of nodes during a
backup and restore procedure, this approach does not give the best performance. The change in
landscape is mimicked by running multiple index server processes on one target machine. Avoid
this scenario for optimal performance.

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Note: At the time of writing, SAP required the exact same number of cores
and amount of memory on every eX5 and X6 model. Following the new
memory per core ratio on the X6 nodes is not supported in hybrid clusters.
Therefore, hybrid scale-out environments are supported by the following
models only:
Two sockets: 256 GB memory (T-shirt size S with AC3-2S-256-C)
Four sockets: 512 GB memory (M with AC3-4S-512-C or AC4-4S-512-C)
Eight sockets: 1 TB memory (L with AC4-8S-1024-C)
A hybrid SAP HANA scale-out environment must have more than 50% of the
nodes still being eX5 models. When the majority of the nodes are X6-based, all
eX5 servers must be excluded from the cluster. A hybrid cluster with an equal
number of eX5 and X6 nodes is not supported. Node designation (SAP HANA
worker or standby) is not important for this situation. As an example, if your
cluster has five worker and one standby nodes running on eX5 technology, you
can add up to five X6 building blocks to it.
This situation leads to different upgrade possibilities. Slow-growing environments
can add X6 nodes to their cluster without hitting the 50% rule anytime soon.
Faster growing environments must take a migration scenario into account.

Chapter 6. SAP HANA IT landscapes with IBM System x solutions

175

Figure 6-10 is an example of how to plan for a transition in which no


migration-specific spare capacity is acquired and all server elements can be
reused. The cluster in this figure consists of five eX5 nodes with eight sockets
and 1 TB each (T-shirt size L), which gives you a total memory size of 5 TB for
the cluster (see the upper part of Figure 6-10).

1TB eX5
node

1TB eX5
node

1TB eX5
node

1TB eX5
node

1TB eX5
node

5TB total memory

HANA database grows, add X6 nodes


9TB total memory
1TB eX5
node

1TB eX5
node

1TB eX5
node

1TB eX5
node

1TB eX5
node

1TB X6
node

1TB X6
node

1TB X6
node

Split cluster into eX5 only and X6 only.


Add additional memory to X6 nodes
to make up for excluded eX5 nodes.

1TB eX5
node

1TB eX5
node

1TB eX5
node

1TB eX5
node

1TB X6
node

8 or 10 TB total memory
1TB eX5
node

2TB X6
node

2TB X6
node

2TB X6
node

2TB X6
node

2TB X6
node

Figure 6-10 Example transition of an eX5 based cluster into an X6 based cluster

The database then grows and the customer decides to add X6 models to prepare
for a migration at some point. Up to four X6 based workload-optimized solutions
can be added to the existing five-node cluster. The new X6 nodes must be eight
sockets and 1 TB memory each (see the middle part of Figure 6-10), which
allows the cluster to grow up to 9 TB total.
Adding additional X6 models breaks the 50% rule. It is the correct point to
exclude all eX5 models from the cluster, leaving the database instance on X6
nodes only. To accommodate for the missing 5 TB of the eX5 nodes, you add
additional memory to each of the X6 servers and use the new memory per core
ratio on X6. You add 1 TB per X6 node. Depending on the fill level of the
database before the split, you might need one additional X6 node (see the lower
part of Figure 6-10). The total memory size of the cluster is then 8 TB (four X6
nodes only) or 10 TB (with one new X6 node).

176

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

The five decommissioned eX5 based workload-optimized servers can be


coupled to an eX5 only cluster and used for either another production SAP HANA
instance or any other non-production instance, such as development or test.
Although this approach, including the example, allows for a migration from eX5 to
X6 that does not involve taking down your SAP HANA database instance, it is
preferable to reconfigure SAP HANA to use the additional features of the X6
models. This configuration can, for example, be combined with a regular SAP
HANA database maintenance window.

6.4 SAP HANA on VMware vSphere


One way of consolidating multiple SAP HANA instances on one system3 is
virtualization.
On 6 May 2014, VMware and SAP announced4 controlled availability for
deploying SAP HANA in virtual environments for production use using VMware
vSphere 5.5 and SAP HANA SPS07 (or later). It is supported under the following
conditions:
Only one single SAP HANA virtual machine per physical server is allowed.
Only scale-up single-server appliance configurations are supported.
Scale-out configurations are not supported.
SAP HANA is virtualized with VMware vSphere 5.5 on single-node hardware
configurations that are validated for SAP HANA or on SAP HANA tailored
data center integration application-verified hardware.
The installation of the SAP HANA appliance software into the VMware guest
must be done by certified SAP HANA appliance vendors or their partners.
Cloning of such virtual machines can then be done by the clients as needed.
Up to 1 TB and 32 physical cores (64 HT cores) per VMware vSphere
instance are supported. As a consequence, only four-socket machines are
supported. Eight-socket configurations (such as the x3950 X6 allows for) are
not supported.
No overprovisioning of CPU or memory is allowed.

3
4

One SAP HANA system, as referred to in this section, can consist of one single server or multiple
servers in a clustered configuration.
See http://global12.sap.com/news-reader/index.epx?articleID=22775 and
http://www.saphana.com/community/blogs/blog/2014/05/06/at-the-seasside-with-sap-hana-on
-vmware

Chapter 6. SAP HANA IT landscapes with IBM System x solutions

177

Additionally, SAP supports VMware vSphere 5.1 since the release of SAP HANA
1.0 SPS 05 (that is, revision 45 and higher) for non-production use only. Relaxed
configuration rules apply to non-production environments. Consolidation of
multiple SAP HANA VMs onto one physical server is allowed for non-production
environments and eight socket servers are supported as host systems.
VMware Tools is approved by SAP for installation inside the VM operating
system.
Table 6-13 summarizes the overview of VMware scenarios with SAP HANA.
Table 6-13 Overview of VMware scenarios with SAP HANA
Feature

Non-production
SAP HANA instance

Production
SAP HANA instance

Required SAP HANA SPS

SPS05 or later

SPS07 or later

Required VMware version

vSphere 5.1

vSphere 5.5

Single node deployment

Yes

Yes

Scale-out deployment

No

No

Maximum number of VMs


per physical server

Multiple

CPU overprovisioning

No

No

Memory overprovisioning

No

No

More information about SAP HANA virtualization with VMware vSphere is


contained in SAP Note 1788665, which also has an FAQ document that is
attached. This SAP Note is available for download at the following website:
http://service.sap.com/sap/support/notes/1788665
Further information is available at the following website:
http://www.saphana.com/docs/DOC-3334
VMware released a preferred practices guide for running SAP HANA on vSphere
5.5. It is available for download from the VMware website at the following URL:
http://www.vmware.com/files/pdf/SAP_HANA_on_vmware_vSphere_best_practic
es_guide.pdf

178

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

6.4.1 vSphere on eX5 workload-optimized models for SAP HANA


The System x solution for SAP HANA supports virtualization with VMware
vSphere on eX5 systems by using an embedded hypervisor, as described in
Integrated virtualization on page 97. The SAP HANA Virtualization FAQ states
that SAP allows for non-production environments with multiple virtual machines
to be installed by using a concept of slots.
Each slot is a virtual machine that is created with 10 virtual CPUs (vCPUs) and
64 GB memory. The standard rules for the operating system and sizes for SAP
HANA Data and Log file systems must be followed. Because of the resources
that are taken by the VMware ESXi server itself, one slot is reserved by SAP
definition because no CPU/memory overcommitment is allowed in an SAP HANA
virtual machine.
So, there is a maximum number of 15 slots that are available on the x3950 X5
workload-optimized system for SAP HANA appliance, and a maximum of three
slots on the x3690 X5 workload-optimized system for SAP HANA appliance.
Figure 6-11 shows the maximum number of slots that is available per eX5
system.
SAP Slots
X

10

11

12

13

14

15

T-Shirt Size (hosting system)


XS
S / S+
M

Figure 6-11 SAP HANA possible slots per T-shirt size (x = reserved) on eX5 servers

Chapter 6. SAP HANA IT landscapes with IBM System x solutions

179

Table 6-14 shows the configuration parameters for virtual machines from slot
sizes 1 - 8. Sizes 7 and 8 are not supported by VMware because of the restriction
of a maximum number of vCPUs per guest of 64 in their latest VMware ESX
server software. You can install one or more non-production virtual machines of
slot sizes 1 - 6 on any x3950 X5 workload-optimized solution for SAP HANA
appliance.
Table 6-14 SAP HANA virtual machine sizes when virtualizing eX5 models
SAP
T-shirt

SAP HANA
support pack

IBM
name

vCPUs
(HT on)

Virtual
memory

Required
no. of slots

Total
HDD

Total
SSD

XXS

SPS 05

VM1

10

64 GB

352 GB

64 GB

XS

SPS 05

VM2

20

128 GB

608 GB

128 GB

None

Manually

VM3

30

192 GB

864 GB

192 GB

SPS 05

VM4

40

256 GB

1120 GB

256 GB

None

Manually

VM5

50

320 GB

1376 GB

320 GB

None

Manually

VM6

60

384 GB

1632 GB

384 GB

None

N/A

VM7a

70

448 GB

1888 GB

448 GB

N/A

VM8a

80

512 GB

2144 GB

512 GB

a. This slot size is not possible due to limitations of the VMware ESXi 5 hypervisor.

For production environments on vSphere on eX5, only one single virtual machine
is allowed per physical server.

6.4.2 vSphere on X6 workload-optimized models for SAP HANA


With the release of the recent Intel Xeon E7 v2 processor family, SAP revised
their configuration method for virtualizing SAP HANA on VMware vSphere.
Sizing a virtual environment now is identical to bare-metal sizing. To achieve the
preferred performance, the main memory and number of cores must follow a
certain ratio, which is the same ratio that bare-metal environments follow. Disk
storage requirements were relaxed. Details are provided in a preferred practices
white paper from VMware, which is available at the following website:
http://www.vmware.com/files/pdf/SAP_HANA_on_vmware_vSphere_best_practic
es_guide.pdf

180

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Running VMware on X6 workload-optimized models is limited to only two- and


four-socket machines. Eight-socket machines cannot run a vSphere hypervisor
because the maximum supported configuration of a VM running on vSphere 5.5
is 64 vCPUs and 1 TB of memory. Mapping them to bare metal, these maximum
requirements can be accommodated on a four-socket X6 server. The minimum
recommended configuration for a VM running SAP HANA is five cores and
64 GB of memory.

6.4.3 VMware vSphere licensing


VMware vSphere 5 comes in different editions, which have differences in their set
of features.5 For the deployment of SAP HANA on VMware vSphere, the number
of supported vCPUs per virtual machine (VM) is the most important difference
within the set of supported features:
The Standard edition supports up to eight vCPUs per VM, which is below the
number of vCPUs of the smallest SAP HANA VM size. Therefore, this edition
cannot be used.
The Enterprise edition supports up to 32 vCPUs per VM, and can be used for
SAP HANA VMs with up to 30 vCPUs and 192 GB of virtual memory. This
edition might be a cost-saving choice when deploying only smaller SAP
HANA VMs, for example, for training or test purposes.
The Enterprise Plus edition supports up to 64 vCPUs per VM, and thus can
be used for SAP HANA VMs with up to 60 vCPUs and 384 GB of virtual
memory, which is the maximum supported VM size.
VMware vSphere is licensed on a per-processor basis. Each processor on a
server must have a valid license installed to run vSphere. The number of
processors for each of the building blocks of the IBM Systems Solution for SAP
HANA is listed in the overview tables in 6.1, IBM eX5 based environments on
page 146 and 6.2, IBM X6 based environments on page 157.

For a complete overview of the available editions and their feature sets, see
http://www.vmware.com/files/pdf/vsphere_pricing.pdf.

Chapter 6. SAP HANA IT landscapes with IBM System x solutions

181

Table 6-15 lists the order numbers for both the Enterprise and Enterprise Plus
editions of VMware vSphere 5.
Table 6-15 VMware vSphere 5 ordering P/Ns
Description

VMware P/N
License / Subscription

IBM P/N

VMware vSphere 5 Enterprise for 1 processor


License and 1-year subscription

VS5-ENT-C /
VS5-ENT-PSUB-C

4817SE3

VMware vSphere 5 Enterprise Plus for 1 processor


License and 1-year subscription

VS5-ENT-PL-C /
VS5-ENT-PL-PSUB-C

4817SE5

VMware vSphere 5 Enterprise for 1 processor


License and 3-year subscription

VS5-ENT-C /
VS5-ENT-3PSUB-C

4817TE3

VMware vSphere 5 Enterprise Plus for 1 processor


License and 3-year subscription

VS5-ENT-PL-C /
VS5-ENT-PL-3PSUB-C

4817TE5

VMware vSphere 5 Enterprise for 1 processor


License and 5-year subscription

VS5-ENT-C /
VS5-ENT-5PSUB-C

4817UE3

VMware vSphere 5 Enterprise Plus for 1 processor


License and 5-year subscription

VS5-ENT-PL-C /
VS5-ENT-PL-5PSUB-C

4817UE5

As an example, if you want to deploy SAP HANA on an eX5 M building block (for
example, 7143-HBx), will need four licenses. An X6 model AC48S1024 needs
eight licenses.

Sizing
The sizing is the same for virtualized and non-virtualized SAP HANA
deployments. Although there is a small performance impact because of the
virtualization, the database size and the required memory size are not affected.

Support
As with any other deployment type of SAP HANA, clients are asked to open an
SAP support ticket by using the integrated support model that is outlined in 8.7.1,
IBM and SAP integrated support on page 252. Any non-SAP related issue is
routed to VMware first, and it eventually is forwarded to the hardware partner. In
certain, but rare situations, SAP or its partners might need to reproduce the
workload on bare metal.

182

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

6.5 Sharing an SAP HANA system


Deployment of SAP HANA databases on dedicated hardware can lead to many
SAP HANA appliances in the data center, such as production, disaster recovery,
quality assurance (QA), test, and sandbox systems, and possibly for multiple
application scenarios, regions, or lines of business. Therefore, the consolidation
of SAP HANA instances, at least for non-production systems, seems desirable.
Section 6.4, SAP HANA on VMware vSphere on page 177 describes the new
support of SAP HANA running on VMware vSphere environments.
Another way of consolidating is to install more than one instance of SAP HANA
database onto one SAP HANA appliance. There are major drawbacks when
consolidating multiple SAP HANA instances on one appliance, so it generally is
not supported for production systems.
For non-production systems, the support status depends on the scenario:
Multiple Components on One System (MCOS)
Having multiple SAP HANA instances on one system, also referred to
as Multiple Components on One System (MCOS), is not
recommended because this poses conflicts between different SAP
HANA databases on a single server, for example, common data and
log volumes, possible performance degradations, and interference of
the systems with each other. SAP and IBM support this scenario
under certain conditions (see SAP Note 1681092), but if issues arise,
as part of the troubleshooting process SAP or IBM might ask you to
stop all but one of the instances to see whether the issue persists.
Multiple Components on One Cluster (MCOC)
Running multiple SAP HANA instances on one scale-out cluster (for
the sake of similarity to the other abbreviations, we call this Multiple
Components on One Cluster (MCOC)) is supported if each node of
the cluster runs only one SAP HANA instance. A development and a
QA instance can run on one cluster, but with dedicated nodes for
each of the two SAP HANA instances, for example, each of the
nodes runs either the development instance, or the QA instance, but
not both. Only the GPFS file system is shared across the cluster.

Chapter 6. SAP HANA IT landscapes with IBM System x solutions

183

Multiple Components in One Database (MCOD)


Having one SAP HANA instance containing multiple components,
schemas, or application scenarios, also referred to as Multiple
Components in One Database (MCOD), is supported. To have all
data within a single database, which is also maintained as a single
database, can lead to limitations in operations, database
maintenance, backup and recovery, and so on. For example, bringing
down the SAP HANA database affects all of the scenarios. It is
impossible to bring it down for only one scenario. SAP Note 1661202
documents the implications.
Consider the following factors when consolidating SAP HANA instances on one
SAP HANA appliance:
An instance filling up the log volume causes all other instances on the system
to stop working correctly. This situation can be addressed by monitoring the
system closely.
Installation of an additional instance might fail when there already are other
instances that are installed and active on the system. The installation
procedures check the available space on the storage, and refuse to install
when there is less free space than expected. This situation also might happen
when trying to reinstall an already installed instance.
Installing a new SAP HANA revision for one instance might affect other
instances that are already installed on the system. For example, new library
versions coming with the new installation might break the already installed
instances.
The performance of the SAP HANA system becomes unpredictable because
the individual instances on the system are sharing resources, such as
memory and CPU.
When asking for support for such a system, you might be asked to remove the
additional instances and to re-create the issue on a single instance system.

6.6 SAP HANA on IBM SmartCloud


IBM announced availability of integrated managed services for SAP HANA
Appliances in the Global SmartCloud for SAP Applications Offering. This
combination of SAP HANA and IBM SmartCloud for SAP Applications global
managed services helps clients reduce SAP infrastructure costs and complexity
of their SAP Business Suite and HANA landscape. These appliances also
accelerate business analytics by using the IBM global managed cloud
infrastructure with standardized processes and skilled SAP-certified staff.

184

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

This offering provides SAP platform as a service (PaaS) on IBM SmartCloud


Enterprise+ for all SAP Business Suite and SAP Business Objects products. IBM
is a certified Global SAP Partner for cloud services.
Initially as part of release R1.3 (the release at the time of writing), IBM supports
two side-by-side SAP HANA use cases (Side-by-side reporting and analytics
and Side-by-side accelerators), where the data is in a traditional database of
the SAP application, for example, SAP Business Suite, but selected data is
replicated into HANA to accelerate real-time reporting and analytics. Only
single-node SAP HANA installations are supported without the high availability
(HA) or disaster recovery (DR) features.

Side-by-side reporting and analytics


This use case has the following attributes:
SAP HANA is used as a data mart to accelerate operational reporting.
Data is replicated from the SAP application into the SAP HANA appliance, for
example, using the SAP LT Replication server.
Data is modeled in SAP HANA through SAP HANA. Preconfigured models
and templates can be loaded to jump-start the project and deliver quick
results.
SAP Business Objects BI 4 can be used as the reporting solution on top of
SAP HANA. Pre-configured dashboards and reports can be deployed.
SAP Rapid Deployment Solutions (SAP RDSs), which provides predefined
but customizable SAP HANA data models, are available (for example: ERP
CO-PA and CRM).

Side-by-side accelerators
This use case has the following attributes:
SAP HANA serves as a secondary database for SAP Business Suite
applications.
Data is replicated from the SAP application into the in-memory SAP HANA
appliance, for example, by using the SAP LT Replicator server.
The SAP Business Suite application is accelerated by retrieving results of
complex calculations on mass data directly from the in-memory database.
The user interface for users remains unchanged to ensure nondisruptive
acceleration.
Additional analytics that are based on the replicated data in SAP HANA can
provide new insights for users.

Chapter 6. SAP HANA IT landscapes with IBM System x solutions

185

The difference to the reporting scenario is that the consumer of the data that
is replicated to SAP HANA is not a BI tool, but the source system itself.
Also for this scenario, several SAP RDSs are provided by SAP.
In release R1.4, IBM will support all existing SAP HANA use cases, including
SAP BW running on SAP HANA and SAP Business Suite running on SAP
HANA. All certified IBM configurations will be supported, including both
single-node and scale-out configurations with high availability (HA) features.
Options to share one appliance (MCOD, MCOS, or MCOC), which are described
in 6.5, Sharing an SAP HANA system on page 183, will be supported as part of
the offering.
Virtualization and disaster recovery (DR) features will be supported as part of
subsequent releases and are not available in release R1.4.

186

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Chapter 7.

Business continuity and


resiliency for SAP HANA
This chapter presents individual SAP HANA high availability (HA) and disaster
recovery (DR) deployment options. It explains basic terminology, and deployment
options for single-node systems and scale-out systems. It also describes the
backup options of SAP HANA.
This chapter covers the following topics:

Overview of business continuity options


HA and DR for single-node SAP HANA
HA and DR for scale-out SAP HANA
Backup and restore

Copyright IBM Corp. 2013, 2014. All rights reserved.

187

7.1 Overview of business continuity options


Given the relevance of SAP software environments in todays world, it is critical
that these systems do not experience unexpected downtime. Hardware
manufacturers such as Intel or IBM invest much into the reliability and availability
features of their products, but IT systems are still exposed to many different
sources of errors.
For that reason, it is crucial to consider business continuity and reliability aspects
of IT environments when planning SAP HANA systems. Business continuity in
this context means to design IT landscapes with failure in mind. Failure spans
from single component errors, such as a hard disk drive (HDD) or a network
cable, up to the outage of the whole data center because an earthquake or a fire.
Different levels of contingency planning must be done to cope with these sources
of error.
Developing a business continuity plan highly depends on the type of business a
company is doing, and it differs, among other factors, by country, regulatory
requirements, and employee size.
This section introduces the three main elements of business continuity planning:
Implementing HA
Planning for DR
Taking backups regularly
These three elements have different objectives for how long it takes to get a
system online again, for the state in which the system is after it is online, and for
the end-to-end consistency level of business data when an IT environment
comes online again. These three values are defined as follows:
Recovery Time Objective (RTO) defines the maximum tolerated time to get a
system online again.
Recovery Point Objective (RPO) defines the maximum tolerated time span to
which data must be able to be restored. It defines the amount of time for
which data is tolerated to be lost. An RPO of zero means that the system
must be designed to not lose data in any of the considered events.
The most common approach to achieve an RPO of zero is to implement HA
within the primary data center plus an optional synchronous data replication
to an offsite location (usually a second data center).
Recovery Consistency Objective (RCO) defines the level of consistency of
business processes and data that is spread out over multitier environments.1
1

188

This publication focuses on SAP HANA, that is, the database layer only. For that reason, this
chapter does not describe RCO any further.

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

It is important to understand the difference between DR and HA. HA covers a


hardware failure (for example, one node becomes unavailable because of a faulty
processor, memory DIMM, storage, or network failure).
All the single-node and scale-out solutions that are described in Chapter 6, SAP
HANA IT landscapes with IBM System x solutions on page 145 can be
enhanced with HA and DR capabilities. All scenarios require at least one
additional copy of the data to be available in the system in order for the SAP
HANA application to survive the outage of a server, including the data on it.
HA is implemented by introducing standby nodes. During normal operation,
these nodes do not actively participate in processing data, but they do receive
data that is replicated from the worker nodes. If a worker node then fails, the
standby node takes over and continues data processing. Details of such a node
takeover are explained in Example of a node takeover on page 134.
DR covers the event when multiple nodes in a scale-out configuration fail, or a
whole data center goes down because of a fire, flood, or other disaster, and a
secondary site must take over the SAP HANA system. The ability to recover from
a disaster, or to tolerate a disaster without major impact, is sometimes also
referred to as disaster tolerance (DT).
Note: When speaking about DR, the words primary site, primary data center,
active site, and production site mean the same thing. Similarly, secondary site,
back up site, and DR site are also used interchangeably. The primary site
hosts your production SAP HANA instance during normal operation.
When running an SAP HANA side-car scenario (for example, SAP CO-PA
Accelerator, sales planning, or smart metering), the data still is available in the
source SAP Business Suite system. Planning or analytical tasks run slower
without the SAP HANA system being available, but no data is lost. More
important is the situation where SAP HANA is the primary database, such as
when using SAP Business Suite with SAP HANA or Business Warehouse with
SAP HANA as the database. In those cases, the production data is available
solely within the SAP HANA database, and according to the business service
level agreements, prevention of a failure is necessary.

Chapter 7. Business continuity and resiliency for SAP HANA

189

HA and DR solutions for SAP HANA can be at two different levels:


On the infrastructure level:
By replicating data that is written to disk by the SAP HANA persistency
layer, either synchronously or asynchronously, standby nodes can recover
lost data from failed nodes. Data replication can happen within a data
center (for HA), across data centers (for DR), or both (for any combination
of HA and DR). This feature is known as GPFS based storage replication.
Using backups that are replicated or otherwise shipped from the primary
site to the secondary site.
On the application level:
By replicating all actions that are performed on an active SAP HANA instance
to a passive instance that is running in receive-only mode. Essentially, the
passive instance runs the same instructions as the active instance, except for
accepting user requests or queries. This feature is known as SAP HANA
System Replication (SSR).
Table 7-1 lists the available features to implement business continuity and shows
for what scenario they are applicable.
Table 7-1 Business continuity options for SAP HANA
Level

Technology

RTO

RPO

Suitable for
HA

Suitable for
DR

Infrastructure
level

GPFS based
synchronous
replication

HA: Zero
DR: Minutes

Zero

Yes

Yes

GPFS based
asynchronous
replicationa

N/A

N/A

N/A

N/A

Backup - Restore

Usually hours

Hours to days

No

Yes

SSR
(synchronous)

Minutes

Zero

Limitedb

Yes

SSR
(asynchronous)

Minutes

Seconds

No

Yes

Application
level

a. This feature is not supported at the time of writing.


b. See the Note box below this table.

190

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Note: SSR does not support automatic failover to a standby system. As the
name implies, it only replicates data to a standby system. Manual intervention
is required for the failover to happen.
SAP uses the term Host Auto-Failover to describe the capability of an
automatic failover to a standby system. At the time of writing, Host
Auto-Failover is not available with the SAP HANA product.
The next two sections explain GPFS based storage replication and SSR in more
detail.
The rest of this chapter then describes all supported scenarios for HA and DR of
single-node and scale-out SAP HANA environments that are based on IBM
System x solutions. Conventional backup methods are described in 7.4, Backup
and restore on page 236.

7.1.1 GPFS based storage replication


IBM uses General Parallel File System (GPFS) in all its solutions for SAP HANA.
The file system has built-in replication features that allow you to have multiple
copies of one file stored on disk spread out on different servers in a multi-server
environment. Those copies are referred to as replicas and are all identical to
each other. For more architectural details about replicas, 5.3.2, GPFS
extensions for shared-nothing architectures on page 126.
Having multiple replicas available enables HA and DR solutions to be built by
using these replicas. It also allows for environments to grow into more reliable
environments if required. Clients can, for example, start with a single data center
solution with HA and add DR capabilities later on when needed.
IBM uses GPFS in its SAP HANA solutions in a way that allows any node in a
cluster to fail without losing data. The concept is explained in 5.3.3, Scaling-out
SAP HANA using GPFS on page 128. That section details what happens when
a node fails and how a standby node takes over the workload without taking
down the service.
Expanding into a DR setup involves an additional replica of the data to be stored
at a remote location.

Chapter 7. Business continuity and resiliency for SAP HANA

191

The major difference between a single site HA (as described in 5.3.3,


Scaling-out SAP HANA using GPFS on page 128) and a multi-site DR solution
is the placement of the replicas within GPFS. In a single-site HA configuration,
there are two replicas2 of each data block in one cluster. In contrast, a multi-site
DR solution holds an additional third replica in the remote or secondary site. This
ensures that when the primary site fails that a complete copy of the data is
available at the second site and operations can be resumed at this site.
A two-site solution implements the concept of a synchronous data replication on
a file system level between both sites by using the replication capabilities that are
built into GPFS.
A brief paper that introduces the wording and concepts that are used by GPFS to
implement HA can be found at the following website:
http://www-03.ibm.com/systems/resources/configure-gpfs-for-reliability.
pdf

7.1.2 SAP HANA System Replication


SSR is a feature that was introduced with SAP HANA SPS05 and was improved
in subsequent revisions. In an environment using SSR, the primary and the
secondary system must be configured identically in terms of SAP HANA worker
nodes. The number of standby nodes is allowed to differ (since SAP HANA
SPS06). Every SAP HANA process running on the primary systems worker
nodes must have a corresponding process on a secondary worker node to which
it replicates its activity.
The only difference between the primary and secondary system is the fact that
one cannot connect to the secondary HANA installation and run queries on that
database. They can also be called active and passive systems.
Upon start of the secondary HANA system, each process establishes a
connection to its primary counterpart and requests the data that is in main
memory. This is called a snapshot. After the snapshot is transferred, the primary
system continuously sends the log information to the secondary system running
in recovery mode. At the time of writing, SSR does not support replaying the logs
immediately as they are received, so the secondary site system only
acknowledges and persists the logs. To avoid having to replay hours or days of
transaction logs upon a failure, from time to time SSR asynchronously transmits
a new incremental data snapshot.

192

In GPFS terminology, each data copy is referred to as a replica. This term also applies to the
primary data, which is called the first replica. This term indicates that all data copies are equal.

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Among other criteria, the following criteria must be met to enable the SSR
feature:
The SAP HANA software revision of the target environment must be the same
or higher than the source environment.
Both systems must use the same SID and instance numbers.
On both systems, instance number + 1" must be free because it is used for
replication purposes.

Replication modes
SSR can be set to one of three modes:
Synchronous mode: Makes the primary site system wait until the change is
committed and persisted on the secondary site system.
Synchronous in-memory mode: Makes the primary site system acknowledge
the change after it is committed in main memory on the secondary site
system, but not yet persisted on disk.
Asynchronous mode (available since SAP HANA SPS06): Makes the primary
site system commit transaction when the replicated log is sent to the DR site.
The system does not wait for an acknowledgment from the remote site.
Asynchronous replication allows for greater distances because a high latency
between the primary and secondary system does not prevent a production
workload from running at maximum performance as with synchronous
replication.
In both synchronous modes, the impact is defined by the transmission time from
the primary to its corresponding secondary system process. When running SSR
in synchronous mode, you must add the time that it takes to persist the change
on a disk in addition to the transmission delay.
In case the connection between the two data centers is lost, live-replication
stops. Then, after a (configurable) timer expires on the primary site system, it
resumes work without replication.
When the connection is restored, the secondary site system requests a delta
snapshot of what changes were done since the connection was lost. Live
replication can then continue after this delta is received on the secondary site
system.

Chapter 7. Business continuity and resiliency for SAP HANA

193

System failover
If there is a failover to the secondary system (system takeover), manual
intervention is required to change the secondary site system from recovery mode
to active mode. SAP HANA automatically loads all row-based tables into memory
and rebuilds the row store indexes. In the next step, all logs since the last
received snapshot are replayed. After this step finishes, the system can accept
incoming database connections. A restart of the SAP HANA database instance is
not required.
An optional feature (called table preload) enables the primary system to share
information about which columns are loaded into main memory. The secondary
system can use this information to preload those tables in main memory.
Preloading reduces the duration of a system takeover operation.

Hosting a non-productive instance at the secondary system


If the secondary system is intended to host a non-productive instance, then
preloading must be disabled. In such a scenario, SAP HANA operates with a
minimal main memory footprint on the secondary system to allow the remaining
memory to be used for a non-productive SAP HANA installation.
If a system takeover is triggered, both instances (the one receiving the replication
data and the non-productive instance) must be stopped. The secondary system
must be reconfigured to use all available main memory, and then a takeover
operation is run. Because you must restart the SAP HANA processes, the time
for a system takeover and a subsequent system performance ramp-up is longer
compared to when no non-productive instance is hosted and tables preload is
enabled.
A non-productive instance is not allowed to share storage with production data.
For that reason, IBM uses the IBM Systems Storage EXP2524 to extend the
locally available storage capacity to hold the data and log files of the
non-productive system.

Multitier System Replication


Multitier System Replication was introduced With SAP HANA SPS07, and it
allows you to cascade replication over several databases.
At the time of writing, only one scenario is supported. The primary system
synchronously replicates (using either synchronous or synchronous in-memory
mode) to the secondary system that asynchronously replicates the information to
a tertiary system, which can be physically far away from the primary and
secondary system. This setup is shown in Figure 7-1 on page 195.

194

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

asynchronous
replication

synchronous
replication

SAP HANA
node01

node02

DB partition 1
- SAP HANA DB
Worker node
- Index server
- Statistic server
- SAP HANA studio

node01

DB partition 2

DB partition 1
- SAP HANA DB
Standby node

- SAP HANA DB
Worker node

- Index server
- Statistic server

- Index server
- Statistic server

- Index server
- Statistic server
- SAP HANA studio

Shared file system - GPFS


Primary data
Replica

Flash

HDD

Flash

data01

log01

data02

log02

node02

node01

node03

DB partition 1

DB partition 2
- SAP HANA DB
Worker node

- SAP HANA DB
Standby node

- SAP HANA DB
Worker node

- Index server
- Statistic server

- Index server
- Statistic server

- Index server
- Statistic server
- SAP HANA studio

Flash
Primary data

HDD

Flash

HDD

Flash

data01

log01

data02

log02

node03

DB partition 2
- SAP HANA DB
Worker node

- SAP HANA DB
Standby node

- Index server
- Statistic server

- Index server
- Statistic server

Shared file system - GPFS

Shared file system - GPFS


HDD

node02
SAP HANA DB

SAP HANA DB

- SAP HANA DB
Worker node

HDD

SAP HANA

SAP HANA
node03

SAP HANA DB

HDD

Flash

Replica

Primary data

HDD

Flash

HDD

Flash

data01

log01

data02

log02

HDD

Flash

Replica

Figure 7-1 SAP HANA Multitier System Replication

The following documents provide additional information about SSR:


Introduction to High Availability for SAP HANA, found at:
http://www.saphana.com/docs/DOC-2775
How to Perform System Replication for SAP HANA, found at:
https://scn.sap.com/docs/DOC-47702
SAP HANA Administration Guide, SAP HANA Security Guide, found at:
http://help.sap.com/hana_platform

7.1.3 Special considerations for DR and long-distance HA setups


The distance between the data centers hosting the SAP HANA servers must be
within a certain range to keep network latency to a minimum. This range allows
synchronous replication to occur with limited impact to the overall application
performance (also referred to as Metro Mirror distance). It does not matter for
what exact purpose the data is replicated to the second data center (either for
DR or for long-distance HA). In both cases, data must be transferred between the
two locations, which impacts the process.
Application latency is the key indicator for how well a long-distance HA or DR
solution performs. The geographical distance between the data centers can be
short. However, the fiber cable between them may follow another route. The
Internet service provider (ISP) usually routes through one of its hubs, which
leads to a longer physical distance for the signal to travel, and therefore a higher
latency. Another factor that must be taken into account is the network equipment
between the two demarcation points on each site. More routers and protocol
conversions along the line introduce a higher latency.

Chapter 7. Business continuity and resiliency for SAP HANA

195

Attention: When talking about latency, make sure to specify the layer at which
you are measuring it. Network engineers talk about network latency, but SAP
prefers to use application latency.

Network latency refers to the low-level latency network packets experience


when traveling over the network from site A to site B. Network latency does not
necessarily include the time that it takes for a network packet to be processed
on a server.
Application latency refers to the delay that an SAP HANA database
transaction experiences when it occurs in a DR environment. This value is
sometimes also known as end-to-end latency. It is the sum of all delays as
they occur while the database request is in flight and includes, besides
network latency, packet extraction in the Linux TCP/IP stack, GPFS code
execution, or processing the SAP HANA I/O code stack.
Data can be replicated from one data center to the other one either
synchronously or asynchronously. Synchronous data replication means that any
write request that is issued by the application is committed to the application only
after the request is successfully written on both sides. To maintain the application
performance within reasonable limits, the network latency (and therefore the
distance) between the sites must be limited to Metro Mirror distances. The
maximum achievable distance depends on the performance requirements of the
SAP HANA system. In general, an online analytical processing (OLAP) workload
can work with higher latencies than an online transaction processing (OLTP)
workload. The network latency mainly is dictated by the connection between the
two SAP HANA clusters. This inter-site link typically is provided by a third-party
ISP.

7.2 HA and DR for single-node SAP HANA


Customers running their SAP HANA instance on a single node can still
implement redundancy to protect against a failure of this node. Two options are
available to achieve this level of protection. IBM uses GPFS to replicate data to a
second or third server (storage replication) and SAP uses SSR to replicate data
to another SAP HANA instance running on another server.
Table 7-2 on page 197 gives an overview of the different installations that can be
implemented with IBM workload-optimized solutions for SAP HANA. It lists GPFS
and SSR-based solutions and their combinations.

196

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Table 7-2 Overview of HA and DR options for single-node SAP HANA solutions
Characteristic

Single-node SAP HANA installation with


HA

Stretched
HA

DR
(using
GPFS)

DR
(using
SSR)

HA and DR
(using
GPFS only)

HA and DR
(using
GPFS and
SSR)

Required data
centers

2 or 3
(metro
distance)

2 or 3
(metro
distance)

2
(metro
distance or
higher)

2 or 3
metro
distance)

2 or 3
(metro
distance or
higher)

RTO

Seconds

Seconds

Minutes

Minutes

Seconds,
for HA,
minutes for
DR

Seconds,
for HA,
minutes for
DR

RPO

Zero

Zero

Zero

Zero or
higher

Zero

Zero or
higher

Replication
method

GPFS
(sync)

GPFS
(sync)

GPFS
(sync)

SAP HANA
(sync or
async)

GPFS
(sync)

GPFS
(sync) plus
SSR (sync
or async)

Automatic
failover

Yes

Yes

No

No

Yes, for HA
node

Yes, for HA
node

Can host
non-production

No

No

Yes

Yes

Yes

Yes

Number of SAP
HANA server
nodes

Number of
GPFS quorum
servers

Tolerated node
failures

The following sections describe each of the solutions in more detail and show
architectural diagrams for each of them.

Chapter 7. Business continuity and resiliency for SAP HANA

197

7.2.1 High availability (using GPFS)


This setup implements HA for a single node within a single data center. It
protects your SAP HANA system from being offline when the server experiences
any failure. This can be a hardware or a software failure.
A single node with HA installation consists of three physical nodes with the
following designations:
Active (or worker) node
Standby node
Quorum node
From a GPFS point of view, single-node HA setups can be treated like scale-out
solutions with only one node that is running SAP HANA workload and producing
data. The second node only receives data from the active node and is ready to
take over if the active node experiences any failure.
The worker and the standby nodes must be identical to ensure a successful
takeover of SAP HANA operations.
The third node, called the quorum node, must be added so that GPFS can
decide which node survives if there is an outage anywhere in the network
between the active and standby node (so called split-brain situations).
During normal operation, all database queries are run only on the active node.
The standby node remains inactive and does not respond to any queries.
GPFS replication ensures that all SAP HANA persistency files on the active node
are copied over to the standby node. Under normal operation, at any given point
there is a synchronous data copy on each of the two servers, which means an
RPO of zero is ensured. If a failure occurs, the standby node takes over and
continues operation using its local data copy. Using synchronous replication
mode ensures that every write request that is issued by the application on the
worker node returns only after it successfully is written on both nodes. This
method ensures that you always have a consistent data set on both nodes.

198

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

The quorum node is not running any SAP HANA processes. It runs only a small
GPFS process, so it does not need connectivity to the SAP HANA network. Only
GPFS access is required. Quorum functionality can be implemented on different
servers (you do not need to use an IBM SAP HANA building block for this task).
In most cases, a smaller system like the IBM System x3550 M4 is suitable for this
task. Without this quorum node, if only the communication between the SAP
HANA nodes is interrupted but otherwise the nodes are running fine, GPFS
cannot tell which side should continue to operate, and it unmounts the file system
on both nodes to protect from inconsistent data (this situation is called a split
brain). When GPFS is down, SAP HANA cannot continue to operate because
data and logs are not accessible. Administrative intervention is required in that
case.
Figure 7-2 shows the detailed conceptual view of a single-node HA solution.
While the GPFS quorum node is part of the GPFS cluster, it does not contribute
disk capacity to it. For that reason, in this example, we extend the shared file
system only half into the quorum node.

node02

node01

quorum node

SAP HANA DB
Worker

Stand-by

- SAP HANA DB
Worker node

- SAP HANA DB
Standby node

- Index server
- Statistic server
- SAP HANA studio

- Index server
- Statistic server

Shared file system - GPFS

First replica

data01 + log01

Second replica

data01 + log01

local storage

local storage

Figure 7-2 Detailed view of single-node HA solution

Chapter 7. Business continuity and resiliency for SAP HANA

199

For a fully redundant solution, the network also must be built using two redundant
Ethernet switches. Figure 7-3 shows the network setup of such a single-node HA
installation.

Worker Node

Standby Node

Quorum Node

GPFS Links
SAP HANA Links
Inter-Switch Link (ISL)

G8264 switches

Figure 7-3 Network setup for single-node high availability

No manual intervention is required for the takeover to complete because it is


handled internally by SAP HANA processes. The takeover usually happens
within seconds or minutes, depending on the environment and workload on the
system. Clients experience a delay only in their queries.
It is not possible to host any non-production SAP HANA instances on the standby
node because it is working in hot-standby mode and must be ready to take over
operation at any time.

7.2.2 Stretched high availability (using GPFS)


This solution extends the physical distances between two servers of a
single-node HA setup to metro distance. This configuration is called stretched
HA. It usually spans two data centers at separate sites. If circumstances allow, it
also is possible to host each server at one end of a companys campus.
Allowing for longer distances between server nodes usually means that the
network in between is also different from what is available within a single data
center. Latency and throughput requirements for running SAP HANA over longer
distances is described in 7.1.3, Special considerations for DR and long-distance
HA setups on page 195.

200

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

This solution requires two identical SAP HANA building blocks plus a GPFS
quorum node. One SAP HANA building block is installed at each site. The node
at the primary site is installed as a worker node and the node at the secondary
site is installed as a standby node.
GPFS ensures that all data that is written on the active node is replicated over to
the server in the second data center, which is running SAP HANA in standby
mode.
GPFS needs a quorum node to act as tie-breaker if the network between the
servers breaks. This quorum node should be in a separate third location to
ensure maximum reliability of the solution.
Figure 7-4 outlines the conceptual view of a stretched HA implementation for a
single-node SAP HANA solution. The GPFS quorum node is placed at a
dedicated third site C.

Site C

Site A

Site B

quorum node

node01

node02
SAP HANA DB

Worker

Stand-by

- SAP HANA DB
Worker node

- SAP HANA DB
Standby node

- Index server
- Statistic server
- SAP HANA studio

- Index server
- Statistic server

Shared file system - GPFS

First replica

data01 + log01
data01 + log01

local storage

Second replica

local storage

Figure 7-4 Detailed view of single-node stretched HA solution (three-site approach)

Chapter 7. Business continuity and resiliency for SAP HANA

201

The GPFS quorum node does not contribute disk space to the file system; it is
connected to the GPFS network only to decide on the surviving site in split-brain
situations. This is why the yellow box only partially spans to the quorum node.
The network view that is outlined in Figure 7-5 shows that the quorum node
needs to be connected only to the GPFS network.

Site C

Site B
Quorum Node

Worker Node

GPFS Links

Standby Node

G8264 switches

SAP HANA Links


Inter-Switch Link (ISL)

Figure 7-5 Network setup of single node with stretched HA (three-site approach with dual inter-site links)

Depending on the customer network preconditions, different scenarios are


possible for implementing the inter-site network. Some customers might not have
two dedicated links available between the G8264 switches at the primary and
secondary sites. Figure 7-6 on page 203 shows an alternative approach with
only one link. Different flavors of virtual private networks (VPN) are used today. In
those situations, the G8264 switches can be connected to the VPN gateways or
distribution layer switches.

202

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Site C

Site B
Quorum Node

Worker Node

Standby Node

MPLS
or other campus/metro switching

GPFS Links

G8264 switches

SAP HANA Links

(usually MPLS LER or


other dist switches)

Inter-Switch Link (ISL)

Figure 7-6 Network setup of single node with stretched HA (three-site approach with one inter-site link)

If a separate third location is not available, the quorum node must be placed in
the primary data center with the active worker node. This gives the active site a
higher weight so that GPFS continues to operate even if the server hosting the
SAP HANA standby process at the second site is no longer reachable. The
standby server can become unreachable, for example, because of a server
hardware failure or because of a broken link between the two data centers.

Chapter 7. Business continuity and resiliency for SAP HANA

203

Figure 7-7 shows a stretched HA solution with the quorum node in the primary
site A together with the active worker node.

Site A
quorum node

Site B
node02

node01
SAP HANA DB
Worker

Stand-by

- SAP HANA DB
Worker node

- SAP HANA DB
Standby node

- Index server
- Statistic server
- SAP HANA studio

- Index server
- Statistic server

Shared file system - GPFS

First replica

data01 + log01

Second replica

data01 + log01

local storage

local storage

Figure 7-7 Detailed view of single-node stretched HA solution (two-site approach)

The corresponding network architecture is shown in Figure 7-8.

Site B

Worker Node

Standby Node

Quorum Node

GPFS Links
SAP HANA Links
Inter-Switch Link (ISL)

G8264 switches

Figure 7-8 Network setup of single node with stretched HA (two-site approach with dual inter-site links)

204

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

This scenario assumes that the customer has two links between the primary and
secondary sites that are available for SAP HANA usage. If not, then a
consolidated single network link can also be used, as shown in Figure 7-9.

Site B

Worker Node

Standby Node

Quorum Node

GPFS Links
SAP HANA Links
MPLS
or other
campus/metro
switching

Inter-Switch Link (ISL)

G8264 switches

(usually MPLS LER or


other dist switches)

Figure 7-9 Network setup of single node with stretched HA (two-site approach with one inter-site link)

The disadvantage of having the GPFS quorum node in the primary data center
and not in a separate third location comes into play only if the primary data
center experiences an outage.
If the GPFS quorum node is at a third independent site, then it can constitute,
together with the standby node at site B, the majority of the cluster nodes, so the
shared file systems stays up at the standby node and SAP HANA can
automatically fail over.
If the GPFS quorum node is in the same failure domain as the SAP HANA worker
node, then upon a failure in this domain, the file system on the standby node
unmounts and manual intervention is required to bring it back up. In this situation,
the GPFS cluster loses the majority of the nodes. Automatic failover is still
possible in any failure situation that does not affect the network subsystem. If the
quorum node can still communicate to the standby node, automatic failover
works. This situation protects against any failure of the worker node, such as
hardware failures, firmware faults, or operating system and user errors.

Chapter 7. Business continuity and resiliency for SAP HANA

205

Only if there is a total site outage (when the SAP HANA worker node and the
GPFS quorum node are unavailable) is manual intervention required on site B to
bring the database instance back up.
Because the node at site B is not idling (it runs SAP HANA standby processes
that monitor the worker node), it is not possible to host non-production database
instances on this node.

7.2.3 Disaster recovery (using GPFS)


An environment running single-node DR spans two data centers in two distinct
locations. There is a primary and a secondary node and SAP HANA is active on
only the primary node. Under normal conditions, SAP HANA is not running on
the secondary node, which means no SAP HANA processes are running on it.
The secondary node is used only to store data that is replicated through GPFS
from the primary node (sometimes called GPFS based storage replication or just
storage replication). This replication happens in a synchronous manner, which
means that disk operations must be completed on both sites before an I/O
operation is marked as complete. Synchronous replication ensures an RPO of
zero.
If there is a disaster, SAP HANA manually must be started on the secondary
node because no standby processes are running in the operation system that
monitors the active worker node. The two nodes running SAP HANA must be
identical; different memory configurations are not allowed.
An additional server is required that acts as a GPFS quorum node. This server
ideally is placed at a third location to ensure maximum reliability of the file
system. However, if a third data center is not available, this additional server must
be placed at the primary data center, where it ensures that when the primary
node cannot communicate with the secondary node (because of a hardware
failure or broken network link) that the primary site server stays up and GPFS
keeps running without experiencing any downtime.

206

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Figure 7-10 shows a single-node solution for DR using GPFS based storage
replication.

Site C

Site A

Site B

quorum node

node01

node02

SAP HANA DB

SAP HANA DB

Worker

Worker

- SAP HANA DB
Worker node

- SAP HANA DB
Standby node

- Index server
- Statistic server
- SAP HANA studio

- Index server
- Statistic server
- SAP HANA Studio

Shared file system - GPFS

First replica

data01 + log01
data01 + log01

local storage

Second replica

local storage

Figure 7-10 Detailed view of single-node DR solution using GPFS (three-site approach)

Chapter 7. Business continuity and resiliency for SAP HANA

207

The network can be implemented in several different ways. The basic version is
shown in Figure 7-11, with fully redundant network connections towards both
satellite sites B and C.

Site B

Site C

Storage expansion for


non-prod DB instance

Worker Node

DR Node

Quorum Node

GPFS Links

G8264 switches

SAP HANA Links


Inter-Switch Link (ISL)

Figure 7-11 Network setup of a single node with DR installation (three-site approach with dual links)

208

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

If no dedicated links to site B or C are available, a different architecture can be


implemented. An example is shown in Figure 7-12. Combinations of the different
approaches are possible. Obviously, this combination that is decided upon has
an impact on the level of redundancy that is provided by a solution.

Site B

Site C

Storage expansion for


non-prod DB instance

Worker Node

DR Node

Quorum Node

MPLS
or other
campus/metro
switching
GPFS Links
SAP HANA Links
Inter-Switch Link (ISL)

G8264 switches
(usually MPLS LER or
other dist switches)

Figure 7-12 Network setup of a single node with DR (three-site approach with alternative inter-site
connectivity)

Chapter 7. Business continuity and resiliency for SAP HANA

209

In the absence of a third site that hosts the GPFS quorum server this machine
must be installed at the primary site A, as shown in Figure 7-13, to ensure
continuous availability of the file system if the link between sites A and B is
broken.

Site A
quorum node

Site B
node01

node02

SAP HANA DB

SAP HANA DB

Worker

Worker

- SAP HANA DB
Worker node

- SAP HANA DB
Standby node

- Index server
- Statistic server
- SAP HANA studio

- Index server
- Statistic server
- SAP HANA Studio

Shared file system - GPFS

First replica

data01 + log01
data01 + log01

local storage

Second replica

local storage

Figure 7-13 Detailed view of single-node DR solution using GPFS (two site approach)

Similar to the approach with three data centers, there is more than one way of
how networking can be implemented with two data centers depending on
customer requirements. Figure 7-14 on page 211 shows the fully redundant
networking architecture with two links between site A and B.

210

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Site B
Storage expansion for
non-prod DB instance

Worker Node

DR Node

Quorum Node

GPFS Links
SAP HANA Links
Inter-Switch Link (ISL)

G8264 switches

Figure 7-14 Network setup of single node with DR installation (two-site approach with dual links)

Different approaches, similar to what is shown in Figure 7-12 on page 209, can
be implemented for a two-site solution. Details must be agreed upon by the
client.
During normal operations, SAP HANA is running only on site As node. No SAP
HANA processes are running on the DR node at site B. The only GPFS
processes that are running are the ones that receive replicated data from the
active worker node from site A.
If the worker node at site A goes down, the file system stays up on the DR node if
the GPFS quorum node is still healthy and available. If the quorum node also
goes down, then GPFS unmounts the file system on the DR node because the
majority of the GPFS cluster nodes are down. If this happens, human
intervention is required. An administrator manually can override GPFS cluster
logic and mount the file system just on the DR node by using the second replica.
After the file system is available on the DR node, SAP HANA processes can be
started that load database data and replay log files so that the database instance
can resume operation with no data loss (RPO of zero).
Because during normal operation there are no SAP HANA processes running on
the secondary node (not even in standby mode), there is an option to host a
non-productive SAP HANA instance (like development or test) there. In such a
scenario, if there is a disaster, this non-productive SAP HANA instance must be
shut down before a takeover can happen from the primary system.

Chapter 7. Business continuity and resiliency for SAP HANA

211

This additional SAP HANA instance needs its own storage space for persistency
and logs. IBM uses the EXP2524 unit to provide this additional space. The
EXP2524 adds up to twenty-four 2.5-inch HDDs that are connected directly to
the server through an SAS interface. A second file system is created over those
drives for the additional SAP HANA instance. This second file system is visible
only to this node. Data is not replicated to a second node.
If a disaster occurs and the productive SAP HANA instance is switched to run on
the DR server at site B, the non-productive instance must be shut down first.
After the primary server at site A is repaired, customers can choose to either
switch back their productive instance to site A or they can let it continue to run on
the DR node at site B.
If customers choose to keep their production instance on site B, this means that
the non-production instance must now be started on site A, that is, the former
primary server. For that reason, the site A server must have an EXP2524
attached as well (not just site Bs machine). Non-production data on the
EXP2524 must be copied manually to the other side before you can start the
non-production SAP HANA instance.

7.2.4 Disaster recovery (using SAP HANA System Replication)


Section 7.2.3, Disaster recovery (using GPFS) on page 206 describes DR for a
single SAP HANA node using GPFS based storage replication. As an alternative,
you can use SSR. Database activity is replicated on the application layer to a
second SAP HANA instance running on a DR node.
Coupling of the two single nodes is done at the application level, not on the
GPFS level, which means from a file system perspective it is two distinct
single-node GPFS clusters. No shared file system exists across the two nodes.
No GPFS quorum node is required.
Figure 7-15 on page 213 shows a DR setup using SSR.

212

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Site A

Site B

node01

node01

SAP HANA DB

SAP HANA DB

Worker
- SAP HANA DB

- Index server
- Statistic server
- SAP HANA studio

First replica

Worker
SAP HANA
System
Replication

- SAP HANA DB

- Index server
- Statistic server
- SAP HANA studio

Shared file system - GPFS

Shared file system - GPFS

data01 + log01

data01 + log01

local storage

First replica

local storage

Figure 7-15 Detailed view of single-node DR solution using SAP HANA System
Replication

A failover to site B always requires manual intervention and does not happen
automatically.
SSR can be configured to work either in synchronous or asynchronous mode.
Different physical distances can be realized in each mode. For more information,
see Replication modes on page 193.

Chapter 7. Business continuity and resiliency for SAP HANA

213

Different architectures are possible for the network connecting the two SAP
HANA building blocks. Figure 7-16 shows a setup with redundant switches for the
SAP HANA replication network.

Site B
Storage expansion for
non-prod DB instance

Worker Node

DR Node

SAP HANA Links


Inter-Switch Link (ISL)

G8264 switches

Figure 7-16 Network setup of single node with SAP HANA System Replication (dual inter-site links)

Because SAP HANA processes are not in an active standby mode on the DR
node, it is possible to host a non-production instance by using dedicated
additional storage. IBM uses the EXP2524 unit to provide this additional space.
The EXP2524 adds up to twenty-four 2.5-inch HDDs that are directly connected
to the server through an SAS interface. A second file system is created over
those drives for the additional SAP HANA instance. This second file system is
visible only to this node. Data is not replicated to a second node.
Designs without the network switches are also supported. Figure 7-17 on
page 215 shows such an approach. It is possible to have these inter-site links
going across VPNs (or other forms of overlay networks) if the latency and
throughput requirements are met.

214

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Site B
Storage expansion for
non-prod DB instance

Worker Node

DR Node

SAP HANA Links

Figure 7-17 Network setup of single node with SAP HANA System Replication (no
switches)

Intermediate architectures with redundant switches in the primary data center on


site A and only one link to site B are possible (similar to Figure 7-12 on
page 209). Implementations with only one link for SSR can be implemented
when no redundancy is required. Details should be discussed with the customer.

7.2.5 HA plus DR (using GPFS)


Customers requiring the highest level of reliability for their SAP HANA installation
can combine HA and DR into one design. This design merges the HA and DR
principles that are described in 7.2.1, High availability (using GPFS) on
page 198 and 7.2.3, Disaster recovery (using GPFS) on page 206. HA can also
be implemented as a stretched HA, as described in 7.2.2, Stretched high
availability (using GPFS) on page 200.
A single-node solution with HA and DR using GPFS for both features requires
three SAP HANA nodes with the same memory configuration. Two nodes are
installed at the primary site and one node at the DR site. There is one GPFS
cluster spanning all three nodes. The SAP HANA instance spans only a single
node. GPFS ensures that there are always three replicas available of the data,
one on each node. This setup allows the DR node to work even when both
primary site nodes are down because it has its own data replica. GPFS
replication happens in a synchronous manner, which means disk operations are
flagged as complete only if all three nodes have written their replica successfully.

Chapter 7. Business continuity and resiliency for SAP HANA

215

Figure 7-18 shows the architectural view of such a single node with a HA and DR
solution. No additional GPFS quorum node is required because the configuration
includes three nodes running GPFS processes. No split-brain situation can occur
in a cluster with three members.

Site A

Site B
node02

node01
SAP HANA DB
Worker

node03
SAP HANA DB

Stand-by

Worker

- SAP HANA DB
Worker node

- SAP HANA DB
Standby node

- SAP HANA DB
Worker node

- Index server
- Statistic server
- SAP HANA studio

- Index server
- Statistic server

- Index server
- Statistic server
- SAP HANA studio

Shared file system - GPFS

First replica

data01 + log01

Second replica

data01 + log01
data01 + log01

local storage

local storage

Third replica

local storage

Figure 7-18 Detailed view of single-node HA and DR solution (using GPFS)

During normal operation, when the setup is healthy, SAP HANA is running only
on the nodes at the primary data center. From these two nodes, only the worker
node responds to user queries. The HA node does not respond to requests. The
DR node at site B has no active SAP HANA processes. Only GPFS processes
are running that receive data (third replica) and write it to local disk.
If the primary site worker node experiences an outage, the standby node detects
it and SAP HANA fails over and resumes operation on this node. This node is
promoted from a standby to a worker designation and responds to user requests.
Now, GPFS has only two replicas that are left in the cluster. When s the first node
is repaired, it rejoins the cluster and its local replica is restored.

216

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

If there is a disaster where both nodes at the primary site are lost, manual
intervention is required to get SAP HANA running at the DR site B. Because
GPFS has lost the majority of its cluster nodes, it must be told that it is safe to
operate using only the surviving node (this temporary procedure is called
relaxing the node quorum). The file system can then be mounted again and SAP
HANA can be started. It reads data and log files from local disk, loads it into
memory, and can then resume database operation. Because GPFS replication
happens synchronously, no data is lost, and the DR node continues exactly with
the last committed transaction before the disaster happened.
Figure 7-19 shows the networking architecture with redundant Ethernet switches.
Site B
Storage expansion for
non-prod DB instance

Worker Node

Standby Node

DR Node

GPFS Links
SAP HANA Links

G8264 switches

Inter-Switch Link (ISL)

Figure 7-19 Network setup of a single node with HA plus DR using GPFS

Because during normal operation the node at the secondary site is not hosting
any running instance, there is an option to host a non-productive SAP HANA
instance (such as development or test) on this DR site node. If there is a disaster,
this non-productive SAP HANA instance must be stopped before the production
database instance can be started on this node.
This additional SAP HANA instance needs its own storage space for persistency
and logs. IBM uses the EXP2524 unit to provide this additional space. The
EXP2524 adds up to twenty-four 2.5-inch HDDs that are directly connected to
the server through an SAS interface. A second file system is created over those
drives for the additional SAP HANA instance. This second file system is visible
only to this node and is not replicated to another node.

Chapter 7. Business continuity and resiliency for SAP HANA

217

7.2.6 HA (using GPFS) plus DR (using SSR)


Single-node HADR solutions can be built with GPFS based storage replication
and with a combination of storage replication and SSR. In this scenario, GPFS
storage replication is used to provide seamless HA capabilities and SSR is used
to implement DR capabilities on a different site. Figure 7-20 shows the
architecture of such a scenario. A dedicated GPFS quorum node is required
because node01 and node02 can experience a split-brain situation in which an
external decision maker is required to decide on the surviving node.
HA with GPFS either can be implemented as base HA (as described in 7.2.1,
High availability (using GPFS) on page 198) or as stretched HA spanning
bigger distances (as described in 7.2.2, Stretched high availability (using
GPFS) on page 200). For stretched HA setups, place the GPFS quorum node at
a third location.

Site A
node02

node01

Site B
quorum node

node01
SAP HANA DB

SAP HANA DB
Worker

- SAP HANA DB
Standby node

- Index server
- Statistic server
- SAP HANA studio

- Index server
- Statistic server

Shared file system - GPFS

First replica

Worker

Stand-by

- SAP HANA DB
Worker node

- SAP HANA DB

SAP HANA
System
Replication

- Index server
- Statistic server
- SAP HANA studio
Shared file system - GPFS

data01 + log01

data01 + log01

Second replica

First replica

data01 + log01

local storage

local storage

local storage

Figure 7-20 Detailed view of single-node solution with HA using GPFS and DR using SSR

At any point, there are two data replicas available at site A. Synchronous GPFS
replication ensures that I/O operations are marked as successfully complete if
and the operation is written to disk on node01 and node 02 of site A.

218

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Depending on the mode in which SSR runs, the third data copy on site B either is
a synchronous or an asynchronous copy. If it is configured to be a synchronous
copy, then restrictions on the distance of the DR site apply. If SSR runs in
asynchronous mode, then longer distances can be covered (as described in
7.1.3, Special considerations for DR and long-distance HA setups on
page 195). Data replication to the DR site happens on SAP HANA level. The DR
node feeds GPFS with the incoming replication data. GPFS stores this data on
its local disk (indicated by first replica on the right in Figure 7-20 on page 218).
The networking architecture for such a single-node setup is shown in
Figure 7-21. Redundant switches are used at both sites. At the DR site B, there is
no need to connect the GPFS interfaces to the Ethernet switch because from a
GPFS perspective the DR node is creating its own single node GPFS cluster.
This server is not a member of site As GPFS cluster.

Quorum Node

Site B
Storage expansion for
non-prod DB instance

Worker Node

Standby Node

DR Node

GPFS Links
SAP HANA Links

G8264 switches

Inter-Switch Link (ISL)

Figure 7-21 Network setup of single-node HA (using GPFS) plus DR (using SSR)

Chapter 7. Business continuity and resiliency for SAP HANA

219

The DR node does not need GPFS connectivity to site A, which allows you to
leave out the Ethernet switches at site B and connect the node directly to the
primary site switches. This setup is shown in Figure 7-22.
Depending on the type of inter-site link that connects the primary and secondary
data centers, different scenarios are possible. For example, if only one inter-site
link is available, only one network interface of the DR node can be cabled. The
different options and their implications to redundancy and availability must be
discussed with the customer.
During normal operation, the node at the DR site B is running in recovery mode,
which allows you to host a non-productive SAP HANA instance (such as
development or test) on this idling node. If there is a disaster, this non-productive
SAP HANA instance must be stopped before the production database instance
can be started on this node.

Quorum Node

Site B
Storage expansion for
non-prod DB instance

Worker Node

Standby Node

DR Node

GPFS Links
SAP HANA Links

G8264 switches

Inter-Switch Link (ISL)

Figure 7-22 Network setup of single-node HA (using GPFS) plus DR (using SSR) (fewer switches at site B)

This additional SAP HANA instance needs its own storage space for persistency
and logs. IBM uses the EXP2524 unit to provide this additional space. The
EXP2524 adds up to twenty-four 2.5-inch HDDs that are directly connected to
the server through an SAS interface. A second file system is created over those
drives for the additional SAP HANA instance. This second file system is visible
only to this node and is not replicated to another node.

220

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

7.3 HA and DR for scale-out SAP HANA


Scale-out SAP HANA installations can implement two levels of redundancy to
keep their database instance from going offline. The first step is to add a server
node to the scale-out cluster that acts as a hot-standby node. The second step is
to set up an additional scale-out cluster in a distinct data center that takes over
operation if there is a disaster at the primary site. This DR capability can be
implemented by using two different technologies, either GPFS based storage
replication or SSR. Both products replicate all the required data to the DR site
nodes.
Table 7-3 lists the available options for protecting scale-out SAP HANA
environments when using IBM workload-optimized solutions for SAP HANA.
Table 7-3 Overview of HA and DR options for scale-out SAP HANA installations
Characteristic

Scale-out SAP HANA installation with


HA
(using GPFS)

HA and DR
(using GPFS)

HA and DR
(using GPFS and
SSR)

Required data centers

2 or 3a

Geographical distance

Not applicable

Metro distance

Metro distance or
higher

RTO

Seconds

Seconds for HA,


minutes for DR

Seconds for HA,


minutes for DR

RPO

Zero

Zero

Zero or higher

Replication method

GPFS
(synchronous)

GPFS
(synchronous)

GPFS (synchronous)
plus SSR
(synchronous or
asynchronous)

Automatic HA failover

Yes

Yes

Yes

Automatic DR failover

Not applicable

Yes

No

Can host
non-production

No

Yes, at DR site

Yes, at DR site

a. A third data center is required only for automatic DR failover. If no third site is
available, manual failover can be implemented instead.

Chapter 7. Business continuity and resiliency for SAP HANA

221

Although it technically is feasible to implement DR with no HA exclusively in the


primary site cluster, it is not a preferred practice to do so, and can cause issues.
The first fault that occurs in the primary site triggers a DR event that leads to a
failover of the production instance to the DR site. So, to avoid these failovers,
always implement HA capabilities alongside DR. With HA, the first fault is
handled within the primary site and no failover to the DR data center is triggered.

7.3.1 High availability using GPFS storage replication


SAP HANA uses the concept of standby nodes, which take over operation if a
node that is designated as an active worker node fails. Standby nodes can
replace any cluster nodes role. For this reason, the standby node must have the
same memory configuration as the other cluster nodes.
For a takeover to happen, the standby node requires access to the file system of
the failed node. This requirement is fulfilled by GPFS, which is a shared file
system. The FPO extension of GPFS allows you to keep the shared file system
feature in a local storage only environment. FPO details are described in 5.3.2,
GPFS extensions for shared-nothing architectures on page 126.
To achieve HA in a scale-out SAP HANA installation, IBM uses the data
replication feature that is built into the file system. GPFS replication ensures that
two valid physical copies of the data always exist in the file system. The concept
of replicas is transparent to the application, which means that SAP HANA is not
affected if a server holding one replica goes offline. Access requests to a file that
has lost one replica are automatically served from the second replica.
Any file system operation is run on multiple nodes (the exact number depends on
different parameters, such as the size of the data to be written), and if both data
replicas are persisted successfully, GPFS signals an IO operation as complete.
This measure ensures an RPO of zero.
The concept of HA in a scale-out SAP HANA installation is described in more
detail in Scale-out solution with high-availability capabilities on page 132. This
section also shows in detail what happens when one cluster member goes offline
and its data is no longer accessible.
More than one standby node in a cluster is a supported scenario. Even with
multiple standby nodes, there still is a small time window in which only one data
replica exists. The duration of this time depends on the speed of the network, the
load on the system, and the amount of data in the database. Multiple standby
nodes are not a replacement for a backup strategy.

222

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

7.3.2 Disaster recovery using GPFS storage replication


To implement DR capability, IBM has a feature that is built into the file system.
Using the GPFS replication feature allows for an additional data copy (replica) to
be stored in a secondary data center location. Using multiple replicas for DR is
identical to using HA in a single-site scale-out environment. GPFS is configured
to write an additional third replica to a remote site.
The following section gives an overview of a multi-site DR solution using GPFS
replication. It explains the advantages of having a quorum node in a separate
location, and it describes how the idling servers on the secondary site can be
used for hosting additional installations of SAP HANA.
For a DR setup, it is necessary to have identical scale-out configurations on both
the primary and the secondary sites. In addition, there can be a third site, which
has the sole responsibility to act as a quorum site. We describe the differences
between a two-site and a three-site setup later in this section.

Basic architecture
During normal operation, there is one active SAP HANA instance running. The
SAP HANA instance on the secondary site is not active. The architecture on
each site is identical to a standard scale-out cluster with HA, as described in
7.3.1, High availability using GPFS storage replication on page 222. The
architecture must include standby servers for HA. A server failure is handled
completely within one site and does not enforce a site failover. Figure 7-23 shows
this setup.

Site A
node01

node02

node03

Site B
node05

node04

SAP HANA DB (active)


Partition 1

Partition 2

Partition 3

node06

node07

node08

SAP HANA DB (inactive)


Partition 1

Standby

Partition 2

Partition 3

Standby

Shared file system - GPFS


First replica
Second replica

synchronous
replication
Third replica

GPFS
quorum
node

Site C
Figure 7-23 Basic setup of the disaster recovery solution using GPFS synchronous replication

Chapter 7. Business continuity and resiliency for SAP HANA

223

The connection between the two main sites, A and B, depends on the clients
network infrastructure. Use a dual link dark Fibre Connection to allow for
redundancy in the network switch at each site. For full redundancy, an additional
link pair is required to fully mesh the four switches. Figure 7-24 shows a
connection with one link pair in between. It also shows that only the GPFS
network must span both sites because they make up one single stretched GPFS
cluster. The SAP HANA internal network is kept within each site because SAP
HANA does not need to communicate to the other site.
Site A

Site B

HANA internal VLAN

HANA internal VLAN

GPFS VLAN

ISL

GPFS VLAN

HANA internal VLAN

HANA internal VLAN

GPFS VLAN

GPFS VLAN

ISL

Four IBM G8264 Ethernet switches

Figure 7-24 Networking details for SAP HANA GPFS based DR solutions

Within each site, the 10 Gb Ethernet network connections for both the internal
SAP HANA and the internal GPFS network are implemented in a redundant
layout.
Depending on where exactly the demarcation point is between the SAP HANA
installation, the client network, and the inter-site link, different architectures can
be used. In the preferred case, there is a dedicated 10 Gb Ethernet connection
going out of each of the two SAP HANA switches at each site towards the
demarcation point. There is no requirement with which technology the data
centers are connected if the technology can route IP traffic across the link. In
general, low-latency interconnect technologies are the preferred choice. For
more information about latency, see 7.1.3, Special considerations for DR and
long-distance HA setups on page 195.
Depending on the client infrastructure, the inter-site link might be the weakest link
and no full 10 Gb Ethernet can be carried across it. SAP has validated the IBM
solution by using a 10 Gb interconnect, but depending on the workload, a HANA
cluster might not generate that much traffic and a value much smaller than 10 Gb
is sufficient. This choice must be decided upon by each individual client. For
example, the initial database load might take hours using a 1 Gbit connection, or
minutes when using a 10 Gbit network connection to the remote site. During
normal operation, latency is more critical than bandwidth for the overall
application performance.

224

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

As with a standard scale-out implementation, the DR configuration relies on


GPFS functionality to enable the synchronous data replication between sites. A
single-site solution holds two replicas of each data block. This function is being
enhanced with a third replica in the dual-site DR implementation. A stretched
GPFS cluster is implemented between the two sites. Figure 7-23 on page 223
shows that there is a combined cluster on GPFS level spanning both sites, where
each sites SAP HANA cluster is independent of the other one. GPFS file
placement policies ensure that there are two replicas on the primary site and a
third replica on the secondary site. If there is a site failure, the file system can
stay active with a complete data replica in the secondary site. The SAP HANA
database can then be made operational with the one remaining replica of
persistency and log files.
GPFS is a cluster file system. As such, it is vulnerable to a split-brain situation. A
split brain happens when the connection between the two data centers is lost but
both clusters can still communicate internally. Each surviving cluster then thinks
that the nodes at the other site are down and it is safe to continue writing to the
file system. In the worst case, this can lead to inconsistent data at the two sites.
To avoid this situation, GPFS requires a quorum of nodes to be able to
communicate. This is called a GPFS cluster quorum. Not every server that is
designated as a GPFS quorum node is elected to act as one. GPFS chooses an
odd number of servers to act as quorum nodes. The exact number depends on
the number of total servers within the GPFS cluster. For an SAP HANA DR
installation, the primary active site always has one more node assigned as a
quorum than the backup site. This ensures that if the inter-site link goes down,
GPFS stays up on the primary site nodes.
In addition to the GPFS cluster quorum, each file system in the GPFS cluster
stores vital information in a small structure that is called a file system descriptor
(shown as FSdesc in the output of GPFS commands). This file system descriptor
contains information such as file system version, mount point, and a list of all the
disks that make up this file system. When the file system is created, GPFS writes
a copy of the file system descriptor on to every disk. From then on, GPFS
updates them only on a subset of disks upon changes to the file system.

Chapter 7. Business continuity and resiliency for SAP HANA

225

Depending on the GPFS configuration, either three, five, or six copies are kept
up-to-date.3 Most valid file system descriptors are required for the file system to
be accessible.4 If disks fail over time, GPFS updates another copy on different
disks to ensure that all copies are alive. If multiple disks fail concurrently and
each was holding a valid copy of the file system descriptor, then a cluster loses
the file system descriptor quorum and the file system automatically is
unmounted.

Site failover
During normal operation, there is a running SAP HANA instance active on the
primary site. The secondary site has an installed SAP HANA instance that is
inactive. A failover to the remote SAP HANA installation is a manual procedure;
however, it is possible to automate the steps in a script. Depending on the reason
for the site failover, you can decide whether the secondary site becomes the new
production site or a failback must happen after the error in the primary site is
fixed.
To ensure the highest level of safety, during normal operation the GPFS file
system is not mounted on the secondary site. This action ensures that there is no
read nor write access to the file system. If you want, however, the file system can
be mounted, for example, read-only, to allow for backup operations on the DR
site.
A failover is defined as anything that brings the primary site down. Single errors
are handled within the site by using the fully redundant local hardware (such as a
spare HA node and second network interface).
Events that are handled within the primary site include the following ones:

Single-server outage
Single-switch outage
Accidentally pulled network cable
Local disk failure

Events that cause a failover to the DR site include the following ones:
Power outage in the primary data center causing all nodes to be down
Two servers going down at the same time (not necessarily because of the
same problem)

3
4

226

For environments with just one disk, GPFS uses only one file system descriptor. This scenario does
not apply to SAP HANA setups.
Even if all disks fail, if there is at least one valid file system descriptor that is accessible, you still
have a chance to recover data manually from the file system.

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

The time for the failover procedure depends on how long it takes to open SAP
HANA on the backup site. The data already is in the data center and ready to be
used. Any switch from one site to the other includes a downtime of SAP HANA
operations because the two independent instances on either site must not run
concurrently because of the sharing of the persistency and log files on the file
system.
GPFS provides the means to restore HA capabilities on the backup site. During
normal operation, only one replica is pushed to the backup site. However, clients
that consider their data centers to be equal might choose to not gracefully fail
back to the primary data center after it is repaired, but instead continue to run
production from the backup site. For this scenario, a GPFS restripe is triggered
that creates a second replica on the backup site out of the one available replica.
This procedure is called restoring HA capabilities. The exact commands are
documented in the IBM SAP HANA Operations Guide.5 The duration of this
restriping depends on the amount of data in the file system.
When the SAP HANA instance is started, the data is loaded into main memory.
The SAP HANA database is restored to the latest savepoint and the available
logs are recovered. This procedure can be automated, but it is dependent on the
client environment. The commands in the Operations Guide provide a template
for such automation.
Clients choosing to continue running production out of the former backup data
center can easily add the former primary site after it is restored. The nodes are
integrated back into the GPFS cluster and a resynchronization of the most recent
version of the data occurs between the new primary data center and the new
secondary data center. One replica is held in the new secondary site. The overall
picture looks exactly like before a failover only with the data centers having
switched their designation.

Site failback
A site failback is defined as a graceful switch of the production SAP HANA
instance from the secondary data center back to the primary data center.
To understand the procedure for a failback, it is important to know the initial state
of the DR environment. There are two possibilities:
You have two replicas that are local in the data center and one replica in the
remote sites data center. Here are some examples:
A disaster happened and you restored HA on the backup site; now you are
ready to fail back production to the primary site again.

SAP Notes can be accessed at http://service.sap.com/notes. An SAP S-user ID is required.

Chapter 7. Business continuity and resiliency for SAP HANA

227

During normal operations, you want to switch production to the backup site
for maintenance reasons.
You have one replica that is local in the data center and two replicas are in the
remote sites data center. For example, a disaster happened and you are
running production from the backup data center without having HA restored.
Environments with only one working replica must restore HA first before being
able to fail back gracefully, which ensures the highest level of safety for your data
during the failback procedure.
When SAP HANA is running from a file system with two local replicas, the
failback procedure is identical to a controlled failover procedure. The data center
is assumed to be down, the active site now is the remote site, HA is restored
(using GPFS restriping), and the second site is attached again with one single
replica of the data. SAP HANA can be started when it is shut down on the other
site, but it experiences a performance impact during HA restore (that is, GPFS
restriping). GPFS restriping is an I/O-heavy operation.

DR environments with dedicated quorum node


The most reliable way to implement DR is with the use of a dedicated quorum
node in a third site. The sole purpose of the quorum node is to decide which site
is allowed to run the production instance after the link that is connecting the
primary and secondary data centers is lost. This situation is known as a split
brain. The quorum node, placed at a third site, has a separate connection to the
primary and to the secondary site. Figure 7-25 shows this configuration.

Site C

Worker
Node

Worker
Node

Worker
Node

Site B

DR
Node

Standby
Node

DR
Node

DR
Node

DR
Node

Quorum
Node

GPFS Links
SAP HANA Links
Inter-Switch Link (ISL)

G8264 switches

GPFS quorum functionality

Figure 7-25 Network setup of GPFS based DR for a scale-out system with a dedicated quorum node

228

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

The quorum node is configured to act as a GPFS quorum server without storing
any SAP HANA data. The only requirement is to have a small disk or partition
available that can hold a file system descriptor. The quorum node needs network
access to only the GPFS network. SAP HANA network access is not required.
In a standard setup, two nodes from the primary data center act as quorum
nodes, two nodes from the secondary data center act as quorum nodes, plus the
additional quorum node at the third site. The number of quorum nodes is shown
as the weight of a data center in Figure 7-26.

Primary site

Secondary site

Servers running production

Servers waiting to take over

Weight = 2

Weight = 2

Tertiary site
GPFS quorum node only
Weight = 1

Figure 7-26 Outline of a three-site environment with a dedicated GPFS quorum node

In the unlikely event that two inter-site links are interrupted at the same time,
there is still the majority of quorum nodes that are available to communicate with
each other over the one connection that is left to decide from which data center
to run production. In terms of weight, this means that in any of these situations a
minimum of three can be ensured to always be up.
If the links between the primary and secondary and between the secondary and
tertiary data centers go down, SAP HANA keeps running out of the primary site
without any downtime. If the links between the primary and tertiary and between
the secondary and tertiary data centers go down, the primary and secondary
data center can still communicate and SAP HANA keeps running out of the
primary data center.

Chapter 7. Business continuity and resiliency for SAP HANA

229

If the links between the primary and secondary and the primary and tertiary data
centers go down, it means that the production SAP HANA instance in the primary
data center is isolated and loses GPFS quorum. As a safety measure, GPFS
prevents any writing to the file system on the primary site and SAP HANA stops.
GPFS stays up and running on the secondary site because the quorum node still
has a connection to it. Depending on client requirements, HA must be restored
first before SAP HANA can be started and production continues to run out of the
secondary site data center.
It is a valid use case to set up an existing server to act as a GPFS quorum node
for the SAP HANA DR installation.6 GPFS must have root access on this
machine to run. Three GPFS server licenses are required on the primary and on
the secondary sites for the first three servers.7 Additional servers need GPFS
FPO licenses.
The main advantage of having a dedicated quorum node is that the file system
always is available during failover and failback without any manual intervention if
there is at least one site and the quorum nodes can communicate with each
other.

DR environments without a dedicated quorum node


Environments that do not have a third site to host a dedicated quorum node can
still implement a GPFS based DR solution for SAP HANA. The difference
between environments with and without a dedicated quorum node is the
procedure that is required upon a failover. Figure 7-27 on page 231 shows such
a scenario with no third site.

6
7

230

The applicability of this statement must be verified for each installation by IBM.
The currently validated minimum number of servers on each site for DR is three. This is required by
SAP to set up a scale-out environment with HA. Hence, the requirement is for at least three GPFS
server licenses per site.

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Site B

Worker
Node

Worker
Node

Worker
Node

DR
Node

Standby
Node

DR
Node

DR
Node

DR
Node

GPFS Links
SAP HANA Links

G8264 switches

Inter-Switch Link (ISL)


GPFS quorum functionality

Figure 7-27 Network setup of GPFS based DR for a scale-out system without a dedicated quorum node

A dedicated quorum node at a third site allows the file system to remain
accessible even if the primary or secondary site goes down because GPFS has
the majority of its nodes still available. Without a dedicated quorum node, the
quorum nodes functionality must be put on the primary site to ensure that GPFS
in the primary data center continues to operate even if the inter-site link is
interrupted.
Figure 7-28 shows the GPFS view of an environment without a dedicated
quorum node. The weight symbolizes quorum designation. There are three
servers with a quorum designation in the primary data center and two servers in
the backup data center.

Primary site

Secondary site

Servers running production

Servers waiting to take over

Weight = 3

Weight = 2

Figure 7-28 Outline of a two-site environment without a dedicated GPFS quorum node

Chapter 7. Business continuity and resiliency for SAP HANA

231

If a disaster happens at the primary site, the GPFS cluster loses its quorum
because the two quorum nodes at the backup site do not meet the minimum of
three. An additional procedure is required to relax GPFS quorum on the surviving
secondary site before GPFS comes up again. The exact procedure is
documented in the Operations Guide for IBM Systems Solution for SAP HANA.
After GPFS is running again, the procedure is identical to a failover with a
dedicated quorum node. It is optional to restore HA capabilities within the
secondary site. SAP HANA can be started already while this restore procedure is
still running, but a performance impact must be expected because this is an
I/O-intensive operation.
You need three GPFS server licenses on the primary and on the backup sites
even though during normal operation only two of them are required on the
backup site. If a disaster happens and you must fail over the SAP HANA
production instance to the backup data center, this data center becomes the
main SAP HANA cluster and requires three GPFS server licenses. Additional
servers on either site get GPFS FPO licenses.8

Backup site hosting non-production SAP HANA


In the environments that are described so far, all servers on the secondary site
receive data only over the network from the primary site and store it on local
disks. Other than that, they are idling. There are no SAP HANA processes
running on them.
To use the idling compute power, SAP supports the hosting of non-production
SAP HANA instances on the backup site, for example, a quality assurance (QA)
or training environment. When a disaster happens, this non-production instance
must be shut down before the failover procedure of the production instance can
be initiated.
The non-production SAP HANA instances need additional space for persistency
and log data. IBM uses the EXP2524 to extend the locally available disk storage
space. The EXP2524 directly connects through an SAS interface to one single
server and provides up to 24 additional 2.5-inch disks. You need one EXP2524
for each secondary site server that is supposed to participate in hosting a
non-production instance. Figure 7-29 on page 233 shows the overall architecture
in an example with four SAP HANA appliances per site.

232

The currently validated minimum number of servers on each site for DR is three. This is required by
SAP to set up a scale-out environment with HA. Hence, the requirement is for at least three GPFS
server licenses per site.

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Primary site

Secondary site

node2

node3

node4

node5

node6

node7

node8

Local
storage

Local
storage

Local
storage

Local
storage

Local
storage

Local
storage

Local
storage

Local
storage

third
replica

Production
file
system

second
replica

first
replica

node1

Non-production
file system

First replica
Second replica

...

...

...

...

Second file system spanning only expansion unit drives (metadata and data)

Figure 7-29 Overview of running non-production SAP HANA instances on the idling backup site

If a failover happens from the primary site to the secondary site, and you are
planning to keep running production from the secondary data center, it means
that you must be able to host non-production instances in the primary data
center. To accommodate for this additional storage space on the primary site
servers, you must connect EXP2524 expansions units to them as well.
If you plan to fail back gracefully your production instance from the backup site to
the primary site after it is repaired, you do not need to have EXP2524 expansion
units on the primary site servers. There might be unforeseen outages that take a
long time to repair. You cannot run your non-production site instances during this
outage.
There is exactly one new file system that spans all expansion units of the backup
site servers. This new file system runs with a GPFS replication factor of two,
meaning that there are always two copies of each data block. The first replica is
stored local in the node writing the data. The second replica is stored in a striped
round-robin fashion over the other nodes. This is identical to a scale-out HA
environment. One server can fail and the data is still available on the other nodes
expansion units. Figure 7-29 shows this from node5s perspective. If node5
writes data into the non-production file system, it stores the first replica on local
disks in the expansion unit. The second replica is striped across node6, node7,
and node8s expansion unit drives (symbolized as a long blue box in the figure).

Chapter 7. Business continuity and resiliency for SAP HANA

233

Although IBM does not support a multi-SID configuration, it is a valid scenario to


run different SAP HANA instances on different servers. If you had a cluster of six
nodes on each site, you can, for example, run QA on two nodes and development
on four nodes, or you can run QA on three nodes and development on three
nodes. All non-production instances must use the same file system, which
means in this example, QA and development must be configured to use different
directories.

Summary
The DR solution for the IBM Systems Solution for SAP HANA uses the advanced
replication features of GPFS, creating a cross-site cluster that ensures availability
and consistency of data across two sites. It does not impose the need for
additional storage systems, but instead builds upon the scale-out solution for
SAP HANA. This simple architecture reduces the complexity in maintaining such
a solution while keeping the possibility of adding more nodes over time if the
database grows.

7.3.3 HA using GPFS replication plus DR using SAP HANA System


Replication
As an alternative to GPFS only environments, customers can choose SSR to
implement DR capabilities. In such a scenario, HA within the primary data center
still uses GPFS storage replication, but data replication to the DR site is handled
by SAP HANA on application level with the SSR feature.
From a GPFS perspective, such a setup requires two distinct clusters, one at
each site. SSR replicates data from the primary cluster to the DR site cluster.
Figure 7-30 on page 235 shows a four-node scenario with three worker nodes
and one standby node (for HA) at each site. SSR replicates data from each node
to its corresponding node at the DR site.

234

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Site A

node01

node02

node03

node04

SAP HANA DB
DB partition 1

DB partition 2

DB partition 3

Standby node

Shared file system - GPFS

First replica

data01 + log01

data02 + log02

data03 + log03

local storage

local storage

local storage

local storage

node03

node04

Second replica

SAP HANA
System Replication

Site B

node01

node02

SAP HANA DB
DB partition 1

DB partition 2

DB partition 3

Standby node

Shared file system - GPFS

First replica

data01 + log01

data02 + log02

data03 + log03

local storage

local storage

local storage

Second replica

local storage

Figure 7-30 DR of a scale-out configuration using SAP HANA System Replication

This replication can be configured to work in either synchronous or


asynchronous mode. Synchronous mode ensures an RPO of zero, but limits the
maximum distance between the two sites. Asynchronous mode allows for bigger
distances, but with the risk of losing data that is still unacknowledged when the
disaster happens (RPO > 0).
For more information details about SSR and the two replication modes, see
7.1.2, SAP HANA System Replication on page 192.
Single-node failures are handled within the primary data center. The standby
node takes over operation of the failed node. This happens automatically without
administrative intervention.
Multi-node failures, such as a full data center outage or other disasters, require a
failover of SAP HANA to the DR site. This failover is a manual activity. SAP
HANA administrators must promote DR site nodes from running in recovery
mode to active workers (or standby, respectively).

Chapter 7. Business continuity and resiliency for SAP HANA

235

With the release of SAP HANA 1.0 SPS 08 in June 2014, it is possible to use
different networks for SSR traffic. You can use either the front-end client network,
the back-end SAP HANA network, or a new dedicated replication network
spanning both sites. Having the choice between multiple networks allows you to
better adapt to different customer situations.
During normal operation, the nodes at the DR site A are running in recovery
mode. This mode allows you to host a non-productive SAP HANA instance (such
as development or test) on these idling nodes. If there is a disaster, this
non-productive SAP HANA instance must be stopped before the production
instance can be started at site B.
This additional SAP HANA instance needs its own storage space for persistency
and logs. The EXP2524 unit is used to provide this additional space. The
EXP2524 adds up to twenty-four 2.5-inch HDDs that are connected directly to
the server through a SAS interface. A second file system is created over those
drives for the additional SAP HANA instance. Depending on the size of this
additional database instance, one or more DR site nodes must be used. Every
node in scope requires an EXP2524 storage expansion. This second file system
is created with a replication factor of two to implement HA in the non-production
instance.

7.4 Backup and restore


Because SAP HANA plays a critical role in the overall SAP IT landscape, it is
critical to back up the data in the SAP HANA database and be able to restore it.
This section gives a short overview about the basics of backup and recovery for
SAP HANA, the available products that are certified for use with SAP HANA, and
the integration of SAP HANA and IBM Tivoli Storage Manager for ERP.

7.4.1 Basic backup and recovery


Simply saving the savepoints and the database logs technically is impossible in a
consistent way, and thus does not constitute a consistent backup that can be
recovered from. Therefore, a simple file-based backup of the persistency layer of
SAP HANA is not sufficient.

236

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Backing up
A backup of the SAP HANA database must be triggered through the SAP HANA
Studio or through the SAP HANA SQL interface. SAP HANA then creates a
consistent backup, consisting of one file per SAP HANA service on each cluster
node. SAP HANA always performs a full backup. Incremental backups are not
supported by SAP HANA.
SAP HANA internally maintains transaction numbers, which are unique within a
database instance, especially in a scale-out configuration. To create a consistent
backup across a scale-out configuration, SAP HANA chooses a specific
transaction number, and all nodes of the database instance write their own
backup files, including all transactions up to this transaction number.
The backup files are saved to a defined staging area that might be on the internal
disks, an external disk on an NFS share,9 or a directly attached SAN subsystem.
In addition to the data backup files, the SAP HANA configuration files and backup
catalog files must be saved to be recovered. For point in time recovery, the log
area also must be backed up.
With the System x solution for SAP HANA, one of the 1 Gbit network interfaces of
the server can be used for NFS connectivity, or an additional 10 Gbit network
interface must be installed (if a PCIe slot is available). You can add a Fibre
Channel host bus adapter (HBA) for SAN connectivity. The Quick Start Guide for
the IBM Systems Solution for SAP HANA lists supported hardware additions to
provide additional connectivity. This guide can be found at the following website:
http://www-947.ibm.com/support/entry/myportal/docdisplay?lndocid=MIGR-5
087035

Restoring a backup
It might be necessary to recover the SAP HANA database from a backup in the
following situations:
The data area is damaged.
If the data area is unusable, the SAP HANA database can be recovered up to
the last committed transaction if all the data changes after the last complete
data backup are still available in the log backups and log area. After the data
and log backups are restored, the SAP HANA databases uses the data and
log backups and the log entries in the log area to restore the data and replay
the logs to recover. It also is possible to recover the database using an older
data backup and log backups if all relevant log backups that are made after
the data backup are available.10
9
10

SAP Note 1820529 lists network file systems that are unsuitable for backup and recovery.
For help with determining the files that are needed for a recovery, see SAP Note 1705945.

Chapter 7. Business continuity and resiliency for SAP HANA

237

The log area is damaged.


If the log area is unusable, the only way to recover is to replay the log
backups. Therefore, any transactions that are committed after the most recent
log backup are lost, and all transactions that were open during the log backup
are rolled back.
After restoring the data and log backups, the log entries from the log backups
automatically are replayed in order to recover. It is also possible to recover the
database to a specific point if it is within the existing log backups.
The database must be reset to an earlier point because of a logical error.
To reset the database to a specific point, a data backup from before that point
to recover to and the subsequent log backups must be restored. During
recovery, the log area might be used as well, depending on the point to which
the database is reset. All changes that are made after the recovery time are
(intentionally) lost.
You want to create a copy of the database.
You might want to create a copy of the database for various purposes, such
as creating a test system.
A database recovery is initiated from the SAP HANA studio or, starting with SAP
HANA 1.0 SPS07, from the command line.
Certain restrictions apply when restoring a backup. Up to and including SAP
HANA 1.0 SPS06, the target SAP HANA system was required to be identical to
the source, with regard to the number of nodes and node memory size. Starting
with SPS07, it is possible to recover a backup that is taken from an m-node
scale-out system and restore it on an n-node scale-out environment. Memory
configuration also can be different. You must configure m-index server instances
on the n-node target environment to restore the backup. This means nodes can
have more than one index server. Such a configuration does not provide the best
performance, but it might be sufficient for test or training environments.
When restoring a backup image from a single-node configuration into a scale-out
configuration, SAP HANA does not repartition the data automatically. The correct
way to bring a backup of a single-node SAP HANA installation to a scale-out
solution is as follows:
1. Back up the data from the stand-alone node.
2. Install SAP HANA on the master node.
3. Restore the backup into the master node.
4. Install SAP HANA on the subordinate and standby nodes as appropriate, and
add these nodes to the SAP HANA cluster.
5. Repartition the data across all worker nodes.

238

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

More detailed information about the backup and recovery processes for the SAP
HANA database is provided in the SAP HANA Backup and Recovery Guide,
found at the following website:
http://help.sap.com/hana_platform

7.4.2 File-based backup tool integration


By using the mechanisms that are described in 7.4.1, Basic backup and
recovery on page 236, virtually any backup tool can be integrated with SAP
HANA. Backups can be triggered programmatically by using the SQL interface,
and the resulting backup files that are written locally then can be moved into the
backup storage by the backup tool. Backup scheduling can be done by using
scripts that are triggered by the standard Linux job scheduling capabilities or
other external schedulers. Because the Backint backup interface was introduced
to SAP HANA with SPS05, a file-based backup tool integration is the only option
for pre-SPS05 SAP HANA deployments.
Section A.2, File-based backup with IBM Tivoli Storage Manager for ERP on
page 258 describes a file-based integration of IBM Tivoli Storage Manager for
ERP V6.4 with SAP HANA.

7.4.3 GPFS snapshots as a backup source


All System x solutions for SAP HANA use GPFS as the file system on which SAP
HANA runs. GPFS supports a snapshot feature that allows, similar to enterprise
storage snapshot features, you to take a consistent and stable view of the file
system that can then be used to create a backup. While the snapshot is active,
GPFS stores any changes to files in a temporary delta area. After the snapshot is
released, the delta is merged with the original data and any further changes are
applied on this data.
Taking only a GPFS snapshot does not ensure that you have a consistent backup
that you can use to perform a restore. SAP HANA must be instructed to flush out
any pending changes to disk to ensure a consistent state of the files in the file
system. With the release of SAP HANA 1.0 SPS07, a snapshot feature is
introduced that prepares the database to write a consistent state to the data area
of the file system (the log area is not affected by this feature). While this snapshot
is active, a GPFS snapshot must be triggered. SAP HANA can then be instructed
to release its snapshot. Using Linux copy commands or other more sophisticated
backup tools, the data can then be stored in a backup place (NFS share, SAN
storage, or other places).

Chapter 7. Business continuity and resiliency for SAP HANA

239

Using snapshots has much less impact on the performance of the running
database than to trigger a file-based backup. Triggering a GPFS snapshot works
in single-node and scale-out environments. The time that it takes to activate a
snapshot depends on the amount of data in the file system and the current load
on it.

7.4.4 Backup tool integration with Backint for SAP HANA


Starting with SAP HANA 1.0 SPS05, SAP provides an application programming
interface (API) that can be used by manufacturers of third-party backup tools to
back up the data and redo logs of an SAP HANA system.11 Using this Backint for
SAP HANA API, a full integration with SAP HANA studio can be achieved,
allowing configuration and running of backups using Backint for SAP HANA.
With Backint, instead of writing the backup files to local disks, dedicated SAN
disks, or network shares, SAP HANA creates data stream pipes. Pipes are a way
to transfer data between two processes: one is writing data into the pipe, and the
other one is reading data out of the pipe. This makes a backup using Backint a
one-step backup. No intermediate backup data is written, unlike with a file-based
backup tool integration that writes to local disk first. This relieves the local I/O
subsystem from the backup workload.

Backing up through Backint


The third-party backup agent runs on the SAP HANA server and communicates
with the third-party backup server. SAP HANA communicates with the third-party
backup agent through the Backint interface. After the user initiates a backup
through the SAP HANA Studio or by running hdbsql, SAP HANA writes a set of
text files describing the parameterization for this backup, including version and
name information, stream pipe location, and the backup policy to use. Then, SAP
HANA creates the stream pipes. Each SAP HANA service (for example, index
server, name server, statistics server, and XS engine) has its own stream pipe to
which to write its own backup data. The third-party backup agents read the data
streams from these pipes, and pass them to the backup server. Currently, SAP
HANA does not offer backup compression; however, third-party backup agents
and servers can compress the backup data and further transform it, for example,
by applying encryption. Finally, SAP HANA transmits backup catalog information
before the third-party backup agent writes a file reporting the result and
administrative information, such as backup identifiers. This information is made
available in SAP HANA Studio.

11

240

For more information, see SAP Note 1730932 Using backup tools with Backint.

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Restoring through Backint


As described in Restoring a backup on page 237, a database restore might be
necessary when the data area or log area is damaged to recover from a logical
error or to copy the database. This can be achieved by using data and log
backups that were performed previously.
A restore operation can be initiated through the SAP HANA Studio only. For the
first step, SAP HANA shuts down the database. SAP HANA then writes a set of
text files describing the parameterization for this restore, including a list of
backup identifiers and stream pipe locations. After receiving the backup catalog
information from the third-party backup tool, SAP HANA performs a series of
checks to ensure that the database can be recovered with the backup data
available. Then, SAP HANA establishes the communication with the third-party
backup agents by using stream pipes, and requests the backup data from the
backup server. The backup agents then stream the backup data that is received
from the backup server through the stream pipes to the SAP HANA services. As
a final step, the third-party backup agent writes a file reporting the result of the
operation for error-handling purposes. This information is made available in SAP
HANA Studio.

Backint certification
Backup tools using the Backint for SAP HANA interface are subject to
certification by SAP. The certification process is documented at
http://scn.sap.com/docs/DOC-34483. To determine which backup tools are
certified for Backint for SAP HANA, search the Partner Information Center and
select the SAP-defined integration scenario HANA-BRINT 1.1 - HANA Backint
Interface. The search function of the Partner Information Center is available at
http://www.sap.com/partners/directories/SearchSolution.epx.
As of April 2014, the following tools are certified by SAP:

Tivoli Storage Manager for ERP 6.4


Symantec NetBackup 7.5
Commvault Simpana 10.0
HP Data Protector 7.0 and 8.1
EMC Networker 8.2
EMC Interface for Data Domain Boost for Databases and Applications 1.0

The following sections give a short overview about the first two backup tools in
this list.

Chapter 7. Business continuity and resiliency for SAP HANA

241

7.4.5 Tivoli Storage Manager for ERP 6.4


Starting with Version 6.4.1, Tivoli Storage Manager for ERP integrates with the
Backint for SAP HANA API for simplified protection of SAP HANA in-memory
databases.
Tivoli Storage Manager for ERP V6.4.1 simplifies and improves the performance
of backup and restore operations for SAP HANA in-memory relational databases
by eliminating an interim copy to disk. The former two-step process for these
operations is replaced by a one-step process by using the new Backint for SAP
HANA API that is available in the SAP HANA Support Package Stack 05 (SPS
05). By using the Backint interface, Tivoli Storage Manager for ERP can now
support any SAP HANA appliance, including those running on competitive Intel
based hardware such as HP, Cisco, or Fujitsu, that meets the requirements that
are defined by the SAP HANA level in use with, and supported by, Tivoli Storage
Manager for ERP.
The new extensions to Tivoli Storage Manager for ERP allow you to use the most
recent functions in SAP HANAs in-memory database environments, for example,
enhanced performance and scalability in multi-nodes environments and better
consumability through tighter integration with SAP HANA Studio. These new
extensions also allow you to apply familiar features of Tivoli Storage Manager for
ERP to SAP HANA backup and recovery:

Management of all files per backup as a logical entity.


Run multiple parallel multiplexed sessions.
Use multiple network paths.
Back up to multiple Tivoli Storage Manager servers, and so on.
Creation of multiple copies of the database redo logs as needed.
Compression.
Deduplication.

In addition to the Backint integration, Tivoli Storage Manager for ERP 6.4
continues to feature a file-based integration with SAP HANA for pre-SPS05 SAP
HANA deployments on IBM hardware. Section A.2, File-based backup with IBM
Tivoli Storage Manager for ERP on page 258 describes such a file-based
integration of Tivoli Storage Manager for ERP V6.4 with SAP HANA in detail.

7.4.6 Symantec NetBackup 7.5 for SAP HANA


Symantec NetBackup 7.5 was the first third-party backup tool to be certified for
the Backint for SAP HANA interface, in December of 2012.12
12

242

For more information, see the following website:


http://www.saphana.com/community/blogs/blog/2012/12/19/backint-for-sap-hana-certificati
on-available-now

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

With a NetBackup agent Version 7.5.0.6 or later installed on the SAP HANA
appliance,13 SAP HANA can be integrated into an existing NetBackup
deployment. This allows SAP HANA to send streamed backup data to
NetBackup, which uses an SAP policy to manage the backed up data sets and
provide destination targets, retention, duplication, and replication for DR
purposes. This seamless integration facilitates the following benefits:

Native integration into SAP HANA Studio for ease of use


Integrated media management and capacity management
Deduplication for backup sets
Compression
Encryption on various levels, including network transmission and on tape
DR for backups with NetBackup Auto Image Replication (AIR)

Symantec NetBackup supports SAP HANA appliances from all vendors of


validated SAP HANA configurations, including IBM.

7.4.7 Backup and restore as a DR strategy


Using backup and restore as a DR solution is a basic way of providing DR.
Depending on the RPO, it might be a viable way to achieve DR. The basic
concept is to back up the data on the primary site regularly (at least daily) to a
defined staging area, which might be an external disk on an NFS share or a
directly attached SAN subsystem (this subsystem does not need to be dedicated
to SAP HANA). After the backup is done, it must be transferred to the secondary
site, for example, by a simple file transfer (can be automated) or by using the
replication function of the storage system that is used to hold the backup files.
Following a companys DR strategy production, SAP HANA must be able to run
on the backup site. Therefore, an SAP HANA system must exist on the
secondary site, which is similar to the one on the primary site at minimum
regarding the number of nodes and node memory size. During normal
operations, this system can run other non-productive SAP HANA instances, for
example, quality assurance (QA), development (DEV), test, or other second-tier
systems. If the primary site goes down, the system must be cleared from these
second-tier HANA systems and the backup can be restored. Upon configuring
the application systems to use the secondary site instead of the primary one,
operation can be resumed. The SAP HANA database recovers from the latest
backup in case of a disaster.

13

SAP HANA 1.0 SPS05, Revision 46 or higher.

Chapter 7. Business continuity and resiliency for SAP HANA

243

Figure 7-31 shows the concept of using backup and restore as a basic DR
solution.

Primary Site
node01

node02

node03

node04
backup

SAP HANA DB
DB partition 1

DB partition 2

DB partition 3

Standby node

Shared file system - GPFS

First replica

data01 + log01

data02 + log02

data03 + log03

local storage

local storage

local storage

Storage

Second replica

local storage

copy or otherwise
transfer the backup files

Secondary Site
node01

node02

node03

node04

restore

SAP HANA DB
DB partition 1

DB partition 2

DB partition 3

Standby node

Shared file system - GPFS

First replica

data01 + log01

data02 + log02

data03 + log03

local storage

local storage

local storage

Storage

Second replica

local storage

Figure 7-31 Scale-out installation using backup and restore as a DR solution

244

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Chapter 8.

SAP HANA operations


This chapter describes the operational aspects of running an SAP HANA system.
This chapter covers the following topics:

Installation services
IBM SAP HANA Operations Guide
Interoperability with other platforms
Monitoring SAP HANA
Installing additional agents
Software and firmware levels
Support process

Copyright IBM Corp. 2013, 2014. All rights reserved.

245

8.1 Installation services


The IBM Systems Solution for SAP HANA comes with the complete software
stack, including the operating system, General Parallel File System (GPFS), and
the SAP HANA software. Because of the nature of the software stack, and
dependencies on how the IBM Systems Solution for SAP HANA is used at the
client location, the software stack cannot be preinstalled completely at
manufacturing. Therefore, installation services are required. Installation services
for the IBM Systems Solution for SAP HANA typically include the following ones:
Performing an inventory and validating the delivered system configuration
Verifying and updating the hardware to the latest level of basic input/output
system (BIOS), firmware, device drivers, and OS patches as required
Verifying and configuring the Redundant Array of Independent Disks (RAID)
configuration
Finishing the software preinstallation according to the client environment
Configuring and verifying network settings and operation
Performing system validation
Providing onsite skills transfer (when required) on the solution and preferred
practices, and delivering postinstallation documentation
To ensure the correct operation of the appliance, installation services for the
IBM Systems Solution for SAP HANA must be performed by trained personnel,
who are available from IBM STG Lab Services, IBM Global Technology
Services, or IBM Business Partners, depending on your geography.

8.2 IBM SAP HANA Operations Guide


The IBM SAP HANA Operations Guide is an extensive guide describing the
operations of an IBM Systems Solution for SAP HANA appliance. It covers the
following topics:
Cluster operations
Actions to take after a server node failure, such as recovering the GPFS
file system, removing the SAP HANA node from the cluster, and installing
a replacement node.
Recovering from a temporary node failure by bringing GPFS on that node
back to a fully operational state, and restarting SAP HANA again on the
node.

246

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Adding a cluster node by integrating it into the private networks of the


appliance, and into the GPFS and SAP HANA clusters.
Reinstalling the SAP HANA software on a node.
Disaster recovery cluster operations
Common operations deviating from the procedures on a normal
installation, such as system shutdown and start.
Planned failover and failback procedures.
Site failover procedures after a site failure for various scenarios.
How to deal with node failures, disk failures, and network failures.
How to operate non-production instances running on the secondary site in
a disaster recovery scenario.
Drive operations
Checking the drive configuration and the health of the drives.
Replacing failed hard disk drives (HDDs), solid-state drives (SSDs), and
IBM High input/output operations per second (IOPS) devices and
reintegrating them into GPFS.
Driver and firmware upgrades for the IBM High IOPS devices.
System health checks
How to obtain GPFS cluster status and configuration, file system status and
configuration, disk status and usage, quotas, SAP HANA application status,
and network information from the switches.
Software updates
Checklists for what to do to update the Linux kernel, the drivers for the
IBM High IOPS drivers, and GPFS, including instructions on how to do a
rolling upgrade where applicable.
References to related documentation, pointing to important documentation
from IBM, SAP, and SUSE.
The IBM SAP HANA Operations Guide is being optimized and extended
continuously based on new developments and client feedback. The latest version
of this document can be downloaded as SAP Note 1650046.1

SAP Notes can be accessed at http://service.sap.com/notes. An SAP S-user ID is required.

Chapter 8. SAP HANA operations

247

8.3 Interoperability with other platforms


To access the SAP HANA database from a system (SAP or non-SAP), the SAP
HANA database client must be available for the platform on which the system is
running. Platform availability of the SAP HANA database client is documented in
the Product Availability Matrix (PAM) for SAP HANA, which is available online at
the following website (search for HANA):
http://service.sap.com/pam
At the time of writing, the SAP HANA database client is available on all major
platforms, including but not limited to the following ones:
Microsoft Windows Server 2008 and Windows Server 2008 R2
Microsoft Windows Vista and Windows 7 (both 32-bit and 64-bit)
SUSE Linux Enterprise Server 11 on 32 and 64-bit x86 platforms, and
IBM System z
Red Hat Enterprise Linux 5 and 6, on 64-bit x86 platforms
IBM AIX V5.2, V5.3, V6.1, and V7.1 on the IBM POWER platform
IBM i V7R1 on the IBM POWER platform
HP-UX 11.31 on Itanium
Oracle Solaris 10 and 11, on x86 and SPARC
For up-to-date and detailed availability information, see the PAM.
If there is no SAP HANA database client available for a certain platform,
SAP HANA can still be used in a scenario with replication by using a dedicated
SAP Landscape Transformation server (for SAP Business Suite sources) or an
SAP BusinessObjects Data Services server running on a platform for which the
SAP HANA database client is available. This way, data can be replicated into
SAP HANA, which then can be used for reporting or analytic purposes, using a
front end supporting SAP HANA as a data source.

8.4 Monitoring SAP HANA


In a productive environment, administration and monitoring of an SAP HANA
appliance play an important role.

248

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

The SAP tool for the administration of and monitoring the SAP HANA appliance
is the SAP HANA Studio. It allows you to monitor the overall system state:
General system information (such as software versions).
A warning section shows the latest warnings that are generated by the
statistics server. Detailed information about these warnings is available as a
tooltip.
Bar views provide an overview of important system resources. The amount of
available memory, CPUs, and storage space is displayed, in addition to the
used amount of these resources.
In a distributed landscape, the amount of available resources is aggregated over
all servers.
Note: For more information about the administration and monitoring of SAP
HANA, see the SAP HANA Administration Guide, found at the following
website:
http://help.sap.com/hana_platform

8.5 Installing additional agents


Many organizations have processes and supporting software in place to monitor,
back up, or otherwise interact with their servers. Because SAP HANA is delivered
in an appliance-like model, there are restrictions with regards to additional
software, for example, monitoring agents, to be installed on to the appliance.
SAP permits the installation and operation of external software if the
prerequisites stated in SAP Note 1730928 are met.
Only the software that is installed by the hardware partner is recommended on
the SAP HANA appliance. For the IBM Systems Solution for SAP HANA, IBM
defines three categories of agents:
Supported

IBM provides a solution covering the respective areas; no


validation by SAP is required.

Tolerated

Solutions provided by a third party that are allowed to be used on


the IBM Workload Optimized Solution for SAP HANA. It is the
clients responsibility to obtain support for such solutions. Such
solutions are not validated by IBM and SAP. If issues with such
solutions occur and cannot be resolved, the usage of such
solutions might be prohibited in the future.

Chapter 8. SAP HANA operations

249

Prohibited

Solutions that must not be used on the IBM Systems Solution for
SAP HANA. Using these solutions might compromise the
performance, stability, or data integrity of SAP HANA.

Do not install additional software on the SAP HANA appliance that is classified
as prohibited for use on the SAP HANA appliance. As an example, initial tests
show that some agents can decrease performance or even possibly corrupt the
SAP HANA database (for example, virus scanners).
In general, all additionally installed software must be configured to not interfere
with the functions or performance of the SAP HANA appliance. If any issues with
the SAP HANA appliance occur, you might be asked by SAP to remove all
additional software and to reproduce the issue.
The list of agents that are supported, tolerated, or prohibited for use on the SAP
HANA appliance are published in the Quick Start Guide for the IBM Systems
Solution for SAP HANA appliance, which is available at this website:
http://www-947.ibm.com/support/entry/myportal/docdisplay?lndocid=MIGR-5
087035

8.6 Software and firmware levels


The IBM Systems Solution for SAP HANA appliance contains several different
components that might need to be upgraded (or downgraded) depending on
different support organizations recommendations. These components can be
split up into four general categories:

Firmware
Operating system
Hardware drivers
Software

The IBM System x SAP HANA support team reserves the right to perform basic
system tests on these levels when they are deemed to have a direct impact on
the SAP HANA appliance. In general, IBM does not give specific
recommendations to which levels are allowed for the SAP HANA appliance.
The IBM System x SAP HANA development team provides, at regular intervals,
new images for the SAP HANA appliance. Because these images have
dependencies regarding the hardware, operating system, and drivers, use the
latest image for maintenance and installation of SAP HANA systems. These
images can be obtained through IBM Support. Part number information is
contained in the Quick Start Guide.

250

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

If the firmware level recommendations for the IBM components of the SAP HANA
appliance are given through the individual System x support teams that fix known
code bugs, it is the clients responsibility to upgrade or downgrade to the
recommended levels as instructed by IBM Support.
If the operating system recommendations for the SUSE Linux components of the
SAP HANA appliance are given through the SAP, SUSE, or IBM support teams
that fix known code bugs, it is the clients responsibility to upgrade or downgrade
to the recommended levels, as instructed by SAP through an explicit SAP Note
or allowed through an OSS Customer Message. SAP describes their operational
concept, including updating of the operating system components, in SAP Note
1599888 - SAP HANA: Operational Concept. If the Linux kernel is updated, take
extra care to recompile the IBM High IOPS drivers and IBM GPFS software as
well, as described in the IBM SAP HANA Operations Guide.
If an IBM High IOPS driver or IBM GPFS recommendation to update the software
is given through the individual IBM Support teams (System x, Linux, or GPFS)
that fix known code bugs, it is not recommend to update these drivers without
first asking the IBM System x SAP HANA support team through an SAP OSS
Customer Message.
If the other hardware or software recommendations for IBM components of the
SAP HANA appliance are given through the individual IBM Support teams that fix
known code bugs, it is the clients responsibility to upgrade or downgrade to the
recommended levels as instructed by IBM Support.

8.7 Support process


The deployment of SAP HANA as an integrated solution, combining software and
hardware from both IBM and SAP, also is reflected in the support process for the
IBM Systems Solution for SAP HANA.
All SAP HANA models that are offered by IBM include either SUSE Linux
Enterprise Server (SLES) for SAP Applications with SUSE 3-year priority support
or Red Hat Enterprise Linux for SAP HANA with 3-year support and IBM GPFS
with 3-year support. The hardware comes with a 3-year limited warranty,2
including customer-replaceable unit (CRU), and onsite support.3

2
3

For more information about the IBM Statement of Limited Warranty, see the following website:
http://www.ibm.com/servers/support/machine_warranties
IBM sends a technician after attempting to diagnose and resolve the problem remotely.

Chapter 8. SAP HANA operations

251

8.7.1 IBM and SAP integrated support


SAP integrates the support process with SUSE and IBM as part of the HANA
appliance solution-level support. If you encounter software problems on your
SAP HANA system, access the SAP Online Service System (SAP OSS) website:
https://service.sap.com
When you reach the website, create a service request ticket by using a
subcomponent of BC-HAN or BC-DB-HDB as the problem component. IBM
support works closely with SAP and SUSE and is dedicated to supporting SAP
HANA software and hardware issues.
Send all questions and requests for support to SAP by using their OSS
messaging system. A dedicated IBM representative is available at SAP to work
on this solution. Even if it is clearly a hardware problem, an SAP OSS message
should be opened to provide the best direct support for the IBM Systems
Solution for SAP HANA.
When opening an SAP support message, use the text template that is provided
in the Quick Start Guide when it is obvious that you have a hardware problem.
This procedure expedites all hardware-related problems within the SAP support
organization. Otherwise, the SAP support teams gladly help you with the
questions regarding the SAP HANA appliance in general.
IBM provides a script to get an overview of the current system status and the
configuration of the running system. The saphana-check-ibm.sh script is
preinstalled in the /opt/ibm/saphana/bin directory. The most recent version can
be found in SAP Note 1661146.
Before you contact support, ensure that you take these steps to try to solve the
problem yourself:
Check all cables to ensure that they are connected.
Check the power switches to ensure that the system and any optional devices
are turned on.
Use the troubleshooting information in your system documentation, and use
the diagnostic tools that come with your system. Information about diagnostic
tools is available in the Problem Determination and Service Guide on the
IBM Documentation CD that comes with your system.

252

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Go to the following IBM support website to check for technical information,


hints, tips, and new device drivers, or to submit a request for information:
http://www.ibm.com/supportportal
For SAP HANA software-related issues, you can search the SAP OSS
website for problem resolutions. The OSS website has a knowledge database
of known issues and can be accessed at the following address:
https://service.sap.com/notes
The main SAP HANA information source is available at the following site:
https://help.sap.com/hana_platform
If you have a specific operating system question or issue, contact SUSE
regarding SUSE Linux Enterprise Server for SAP Applications. Go to the SUSE
website:
http://www.suse.com/products/prioritysupportsap
Media is available for download at the following website:
http://download.novell.com/index.jsp?search=Search&families=2658&keywor
ds=SAP
Note: Registration is required before you can download software packages
from the SUSE website.

8.7.2 IBM SAP International Competence Center InfoService


The IBM SAP International Competence Center (ISICC) InfoService is the key
support function of the IBM and SAP alliance. It serves as a single point of entry
for all SAP-related questions for clients using IBM systems and solutions with
SAP applications. As a managed question and answer service, it has access to a
worldwide network of experts on technology topics about IBM products in SAP
environments. You can contact the ISICC InfoService by writing to the following
email address:
infoservice@de.ibm.com
Note: The ISICC InfoService does not provide product support. If you need
product support for the IBM Systems Solution for SAP HANA, see 8.7.1, IBM
and SAP integrated support on page 252. If you need support for other IBM
products, consult the product documentation on how to get support.

Chapter 8. SAP HANA operations

253

254

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Appendix A.

Additional topics
This appendix covers the following topics:
GPFS license information
File-based backup with IBM Tivoli Storage Manager for ERP

Copyright IBM Corp. 2013, 2014. All rights reserved.

255

A.1 GPFS license information


The models of the IBM Systems Solution for SAP HANA come with GPFS
licenses, including three years of Software Subscription and Support. Software
Subscription and Support contracts, including Subscription and Support
renewals, are managed through IBM Passport Advantage or Passport
Advantage Express.
There are four different types of GPFS licenses:
The GPFS on x86 Single Server for Integrated Offerings license provides file
system capabilities for single-node integrated offerings. This GPFS license
does not cover usage in multi-node environments, such as the scale-out
solution described here. To use the building blocks that come with the GPFS
on x86 Single Server for Integrated Offerings licenses for a scale-out solution,
GPFS on x86 Server licenses or GPFS File Placement Optimizer licenses
must be obtained for these building blocks.
The GPFS Server license permits the licensed node to perform GPFS
management functions, such as cluster configuration manager, quorum node,
manager node, and network shared disk (NSD) server. In addition, the GPFS
Server license permits the licensed node to share GPFS data directly through
any application, service, protocol, or method, such as Network File System
(NFS), Common Internet File System (CIFS), File Transfer Protocol (FTP), or
Hypertext Transfer Protocol (HTTP).
The GPFS File Placement Optimizer license permits the licensed node to
perform NSD server functions for sharing GPFS data with other nodes that
have a GPFS File Placement Optimizer or GPFS Server license. This license
cannot be used to share data with nodes that have a GPFS Client license or
non GPFS nodes.
The GPFS Client license permits exchange of data between nodes that
locally mount the same file system (for example, through a shared storage).
No other export of the data is permitted. The GPFS Client cannot be used for
nodes to share GPFS data directly through any application, service, protocol
or method, such as NFS, CIFS, FTP, or HTTP. For these functions, a GPFS
Server license is required. Because of the architecture of the IBM Systems
Solution for SAP HANA (not having a shared storage system), this type of
license cannot be used for the IBM solution.
Table A-1 on page 257 lists the types of GPFS licenses and the processor value
units (PVUs) that are included for each of the models.

256

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Table A-1 GPFS licenses that are included in the custom models for SAP HANA
MTM

Type of GPFS license included

PVUs
included

7147-H1x

GPFS on x86 Server

1400

7147-H2x

GPFS on x86 Server

1400

7147-H3x

GPFS on x86 Server

1400

7147-H7x

GPFS on x86 Server

1400

7147-H8x

GPFS on x86 Server

1400

7147-H9x

GPFS on x86 Server

1400

7147-HAx

GPFS on x86 Single Server for Integrated Offerings

1400

7147-HBx

GPFS on x86 Single Server for Integrated Offerings

1400

7143-H1x

GPFS on x86 Server

1400

7143-H2x

GPFS on x86 Server

4000

7143-H3x

GPFS on x86 Server

5600

7143-H4x

GPFS on x86 Server

1400

7143-H5x

GPFS on x86 Server

4000

7143-HAx

GPFS on x86 Single Server for Integrated Offerings

4000

7143-HBx

GPFS on x86 Single Server for Integrated Offerings

4000

7143-HCx

GPFS on x86 Single Server for Integrated Offerings

5600

Licenses for IBM GPFS on x86 Single Server for Integrated Offerings V3
(referred to as Integrated in the table) cannot be ordered independently of the
select hardware for which it is included. This type of license provides file system
capabilities for single-node integrated offerings. Therefore, the model 7143-HAx
includes 4000 PVUs of GPFS on x86 Single Server for Integrated Offerings V3
licenses, so that an upgrade to the 7143-HBx model does not require additional
licenses. The PVU rating for the 7143-HAx model to consider when purchasing
other GPFS license types is 1400 PVUs.
Clients with highly available, multi-node clustered scale-out configurations must
purchase the GPFS on x86 Server and GPFS File Placement Optimizer product.

Appendix A. Additional topics

257

A.2 File-based backup with IBM Tivoli Storage Manager


for ERP
IBM Tivoli Storage Manager for ERP is a simple, scalable data protection solution
for SAP HANA and SAP ERP. Tivoli Storage Manager for ERP V6.4 includes a
one-step command that automates file-based SAP HANA backup and Tivoli
Storage Manager data protection.
Tivoli Storage Manager clients running SAP HANA appliances can back up their
instances using their existing Tivoli Storage Manager backup environment, even
if the level of the SAP HANA code does not allow use of the Backint interface for
SAP HANA, and only a file-based backup tool integration can be used. Tivoli
Storage Manager for ERP Data Protection for SAP HANA V6.4 provides such
file-based backup and restore functions for SAP HANA.

A.2.1 Setting up Data Protection for SAP HANA


Data Protection for SAP HANA comes with a setup.sh command, which is a
configuration tool that prepares the Tivoli Storage Manager for ERP configuration
file, creates the SAP HANA backup user, and sets all necessary environment
variables for the SAP HANA administration user. The setup.sh command guides
you through the configuration process. Data Protection for SAP HANA stores a
backup user and its password in the SAP HANA keystore called hdbuserstore to
enable unattended operation of a backup.

A.2.2 Backing up the SAP HANA database


SAP HANA writes its backup (logs and data) to files at pre-configured directories.
The Data Protection for SAP HANA command, backup.sh, reads the
configuration files to retrieve these directories (if the default configuration is not
used). Upon backup, the files that are created in these directories are moved to
the running Tivoli Storage Manager instance and are deleted afterward from
these directories (except for the HANA configuration files).

258

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Figure A-1 shows this backup process.

node01

node02

node03

SAP HANA DB
DB partition 1

move files to TSM

5
backup

2 - GPFS
Shared file system

data02 + log02

3
Backup files

DB partition 3

data01 + log01

First replica

Tivoli
Storage
Manager
Server

DB partition 2

backup

backup

local storage

local storage

data03 + log03

backup

local storage

backup.sh

restore

Tivoli Storage Manager


Storage
Figure A-1 Backup process with Data Protection for SAP HANA using local storage for backup files

The backup process has the following steps:


1. The backup.sh command triggers a log or data backup of the SAP HANA
database.
2. The SAP HANA database performs a synchronized backup on all nodes.
3. The SAP HANA database writes a backup file on each node.
4. The backup.sh command collects the file names of the backup files.
5. The backup files are moved to Tivoli Storage Manager (and deleted on the
nodes).

Appendix A. Additional topics

259

Instead of having the backup files of the individual nodes written to the local
storage of the nodes, an external storage system can be used to provide space
to store the backup files. All nodes must be able to access this storage, for
example, using NFS. Figure A-2 shows this scenario.

node01

node02

node03

SAP HANA DB
DB partition 1

2
First replica

Tivoli
Storage
Manager
Server

move files to TSM

5
backup

DB partition 2

DB partition 3

2 file system - GPFS2


Shared

data01 + log01

data02 + log02

data03 + log03

local storage

local storage

local storage

backup.sh

restore

4
Tivoli Storage Manager
Storage

backup

backup

backup

SAP HANA
backup file storage

Figure A-2 Backup process with Data Protection for SAP HANA using external storage for backup files

Running log and data backups requires the Data Protection for SAP HANA
backup.sh command to be run as the SAP HANA administration user
(<sid>adm).
The backup.sh command provides two basic functions:
1. Completes the data-backup (including HANA instance and landscape
configuration files).
2. Completes log-backup and removes successfully saved redo log files from
disk.

260

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

The functions can be selected by using command-line arguments to schedule the


backup script with a given parameter:
backup.sh --data

Performs complete data and configuration file backup.

backup.sh --logs

Performs complete log backup followed by a LOG


RECLAIM.

By using this command, a backup of the SAP HANA database into Tivoli Storage
Manager can be automated fully.

A.2.3 Restoring the SAP HANA database


The SAP HANA database requires the backup files to be restored to start a
recovery process by using the SAP HANA Studio. For SAP HANA database
revisions 30 and higher, Data Protection for SAP HANA provides a restore.sh
command that moves all required files back to the file system location
automatically so that the user is not required to search these files manually. For
earlier revisions of the SAP HANA database, this task must be done manually by
using the Tivoli Storage Manager BACKUP-Filemanager. The SAP HANA
database expects the backup files to be restored to the same directory as they
were written during backup. The recovery itself can then be triggered using SAP
HANA Studio.

Appendix A. Additional topics

261

To restore data backups, including SAP HANA configuration files and log file
backups, the Tivoli Storage Manager BACKUP-Filemanager is used. Figure A-3
shows a sample panel of the BACKUP-Filemanager.
BACKUP-Filemanager V6.4.0.0, Copyright IBM 2001-2012
.------------------+---------------------------------------------------------------.
| Backup IDs
| Files stored under TSM___A0H7K1C4QI
|
|------------------+---------------------------------------------------------------|
| TSM___A0H7KM0XF4 | */hana/log_backup/log_backup_2_0_1083027170688_1083043933760 |
| TSM___A0H7KLYP3Z | */hana/log_backup/log_backup_2_0_1083043933760_1083060697664 |
| TSM___A0H7KHNLU6 | */hana/log_backup/log_backup_2_0_1083060697664_1083077461376 |
| TSM___A0H7KE6V19 | */hana/log_backup/log_backup_2_0_1083077461376_1083094223936 |
| TSM___A0H7K9KR7F | */hana/log_backup/log_backup_2_0_1083094223936_1083110986880 |
| TSM___A0H7K7L73W | */hana/log_backup/log_backup_2_0_1083110986880_1083127750848 |
| TSM___A0H7K720A4 | */hana/log_backup/log_backup_2_0_1083127750848_1083144513792 |
| TSM___A0H7K4BDXV | */hana/log_backup/log_backup_2_0_1083144513792_1083161277760 |
| TSM___A0H7K472YC | */hana/log_backup/log_backup_2_0_1083161277760_1083178040064 |
| TSM___A0H7K466HK | */hana/log_backup/log_backup_2_0_1083178040064_1083194806336 |
| TSM___A0H7K1C4QI | */hana/log_backup/log_backup_2_0_1083194806336_1083211570688 |
| TSM___A0H7JX1S77 | */hana/log_backup/log_backup_2_0_1083211570688_1083228345728 |
| TSM___A0H7JSRG2B | */hana/log_backup/log_backup_2_0_1083228345728_1083245109824 |
| TSM___A0H7JOH1ZP | */hana/log_backup/log_backup_2_0_1083245109824_1083261872960 |
| TSM___A0H7JK6ONC | */hana/log_backup/log_backup_2_0_1083261872960_1083278636608 |
| TSM___A0H7JJWUI8 | */hana/log_backup/log_backup_2_0_1083278636608_1083295400384 |
| TSM___A0H7JJU5YN | */hana/log_backup/log_backup_2_0_1083295400384_1083312166016 |
| TSM___A0H7JFWAV4 | */hana/log_backup/log_backup_2_0_1083312166016_1083328934016 |
| TSM___A0H7JBG625 | */hana/log_backup/log_backup_2_0_1083328934016_1083345705856 |
| TSM___A0H7JBAASN | */hana/log_backup/log_backup_2_0_1083345705856_1083362476352 |
| TSM___A0H7J7BLDK | */hana/log_backup/log_backup_2_0_1083362476352_1083379244416 |
| TSM___A0H7J5U8S7 | */hana/log_backup/log_backup_2_0_1083379244416_1083396008064 |
| TSM___A0H7J5T92O | */hana/log_backup/log_backup_2_0_1083396008064_1083412772928 |
| TSM___A0H7J4TWPG | */hana/log_backup/log_backup_2_0_1083412772928_1083429538688 |
|
| */hana/log_backup/log_backup_2_0_1083429538688_1083446303424 |
|
| */hana/log_backup/log_backup_2_0_1083446303424_1083463079488 |
|
| */hana/log_backup/log_backup_2_0_1083463079488_1083479846528 V
|------------------+---------------------------------------------------------------|
| 24 BIDs
| 190 File(s) - 190 marked
|
`------------------+---------------------------------------------------------------'
TAB change windows
F2 Restore
F3 Mark all
F4 Unmark allF5 reFresh
F6 fileInfo
F7 redireCt
F8 Delete
F10 eXit
ENTER mark file
Figure A-3 The BACKUP-Filemanager interface

Data and log backups can be selected and then restored to the wanted location.
If no directory is specified for the restore, the BACKUP-Filemanager restores the
backups to the original location from which the backup was done.

262

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

After the backup files are restored, the recovery process must be started by
using SAP HANA Studio. More information about this process and the various
options for a recovery is contained in the SAP HANA Backup and Recovery
Guide, found at the following website:
http://help.sap.com/hana_platform
After successfully completing the recovery process, if the backup files are no
longer needed, they must be removed manually from the disk.

Appendix A. Additional topics

263

264

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Abbreviations and acronyms


ABAP

Advanced Business
Application Programming

HDD

hard disk drive

HPI

Hasso Plattner Institute

ACID

atomicity, consistency,
isolation, durability

I/O

input/output

APO

Advanced Planner and


Optimizer

IBM

International Business
Machines

BI

Business Intelligence

ID

identifier

BICS

BI Consumer Services

IDs

identifiers

BM

bridge module

IMM

integrated management
module

BW

Business Warehouse

IOPS

I/O operations per second

CD

compact disc

ISICC

CPU

central processing unit

IBM SAP International


Competence Center

CRC

cyclic redundancy checking

ITSO

CRM

customer relationship
management

International Technical
Support Organization

JDBC

Java Database Connectivity

CRU

customer-replaceable unit

JRE

Java Runtime Environment

DB

database

KPIs

key performance indicators

DEV

development

LM

landscape management

DIMM

dual inline memory module

LUW

logical unit of work

DSO

DataStore Object

MB

megabyte

DR

disaster recovery

MCA

Machine Check Architecture

DXC

Direct Extractor Connection

MCOD

ECC

ERP Central Component

Multiple Components in One


Database

ECC

error checking and correcting

MCOS

ERP

enterprise resource planning

Multiple Components on One


System

ETL

extract, transform, and load

MDX

Multidimensional Expressions

FTSS

Field Technical Sales Support

NOS

Notes object services

GB

gigabyte

NSD

Network Shared Disk

GBS

IBM Global Business


Services

NUMA

non-uniform memory access

ODBC

Open Database Connectivity

GPFS

General Parallel File System

ODBO

OLE DB for OLAP

GTS

Global Technology Services

OLAP

online analytical processing

HA

high availability

OLTP

online transaction processing

Copyright IBM Corp. 2013, 2014. All rights reserved.

265

OS

operating system

SQL

Structured Query Language

OSS

Online Service System

SSD

solid-state drive

PAM

Product Availability Matrix

SSR

PC

personal computer

SAP HANA System


Replication

PCI

Peripheral Component
Interconnect

STG

Systems and Technology


Group

POC

proof of concept

SUM

Software Update Manager

PSA

Persistent Staging Area

TB

terabyte

PVU

processor value unit

TCO

total cost of ownership

QA

quality assurance

TCP/IP

QPI

QuickPath Interconnect

Transmission Control
Protocol/Internet Protocol

RAID

Redundant Array of
Independent Disks

TDMS

Test Data Migration Server

TREX

RAM

random access memory

Text Retrieval and Information


Extraction

RAS

reliability, availability, and


serviceability

UEFI

Unified Extensible Firmware


Interface

RDS

Rapid Deployment Solution

RHEL

Red Hat Enterprise Linux

RPM

revolutions per minute

RPO

Recovery Point Objective

RTO

Recovery Time Objective

SAN

storage area network

SAPS

SAP Application Benchmark


Performance Standard

SAS

serial-attached SCSI

SATA

Serial ATA

SCM

supply chain management

SCM

software configuration
management

SD

Sales and Distribution

SDRAM

synchronous dynamic random


access memory

SLD

System Landscape Directory

SLES

SUSE Linux Enterprise


Server

SLO

System Landscape
Optimization

SMI

scalable memory interconnect

266

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

Related publications
The publications that are listed in this section are considered suitable for a more
detailed description of the topics that are covered in this book.

IBM Redbooks
The following IBM Redbooks publications provide additional information about
the topics in this document. Some publications that are referenced in this list
might be available in softcopy only:
The Benefits of Running SAP Solutions on IBM eX5 Systems, REDP-4234
IBM eX5 Portfolio Overview: IBM System x3850 X5, x3950 X5, x3690 X5, and
BladeCenter HX5, REDP-4650
Implementing the IBM General Parallel File System (GPFS) in a Cross
Platform Environment, SG24-7844
You can search for, view, download, or order these documents and other
Redbooks, Redpapers, Web Docs, drafts, and additional materials, at the
following website:
ibm.com/redbooks

Online resources
These websites are also relevant as further information sources:
IBM and SAP: Business Warehouse Accelerator
http://www.ibm-sap.com/bwa
IBM Systems and Services for SAP HANA
http://www.ibm-sap.com/hana
IBM Systems Solution for SAP HANA
http://www.ibm.com/systems/x/solutions/sap/hana
SAP In-Memory Computing - SAP Help Portal
http://help.sap.com/hana

Copyright IBM Corp. 2013, 2014. All rights reserved.

267

Help from IBM


IBM Support and downloads
ibm.com/support
IBM Global Services
ibm.com/services

268

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

In-memory Computing with SAP HANA on IBM eX5 and X6 Systems

(0.5 spine)
0.475<->0.875
250 <-> 459 pages

Back cover

In-memory Computing
with SAP HANA on IBM
eX5 and X6 Systems
IBM System x
solution for
SAP HANA
SAP HANA overview
and use cases
Operational aspects
for SAP HANA
appliances

This third edition of this IBM Redbooks publication describes


in-memory computing appliances from IBM and SAP that are
based on IBM eX5 and X6 flagship systems and SAP HANA. It
covers the basic principles of in-memory computing,
describes the IBM eX5 and X6 hardware offerings, and
explains the corresponding SAP HANA IT landscapes using
these offerings.

INTERNATIONAL
TECHNICAL
SUPPORT
ORGANIZATION

This book also describes the architecture and components of


the IBM System x solution for SAP HANA, with IBM General
Parallel File System (GPFS) as a cornerstone. The SAP HANA
operational disciplines are explained in detail: Scalability
options, high availability and disaster recovery, backup and
restore, and virtualization possibilities for SAP HANA
appliances.

BUILDING TECHNICAL
INFORMATION BASED ON
PRACTICAL EXPERIENCE

This book is intended for SAP administrators and technical


solution architects. It is also for IBM Business Partners and
IBM employees who want to know more about the SAP HANA
offering and other available IBM solutions for SAP clients.

IBM Redbooks are developed by


the IBM International Technical
Support Organization. Experts
from IBM, Customers and
Partners from around the world
create timely technical
information based on realistic
scenarios. Specific
recommendations are provided
to help you implement IT
solutions more effectively in
your environment.

For more information:


ibm.com/redbooks
SG24-8086-02

ISBN 0738439908

You might also like