Professional Documents
Culture Documents
Redbook Building and Scaling SAP Business Information Warehouses On DB2 UDB ESE sg247094 PDF
Redbook Building and Scaling SAP Business Information Warehouses On DB2 UDB ESE sg247094 PDF
Configuration examples
to start with and grow
Chuck Ballard
Andreas Christian
Theodor Kuebler
Rodrigo Schiavon
ibm.com/redbooks
International Technical Support Organization
June 2004
SG24-7094-00
Note: Before using this information and the product it supports, read the information in
“Notices” on page ix.
This edition applies to Version 3.5 of SAP Business Information Warehouse, and Version 8.1 of
DB2 Universal Databse Enterprise Server Edition.
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
The team that wrote this redbook. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
Become a published author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv
Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Management summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Structure of the redbook. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Contents v
Chapter 8. SAP BW performance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
8.1 The approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
8.2 Health check . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
8.2.1 SAP Notes related to performance and system configuration . . . . 250
8.2.2 How to keep DB2 database statistics current with SAP BW . . . . . . 254
8.2.3 How to control DB2 log space consumption with SAP BW . . . . . . . 254
8.2.4 Recommended configuration parameters . . . . . . . . . . . . . . . . . . . . 256
8.3 Performance tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
8.3.1 General SAP BW tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
8.3.2 Tuning DB2 UDB ESE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
8.4 Performance monitoring of SAP BW on DB2 UDB . . . . . . . . . . . . . . . . . 268
8.4.1 Performance bottlenecks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
8.4.2 Proactive monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
8.4.3 Problem analysis and resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
Contents vii
viii SAP BW and DB2
Notices
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult
your local IBM representative for information on the products and services currently available in your area.
Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM
product, program, or service may be used. Any functionally equivalent product, program, or service that
does not infringe any IBM intellectual property right may be used instead. However, it is the user's
responsibility to evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document.
The furnishing of this document does not give you any license to these patents. You can send license
inquiries, in writing, to:
IBM Director of Licensing, IBM Corporation, North Castle Drive Armonk, NY 10504-1785 U.S.A.
The following paragraph does not apply to the United Kingdom or any other country where such provisions
are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES
THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED,
INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT,
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer
of express or implied warranties in certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically made
to the information herein; these changes will be incorporated in new editions of the publication. IBM may
make improvements and/or changes in the product(s) and/or the program(s) described in this publication at
any time without notice.
Any references in this information to non-IBM Web sites are provided for convenience only and do not in any
manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the
materials for this IBM product and use of those Web sites is at your own risk.
IBM may use or distribute any of the information you supply in any way it believes appropriate without
incurring any obligation to you.
Information concerning non-IBM products was obtained from the suppliers of those products, their published
announcements or other publicly available sources. IBM has not tested those products and cannot confirm
the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on
the capabilities of non-IBM products should be addressed to the suppliers of those products.
This information contains examples of data and reports used in daily business operations. To illustrate them
as completely as possible, the examples include the names of individuals, companies, brands, and products.
All of these names are fictitious and any similarity to the names and addresses used by an actual business
enterprise is entirely coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrates programming
techniques on various operating platforms. You may copy, modify, and distribute these sample programs in
any form without payment to IBM, for the purposes of developing, using, marketing or distributing application
programs conforming to the application programming interface for the operating platform for which the
sample programs are written. These examples have not been thoroughly tested under all conditions. IBM,
therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. You may copy,
modify, and distribute these sample programs in any form without payment to IBM for the purposes of
developing, using, marketing, or distributing application programs conforming to IBM's application
programming interfaces.
Intel and Intel Inside (logos) are trademarks of Intel Corporation in the United States, other countries, or
both.
Microsoft, Windows, and the Windows logo are trademarks of Microsoft Corporation in the United States,
other countries, or both.
Java and all Java-based trademarks and logos are trademarks or registered trademarks of Sun
Microsystems, Inc. in the United States, other countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Other company, product, and service names may be trademarks or service marks of others.
The primary objective of this IBM Redbook is to demonstrate and document the
process of:
Installing
Administering
Tuning
Scaling
Another important goal of this book is to familiarize you with the architecture and
functionality of SAP BW and DB2 UDB ESE. In particular, we explain specific
functions of DB2 UDB ESE that are utilized by SAP BW to deliver high-end
performance. We also provide details about the SAP BW graphical front-end,
and other important functions that are developed as part of the SAP-IBM
cooperation.
Measuring scalability factors that are achieved with SAP BW on DB2 was out of
scope for this project. Therefore, we refer to two studies, which show that SAP
BW with DB2 scales linear, or close to linear, in the area of SAP BW queries and
data load.
A high level overview of the structure and contents of the redbook is presented in
the “Introduction” on page 1.
Left to right: Chuck Ballard, Rodrigo Schiavon, Andreas Christian, Theo Kuebler.
Andreas Christian works at the SAP DB2 UDB Development / Porting Center in
Walldorf Germany. He provides technical support, consultancy, and knowledge
transfer, in the area of SAP BW/DB2 UDB for customers, consultants, IBM
partners, and IBM marketing and pre-sales. He started to work for IBM in 1999
as a member of the IBM SAP BW/DB2 UDB porting team. In addition to his
Theodor H. Kuebler is a consultant and trainer for SAP and Data Management
products. He is founder of the Logos Informations Syteme (LIS) AG in
Switzerland (www.lis-ag.ch), an IBM enabled Education Center for IBM Software
(ECIS), and an IBM Business Partner, providing classroom and onsite courses in
Data Management, Tivoli®, and WebSphere®. He has 12 years experience in
implementation and tuning of SAP products on database management systems,
such as DB2 UDB, Adabas-D, and Oracle, and on different platforms. He
teaches SAP Basis courses for Z/OS and DB2, worldwide.
Additional contributors
Assistance and cooperation was also received from other contributors and
organizations, and we recognize them in this section:
Preface xiii
Martin Mezger, IBM DB2 Project at SAP, SAP Global Support, Technology
Infrastructures and Enablement, Boeblingen, Germany
Dirk Nakott, DM Development, SAP DB2 UDB Development / Porting Center,
Boeblingen, Germany
Steve Rees, Manager, DB2 Performance and Advanced Technology, Markham,
ON, Canada
Jens Seifert, DM Development, SAP DB2 UDB Development / Porting Center,
Boeblingen, Germany
Volker Susok, SAP Technology Alliance Manager, Boeblingen, Germany
Robin Van Boeschoten, DB2 Performance and Advanced Technology,
Markham, ON, Canada
Siegfried Wurst, DM Development, SAP DB2 UDB Development / Porting
Center, Boeblingen, Germany
Your efforts will help increase product acceptance and customer satisfaction. As
a bonus, you'll develop a network of contacts in IBM development labs, and
increase your productivity and marketability.
Find out more about the residency program, browse the residency index, and
apply online at:
ibm.com/redbooks/residencies.html
Preface xv
xvi SAP BW and DB2
Introduction
This redbook demonstrates the capabilities of DB2 UDB ESE to meet SAP BW
requirements. Because typical SAP BW systems have fast growing data volumes
and an increasing number of users, scalability will prove to be a critical decision
factor in the selection of a database management system on which to implement
SAP BW.
IBM DB2 Universal Database Enterprise Server Edition is a parallel and scalable
database management system. It is capable of processing very large data sets,
while delivering high performance. This is enabled by several features that are
specifically designed for data warehouse applications, and the classes of
parallelism that are supported. SAP BW provides a convenient graphical
front-end to administer and monitor DB2. Because of the tight integration of the
two products, and due to DB2 features that simplify the administration and
increase database availability, SAP BW with DB2 is robust and easy to use.
Therefore DB2 meets one of the most important SAP BW requirements, which is
scalability at the database server level. This requirement is important because,
typically, over time, an SAP BW database constantly grows, has more and more
source systems attached, and has an ever increasing number of users. To keep
response times stable in this environment, requires a scalable database system.
SAP BW with DB2 scales linear, or close to linear, in the area of SAP BW queries
and data load. These excellent scalability factors are not self-evident. They result
from the advantages of the DB2 shared-nothing architecture and its approach to
parallel processing.
With this redbook, we show the potential and value of building your SAP BW on
DB2 UDB ESE. The performance and scalability of the combination of DB2 and
SAP BW provides a powerful business intelligence solution that enables you to
start small and grow to a very large environment, while retaining investments in
your existing hardware.
In addition to being the first Linux based SAP BW 3.5 Standard Application
Benchmark result, it is also the leading 8-way result (query phase), leading Intel®
based result (query phase), and SAP BW 3.x result (query and load phase 2).
DB2 and the xSeries 455 achieved measured throughput of 98,424 navigation
steps per hour in the Query Phase, 1.279 seconds average dialog response time,
and an average CPU utilization of 96 percent for the database server. Total load
throughput measured 33,515,065 rows per hour for the first phase and
10,954,707 rows per hour for the second phase, yielding a total average load
throughput of 8,256,119 rows per hour.
The SAP BW Standard Application Benchmark consists of two phases that run
consecutively (as of SAP BW 3.0). Step 1 is the Load Phase, which consists of
the reading of master data from an external flat file structure and the loading of
Introduction 3
transactional data from external flat files into the operational data store and the
InfoCube. After this, the aggregates are built. Finally, statistics and bitmapped
indices on the Fact Table and the rollup of aggregates are refreshed. The key
figure is throughput of data in rows/hour. Step 2 is the Query Phase, which
simulates the actions in an SD (Sales and Distribution) InfoCube. The key figure
is the number of dialog steps/hour.
Tip for decision makers: We recommend that you read Chapter 4 first, to
become familiar with the business advantages of using DB2 UDB ESE with
SAP BW.
Chapter 1 introduces SAP BW and positions it within the SAP product set.
We first introduce the concepts of data warehousing and then describe SAP
BW as a solution implementation of those concepts. A high level description
of the SAP BW architecture is described to enable a better understanding of
the components and how they interact. A sample scenario is introduced to
show a typical SAP BW environments and to explain important system
requirements.
Chapter 2 continues with the overview of SAP BW, but presents more
detailed and technical information regarding the architecture, components,
and operation of BW. This promotes a deeper understanding of SAP BW and
the benefits to be gained.
Chapter 3 provides an introduction to the DB2 Universal Database Enterprise
Server Edition (DB2 UDB ESE), which is the database management system
used for this redbook. Among other topics, we explain the architecture, the
different types of parallelism that are available and the different options to
scale the DB2 database system, depending on the hardware which is used.
We also talk important new features in DB2 UDB ESE version 8.
Chapter 4 combines concepts and functionality from the previous chapters to
describe how DB2 and SAP BW are integrated and deployed for a robust and
scalable data warehousing solution. We discuss some of the initiatives with
IBM and SAP for enhanced integration of DB2 and SAP BW. The result is a
low TCO (total cost of ownership), high performance, ease of use, and the
best scalability to handle the rapid growth in data volumes and number of end
users.
With this overview of the redbook, it is now time to make your reading selections
and get started.
Introduction 5
6 SAP BW and DB2
1
It is important to understand that the term, enterprise decision maker, does not
refer only to enterprise executives and strategic decision makers. When using
this term, we are referring to any enterprise employee responsible for decision
making. The data warehousing technology implies delivery of an accurate and
consistent source of information available to all enterprise employees for their
decision-making purposes.
In addition, the data was typically collected as needed so each source had a
different date and time of currency. That is, each data extract was accurate - but
as of a different point in time!
Therefore, a good deal of work has to take place to manipulate and merge the
data, to enable the resulting information to be accurate. Even then, it is only
accurate as of that specific point-in-time. So, it cannot be readily combined with
information obtained by other similar extract processes. With the lack of good
tools to help with this effort, each extract required a custom built extract program.
mySAP SCM
mySAP HLM
mySAP PLM
Marketplace
mySAP HR
Financials
mySAP
mySAP
SAP positions its products as business solutions rather than IT based technology
solutions. When clients purchase a mySAP solution they receive the necessary
IT software components to implement the solution. In addition, SAP BW is now
delivered with predefined business content to allow clients to implement the
business solution faster and with lower effort.
Financial Solution
System FI1
System BW1
(mySAP ETL (SAP BW)
Financials)
BW
Non-SAP
CRM HR FI Legacy
System
Time
Figure 1-4 Typical SAP BW scenario
Because more and more systems are connected over time, the volume of data
that has to be loaded in the given nightly maintenance window is constantly
increasing until the final stage is implemented. At that point, a total of 4 GB of
data is loaded on average per day. The volume of data continues to grow until a
total of 7 TB is reached. After 4 years, the customer begins to archive part of their
warehouse data, which keeps the total data volume at a constant size.
Figure 1-6 shows the anticipated growth of users that are working concurrently
on the SAP BW system. All numbers were taken from an actual SAP BW
hardware sizing.
Scalability does not stand in isolation, however. It is not, for example, simply a
means of enabling growth in the ability to store larger volumes of data. With
scalability comes the need to support the loading and updating of that increasing
volume of data. In addition, there will typically be an associated growth in the
number of users as well. With additional users will come an increase in query
load, and the requirement to maintain acceptable performance levels. This is
depicted in Figure 1-7.
Increasing
Need
scalability
to maintain
System performance
Performance
Time
Figure 1-7 Need for scalability
In the following chapters of this redbook, we will further describe how DB2 UDB
ESE provides scalability, and, in particular, show its unique ability to attach
additional resources to an existing DB2 database system.
This has been a high-level overview and discussion of SAP BW. Chapter 2, “SAP
BW technical overview” on page 21, will take the discussion to another level of
detail. This will promote understanding of the framework and architecture from a
more technical perspective. This additional information will demonstrate why
DB2 is the best DBMS for enabling scalability.
This chapter continues with the discussion started in Chapter 1, “SAP Business
Information Warehouse (BW)” on page 7. It describes in more technical detail the
structure and operation of SAP BW. This will provide you with a better
understanding of the architecture, framework, and components of SAP BW, and
how they work together to enable an enterprise data warehousing solution.
Let’s start with some basic structural elements and see how they all work
together to enable the SAP BW functional components that comprise the SAP
BW architecture.
InfoObjects also carry metadata that describes the data contained in the
InfoObject, such as its origin, history, and technical properties. An InfoObject has
three classes of metadata:
Technical metadata: This describes technical properties, such as data type
and field length.
User metadata: This carries information about authorizations.
Business definitions: These form the basis for a common understanding of
business terms, such as key performance indicators.
More information on this topic can be found in the SAP Technical Brief entitled
Data Warehousing with mySAP Business Intelligence, which can be found on the
SAP Web site.
Operational
Data Store
Persistent (ODS)
Staging
Any Area Information
Source (PSA) Data Warehouse Access
Multi-dimensional
Models
Master
data
ETL process: When data is loaded into SAP BW, it is integrated, standardized,
synchronized, and enriched. This is performed through processes known as
extraction, transformation, and loading (ETL). You have to ensure that the ETL
process captures and loads the full range of required data, while avoiding an
overload of irrelevant data. Data is often stored in different formats, with different
data types and different lengths. In addition, a data type may have the same
name but a different meaning in the different systems where it is used. All of
these data differences require resolution to enable the data from the various
sources to be combined and integrated in the data warehouse, and to provide
meaningful and accurate information. Data has to be collected and cleansed to
eliminate duplication and incorrect values. And it then has to be enriched so that
it is transformed into practical, business-descriptive information.
MultiProvider
An InfoSet is a semantic layer using ODS objects and master data to create
reports from these objects, in particular joins between these objects. InfoSets are
created and changed in the InfoSet Builder. You can define reports based on
InfoSets using the BEx Query Designer.
2.4 Hierarchies
SAP BW allows the definition of hierarchies on InfoObjects. Hierarchies can be
unbalanced, unleveled, and also time dependent. Figure 2-4 shows an example
of a sales hierarchy that is organized by region.
Hierarchies can be used for reporting. For example, with a structure such as
shown in Figure 2-4, you could retrieve the total sales. Or you could retrieve only
the sales result of a particular branch of the sales organization, such as North
America. With SAP BW, unleveled or dynamic hierarchies are stored as
parent-child lists.
East West
The parent-child list that corresponds to Figure 2-4 is shown in Table 2-1.
North America US
US West
US East
WW Sales Europe
Europe East
Europe Central
If there are restrictions on hierarchies within a query, the results, that is, the
corresponding leafs of the hierarchy nodes that you select, are stored in
hierarchy result tables. The names of those tables consist of the prefix for SAP
BW tables (/BI0 or /BIC) followed by the digits 02. “/BI0/0200000001” is an
example of the name of a hierarchy result table.
SID Table Master Hierarchies Text Hierarchies Text SID Table Master
Hierarchies Text Hierarchies Text
dimension
SID Table Master
SID Table Master Hierarchies Text
Hierarchies Text dimension Fact dimension
SID Table Master
Hierarchies Text
dimension
SID Table Master
dimension
Hierarchies Text SID Table Master
Hierarchies Text
SID Table Master SID Table Master
Dimension tables are independent of each other. The fact table links the
dimension tables and the key figures. To link these tables, dimension identifiers
are used. A dimension identifier uniquely identifies a particular combination of
characteristic values in a dimension table, for example, a certain product and the
corresponding product group. Characteristics that are correlated, such as
product and product group, are usually put in the same dimension.
SAP BW uses an extended star schema, which builds on the basic star schema
by storing master data about attributes, hierarchies, and text, in separate tables
that are shared between InfoCubes. This reduces data redundancies because
master data is stored only once, and then used by various InfoCubes.
In Figure 2-6, actual characteristic values, such as the name of a region, are
shown in the dimension tables. In reality, characteristic values are replaced by
surrogate identifiers (SIDs). These are numeric key values (4-byte integers), that
are more compact than the characteristic values. The actual characteristic values
are stored in the master table. Therefore, you have foreign key relationships
between each characteristic in a dimension table and the corresponding
attribute, hierarchy, and text tables. SIDs are used to keep dimension tables as
small as possible, since they can also grow very large.
Customer dimension
Request dimension InfoCube
R Data Request # C Customer # Region …
SAP BW provides the option to define a very large dimension as a line item
dimension. In this case, the corresponding dimension is not stored as a separate
table, but rather it is stored directly in the fact table. This eliminates the
necessary join operation between dimension and fact table during SAP BW
query processing, which can provide improved query performance.
Master data
3 4
P SID Product SID Product Group
InfoCube 1
2
1
2
2
2
Product dimension
PSA Product
Concorde
Prod Group
Airplanes
Customer
Air France
City
Paris
Country
France
Quantity
1
Revenue
25000000 $
A320 Airplanes Lufthansa Munich Germany 2 30000000 $
1
Figure 2-7 Steps during dataload into InfoCube
3. The master data tables are checked to see if the attribute values from the
PSA record already exist. If this is the case, the corresponding surrogate
identifiers (SIDs) are retrieved from the master data tables. If the attribute
values do not yet exist, they are inserted in the master data tables and
corresponding SIDs are generated. In the given example the attribute values
Concorde and Airplanes are inserted in the master data tables.
4. The dimension tables are checked to see if the combination of SID values
that correspond to the attribute values from the PSA record exist in the
dimension tables. For example, the product dimension table is checked to
see if the SID combination 2/2 exists, which corresponds to the attribute
combination Concorde/Airplanes. If the SID combination exists, the
corresponding dimension identifier is retrieved. If the SID combination does
not exist, it is inserted in the dimension table and a corresponding dimension
identifier is generated.
5. After the dimension identifiers that correspond to the given attribute values
are retrieved/generated, a new record is inserted into the fact table. It
contains the key figures from the PSA record and the corresponding
dimension identifiers.
Aggregates reduce the volume of data to be read from the database, speed up
query execution time, and reduce the overall load on the database. To use
aggregates you must build and maintain them, and they require additional space
in the database. Aggregates are very similar to the automatic summary tables
defined in DB2.
USA 40
Germany 35
Canada 20
These example aggregates could be used by a query that reports the sales by
country or total sales. The aggregates could also be used for reports according
to a navigation attribute for the characteristic country or according to a hierarchy
containing the countries. If during query navigation there is a drilldown, or a
selection is made for a particular customer, these aggregates would not be used.
This is because the aggregates do not contain any detailed information about a
particular customer, only summarized customer data.
Level 1 World
The aggregate is shown in Table 2-4. That aggregate can be used by queries
that report on sales for a hierarchy node on level 2 or higher. In this example, in
Figure 2-8, those are the nodes labeled “Europe”, “America”, and “World”. It can
also be used by queries that have this country hierarchy as a presentation
hierarchy in the drilldown, but the drilldown goes no lower than the second level.
America 60
Europe 35
When you fill an aggregate, you load it with data. This action can only be
triggered from the aggregate maintenance. Also note that an aggregate can be
filled from the data of a larger aggregate that is already filled. This means that
very highly summarized aggregates, as a rule, can quickly obtain data from other
aggregates. In contrast, it can take a long time to build aggregates from the
InfoCube. Because of this, all aggregates are filled in background processes. If
there are several aggregates scheduled to be filled in one process, a hierarchy
sequence for the aggregates is determined first, and it is then processed
sequentially. This guarantees that very highly summarized aggregates are built
from the more detailed aggregates.
Roll-up
If aggregates are defined, new data packets that are loaded into the InfoCube
cannot be used for reporting until they are written to the aggregates by a
so-called “roll-up”. In other words, data that has been recently loaded into an
InfoCube is not visible for reporting, from the InfoCube or aggregates, until an
aggregate roll-up takes place. During this process you can continue to report
using the data that existed prior to the recent data load. The new data is only
displayed by queries that are executed after a successful roll-up.
The change-run, also called hierarchy-attribute realignment run, adjusts the data
in the aggregates and turns the modified version of the navigation attributes and
hierarchies into an active version. In almost every phase of the change-run, you
can continue reporting on the old master data and hierarchies.
You can either start the change-run manually in the Administrator Workbench or
with a program RSDDS_AGGREGATES_MAINTAIN. You can give the program
a list of characteristics and hierarchies that are to be taken into account for the
change-run. By default, all those characteristics are taken into account whose
master data you loaded or changed manually, and all the hierarchies that were
marked for the realignment run.
Request 1 to n
E- Fact table (compressed data)
Condensor
The processing steps that occur during compression of requests, when running
on DB2 UDB ESE, are explained in detail in “Collocated joins during
compression of requests” on page 97.
An ODS object contains a key (for example, order number) as well as data fields
(key figures). As opposed to InfoCubes, fields in an ODS object can be
Figure 2-10 shows an example of two layers of ODS objects that are used to
update order and delivery information. These ODS objects allow you to
determine, for example, which orders are open, partly delivered, or completely
delivered.
InfoCube
Update Rules
Order-
Delivery
Level 2
ODS
Order Level 1
Delivery
PSA
Order Delivery
The ODS objects can store the data at the same level of granularity as offered
from the PSA (that is, the source system) because aggregation can be
performed later during reporting.
The example shows how the ODS can be used to integrate data that describes
the same processes but that potentially comes from different source systems
(DataSources). The data can be loaded into SAP BW and stored in different PSA
tables, each having their own Transfer Structure. Integration is achieved by
applying Transfer Structure specific rules (that is, transfer rules) while
transferring the data into the same consolidated structure (communication
structure of an InfoSource) of an ODS object.
The first level ODS objects are part of the foundation of the data warehouse
because it represents the transactional data archive. This data can be accessed
to create new reporting scenarios based on integrated and consolidated data
from the past.
On the database level, every ODS object consists of three transparent tables:
Active table: The data in this table is used for reporting.
Activation queue: This table contains data that is new or modified since the
last activation.
Change log table: The change log is used for the delta update from the ODS
object into other ODS objects or InfoCubes.
Figure 2-11 shows the steps that occur during activation of ODS data. In this
example, we assume that document 123, with an amount of 100, is already
loaded into the activation table of the ODS object.
Activate
1. Insert new 2. Insert old record
record, Delete (negative value),
old record Insert new
record
Doc-No. Item Amount
New record
123 1 60 3. Delete
If all the records are activated, you can propagate the changes from the change
log to the datasets of the related ODS Objects and/or InfoCubes, in a separate
step. The amount is therefore reduced by 40 in the related data targets in the
example.
InfoCubes
+600 InfoCubes
+11,000 InfoObjects
+120 MultiProviders
This content can be used as is, or can be modified to meet any particular
requirements. It represents a wealth of supplied content that helps speed the
implementation and shorten the time to realize benefits.
As can be seen, SAP BW is a robust solution for data warehousing and business
intelligence. This has been a brief overview to supply sufficient information to
enable a better understanding of the remaining contents of this redbook. There is
more detailed information available concerning this offering that can be accessed
from your IBM representative, or directly from SAP.
In addition, DB2 integrates with IBM AIX HACMP™ to provide failover support to
improve availability.
XML support is provided via a no-charge XML Extender to DB2, that includes an
XML-aware text search, an XML datatype, and complete integrated XML
management capabilities. XML-aware searches are easy to perform using the
DB2 Net Search Extender.
Based on the strong cooperation between SAP and IBM, DB2 has become very
convenient to manage through SAP applications. All DB2 administration
functions are accessible via SAP transaction ST04. More details about DB2 in an
SAP Business Warehouse environment are described in Chapter 4, “Building
SAP BW on DB2” on page 81.
On the server side, activity is controlled by either DB2 processes (on UNIX) or
DB2 threads (on Windows). We refer to DB2 processes and threads as engine
dispatchable units (EDUs). EDUs are shown as circles or groups of circles in
Figure 3-1.
DB2 agents are the most common type of EDUs and perform most of the SQL
processing. A single SQL statement can be processed by one or multiple DB2
agents. Multiple subagents can be assigned if the server has multiple processors
or is part of a partitioned database. For example, in a symmetric multiprocessing
(SMP) environment, multiple SMP subagents can exploit the available
processors. All agents and subagents are managed using a pooling algorithm
that minimizes the creation and destruction of EDUs.
Buffer pools are DB2’s database cache. They are areas of database server
memory where database pages of user table data, index data, and catalog data
are temporarily moved and can be modified. Buffer pools are a key determinant
of database performance because data in memory can be accessed much faster
than data from disk. Therefore, if more of the data needed by applications is
present in a buffer pool, the applications will realize significantly improved
performance.
All databases maintain log files that keep records of database changes and to
support rollback operations. If a database needs to be restored and the
transactions recovered to a point beyond the time of the last full backup, the logs
are required to roll the data forward to that point in time. There are two logging
strategy choices:
Circular logging: In this strategy, the log records fill the log files and then
overwrite the initial log records in the initial log file. The overwritten log
records are not recoverable. This type of logging is used, if the database
configuration parameters LOGRETAIN and USEREXIT are set to NO. Using
this setup, the rollforward command cannot be used.
Retain log records: In this strategy, a log file is archived after it has been
filled with log records. New log files are made available for log records.
Retaining log files enables roll-forward recovery. Roll-forward recovery
reapplies changes to the database based on completed units of work
(transactions) that are recorded in the log. You can specify that roll-forward
recovery is to the end of the logs, or to a particular point in time before the
end of the logs. This is the default type of logging for SAP BW.
All changes to regular data and index pages are written to the log buffer first. The
data in the log buffer is written to disk by the logger process, for example, if the
client application issues a COMMIT statement, if the log buffer is full, or after a
second at latest.
In addition, understanding the structure and functionality at this detailed level will
enable you to better compare it with other database systems. With this
knowledge, you will be able to see how DB2 can deliver superior scalability and
still maintain the required levels of performance.
Central to all these technologies, and a key differentiator for DB2, is the cost
based optimizer. This component enables DB2 to provide the best performance
at the lowest resource cost. In addition to enabling the functionality required, it
gives you the best use of the resources you have available, at the lowest cost.
Let’s start with the base building blocks of a DBMS: the database objects.
Figure 3-2 shows the relationship between the various DB2 database objects
that provide the infrastructure.
Instances SMS
Each container is a
directory in the file
space of the operating
Databases system.
Tablespaces
- Or -
Tables
Indexes DMS
Each container is a
Long data fixed, pre-allocated file
or a physical device
such as a disk
Instances
An instance (sometimes called a database manager) is DB2 code that manages
data. It controls what can be done to the data, and manages system resources
assigned to it. Each instance is a complete environment. It contains all the
database partitions defined for a given parallel database system. An instance
has its own databases (which other instances cannot access), and all its
database partitions share the same system directories. It also has separate
security from other instances on the same server (system).
Important: Please note that SAP Instances are not equivalent to DB2
Instances. SAP Systems may be able to run multiple SAP Instances
(application servers). Each SAP System will connect to a DB2 database within
a DB2 Instance.
Tablespaces
A DB2 database is organized into parts called tablespaces. A tablespace is a
place to store tables. When creating a table, you can decide to have certain
objects such as indexes and large object (LOB) data kept separately from the
rest of the table data. A tablespace can also be spread over one or more physical
storage devices.
Tables
A table consists of data logically arranged in columns and rows. All database and
table data is assigned to tablespaces. Table data is accessed using the
Structured Query Language (SQL). When creating a table, you can decide to
store all related objects (for example, indexes and large object data) in the same
tablespace, or keep them in separate tablespaces.
All of these advantages can combine to give DB2 high scalability, faster
performance, and higher data availability.
Tip: Partitioning a DMS tablespace results in larger size limits for the
tablespace, because the size is limited per partition: 256 GB for 16 KB page
size, and 512 GB for 32 KB page size. For example, with 16k page size, a
4-partition system will be able to hold 256GB x 4 = 1 TB per tablespace.
You can define database partition groups, which can comprise all or a subset of
the database partitions. When creating a tablespace, you assign it to a database
partition group and define which tablespace containers are located in a particular
database partition.
When a new row is inserted into a partitioned table, the row’s hash key value is
used as input to a hash function that maps it to a hash value between 0 and
4095. A partition map is used to store the mapping between hash values and
database partitions. For each hash value it contains the corresponding database
partition number. The table row is stored on the partition number that is retrieved
from the partition map. The same hash function is used to retrieve rows from
partitioned tables.
You can create more than one buffer pool and assign it to particular tablespaces.
If a tablespace is located on multiple database partitions, the corresponding
buffer pool is created on each partition. For example, if you define two partitions
on a single server and a tablespace on both partitions, two separate memory
areas will be allocated for the corresponding buffer pool, one for each partition.
The isolation level determines how data is locked or isolated from other
processes, while the data is being accessed. To explain the isolation levels
provided by DB2, we first introduce the different effects that could occur without
locking:
Lost updates: Two applications, A and B, might both read the same row from
the database and both calculate new values for one of its columns based on
the data these applications read. If A updates the row with its new value and
B then also updates the row, the update performed by A is lost.
Access to uncommitted data: Application A might update a value in the
database, and application B might read that value before it was committed.
Then, if the value of A is not later committed, but backed out, the calculations
performed by B are based on uncommitted (and presumably invalid) data.
Non-repeatable reads: Some applications involve this sequence of events:
a. Application A reads a row from the database, then goes on to process
other SQL requests.
When the query is repeated in the third step, some additional (“phantom”) rows
are returned as part of the result set that were not returned when the query was
initially executed in the first step.
Table 3-1 summarizes the different isolation levels in terms of the undesirable
effects previously discussed.
Memory
Database Partition
SMP
Server
Disks
Resources, such as disk space and memory, are shared. With multiple
processors available, different database operations such as loading data,
backup/restore, and index creation, can take advantage of multiple processors.
Most importantly, DB2 can divide the work of a single query among available
processors to improve processing speed. This is called intra-partition
parallelism. Typically a single query comprises several operations, as shown in
Figure 3-5. If intra-partition parallelism is activated, these operations are
processed concurrently by multiple database agents. For example, one agent
performs a table scan and passes intermediate results to another agent, which
concurrently starts to process a join operation. This kind of pipeline processing
results in faster execution times for the query.
SORT
Query 2
JOIN
SORT
JOIN
SCAN SCAN
TIME
To scale an SMP system, you can add more processors. However, with shared
memory and shared disks, the DB2 agents are effectively sharing all of the
database data. You can increase the I/O capacity of the database partition
associated with your processors by increasing the number of disks and I/O
adapters, as shown in Figure 3-6. You can establish I/O servers to specifically
deal with I/O requests. Having one or more I/O servers for each disk allows for
more than one I/O operation to take place at the same time.
Important: Multiple logical partitions share processors, but disks and memory
(for example, buffer pools) are not shared.
Figure 3-7 illustrates how queries are processed if multiple partitions are
configured.
Distributed Table
Database Partition 0 Database Partition 1
The model in Figure 3-9 is used to process the query in the following steps:
Parse query: The SQL compiler analyzes the SQL query to validate the
syntax. If any syntax errors are detected, the SQL compiler stops processing
and returns the appropriate SQL error to the application that submitted the
query. When parsing is complete, an internal representation of the query is
created and stored in the query graph model.
Check semantics: The compiler ensures that there are no inconsistencies
among parts of the statement. As a simple example of semantic checking, the
compiler verifies that the data type of the column specified for the YEAR
scalar function is a datetime data type.
The compiler also adds the behavioral semantics to the query graph model,
including the effects of referential constraints, table check constraints,
triggers, and views. The query graph model contains all of the semantics of
the query, including query blocks, subqueries, correlations, derived tables,
expressions, data types, data type conversions, code page conversions, and
partitioning keys.
Figure 3-10 illustrates the Directed Join. On each partition, each row of table A
is sent to one particular partition, where it is joined with the data of table B on
that partition.
Table A
Directed Join
Table B
Partition 0 Partition 1 Partition N
Dynamic SQL statements are cached until they are either invalidated, freed for
space management reasons, or the database is shut down. Because of the
caching, dynamic SQL statements do not need to be compiled often by DB2, but
they must be compiled at least once when you execute the application. If
required, dynamic SQL statements are recompiled implicitly by the DB2 SQL
compiler whenever a cached statement becomes invalid.
Static SQL statements are persistent, because information about access plans
for static SQL is stored in the system catalog tables. When a static SQL
statement is executed, the database manager will use the information stored in
the system catalog tables to determine how to access the data and provide
results for the query.
DB2 provides a 64-bit engine with full support for 64-bit and 32-bit clients,
management tools, and associated subsystems like the Spatial Extender.
For UNIX platforms (excluding Linux), the installation is the same for both 32-bit
and 64-bit environments. When DB2 Version 8 is installed on a UNIX server, both
the 32-bit and 64-bit files are laid down. One can choose either the 32-bit or
64-bit option at instance creation time, at which point instance links will be set up
to point to the appropriate libraries and executables.
For Windows and Linux, there is a separate install image for 32-bit and 64-bit
environments, since coexistence is not supported for these platforms.
Important: In DB2 Version 8, this log has been split into two —
db2diag.log and db2admin.log
Health Monitor and Health Center: DB2 Version 8 has two features to help
you monitor the health of DB2 systems — the Health Monitor and the Health
Center. These tools add a management by exception capability to DB2 UDB
by alerting the DBA to potential system health issues. This enables the DBA
to address health issues before they become real problems that affect your
system’s performance.
Here are some guidelines to help you understand the use of the tools:
Choose the Snapshot Monitor or Event Monitor to gather data about DB2’s
operation, performance, and the applications using it. This data is maintained
as DB2 runs, and can provide important performance and troubleshooting
information.
Choose the Explain Facility to analyze the access plan of an SQL statement,
or a group of SQL statements.
Choose the db2batch tool to measure and analyze the performance of a set
of SQL statements. Performance times are returned along with Snapshot
Monitor data for analysis. Explain information can be gathered for use by the
Explain Facility.
Choose the CLI/ODBC/JDBC Trace Facility to track activity between a CLI
client and DB2. This facility can help pinpoint long running statements, and
analyze the time spent in the client application, DB2, or the network.
The Design Advisor Wizard and the Configuration Advisor Wizard should be
run after significant changes to the workload have occurred, or are
anticipated. Given the potential for significant resource consumption by these
wizards, these executions should be relegated to offpeak hours.
Use the Health Monitor and Health Center to take a proactive approach to
manage the DB2 environment by exploiting its management by exception
capabilities.
MDC enables a table to be physically clustered on more than one key (or
dimension) simultaneously. As shown in Figure 3-14, prior to DB2 Version 8, DB2
only supported single-dimensional clustering of data, through clustering indexes.
In the example on the left side, we assume we have a clustering index on column
region. Using this index, DB2 maintains the physical order of data on pages in
the key order of the index, as records are inserted and updated in the table.
Clustering indexes greatly improve the performance of range queries that have
predicates containing one or more keys of the clustering index. With good
clustering, only a portion of the table needs to be accessed and, when the pages
are sequential, more efficient prefetching can be performed.
97 98 99 99 00
Year Year
With MDC, these benefits are extended to more than one dimension, or
clustering key. MDC tables are managed by block according to the clustering
dimensions that are defined. Each insert transparently places a row in an existing
block that satisfies all dimensions, or creates a new block. Dimension indexes
are BLOCK-based, which results in much smaller indexes. In addition, MDC
tables also support record-based indexes.
Type-2 indexes
DB2 Version 8 adds support for type-2 indexes. The primary advantages of
type-2 indexes are:
They improve concurrency because the use of next-key locking is reduced to
a minimum. Most next-key locking is eliminated because a key is marked
deleted instead of being physically removed from the index page.
They support the multi-dimensional clustering feature; that is, they are
required for MDC tables.
A table must have only type-2 indexes before online table reorg and online
table load can be used against the table.
An index can be created on columns that have a length greater than 255
bytes.
All new indexes are created as type-2 indexes, except when you add an index on
a table that already has type-1 indexes. In this case the new index will also be a
type-1 index because you cannot mix type-1 and type 2 indexes on a table. All
indexes created before DB2 Version 8 were type-1 indexes. To convert type-1
indexes to type-2 indexes, use the REORG INDEXES command.
Vacate
During online index reorganization, the entire index object (that is, all indexes on
the table) is rebuilt. A shadow copy of the index object is made, leaving the
original indexes and the table available for read and write access. Any concurrent
transactions that update the table are logged. Once the logged table changes
have been forward-fitted and the new index (the shadow copy) is ready, the new
index is made available. While the new index is being made available, all access
to the table is prohibited.
Online Load
This feature enables read access to tables during DB2 LOAD operations. The
tablespace which is loaded is available for full read/write access.
The MAX_CONNECTIONS parameter determines the maximum size of the idle agent
pool. If more agents are created than is indicated by the value of this parameter,
they will be terminated when they finish executing their current requests, rather
than be returned to the pool.
N Client N Client
Connections Connections
N Coordinator N Coordinator
Connections Connections
f(N) SubAgents
f(N) SubAgents
Bufferpool(s) Bufferpool(s)
Tablespaces Tablespaces
To apply the changes from the transaction table to the master table, prior to DB2
Version 8, two separate operations were required:
An UPDATE operation for those rows already existing in the master table
An INSERT operation for those rows that do not exist in the master table
Prior to DB2 Version 8, multiple servers were required to run more than one level
of DB2 at the same version.
Another joint IBM/SAP team is working on the integration of DB2 with SAP
products in general. They have developed, for example, a powerful graphical
interface to maintain and monitor DB2 from within SAP applications.
Yet another joint IBM/SAP team develops code that is part of the SAP installation
tool, SAPinst, to facilitate the installation process of SAP BW on DB2 UDB ESE.
4.1.1 Scalability
When SAP talks about SAP Business Information Warehouse system demands,
scalability continues to be one of the most important requirements. For SAP BW,
the limiting factor is the computing power of the database server. With DB2 UDB
ESE, IBM’s parallel and scalable database, this limitation does not exist.
The new SAP product strategy implies that SAP BW (as one component of SAP
NetWeaver) will be part of almost every SAP installation. As described in 1.6,
“SAP BW customer scenario” on page 15, during the life cycle of an SAP
implementation, the following factors lead to a continuous growth of the SAP BW
system:
More data has to be loaded in a given timeframe: Several SAP modules
are connected to SAP BW over a certain time frame. For example, a
company starts to implement Human Resources (HR) and in a later step
Finance (FI). This results in more data that has to be uploaded in a given
maintenance window (for example, during the night or the weekend).
The total volume of data in SAP BW constantly grows: At least for the first
4 to 5 years, data volume grows because many customers want to have
online access to at least 4 years of data.
The number of concurrent SAP BW queries increases: This happens
because with a growing number of connected SAP BW systems, the number
of users that need to run queries will also increase over time.
Important: With DB2 UDB ESE, SAP Business Warehouse is also scalable at
the database server level.
SAP BW
Application
Server
Network
Presentation
Server
Figure 4-1 DB2 UDB ESE provides the unique ability to scale SAP BW at the database level
A unique feature of DB2 is the ability to fully exploit all available database servers
to process a single SAP BW query. This is depicted in Figure 4-2.
Optimizer
Table A
Table B
Partition 0 Partition 1 Partition 2
Figure 4-2 DB2 exploits all available database servers to process a single query
Basic tools /
Extrapolation
Accuacy
More from Customer tailored
T-Shirt sizing
sophisticated evaluation / test benchmark
sizing approach systems
In its sizing guide, “Sizing - ASAP for BW Accelerator”, SAP clearly points out
that:
Their detailed sizing approach is based on assumptions, and that in some
cases it may result in an inaccurate sizing.
A reliable sizing can only be performed with a customer tailored benchmark.
However, because of DB2’s flexible scaling approach, you may even be able to
begin implementation without an expensive customer tailored benchmark.
Please refer to 6.1, “Sizing SAP BW on DB2” on page 122 for more details on
SAP BW sizing.
Those functions for which a special implementation exists, for DB2 UDB ESE,
are shown in the boxes with dashed lines. As you can see, there is also
DB2-specific functionality for installation and upgrade.
Figure 4-6 shows the default tablespace, and buffer pool, layouts used for SAP
BW. By default, the system is installed with only one database partition. The
following tablespaces exist:
Multiple tablespaces for R/3 tables, all with 4K page size. Those are not
specific to SAP BW.
One tablespace for dimension tables and one for indexes on dimension
tables, both with 4K page size.
One tablespace for InfoCube and aggregate fact tables and one for
corresponding indexes, both with 16K page size.
One tablespace for PSA and ODS tables and one for corresponding indexes,
both with 16K page size.
One tablespace for temporary tables with 4K page size.
One tablespace for temporary tables with 16K page size.
R/3 tables
DIMD, DIMI
PSAPTEMP
PSAPTEMP16
By default, only one database partition is created: If you initially install the
system with more than one database partitions, the following tablespaces and
bufferpools are created on all partitions:
Tablespaces for InfoCube and aggregate fact tables
Tablespaces for PSA- and ODS tables
Temporary tablespace PSAPTEMP16
Buffer Pool BP_STD_16K
Important: If you plan to scale your database system by adding servers later
on, you should configure multiple partitions during SAP BW installation. It is
much easier to install SAP BW with multiple partitions, than to add database
partitions later on. Multiple partitions are required to use multiple servers for
your database system.
Aggregate data
R/3 tables
Dim. tables
Aggregate tables InfoCube1
Fact tables
Dim. tables
Aggregates InfoCube2 Table-
Fact tables spaces
ODS activation
queue
ODS active/change log table
PSA tables
PSAPTEMP
PSAPTEMP16
These are the primary differences when compared to the default layout:
The data for fact, PSA, and ODS active tables is not located in partition 0.
This reduces the total backup and restore times, because partition 0 must
always be backed up first and restored first.
For large InfoCubes, fact data is in separate tablespaces for easier
manageability.
Aggregate fact data is in one or more separate tablespaces. This enables
you, for example, to assign a dedicated buffer pool to aggregates. As
aggregates are small when compared to InfoCubes, and are more frequently
accessed, this separation can considerably increase overall buffer pool hit
ratio. If aggregate fact tables are small, they should not be distributed. This
will avoid unnecessary inter-partition parallelism.
Chapter 10, “Scalability factors of SAP BW on DB2 UDB ESE” on page 317,
describes scalability factors that are achieved for the most performance critical
operations when going from one to multiple database partitions.
The following sections discuss topics that are related to SAP BW systems built
on DB2 databases with multiple database partitions.
BW BW BW BW
Application Application Application Application
Server 1 Server 2 Server 3 Server 4
DB Server 1 DB Server 2
Partition 0 Partition 1 Partition 2 Partition 3
R/3,
master,
dimension
Fact tables
tables
PSA and ODS tables
TCP/IP Communication
Figure 4-8 Multiple SAP BW Instances connected to multiple database servers
If all SAP BW Instances are connected to the first database server, this server
can become a bottleneck. The reason is that the database server, the database
partition to which the instance connects, has to compile (prepare) all SQL
statements that are issued by this instance. Preparing an SQL statement can be
computationally expensive, especially when executing SAP BW queries.
Notice that the partitioning key for fact tables differs from the partitioning key for
PSA tables. Therefore, when loading from PSA into an InfoCubes, records might
be relocated to different partitions. This results in communication between
database partitions.
BW Application
Server
Array Insert
Buffer Buffer
Partition 1 Partition 2
This buffering avoids single record transfers between partitions, which would
result in much higher overall communication time.
To activate insert buffering, you have to add the following two lines to the
db2cli.ini file, which is located in /db2/db2<sid>/sqllib/cfg:
[common]
InsertBuffering=2
On each partition, it joins only the local data of the F-fact and E-fact tables on
that partition. No communication between partitions is required. Therefore, the
compression scales linearly with the number of database servers. Thus, if you
double the number of database servers, the request is compressed twice as fast.
Table A
Collocated Join
Table B
Partition 0 Partition 1 Partition N
Collocated joins are also used in other situations. For example, dimension tables
are locally joined with master data or hierarchy tables. In general, collocated
joins can be used for tables that belong to the same nodegroup, or for tables that
are not distributed and reside on the same partition.
It is planned that, in a future FixPak, DB2 Version 8 will provide the option to
collect statistics on all database partitions.
In this example, indirect archiving to Tivoli Storage Manager (TSM) is used. The
user exit program (db2uext2) copies log files from log_dir to the log_archive
directory. From there, the files are then archived to TSM by the brarchive
program. The following shared directories are used for log file archiving and
retrieval when using indirect archiving: /db2/<DBSID>/log_archive,
/db2/<DBSID>/log_retrieve.
In addition to indirect archiving, you can also use direct archiving. Figure 4-13
provides a complete overview of the different archiving methods. More details
can be found in the database administration guide, SAP on IBM DB2 Universal
Database for UNIX and Windows.
Attention: When running SAP BW 2.x on DB2 UDB ESE, DB2 Common
Table Expressions are used, instead of views, to improve query performance.
If you only have short running queries (< 3 seconds) and often execute the same
BW queries with identical restrictions, it may be useful to use host variables for
SAP BW queries. In this case, DB2 can reuse previously generated access
plans. This avoids compilation of queries and can improve performance.
Attention: When running on DB2 UDB ESE, Common Table Expressions are
used to combine hierarchy result tables instead of views. This improves query
performance.
In the example, fact table sales is joined with its dimension tables store, period,
and product. In addition, a WHERE condition (predicate) on table product is
defined.
Important: If the foreign key columns in the fact table are single-column
indexes, which is the case in SAP BW, and there is a relatively high selectivity
across all dimension tables, DB2 can use a star join with index ANDing. This
can significantly increase performance.
Hash joins
Hash joins can significantly improve performance of certain queries, especially in
data warehouse environments such as SAP BW, where the queries are typically
quite large and complex. In DB2 UDB Version 8, hash joins are always
considered when optimization level 5 or higher is used. By default, SAP BW uses
optimization level 5. With DB2 Version 7, hash joins are only considered when
the registry variable DB2_HASH_JOIN is set to YES.
When joining two tables, regardless of which join method is used, one table will
be selected to be the OUTER table, and another table will be the INNER table.
The optimizer decides which will be the OUTER table and which will be the
INNER table based on the calculated cost and the join method selected. The
OUTER table will only be scanned once. The INNER table may be scanned
multiple times, depending on the type of join and indexes that are present on the
table.
Hash joins require one or more equality join predicates between the joined
tables. The fact that they can handle more than one equality predicate between
the joined tables is a distinct advantage over other join methods.
For the full performance benefits of a hash join, you should consider the following
suggestions:
You may need to change (increase) the value of the sortheap database
configuration parameter and the sheapthres database manager configuration
parameter.
Hash join performance is best if you can avoid hash loops and overflow to
disk. To tune hash join performance, estimate the maximum amount of
memory available for sheapthres, then tune the sortheap parameter. Increase
its setting until you avoid as many hash loops and disk overflows as possible,
but do not reach the limit specified by the sheapthres parameter.
Increasing the sortheap value should also improve performance of queries
that have multiple sorts.
If intra-partition parallelism is enabled, the hash join may be executed in
parallel.
Depending on the type of tables, the following types of statistics are collected in
SAP BW:
SAP Basis tables: Basic Statistics
SAP BW ODS, PSA tables: Basic Statistics
SAP BW Fact, Dimension, Master Data tables: Distribution Statistics
You can use a planning calendar, by using transaction ST04, to schedule the
automatic collection of database statistics. For more details refer to 7.3.3, “The
optimizer and runstats” on page 207.
Figure 4-16 DBA Cockpit 6.40 can process runstats concurrently on multiple tables
A B C D
FiberChannels
85 GB RAID/5 (vg2)
FASt T700
Storage
Note: Throughout this redbook the alias name P650 A is used to refer the first
P650 and the alias name P650 B to refer the second P650, as shown in
Appendix 5-1, “Basic hardware description” on page 111.
The storage system is an IBM FASt T700, using a Fibre Channel. The disks were
RAID5 configured, as is typical in most customer environments.
Note: Please consult your disk supplier and verify SAP recommendations in
the installation guide SAP Web Application Server Installation. That is
necessary because the disk design and architecture can significantly
influence the BW system performance.
Although there are other operating systems that could be used, AIX provides
excellent support for DB2 parallelism and partitioning. And these were the
capabilities of most specific interest in this project.
64 bits bootinfo -K 64
Filesets bos.rte
(using command bos.adt
lslpp -l to check)
bos.data
bos.sysmgt
bos.diag.rte
bos.msg.en_US
bos.net.nfs
bos.net.tcp
perfagent
bos.loc.iso.en_US
bos.loc.iso.de_DE
bos.iconv.de_DE
bos.iconv.com
devices.*
printers.rte
X11.base
X11.apps
X11.motif
X11.fnt.iso1
X11.loc.en_US
X11.msg.en_US
X11.Dt
xlC.rte
Note: If possible, distribute the swap area across more than one disk. But, do
not put the sapdata and swap area on the same disk.
DB2 Universal Database, Version 8.1 (64 bits) for AIX, which was
Enterprise Server Edition delivered via SAP CD number 51019377
Note: If you elect to use the latest available DB2 FixPak for your
implementation, first verify that it is in compliance with SAP.
BW BW 3.5
You can download the latest patch available using SAP Service Marketplace by
using the following link:
http://service.sap.com/patches
The technical steps to set up the installation are described in detail in Chapter 6,
“Implementing SAP BW on DB2” on page 121. The technical steps to scale out
the system by adding additional hardware and database partitions are described
in Chapter 9, “Scaling out the database” on page 297. In Chapter 10, “Scalability
factors of SAP BW on DB2 UDB ESE” on page 317, we summarize the results of
two studies about SAP BW scalability, when running on DB2. We also provide
the results of the measurements we performed as part of this redbook project.
Table 5-5 provides an overview of the different installation and scaling scenarios,
and the hardware that is used in each scenario. The computing power, in terms
of the number of processors and number of disks, is doubled when going from
scenario 2 to scenario 3.
1 1 p650 8 1 BW Installation
If you want to scale your database system by adding additional SMP servers,
you must partition your database. The different options to partition the DB2
database are described in Section 4.3.1, “Architecture and features” on page 89.
The purpose of this scenario is to show the required steps and tools to
re-partition the database. This is relevant for most SAP BW andDB2 customers
that are currently running on a single partition. In addition, we compare the
execution times between Scenario 1 (unpartitioned) and Scenario 2 (partitioned)
for certain performance critical BW operations to demonstrate the advantage of
partitioning.
If you cannot add processors to your database server, adding additional servers
is the next step in order to scale the database system. This might become
necessary, for example, if SAP BW system response times increase due to
growing data volumes, growing number of users, or increasing complexity of
queries. The ability to add database servers to your database system is a unique
feature of DB2 UDB ESE. This feature has many advantages, such as in terms of
TCO (total cost of ownership) as described in Section 4.2, “Business advantage:
the value proposition” on page 87.
The purpose of this scenario is to show how to attach additional SMP servers to
your database system.
The data model is shown in Figure 5-2. It consists of a sales and distribution
InfoCube 0SD_C01. In addition, an ODS object, ODSBENCH, is used. During
the project we performed the following steps:
Load 30 InfoPackages concurrently from flat file into PSA.
Collecting runstats on PSA.
Drop InfoCube indexes.
Load 30 InfoPackages concurrently from PSA into InfoCube.
Recreate InfoCube indexes.
Rollup the InfoPackages from InfoCube 0SD_C01 into the aggregates.
Compress an InfoPackage.
Load from PSA into ODS.
Activate the InfoPackages in the ODS.
Ten aggregates are defined for the InfoCube with approximately the same
volume of data as the InfoCube. The aggregate definitions are shown in
Table 5-6.
1 Material number: *
Country of the sold-to party: *
Valuetype for reporting: *
2 Division: F, B2
Material number: *
Country of the sold-to party: *
Valuetype for reporting: *
3 Division: F, B3
Material number: *
Country of the sold-to party: *
Valuetype for reporting: *
4 Division: F, B4
Material number: *
Country of the sold-to party: *
Valuetype for reporting: *
5 Division: F, B5
Material number: *
Country of the sold-to party: *
Valuetype for reporting: *
6 Distr. Channel: *
Division: *
Material number: H/BW_SDK_BMK///0MATERIAL 2
Country of the sold-to party: *
Valuetype for reporting: *
7 Distr. Channel: *
Division: *
Material number: H/BW_SDK_BMK///0MATERIAL 3
Country of the sold-to party: *
Valuetype for reporting: *
8 Distr. Channel: *
Division: *
Material number: H/BW_SDK_BMK///0MATERIAL 4
Country of the sold-to party: *
Valuetype for reporting: *
9 Division: *
Country of the sold_to_party: *
Valuetype for reporting: *
10 Division: *
Valuetype for reporting: *
The FASt T700 storage system used for the tests provided a total of 290 GB of
free space. This generated and loaded between 30 million and 60 million records
of transactional data as described in Chapter 10, “Scalability factors of SAP BW
on DB2 UDB ESE” on page 317.
Loading 60 million records was the largest scenario possible with the available
disk space. This maps to the 4 GB SAP BW benchmark scenario. 60 million
records in the PSA consume about 30 GB disk space (24 GB for table data and 6
GB for index data). Additional disk space is required to load the transactional
data from PSA into the InfoCube, the ODS, and the aggregates.
To see a realistic percentage of disk activity during our tests, we limited the
amount of real memory to be used by the database (bufferpools and sortheap).
SAP recommends to configure about 1/3 of real memory for bufferpools
(database cache) and 1/4 of real memory for sortheap (see 8.2, “Health check”
on page 250, and SAP Note 584952 for details).
SAP considers two major areas to estimate hardware requirements for SAP BW:
Data staging (for data load into PSA, and from PSA into data targets)
SAP BW queries
To help estimate hardware requirements, SAP provides the sizing guide, Sizing -
ASAP for BW Accelerator, and a tool called QuickSizer. There are some
important factors that make it difficult to estimate hardware requirements. As
examples, there are uncertainties in the area of SAP BW queries such as the
following:
You cannot exactly predict the usage patterns of queries — that is, how often
certain queries are executed by how many users in a given timeframe. The
QuickSizer makes some assumptions about user categories and number of
users in each category, based on customer surveys.
You often do not have detailed information about the final complexity of the
queries. Therefore, response times are hard to predict. In addition, queries
might need to be modified in the production environment. Changing a small
aspect of a query can completely change, and potentially increase, the
execution time.
The execution times of BW queries heavily depend on the existence and
usage of aggregates. If you cannot actually test your BW queries on a system
with the final setup of aggregates, you can only make rough assumptions
about required hardware resources.
Since complex SAP BW queries are more typically database server bound, this
makes it difficult to estimate hardware requirements for the database server.
The data staging area (dataload) is more application server bound, and sizing
this area brings the following uncertainties:
The QuickSizer assumes that there are no customer specific transformation
and update rules, which is often not the case. If customers define their own
transformation and update rules, this can create an additional load, which is
primarily on the application server.
The following steps explain how to derive hardware requirements and how to
determine the number of database partitions on DB2 UDB ESE:
Data volume, complexity of queries, number of concurrent users, and
maintenance window determines required SAPS (SAP unit of measure for
computing power).
Required SAPS determines the number of processors needed on the
selected hardware platform.
Number of processors on each server determines number of logical partitions
on that server.
Recommended are 1-4 processors per partition. In most cases DB2 can
utilize between 2-4 processors per partition when executing a single running
SAP BW query. This requires that INTRA_PARTITION parallelism is switched
on. If INTRA_PARTITION parallelism is switched off, only one processor can
be utilized per partition for single running queries. If you have a high
concurrent workload (many users that run queries concurrently), you might
consider switching INTRA_PARTITION parallelism off to reduce the number
of database agents.
If you plan to connect a second database server to the existing storage
system later on, you may want to configure more partitions to avoid
re-partitioning your data later on. For details about scaling your database
system, please refer to 3.2.3, “Parallel processing and partitioning” on
page 55 and Chapter 9, “Scaling out the database” on page 297.
SAP provides the methodology and tools to guide you during the implementation
of SAP NetWeaver. The methodology is based on the ValueSAP Roadmaps,
which was formerly known as AcceleratedSAP (SAP).
SAP recommends using the ValueSAP phases, tasks, and activities for the
implementation project. Those phases are listed below. We do not describe or
discuss all of the single steps for the phases, but rather provide a high-level
discussion of those roadmap phases and their advantages.
The illustration in Figure 6-3 shows a list of the needed tasks and activities for
project preparation.
The Application Consultant is the primary person involved in these tasks and
activities, as shown in Figure 6-4. However, technical consultants, such as the
DB2 and SAP Administrator, with their knowledge and experience, should also
be involved. This is especially important for the evaluation of performance and
the load aspects.
The steps to complete this phase are depicted in Figure 6-5. For the system
tests, Test Plans should be defined.
ValueASP also provides templates and master documents, which are very
helpful. Even the recommended role, such as DB Administrator or Application
Consultant, are assigned to the appropriate task. The illustration in Figure 6-7,
shows examples from the available documents. It lists the related templates for
security and testing the BW environment, and describes the involved roles, such
as database administrator or business consultant, as well.
The project team will need to define the DB2 UDB structure and decide on such
questions as these:
Is database partitioning necessary?
How many database partitions will be needed?
How should the database partitions be distributed?
Should partitions be located on one or more systems?
The project team has to define the distribution of the instances. The following
options are available:
Minimal component distribution: The minimal system installs all of the
mandatory components on a single system. We used the minimal component
distribution for this redbook project.
S ys te m 1
C e n tr a l In s ta n c e D a ta b a s e In s ta n c e
A B A P P ro c e s s o r
M essage Enqueue D a ta b a s e O b je c ts
S e rv e r S e rv e r
W o rk P ro c e s s e s
System 1 System 2
ABAP Processor
Work Processes
The distribution of the components for the LAN SAP Web AS J2EE system and
the installation of SAP Web AS J2EE systems is described in the documentation,
Installation Guide - SAP Web Application Server Java on UNIX: IBM DB2
Universal Database for UNIX and Windows.
During the planning activities you should review, evaluate, and document the
following items:
System design (according sizing)
SAP System Landscape
Storage (disk) design and architecture
Service level agreements (SLAs)
Based on the SAP Notes, these checklists, and the document SAP® Software on
UNIX: OS Dependencies, the system administrator checks the prerequisites.
Figure 6-10 OS-Dependent Installation Steps for AIX and HP Tru64 UNIX
Figure 6-11 OS-Dependent Installation Steps for LINUX and SUN Solaris
4. At this step you can optionally create the operating system users
<sapsid>adm or db2<dbsid> manually. Because the SAP NetWeaver
installation procedure, SAPinst, creates the user automatically, we did not
perform this activity.
5. We modified the UNIX kernel parameters and swap space according the
recommendations given by documentation SAP Software on UNIX: OS
Dependencies and the SAP Notes. For BW Systems, a swap space of at
least 20 GB is recommended. Example 6-2 shows our definitions.
Additionally, some user related kernel parameter must be changed. For
example, the number of processes per user was set to 500.
Note: SAPinst GUI caused segment failures on the pSeries with AIX V5 64bit
server. SAP Notes 668603 supplies the solution, which is to set the following
parameters in the installation shell:
ulimit -d unlimited
ulimit -s unlimited
You could set this parameter with smitty in the associated user (root)
definitions as well.
9. We prepared the required CDs for the installation. All of the needed CDs were
copied to disk, which reduces the installation time and provides for
unattended loading of the database export CDs. Example 6-4 lists the SAP
provided CDs for the installation.
Now the planning and preparation activities are finished, and the installation
steps can be performed.
The SAP NetWeaver 6.40 includes, in addition to the ABAP Engine, the JAVA
J2EE Engine. Optionally, you may install the SAP Web Application Server Java
on UNIX as well. The following basic steps outline the installation of the SAP
WAS Java environment (however, we do not describe the details of the
installation of the WAS JAVA environment in this document):
1. IBM DB2 Universal Database for UNIX and Windows Version 8 is installed
during the ABAP installation.
2. Perform the installation of the Web Application Server J2EE Central Services.
The documentation uses the term Central Services to distinguish between the
ABAP and the Java environment.
If the Central Services are on an distributed system, which means the
Database Server and the Central Services are separated, the installation of
the of the DB2 Administration Client Software must be performed as a
prerequisite.
The free space for the DB2 UDB installation is available now, and the installation
can be started.
1. Log on as user root and set the DISPLAY variable.
– on c-sh setenv DISPLAY <hostname>:0.0
– on k-sh export DISPLAY=<hostname>:0.0
2. Go to the subdirectory <tmp_dir>/DBSW, entering cd <tmp_dir>/DBSW
For Solaris only:
If directory /var/tmp is empty, db2setup will delete this directory. To prevent
this, create an empty file in directory /var/tmp: + touch /var/tmp/DUMMY.
3. To start the installation, enter the command: ./db2setup
4. After clicking the Install Products pane, you will receive a screen with the
procedure that allows you to select the installation options. This is depicted in
Figure 6-14.
5. Select the DB2 UDB Enterprise Server Edition, and click Next.
6. Click Next on the Introduction screen.
7. Accept the license agreement, and click Next.
8. After selecting Typical Installation, click Next.
9. Choose Install DB2 Enterprise Server Edition on this computer, and click
Next.
10.Enter the password and user name for the DB2 Administration Server, and
click Next.
11.Select Do not create a DB2 Instance, and then click Next.
12.Do not change the defaults on the Contacts screen. Just click Next.
13.Do not set up SMTP notification.
14.On the next screen, click Finish to begin the installation.
15.After the installation has completed, click Finish again.
FixPaks are cumulative. This means that the latest FixPak for any given version
of DB2 contains the updates from all previous FixPaks for the same version of
DB2. We recommend that you keep your DB2 environment running at the latest
FixPak level to maintain a problem-free operation.
You can download the latest DB2 FixPak from the IBM DB2 UDB and DB2
Connect Online Support Web site at:
http://www.ibm.com/software/data/db2/udb/winos2unix/support
Each FixPak contains a set of Release Notes and a README file. The README
file provides instructions for installing the FixPak. In the following section we list
the typical installation steps.
Installation steps
You will need about 1.5 GB of free space for the temporary directory <tmp_dir>.
1. Log on as user root.
2. Create a new file system with 1.5 GB or use an existing one. The FixPak
software must be temporarily unpacked to disk.
3. Access the Fixpak through mounting the CD or Filesystem.
4. If You use the SAP CD an AIX systems enter:
<CD_mount>/UNIX/AIX/SAPCAR -xvf <CD_mount>/UNIX/AIX/DB2FP<nr>.SAR
The Subdirectory FIXPAK is created.
The source directory can be defined on other operating systems by using the
command that corresponds to your operating system in the following list:
SUN
<CD_mount>/UNIX/SUN/SAPCAR -xvf <CD_mount>/UNIX/SUN/DB2FP<nr>.SAR
Linux
<CD_mount>/UNIX/LIN_32/SAPCAR -xvf <CD_mount>/UNIX/LIN_32/DB2FP<nr>.SAR
HP
<CD_mount>/UNIX/HP11_64/SAPCAR -xvf <CD_mount>/UNIX/HP11_64/DB2FP<nr>.SAR
The screens and prompts guide you through the installation steps.
DB2 UDB ESE should now be installed, and the recommended SAP FixPaks all
applied. The next step is the installation of the SAP WAS Central Instance
Installation.
Prerequisites
The recommended changes in the SAP Notes and the SAP® Software on UNIX:
OS Dependencies are entered and reviewed.
Prior starting the SAPinst routine, the library path must be set. If you restart
SAPinst at a later time, make sure that the variable is still set.
As user root, set the library path environment variable according to the following
tables.
Table 6-1 shows the values for the library path environment variable for different
operating systems
Table 6-2 shows the name of the library path environment variable for different
operating systems.
Needed filesystems
We then defined the logical volumes, as depicted in Table 6-3:
Next we defined the SAP related filesystems that were prepared for the
installation. They are depicted in Example 6-6:
DB2 Code
/dev/hd10opt 262144 253372 4% 389 1% /opt
Because of the benchmark, the filesystem sizes are larger than recommended.
Additionally, the SAPinst asks for the hostname, the database ID and ports,
and userids as well.
We show the Group and the user definition screens in the following figures.
First look at Figure 6-19, and focus on the Additional information heading.
This additional information tells us that the group ID or user ID must be the
same on the additional instances. This then relates to the Group ID number.
In our case, as shown here, the GroupID is 200. That means that on all
additional instances, the same number must be defined, because all UNIX
Systems perform the security activities based on the userID UID or group ID
GID related number.
However, we recommend that you allow the installation procedure to create the
users and the groups. In this case, you have to fill out the screens provided as
illustrated in Figure 6-20. The screen asks to enter the User ID, the Password,
and the Login Shell. SAP recommends that you work in the c-shell environment.
If you have any problems during the installation of the Central Instance, see the
installation log. The installation log is visible using the View Log button which
displays the content of the sapinst.log file.
If an error has occurred, you could restart the installation by clicking the Retry
button. Only in very few cases will it be necessary to clean the directories and
perform the full installation again. This cleanup of the directories is described in
the Installation Guides as well.
When the Central Instance installation has finished successfully, you can run the
installation of the Database Instance.
From the installation directory we again started the SAPinst routine, using the
SAPinstGUI on the PC. By entering the appropriate IP-address and Port 21212,
the welcome screen shown in Figure 6-23 is displayed. We have now begun the
installation of the Database Instance.
In this section we show the installation flow graphically. However, we only use
selected screens rather than including all of the screens, as it is a lengthy
process.
To begin the process, you must again enter the SAP SID (System ID), which in
our project was BW1.
For this new installation we have selected the Standard Installation. The
SAPinstgui guides you though the process. You will be asked for such
information as:
The hostname, db2bi
The SAP administrator user, bw1adm
The SAP DB administrator user, db2bw1
The group definition, for example, the Database Maintenance Group
The available resources, such as swap space and available memory
The location of path to the DB2 UDB software, as depicted in Figure 6-25
At this time you must define whether or not you want to use multiple database
partitions. The installation procedure offers you the screen depicted in
Figure 6-27.
We selected the Single Database partition type for our installation. If you select
single database partitioning, you may choose to define the intrapartitioning
function, which allows you to have parallel access to the database.
During the next steps we define the physical database structure, which includes
the tablespace, indexspace, and container definitions. The installation screens
allow you to adapt the default definitions when your requirements differ from the
standard installation. Figure 6-28 shows the changeable fields for tablespaces.
Each tablespace might contain one or more containers. The installation routine
provides processes to add or change container data as well. These include the
assignment of the containers to other then originally defined directories.
If the database physical structure is defined, you can either accept or redefine
the database bufferpools. Figure 6-29 shows how the bufferpools can be
specified.
If all required definitions are performed, the database load options should be
specified. For the database load, parallel jobs are selectable. In our installation
we selected six parallel jobs, which reduces the database load time significantly.
After choosing the National Language Option, the installation procedure lists all
the defined parameters for a review. If necessary, the parameters can either be
modified again or you can start the installation by clicking the Start button.
The running activities and steps are marked, as depicted in Figure 6-31.
Additionally, the progress of the currently executing step is displayed.
After the database load, the SAPinst starts collecting the runstats to create the
optimizer used statistics.
The runtime for the installation is dependent on the available resources and the
sizing. In our project the installation of the Database Instance ran about two
hours.
The Database Instance is now installed, and the Post-installation activities can
now be started.
For example, after installation you must check that you can start and stop the
SAP system. In the SAP Kernel directory /sapmnt/BW1/exe, there are scripts
named startsap and stopsap that are available to start and stop the system.
If you use a language other than German or English, you will have to install the
appropriate language for the client copy.
We changed the DB2 parameters in the Command Line Processor for DB2,
where the DB2 UDB commands can be entered directly. We do not explain all of
the single parameters. You will find any comments and hints concerning those
parameters in the SAP Note 584952.
In a BW system, SAP delivers the clients 000 and 066. They are defined as:
Client 000 is the SAP reference Client.
Client 066 is the Early Watch Client, which is used from SAP when there are
customer requested support activities.
By using the screen depicted in Figure 6-35, values can be entered. Then the
save symbol, represented by the small yellow diskette, must be clicked.
Using the transaction /nscc4, you select the copy options as desired for the
client copy activity. Here are some data areas that are selectable:
Customizing
User definitions
Application data
At this point, the ASP business consultants typically start with entering additional
SAP BW (Business Information Warehouse) related activities and definitions.
These steps are described in the following section.
Note: Please check the SAP Note 634214 to verify the minimum requirements
necessary to install BI_CONT 351 and get the password to install this Add-on.
10.Now you can click the Start Options button to install the Add-on. This is
depicted in Figure 6-43. It may be best to start in background mode, as this
will enable the maximum run time dialog parameter to be reached. In our
project environment, this package took about 30 minutes to run.
Tip: Before starting the package installation, make sure that you have enough
free space in the BW tablespaces. Some guidelines are shown here:
11.You can check the Add-on installation in the main screen of SAINT, see
Figure 6-44. Another way to see the Add-on installation is to go to transaction
SE16 and list the contents of table AVERS.
12.Apply all support packages for this Add-on using transaction SPAM, and
check all relevant notes to define the queue correctly.
In this redbook project, the Dialog Instance is installed on a separate server from
the Central Instance. The servers and equipment used were described before in
Table 5-1 on page 111. You can see a diagram of the project configuration in
Figure 6-45.
C
A EtherNet 100MB
P-Series 630
P-Series 650
FiberChannel
FiberChannel Switch 2
Switch 1
FASt T700
Storage
Notes:
For the standard installation of the DB2 Administration Client v.8,
the files reside in the /usr/opt directory. If you want to use the standard
installation, you should have at least 300 MB in /usr/opt for this client.
When you create the /db2 file system, it should contain at least 20 MB for
the home for user db2<sid>.
At this point, you can begin the Dialog Instance installation steps.
Notes:
You need to create the file system /usr/sap/<SID> with at least 500 MB.
From the Central Instance server, export the /sapmnt/<SID> file system to
the application server.
From the application server, mount the NFS (Network File System) with the
mount point: /sapmnt/<SID>
log data work exe profile global log data work exe profile global
ctrun = SAPCPE
is active exe profile global exe profile global
Tip: If you have the same operating system in the Central Instance and the
Application Server, you can activate sapcpe. Please refer to page 105 in the
SAP NetWeaver’04 Installation Guide.
e. Now start the SAPINST in your PC, using the appropriate IP and port
number, as shown in the example in Figure 6-48.
j. Now you can continue with the installation with the following steps:
• Select Instance memory management.
• Specify no LDAP support.
• Specify the location of the Instance Files: /sapmnt.
• Specify the database specific parameters.
• Specify the operational system group and user.
• Do not extract the SAP System Kernel.
• Check SAP System ports.
• Specify the path to the Database Software.
• Enter the location of IGS software.
• Start the installation.
k. Log-on to the system as user <sid>adm and execute the following
command:
R3trans -d
Tip: If you have a problem starting the Dialog Instance, it may be because the
database connection has not been established. Verify that there is a database
connection.
HACMP has two products, namely, HACMP Classic and HACMP/ES (Enhanced
Scalability). HACMP/ES has some in-built features, which allow application
monitoring. The easiest form of this is process monitoring where you can add
specific process names, and HACMP will monitor these and take action if these
processes vanish. The level of action that can be taken can be fine-tuned from
something as simple as alerting Tivoli to instigating failover.
In addition, there are other even more critical factors that enter in to cause
unplanned system outages. These are such things as:
Operator errors
Environmental problems
Application and operating system errors
With these types of elements, reliable and recoverable hardware just cannot
protect against failures caused by them.
Nodes form the core of an HACMP cluster, and are processors that run both AIX
and the HACMP software. The HACMP software supports pSeries uniprocessor
and symmetric multiprocessor (SMP) systems, and the Scalable POWERParallel
processor (SP) systems as cluster nodes. To the HACMP software, an SMP
system looks just like a uniprocessor. SMP systems provide a cost-effective way
to increase cluster throughput. Each node in the cluster can be a large SMP
server, extending an HACMP cluster far beyond the limits of a single system and
allowing thousands of clients to connect to a single database.
The HACMP software allows you to combine physical components into a wide
range of cluster configurations to meet your requirements. Figure 6-54 shows an
example of an HACMP cluster. You can see that the nodes and disks are all
connected with redundant busses to enable uninterrupted communications. One
node can pick up the workload of a failed node, and the disks are mirrored in the
event of a disk failure.
Clients
Nodes
LAN
Disk Busses
In addition, you can keep an HACMP cluster online while making configuration
changes by using the Cluster Single Point of Control (C-SPOC) facility. C-SPOC
makes cluster management easier, as it allows you to make changes to shared
volume groups, users, and groups across the cluster from a single node. The
changes are propagated transparently to other cluster nodes.
The resource group will be defined as Cascading utilizing the without fallback
option. This will allow the resources to be returned to the primary server after
failover has occurred in a controller manner. Configuring the resource group
without this option allows fallback when service to the primary server is restored
resulting in a second application outage.
This is an unusual configuration and requires that the network connection has
sufficiently low latency times to provide the network speed that is required for
FCM communication between the database partitions.
Site 2
Figure 6-55 Example for SAP BW/DB2 database in HACMP failover configuration
There are a number of approaches that can be used to enable high availability,
depending on client requirements. HACMP offers a robust capability and can
meet the requirements of many businesses.
First we describe the standard SAP administration tasks are first described, and
then some of the more typical administration activities involved with SAP BW.
The SAP BW System provides many predefined roles, which can be displayed
with the RSU2 transaction by using the /nrsu01 command.
Example 7-1 shows the description of the BW Administrator and BW Query User
roles.
The DB administrator role provides the basic database functions by the menu
structure in the launchpad, as depicted in Figure 7-1. Depending on the company
size and the complexity of the system landscape, some of the administration
roles may be combined.
These are the administration and maintenance tasks discussed in this chapter:
DB2 UDB administration
Periodic administration activities for the BW system
DB2 UDB related maintenance tasks, such as:
– Backup and recovery
– Storage management
– Using runstats and reorg
BW administration tasks
The configuration facilities available are for the following tasks and activities:
Database manager
Database
Parameter changes
Buffer pools
Data classes predefined by SAP
CLP
You can use these functions for changing names, objects, or parameters, and
also for monitoring. Figure 7-3 shows an example of a Buffer Pool Snapshot for
the Default Buffer Pool, by using the Command Line Processor to display the
bufferpools.
By using the Database Administration Facility, you can analyze and create data
classes. The data classes are defined by SAP, and includes, for example, user
data classes such as:
APPL0, APPL1, APPL2
USER1
You can then click the execute symbol (clock) and expand the nodes:
-> Running Your SAP System
--> BW1: Checklist for Operating the Production System
----> BW1: Daily Tasks
This will result in a list of tasks and activities like those depicted in Figure 7-7.
These are daily tasks that need to be performed for this day, as determined by
the SAP system.
As the illustration in Figure 7-7 shows, there are icons that provide links to the
documentation (to the left of the clock icon) and activity status symbols. The red
activity symbols indicate that the task has not been started. When the task has
been performed, the indicator will turn green.
As you can see, the SSAA daily task list also lists the DB2 UDB and filesystem
maintenance tasks that are to be performed. We recommend that you use the
SAP System Administration Assistant as a standard procedure for monitoring
and performing the recommended activities. On request, the SSAA will display
the user and the time that the activity was started.
As shown in Figure 7-8, the periodic activities can be devided into two primary
groups:
1. Monitoring and maintenance of the SAP functions, which helps to provide
proactive systems management
2. DB2 UDB related maintenance and support functions
The SAP BW database objects are loaded into DMS Storage. For a DMS
tablespace, each container is either a fixed-size pre-allocated file or a physical
device such as a disk. The DB2 UDB Database Manager (DBM) controls the
storage. If database objects grow, the DBM allocates additional space at the
extent level.
Figure 7-9 depicts the results of the performed transaction /nst06 and the
selection for the filesystem details. You can monitor if free space is available in
the filesystems. If the threshold level is reached, the system creates an alert,
which is visible in the alert monitor. You can start the alert monitor using the
transaction rz20.
During the installation procedure, SAP defines and creates the tablespaces.
Example 7-2 shows a partial example of the installation procedure, including
some of the primary attributes of the tablespace, such as these:
A pagesize of 4 KB or 16 KB: DB2 UDB stores data in pages, which also
can have a size of 8 KB or 32 KB.
An extentsize of 8,16 or 32: The extent defines the number of pages that are
written to a container before switching to the next extent. Typically the next
extent is allocated in another container. SAP typically defines the extent size,
which is between 4 and 64 pages.
Managed by system: This is only for the temporary tablespaces. When it is
an SMS tablespace, each container is a directory in a filesystem.
Managed by database: This is used for all other tablespaces.
The right pane in Figure 7-10 shows that the temporary tablespaces (the first two
entries in the Tablespace column) are system managed (SMS) tablespaces.
Since they are system managed, the values for percentage-used are set to
100%. They can be extended as long as there is free space available in the
filesystem. All other tablespaces in the list shown are database managed (DMS),
and should be analyzed periodically.
Figure 7-11 depicts the facility used to add more space with containers. That can
be accomplished by entering the desired values and executing the changes. The
system automatically generates the alter tablepace SQL statement and
performs it. An example of this can be seen in the bottom portion of Figure 7-11.
If a particular BW table seen to be growing very fast, it can be separated from the
existing tablespace. In addition, you can assign it to another buffer pool to
minimize the response time and improve performance.
The optimizer uses the statistics of the database objects, such as tables, views,
and indexes, to find he best access path. The statistics should be kept current to
enable the optimizer to pick the best path. The runstats utility collects the
statistics for the tables and indexes, and updates them in the DB2 catalog.
After the database server installation, the runstats utility will collect statistics on
the all tables in the database. Database activity should be monitored to be aware
of when, for example, large volumes of data have been inserted into, or deleted
from, the database, then the runstats utility should be used to refresh the
statistics. The BW System offers transactions and ABAP programs that provide
support for these types of maintenance requirements.
As previously mentioned, the database objects that need statistics updated are
flagged. To see the flagged database objects, run either the transaction DB2 by
entering /ndb2, or transaction ST04 by entering /nst04, in the command field.
In the NetWeaver Web Application Server (WAS) 6.40, both transactions open an
identical screen.
Figure 7-13 shows the left pane of the DB2 and ST04 transaction screen. Select
the RUNSTATS Settings to get a list of flagged database objects.
Figure 7-14 shows a list of the BW tables that are flagged for RUNSTATS. That
screen also provides additional information, such as:
The last runstats date and time
The volatility of the table
The table cardinality (the number of different values in the clustered
columns).
To see the recommendations on scheduling runstats, you must first select the tab
RUNSTATS Control. An example of the information on that tab is shown in
Figure 7-16. It not only shows the type of Scheduling recommended, but also
shows such information as the Statistics Attributes, Table Analysis Method, and
Index Analysis Method.
Using the SAP provided Planning Calendar (transaction DB13), you can
schedule the needed activities. In the example shown in Figure 7-17, you can
see that the following activities have been scheduled:
The reorg check (Reorgck_All)
The check for runstats (Stats_Check)
The execution of the runstats utility on flagged objects (Run_DBSTATC).
We suggest that you develop and document an operations concept for your
production environment. There you would describe the scheduled activities,
including all of the necessary activities for DB2 UDB Health.
If you elect to schedule the REORG and RUNSTATS of the Flagged Tables
activity manually, the screen depicted in Figure 7-18 will be displayed. You can
then select from the two options, to run manually:
Select to run all manually by clicking the Select All button.
Select one or more tables by clicking the icon in the leftmost column of the
desired row(s).
Based on daily checklist or on demand activities, you can display the Database
Administrator (DBA) Action Log. Figure 7-19 shows an overview of the executed
jobs and indicates whether or not they have run sucessfully. You may also see
some of the runstats-related activities.
Starting with testing and production, the following are examples of operations
that may affect the physical distribution of data:
Deleting data may leave empty pages.
Inserts of new data may be written to different physical storage elements.
Updates to variable-length columns may result in its movement to a different
page and leave unused space in the table.
DB2 UDB provides commands and functions to gather information and analyze
the physical organization of the tables and indexes, such as the REORGCHK
command, which does the following calculations:
It determines the used and free space.
It analyzes the physical distribution of tables and indexes.
It uses formulas to determine if reorganization (reorg) is necessary.
The DB2 UDB REORG function deletes the unused space and reorganizes the data
to be stored in contiguous pages. If the INDEX option is used, it places the index
data rows in the same sequence.
The SAP BW System provides the following jobs for REORGCHK and REORG:
RUNSTATS and REORGCHK (DBSTATC)
REORGCHK for all tables
RUNSTATS and REORGCHK for all tables
RUNSTATS and REORGCHK for a single table.
Automatic REORG
REORG and RUNSTATS for a single table
REORG and RUNSTATS of flagged tables
REORG of tables in tablespace(s)
The Automatic REORG job can be specified as shown in Figure 7-17 on page 213,
using the Action Pad. The job can then either be added to the Planning
Calendar, or manually started immediately.
You should first categorize your data according to the business dependencies.
The following is a list of example categories:
Enterprise critical data, such as the BW database
Infrastructure data, such as:
– SAP Kernel
– DB2 UDB code
– Operating system files
Bootable backup to recreate the basic operating system if necessary.
In addition, we recommend that you consider how the runtime of backup and
recovery can be influenced. From the policy point of view, the focus must be on
the maximum available time for recovery. The following considerations apply for
both backup and restore operations:
Use multiple I/O buffers and devices.
Allocate at least twice as many buffers as devices being used.
Do not overload the I/O device controller bandwidth.
Use many buffers of smaller size, rather than a few large buffers.
Tune the number and size of the buffers according to the system resources.
Use the PARALLELISM option.
The data will be manipulated in the DB2 buffer pool. The db2agent process
writes information about the updates, inserts, and deletes, in the bufferpools as
log records in the log buffer. The db2loggr process writes the contents of the log
buffer to the active (online) log file. This will be performed:
After a commit or rollback.
Every second.
If the log buffer is full.
If the page cleaner moves updated pages.
1 Log Buffer
Log Records
DB2 UDB 2
4K
Log_dir
16 K 3
When an active log file is filled with records, the DB2 logging user exit db2uext2
copies the log file into log_archive directory. The copied log file is called the
offline log. When all referenced pages have been written to disk, the online active
log file is deleted in the log_dir directory, but a copy of it remains as the offline
log in the log_archive directory. The archive-log consists only of committed
records and can be written to a secondary storage as well.
Important: Please ensure the safekeeping of the Archive Log files. In case of
recovery, the database manager may require the backed-up and archived
datafile. If the archive file is not available, recovery is not possible.
Using the CLI, only one backup command is used to back up the entire
database. The created backup file includes database system files, data files, log
files, and control information, as examples. Database recovery is also performed
by one restore command, or with the rollforward command. These integrated
features make it easy to manage database backup and recovery.
The first backup should be performed after the installation of SAP BW, described
in the SAP NetWeaver Installation Guide. After installation you should implement
your developed and documented backup/recovery policy and strategy.
Offline backup
The offline backup will be performed based on the Database Instance. Therefore
it involves stopping the database applications for the particular Database
Instance. When the backup is complete, the database and/or applications can be
restarted. During an offline backup, no dialog or background activities can run.
Therefore, you should schedule sufficient time to perform the SAP BW offline
backup so that it will not interfere with other required activities.
The database shutdown can be executed from the CLI by executing the
command db2 deactivate database <SAPSID>. During shutdown, the DB2 I/O
servers write the buffer pools content to disk, and then stop the database
processes.
Entering the command db2stop in the UNIX command line will stop the DB2
Instance. If applications are still running, the Database Instance cannot be
stopped. To check the status of the applications, enter the command db2 list
applications (as shown in the Example 7-3).
An offline backup can be started by entering the db2 backup database command.
This command should be run after all applications previously connected to the
database are disconnected. See Example 7-3.
Example 7-3 Offline backup in CLI
The screen in Figure 7-23 shows how to define a full database offline backup to
/sapmn/BW1/backup, which is a 100 GB filesystem.
While the offline backup is running, the job log reports the single steps, as
depicted in Figure 7-24. The job log shows that SAP automatically stops the
database to end all applications and transactions. Then the database is
restarted, and the offline backup begins to run. As the log shows, the full offline
backup ran 28 minutes to back up 73 GB using the selected definitions of:
Eight parallel processes
Eight buffers
Online backup
The second type of DB2 UDB backup is the online backup. It is performed by the
DB2 backup utility. It can run in parallel with ongoing activities of database
applications, such as R/3 work processes.
Because transactions are executed during the runtime of an online backup, the
created backup image cannot provide a consistent state of the database. In the
case of a restore of an online backup, all changes that occurred during the online
backup must be reapplied or rolled back based on the log files written during the
online backup. This is called a roll-forward operation.
An online backup test was executed according the entries in Figure 7-22 on
page 221. We defined six parallel processes and scheduled the Online Backup
with the Planning Calendar, as shown in Figure 7-25. The job ran on Monday the
8th of March, at 17:00.
As depicted in Figure 7-25, the backup run was not successful. By clicking the
entry in the Planning Calendar, you can access the corresponding detailed job
log. It shows the ABAP program that was called and the specified parameter. In
addition, it shows the problem. The parameter, LOGRETAIN, was not set. In a
production environment, setting the parameter LOGRETAIN to on is mandatory.
After the Online Backup you should create a backup of the recovery log files as
well. Without the log files that were active during backup, you cannot recover the
database to a consistent state. You cannot take the log files backup right after
the database backup, because these log files are in an active state.
DB2 UDB provides the archive log command to close and archive active logs.
The database administrator can acquire a complete set of log files up to the point
in time when the command is executed. This command can be used only in the
archive logging mode. Example 7-4 demonstrates that you can execute this
command without a database connection.
Example 7-4 Archive active log
C:\PROGRA~1\SQLLIB\BIN>db2 archive log for database sample
DB20000I The ARCHIVE LOG command completed successfully.
Tablespace backup
DB2 UDB also provides a tablespace level backup. It is supported when the
database is in archive logging mode. Tablespace backup involves backing up
the history file, system tablespace, and user tablespaces. You can take only one
tablespace, or several tablespaces together, in one backup file. If the size of one
backup file exceeds the operating system limit, the database will create another
backup file that is sequenced automatically.
From the recovery point of view, you should consider backing up all the related
tablespaces together. The relation to other tablespaces is based on primary and
secondary key relations, on referential integrity (RI), or on triggers.
Because of the complexity of the SAP BW database, SAP does not recommend
a tablespace backup.
The success of backups is diplayed in the SAP Planning Calendar and the
assigned job logs. In addition, you can verify the DB2 controlled backup
information using the db2 list history backup all for <SID> command
(BW1 was the SID in our installation).
DB2 UDB provides the db2ckbk command usable at the operating system level.
The db2ckbkp enables you to check whether a backup is recoverable — that is,
whether you can actually restore it. At least once, a typical backup should be
restored to test whether or not the backup and restore process itself is operating
correctly. There is no way to restore an unsuccessful, incomplete backup.
The db2ckbkp checks the integrity of the backup image and determines whether
or not it can be restored. Some, or all, parts of the backup can be checked. The
backup image must reside physically on the disk.
For an SAP BW System, you will need a large pool of tapes for the entire backup
cycle. SAP recommends having 30% more tapes than required. You can reuse
the backup tapes at the end of a backup cycle (after 28 days).
When an online log file (active log) is filled with log records, the database starts
the DB2 logging user exit. As a result, the log file is written to the log_archive
directory. In addition, the logging tables in the administration database
ADM<SAPSID> will be updated.
Important: Do not delete the offline log files from the log_archive directory
without performing a backup of them. For safety reasons, SAP recommends
that you back up two copies of each offline log file onto separate tapes.
Remember: An online backup is not usable without the log information that
was generated during the database backup.
If, in case of recovery, you need the all of the required offline log files, SAP
provides the BRARCHIVE program for DB2 UDB. You can use it to archive
offline log files to persistent storage media such as tape, or a backup server such
as IBM Tivoli Storage Manager or Legato.
BRARCHIVE retrieves the range of log files which are to be archived from the
administration database ADM<SAPSID>.
db2bi:bw1adm 6>
Tablespace ID = 34
Name = BW1#ODSI
Type = Database managed space
Contents = Any data
State = 0x10000
Detailed explanation:
Write Suspended
db2bi:bw1adm 6>
You can use this write suspended status to create a Volume Based Backup using
the storage facilities such as:
Snapshot or flashcopy with the IBM Enterprise Storage Server® (ESS)
Concurrent copy
EMC timefinder
Attention: Use the volume based backup and recovery functions only based
on a well developed, documented, and tested policy.
These kinds of volume based copies are established in a short time (minutes).
This command can be used only for the database in the archive logging mode.
But the snapshot feature of storage vendors can still be used for circular logging
of a database copy with a short shutdown time of the database.
Control Center
The DB2 UDB package includes the Control Center. This GUI facility also
provides support during the DB2 administration activities. It is recommended that
you use the SAP provided DB2 administration facilities.
After a database failure, the database manager (DBMS) automatically rolls back
any effects from incomplete transactions and any complete committed
transactions that were still in memory when the failure occurred. This will put the
database back to a consistent state.
During recovery, the database manager uses an active log file and a log control
head file (sqlogctl.lfh). Keeping these two files safe is very important for the
database startup after a database failure. If the log control head file becomes
corrupted, there are two ways to recover. You could use the db2dart utility to
export data from containers directly, or you could contact your IBM service
representative for assistance.
If the database is in circular logging mode, the restart operation will fail because
a tablespace is damaged that is needed for recovery. The damaged tablespace
list is recorded in the db2diag.log file. You can restart the database again with
the following command:
db2 restart database database_alias drop pending tablespaces
(tablespace_name)
But you should drop this tablespace after the database is restarted. This
tablespace cannot be used again without a database restore.
Version recovery
Databases that are not enabled for archive logging are said to be
non-recoverable databases. Only version recovery can be performed. Version
recovery restores a previous version of the database using an image created by
an offline backup process. No log files are applied, and any transactions
committed after the backup are lost. All users must disconnect from the database
for version recovery.
Create
BACKUP
Database
Image
Time
When a database is corrupted, you can check the latest backup information
using the list history backup command and restore the database backup
using restore command.
Figure 7-28 shows the units of work performed after a last backup is reapplied.
The restored database is using n archived log files and 1 active log file. A
successful database restore puts the database in roll-forward pending state
when this backup was taken online. The rollforward command is used to
release the pending state.
Changes
in Logs
Units of Work Units of Work
BACKUP
Create BACKUP BACKUP
Database
Database Database Database
Update Update
n archived n archived
logs logs
1 active log 1 active log
Time
Figure 7-29 shows that a corrupted tablespace has been recovered from a
tablespace backup. After a tablespace roll-forward recovery, the recovered
tablespace is in backup pending state unless the restored tablespaces was rolled
forward to the end of logs. The tablespace is usable when you take a tablespace
backup.
The corrupted tablespaces are rolled forward to the point of failure. This puts the
tablespace backup in a not-pending state.
Changes in Logs
BACKUP Units of Work BACKUP Units of Work
Table Table
space(s) space(s)
Update Update
Media
Error
n archived n archived
logs logs
1 active log 1 active log
You can recover the history file from a database or a tablespace backup file, for
example, by using the following command:
In this situation, the restore command is used with the incremental option. The
last delta backup is restored two times during the incremental restore operation.
During the first restore, only the initial data is read from the images and the
complete image is read and processed only during the second restore.
You can also work using the automatic recovery function. The last image that
you want to restore should be specified in the automatic recovery command.
Each of the steps described above is performed automatically using database
history file. So keeping the history file is very important for the automatic
incremental recovery. If the history file is not able to be used, you should use a
manual incremental recovery.
For example, the command: db2 restore database BW1 incremental automatic
from /sapmnt/BW1/backup taken at 20040308154417 performs an automatic
incremental recovery.
Attention: You should use the incremental backup and recovery only if based
on a well developed, documented, and tested policy.
Statistics, in addition to the standard database and operating system objects, are
available for other BW objects such as:
FACT tables
InfoCubes
Aggregates
Because of the selected roles, after log-on, the SAP GUI provides the
transactions and lauchpad depicted in Figure 7-32.
Of course you should go through all processing, data target, update, and
schedule specifications thoroughly. To specify the data targets, the SAP BW
system provides visualized graphics to guide you to the correct target.
The graphic shown in Figure 7-35 helps to define which objects are loadable
from flat file. In addition, you can select a sequential or parallel load.
RSSM BW Authorizations
Data archiving moves mass data from the database to an archive file or an
external archiving system. SAP BW provides archiving transactions and facilities
such as:
Transactions
Programs
Prepared archiving objects
Interfaces
These facilities are used to house BW data which is no longer required online,
but must be accessible later for legal or internal auditing.
Figure 7-38 Define the Archiving Object: Define the File Structure
Figure 7-39 Define the Archiving Object: Specify the File Name
After the archiving object is defined, or selected if it exists, the archiving process
can be started. Archiving runs as a job in the background, therefore you need at
least two background workprocesses and the standard definitions for a job, such
as:
The start time, immediately or at a specified time
The start conditions
The associated printer
Depending on the definitions, the archiving job creates one or more archiving
files, writes the archive objects into the created flat files, and deletes the archived
objects in the database.
Important: Start an archiving project to define, document, and test all the
single steps, as well as the complete environment.
Using the Archive Link Interface, you can archive the files directly in an archiving
system such as IBM Common Store.
668602 SAP software on UNIX: SAP Software on UNIX: OS Dependencies. SAP Web AS
OS dependencies 6.40 6.40.
146289 Parameter The 64-Bit-Kernel for your SAP system has been
recommendations for implemented, and the SAP profile parameters must be
64-Bit SAP Kernel adjusted.
425207 SAP memory It is not clear which maximum values can be assigned in
management, current which kernel release to specific parameters for the SAP
parameter ranges memory management.
27617 Syslog Q0G, scratch Syslog message Q0G / request (type NOWP) cannot be
area too small. processed.
Table 8-2 Notes related to configuration of DB2 UDB ESE with SAP BW
Note Number Title Description
374502 DB6: DB2 UDB ESE - Summarizes all notes about current performance problems
BW performance - related to DB2 UDB ESE on BW and APO.
overview of notes
370026 DB2 V7.2 parameter Describes best settings forDB2 V7 database and database
settings for R/3 4.x/ manager parameters.
6.10/ 6.20
584952 DB2 UDB ESE V8.1 Describes best settings for DB2 V8 database and database
standard parameter manager parameters.
settings
302429 DB6: Performance on Describes how to tune the DB2 database for BW2.0B,2.1C,
BW 2.0B, 2.1C, 3.0A and 3.0A.
546262 DB6: Administration & Describes how to administer and do performance tuning for
performance on SAP SAP BW, SAP SCM, and SQP SEM (release independent).
BW, SAP SCM, SAP
SEM
583604 DB6: Regular updates Describes how to update database statistics via the DBA
of database statistics of Planning Calender (DB13).
BW tables
147634 DB6: Hints and tricks for Describes how to create new DB2 tablespace with special
creating DB2 focus on performance aspects.
tablespaces
362325 DB6: Table conversion Describes how to use Report DB6CONV to move tables
with DB6CONV from one tablespace to another. DB6CONV also allows to
repartition tables as described in note 648432.
648432 DB6: New function Describes how to check and, if necessary, correct the
modules for hash partitioning keys for fact, ODS and PSA tables. This is
partitioning important if you want to add additional database partitions to
your system and redistribute your data. These checks are
also important after a heterogeneous system migration from
another database platform to DB2 UDB.
544274 FAQs: DB2/UDB 64-bit Information about usage of / migration to DB2/UDB 64-bit.
support
356828 DB6: Migrating to 64 bit Explains how to migrate a 32-bit SAP/DB2 UDB ESE system
for AIX and Solaris to 64 bit
156829 DB2-Explain with SQL statements should be examined for cost and efficiency.
db2exfmt
409127 DB6: DB2 UDB Dual DB2 UDB Dual Logging is supported with DB2 V7.1 FP3b
Logging and later.
454173 DB6: R3load migration How to accelerate a system copy or system migration of a
accelerated through CLI large database using R3load. The target database platform
LOAD is DB2 UDB for UNIX and Windows (here DB6).
486951 DB6: DB2 CLI trace For database problems, which were or are going to be
forwarded by SAP to the IBM support, it may be useful to
create a CLI trace of the problem situation.
72557 Logging in DB2 UDB Presentation of the logging behavior of the database.
systems
92495 Examination of lock The SQL error message SQL0911 occurs in the database.
situations (SQL0911) Another information deadlock (reason code 2) or lock
time-out (reason code 68). Explains how the cause of these
lock situations can be examined.
522550 DB6: Moving small Describes how to move small tables to another tablespace.
tables to other This may be necessary, for example, to eliminate LONG
tablespaces VARCHAR fields in database tables by moving them to
tablespaces with larger page sizes.
567745 Composite note BW 3.x Describes how to improve the performance of your BW 3.x
performance: System with regard to database-specific features and
DB-specific setting settings.
567746 Composite note BW 3.x This composite note deals with performance relevant topics
performance: Query & in the area of queries and Web Applications.
Web Applications
567747 Composite note BW 3.x This is a composite note deals with performance relevant
performance: Extraction topics in the area of extraction and loading.
& loading
192658 Setting basis Provides recommendations for setting basis parameters for
parameters for BW SAP BW. Only an optimal setting of these BW basis
Systems parameters guarantees an error-free running SAP BW
System and good overall system performance. The
recommendations for BW Systems deviate in certain
respects from the recommendations for R/3 Systems.
130691 Collective note for BW: This notes provides tips and tricks for all functional areas of
tips & tricks SAP BW.
130253 Notes on upload of Describes steps to make mass data upload into the BW
transaction data into the System as efficient as possible. By increasing number range
BW buffers, for example.
Table 8-4 SAP BW parameters that influence log space consumption of BW operations
SAP BW operation Parameter Description
Aggregate Build BLOCKSIZE Determines the number of records that are inserted in one
UOW. It is defined in table RSADMINC. Use transaction
SPRO to configure this parameter. If no value is defined, it
defaults to 100.000.000! We suggest a value between
1million and 10 million.
Changerun BLOCKSIZE Parameter BLOCKSIZE also determines the UOW size for
changerun. See details above.
Changerun DELTALIMIT Defines a threshold limit that determines how the changerun
is carried out. If the percentage of records that have to be
changed during the changerun is larger than DELTALIMT,
the aggregate will be completely rebuild. Otherwise the
records to be changed will be updated.
Dataload from file into IDOCPACKSIZE Determines the number of records that are inserted in one
PSA or InfoCube UOW. It is defined in table RSADMINC. Use transaction
SPRO to configure this parameter.
Dataload from SAP MAXSIZE Determines the amount of data (kilobytes) that is loaded into
R/3 Source System each data target in one UOW. The required log space for
into PSA and/or loading an InfoPackage is <MAXSIZE * MAXPROCS *
InfoCubes number of data targets>. Data targets can be PSA only, or
PSA and InfoCubes concurrently. MAXSIZE and
MAXPROCS are defined in table ROIDOCPRMS in the
source system.
Deletion of DB6_DELETE_ See Table 8-5 on page 256.
InfoPackages PIECEWISE
Example 8-1 lists the parameters used to calculate the logspace consumption
during the build of an aggregate table.
Example 8-2 lists the parameters used to calculate the logspace consumption
during an SAP R/3 data load.
In the next section we begin to address the topics of performance analysis and
tuning.
We mainly focus on basic SAP BW and database tuning concepts. For example,
what should be considered when setting up the system and how to monitor the
system and analyze performance problems.
SAP memory
There are six memory areas which can be configured for an SAP R/3 Instance.
The following list is a brief explanation of each of them:
R/3 buffer: This stores global data that can be accessed by all users, for
example, program code, tables, and field definitions.
R/3 roll memory: This consists of two parts, local roll area and roll buffer.
Local roll area is the local memory of a work process which stores the initial
part of user context at the beginning of a transaction step. R/3 parameter
ztta/roll_first specifies the amount of context loaded to local roll area. When
the transaction step is complete, the context is moved to roll file which stores
the roll area on the hard disk. To reduce the access time, a buffer is provided
to buffer the data written into the roll file. This is roll buffer. It is the instance
level shared memory which saves user context during roll out.
R/3 extended memory: This is the shared memory which stores the main
part of user context. It is the core of R/3 memory management system.
R/3 heap memory: This provides an area for user context if it becomes too
large for the local roll area. It is dynamically allocated as needed and is
released as soon as a transaction is completed.
R/3 paging memory: This stores temporary data exchanged between user
contexts.
Local memory of R/3 work processes: This stores executable programs,
local data, R/3 cursor cache and so on.
When configuring R/3 memory, the goal is to ensure system stability as well as
performance. There are a few aspects that need to be considered when
optimizing R/3 memory, for example, available physical memory, swap space
and operating system limits. To be able to achieve optimal performance, physical
memory should be large enough for R/3, database and operating system. Swap
space must be sufficiently large to enable the creation of required memory areas.
A good rule of thumb is to set up swap space that is three times the amount of
physical memory available. A 32-bit architecture limits the address space that
can be used by R/3 work processes, while a 64-bit system is limited only by
installed physical memory.
If too many work processes are configured for one application server, it is very
likely to cause contention for available system resources, such as CPU, R/3
dispatcher, R/3 program buffer and database. However, if the number of work
processes is too few, user requests will have to wait in the R/3 dispatcher queue.
Thus the purpose of tuning work processes is to ensure all R/3 system
components interact smoothly, neither being overused, nor being under used, so
that an optimal performance is achieved.
The proper configuration of the number of R/3 work processes relies on detailed
analysis of database request time, wait time of the dispatcher, roll time and R/3
buffer contention, etc. Generally the number of R/3 update processes should be
as small as possible while still being able to keep pace with the dialog processes.
Number of instances
The number of R/3 Instances per server depends on hardware configuration as
well. Generally speaking, for more than four processors and 4 GB of main
memory, it is reasonable to install more than one R/3 Instance on the application
server.
NGRP_ODS_<SAPSID>
NGRP_DIM_<SAPSID>
NGRP_FACT_<SAPSID>
For more details, please refer to the SAP Administration Guide: SAP BW 3.5
Administration Tasks in Multi-Partition Installations: IBM DB2 Universal Database
for UNIX and Windows.
Typically, raw device containers yield higher I/O throughput than file containers,
since the additional layer of the file system buffer cache is avoided by raw
devices. So from a performance point of view, raw devices should generally be
preferred to files. On the other hand, there are situations where the file system
buffer cache can serve I/O requests that would otherwise have to go to disk. In
particular, tablespaces for long data (LONG VARCHAR or LOB data types)
should use file containers, since long data is not cached in the buffer pool.
An additional point in favor of DMS file tablespaces is that raw devices are
sometimes considered to be harder to handle from administration point of view.
In our test system, all tablespace containers were file-based.
Memory management
For details of DB2 structure and memory management, please see references 4
on page 371, and 5 on page 371, under Other Publications.The following list
contains a few DB2 database parameters relevant to memory management:
BUFFPAGE / buffer pool size: Each tablespace is assigned to a buffer pool
that stores data pages in memory. The database configuration parameter,
BUFFPAGE, defines the size of the default buffer pool – IBMDEFAULTBP.
The basic rule is that indexes and data should stay in the buffer pool as much
as possible to avoid disk activity. But an excessively large buffer pool is not a
good use of memory since there are generally diminishing benefits once the
buffer pool becomes a significant fraction of the active data volume.
For more details, please check SAP Administration Guide: SAP BW 3.5
Administration Tasks in Multi-Partition Installations: IBM DB2 Universal
Database for UNIX and Windows.
– Creating new buffer pools: You can create buffer pools using DB2
interface or using BW interface (transaction DB6DBM). In our case we
describe how to do this in the DB2 level.
i. Log on into the system as db2<sid>.
ii. Open a DB2 command prompt window and issue the following the
commands:
create bufferpool <buffer_pool_name> database partition group
<partition_group_name> size n pagesize 16K
Example:
create bufferpool BP_AGGR_BW1 database partition group
NGRP_AGGR_BW1_4P size 7500 pagesize 16K
SORTHEAP: This defines the quantity of memory pages to be used for sorts.
Each agent has a separate sort heap that is allocated as needed, as sorts are
performed. When sorts spill over from the sort heap to disk (generally an
indicator of a SORTHEAP that is too small), performance can degrade
significantly.
When increasing SORTHEAP size, check to see if the SHEAPTHRES
parameter also needs to be adjusted. This is the threshold for the total
amount of sort memory in the system. The value of SHEAPTHRES should be
larger than SORTHEAP times the average number of concurrent sorts.
Snapshot monitor can be used to monitor total sort heap allocated, total sorts,
active sorts, and number of overflow sorts.
Transaction logging
All transactional changes to data and index pages are written to the log, via the
log buffer. The log buffer contents are written to disk when a transaction commits,
the log buffer is full, or write-ahead logging is triggered.
LOGBUFSZ: This specifies the size of the log buffer, and should be
increased if there is considerable read activity on the log disk due to
rollbacks. Ideally, the log pages read counter in the snapshot monitor should
be as close to zero as possible.
I/O parameters
The following list describes a number of I/O parameters.
NUM_IOCLEANERS: This specifies the number of asynchronous page
cleaners for a database. These page cleaners write changed pages from the
buffer pool to disk before the space in the buffer pool is required by a
database agent, ensuring an adequate supply of clean pages.
This parameter should typically be set to a value between one and the
number of physical disks used for the database. Environments with high
update transaction rates or very large bufferpools may require more page
cleaners to be configured. However, too many I/O cleaners can impact the
performance negatively.
The value asynchronous pool data/index page writes should be as close as
possible to buffer pool data/index writes. If there are a significant number of
synchronous writes or dirty page steal cleaner triggers, then the value for
NUM_IOCLEANERS should be increased.
NUM_IOSERVERS: This specifies the number of I/O servers (or prefetchers)
for a database. These bring database pages into the buffer pool before they
are required, for improved performance, by database agents.
Both page cleaners and prefetchers are taken from a shared pool and are not
exclusive to a particular buffer pool. The considerations for tuning page
cleaners and prefetchers setting are typically the same regardless of whether
or not a system has one buffer pool or multiple bufferpools.
Locking
Maximizing system concurrency is very important. SAP BW uses the lowest
isolation level available with DB2, which is uncommitted read, to maximize
concurrency. The following DB2 parameters can influence system concurrency:
LOCKLIST: There is one lock list per active database, containing the locks
held by all connected applications. LOCKLIST specifies the lock list memory
allocated. Once the list is full, performance can degrade due to lock
escalations and decreased concurrency. The database monitor reports lock
escalations, and also the high-water mark of the lock list (to avoid
over-allocating memory). If the high-water mark exceeds fifty percent of the
defined LOCKLIST size, consider increasing it. Note that on a 64-bit
database, each lock is 56 bytes, versus 36 bytes on a 32-bit database.
MAXLOCKS: The MAXLOCKS parameter defines a percentage of the lock
list that each application can use before lock escalations occur. When setting
MAXLOCKS, you should consider the size of the lock list and number of
concurrent applications.
Optimizer
DB2 uses a very sophisticated cost-based optimizer. It estimates the execution
cost of each alternative data access plan by using statistics for tables and
indexes, and information regarding tablespace I/O characteristics. It then
chooses the plan with the smallest estimated execution cost. DB2 provides a
range of optimization classes from class 0, which uses a minimal amount of
optimization, to class 9, which uses all optimization techniques.
Configuration DB03
Backup&Recovery DB12
Diagnostics DB6COCKPIT
Figure 8-3 depicts a simplified view of how SAP systems access data. Involving
too many elements in the data access process can result in runtime and
performance problems. As examples:
1. The user or job will be assigned to a dialog or background workprocess.
First the SAP buffers, located in the application server virtual memory, are
checked to see if the needed data is loaded into the buffers. If the requested
data is available in the SAP buffers, this results in faster response times.
2. If the requested data is not residing in the SAP buffers, the workprocesses
access the data using the database interface from the database server. When
the data is prefetched in the DB2 Buffer, no I/O operation is necessary.
3. An I/O access to the disk results in a longer response time.
Processes
SAP BW DB2 UDB
4K
SAP 2
Application DBMS DB_Files
3
Work
1 16 K
SAP
Buffer
DB_Files
The SAP BW system provides monitoring and workload statistics functions for all
single steps and areas of the different access types. The results of the monitors
is even kept for trend analysis and time-related comparisons as well.
Figure 8-4 shows that the CPU utilization related to the threshold, as a 15 minute
average value, is too high. Additionally, the CCMS System Monitor shows paging
and memory problems as well. Selecting the three elements — the CPU
utilization, in particular — the monitor lists seven alerts for further investigation.
The tree elements, related to the database, provides an impression about the
provided statistics. In Figure 8-5, some of the higher level elements are
extracted, such as:
DB2 UDB
Performance
the partitions 000 and 001
As the monitor indicates, partition 000 has lock and cleaner problems, and
partition 001 has DB2 buffer pool and cleaner problems. Clicking the associated
nodes, SAP BW starts an assigned method for analysis. This method can be, for
example, an analysis or even an predefined action.
Figure 8-7 shows the selected monitor tree for the BW environment. The BW
Monitor provides all of the information about the gathered alert statistics. For
example, it informs you about the state of:
DB2 UDB (green)
ALE communication and IDocs (green)
the BW Process Chains (red)
The illustration shows that the failures will be propagated to the higher tree levels.
For example, the BW monitor is marked red because some of the PSA loads did
not run successfully.
By clicking the first entry that is marked red on the lowest level, the monitor
automatically performs the specified analysis method. In this example, it is an
action to display the Process Chain Maintenance Planning View for loading the
PSA. The result is visualized in Figure 8-8.
By selecting the red marked object 1990-3 in Figure 8-8, the monitor displays the
log of the PSA Load job for final analysis. Then after the analysis and correction
of the problem, you can resolve the alert.
First we give an overview of the available monitors, then we focus on the DB2
UDB transactions.
Definition:
SAP number of the semaphore which blocks the work process:
7: PAGING semaphore
8: NO_BUFFER semaphore
9: STAT semaphore
16: DB_TBUFF semaphore
17: DB_SYNC semaphore
18: DB_TTAB semaphore
19: DB_SNTAB semaphore
20: DB_IREC semaphore
21: DB_FTAB semaphore
22: LOGFILE semaphore
23: REQ_QUEUE semaphore
24: DB_TBUFF_P semaphore
30: DB_CUA_BUFFER semaphore
Figure 8-9 shows, in the left pane, Analysis Views that you can analyze, as
examples, for specified time periods, top DB accesses, or even a single
transaction performed using the Transaction Profile.
– ST02 helps to analyze the SAP buffering hit-rate and memory allocation.
In the Detail Analysis Menu you get additional information about the
virtual storage usage and allocation and the SAP Profile parameters.
After analysis, perform the necessary tuning activities and document the results.
Figure 8-10 shows, in the left pane, the available data categorized by database
elements: partitions, database, schemes, tablespaces, tables, applications, SQL
cache, lockwaits, and deadlocks. Additionally, you have access to the collected
history. The right pane lists the four partitions and the performance related
columns. By selecting one of the partitions shown in Figure 8-10, SAP will
provide more detailed data about that particular database partition.
Total Size Buffer Pools In KB of all buffer pools used for a partition.
Index Physical Reads Number of disk read accesses to index. This value
includes the number of synchronously read index
pages.
Avg. Phys. Read Time (ms Average read from disk to BP time in milliseconds.
Avg. Phys. Write Time (ms) Average write time from BP to disk in milliseconds.
Package Cache Size Heap memory size using for static and dynamic SQL
statements.
Package Cache Quality (%) The hit ration for package or catalog cache. Should
be greater than 95%.
Figure 8-11 illustrates that you may select the available DB partition related
performance data since the last reset of the snapshot, or since the start of the
DBMS. As an option, in the right pane you can select the DB partition. This
snapshot gives performance related information such as:
Cache usage and hit-rate
Lock waits and lock escalations
Number of logical and physical reads and writes
DB logging values
Figure 8-12 lists a table snapshot. It shows, for example, that some of the tables
have a lot of overflow access. This column provides the number of accesses
(reads and writes) to overflow rows of the table. The overflow rows indicate that
data fragmentation has occurred.
If this number is high, the transaction accessing this table will experience long
response times. This will be especially true if the table is not a read-only object. It
may be possible to improve performance by reorganizing the table. This can be
done by using the reorg utility, which cleans up the fragmentation.
By selecting the second statement in the right pane, where the statements are
sorted according to the Total Execution Time, the SQL statement and the used
access plan will be explained as shown in Figure 8-14. The Estimated Costs
value is marked yellow, and the #key column is marked green.
When you click the Collect button shown at the top of Figure 8-14, the BW
system collects all of the necessary data to analyze the SQL.
As depicted in Figure 8-15, you are asked to select the type of information
needed. Included are such things as:
DB configuration values and level
Tablespace configuration
Table structure
Statistics
Explain
All the values can also be displayed on the SAP GUI (SAP Graphical User
Interface) by selecting Display Download on Screen — see Figure 8-16.
Figure 8-17 The collected and downloaded information for the SQL Explain
The files include all of the detailed information including the database
configuration for the partitions. Using the download facility, SAP BW provides a
complete set of information related to a particular performance problem.
Additionally you might use the result to request remote support from SAP or IBM.
DB2LEVEL File
Node type = Enterprise Server Edition with local and remote clients
Tablespace ID = 0
Name = SYSCATSPACE
Type = Database managed space
Contents = Any data
.
.
.
Missing Indexes
Checking the consistency between the ABAP Dictionary and the DB2 UDB
Catalog is provided by the SAP BW diagnostic functions. This is depicted in
Figure 8-18. The left pane shows the open diagnostic tree, and the right pane
lists the results of the consistency check.
This example shows that some rows have a green check mark indicating that
they are OK. But, there are eight secondary indexes listed as missing in the
database. Selecting the Info symbol in Figure 8-18 will produce an explanation of
the problem as provided by the system.
The system explanation for the missing indexes is shown in Figure 8-19.
Running the Utility for Database Indexes, depicted in Figure 8-20, will check the
consistency. The results, shown in Figure 8-21, indicate that there really is not a
missing indexes problem.
For performance analysis activities, you will need to see the response times as
well. Similar to the Database Accesses Monitor, you can select the Application
Monitor Response Times. Then by looking at the details you will find the
programs with higher than expected response times.
Figure 8-24 and Figure 8-25 show the result of the application monitors for
Database Memory and SAP Buffering. Using these monitors, you are able to find
those instances where there is a lack of memory and those where there are
buffering problems related on the single ABAP application modules. Those
instances are listed on the bottom segment of each of the displays.
Next, we briefly explain some scaling steps we performed during this project:
Re-partition the database from 1 to 4 partitions: Multiple partitions provide
several advantages when running on an SMP (symmetric multi-processor)
server. Different operations, such as SAP BW queries, table reorganization,
update statistics, and backup/restore, benefit from inter-partition parallelism
and realize improved performance.
Attach additional servers to the database system: Then you can move
existing database partitions to those additional servers: The ability to add
servers to a database system, to increase SAP BW performance at the
database level, is a unique feature of DB2. Refer to 4.1.1, “Scalability” on
page 82 for more details.
Installing the SAP BW system with only one database partition has the following
considerations:
You are not taking advantage of inter-partition parallelism. If you are running
on an SMP (symmetric multi-processor) server, some single running
operations may not fully utilize all processors on the server.
To attach additional servers later on, you have to add database partitions and
then redistribute PSA, ODS, and Fact tables over all the partitions.
Configuring multiple database partitions from the beginning avoids the task of
redistributing the data later on. And you can immediately take advantage of
inter-partition parallelism, which results in improved performance for many
database operations.
To redistribute data, SAP provides the tool db6conv. Instead of using the DB2
redistribute command, this tool uses the following command, insert into
select from not logged initially, to move PSA, ODS, and Fact tables
from non-partitioned to partitioned tablespaces and is therefore much faster.
After using the db6conv tool to redistribute tables, take a backup of the database,
because of the not logged initially option used.
If you initially start with multiple partitions and want to attach additional servers to
your database system, you can move partitions to the additional servers instead
of adding new partitions. This avoids the need of redistributing PSA, ODS, and
Fact data.
Figure 9-1 shows the database system after partitioning the database. As you
can see, all data is on the same disk subsystem. ODS, PSA, InfoCube, and
aggregate fact tablespaces are distributed over all four partitions.
P-Series 650 A
VG 1 – RAID 5
<SID>#FACTD/I
<SID>#AGGRD/I
<SID>#ODSD/I
DIM
etc..
0 1 2 3
Database Partitions
Disk Subsystem
Figure 9-1 Four database partitions on one server (logical partitioning)
VG 1 – RAID 5 VG 2 – RAID 5
<SID>#FACTD/I <SID>#FACTD/I
<SID>#AGGRD/I <SID>#AGGRD/I
<SID>#ODSD/I <SID>#ODSD/I
DIM
etc..
0 1 2 3
Database Partitions Database Partitions
Disk Subsystems
Figure 9-2 Four database partitions on two servers
The new partitions will NOT be listed by the command db2 list
DBPARTITIONNUMS at this time, because there is no partition group that is
referencing this partition.
You can create a partition group using SAP BW with transaction DB6DBP.
In the following steps we use the DB2 command line processor:
You can create a tablespace using SAP BW with transaction DB02. In the
following steps we use the DB2 command line processor:
1. Log on into the database system as user db2<sid>.
2. Open the DB2 command prompt and enter the following commands:
create tablespace <SID>#<tablespace_name>
in database partition group <partition_group>
pagesize <page_size> managed by database
using (file ‘<container_1>’ <container_size>)
on dbpartitionnum (<partition_number_1>)
using (file ‘<container_2>’ <container_size>)
on dbpartitionnum (<partition_number_2>)
...
using (file ‘<container_n>’ <container_size>)
on dbpartitionnum (<partition_number_n>)
extentsize <extentsize> prefetchsize <prefetchsize>
bufferpool <bufferpool>
Example:
create tablespace BW1#AGGRD_4P in database partition group NGRP_AGGR_BW1_4P
pagesize 16384 managed by database using
(file ‘/db2/BW1/sapdata1/NODE0000/BW1#AGGRD_4P.container000’ 128000) on
dbpartitionnum (0) using (file
‘/db2/BW1/sapdata1/NODE0001/BW1#AGGRD_4P.container001’ 128000) on
dbpartitionnum (1) using (file
‘/db2/BW1/sapdata1/NODE0002/BW1#AGGRD_4P.container002’ 64000) on
dbpartitionnum (2) using (file
‘/db2/BW1/sapdata1/NODE0003/BW1#AGGRD_4P.container003’ 64000) on
dbpartitionnum (3) extentsize 4 prefetchsize 4 bufferpool BP_STD_16K
It is necessary to create a new data class for each new partitioned tablespace
and to assign the corresponding SAP BW objects (PSA, ODS, InfoCubes, and
aggregates) to the new data class. Use the following procedure:
1. Log on into the SAP BW system.
2. Go to transaction ST04 -> Configuration -> Data Classes.
3. Click Add, and enter the information as shown in Figure 9-4.
4. To assign the new data class to an aggregate for example, you need to go to
the transaction RSA1 -> Infoprovider, right-click on InfoCube and select the
option Maintain Aggregates.
5. Go to the menu Extras -> Change Data Class for Aggregates.
For the other objects (such as ODS and InfoCube) you can find the option to
change the Data Class if you edit (change) the object (such as ODS and
InfoCube) and go to the menu Extras -> maintain DB-storage parameters.
6. Repeat the steps 3 - 5 for all data classes that you want to assign. In our case
we create new data classes for fact, ODS, and aggregate tablespaces,
because these are the tablespaces that we want to copy to newly created
distributed tablespaces.
Re-distributing tables
We re-distribute the tables by copying them into the new tablespaces created on
multiple partitions. We use the SAP report DB6CONV to copy the tables. It
executes an SQL statement of the form INSERT INTO target SELECT FROM
source NOT LOGGED INITIALLY to copy the data of a particular table. Because of
the NOT LOGGED INITIALLY option, this method of re-distributing data works much
faster than the DB2 redistribute command. Because the SQL statements are not
logged, it is required to backup the database after the tables are copied.
Note: Please refer to SAP Note 362325 for downloading the latest version of
report DB6CONV.
The report DB6CONV normally converts/copies only single tables. The SAP
module RSDU_REPARTITION_PREPARE_DB6 allows you to schedule the
conversion of all tables that belong to a tablespace.
We start for example, with copying the aggregate tablespace. That tablespace
contains 10 aggregates as shown in Figure 9-5 on page 307.
2. Go to transaction SE37 and (via Function Module -> Test -> Single Test)
execute the function module: RSDU_REPARTITION_PREPARE_DB6.
3. Enter the name of the source tablespace in the parameter I_TABLESPACE
and the name of target tablespace in the parameter I_TABLESPACE_NEW
and execute the function. See Figure 9-6, the initial screen.
Figure 9-6 Function module to schedule more than one table to convert.
5. Select all entries, and click the button CONVERT NEW to start the
conversion.
If you select a table entry and click show/edit, you get more details about the
table to be converted, for example, the number of rows in the table.
6. After the conversion of a table is finished, its status changes to FINISHED.
7. Repeat steps 1 to 6 for all tablespaces, you want to distribute.
p-Series 650 A
VG 1 – RAID 5 VG 2 – RAID 5
<SID>#FACTD/I <SID>#FACTD/I
<SID>#AGGRD/I
db2realocatedb <SID>#AGGRD/I
<SID>#ODSD/I <SID>#ODSD/I
DIM
etc..
0 1 2 3
Database Partitions Database Partitions
We then export the corresponding VG on the first server and import the VG into
the second server as shown in Figure 9-9. The necessary steps are described in
9.2.3, “Moving database partitions to another server” on page 312.
VG 1 – RAID 5 VG 2 – RAID 5
<SID>#FACTD/I <SID>#FACTD/I
<SID>#AGGRD/I <SID>#AGGRD/I
<SID>#ODSD/I <SID>#ODSD/I
DIM
etc..
0 1 2 3
Database Partitions Database Partitions
Before starting, make sure that there is a recent database backup, or take a new
backup of the database. The relocation requires the following steps for each
database partition for which you want to relocate the containers:
/db2/<SID>/log_archive DB2 offline log files. If you have mirror for DB2 log files,
don’t forget to shared the mirror too.
Example :
/db2/BW1/log_archive
/db2/BW1/log_archive2 (mirror)
/db2/<SID>/log_dir DB2 online database log files. If you mirror the log files,
don’t forget to create the mirror file system.
Example :
/db2/BW1/log_dir
/db2/BW1/log_dir2 (mirror)
You can use the same size that you have in the first
server. At least 540 MB required.
11.Export / Import those VGs that include the containers of the partitions you
want to move.
In our case VG2 is still on the first server and includes the file systems below:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
lv_sapd10 jfs2 60 60 1 open/syncd /db2/BW1/sapdata10
lv_sapd11 jfs2 60 60 1 open/syncd /db2/BW1/sapdata11
lv_sapd12 jfs2 216 216 1 open/syncd /db2/BW1/sapdata12
These file systems store the containers that belong to partition 2 and 3.
Filesystem sapdata10 stores the following containers, for example:
./NODE0002:
BW1#AGGRD4P.container002
BW1#AGGRI4P.container002
./NODE0003:
BW1#AGGRD4P.container003
BW1#AGGRI4P.container003
We export VG2 from the first server to the second server with the following
steps:
a. As user root, perform the following steps on the first server:
i. Unmount the file systems that are in VG2.
ii. Deactivate the VG2 using smit vg -> Deactivate a Volume Group
iii. Export VG2 using smit vg -> Export a Volume Group.
b. As user root, perform the following steps on the second server:
i. Import VG2 using smit vg -> Import a Volume Group.
You can leave the Major number blank.
ii. Activate VG2 using smit vg -> Activate a Volume Group.
iii. Mount the file systems that are in VG2.
12.Copy temporary tablespace containers to the additional server for those
database partitions that you want to move.
0 serverA 0 serverA_1GB
1 serverA 1 serverA_1GB
2 serverB 0 serverB_1GB
3 serverB 1 serverB_1GB
15.Check that you can reach each server from each other server via the remote
shell command (rsh), and ensure that you are not prompted for a password.
For example:
As user db2<sid>, logon to serverA.
Execute the following command: rsh serverB -l db2<sid> hostname
If you are prompted for a password, you might have to modify the file
/etc/hosts.equiv on serverB, to include client workstation serverA.
9.3 Summary
In this chapter we have demonstrated how to scale out a database. We
described how to add partitions to a non-partitioned database, and showed how
to add a server to a single-server configuration and spread the database
partitions across both servers. These capabilities can provide you with the
scalability you need as your environment grows. This has demonstrated the
powerful capabilities of DB2 to scale in an SAP BW environment, and to handle
the continuing growth in database volume and number of users in business
intelligence implementations.
DB2 is the leading database when it comes to scalability. Key to this unique
capability is support for parallelism and partitioning as described in 3.2.3,
“Parallel processing and partitioning” on page 55. When you are running with
multiple partitions on one or multiple database servers, DB2 supports parallel
processing across all database partitions. This is called inter-partition
parallelism. It provides maximum flexibility and results in maximum throughput
with your available processing power. This is extremely important as it enables
excellent response times for your users as you grow.
SP Switch
This means that when going from testcase 1 to 2 the hardware (number of SP
nodes and SSA storage towers) was doubled and again doubled when going
from testcase 2 to 3.
For each testcase, both single running SAP BW queries and multiple concurrent
SAP BW queries were tested. For the single query tests, the user that executed
the queries was always connected to the first SAP BW Instance on the first SP
node.
For the concurrent query tests, a query driver was used, that simulated several
concurrent users. The users were connected evenly to each of the 4 configured
SAP BW Instances.
WP WP WP WP WP WP WP WP WP WP WP WP WP WP WP WP
Fact table on
2xRAID5 drives
SP node 2 on 4 x RAID5 drives
Fact table
Fact table on 8 x RAID5 drives
The first number in each cell is the runtime with empty bufferpools ad the second
number (in italics) is the runtime of the second execution of the query, that is,
with filled buffer pool. Queries with different complexity and different inclusion
factors were tested:
Queries with high inclusion factor: That means weak restrictions, as with
example Queries 1 and 2. These queries resulted in a large amount of data
being selected from the InfoCube fact tables. Because the selected data did
not fit completely in the buffer pool, these queries always involved I/O
operations, also during the second execution.
Queries with low inclusion factors: That means strong restrictions, as with
example queries 3 to 5. Because these queries selected less data, when they
were executed the second time the data was completely in the buffer pool and
no I/O operations (disk access) occurred.
Some numbers in the table need further explanation. Query 1 for example, is a
complex and long running query. It scales more than linear (performance
increase of factor 10 during second execution) when going from 1 to 2 SP nodes,
because of the larger buffer pool, which results in a higher buffer pool hit ratio.
The data that was selected by query 3 to 5 (cells with gray background) did
completely fit into the buffer pool. Therefore, the larger buffer pool did not have
any impact on the execution times of these queries.
Depending on the query structure, execution plans might be different when going
from 1 to 2 and 4 SP nodes. Therefore, some queries, such as query 3 in
Table 10-1, do not scale predictably.
Short running queries remain short running when going to multiple partitions, for
example, query 5.
Long-running queries: The queries that were used in this test selected more
data and the data did not fit completely into the buffer pool. Therefore, I/O
operations occurred during the test. During this test, 16 queries were running
concurrently at all time, which resulted in 100% CPU utilization.
Linear Scalability
Measured Scalability
0 0
1 2 3 4 1 2 3 4
Number of SP nodes Number of SP nodes
Scalability factors for short running Scalability factors for long running
queries: queries:
• From 1 to 2 database servers: 1.84 • From 1 to 2 database servers: 2.04
• From 1 to 4 database servers: 3.58 • From 1 to 4 database servers: 4.11
During the load, dimension identifiers must be read from the dimension tables,
new dimension identifiers must be created for new dimension entries, the transfer
rules are applied, and the data is finally inserted into the fact table.
Because the distributed PSA and fact tables use different partitioning keys,
some rows that are selected from PSA on one partition are potentially inserted
into the fact table on a different partition (see the cross arrows from PSA to fact
tables).
Figure 10-4 Test environment for dataload from PSA into InfoCube
During the tests, 33% of the dataload time was spend for database processing.
The CPU utilization was 100% in the first scenario and about 95% in the second
and third scenario. A data package size (IDOC) of 100.000 was used for all tests.
The throughput, that is, the number of records loaded in a given timeframe was
compared for each scenario.
Figure 10-5 shows the result of the tests, that is, the increase of throughput when
going from one to two SP nodes and from two to four SP nodes.
Single-user test
Here are the single-user test results:
In general, the SAP BW query runtimes decrease with increasing number of
SP Nodes (machines).
Long running queries scale at least linear.
Short running queries remain short running in a multi-node environment.
To help customers deal with the potential huge initial outlay of hardware, the
paper presents a cluster configuration. This helps the cost issue in two ways.
First, it involves using several smaller machines linked together which are
normally less expensive. Second, customers can start with a small system that
uses only one region, or business segment, and then add more machines in the
future as they expand the SAP BW coverage.
In the following sections we show the hardware and software setup that was
used for this paper, explain the scalability tests, and show the test results.
The software setup used was a two-tier approach where the database and
application servers both resided on the same machine. This is simpler and more
efficient for a small setup since at some points the database server will need
more CPU and at other points the application servers need more CPU. Having
both reside on one system allows pooling of CPU resources, and also has a side
benefit of reducing network traffic between database server and application
server.
The total application server capacity was split across the two machines by having
one instance of the application server on each machine. One of them was the
Central Instance, but other than that, each was configured with similar settings.
SunStorEdge T3
LAN Network
Gigabit
Interconnect
Ethernet
SunBlade 2000
The database server was distributed across the two machines using the DB2
shared-nothing partitioned database feature. Four partitions were created per
machine to show the ability of this architecture to scale.
The total amount of data loaded in the fact table was 116.800.000 rows when
testing on one machine, and double the amount when testing on two machines.
repair index
rollup aggregates
activate ODS
Number of Machines
Figure 10-7 Results of scalability tests
The amount of processed data was identical in both cases. Table 10-2 shows the
operations tested. You can see that some operations perform much better on a
partitioned database — as examples, collecting runstats and doing InfoPackage
compression.
Of course, if during a certain operation the system already has 100% CPU
utilization on an unpartitioned database, it is likely that it will not run faster on a
partitioned database. This was the case, for example, when loading 30
InfoPackages concurrently.
When transaction DB6PERF is started, you can select the SQL Cache row in the
navigation field (left pane) by using a mouse-click. The transaction opens the
screen depicted in Figure A-1, which shows such statistics as:
Number of executions.
Total time of executions in ms, and in %
Selecting the second statement in the right pane where the statements are
sorted according to the Total Execution Time, the SQL Statement and the used
access plan will be explained as shown in Figure A-2. The Estimated Costs value
is marked yellow, and the #key columns is marked green.
By clicking the Collect button shown at the top of Figure A-3, the BW system
collects all of the necessary data to analyze the SQL.
As depicted in Figure A-4, you are asked to select the type of information
needed, such as:
DB configuration values and level
Tablespace configuration
Table structure
Statistics
Explanation
All the values can also be displayed on the SAP GUI (SAP Graphical User
Interface) by selecting the Display Download on Screen — see Figure A-4.
Figure A-5 The Collected and Downloaded Information for the SQL Explain.
The files include all of the detailed information including the database
configuration for the partitions.
DB2LEVEL File
Node type = Enterprise Server Edition with local and remote clients
Backup pending = NO
Database is consistent = NO
Rollforward pending = NO
Restore pending = NO
Attention: Other BW1 and partition configuration files would typically follow,
and have the type of information as in the above example. But, they have been
deleted to save space.
Tablespace ID = 0
Name = SYSCATSPACE
Type = Database managed space
Contents = Any data
State = 0x0000
Detailed explanation:
Normal
Total pages = 166400
Useable pages = 166398
Used pages = 53448
Free pages = 112950
High water mark (pages) = 53448
Page size (bytes) = 4096
Extent size (pages) = 2
Prefetch size (pages) = 2
Number of containers = 1
CONNECT TO BW1;
---------------------------------
-- DDL statements for User Defined Functions
---------------------------------
------------------------------------------------
-- DDL Statements for table "SAPBW1 "."/BIC/DZSD_C011"
------------------------------------------------
COMMIT WORK;
CONNECT RESET;
TERMINATE;
STATISTICS File
CONNECT TO BW1;
---------------------------------------------
---------------------------------------------
UPDATE SYSSTAT.INDEXES
SET NLEAF=-1,
NLEVELS=-1,
FIRSTKEYCARD=-1,
FIRST2KEYCARD=-1,
FIRST3KEYCARD=-1,
FIRST4KEYCARD=-1,
FULLKEYCARD=-1,
CLUSTERFACTOR=-1,
CLUSTERRATIO=-1,
SEQUENTIAL_PAGES=-1,
DENSITY=-1,
UPDATE SYSSTAT.COLUMNS
SET COLCARD=-1,
NUMNULLS=-1
WHERE TABNAME = '/BIC/DZSD_C011' AND TABSCHEMA = 'SAPBW1 ';
UPDATE SYSSTAT.TABLES
SET CARD=0,
NPAGES=0,
FPAGES=1,
OVERFLOW=0,
ACTIVE_BLOCKS=0
WHERE TABNAME = '/BIC/DZSD_C011' AND TABSCHEMA = 'SAPBW1 ';
UPDATE SYSSTAT.COLUMNS
SET COLCARD=0,
NUMNULLS=0,
SUB_COUNT=-1,
SUB_DELIM_LENGTH=-1,
AVGCOLLEN=4
WHERE COLNAME = 'DIMID' AND TABNAME = '/BIC/DZSD_C011' AND TABSCHEMA =
'SAPBW1 ';
UPDATE SYSSTAT.COLUMNS
SET COLCARD=0,
NUMNULLS=0,
SUB_COUNT=-1,
SUB_DELIM_LENGTH=-1,
AVGCOLLEN=4
WHERE COLNAME = 'SID_0SOLD_TO' AND TABNAME = '/BIC/DZSD_C011' AND TABSCHEMA
= 'SAPBW1 ';
COMMIT WORK;
UPDATE SYSSTAT.COLDIST
SET COLVALUE = NULL, VALCOUNT= -1
WHERE VALCOUNT <> -1 AND COLNAME = 'DIMID' AND TABNAME = '/BIC/DZSD_C011'
AND TABSCHEMA = 'SAPBW1 ';
UPDATE SYSSTAT.COLDIST
SET COLVALUE = NULL, VALCOUNT= -1
WHERE VALCOUNT <> -1 AND COLNAME = 'SID_0SOLD_TO' AND TABNAME =
'/BIC/DZSD_C011'
AND TABSCHEMA = 'SAPBW1 ';
COMMIT WORK;
UPDATE SYSSTAT.INDEXES
SET NLEAF=0,
NLEVELS=1,
FIRSTKEYCARD=0,
FIRST2KEYCARD=-1,
FIRST3KEYCARD=-1,
FIRST4KEYCARD=-1,
FULLKEYCARD=0,
CLUSTERFACTOR=-1.000000,
CLUSTERRATIO=100,
SEQUENTIAL_PAGES=0,
DENSITY=0,
AVERAGE_SEQUENCE_GAP=0.000000,
AVERAGE_SEQUENCE_FETCH_GAP=-1.000000,
AVERAGE_SEQUENCE_PAGES=0.000000,
AVERAGE_SEQUENCE_FETCH_PAGES=-1.000000,
AVERAGE_RANDOM_PAGES=0.000000,
AVERAGE_RANDOM_FETCH_PAGES=-1.000000,
NUMRIDS=0,
NUMRIDS_DELETED=0,
NUM_EMPTY_LEAFS=0
WHERE COLNAMES = '+DIMID'
AND TABNAME = '/BIC/DZSD_C011' AND TABSCHEMA = 'SAPBW1 ';
UPDATE SYSSTAT.INDEXES
SET NLEAF=0,
NLEVELS=1,
FIRSTKEYCARD=0,
FIRST2KEYCARD=-1,
FIRST3KEYCARD=-1,
FIRST4KEYCARD=-1,
FULLKEYCARD=0,
CLUSTERFACTOR=-1.000000,
CLUSTERRATIO=100,
SEQUENTIAL_PAGES=0,
DENSITY=0,
AVERAGE_SEQUENCE_GAP=0.000000,
AVERAGE_SEQUENCE_FETCH_GAP=-1.000000,
AVERAGE_SEQUENCE_PAGES=0.000000,
COMMIT WORK;
-- Mimic functions
UPDATE SYSSTAT.FUNCTIONS
SET ios_per_invoc= -1.0,
insts_per_invoc= -1.0,
ios_per_argbyte= -1.0,
insts_per_argbyte= -1.0,
percent_argbytes= -1,
initial_ios= -1.0,
initial_insts= -1.0,
cardinality= -1.0;
COMMIT WORK;
CONNECT RESET;
TERMINATE;
EXPLAIN File
DB2_VERSION: 08.01.3
SOURCE_NAME: SYSSH200
SOURCE_SCHEMA: NULLID
SOURCE_VERSION:
EXPLAIN_TIME: 2004-03-17-10.29.01.664445
EXPLAIN_REQUESTER: SAPBW1
Package Context:
---------------
SQL Type: Dynamic
Optimization Level: 5
Blocking: Block All Cursors
Isolation Level: Cursor Stability
Original Statement:
------------------
SELECT "DIMID"
FROM "/BIC/DZSD_C011"
WHERE "SID_0SOLD_TO" = ? FETCH FIRST 1 ROWS ONLY
OPTIMIZE
FOR 1 ROWS
Optimized Statement:
-------------------
SELECT Q1.DIMID AS "DIMID"
FROM SAPBW1."/BIC/DZSD_C011" AS Q1
WHERE (Q1.SID_0SOLD_TO = :?)
Access Plan:
-----------
Total Cost: 0.0714158
Query Degree: 0
Rows
Arguments:
---------
BLDLEVEL: (Build level)
DB2 v8.1.1.24 : s030728
ENVVAR : (Environment Variable)
DB2_INLIST_TO_NLJN = YES
ENVVAR : (Environment Variable)
Input Streams:
-------------
5) From Operator #2
Column Names:
------------
+Q2.DIMID
2) TQ : (Table Queue)
Cumulative Total Cost: 0.0714158
Cumulative CPU Cost: 136416
Cumulative I/O Cost: 0
Cumulative Re-Total Cost: 0.00283065
Cumulative Re-CPU Cost: 5407
Cumulative Re-I/O Cost: 0
Cumulative First Row Cost: 0.0714158
Cumulative Comm Cost: 1
Cumulative First Comm Cost: 0
Estimated Bufferpool Buffers: 1
Arguments:
---------
LISTENER: (Listener Table Queue type)
FALSE
TQMERGE : (Merging Table Queue flag)
FALSE
TQREAD : (Table Queue Read type)
READ AHEAD
TQSEND : (Table Queue Write type)
DIRECTED
UNIQUE : (Uniqueness required flag)
FALSE
Input Streams:
Column Names:
------------
+Q1.$RID$+Q1.DIMID+Q1.SID_0SOLD_TO
Output Streams:
--------------
5) To Operator #1
Column Names:
------------
+Q2.DIMID
3) FETCH : (Fetch)
Cumulative Total Cost: 0.021395
Cumulative CPU Cost: 40868
Cumulative I/O Cost: 0
Cumulative Re-Total Cost: 0.00283065
Cumulative Re-CPU Cost: 5407
Cumulative Re-I/O Cost: 0
Cumulative First Row Cost: 0.021395
Cumulative Comm Cost: 0
Cumulative First Comm Cost: 0
Estimated Bufferpool Buffers: 1
Predicates:
----------
2) Sargable Predicate
Relational Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 0.04
Predicate Text:
--------------
(Q1.SID_0SOLD_TO = :?)
Input Streams:
-------------
2) From Operator #4
Column Names:
------------
+Q1.DIMID(A)+Q1.$RID$
Column Names:
------------
+Q1.SID_0SOLD_TO
Output Streams:
--------------
4) To Operator #2
Column Names:
------------
+Q1.$RID$+Q1.DIMID+Q1.SID_0SOLD_TO
Arguments:
---------
MAXPAGES: (Maximum pages for prefetch)
ALL
PREFETCH: (Type of Prefetch)
NONE
ROWLOCK : (Row Lock intent)
Input Streams:
-------------
1) From Object SAPBW1./BIC/DZSD_C011~0
Column Names:
------------
+Q1.DIMID(A)+Q1.$RID$
Output Streams:
--------------
2) To Operator #3
Column Names:
------------
+Q1.DIMID(A)+Q1.$RID$
Schema: SAPBW1
Name: /BIC/DZSD_C011~0
Schema: SAPBW1
Name: /BIC/DZSD_C011
Type: Table
Time of creation: 2004-02-23-11.30.54.422579
Last statistics update: 2004-03-16-18.47.14.386674
Number of columns: 2
Number of rows: 0
Width of rows: 14
Number of bufferpool pages: 1
Distinct row values: No
Tablespace name: BW1#DIMD
Access Control List (ACL). The list of principals Asynchronous Messaging. A method of
that have explicit permission (to publish, to communication between programs in which a
subscribe to, and to request persistent delivery of a program places a message on a message queue,
publication message) against a topic in the topic then proceeds with its own processing without
tree. The ACL defines the implementation of waiting for a reply to its message.
topic-based security.
Attribute. A field in a dimension table/
Aggregate. Pre-calculated and pre-stored
summaries, kept in the data warehouse to improve Basis. A set of middleware programs and tools
query performance. A multidimensional summary from SAP that provides the underlying base that
table derived from InfoCube data for performance; enables applications to be seamlessly interoperable
can be stored in RDBMS or MS Analysis Services. and portable across operating systems and
database products.
Aggregation. An attribute level transformation that
reduces the level of detail of available data. For BEx. Business Explorer: the SAP query and
example, having a Total Quantity by Category of reporting tool for end users tightly coupled with BW
Items rather than the individual quantity of each item
in the category. BLOB. Binary Large Object, a block of bytes of data
(for example, the body of a message) that has no
Alert Monitor for BEX. A monitoring tool for discernible meaning, but is treated as one solid
displaying exceptions whose threshold values have entity that cannot be interpreted.
been exceeded or have not been reached.
Broker domain. A collection of brokers that share
AMI. See Application Messaging Interface. a common configuration, together with the single
Configuration Manager that controls them.
Application Link Enabling. Supports the creation
and operation of distributed applications. And, Characteristic. A business intelligence
application integration is achieved via synchronous dimension.
and asynchronous communication, not via a central
database. Provides business-controlled message Cluster. A group of records with similar
exchange with consistent data on loosely linked SAP characteristics. In WebSphere MQ, a group or two or
applications. more queue managers on one or more computers,
providing automatic interconnection, and allowing
Application Messaging Interface. The queues to be shared amongst them for load
programming interface provided by MQSeries® that balancing and redundancy
defines a high level interface to message queuing
services. Compensation. The ability of DB2 to process
SQL that is not supported by a data source on
the data from that data source.
Configuration repository. Persistent storage for Data Refresh. A data loading technique where all
broker configuration and topology definition. the data in a database is completely replaced with a
new set of data.
Configuration. The collection of brokers, their
execution groups, the message flows and sets Data Warehouse. A specialized data environment
that are assigned to them, and the topics and developed, structured, and used specifically for
associated access control specifications. decision support and informational applications. It is
subject oriented rather than application oriented.
Connector. See Message processing node Data is integrated, non-volatile, and time variant.
connector.
Database Instance. A specific independent
Control Center. The graphical interface that implementation of a DBMS in a specific
provides facilities for defining, configuring, environment. For example, there might be an
deploying, and monitoring resources of the independent DB2 DBMS implementation on a Linux
WebSphere MQ Integrator network. server in Boston supporting the Eastern offices, and
another separate and independent DB2 DBMS on
Data Append. A data loading technique where the same Linux server supporting the western
new data is added to the database leaving the offices. They would represent two instances of DB2.
existing data unaltered.
Database Partition. Part of a database that
Data Cleansing. A process of data manipulation consists of its own data, indexes, configuration files,
and transformation to eliminate variations and and transaction logs.
inconsistencies in data content. This is typically to
improve the quality, consistency, and usability of the DataBlades. These are program modules that
data. provide extended capabilities for Informix
databases, and are tightly integrated with the DBMS.
Data Federation. The process of enabling data
from multiple heterogeneous data sources to appear DB Connect. Enables connection to several
as if it is contained in a single relational database. relational database systems and the transfer of data
Can also be referred to “distributed access”. from these database systems into the SAP Business
Information Warehouse.
Enrichment. The creation of derived data. An InfoCube. Star schema; a fact table with several
attribute level transformation performed by some dimension tables; consists of key figures and
type of algorithm to create one or more new characteristics; often has an aggregate summary
(derived) attributes. table.
Event. A signal to the background processing InfoProvider. Generic term for objects, for which
system that a certain status has been reached in the queries are created and executed in SAP BW. There
SAP system. The background processing system are two types of InfoProvider; objects that contain
then starts all processes that were waiting for this physical data and objects that do not contain
event. physical data. Data targets, such as InfoCubes, ODS
objects, and InfoObjects (characteristics with
Event Queue. The queue onto which the queue attributes, texts, or hierarchies) belong to the first
manager puts an event message after it detects an type of InfoProvider, and InfoSets, RemoteCubes,
event. Each category of event (queue manager, SAP RemoteCubes, and MultiProviders belong to
performance, or channel event) has its own event the second type. InfoProviders are the objects or
queue. views that are relevant for reporting.
Execution group. A named grouping of message InfoSet. Describes data sources that are usually
flows that have been assigned to a broker. defined as joins between ODS objects and/or
InfoObjects (characteristics with master data).
Extenders. These are program modules that
provide extended capabilities for DB2, and are
tightly integrated with DB2.
Glossary 361
InfoSource. An intermediary supplying updates to Message flow. A directed graph that represents
InfoCubes and ODS; contains no data; a collection the set of activities performed on a message or event
of rules and objects. as it passes through a broker. A message flow
consists of a set of message processing nodes and
Input node. A message flow node that represents message processing connectors.
a source of messages for the message flow.
Message parser. A program that interprets the bit
Instance. A complete database environment stream of an incoming message and creates an
internal representation of the message in a tree
IVews. InfoObjects (master data) with properties, structure. A parser is also responsible to generate a
text, and hierarchies. bit stream for an outgoing message from the internal
representation.
Java Database Connectivity. An application
programming interface that has the same Message processing node connector. An entity
characteristics as ODBC but is specifically designed that connects the output terminal of one message
for use by Java database applications. processing node to the input terminal of another.
Java Development Kit. Software package used to Message processing node. A node in the
write, compile, debug and run Java applets and message flow, representing a well defined
applications. processing stage. A message processing node can
be one of several primitive types or it can represent
Java Message Service. An application a subflow.
programming interface that provides Java language
functions for handling messages. Message Queue Interface. The programming
interface provided by the WebSphere MQ queue
Java Runtime Environment. A subset of the Java managers. This programming interface allows
Development Kit that allows you to run Java applets application programs to access message queuing
and applications. services.
Master data. Data that remains unchanged over a Message repository. A database holding
long period of time. (attributes, texts and hierarchies message template definitions.
-- similar to data warehouse dimensions).
Message set. A grouping of related messages.
Materialized Query Table. A table where the
results of a query are stored, for later reuse. Message type. The logical structure of the data
within a message.
Measure. A data item that measures the
performance or behavior of business processes. Meta Data. Typically called data (or information)
about data. It describes or defines data elements.
Message broker. A set of execution processes
hosting one or more message flows. MOLAP. Multi-dimensional OLAP. Can be called
MD-OLAP. It is OLAP that uses a multi-dimensional
Message domain. The value that determines how database as the underlying data structure.
the message is interpreted (parsed).
Glossary 363
Predefined message. A message with a structure Roll-up. Iterative analysis, exploring facts at a
that is defined before the message is created or higher level of summarization.
referenced.
SAP BW. Business Information Warehouse.
Process. An activity within or outside an SAP
system with a defined start and end time. SAPS. SAP unit of measure for computing
power.
Process Variant. Name of the process. A process
can have different variants. For example, in the Shared nothing. A data management architecture
loading process, the name of the InfoPackage where nothing is shared between processes. Each
represents the process variants. The user defines a process has its own processor, memory, and disk
process variant for the scheduling time. space.
Primary Key. Field in a database table that is Slice and Dice. Analysis across several
uniquely different for each record in the table. dimensions and across many categories of data
items. Typically to uncover business behavior
PSA. Persistent staging area: flat files that hold and rules.
extract data that has not yet been cleaned or
transformed. Static SQL. SQL that has been compiled prior to
execution. Typically provides best performance.
Pushdown. The act of optimizing a data operation
by pushing the SQL down to the lowest point in the Subflow. A sequence of message processing
federated architecture where that operation can be nodes that can be included within a message flow.
executed. More simply, a pushdown operation is one
that is executed at a remote server. Subject Area. A logical grouping of data by
categories, such as customers or items.
Queue Manager. A subsystem that provides
queuing services to applications. It provides an Synchronous Messaging. A method of
application programming interface so that communication between programs in which a
applications can access messages on the queues program places a message on a message queue
that are owned and managed by the queue and then waits for a reply before resuming its own
manager. processing.
Queue. A WebSphere MQ object. Applications can Thread. In WebSphere MQ, the lowest level of
put messages on, and get messages from, a queue. parallel execution available on an operating system
A queue is owned and managed by a queue platform.
manager. A local queue is a type of queue that can
contain a list of messages waiting to be processed. Type Mapping. The mapping of a specific data
Other types of queues cannot contain messages but source type to a DB2 UDB data type
are used to point to other queues.
Unit of Work. A recoverable sequence of
RemoteCube. An InfoCube whose transaction operations performed by an application between two
data is managed externally rather than in SAP points of consistency.
BW.
User Mapping. An association made between the
ROLAP. Relational OLAP. Multi-dimensional federated server user ID and password and the data
analysis using a multi-dimensional view of relational source (to be accessed) used ID and password.
data. A relational database is used as the underlying
data structure.
Glossary 365
366 SAP BW and DB2
Abbreviations and acronyms
MM
ABAP Advanced Business CPU Central Processing Unit
Application Programming CSA Common Storage Area
ACS access control system DB Database
ADK Archive Development Kit DBA Database Administrator
AIX Advanced Interactive DB2 Database 2™
eXecutive from IBM
DB2 UDB DB2 Universal DataBase
ALE Application Link Enabling
DB2 II DB2 Information Integrator
AMI Application Messaging
Interface DBMS DataBase Management
System
API Application Programming
Interface DCE Distributed Computing
Environment
AQR automatic query re-write
DCM Dynamic Coserver
AR access register Management
ARM automatic restart manager DCOM Distributed Component
ART access register translation Object Model
ASCII American Standard Code for DDL Data Definition Language
Information Interchange DLL Dynamically Linked Library
AST Application Summary Table DIMID Dimension Identifier
BAPI Business Application DML Data Manipulation Language
Programming Interface
DRDA® Distributed Relational
BEx Business Explorer Database Architecture
BI Business Intelligence DSA Dynamic Scalable
BIW Business Information Architecture
Warehouse (SAP) DSN Data Source Name
BW Business Information DSS Decision Support System
Warehouse (SAP)
EAI Enterprise Application
BLOB Binary Large OBject Integration
CCMS Computing Center EBCDIC Extended Binary Coded
Management System Decimal Interchange Code
CLI Call Level Interface EDA Enterprise Data Architecture
CLOB Character Large OBject EDU Engine Dispatchable Unit
CLP Command Line Processor EGM Enterprise Gateway Manager
CORBA Common Object Request EJB Enterprise Java Beans
Broker Architecture
ER Enterprise Replication
The publications listed in this section are considered particularly suitable for a
more detailed discussion of the topics covered in this redbook.
IBM Redbooks
For information on ordering these publications, see “How to get IBM Redbooks”
on page 372. Note that some of the documents referenced here may be available
in softcopy only.
DB2 UDB/WebSphere Performance Tuning Guide, SG24-6417
DB2 UDB Backup and Recovery with ESS Copy Services, SG24-6557
Other publications
The publications listed below are relevant as information sources. Information
from them may have been referenced, or used, in the development of this
redbook:
1. Getting Started with SAP R/3® AND IBM® DB2® Universal Database, on
HP® Servers, IBM Technical White Paper.
2. SAP Business Warehouse implementation on DB2 UDB Sun Cluster, IBM
Technical White Paper.
3. SAP R/3 performance optimization, ISBN:0321112350.
4. IBM DB2 Universal Database Administration Guide, Version 8.1, SC09-4821.
5. DB2 Universal Database Performance Tuning and Monitoring Workshop, IBM
Learning Services.
6. Performance Study for SAP Business Information Warehouse on DB2
Universal Database EEE for Linux, UNIX, and Windows, an IBM paper.
7. HACMP - Backbone Infrastructure Detailed Design, a technical IBM
document by Craig Wilson and Chris Hough.
8. Data Warehousing with mySAP Business Intelligence, and SAP Technical
Brief.
Index 375
N Redbooks Web site 372
NetWeaver - see SAP NetWeaver Contact us xv
Remote Function Call 40
RemoteCube 26
O reorganization 75
ODS - see Operational Data Store
Repeatable Read 54
OLAP 15, 123
roll-up 34
OLAP Engine 39
online functions
buffer pool configuration 77 S
drop container 76 SAP Business Information Warehouse xi, 1
index reorganization 76 3-tier system environment 83
load 76 Additional Dialog Instance 179
parameter changes 77 administration - see SAP BW Administration
Operational Data Store 23, 36 Administrator Workbench 39, 239
ODS object 23, 25, 36 an overview 7
ODS tables 90–91 anticipated growth 17
optimizer - see DB2 UDB ESE anticipated growth of users 17
architecture 21
benchmark tools 123
P business content 41
parallel processing 55
Business Content Add-on 173
query 85
Business Explorer 24, 39
parallelism 48, 85
BW related definitions 172
database backup/restore 85
Central Instance 110, 132
inter-partition parallelism 59, 85, 91, 93, 95
client copy 169
intra-partition parallelism 56, 85, 100
component of business intelligence 14
partitioning 48, 51, 55, 58–59, 64, 90
components 114
distribute the workload 94
Central Instance 114
hash partitioning 52
Database Instance 114
logical partitioning 51, 94
Dialog Instance 114
multiple database partitions 94
Gateway Instance 114
partition map 52
concurrent queries 82
partitioning keys 52, 95
continous growth 82
physical partitioning 51
customer scenario 15
re-partitioning 116
data flow 24
Persistent Staging Area 23, 25, 90, 120
Database Instance 110, 132
pipeline processing 56
DB2 synergy 81
PSA - see Persistent Staging Area
deployed on DB2 81
Dialog Instances 132
Q Front-ends 132
Queries with hierarchies 101 Gateway Instance 132
Query Designer 39 information model - see information model
query graph model - see DB2 UDB ESE installation Notes 134, 138
maintenance window 82
Management Summary 2
R MultiCubes 123
RAID data storage 186
naming convention 199
Read Stability 54
OLAP engine 39
record-based indexes 72
Index 377
instance distribution 132 data distribution 262
Logical Partitioning 130 DB2 configuration hints 260
logical partitions 123 DB2 parameters 263
Logical Volumes 138 DB2 UDB ESE 260
Log-on to the SAP System 164 DB2COCKPIT 280
number of processors 123 DB2PERF 280
Phased Implementation 124 distribute the tablespaces 261
physical database structure 160 extending tablespaces 262
post-installation activities 163 Health Check 250
production support 130 isolation level 266
QuickSizer 122 lock list 266
required documentation 138 MAXLOCKS parameter 266
Roll-Out Implementation 124 memory management 263
SAN Switches 138 Missing Indexes 288
SAP BW queries 122 number of containers 262
SAP Patch Manager 175 number of I/O servers 265
SAP system ID 140 number of R/3 instances 260
sapdata directories 160 number range buffering 260
SAPinst GUI 141 overflow records 268
SAPinstGUI 150 package cache 265
SAPS 123 page cleaners 265
sizing 122 page size considerations 263
sizing guide 122 partition the database 261
Starting the SAP System 164 performance monitoring 268
Stopping the SAP System 164 prefetchers 265
System Administration Assistant 164 Proactive monitoring 272
system landscape 131 SAP BW requirements 255
system planning 137 SAP memory 258
the value 82 SAP Notes 250
tight integration 86 soft checkpoints 266
value details 89 sort heap 264
value proposition 87 system bottlenecks 271
Volume Groups 138 systems configuration 250
Web Application Server 132 the approach 250
SAP BW Performance 249 too many processes 260
a few tips 259 transaction analysis 283
aggregate table 255 Transaction logging 265
analysis and tuning 250 tuning components 257
Analyzing the SQL Statements 283 unit of work size 254
Application Monitor 291 using VARCHAR 268
bufferpools 264 work processes 259
BW Monitor 275 SAP Exchange Infrastructure 14
catalog tablespace 263 SAP Master Data Management 14
common workload monitor 278 SAP Mobile Infrastructure 14
configuration parameters 256 SAP NetWeaver 10, 12, 82, 143
configuring R/3 memory 258 components 14
container types 262 Developer Studio 143
cost-based optimizer 266 Developer Workplace 143
current database statistics 254 Enterprise Services Architecture 12
Index 379
380 SAP BW and DB2
Building and Scaling SAP Business Information Warehouse on DB2 UDB ESE
(0.5” spine)
0.475”<->0.875”
250 <-> 459 pages
Back cover ®