Professional Documents
Culture Documents
.
.
PARTICIPANT HANDBOOK
INSTRUCTOR-LED TRAINING
.
Course Version: 17
Course Duration: 2 Day(s)
Material Number: 50155423
SAP Copyrights, Trademarks and
Disclaimers
No part of this publication may be reproduced or transmitted in any form or for any purpose without the
express permission of SAP SE or an SAP affiliate company.
SAP and other SAP products and services mentioned herein as well as their respective logos are
trademarks or registered trademarks of SAP SE (or an SAP affiliate company) in Germany and other
countries. Please see https://www.sap.com/corporate/en/legal/copyright.html for additional
trademark information and notices.
Some software products marketed by SAP SE and its distributors contain proprietary software
components of other software vendors.
National product specifications may vary.
These materials may have been machine translated and may contain grammatical errors or
inaccuracies.
These materials are provided by SAP SE or an SAP affiliate company for informational purposes only,
without representation or warranty of any kind, and SAP SE or its affiliated companies shall not be liable
for errors or omissions with respect to the materials. The only warranties for SAP SE or SAP affiliate
company products and services are those that are set forth in the express warranty statements
accompanying such products and services, if any. Nothing herein should be construed as constituting an
additional warranty.
In particular, SAP SE or its affiliated companies have no obligation to pursue any course of business
outlined in this document or any related presentation, or to develop or release any functionality
mentioned therein. This document, or any related presentation, and SAP SE’s or its affiliated companies’
strategy and possible future developments, products, and/or platform directions and functionality are
all subject to change and may be changed by SAP SE or its affiliated companies at any time for any
reason without notice. The information in this document is not a commitment, promise, or legal
obligation to deliver any material, code, or functionality. All forward-looking statements are subject to
various risks and uncertainties that could cause actual results to differ materially from expectations.
Readers are cautioned not to place undue reliance on these forward-looking statements, which speak
only as of their dates, and they should not be relied upon in making purchasing decisions.
Demonstration
Procedure
Warning or Caution
Hint
Facilitated Discussion
TARGET AUDIENCE
This course is intended for the following audiences:
● Support Consultant
● Developer IT Adminstrator IT Support
● System Administrator
Lesson 1
Handling System Offline Situations 3
Lesson 2
Handling System Hang but Reachable Situations 27
Lesson 3
Analyzing a Suddenly Slow System 41
UNIT OBJECTIVES
LESSON OBJECTIVES
After completing this lesson, you will be able to:
● Handle system offline situations
The following issues can cause a SAP HANA system to go offline (that is, from the end-user
perspective, the SAP HANA system seems to hang):
● ........................................
● ........................................
● ........................................
● ........................................
● ........................................
Usually, in a system-down scenario the system cannot be accessed through SQL and/or any
another connection method. This makes analyzing the root cause a bit more difficult, but not
impossible. Several small tests, in the right order, can help you quickly exclude areas that
aren't causing the problem. Such a workflow should become a standard way of approaching a
system that is down.
Because SAP HANA cockpit might only be able to partially connect to the SAP HANA system,
you should use the following quick tests to roughly determine the area that causes the
problem. As soon as you have found the problem area you should investigate more deeply,
but not forget that getting the system up and running again has the highest priority.
Question: What tests can you perform to find the problem area?
● ........................................
● ........................................
● ........................................
● ........................................
● ........................................
Answer: What tests can you perform to find the problem area?
Caution:
The following checks will help you to quickly identify parts that are broken or
working. With there tests, you are not supposed to do a deep root cause analysis.
For a deep root cause analysis there are other and probably better tools
available.
In today's world, where almost every device is connected to the network, it's extremely
important that the network is up and running correctly. In an SAP HANA database system, the
network is important as well. End users connect to the database to execute all kind of queries.
This can be done directly using SQL or via a middleware application. The SAP HANA database
itself can be set up as a multi-host scale-out system that distributes the data over several
servers. Without a network, external end-user connections and internal server-to-server
connections would fail.
Because external and internal network connections are important for a SAP HANA system,
you should test both by pinging SAP HANA and non-SAP HANA hosts in your network. If all
the hosts can be reached, then the network is available and can be excluded.
ping <SAP HANA host>
ping <internal host>
ping <external host>
Using a ping, you can test that the remote hosts are reachable, but maybe the network
packages are taking the long way home due to a routing problem in the network. You can
check the network path to the remote host using the following command:
traceroute <SAP HANA host>
traceroute <internal host>
Hint:
If in your company the end users are connecting to the network using a virtual
desktop infrastructure (VDI) solution or are in a dedicated network, then you
should test the network connections from within these infrastructures as well.
As the SAP HANA hosts are normally up and running 24/7, check whether there were
unplanned and unexplained restarts. You can check this with the following command:
last | grep boot
Looking at the Linux system log files to analyze the system is one of the most important tasks
when troubleshooting a system. Since the move from syslog to systemd, kernel messages
and messages of system services are handled by systemd.
Systemd was introduced in SLES 12 and RHEL 7 and replaces the traditional init scripts.
Systemd also introduced its own logging system called journal.
Systemd manages the journal as a system service under the name systemd-journald.service
and it is switched on by default. In a systemd-enabled Linux system, the systemd-journald
service collects all messages from the kernel, boot process, syslog, user processes, standard
input, and system service errors in a centralized location.
You can check the last 50 boot error messages in the journal with the following command:
journalctl -n 50 -p err -b
Hint:
You can check the last 50 kernel error messages in the journal with the following
command:
journalctl -n 50 -p err -k
Avoiding storage problems is part of every layer in the Linux software and hardware stack.
Modern hard drives are capable of detecting and correcting minor errors in block reads. SAN
and NAS have built-in error correction and redundancy to handle power and hardware failure.
Modern Linux file systems are all journal-based and can correct errors created due to power
failures. Last but not least databases also support many different techniques to survive power
failures and incorrect service shutdown situation.
If the SAP HANA database system 'stopped' due to power, hardware, or software failures, you
should check if all file systems are available again after the server has restarted. Depending
on the storage system used you can investigate the storage problem more deeply.
Note:
In the scope of this course we will not investigate storage system problems. For
this you need to contact your storage vendor and get the support information you
need.
To check if all the SAP HANA services and hosts are available on the Linux host, you can
execute the following commands:
As <sid>adm user:
sapcontrol -nr <instance number> -function GetProcessList
sapcontrol -nr <instance number> -function GetSystemInstanceList
You also need to check whether or not the system can be reached over the SQL interface.
When you are already connected to the SAP HANA host via the SSH session, check the SQL
interface with the following command:
Note:
The default port number range for tenant databases is 3<instance>40 -
3<instance>99.
As <sid>adm user:
Enter your password when requested. You are now in the HDBSQL terminal. From the
HDBSQL terminal you can get SAP HANA connection information by executing the command:
\s
Caution:
It's important to test all your tenants, because the tenants have different SQL
ports and can be stopped independently of a running SAP HANA database
system.
Checking the SQL connection only from the local host isn't sufficient as it could be the case
that SAP HANA SQL is blocked on the network. To make sure that this isn't the case, you
should perform a HDBSQL connection test also via the network from the end-user LAN and
the application server network.
From the SAP S/4HANA ABAP application server, as <sid>adm user:
Enter your password when requested. You are now in the HDBSQL terminal. From the
HDBSQL terminal you can get SAP HANA connection information by executing the command:
\s
If the issue is due to a hardware or a software failure, it is important to save log files on the
Linux operating system or at the storage system level for later analysis.
For further specific steps and guidance on pro-active or reactive actions you can take, see
SAP Note 1999020 — SAP HANA: Troubleshooting when the database is no longer reachable.
The Cockpit Manager is used by the cockpit administrator to register databases and to create
groups and cockpit users for accessing SAP HANA cockpit.
The first step for administrating a tenant or systemdb is to register the new tenant in the
Cockpit Manager. As soon as a tenant is registered in the Cockpit Manager, the database
administrator can start to use the tenant in the SAP HANA cockpit.
The SAP HANA cockpit home screen shows a high level aggregated overview of all registered
systems. From this aggregated landscape overview level you can quickly drill down to a
detailed overview of an individual database. In the database overview screen you find cards
for all important parts of the SAP HANA database. On a card, you see a mini graph of an
important KPI that belongs to the monitoring area the card displays. On these cards you will
also find links that start cockpit applications to analyze the measured KPIs further. Through
this drill-down you can easily find the cause of the problem.
The SAP HANA database explorer tool is integrated into the SAP HANA cockpit. The database
explorer allows you to query information about the database using SQL statements, and to
view information about your database's catalog objects.
In the overview, you can select the Database Directory tile or a dedicated group tile, like the
Group01 in this screenshot, to quickly see the status of the SAP HANA systems.
From the Database Directory tile, you can navigate to your SAP HANA system overview page,
where a detailed status of the selected SAP HANA system is displayed.
Database Directory
The Database Directory gives an aggregated overview of each database for which you are
responsible. In the Database Directory, you can see that a system or tenant is in trouble when
the status Stopped, Running with issues, or Unknown is displayed. To investigate these
problems in more detail, you start the cockpit System Overview page for this system or tenant
by selecting the corresponding line.
When the Database Directory shows that the whole SAP HANA database system is having
problems, it is important to quickly investigate the root cause of the problem so that you can
get the system up and running again.
Even when the SAP HANA database system is down, the cockpit can be used to investigate
the root cause of the problem. This system-down analysis is done via the SAP start service
connection.
When a important system is down, you want to start it as soon as possible. This is a logical
course of action, but it could make the root cause analysis more difficult, because during a
restart, important low-level log or trace files can be overwritten. So a best practice is that
before starting the system, you save all the important log and trace files for later
investigation.
To support this, SAP HANA provides a full system information dump. This information dump
lets you control which logs to save, so you can use these saved logs to troubleshoot the issue
after you have restarted the SAP HANA database.
In the Database Directory, you can also specify the database user credentials required to drill
down to an individual database, which is necessary unless single sign-on is in effect for that
database.
Database Overview
The Database Directory shows a high-level status overview of all the databases belonging to
groups to which you have been granted access. For each database, you can drill down for
more information.
When you open the cockpit's Database Overview page for a system that shows the status
Stopped, No SQL access, or Unknown, it is very likely that you cannot connect to the SAP
HANA database using a SQL connection. The cockpit starts, but cannot retrieve the
monitoring data using SQL. This results in almost all cards showing the text Cannot load
data.
It is best that you start the Database Overview page of SYSTEMDB instead, because from this
Database Overview page you can retrieve some information via the SAP host agent
infrastructure. Via the SYSTEMDB, you can get information on the status of the SAP HANA
services.
This means that you cannot use the default monitoring cards to further investigate the
problem. Depending on the error situation, the SAP HANA Cockpit will present you the
relevant cards that can be useful during the investigation. In a system down you will find the
Manage full system information dumps application to be available, but the Troubleshoot
unresponsive systems isn't because the SAP HANA Cockpit can't connect to the SAP HANA
index service.
3. Choose Collect Diagnostics, and in the dropdown list choose Collect from Existing Files or
Create from Runtime Environment.
4. In the pop-up window choose the information items you want to collect. In the bottom-
right corner, choose Start Collecting.
Note:
If you are connected to the system database of a multiple-container system, only
information from the system views of the system database is collected.
Information from the system views of tenant databases is not collected,
regardless of this option setting.
Information from system views is collected through the execution of SQL statements, which
may impact performance. In addition, the database must be online, so this option is not
available in diagnosis mode.
The system collects the relevant information and saves it to a ZIP file. This may take some
time and can be allowed to run in the background.
If you are connected to the system database of a multiple-container system, information from
all tenant databases is collected and saved to separate ZIP files.
If you are logged on as the operating system user, <sid>adm, the fullSystemInfoDump.py
script is part of the server installation and can be run from the command line. It is located in
the directory $DIR_INSTANCE/exe/python_support.
Hint:
You can use the predefined shell alias cdpy to quickly navigate to the
python_support directory.
You can modify the command with several command line options. To see the available
options, specify the option --help. All options related to getting a system dump are fully
described in SAP Note 1732157.
If the system can be reached by SQL (and you have not specified the option --nosql), the
script starts collecting diagnosis information. If the system cannot be reached by SQL, the
script starts collecting support information, but does not export data from system views.
The script creates a ZIP file containing the collected information and saves it to the directory
$DIR_GLOBAL/sapcontrol/snapshots. $DIR_GLOBAL typically points to /usr/sap/
<SID>/SYS/global.
The name of the ZIP file is structured as follows:
fullsysteminfodump_<SID>_<DBNAME>_<HOST>_<timestamp>.zip.
The time-stamp in the file name is Coordinated Universal Time (UTC). The HOST and SID are
taken from the sapprofile.ini file.
The output directory for the ZIP file is shown as console output when the script is running, but
you can look it up with the command hdbsrvutil -z | grep DIR_GLOBAL= .
Note:
All of the following file types are collected unless the option --rtedump is
specified, in which case only runtime environment (RTE) dump files are created
and collected.
Log File
All information about what has been collected is shown as console output, and is written to a
file named log.txt, which is stored in the ZIP file.
Trace Files
Each of the following trace files is put into a file with the same name as the trace file. For
storage reasons, only the trace files from the last seven days are collected unabridged. Older
trace files are not collected. This behavior can be changed by using the option --days or with
the options --fromDate and --toDate.
● $DIR_INSTANCE/<SAPLOCALHOST>/trace/
compileserver_alert_<SAPLOCALHOST>.trc
● $DIR_INSTANCE/<SAPLOCALHOST>/trace/
compileserver_<SAPLOCALHOST>.<...>.trc
● $DIR_INSTANCE/<SAPLOCALHOST>/trace/daemon_<SAPLOCALHOST>.<...>.trc
● $DIR_INSTANCE/<SAPLOCALHOST>/trace/
indexserver_alert_<SAPLOCALHOST>.trc
● $DIR_INSTANCE/<SAPLOCALHOST>/trace/
indexserver_<SAPLOCALHOST>.<...>.trc
● $DIR_INSTANCE/<SAPLOCALHOST>/trace/
nameserver_alert_<SAPLOCALHOST>.trc
● $DIR_INSTANCE/<SAPLOCALHOST>/trace/nameserver_history.trc
● $DIR_INSTANCE/<SAPLOCALHOST>/trace/
nameserver_<SAPLOCALHOST>.<...>.trc
● $DIR_INSTANCE/<SAPLOCALHOST>/trace/
preprocessor_alert_<SAPLOCALHOST>.trc
● $DIR_INSTANCE/<SAPLOCALHOST>/trace/
preprocessor_<SAPLOCALHOST>.<...>.trc
● $DIR_INSTANCE/<SAPLOCALHOST>/trace/
statisticsserver_alert_<SAPLOCALHOST>.trc
● $DIR_INSTANCE/<SAPLOCALHOST>/trace/
statisticsserver_<SAPLOCALHOST>.<...>.trc
● $DIR_INSTANCE/<SAPLOCALHOST>/trace/xsengine_alert_<SAPLOCALHOST>.trc
● $DIR_INSTANCE/<SAPLOCALHOST>/trace/xsengine_<SAPLOCALHOST>.<...>.trc
Configuration Files
All configuration files are collected unabridged and stored in a file with the same name as
the .ini file:
● $DIR_INSTANCE/<SAPLOCALHOST>/exe/config/attributes.ini
● $DIR_INSTANCE/<SAPLOCALHOST>/exe/config/compileserver.ini
● $DIR_INSTANCE/<SAPLOCALHOST>/exe/config/daemon.ini
● $DIR_INSTANCE/<SAPLOCALHOST>/exe/config/executor.ini
● $DIR_INSTANCE/<SAPLOCALHOST>/exe/config/extensions.ini
● $DIR_INSTANCE/<SAPLOCALHOST>/exe/config/filter.ini
● $DIR_INSTANCE/<SAPLOCALHOST>/exe/config/global.ini
● $DIR_INSTANCE/<SAPLOCALHOST>/exe/config/indexserver.ini
● $DIR_INSTANCE/<SAPLOCALHOST>/exe/config/inifiles.ini
● $DIR_INSTANCE/<SAPLOCALHOST>/exe/config/localclient.ini
● $DIR_INSTANCE/<SAPLOCALHOST>/exe/config/mimetypemapping.ini
● $DIR_INSTANCE/<SAPLOCALHOST>/exe/config/nameserver.ini
● $DIR_INSTANCE/<SAPLOCALHOST>/exe/config/preprocessor.ini
● $DIR_INSTANCE/<SAPLOCALHOST>/exe/config/scriptserver.ini
● $DIR_INSTANCE/<SAPLOCALHOST>/exe/config/statisticsserver.ini
● $DIR_INSTANCE/<SAPLOCALHOST>/exe/config/validmimetypes.ini
● $DIR_INSTANCE/<SAPLOCALHOST>/exe/config/xsengine.ini
● $DIR_INSTANCE/<SAPLOCALHOST>/trace/backint.log
Crashdump Information
Crashdump files for services are collected unabridged.
Kerberos Files
The following Kerberos files are collected:
● /etc/krb5.conf
● /etc/krb5.keytab
System Views
If the collection of system views is not excluded (the option --nosql is specified), all rows of
the following system views (with the exceptions mentioned) are exported into a CSV file with
the name of the table:
Note:
If you are connected to the system database of a multiple-container system, only
information from the system views of the system database is collected.
Information from the system views of tenant databases is not collected,
regardless of this option setting.
Note:
If you trigger the collection of diagnosis information from the SAP HANA cockpit
for offline administration, information from system views cannot be collected
because it does not use an SQL connection.
● SYS.M_DATABASE_HISTORY
● SYS.M_DEV_ALL_LICENSES
● SYS.M_DEV_PLE_SESSIONS_
● SYS.M_DEV_PLE_RUNTIME_OBJECTS_
● SYS.M_EPM_SESSIONS
● SYS.M_INIFILE_CONTENTS
● SYS.M_LANDSCAPE_HOST_CONFIGURATION
● SYS.M_RECORD_LOCKS
● SYS.M_SERVICE_STATISTICS
● SYS.M_SERVICE_THREADS
● SYS.M_SYSTEM_OVERVIEW
● SYS.M_TABLE_LOCATIONS
● SYS.M_TABLE_LOCKS
● SYS.M_TABLE_TRANSACTIONS
● _SYS_EPM.VERSIONS
● _SYS_EPM.TEMPORARY_CONTAINERS
● _SYS_EPM.SAVED_CONTAINERS
● _SYS_STATISTICS.STATISTICS_ALERT_INFORMATION
● _SYS_STATISTICS.STATISTICS_ALERT_LAST_CHECK_INFORMATION
Note:
Only the first 2,000 rows are exported.
● _SYS_STATISTICS.STATISTICS_ALERTS
Note:
Only the first 2,000 rows are exported.
● _SYS_STATISTICS.STATISTICS_INTERVAL_INFORMATION
● _SYS_STATISTICS.STATISTICS_LASTVALUES
● _SYS_STATISTICS.STATISTICS_STATE
● _SYS_STATISTICS.STATISTICS_VERSION
The first 2,000 rows of all remaining tables in the schema _SYS_STATISTICS are exported,
ordered by the SNAPSHOT_ID column.
In the SAP HANA cockpit - Home screen, select the Database Directory or your personal group
tile. In the Database Directory screen, choose the Manage Databases link. To start the tenant
database, in the Manage databases screen, choose the Start button. This will perform a
normal tenant database start.
In the SAP HANA cockpit - Home screen, choose the Database Directory tile and choose the
SYSTEMDB that is stopped. In the Database Overview screen, search for the Services card. To
start the database system, in the Services card choose the Start Database button. This will
perform a normal database start of the SAP HANA database.
Note:
In newer version of the SP HANA Cockpit. you can choose the Start Database
button directly from the Database Overview screen.
1. The data volume of each service is accessed to read and load the restart record.
● Write transactions that were open when the database was stopped are rolled back.
● Changes to committed transactions that were not written to the data area are rolled
forward.
The first column tables start being reloaded into memory as they are accessed for roll
forward.
Note:
Because a regular or "soft" shutdown writes a savepoint, there are no replay
log entries to be processed in this case.
7. Column tables that are marked for preload, and their attributes, are asynchronously
loaded in the background (if they have not already been loaded as part of log replay).
The preload parameter is configured in the meta-data of the table. This feature is useful,
for example, to make certain tables and columns that are used by important business
processes available more quickly.
8. Column tables and their attributes that were loaded before restart, start reloading
asynchronously in the background (if they have not already been loaded as part of log
replay or because they are marked for preload).
During normal operation, the system tracks the tables that are currently in use. This list is
used as a basis for reloading tables after a restart.
Reloading column tables, as described in steps 7 and 8, restores the database to a fully
operational state more quickly. However, it does create performance overhead and may not
be necessary in non-production systems. You can deactivate the reload feature in the
indexserver.ini file by setting the reload_tables parameter in the sql section to false. In
addition, you can configure the number of tables whose attributes are loaded in parallel using
the tables_preloaded_in_parallel parameter in the parallel section of the
indexserver.ini file. This parameter also determines the number of tables that are preloaded in
parallel.
Now that the SAP HANA database system is up and running, you can continue to investigate
the failure. If you cannot find the root cause of the failure, open an SAP support message and
attach the diagnosis information collected in the SAP HANA cockpit - Full system information
dump application.
LESSON SUMMARY
You should now be able to:
● Handle system offline situations
LESSON OBJECTIVES
After completing this lesson, you will be able to:
● Handle system hanging but reachable situations
There are various reasons for a system to hang, or seem to be hanging from an end-user
perspective. The database is said to be hanging when it no longer responds to queries that are
executed against it.
The source of the system standstill might be related to any of the components involved, for
example, the storage, OS and hardware, network, SAP HANA database or the application
layer. For troubleshooting it is essential to collect information about the context of the active
threads in the SAP HANA database.
● ........................................
● ........................................
● ........................................
● ........................................
● ........................................
The following list of issues can cause a system hang state, or a state where the system seems
to hang from the end-user perspective:
● Log volume full caused by either a full disk, a quota setting or failed log backups
● Savepoint lock conflict with long-running update
● Wrong configuration of transparent huge page or OS page cache
● The Translation Lookaside Buffer (TLB) shootdown
● High context switches caused by many SqlExecutor or JobExecutor threads
● Huge Multiversion Concurrency Control (MVCC) versions
● High system CPU usage caused by non-HANA applications
● Frequent Out of Memory (OOM) situations that lead to a performance drop
Note:
What does "Translation Lookaside Buffer (TLB) shootdown" mean?
A Translation Lookaside Buffer (TLB) is a cache of the translations from virtual
memory addresses to physical memory addresses. When a processor changes
the virtual-to-physical mapping of an address, it needs to tell the other processors
to invalidate that mapping in their caches.
As SQL statements cannot usually be executed for analysis, you should perform the following
steps if it is still possible to log on to the OS of the master host (for example, as the <sid>adm
user). Also see SAP Note 1999020: SAP HANA: Troubleshooting when database is no longer
reachable for further specific steps and guidance on proactive or reactive actions you can
take.
● ........................................
● ........................................
● ........................................
● ........................................
The SAP HANAsitter checks by default once an hour, if SAP HANA is online and primary. If so,
it starts to track. Tracking includes regularly (by default, every minute) checking if SAP HANA
is responsive. If it is not, it starts to record.
Recording can include writing call stacks of all active threads, recording run time dumps,
index server gstacks, and/or kernel profiler traces. By default, nothing is recorded.
If SAP HANA is responsive, it checks many of the critical features of SAP HANA. As standard,
the script checks if there are more than 30 active threads. If there are more than 30 active
threads, the script starts to record.
When the script has finished recording, it exits. The script can be configured to restart using
the command line.
When the script has finished all the tests successfully, it sleeps for one hour, before it starts
all the checks again.
2. Create a user key (for example, SYSTEMKEY, but you can use a different name) in the
hdbuserstore.
Check if the SAP HANA File Systems Still Have Free Space
In a system hanging situation the execution of SQL statements is probably not possible
anymore. If you still can log on to the operating system, you should try to perform the
following steps on the OS of the master host.
In cases where logs cannot be written, all DML statements will fall into wait status. This can
lead to a failure of opening new connections because the system internally executes DML
statements during the process. Typically, a full log volume is the cause for this.
The root cause of the "log volume full" situation is either caused by disk being full or hitting
the quota setting. To investigate more deeply, perform the following steps:
1. Check for the Internal Disk-Full Event (Alert 30) in the indexserver trace.
2. Check if the system is running out of disk space using the command df -h on the OS ssh
shell.
3. Check if the system is running out of inodes (NFS) using the command df -i.
Once you have resolved the issue (for example, freed up disk space), you may need to
manually mark the internal event as handled. You can do this on the Overview tab of the
Administration editor in the SAP HANA studio, or by executing the following SQL statements:
ALTER SYSTEM SET EVENT ACKNOWLEDGED '<host>:<port>' <id>
ALTER SYSTEM SET EVENT HANDLED '<host>:<port>' <id>
1. Use SAP Note 813020: How to generate a runtime dump on SAP HANA to collect a
runtime dump. In the generated dump, look for the following combination of call stacks in
many threads.
…
DataAccess::SavepointLock::lockShared(…)
DataAccess::SavepointSPI::lockSavepoint(…)
…
And one of the following call stacks that is in the savepoint phase.
…
DataAccess::SavepointLock::lockExclusive()
DataAccess::SavepointImpl::enterCriticalPhase(…)
…
2. If you are running SAP HANA 1.0 (rev 97 or older) check whether the symptoms match the
description in SAP Note 2214279 - Blocking situation caused by waiting writer holding
consistent change lock. If so, apply the parameter cch_reopening_enabled as
described in the SAP Note.
2313619. It is useful to capture one or several runtime dumps (SAP Note 1813020) so that an
accurate root cause analysis can be done later.
If you consider the possibility of an SAP HANA internal deadlock, you can also run the
deadlock detector functionality of hdbcons (SAP Note 2222218): hdbcons
'deadlockdetector wg -w -o <file_name>.dot'.
Note:
The generated DOT file can be converted to a PDF or a GIF file using the following
commands:
● To generate a PDF file: dot -Tpdf -o <pdffile> /usr/sap/<SID>/
HDB00/work/HA215_<SID>_DeadlockCheck.dot
See SAP Note 1999020: SAP HANA: Troubleshooting, when the database is no longer
reachable for further specific steps and guidance on the proactive or reactive actions you can
take.
To troubleshoot a system in a hang state, there is a function in SAP HANA cockpit called
Troubleshoot unresponsive systems. When using this function, information is collected
through the SAP host agent. The communication between the web browser and the SAP host
agent is always done over HTTPS, which requires that the SAP host agent has a Secure
Sockets Layer (SSL) certificate (PSE) in its security directory.
The information is collected into a file named emergency_info_<SID>.zip by the Python script
emergencyInfo.py. This script connects to the index server, using the hdbcons interface.
The script tries to collect information about the open connections, running transactions, and
threads. It also shows blocked transactions. If the index server is unavailable, no information
is shown.
The information is collected by a Python script that connects to the index server, using the
hdbcons interface. The script tries to collect information about the open connections,
running transactions, and threads. It also shows blocked transactions. If the index server is
unavailable, no information is shown.
The Troubleshoot Unresponsive System function organizes information about the system by
tab. You can diagnose the following:
● Connections
● Transactions
● Blocked transactions
● Threads
Connections Tab
Analyzing the sessions connected to your SAP HANA database helps you identify which
applications, or which users, are currently connected to your system, as well as what they are
doing in terms of SQL execution.
On the CONNECTIONS tab, you can see information about the current connections to the SAP
HANA server. This information includes connection start time, ID, user name, and status. If
there are many connections open to the server, it can lead to congestion and may result in the
server becoming unresponsive.
On the CONNECTIONS tab, you can use the Cancel Connection button to stop a single
connection. To do this, select the connection that you want to cancel, and choose Cancel
Connection.
You can stop all the transactions that are currently running by choosing Cancel All
Transactions.
Transactions Tab
On the TRANSACTIONS tab, you can see information about the current transactions in the
SAP HANA system. This information includes connection and transaction ID, allocated
memory, and user name. The information shown in the transactions table gives you a good
insight into current activity on the system.
Via the connection ID and the primary connection tab, you can even link the transaction to the
corresponding connection.
You can stop all the transactions that are currently running by choosing Cancel All
Transactions.
On the BLOCKED TRANSACTIONS tab, you can investigate if there are blocked transactions in
your system.
Blocked transactions are transactions that cannot be processed further because they need to
acquire transactional locks (record or table locks) that are currently held by another
transaction. Transactions can also be blocked while waiting for other resources, such as the
network or disk access (database or metadata locks).
The type of lock held by the blocking transaction (record, table, or metadata) is indicated in
the Lock Type column.
The lock mode is indicated in the Transactional Lock Type column.
Exclusive: Row-level locks prevent concurrent write operations on the same record. They are
acquired implicitly by update and delete operations, or explicitly with the SELECT FOR
UPDATE statement.
Table-level: Locks prevent operations on the content of a table from interfering with changes
to the table definition (such as drop table or alter table). DML operations on the table content
require an intentional exclusive lock, while changes to the table definition (DDL operations)
require an exclusive table lock. There is also a LOCK TABLE statement for explicitly locking a
table. Intentional exclusive locks can be acquired if no other transaction holds an exclusive
lock for the same object. Exclusive locks require that no other transaction holds a lock for the
same object (neither intentional exclusive nor exclusive).
For more detailed analysis of blocked transactions, information about low-level locks is
available in the columns Lock Wait Name, Lock Wait Component, and Thread ID of Low-Level
Lock Owner. Low-level locks are locks acquired at the thread level. They manage code-level
access to a range of resources (for example, internal data structures, network, or disk). Lock
wait components group low-level locks by engine component or resource.
By choosing Cancel All Transactions, you can stop all the current running transactions.
Because the Delta Table Merge needs to lock tables to proceed, it is a common cause of
blocked transactions. Another job displayed by this monitor is the savepoint write, which
needs to pull a global database lock in its critical phase. A common issue is a flaw in the
application coding that does not commit a write transaction. Such a transaction will block any
other transaction that needs to access the same database object. To remedy the situation,
close the blocking transaction.
In the UI table, the blocked transactions are displayed directly beneath the blocking
transaction.
First, you must determine whether there is only one, or a few transactions, blocking many
other transactions. To do this, open the Blocked Transaction tab and check the amount of
blocking transactions. If there are only a few blocking transactions, there is probably an issue
on the application side. To resolve the problem, use the following techniques:
1. If only one transaction is blocked, contact the application user and the developer. Firstly,
ask the user to close the application and secondly, to check if there is a general issue with
the application code.
If you are not able to contact the user, you can kill the transaction or kill the client process
that opened the session. The transaction is rolled back. The session cancellation may take
some time to succeed. If it takes longer than 30 seconds, consider this as a bug and
contact development support.
If the session cancellation takes too long or does not complete at all, you can kill the client
process that opened the session. This also terminates the blocking transaction. As a
prerequisite, you must have access to the client machine.
2. If a large amount of transactions are blocked, you must find out whether a specific access
pattern is causing the issue. If multiple transactions are trying to access the same
database objects with write operations, they block each other. To check if this is
happening, open the Blocked Transaction Monitor and analyze the Waiting Schema Name,
Waiting Object Name, and Waiting Record ID columns. If you find a lot of transactions that
are blocking many other transactions, you must investigate whether you can do the
following:
b. If a background job that issues many write transactions (for example, a data load job)
is running, reschedule it to a period with a low user load.
c. Partition tables that are accessed frequently to avoid clashes. See the SAP HANA
Administration Guide for more details on partitioning.
3. If you cannot identify specific transactions or specific database objects that lead to
transactions being blocked, you can assume that there is a problem with the database
itself or with its configuration, an issue with the delta merge (for example, mass write
operation on a column table), or a long savepoint duration.
Threads Tab
In the THREADS tab, you can identify which statements or procedures are being executed and
at what stage they are, who else is connected to the system, and if there are any internal
processes running as well. The information shown includes thread type, ID, and thread status.
You can also find information of system user and waiting time. The information shown in the
table helps you to identify transactions with high average wait times. With the user name,
thread ID, and wait time columns, you can identify which thread is causing problems.
In this tab, you can identify long-running threads and those threads that are blocked for an
inexplicable long period of time.
In the case of an emergency, choose Cancel All Transactions.
LESSON SUMMARY
You should now be able to:
● Handle system hanging but reachable situations
LESSON OBJECTIVES
After completing this lesson, you will be able to:
● Analyze a suddenly slow system
The following issues can cause a system to suddenly become slow (that is, from the end-user
perspective the SAP HANA system seems to be performing slowly):
● Hardware failures at the server level (read errors on bad memory)
● Hardware failures at the storage level (read errors on disk)
● Hardware failures at the network level (package collisions on switches or a router)
● Software errors at the OS level (OS command on Linux is using 100% CPU/memory
swapping on disk)
If the slow system is caused by something at the OS level, it is important to save log files on
the Linux OS or at storage system level for later analysis.
Usually, in a slow system scenario, the system can still be accessed through SQL, but it takes
longer to investigate. The normal SQL access should be used to further analyze the possible
root cause.
If the slow system is caused by something in the SAP HANA database, is it important to
investigate the SQL query and save log and trace files at the SAP HANA database level for
further analysis, before terminating sessions or threads.
To launch the overview, drill down to the name of the database from the Database Directory
or from a group. Unless your administrator has enabled single sign-on, you must connect to
the database with a database user that has the system privilege CATALOG READ and choose
_SYS_STATISTICS.
Manage Services
Manage Services provides you with detailed information about database services for an
individual database.
Note:
Not all the columns listed in the following figure are visible by default. You can add
and remove columns in the table personalization dialog, which you open by
choosing the personalization icon in the table toolbar.
The following list gives an overview of the information available on each service:
● Host: The name of the host on which the service is running
● Status: The status of the service. The following statuses are possible:
- Running
- Running with Issues
- Stopped
- Not Running
To investigate why the service is not running, you can navigate to the crashdump file,
created when the service stopped.
Note:
The crashdump file opens in the Trace tool of the SAP HANA Web-based
Development Workbench. For this, you need the role
sap.hana.xs.ide.roles::TraceViewer or the parent role
sap.hana.xs.ide.roles::Developer.
● Service: The service name, for example, indexserver, nameserver, xsengine, and so on
● Role: The role of the service in a failover situation. Automatic failover happens when the
service or the host on which the service is running fails. The following values are possible:
To open Memory Analysis for a more detailed breakdown of memory usage as follows,
choose the mini chart:
- Used Memory (MB): The amount of memory currently used by the service. Choosing
the mini chart opens the Memory Analysis app for a more detailed breakdown of
memory usage.
- Peak Memory (MB): The highest amount of memory ever used by the service
- Effective Allocation Limit (MB): The effective maximum memory pool size that is
available to the process considering the current memory pool sizes of other processes
- Memory Physical on Host (MB): The total memory available on the host
- All Process Memory on Host (MB): The total used physical memory and swap memory
on the host
- Allocated Heap Memory (MB): The heap part of the allocated memory pool
- Allocated Shared Memory (MB): The shared memory part of the allocated memory pool
- Allocation Limit (MB): The maximum size of the allocated memory pool
- CPU Process (%): The CPU usage of the process
- CPU Host (%): The CPU usage on the host
- Memory Virtual on Host (MB): The virtual memory of the host
- Process Physical Memory (MB): The process physical memory used
- Process Virtual Memory (MB): The process virtual memory
- Shrinkable Size of Caches (MB): The memory that can be freed in the event of a
memory shortage
- Size of Caches (MB): The part of the allocated memory pool that can potentially be
freed in the event of a memory shortage
- Size of Shared Libraries (MB): The code size (including shared libraries)
- Size of Thread Stacks (MB): The size of the service thread call stacks
- Used Heap Memory (MB): The amount of the process heap memory used
- Used Shared Memory (MB): The amount of the process shared memory used
- SQL Port: The SQL port number
- Process ID: The process ID
Operations on Services
As an administrator, you may need to perform certain operations on all, or selected services -
for example, start missing services, or stop or kill a service.
You can perform several operations on database services from Manage Services. You can
trigger these operations by selecting the service, and choosing the required option in the
footer toolbar.
Choose Start Missing Services to start inactive services. This can only be performed on a
tenant database if you drill down to Manage Services, through the system database.
Choose Stop Service to stop the selected service normally. The service is then typically
restarted.
Choose Kill Service to stop the selected service immediately and, if the related option is
selected, create a crashdump file. The service is then typically restarted.
Choose Add Service to add the service you selected from the list. This can only be performed
on a tenant database if you drill down to Manage Services, through the system database.
Services cannot be added to the system database itself. To add a service, you must have the
EXECUTE privilege on the stored procedure SYS. UPDATE_LANDSCAPE_CONFIGURATION.
Choose Remove Service to remove the selected service. This can only be performed on a
tenant database if you drill down to Manage Services, through the system database.
You can only remove services that have their own persistence. If data is still stored in the
service's persistence, it is re-distributed to other services.
You cannot remove the following services:
● Name server
● Master index server
● Primary index server on a host
To remove a service, you must have the EXECUTE privilege on the stored procedure SYS.
UPDATE_LANDSCAPE_CONFIGURATION.
Choose Reset Memory Statistics to reset all memory statistics for all services. This can only
be performed on a tenant database if you drill down to Manage Services application, through
the system database.
Peak used memory is the highest recorded value for used memory since the last time the
memory statistics were reset. This value is useful for understanding the behavior of used
memory over time and under peak loads. Resetting peak used memory allows you, for
example, to establish the impact of a certain workload on memory usage. If you reset peak
used memory and run the workload, you can then examine the new peak used memory value.
Choose Go To Alerts to display the alerts for this database.
The SAP HANA database provides several features in support of high availability, one of which
is service auto-restart. In the event of a failure, or an intentional intervention by an
administrator that disables one of the SAP HANA services, the service auto-restart function
automatically detects the failure and restarts the stopped service process.
The Memory Analysis app enables you to visualize and explore the memory allocation of every
service of a selected host during a specified time period. If you notice an increase in overall
memory usage, you can investigate whether it is due to a particular component,
subcomponent, or table.
The upper chart provides the following data:
● Global Allocation Limit: This is the global_allocation_limit for the host (as set in the
global.ini configuration file).
● Allocated Memory: This is the pool of memory pre-allocated by the host for storing in-
memory table data, thread stacks, temporary results, and other system data structures.
● Total Used Memory: This is the total amount of memory used by SAP HANA, including
program code and stack, all data and system tables, and the memory required for
temporary computations.
Move the vertical selection bar in the upper chart to populate the data in the lower chart. The
vertical selection bar snaps to the closest time for which there is collected data for the
components. When you select the Components tab, the lower chart displays the Used
Memory by Component.
The Components tab provides the following detailed information:
● Used Memory by Component: For the specific time (chosen by the vertical selection bar in
the upper chart), the components of the selected service are listed in descending order of
the amount of used memory.
● Used Memory by Type: This donut chart displays a visual representation of the types of
used memory for the specified time.
● Components Used Memory History: If you select the checkbox of one or more
components, the used memory history chart is populated.
The Subcomponents tab displays more detailed memory use. You can filter by component
type. You can move through the collected data points by using the arrow buttons. The
following information is displayed:
● Used Memory by Subcomponent: Subcomponents of the selected component are listed in
descending order of used inclusive memory for the specific time (chosen by the vertical
selection bar in the upper chart). By choosing a subcomponent, you can expand the list.
● Filter by Component Name: To further refine the displayed subcomponent data, select the
filter icon to specify one or more component names.
● Subcomponents Used Memory History: Selecting the checkbox of one or more
subcomponents populates the used memory history chart.
The Tables tab shows detailed statistics on the memory used by data tables. The Tables tab
shows the following information:
● Top Ten Tables by Size: This displays the breakdown of memory usage of the 10 highest
consuming tables for the specific time (chosen by the vertical selection bar in the upper
chart).
● Top Ten Tables by Growth: This displays the memory usage of the 10 tables with the
largest change in consumption for the selected time period. By hovering over the data, you
can see the Previous Size memory usage value from the beginning of the time period and
the Growth during the time period (where the current size of the table is the sum of
Previous Size and Growth).
The following system views provide information from which the current and historical
memory allocation is calculated:
● HOST_RESOURCE_UTILIZATION_STATISTICS
● HOST_SERVICE_MEMORY
● HOST_SERVICE_COMPONENT_MEMORY
● HOST_HEAP_ALLOCATORS
● GLOBAL_ROWSTORE_TABLES_SIZE_BASE
● HOST_COLUMN_TABLES_PART_SIZE
All views are in the _SYS_STATISTICS schema. For more information about these views, see
the SAP HANA SQL and System Views Reference Guide.
Performance Monitor
Analyzing the performance of the SAP HANA database over time can help you to pinpoint
bottlenecks, identify patterns, and forecast requirements.
Use the Performance Monitor to visually analyze historical performance data across a range
of key performance indicators related to memory, disk, and CPU usage.
Open the Performance Monitor by choosing the chart or the Show All link on the Memory
Usage, CPU Usage, or Disk Usage card on the homepage of the SAP HANA cockpit.
The Performance Monitor can be reached through the Memory Usage, CPU Usage, and Disk
Usage cards. All three cards point to the same Performance Monitor, but the displayed data
(memory, CPU, or disk) depends on the selected database. The general working of the
Performance Monitor is the same for all three of the cards.
The Performance Monitor opens displaying the load graph for the selected database
(memory, CPU, or disk). The load graph initially visualizes resource usage of all hosts and
services listed on the left, according to the default key performance indicator (KPI) group of
the selected database.
You can customize the information displayed on the load graph in several ways, for example:
● Define the monitored time frame.
● Use the Add Chart button to create custom charts displaying the host and services
selection, as well as selected KPIs. For a list of all available KPIs, see Key Performance
Indicators.
● Update the displayed data by the selected refresh rate.
● Zoom into a specific time by changing the duration.
● In the Settings menu, customize your graphs by including hosts and services as well as
additional KPIs in the Charts tab. In the Alerts tab, configure alerts per category and
priority status.
The Sessions card displays the number of active and total sessions.
Open the Sessions card. The Sessions page allows you to monitor all sessions in the current
landscape. You can see the following information:
● Active/inactive sessions and their relation to applications
● Whether a session is blocked and, if so, which session is blocking it
● The number of transactions that are blocked by a blocking session
● Statistics such as average query runtime and the number of DML and DDL statements in a
session
● The operator currently being processed by an active session
To support monitoring and analysis, you can perform the following actions from the Sessions
page:
● To cancel a session, choose Cancel Sessions.
● To save the data sets as a text or HTML file, choose Save As.
The Threads card provides you with information about the number of currently active and
blocked threads in the database.
To open the Threads card, choose either the number of active or blocked threads on the card.
The 1,000 longest-running threads currently active in the database are listed. By default,
threads are listed in order of longest runtime. For each statement, you can see the duration,
as well as the name of the service that is executing the thread. You can identify the host, the
port, and the thread type, and whether the statement is related to a blocking transaction.
If a thread is involved in a blocked transaction, or is using an excessive amount of memory,
cancel the operation executing the thread by choosing Cancel Operations in the footer
toolbar.
Thread Details
The Threads card provides you with detailed information about the 1,000 longest-running
threads currently active in the database.
Note:
Not all of the columns listed in the following table are visible by default. You can
add and remove columns in the table personalization dialog, which you open by
choosing the personalization icon in the table toolbar.
Detail Description
Detail Description
Transaction ID Transaction ID
Update Transaction ID Update Transaction ID
Thread Status Thread State
Connection Transaction ID Transaction object ID
Connection Start Time Connected Time
Connection Idle Time (ms) Time that the connection is unused and idle
Connection Status Connection Status: 'RUNNING' or 'IDLE'
Client Host Host name of client machine
Client IP IP of client machine
Client PID Client Process ID
Connection Type Connection type: Remote, Local, History (re-
mote), History (local)
Own Connection Own connection: TRUE if own connection,
FALSE if not
Memory Size per Connection Allocated memory size per connection
Auto Commit Commit mode of the current transaction:
TRUE if the current connection is in auto-
commit mode, FALSE otherwise
Last Action The last action done by the current connec-
tion: ExecuteGroup, CommitTrans, Abort-
Trans, PrepareStatement, CloseStatement,
ExecutePrepared, ExecuteStatement, Fetch-
Cursor, CloseCursor, LobGetPiece, LogPut-
Piece, LobFind, Authenticate, Connect, Dis-
connect, ExecQidItab, CursorFetchItab, In-
sertIncompleteItab, AbapStream, TxStartXA,
TxJoinXA
Current Statement ID Current statement ID
Current Operator Name Current operator name
Fetched Record Count Sum of the record count fetched by select
statements
Sent Message Size (Bytes) Total size of messages sent by the current
connection
Sent Message Count Total message count sent by the current
connection
Received Message Size (Byte) Total size of messages/transactions re-
ceived by the current connection
Received Message Count Total message/transaction count received
by the current connection
Detail Description
Use the Top SQL Statements card to analyze the current most critical statements running in
the database.
The Top SQL Statements card displays the number of long-running statements and the long-
running blocking situations currently active in the database. Statements are ranked based on
a combination of the following criteria:
● The runtime of the current statement execution.
● The lock wait time of the current statement execution.
● The cursor duration of the current statement execution.
Open the Top SQL Statements card to list the 100 most critical statements currently active in
the database. By default, statements are listed in order of the longest runtime. For each
statement, you can see the full statement string, as well as the ID of the session in which the
statement is running. You can identify the application, the application user, and the database
user running the statement and whether the statement is related to a blocking transaction.
Optionally, you can activate monitoring of the memory consumption of statements by
choosing Enable Memory Tracking in the footer toolbar. Detailed information about the
memory consumption of statement execution is collected and displayed.
If a statement is involved in a blocked transaction or using an excessive amount of memory,
cancel the session that the statement is running in (or the blocking session) by choosing
Cancel Session in the footer toolbar.
LESSON SUMMARY
You should now be able to:
● Analyze a suddenly slow system
Learning Assessment
1. What kind of query language statements can be used in the SAP HANA Database
Explorer?
Choose the correct answer.
X A SQL
X B MDX
X C XML
X D TCL
2. In a system-down scenario, SQL can be used to access the system to further analyze the
problem.
Determine whether this statement is true or false.
X True
X False
3. Which analysis steps can be performed when the SAP HANA database cannot be reached
using SQL?
Choose the correct answer.
X D Check the hardware with the SAP HANA Hardware Configuration Check Tool.
4. The Troubleshoot Unresponsive Systems function collects its data through the
fullSystemInfoDump.py python script.
Determine whether this statement is true or false.
X True
X False
X A CPU usage
X B Service status
X C Disk usage
X D Lock status
6. Because SAP HANA is running in-memory, a slow system situation is always caused by a
problem at OS level.
Determine whether this statement is true or false.
X True
X False
1. What kind of query language statements can be used in the SAP HANA Database
Explorer?
Choose the correct answer.
X A SQL
X B MDX
X C XML
X D TCL
Correct! The SAP HANA Database Explorer supports only SQL. Read more about this in
the lesson "Handling System Offline Situations" of the course HA215.
2. In a system-down scenario, SQL can be used to access the system to further analyze the
problem.
Determine whether this statement is true or false.
X True
X False
Correct! When the SAP HANA database is down, you should use the Troubleshoot
Unresponsive System application to analyze the problem. Read more about this in the
lesson "Handling System Offline Situations" of the course HA215.
3. Which analysis steps can be performed when the SAP HANA database cannot be reached
using SQL?
Choose the correct answer.
X D Check the hardware with the SAP HANA Hardware Configuration Check Tool.
Correct! The deadlock detector functionality is part of the hdbcons tool, and can be used
for analyzing an unreachable system. Read more about this in the lesson "Handling
System Hang but Reachable Situations" of the course HA215.
4. The Troubleshoot Unresponsive Systems function collects its data through the
fullSystemInfoDump.py python script.
Determine whether this statement is true or false.
X True
X False
Correct! The Troubleshoot Unresponsive Systems function collects its data through the
SAP Host Agent. Read more on this in the lesson "Handling System Hanging but
Reachable Situations" of the course HA215.
X A CPU usage
X B Service status
X C Disk usage
X D Lock status
Correct! The CPU usage and Service status KPIs are shown in the Manage Services
application. Read more on this in the lesson "Analyzing a Suddenly Slow System" of the
course HA215.
6. Because SAP HANA is running in-memory, a slow system situation is always caused by a
problem at OS level.
Determine whether this statement is true or false.
X True
X False
Correct! A slow system can be caused by OS, network, hardware, other software, or the
SAP HANA database. The problem could be with any of these components. Read more on
this in the lesson "Analyzing a Suddenly Slow System" of the course HA215.
Lesson 1
Analyzing Memory Issues 65
Lesson 2
Analyzing CPU Issues 79
Lesson 3
Analyzing Expensive Statement Issues 87
Lesson 4
Analyzing Disk and I/O Issues 95
UNIT OBJECTIVES
LESSON OBJECTIVES
After completing this lesson, you will be able to:
● Analyze memory issues
You can find detailed information about memory consumption of individual SAP HANA
components and executed operations on the SAP HANA Performance Monitor - Memory
application.
Note:
See SAP Note 1704499 - "System Measurement for License Audit" for more
information about memory consumption with regards to SAP HANA licenses.
On a typical SAP HANA appliance, the resident memory part of the operating system and all
other running programs usually does not exceed 2 GB. The rest of the memory is therefore
dedicated to SAP HANA.
When memory is required for table growth or for temporary computations, the SAP HANA
code obtains it from the existing memory pool. When the pool cannot satisfy the request, the
SAP HANA memory manager will request and reserve more memory from the operating
system. At this point, the virtual memory size of SAP HANA processes grows.
Once a temporary computation completes or a table is dropped, the freed memory is
returned to the memory manager, which recycles it to its pool without informing the operating
system. Therefore, from SAP HANA's perspective, the amount of used memory shrinks, but
the process’s virtual and resident memory sizes are not affected. This creates a situation
where the used memory value may shrink to below the size of SAP HANA's resident memory.
This is normal.
Note:
The memory manager may also choose to return memory back to the operating
system, for example, when the pool is close to the allocation limit and contains
large unused parts.
Figure 39: SAP HANA Memory Usage and the Operating System
Note:
SAP HANA really consists of several separate processes, so the figure shows all
SAP HANA processes combined.
Because the code and program stack size are about 6 GB, almost all of the used memory is
used for table storage, computations, and database management.
Memory is a fundamental resource of the SAP HANA database. Understanding how the SAP
HANA database requests, uses, and manages this resource is crucial to understanding SAP
HANA.
SAP HANA provides various memory usage indicators that enable monitoring, tracking, and
alerting. The most important indicators are those for used memory and peak used memory.
SAP HANA contains its own memory manager and memory pool. Certain external indicators
can be misleading when estimating the real memory requirements of an SAP HANA
deployment. Examples of these indicators include the size of resident memory at the host
level, and the size of virtual and resident memory at the process level.
Note:
The SAP HANA database loads column-store tables into memory column by
column only upon use. This is sometimes called "lazy loading". This means that
columns that are never used will not be loaded and memory waste is avoided.
When the SAP HANA database runs out of allocatable memory, it will try to free
up some memory by unloading unimportant data (such as caches) and even
table columns that have not been used recently. Therefore, if it is important to
measure precisely the total, or worst-case, amount of memory used for a
particular table, it is important to ensure that the table is first fully loaded into
memory. You can do this by loading the table into memory.
are saved in the system views M_HEAP_MEMORY (allocated memory by component) and
M_CONTEXT_MEMORY (allocated memory that can be associated with a connection, a
statement, or a user). Both views have a reset feature so that statistics can be captured for a
specific period of time. The embedded statistics service also includes a view which tracks
memory allocation per host, HOST_HEAP_ALLOCATORS.
Note:
For full details of these views, see the SAP HANA SQL and System Views
Reference.
Allocator statistics are saved automatically for each core processor and in certain scenarios
where systems have a large number of logical cores the statistics can consume a significant
amount of memory. To save memory statistics logging can be reduced, to save statistics only
for each node, or only for each statistics object. An example of using the lscpu command to
retrieve details of the physical and logical CPU architecture is given in the section Controlling
CPU Consumption.
You can configure this feature by setting values for the following two configuration
parameters in the global.ini file:
● The parameter pool_statistics_striping can reduce the amount of memory
consumed by the component-specific allocator statistics (rows in M_HEAP_MEMORY with
the category PoolAllocator).
● The parameter composite_statistics_striping can reduce the amount of memory
consumed by statement-specific allocator statistics (rows in M_CONTEXT_MEMORY).
The parameters can be set to one of the following values (the configuration can be changed
online, but the change will only affect newly created statistic objects):
Value Effect
auto (default value) Let the system decide the statistics strategy.
By default SAP HANA will try to utilize as
much memory as possible for maximum per-
formance.
core The system allocates one stripe per logical
core.
numa The system allocates only one stripe per NU-
MA node.
none In this case, the system creates a single
stripe per statistics object.
manager increases the pool size by requesting more memory from the operating system, up
to a predefined allocation limit.
Alternatively, you can define this limit as a flexible percentage of the available main memory
size. If you enter a percentage value the precise value of the limit will be calculated
automatically by the system. The percentage value is very useful when running SAP HANA in
a virtual environment, if you then change the size of the vm-container where the system runs
the allocation limit will automatically adjust to the correct percentage of the new vm-container
size.
Note:
Changing this parameter does not require a restart.
There is normally no reason to change the value of this parameter although, for example, on
development systems with more than one SAP HANA system installed on a single host you
could limit the size of the memory pool to avoid resource contentions or conflicts.
A change may also be necessary to remain in compliance with the memory allowance of your
license if you purchased a license for less than the total amount of physical memory available.
This is illustrated in the following examples:
To change the global memory allocation limit, you must do the following:
Caution:
As of SPS04 and the introduction of the parameter configuration framework,
SAP recommends not to edit the INI files directly when SAP HANA is online.
If you only enter a value for the system, it is used for all hosts. For example, if you have five
hosts and you set the limit to 5 GB, the database can use up to 5 GB on each host (25 GB in
total). If you enter a value for a specific host, then for that host, the specific value is used and
the system value is only used for all other hosts. This is only relevant for multiple-host
(distributed) systems.
In addition to the global allocation limit, each service running on the host has an allocation
limit: the service allocation limit. Given that collectively, all services cannot consume more
memory than the global allocation limit, each service has what is called an effective allocation
limit. The effective allocation limit of a service specifies how much physical memory a service
can consume in reality, considering the current memory consumption of other services.
You can find the Performance Monitor, with all preselected memory indicators, in the Memory
Usage card of the SAP HANA cockpit.
The Performance Monitor provides an overview of the general memory situation, with time-
based statistics for the following indicators:
● Database resident memory
● Total resident memory
● Physical memory size
● Database used memory
● Database allocation limit
SAP Note: 1969700 - SQL Statement Collection for SAP HANA contains several commands
that are useful to analyze memory-related issues. Based on your needs, you can configure
restrictions and parameters in the sections marked /* Modification section */.
The most important memory-related analysis queries are listed here. Note that some queries
have version-specific variations identified in the file names:
● HANA_Memory_Overview_1.00.vv
Provides an overview of current memory information.
● HANA_Memory_TopConsumers_1.00.vv
Displays the areas with the current highest memory requirements: columnstore and
rowstore tables, heap, code, and stack.
● HANA_Memory_SharedMemory*
Shows the currently used and allocated shared memory per host and service.
● HANA_Memory_TopConsumers_History_1.00.vv (+_ESS)
Displays the areas with the highest historical memory requirements: columnstore and
rowstore tables, heap, code, and stack. Optionally, it can include results for the Embedded
Statistics Server.
In the case of critical memory issues, you can often find more detailed information in logs and
trace files, as follows:
● Identify memory-related errors in the SAP HANA system alert trace files. Search for the
strings memory, allocat, or OOM.
Note:
The search is not case-sensitive.
In SAP HANA Cockpit, the number of Out of Memory (OOM) events are displayed on the
Memory Usage card. Use the Analyze Memory History application to investigate the root
cause of the OOM. You can find the Analyze Memory History application by choosing the More
button in the Memory Usage card.
Select the Out of Memory Events tab to display on lower chart the number of unique out-of-
memory events that have occurred in the time range specified in the header. The number of
events shown depends on your selected time range, not the vertical selection bar. The list
shows the following information on the OOM events:
● Occurrences: The number of times a specific OOM event has been triggered
● Last Occurrence: The time and date of the most recent occurrence of the OOM event
● Last Reason: The parameter that triggered the most recent occurrence of the OOM event
● Statement: The SQL statement related to the OOM event
● Statement Hash: The unique identifier for the OOM event. To open the Workload Analyzer
and investigate the event, choose the OOM identifier.
Hint:
When investigating from the SYSTEMDB, if an event has a corresponding OOM
dump file, you can select View Trace to launch the Dump Viewer in the SAP
Database Explorer.
In the Memory Statistics charts you can choose to display historical data for a time range
between 24 hours and six weeks. To display a date range longer than six weeks (42 days), you
can use SQL to update the RETENTION_DAYS_CURRENT value in the table
_SYS_STATISTICS"."STATISTICS_SCHEDULE.
If you need help from SAP Customer Support to perform an in-depth analysis, add the
following valuable information to the ticket:
● Diagnosis information (full system info dump)
● Performance trace, which provides detailed information on the system behavior, including
statement execution details
The trace output is written to the trace file perftrace.tpt, which must be sent to SAP Customer
Support.
If specific SAP HANA system components need deeper investigation, SAP Customer Support
may ask you to raise the corresponding trace levels to INFO or DEBUG. To do this, following
these steps:
Some trace components, for example join_eval = DEBUG, can create many megabytes of
trace information. They require an increase of the values maxfiles and maxfilesize in the
trace section of the global.ini file.
LESSON SUMMARY
You should now be able to:
● Analyze memory issues
LESSON OBJECTIVES
After completing this lesson, you will be able to:
● Analyze CPU issues
CPU-related Issues
This lesson covers the troubleshooting of high CPU consumption on the system.
A constantly high CPU consumption leads to a considerably slower system, where no more
requests can be processed. From an end-user perspective, the application behaves slowly, is
unresponsive, or can seem to hang.
Note:
Optimal CPU use is the desired behavior for SAP HANA. Therefore, performance
issues are nothing to worry about unless the CPU becomes a bottleneck. SAP
HANA is optimized to consume all the memory and CPU available. The software
paralyzes queries as much as possible, to ensure optimal performance. Therefore,
if the CPU usage is near 100% for a query execution, it does not always mean that
there is an issue.
The following sources alert you to high CPU consumption on your SAP HANA database:
● Alert 5 (Host CPU usage) for current or past CPU usage
● The displayed CPU usage on the overview screen
The load graph in the figure, Indicators of CPU-related Issues, shows high CPU consumption,
or high consumption in the past.
Choose the CPU Usage card to see detailed CPU usage. In the detailed graph, several CPU-
related KPIs are shown.
On the left side of the figure, KPI Details of CPU-related Issues, the legend shows which color
represents which KPI. By default all KPIs are shown, which can make the graph appear
cluttered. Use the checkboxes in the legend to show or hide KPIs.
You can display a specific time period to investigate by using the From and To fields.
You can rearrange the display order of the charts using the Settings button in the top-right
corner. You can also delete the charts, if needed.
The Thread Monitor now shows, in milliseconds, the CPU time of each thread running on SAP
HANA. A high CPU time for related threads indicates that an operation is causing increased
CPU consumption.
To identify the expensive statements causing the high resource consumption, turn on the
expensive statement trace. This trace is accessed using the Monitor Expensive Statements
link in the Monitoring section of the SAP HANA database Overview screen. Start the expensive
statements trace and specify a reasonable run time. If possible, add further restrictive criteria
such as database user, or application user, to narrow down the amount of information traced.
Note:
When resource_tracking is activated, the CPU time for each statement is
shown in the CPU_TIME column.
As soon as this is clarified and you agree on how to resolve the situation, two options are
available:
● On the client side, end the process calling the affected threads.
● Cancel the operation that is related to the affected threads.
To do this, select the identified thread in the Thread card and choose Cancel.
For further analysis on the root cause, open a ticket in SAP HANA Development Support and
attach the full system info dump, if available.
shortage when running in parallel. Historical information about background jobs can be
obtained from the following system views:
● M_BACKUP_CATALOG
● M_DELTA_MERGE_STATISTICS
LESSON SUMMARY
You should now be able to:
● Analyze CPU issues
LESSON OBJECTIVES
After completing this lesson, you will be able to:
● Analyze expensive statement issues
SQL statements processing a high amount of data, or using inefficient processing strategies,
can be responsible for increased memory requirements.
From the trace file, you can analyze the response time of SQL statements.
● Expensive Statements Trace
On the Expensive Statements tab, you can view a list of all SQL statements that exceed a
specified response time.
In addition, you can analyze the SQL plan cache, which provides a statistical overview of the
statements are executed in the system.
Use the Monitor Statements page to analyze the current most critical statements running in
the database.
Analyzing the current most critical statements running in the SAP HANA database can help
you identify the root cause of poor performance, CPU bottlenecks, or Out of Memory
situations. Enabling memory tracking allows you to monitor the amount of memory used by
single statement executions.
The SQL Statements card displays the number of long-running statements and long-running
blocking situations currently active in the database. Statements are ranked based on a
combination of the following criteria:
● Runtime of the current statement execution
● Lock wait time of the current statement execution
● Cursor duration of the current statement execution
Open the Monitor Statements page by choosing either the long-running statements or long-
running blocked statements on the card. The Monitor Statements page allows you to analyze
the most current statements running in the database. Here you can see the following
information:
● The 100 most critical statements, listed in order of the longest runtime
● The full statement string and ID of the session in which the statement is running
● The application, the application user, and the database running the statement
● Whether a statement is related to a blocking transaction
To support monitoring, you can perform the following actions on the Monitor Statements
page:
● If a statement is in a blocked transaction or using an excessive amount of memory, you
can cancel the session that the statement is running in (or the blocked session) by
choosing Cancel Session in the footer toolbar.
● To access information about the memory consumption of statements, choose Enable
Memory Tracking in the footer toolbar.
● To set up or modify workload classes, choose a statement's Workload Class name. To
create a new workload class, choose New, or, to select a workload class from a list, choose
Existing, and fill out the fields.
You can find the Expensive Statements trace in the Monitor Statements application.
Use the Expensive Statements trace to analyze individual SQL queries whose execution time
was above a configured threshold. The Expensive Statements trace records information and
displays it on the Expensive Statements tab.
The Expensive Statements trace records information about the expensive statements for
further analysis and displays it on the Expensive Statements tab.
To support monitoring and analysis, you can perform the following actions on the Expensive
Statements Trace page:
● The expensive statements trace is deactivated by default. To activate and configure it, in
the footer bar, choose Configure Trace.
● Define the monitored date.
● Filter expensive statements, refresh the list, choose the sorting parameter, and filter by
parameter.
● To save the data sets as a text or HTML files, choose Save As...
● To configure the threshold parameters, choose Configure Trace and enter information on
the Configure Expensive Trace page.
● To open an expensive statement with the SQL analyzer, next to the statement string,
choose More.
● Set up or modify workload classes by choosing a statement's workload class name. To
create a new workload class, choose New, or to select a workload class from a list, choose
Existing, and fill out the fields.
You can find the SQL Plan Cache tab in the Monitor Statements application.
Technically, the plan cache stores compiled execution plans of SQL statements for reuse,
which gives a performance advantage over recompilation at each invocation. For monitoring
reasons, the plan cache keeps statistics about each plan, for example, the number of
executions, minimum/maximum/total/average runtime, and lock/wait statistics. Analyzing
the plan cache is very helpful as one of the first steps in performance analysis because it gives
an overview of the statements that are executed in the system.
Note:
Due to the nature of a cache, seldom-used entries are removed from the plan
cache.
The SQL plan cache is useful for observing overall SQL performance because it provides
statistics on compiled queries. You can get an insight into frequently executed queries and
slow queries with a view to finding potential candidates for optimization.
To support monitoring and analysis, you can perform the following actions on the SQL Plan
Cache page:
● To open an SQL statement with the SQL analyzer, next to the statement string, choose
More.
● To save the data sets as a text or HTML file, choose Save As...
● The collection of SQL plan cache statistics is enabled by default. To disable it, choose
Configure.
LESSON SUMMARY
You should now be able to:
● Analyze expensive statement issues
LESSON OBJECTIVES
After completing this lesson, you will be able to:
● Analyze disk and I/O issues
Although SAP HANA is an in-memory database, I/O still plays a critical role in system
performance. From an end user perspective, if there are issues with I/O performance, an
application, or the system as a whole, it runs sluggishly, is unresponsive, or can seem to hang.
In certain scenarios, data is read from or written to disk, for example during the COMMIT
transaction. Normally, this is done asynchronously, but at certain points, synchronous I/O is
performed. Even during asynchronous I/O, important data structures may be locked.
Here are some details for each of the scenarios:
Savepoint
A savepoint ensures that all changed, persistent data since the last savepoint is written to
disk.
By default, the SAP HANA database triggers savepoints at five minute intervals. Data is
automatically saved from memory to the data volume located on disk. Depending on the
type of data, the block sizes vary between 4 KB and 16 MB.
Savepoints run asynchronously to SAP HANA update operations. Database update
transactions only wait at the critical phase of the savepoint, which usually takes
microseconds.
Write Transactions
All changes to persistent data are captured in the redo log. SAP HANA asynchronously
writes the redo log with I/O orders of 4 KB to 1 MB size into log segments. Transactions
writing a Commit into the Redo log wait until the buffer containing the commit has been
written to the log volume.
Delta Merge
The delta merge itself takes place in-memory. Updates to column store tables are stored
in the delta storage. During the delta merge, these changes are applied to the main
storage, where they are stored, read, optimized, and compressed. After the delta merge
is complete, the new main storage is persisted in the data volume, that is, written to disk.
The delta merge does not block parallel read and update transactions.
Data Backup
For a data backup, the current payload of the data volumes is read and copied to backup
storage. When writing a backup, it is essential that there are no collisions with other
transactional operations running against the database on the I/O connection.
Log Backup
Log backups store the content of a closed log segment. They are automatically and
asynchronously created by reading the payload from the log segments, and writing them
to the backup area.
Snapshot
SAP HANA database snapshots are used by certain operations, such as backup and
system copy. They are created by triggering a system-wide consistent savepoint. The
system keeps the blocks belonging to the snapshot at least until the drop of the
snapshot. Detailed information about snapshots can be found in the SAP HANA
Administration Guide.
Database Restart
At database startup, the services load their persistence, including catalog and row store
tables, into memory. This means that the persistence is read from the storage.
Additionally, the redo log entries written after the last savepoint are read from the log
volume and replayed in the data area in-memory. When this is finished, the database is
accessible. The bigger the row store, the longer it takes for the system to become
available for operation again.
Database Recovery
The restore of a data backup reads the backup content from the backup device and
writes it to the SAP HANA data volumes. The I/O write orders of the data recovery, have
a size of 64 MB. The redo log can be replayed during a database recovery, that is, the log
backups are read from the backup device and the log entries get replayed.
Failover (Host Auto-FailOver)
On the standby host, the services run in idle mode. Upon failover, the data and log
volumes of the failed host are automatically assigned to the standby host. The standby
host then has read and write access to the files of the failed active host. Row and Column
Store tables (the latter on demand) must be loaded into memory. The log entries have to
be replayed.
Takeover (System Replication)
The secondary system is already running, services are active but cannot accept SQL, and
thus are not usable by the application. As in the database restart (described earlier) the
row store tables need to be loaded into memory from persistent storage. If the prload
table is used, then most of the column store tables are already in-memory. During
takeover, the replicated redo logs, shipped since the last data transport from primary to
secondary, must be replayed.
Identifies large diagnosis files. Unusually large files can indicate a problem with the
database.
Alert 52 Crashdump files
Identifies new crashdump files that have been generated in the trace directory of the
system.
Alert 53 - Pagedump files
Identifies new pagedump files that have been generated in the trace directory of the
system.
Alert 54 - Savepoint duration
Identifies long-running savepoint operations.
Alert 60 - Sync/Async read ratio
Identifies a bad trigger asynchronous read ratio. This means that asynchronous reads are
blocking and behave almost like synchronous reads. This might have negative impact on
SAP HANA I/O performance in certain scenarios.
Alert 61 - Sync/Async write ratio
Identifies a bad trigger asynchronous write ratio. This means that asynchronous writes
are blocking, and behave almost like synchronous writes. This may have a negative
impact on SAP HANA I/O performance in certain scenarios.
Alert 77 - Database disk usage
Determines the total used disk space of the database. All data, logs, traces and backups
are considered.
Alert 89 - Missing volume files
Determines if there is any volume file missing.
Alert 113 - Host open file count
Determines what percentage of total open file handles are in use. All processes are
considered, including non-SAP HANA processes. Compare
M_HOST_RESOURCE_UTILIZATION.OPEN_FILE_COUNT with
M_HOST_INFORMATION.VALUE of M_HOST_INFORMATION.KEY open_file_limit.
Note:
For more information about SAP HANA alerts, see SAP Note 2445867 - "How-To:
Interpreting and Resolving SAP HANA Alerts".
In SAP HANA cockpit, disk-related information is found via the Disk Usage card and the
Monitor Disk Volume application.
The Monitor Disk Volume application provides information about the size of the data and log
volumes, the storage locations, the disk storage throughput, and the used page statistics in
the data volume.
The Disk Usage card opens the Performance Monitor showing the Disk Size and Disk Used
information over time. This information helps you to understand when the growth of the used
disk space started and when the system ran out of space. By adding additional KPIs like Data
Read/Write and Log Read/Write, you can determine if the disk full event is caused by SAP
HANA writing huge amounts of data or log information.
Note:
For more information about disk I/O analysis, see SAP Note 1999930 - "FAQ:
SAP HANA I/O Analysis".
LESSON SUMMARY
You should now be able to:
● Analyze disk and I/O issues
Learning Assessment
1. Which KPIs are shown, by default, in the Performance Monitor started from the Memory
Usage card in SAP HANA Cockpit 2.0?
Choose the correct answers.
2. The statistics service is a central element of the internal monitoring infrastructure of SAP
HANA. It collects statistical and performance information using SQL.
Determine whether this statement is true or false.
X True
X False
3. Which SAP HANA information sources inform you about high CPU consumption on your
SAP HANA database?
Choose the correct answers.
4. In SAP HANA, performance issues are nothing to worry because the whole database is
running in-memory.
Determine whether this statement is true or false.
X True
X False
5. Which tools can you use to monitor the Average Execution Time of individual SQL queries?
Choose the correct answers.
6. The SQL Statements card displays the number of blocked transactions currently on hold
in the database.
Determine whether this statement is true or false.
X True
X False
7. Which operations in the SAP HANA database result in disk I/O on the Redo Log Volume?
Choose the correct answers.
X A Database restart
8. A disk-full event causes the SAP HANA database to stop and the problem must be
resolved before database operations can continue.
Determine whether this statement is true or false.
X True
X False
1. Which KPIs are shown, by default, in the Performance Monitor started from the Memory
Usage card in SAP HANA Cockpit 2.0?
Choose the correct answers.
Correct! The total resident memory and the physical memory size are shown as a KPI in
the Performance Monitor. Read more on this in the lesson "Analyzing Memory Issues" of
the course HA215.
2. The statistics service is a central element of the internal monitoring infrastructure of SAP
HANA. It collects statistical and performance information using SQL.
Determine whether this statement is true or false.
X True
X False
Correct! The embedded statistics service is part of the indexserver. Read more on this in
the lesson "Analyzing Memory Issues" of the course HA215.
3. Which SAP HANA information sources inform you about high CPU consumption on your
SAP HANA database?
Choose the correct answers.
Correct! High CPU consumption is shown by the CPU graph, the Alerts card, and the host
CPU usage alert. Read more on this in the lesson "Analyzing CPU Issues" of the course
HA215.
4. In SAP HANA, performance issues are nothing to worry because the whole database is
running in-memory.
Determine whether this statement is true or false.
X True
X False
Correct! SAP HANA is optimized to consume all memory and CPU available. Even if you
have enough memory and CPU resources available, you can experience performance
issues due to badly written queries. Read more on this in the lesson "Analyzing CPU
Issues" of the course HA215.
5. Which tools can you use to monitor the Average Execution Time of individual SQL queries?
Choose the correct answers.
Correct! The response time of individual SQL queries can be monitored using the Monitor
Statements - Active Statements and Monitor Statements - SQL plan cache applications.
Read more about this in the lesson "Analyzing Expensive Statements Issues" of the course
HA215.
6. The SQL Statements card displays the number of blocked transactions currently on hold
in the database.
Determine whether this statement is true or false.
X True
X False
Correct! The SQL Statements card displays the currently running statements. Read more
about this in the lesson "Analyzing Expensive Statements Issues" of the course HA215.
7. Which operations in the SAP HANA database result in disk I/O on the Redo Log Volume?
Choose the correct answers.
X A Database restart
Correct! A database restart and a scale-out host failover are operations that result in disk
I/O on the Redo Log volume. Read more about this in the lesson "Analyzing Disk and I/O
Issues" of the course HA215.
8. A disk-full event causes the SAP HANA database to stop and the problem must be
resolved before database operations can continue.
Determine whether this statement is true or false.
X True
X False
Correct! The SAP HANA database cannot continue in a disk-full situation because it
cannot write its data to disk. This must be fixed before the SAP HANA database can
continue operation. Read more about this in the lesson "Analyzing Disk and I/O Issues" of
the course HA215.
Lesson 1
Configuring the SAP HANA Alerting Framework 111
Lesson 2
Setting up SAP HANA Workload Management 123
Lesson 3
Using SAP HANA Capture and Replay 151
UNIT OBJECTIVES
LESSON OBJECTIVES
After completing this lesson, you will be able to:
● Configure SAP HANA alerting framework
Alert Monitoring
Alert Monitoring
As an administrator, you actively monitor the status of the system and its services and the
consumption of system resources. However, you are also alerted to critical situations, for
example: a disk is becoming full, CPU usage is reaching a critical level, or a server has
stopped.
A summary of all alerts in the database is available on the homepage of the SAP HANA
cockpit. To get more information about these alerts, and to analyze the historical occurrence
of alerts, you can drill down into the Alerts application.
In addition, several configuration options are available so that you can tailor alerts in the SAP
HANA database to your needs. For example, you can change the alerting thresholds, setup e-
mail notification of alerts, and switch particular alerts on or off.
On the Alerts card, the alerts are counted and grouped by the ten most important alert
categories defined in SAP HANA. Use the View By KPA (Key Performance Area) to switch
between Alert Categories and Alert KPA. You can refresh the displayed data by using the SAP
HANA Cockpit Refresh - Now button in the top right corner.
Priority Description
Information Action recommended to improve system
performance or stability
Low Medium-term action required to mitigate the
risk of downtime
Medium Short-term action required (few hours, days)
to mitigate the risk of downtime
High Immediate action required to mitigate the
risk of downtime, data loss, or data corrup-
tion
Alert Details
In addition, several configuration options are available so that you can tailor alerting in the
SAP HANA database to your needs (for example, changing alerting thresholds, switching
particular alerts off, and setting up e-mail notification of alerts).
To open the Alerts app, on the Overview page of the SAP HANA cockpit, choose the Alerts
card. All of the latest alerts are displayed in list format on the left.
Find and select the alert that you want to analyze using the options available for filtering,
searching, and sorting. Detailed information about the alert is shown on the right, including a
graph displaying how often the alert has been issued over a certain time frame.
Select the time frame that you want to analyze. By default, the number of occurrences per
hour over the last 24 hours is displayed.
When you select an alert, detailed information about the alert is displayed on the right. The
following detailed information about an alert is available:
● Category
Displays the category of the alert checker that issued the alert.
Alert checkers are grouped into categories, for example, those related to memory usage,
those related to transaction management, and so on.
● Next Scheduled Run
Displays when the related alert checker is next scheduled to run.
If the alert checker has been switched off (alert checker status Switched Off) or it failed the
last time it ran (alert checker status Failed), this field is empty because the alert checker is
no longer scheduled.
● Interval
Displays the frequency at which the related alert checker runs.
If the alert checker has been switched off (alert checker status Switched Off) or it failed the
last time it ran (alert checker status Failed), this field is empty because the alert checker is
no longer scheduled.
● Alerting Host and Port
Displays the name and port of the host that issued the alert.
In a system replication scenario, alerts issued by secondary system hosts can be identified
here. This allows you to ensure availability of secondary systems by addressing issues
before an actual failover.
● Alert Checker
Displays the name and description of the related alert checker.
● Proposed Solution
Displays the possible ways of resolving the problem identified in the alert, with a link to the
supporting app, if available.
● Past Occurrences of Alert
A configurable graphical display that indicates how often the alert occurred in the past.
The statistics service collects and evaluates information about status, performance, and
resource consumption from all components belonging to the system. In addition, it performs
regular checks and when configurable threshold values are exceeded, issues alerts. For
example, if 90% of available disk space is used, a low priority alert is issued: if 98% is used, a
high priority alert is issued.
Monitoring and alert information are stored in database tables in a dedicated schema
(_SYS_STATISTICS). From there, the information can be accessed by administration tools,
such as SAP HANA cockpit, or SAP HANA studio. The data from system views is evaluated
against certain threshold values, which can then trigger configured follow-up actions, such as
an email notification.
The statistics service is implemented by a set of tables and SQLScript procedures in the
master index server and by the statistics scheduler thread that runs in the master name
server. The SQLScript procedures either collect data (data collectors) or evaluate alert
conditions (alert checkers). Procedures are invoked by the scheduler thread at regular
intervals, which are specified in the configuration of the data collector or alert checker. Data
collector procedures read system views and tables, process the data (for example, if the
persisted values need to be calculated from the read values) and store the processed data in
measurement tables for creating the measurement history.
This scheduler thread is part of the statistics server that runs in the nameserver service. From
the TREXNet, calls are sent to the indexserver to call SQLScript procedures.
In the case of multi-database systems, note the following:
● SystemDB: All threads run in the nameserver.
● Tenant DBs: All threads run in the indexserver.
Alert checker procedures are scheduled independently of the data collector procedures. They
read current data from the original system tables and views, not from the measurement
history tables. After reading the data, the alert checker procedures evaluate the configured
alert conditions. If an alert condition is fulfilled, a corresponding alert is written to the alert
tables. From there, it can be accessed by monitoring tools that display the alert. It is also
You can change the retention period of individual data collectors with the following SQL
statement:
UPDATE _SYS_STATISTICS.STATISTICS_SCHEDULE set
RETENTION_DAYS_CURRENT=<retention_period_in_days> where
ID=<ID_of_data_collector>;
Note:
To determine the IDs of data collectors execute the statement:
SELECT * from _SYS_STATISTICS.STATISTICS_OBJECTS
where type = 'Collector';
Monitoring tools such as the SAP HANA cockpit allow administrators in the system database
to access certain alerts occurring in individual tenant databases. However, this access is
restricted to alerts that identify situations with a potentially system-wide impact, for example,
the physical memory on a host is running out. Alerts that expose data in the tenant database
(for example, table names) are not visible to the system administrator in the system
database.
Configuring Alerts
From the Alert Details or the Configure Alerts link, on the SAP HANA Cockpit – Overview page,
you can open the Configure Alerts app. In the Configure Alerts app, there are several
configuration options available so that you can tailor alerting in the SAP HANA database to
your needs.
You can configure the following:
● Change the threshold values that trigger alerts of different priorities.
● Set up email notifications so that specific people are informed when alerts are issued.
Detail Description
Header information The name of the alert checker, its status, and the last time it ran.
Description A description of what the alert checker does, for example, what
performance indicator it measures or what setting it verifies.
Alert Checker ID The unique ID of the alert checker.
Category The category of the alert checker.
Alert checkers are grouped into categories, for example, those re-
lated to memory usage, those related to transaction management,
and so on.
Threshold Values for The values that trigger high, medium, low, and information alerts
Prioritized Alerting issued by the alert checker.
The threshold values and the unit depend on what the alert check-
er does. For example, alert checker 2 measures what percentage
of disk space is currently used, so its thresholds are percentage
values.
Note:
Thresholds can be configured for any alert checker
that measures variable values that should stay within
certain ranges, for example, the percentage of physi-
cal memory used, or the age of the most recent data
backup. Many alert checkers verify only whether a
certain situation exists or not. Threshold values can-
not be configured for these alert checkers. For exam-
ple, alert checker 4 detects service restarts. If a serv-
ice was restarted, an alert is issued.
The alert checker remains disabled for a specific length of time before it is automatically
re-enabled. The length of time is calculated based on the values in the following columns of
the table STATISTICS_SCHEDULE (_SYS_STATISTICS):
- INTERVALLENGTH
- SKIP_INTERVAL_ON_DISABLE
Caution:
If you switch off alerts, you may not be warned about potentially critical
situations in your system.
You can switch an alert checker on again at any time. You may also want to switch on alert
checkers that the system has disabled, such as checkers with the status Failed. The system
automatically disables alert checkers when they fail to run, for example, due to a shortage of
system resources.
The system automatically switches failed alert checkers back on after a certain length of time.
For more information, see Alert Checker Statuses.
You can disable an alert for a particular table or schema. You can do this for the alerts
"Record count of non-partitioned column-store tables" (ID 17) and "Table growth of non-
partitioned column-store tables" (ID 20).
To exclude an alert from being issued for a particular table, use the following SQL statement:
INSERT INTO _sys_statistics.statistics_exclude_tables VALUES
(<alert_id>, '<schema_name>', '<table_name>')
To exclude an alert from being issued for all tables of a particular schema, use the following
SQL statement:
INSERT INTO _sys_statistics.statistics_exclude_tables VALUES
(<alert_id>, '<schema_name>', null)
The configured recipients receive an email when an alert checker issues an alert. If the alert
checker issues the same alert the next time it runs, no further emails are sent. However, when
the alert checker runs and does not issue an alert, indicating that the issue is resolved or no
longer occurring, a final email is sent.
Note:
If you want to manually run an alert checker with the status Switched Off or Failed,
you must switch it back on first.
LESSON SUMMARY
You should now be able to:
● Configure SAP HANA alerting framework
LESSON OBJECTIVES
After completing this lesson, you will be able to:
● Set up SAP HANA workload management
Workload Management
The load on an SAP HANA system can be managed by selectively applying limitations and
priorities to how resources (such as the CPU, the number of active threads, and memory) are
used. Settings can be applied globally or at the level of individual user sessions by using
workload classes.
On an SAP HANA system, thanks to the capabilities of the platform, there are many different
types of workload, from simple or complex statements to potentially long-running data
loading jobs. These workload types must be balanced with the system resources that are
available to handle concurrent work. For simplicity, we classify workload queries as
transactional (OLTP) or analytic (OLAP). With a transactional query the typical response time
is measured in milliseconds and these queries are normally executed in a single thread.
However, analytic queries normally feature more complex operations using multiple threads
during execution: this can lead to higher CPU usage and memory consumption compared
with transactional queries.
To manage the workload of your system aim to ensure that the database management
system is running in an optimal way given the available resources. The goal is to maximize the
overall system performance by balancing the demand for resources between the various
workloads: not just to optimize for one particular type of operation. If you achieve this,
requests will be carried out in a way that meets your performance expectations and you will
be able to adapt to changing workloads over time. Besides optimizing for performance, you
can also optimize for robustness so that statement response times are more predictable.
The figure, Types of Workload, shows different types of workload such as Extract Transform
and Load operations (used in data warehouses to load new data in batches from source
system) as well as analytic and transactional operations.
When we discuss workload management we are really talking about stressing the system in
terms of its resource utilization. The main resources we look at (shown in the previous figure)
are CPU, memory, disk I/O, and network. In the context of SAP HANA, disk I/O comes into
play for logging, for example, in an OLTP scenario many small transactions result in a high
level of logging compared to analytic workloads (although SAP HANA tries to minimize this).
With SAP HANA, network connections between nodes in a scale out system can also be
optimized, for example, statement routing is used to minimize network overheads.
However, when we try to influence workload in a system, the main focus is on the available
CPUs and memory being allocated and utilized. Mixed transactional and analytic workloads
can, for example, compete for resources and at times require more resources than are readily
available. If one request dominates, there may be a queuing effect, so the next request may
have to wait until the previous one is ready. Such situations need to be managed to minimize
the impact on overall performance.
The OS level settings are rather static, with a low granularity. The more dynamic and more
granular settings can be done at the SAP HANA system level, and even more so at the SAP
HANA session level by using SAP HANA workload classes.
All of these options have default settings which are applied during the SAP HANA installation.
These general-purpose settings may provide you with a perfectly acceptable performance, in
which case the workload management features described in this chapter may be
unnecessary. Before you begin workload management, ensure that the system generally is
well configured: that SQL statements are tuned, that in a distributed environment tables are
optimally distributed, and that indexes have been defined as needed.
If you have specific workload management requirements, the following table outlines a
process of looking at ever more fine-grained controls that can be applied with regard to CPU,
memory and execution priority.
1. First, look at how the system is currently performing in terms of CPU usage and memory
consumption. What kinds of workloads are running on the system? Are there complex,
long running queries that require lots of memory?
2. When you have a broad understanding of the activity in the system you can drill down to
the details, such as business importance. Are statements being run that are strategic or
analytic in nature, compared to standard reporting that may not be so time-critical? Can
those statements be optimized to run more efficiently?
3. When you have a deeper understanding of the system, you have a number of ways to
influence how it handles the workload. You can map the operations to available resources,
such as CPU and memory, and determine the priority that requests get by, for example,
using workload classes.
This section lists some of the most useful views available which you can use to analyze your
workload and gives suggestions for how to improve performance. Refer also to the scenarios
section for more details of how these analysis results can help you to decide which workload
management options to apply.
If these views indicate problems with statements you can use workload classes to tune the
statements by limiting memory or parallelism.
Consider also the setting of any session variables (in M_SESSION_CONTEXT) which might
have a negative impact on these statements. The following references provide more detailed
information on this:
● SAP Note 2215929: Using Client Info to set Session Variables and Workload Class settings
describes how client applications set session variables for dispatching workload classes.
● SAP HANA Developer Guide (Setting Session-Specific Client Information)
These views provide detailed information on the threads that are active in the context of a
particular service and information about locks held by threads.
If these views show many threads for a single statement, and the general system load is high
you can adjust the settings for the set of 'execution' ini-parameters as described in the topic
Controlling Parallel Execution.
Using the configuration option, we first analyze how the system CPUs are configured Then,
based on the information returned, apply affinity settings in the daemon.ini file to bind specific
processes to logical CPU cores. Processes must be restarted before the changes become
effective. This approach applies primarily to the use cases of SAP HANA tenant databases
and multiple SAP HANA instances on one server. You can use this, for example, to partition
the CPU resources of the system by tenant database.
Note:
As an alternative to applying CPU affinity settings you can achieve similar
performance gains by changing the parameter max_concurrency in the section
[execution] of the global.ini configuration file. This may be more convenient
and can be done while the system is online.
To make the changes described here, you require access to the operating system of the SAP
HANA instance to run the Linux lscpu command and you require the privilege INIFILE
ADMIN.
Information about the SAP HANA system topology is also available from SAP HANA
monitoring views as described in a later subsection, SAP HANA Monitoring Views for CPU
Topology Details.
Hint:
For more information, see SAP Note 2470289: FAQ: SAP HANA Non-Uniform
Memory Access (NUMA).
For Xen and VMware, the users in the VM guest system see what is configured in the VM host.
So the quality of the reported information depends on the configuration of the VM guest.
Therefore, SAP cannot give any performance guarantees in this case.
Configuration Steps
To confirm the physical and logical details of your CPU architecture, analyze the system using
the lscpu command. This command returns a list of details of the system architecture.
The following table gives an overview of the most useful values, based on a sample system
with 2 physical chips (sockets), each containing 8 physical cores. These are hyper-threaded
to give a total of 32 logical cores.
1 Architecture x86_64
4 CPUs 32
8 Socket(s) 2
9 NUMA node(s) 2
● Item 4-5: This sample server has 32 logical cores, numbered 0 to 31.
● Item 6-8: Logical cores (threads) are assigned to physical cores. Assigning multiple
threads to a single physical core is referred to as hyper-threading.
In this example, there are two sockets, each socket contains eight physical cores (total 16).
Two logical cores are assigned to each physical core, thus, each core exposes two
execution contexts for the independent and concurrent execution of two threads.
● Item 9: In this example there are two Non-uniform Memory Access (NUMA) nodes, one for
each socket. Other systems may have multiple NUMA nodes per socket.
● 21-22: The system numbers and assigns 32 logical cores to one of the two NUMA nodes.
Note:
Even on a system with 32 logical cores and two sockets, the assignment of logical
cores to physical CPUs and sockets can be different. It is important to collect the
assignment in advance before making changes.
You can perform a more detailed analysis by using the system commands
described in the next step. These commands provide detailed information for each
core, including how CPU cores are grouped as siblings.
In addition to the lscpu command, you can use the set of system commands in the /sys/
devices/system/cpu/ directory tree. For each logical core, there is a numbered
subdirectory beneath the node (/cpu12/ in the following examples).
The following examples show how to retrieve this information. The following table provides
details of some of the more useful commands:
cat /sys/devices/system/cpu/present
cat /sys/devices/system/cpu/cpu12/topology/thread_siblings_list
Other Linux commands which are relevant here are sched_setaffinity and numactl. The
command sched_setaffinity limits the set of CPU cores available (by applying a CPU
affinity mask) for execution of a specific process (this could be used, for example, to isolate
tenants) and numactl controls NUMA policy for processes or shared memory.
Based on the results returned you can use the affinity setting to restrict CPU usage of SAP
HANA server processes to certain CPUs or ranges of CPUs. You can do this for the following
servers: nameserver, indexserver, compileserver, preprocessor, and xsengine. Each server
has a section in the daemon.ini file.
The affinity setting is applied by the TrexDaemon when it starts the other SAP HANA
processes using the command sched_setaffinity. Changes to the affinity settings take
effect only after restarting the SAP HANA process.
The following examples show the syntax for the ALTER SYSTEM CONFIGURATION commands
required.
Example 1
To restrict the nameserver to two logical cores of the first CPU of socket 0 (see line 21 in the
example), use the following affinity setting:
ALTER SYSTEM ALTER CONFIGURATION ('daemon.ini', 'SYSTEM') SET
('nameserver', 'affinity') = '0,16'
Example 2
To restrict the preprocessor and the compileserver to all remaining cores (that is, all except 0
and 16) on socket 0 (see line 21 in the example), use the following affinity settings:
ALTER SYSTEM ALTER CONFIGURATION ('daemon.ini', 'SYSTEM') SET
('preprocessor', 'affinity') = '1-7,17-23'
Example 3
To restrict the indexserver to all cores on socket 1 (see line 22 in the example), use the
following affinity settings:
ALTER SYSTEM ALTER CONFIGURATION ('daemon.ini', 'SYSTEM') SET
('indexserver', 'affinity') = '8-15,24-31'
Example 4
To set the affinity for two tenant databases, called DB1 and DB2 respectively, in a tenant
database setup, use the following affinity settings:
ALTER SYSTEM ALTER CONFIGURATION ('daemon.ini', 'SYSTEM') SET
('indexserver.DB1', 'affinity') = '1-7,17-23';
Example 5
In this scenario tenant BD1 already exists. Here, we add another tenant DB2:
CREATE DATABASE NM2 ADD AT LOCATION 'host:30040' SYSTEM USER PASSWORD
Manager1;
Set the configuration parameter to bind CPUs to specific NUMA nodes on each tenant. You
can use the following notation with a dot to identify the specific tenant:
To assign affinities to multiple indexservers of the same tenant on the same host execute the
following SQL statements on the SYSTEMDB to apply the instance_affinity[port]
configuration parameter:
Example 5
In this scenario an indexserver is already running on tenant NM1 on port 30003. Here, we add
another indexserver on a different port:
ALTER DATABASE NM1 ADD 'indexserver' AT LOCATION 'host:30040';
In this example, table T1 will be processed by NUMA node 1 if possible, and otherwise by any
of NUMA nodes 3-5. Preferences are saved in the system table NUMA_NODE_PREFERENCE_.
Use the following statement to remove any preferences for an object:
ALTER TABLE T1 NUMA NODE NULL
By default, preferences are only applied the next time the table is loaded. You can use the
ALTER TABLE statement with the keyword IMMEDIATE to apply the preference immediately
(the default value is DEFERRED):
ALTER TABLE T1 NUMA NODE (‘3’) IMMEDIATE
Granularity
NUMA node location preferences can be applied at any of the following levels:
● Table (column store only)
● Table partition (range partitioning only)
● Column
If multiple preferences for a column or partition have been defined, the column preference is
applied first, then the partition preference, then the table.
The following example shows the statement being used to apply a preference for column A in
table T1:
CREATE COLUMN TABLE T1(A int NUMA NODE (‘2’), B varchar(10))
The following examples show statements to apply a preference for partition A in table T1:
CREATE COLUMN TABLE T1(A int , B varchar(10)) PARTITION BY RANGE(A)
(PARTITION VALUE = 2 NUMA NODE (‘4’))
ALTER TABLE T1 ADD PARTITION (A) VALUE = 3 NUMA NODE (‘1’) IMMEDIATE
You can also identify a partition by its logical partition ID number and set a preference by
using ALTER TABLE as shown here:
ALTER TABLE T1 ALTER PARTITION 2 NUMA NODE ('3')
Transferring Preferences
Using the CREATE TABLE LIKE statement, the new table can be created with or without the
NUMA preference. In the following example any preference which has been applied to T2 will
(if possible) apply on new table T1. The system checks the topology of the target system to
confirm if it has the required number of nodes. If not, the preference is ignored:
CREATE TABLE T1 LIKE T2
The keyword WITHOUT can be used as shown in the following example to ignore any
preference which has been applied to T2 when creating the new table T1:
CREATE TABLE T1 LIKE T2 WITHOUT NUMA NODE
A similar approach is used with the IMPORT and EXPORT statements: any preferences are
saved in the exported table definition and applied, if possible, in the target environment when
the table is imported. In this case you can use the IGNORE keyword to import a table and
ignore any node preferences:
IMPORT SYSTEM."T14" FROM '/tmp/test/' WITH REPLACE THREADS 40 IGNORE
NUMA NODE
once per minute. For most keys, you require the INIFILE ADMIN privilege to view the values.
Select one or more key names for a specific host to retrieve the corresponding values:
select * from SYS.M_HOST_INFORMATION where key in
('cpu_sockets','cpu_cores','cpu_threads');
Caution:
The settings described here should only be modified when other tuning
techniques like remodeling, repartitioning, and query tuning have been applied.
Modifying the parallelism settings requires a thorough understanding of the
actual workload because they have an impact on the overall system behavior.
Modify the settings iteratively by testing each adjustment.
On systems with highly concurrent workloads, too much parallelism of single statements may
lead to poor performance. Note also that partitioning tables influences the degree of
parallelism for statement execution. In general, adding partitions tends to increase
parallelism. You can use the parameters described in this section to adjust the CPU utilization
in the system.
Two thread pools control the parallelism of the statement execution. Generally, target thread
numbers applied to these pools are soft limits, meaning that additional available threads can
be used if necessary and deleted when no longer required:
● SqlExecutor
This thread pool handles incoming client requests and executes simple statements. For
each statement execution, an SqlExecutor thread from a thread pool processes the
statement. For simple OLTP-like statements against column store, as well as for most
statements against row store, this will be the only type of thread involved. With OLTP, we
mean short running statements that consume relatively little resources. However, even
OLTP-systems like SAP Business Suite may generate complex statements.
● JobExecutor
The JobExecutor is a job dispatching subsystem. Almost all remaining parallel tasks are
dispatched to the JobExecutor and its associated JobWorker threads.
In addition to OLAP workload, the JobExecutor also executes operations like table
updates, backups, memory garbage collection, and savepoint writes.
You can set a limit for both SqlExecutor and JobExecutor to define the maximum number of
threads. You can use this, for example, on a system where OLAP workload would normally
consume too many CPU resources to apply a maximum value to the JobExecutor to reserve
resources for OLTP workload.
Caution:
Lowering the value of these parameters can have a negative effect on the parallel
processing of the servers, and reduce the performance of the overall system.
Adapt and test these values iteratively.
For more information, see Understand your Workload, and SAP Note 2222250:
FAQ SAP HANA Workload Management.
A further option to manage statement execution is to apply a limit to an individual user profile
for all statements in the current connection using the THREADLIMIT parameter.
The following SqlExecutor parameters are in the sql section of the indexserver.ini file:
sql_executors sets a soft limit on the target number of logical cores for the SqlExecutor
pool.
● This parameter sets the target number of threads that are immediately available to accept
incoming requests. Additional threads will be created if needed, and deleted if no longer
needed.
● The parameter is initially not set (0) - the default value is the number of logical cores in a
system. As each thread allocates a particular amount of main memory for the stack,
reducing the value of this parameter can help to avoid memory footprint.
max_sql_executors sets a hard limit on the maximum number of logical cores that can be
used.
● In normal operation new threads are created to handle incoming requests. If a limit is
applied here, SAP HANA will reject new incoming requests with an error message if the
limit is exceeded.
● The parameter is initially not set (0) so no limit is applied.
Caution:
SAP HANA will not accept new incoming requests if the limit is exceeded. Use
this parameter with extreme care.
● This parameter sets the size of the thread pool used by the JobExecutor used to parallelize
execution of database operations. Additional threads will be created if needed and deleted
if no longer needed. You can use this to limit resources available for JobExecutor threads,
thereby saving capacity for SqlExecutors.
● The parameter is initially not set (0) - the default value is the number of logical cores in a
system. Especially on systems with at least 8 sockets, consider setting this parameter to a
reasonable value between the number of logical cores per CPU, up to the overall number of
logical cores in the system. In a system that supports tenant databases, a reasonable
value is the number of cores divided by the number of tenant databases.
max_concurrency_hint limits the number of logical cores for job workers, even if more
active job workers are available.
● This parameter defines the number of jobs to create for an individual parallelized
operation. The JobExecutor proposes the number of jobs to create for parallel processing,
based on the recent load on the system. Multiple parallelization steps may result in far
more jobs being created for a statement (and hence higher concurrency) than this
parameter.
● The default is 0 (no limit is applied but the hint value is never greater than the value for
max_concurrency). On large systems (systems with more than 4 sockets) setting this
parameter to the number of logical cores of one socket may result in better performance,
but testing is necessary to confirm this.
You can also create exceptions to these limits for individual users (for example, to ensure an
administrator is not prevented from doing a backup) by setting a different statement memory
limit for each individual.
These limits only apply to single SQL statements, not the system as a whole. Tables which
require much more memory than the limit applied here may be loaded into memory. The
parameter global_allocation_limit limits the maximum memory allocation limit for the
system as a whole.
You can view the (peak) memory consumption of a statement in
M_EXPENSIVE_STATEMENTS.MEMORY_SIZE.
To be able to set memory limits for SQL statements, enable the following parameters:
● In the global.ini file, in the resource_tracking section:
- enable_tracking = on
- memory_tracking = on
- The value defined for this parameter can be overridden by the corresponding workload
class property STATEMENT_MEMORY_LIMIT.
After setting this parameter, statements that exceed the limit you have set on a host are
stopped by running out of memory.
● statement_memory_limit_threshold defines the maximum memory allocation per
statement as a percentage of the global allocation limit. The default value is 0%
(statement_memory_limit is always respected).
- In the global.ini file, expand the memorymanager section and set the parameter as a
percentage of the global allocation limit.
- This parameter provides a means of controlling when statement_memory_limit is
applied. If this parameter is set, when a statement is issued the system will determine if
the amount of memory it consumes exceeds the defined percentage value of the overall
global_allocation_limit parameter setting. The statement memory limit is only
applied if the current SAP HANA memory consumption exceeds this statement
memory limit threshold as a percentage of the global allocation limit.
- This is a way of determining if a particular statement consumes a large amount of
memory compared to the overall system memory available. In this case, to preserve
memory for other tasks, the statement memory limit is applied and the statement fails
with an exception.
- Note that the value defined for this parameter also applies to the workload class
property STATEMENT_MEMORY_LIMIT.
In the case of rejected requests an error message that the server is temporarily overloaded is
returned to the client: 1038,'ERR_SES_SERVER_BUSY','rejected as server is
temporarily overloaded'.
The load on the system is measured by background processes which gather a set of
performance statistics covering available capacity for memory and CPU usage. The statistics
are moderated by a configurable averaging factor (exponentially weighted moving average) to
minimize volatility, and the moderated value is used in comparison with the threshold
settings.
The admission control filtering process does not apply to all requests. In particular, requests
that release resources will always be executed, for example, commit, rollback, and
disconnect. The filtering also depends on user privileges: administration requests from
SESSION_ADMIN and WORKLOAD_ADMIN are always executed.
There are some situations where it is not recommended to enable admission control, for
example, during planned maintenance events such as an upgrade or the migration of an
application. In these cases it is expected that the load level is likely to be saturated for a long
time and admission control could therefore result in the failure of important query executions.
Limits can be applied at two levels so that firstly, new requests are queued until adequate
processing capacity is available or a timeout is reached, and secondly, a higher threshold can
be defined to determine the maximum workload level above which new requests are rejected.
If requests have been queued, items in the queue are processed when the load on the system
reduces below the threshold levels. If the queue exceeds a specified size or if items are
queued for longer than a specified period of time, they are rejected.
In the case of rejected requests, an error message that the server is temporarily overloaded,
is returned to the client: 1038,'ERR_SES_SERVER_BUSY','rejected as server is
temporarily overloaded'.
The load on the system is measured by background processes that gather a set of
performance statistics covering available capacity for memory and CPU usage. The statistics
are moderated by a configurable averaging factor (exponentially weighted moving average) to
minimize volatility, and the moderated value is used in comparison with the threshold
settings.
The admission control filtering process does not apply to all requests. In particular, requests
that release resources are always executed, for example, commit, rollback, and disconnect.
The filtering also depends on user privileges: administration requests from SESSION_ADMIN
and WORKLOAD_ADMIN are always executed.
To monitor the admission control feature, you can use the SAP HANA cockpit or use the
following public monitoring views that are available:
● M_ADMISSION_CONTROL_STATISTICS
● M_ADMISSION_CONTROL_QUEUES
● Extended M_CONNECTIONS.CONNECTION_STATUS for queueing status
In the Workload Admission Control Setting application, you can configure the threshold
values for admission control to determine when requests are queued or rejected, which are
defined as configuration parameters.
The admission control feature is enabled by default and the related threshold values and
configurable parameters are available in the indexserver.ini file. A pair of settings is available
for both memory and CPU that define firstly the queuing level (the default value is 90%) and
secondly, the rejection level (the default is not active). Two parameters are available to
manage the statistics collection process by defining how frequently statistics are collected
and setting the averaging factor that is used to moderate volatility. These parameters,
available in the Workload Admission Control Setting application and in the
admission_control section of the INI file, are summarized in the following table.
Queue Management
If requests have been queued, items in the queue are processed when capacity becomes
available. A background job continues to evaluate the load on the system in comparison to the
thresholds. When the load is reduced enough, queued requests are submitted in batches on
an oldest-first basis.
The queue status of a request is visible in the M_CONNECTIONS view. The connection status
value is set to queuing in the column M_CONNECTIONS.CONNECTION_STATUS.
There are several configuration parameters (in the admission_control section of the INI
file) to manage the queue and how the requests in the queue are released. You can apply a
maximum queue size or a queue timeout value. If either of these limits are exceeded, requests
which would otherwise be queued, are rejected. An interval parameter is available to
determine how frequently to check the server load so that de-queueing can start, and a de-
queue batch size setting is also available.
Note:
If Admission Control has been configured and is active, it takes precedence over
any other time-out value which may have been applied. This means that other
timeouts that apply to a query (such as a query timeout) will not be effective until
the query has been de-queued or rejected by the queue timeout.
applied by the SQL command ALTER USER. However, workload class settings only apply for
the duration of the current session, whereas changes applied to the user persist. More
detailed examples of precedence are given in a separate section.
To apply workload class settings, client applications can submit client attribute values
(session variables) in the interface connect string as one or more property-value pairs. The
key values which can be used to work with workload classes are: database user, client,
application name, application user, and application type.
Based on this information the client is classified and mapped to a workload class. If it cannot
be mapped, it is assigned to the default workload class. The configuration parameters
associated with the workload class are read and this sets the resource variable in the session
or statement context.
The list of supported applications includes HANA WebIDE (XS Classic), HANA Studio, ABAP
applications, Lumira, and Crystal Reports. Full details of the session variables available in
each supported client interface which can be passed in the connect string are given in SAP
Note 2331857 SAP HANA workload class support for SAP client applications.
Caution:
Workload classes cannot be used on an Active/Active (read-only) secondary
node.
Required Privilege
Managing workload classes requires the WORKLOAD ADMIN privilege. Changes of workload
classes or mappings will only be applied when a (connected) database client reconnects. In
terms of the privileges of the executing user (DEFINER or INVOKER), the workload mapping is
always determined on the basis of invoking user, regardless of whether the user has definer or
invoker privileges.
The ABAP server sets the client context information automatically for all ABAP applications.
Users, classes, and mappings are interrelated: if you drop a user in the SAP HANA database,
all related workload classes are dropped and if you drop a workload class, the related
mappings are also dropped
Note:
In a scale-out environment workload classes are created for the complete SAP
HANA database and do not have to be created for each single node. However,
restrictions defined in these workload classes are applied to each single node
and not to the complete SAP HANA database
You can classify workloads based on user and application context information and apply
configured resource limitations (for example, a statement memory limit). Workload classes
allow SAP HANA to influence dynamic resource consumption on the session or statement
level. When a request from an application arrives in SAP HANA, the corresponding workload
class is determined based on the information given by the session context such as application
name, application user name and database user name. Once the corresponding workload
class is determined, the application request can have its resources limited according to the
workload class definition.
Statement memory limits will not apply if memory tracking is inactive in SAP HANA cockpit.
You can activate memory tracking in the Configuration settings.
You can use workload classes to set values for the properties listed here. Each property also
has a default value, which is applied if no class can be mapped or if no other value is defined.
For all of the following parameters, although you can enter values including decimal fractions
(such as 1.5 GB) these numbers are rounded down and the whole number value is the
effective value which is applied
Parameter Value
Workload Class A name for the new workload class.
Name
Execution Priority To support better job scheduling, this property prioritizes
statements in the current execution. Priority values of 0 (lowest
priority) to 9 (highest) are available. The default value is 5.
Limit Type Individual Statement Limit or Total Aggregate Statement Limit.
Statement Memory Displayed if Individual Statement Limit is the specified limit type.
Limit Maximum amount of memory the statement may use, as either an
absolute or relative value.
Total Memory Limit Displayed if Total Aggregate Statement Limit is the specified limit
type. Maximum amount of memory all statements may use, as either
an absolute or relative value.
Statement Thread Displayed if Individual Statement Limit is the specified limit type.
Limit Maximum number of parallel threads the statement may execute, as
either an absolute or relative value.
Total Thread Limit Displayed if Total Aggregate Statement Limit is the specified limit
type. Maximum number of parallel threads all statements may
execute, as either an absolute or relative value.
Query Timeout The amount of time in seconds before the query times out.
(Available for databases running SAP HANA SPS 03 or higher).
Note:
For thread and memory limits, workload classes can contain either the statement-
level properties or the aggregated total properties, but not both. For the
aggregated limits, the full set of three properties must be defined: TOTAL
STATEMENT THREAD LIMIT, TOTAL STATEMENT MEMORY LIMIT, and
PRIORITY.
Example
You can set values for one or more resource properties in a single SQL statement. The
following example creates a workload class called MyWorkloadClass with values for all three
properties:
CREATE WORKLOAD CLASS "MyWorkloadClass" SET 'PRIORITY' = '3',
'STATEMENT MEMORY LIMIT' = '2' , 'STATEMENT THREAD LIMIT' = '20'
QueryTimeout 25 25 25 25*
statement_timeout (ini) 10 10* 10* 10 (ignored)
For more information, see the Setting Session-Specific Client Information in the SAP HANA
Developer Guide.
The properties supported are listed in the following table in order of importance. The
workload class with the greatest number of properties matching the session variables passed
from the client is applied. If two workload classes have the same number of matching
properties, they are matched in the following order of importance.
Example
This example creates a workload mapping called MyWorkloadMapping that applies the
values of the MyWorkloadClass class to all sessions where the application name value is
HDBStudio:
CREATE WORKLOAD MAPPING "MyWorkloadMapping" WORKLOAD CLASS
"MyWorkloadClass" SET 'APPLICATION NAME' = 'HDBStudio';
This example applies more restrictive limits than those already defined and by default,
workload class hints can only be used in this way. The hint is ignored if any of the new values
weaken the restrictions or if any values are invalid. You can change this default behavior by
switching the following configuration parameter in the session_workload_management
section of the indexserver.ini file: allow_more_resources_by_hint. If this parameter is set
to True then any hint can be applied.
In these system views the field WORKLOAD_CLASS_NAME shows the effective workload
class used for the last execution of that statement:
● M_ACTIVE_STATEMENTS
● M_PREPARED_STATEMENTS
● M_EXPENSIVE_STATEMENTS (enable_tracking and memory_tracking must first be
enabled in the global.ini file for this view)
● M_CONNECTIONS
If no workload class is applied, these views display the pseudo-workload class value
_SYS_DEFAULT.
LESSON SUMMARY
You should now be able to:
● Set up SAP HANA workload management
LESSON OBJECTIVES
After completing this lesson, you will be able to:
● Capture and replay a SAP HANA workload
The SAP HANA Capture and Replay performance management tool allows you to capture the
workload of a source system and to replay the captured workload on a target system without
applications.
Moreover, you can use the tool to analyze the captured workload and the reports generated
after replaying the workload. Comparing the performance between the source and target
systems can help you to find the root cause of performance differences.
The following changes may require a check of the existing system, concerning both
performance and stability:
● Hardware change
● SAP HANA revision upgrade
● SAP HANA INI file change
● Table partitioning change
● Index change
● Landscape reorganization for SAP HANA scale-out systems
● Apply HINT to queries
What Is a Workload?
A workload in the context of SAP HANA can be described as a set of requests with common
characteristics.
In the context of SAP HANA capture and replay, workload can mean any change to the
database via SQL statements that come from SAP HANA client interfaces such as JDBC,
ODBC, or DBSL. The workload can be created by applications or clients (for example, SAP
NetWeaver or Analytic).
You can look at the details of a workload in several ways. Firstly, you can look at the source of
requests and determine if applications or application users generate a high workload for the
system. You can examine what kinds of SQL statements are generated. Are they simple or
complex statements? Is there a prioritization of work done based on business importance?
For example, does one part of the business need to have more access at peak times? You can
then look at what kind of service level objectives the business has in terms of response times
and throughput.
The main steps involved in the capture and replay process are:
1. Capture
In this step the tool automatically collects the execution context information together with
the incoming requests to the database. The captured workload file stores the start times
of the SQL statements.
A database backup is recommended after starting capturing, to ensure that the source
and target systems are in a consistent state.
2. Preprocess
In this step the tool reconstructs and optimizes the captured workload file so it can be
replayed on a target system. This process is a one-time operation and the stored
preprocessed workload file can be replayed multiple times.
3. Replay
The replayer is a service on operating system level that needs to be started before
replaying.
The tool replays the preprocessed file based on the SQL statement timestamp or on the
transactional order. Together with the collected execution context it allows you to
accurately simulate the database workload.
4. Analyze
For a final analysis, you can generate comparison reports displaying a capture-replay or a
replay-replay comparison. You can analyze the statements based on results or on
performance
Note:
We recommend that you perform a full database backup after starting the capture
step. This backup needs to be restored on the target system before the replay
phase starts to ensure that the source and target systems are in a consistent
state.
What Are the Landscape Requirements for Using SAP HANA Capture and Replay?
You can use a two- or three-system setup as a SAP HANA Capture and Replay landscape. The
following figure shows both the two-tier setup and the three-tier setup.
In a two-system setup, you need a source and a target system. The control system and the
target system share the same host.
In a three-system setup, a separate control system is added. This control system is the
system running the cockpit and is the system for storing intermediate preprocessed or replay
results.
The advantage of a three-system setup is that the replay results are stored in a separate
control system. This means the capture information is not lost when recovering the target
system.
Consider the following recommendations when using SAP HANA capture and replay:
● Check the disk performance to ensure that there is sufficient bandwidth for capturing and
preprocessing workloads without any performance bottlenecks. If disk performance is not
sufficient, the active capture can impact the source system.
● Check the available disk space in combination with the characteristics of the workload that
should be captured. The required disk space is highly dependent on the type of workload
being captured.
● Use the disk space that is dedicated to the database instance itself.
● One replayer service is sufficient to execute a replay successfully. For better scalability and
performance in large workload scenarios, multiple replayers can be used for all replaying
purposes. When using multiple replayers, distribute and divide all involved components
(for example, target instance, control instance, one or more replayers) on different hosts
and systems. Doing so will reflect the initial captured workload as realistically as possible.
This will also reduce the effect which the resource consumption of the components may
have on a replay.
● Use a separate control and target instance for replaying workloads. If a replayed statement
causes a crash, it will be displayed in the replay report. When you use the same control and
target instance, the replay report entry causing a crash, will not be successfully sent to the
control instance.
● Use the secure store for saving passwords and authenticating users.
● The target system should meet the same privacy and security prerequisites as the source
system. Since the target system processes the same data as the source system, it should
meet an appropriate security level depending on data criticality.
Unnecessary network connections to the target system should not be allowed. Users
registered on the source system might be able to access the target system after a replay
has been completed.
● Regarding version dependencies, the following rules can be followed:
- Target system >= control system and replayers >= source system.
- The source system should be at least 122.14+ for captures with transactional replay
enabled.
● To trigger replays, the control system and target system must be registered in the same
SAP HANA cockpit. The user in the SAP HANA cockpit must be able to access both
systems.
When registering the target system, the cockpit does not store the credentials.
To capture a workload on the source system, you need a user with WORKLOAD CAPTURE
ADMIN system privilege. Additionally, you can add the optional privileges INIFILE ADMIN and
BACKUP OPERATOR. These two additional privileges let you change parameter values in the
optional filters on the capture configuration page, and start database backups while the
workload capture is running.
3. To start configuring the new capture, on the Capture Management page, choose New
Capture.
On the Configure New Capture page, it is mandatory to enter the name of the new capture.
You can customize other optional settings before you start the capture.
The Capture Monitor opens, displaying monitoring information such as duration, the
number of captured statements, or disk space. You can stop the capture or you can let it
run for as long as you wish. If you did not create a backup when starting the capture, you
can also start a full backup from the Capture Monitor page.
Note:
By default, the captured workload file is stored under the trace directory
$DIR_INSTANCE/<host name>/trace with a CPT file extension.
After the capture is complete, the new captured workload has the status Captured. By
choosing the new captured workload, the Capture Report opens, displaying information about
the captured workload. You can continue analyzing the captured workload. For more
information, see Analyze a Captured Workload.
Hint:
To ensure that the source system and the target system are in a consistent
state for capture and replay, we recommend performing a full database
backup after starting the capture. A full database backup is only required the
first time, because incremental backups can be used once the system has
been initialized. For more information, see SAP HANA Backup and Recovery.
Analyze Workload provides information on the number of traced files, as well as information
on the number of loaded files. To analyze the traced workload data, the file must be loaded
into the database.
Prerequisites
You have the system privilege WORKLOAD ANALYZE ADMIN.
2. To load the workload into the SAP HANA database, choose Load Workload.
4. In the Workload Analysis page, several analysis charts and information tables are
displayed.
Note:
The replayer is not part of the SAP HANA database services that are running as
daemon processes. You must start and stop the replayer yourself.
Prerequisites
● A user with the WORKLOAD REPLAY ADMIN system privilege to control the replayer.
● Store the logon credentials in the secure store.
● When using multiple replayers, distribute and divide all involved components (for example,
target instance, control instance, and one or more replayers) on different hosts and
systems.
The replay process is performed by the replayer, which must be running before you start the
replay. The replayer is a service at operating system level that reads SQL commands from the
preprocessed workload file and executes them one-by-one in timestamp-based order. All
preprocessed workloads can be replayed as often as necessary.
Hint:
SAP recommends performing the preprocessing step in the target system or in a
separate control system, not in the production system. The preprocessing may
require significant computing power.
You can preprocess a captured workload and replay the preprocessed workload using the
Replay Workload card.
Prerequisites
● A user with WORKLOAD REPLAY ADMIN system privilege.
2. Check the status of the captured workload that you want to preprocess. The status should
be Not Preprocessed.
5. The preprocessing starts. The runtime depends on the size of the captured workload. You
can manually refresh the screen by using the refresh button at the top right of the screen.
Preprocess Destination: After the preprocessing is complete, the preprocessed workload file
is stored by default in the $SAP_RETRIEVAL_PATH/trace directory. Because the default
trace directory generally resides in the same storage area with data and log volumes,
preprocessing workloads may affect the performance across the entire system. Enter a
different destination to have a better distribution of the disk I/O between the data and log
volumes, and the preprocessed files.
Note:
As an example, an executed statement A has a runtime of 10 ms during capture,
while executed statement A has a runtime of 12 ms during replay. If the runtime in
the target system is lower than the configured threshold value, the statement is no
longer listed in the replay report as slower or faster, but as comparable.
Hint:
Manually copy the captured workload files and the database backup from the
source system to the control or target system.
Prerequisites
● A user with WORKLOAD REPLAY ADMIN and WORKLOAD ANALYZE ADMIN system
privileges.
● The target system meets the same security and privacy prerequisites as the source
system. Since the target system processes the same data as the source system, it should
meet an appropriate security level depending on data criticality.
● You have preprocessed the captured workloads using the Replay Workload card.
● The replayer is running.
Caution:
Do not allow unnecessary network connections to the target system. Users
registered on the source system could access the target system after the replay
is complete.
2. Choose a replay candidate with the status Preprocessed to start configuring it for the
replay. The Replay Configuration page opens, allowing you to configure various mandatory
and optional settings.
If a database backup is available, restore the database before starting the replay in the target
system. When running a replay on a target system that has been restored using a backup
taken automatically during the capture process, activate the Synchronize Replay with Backup.
If no or only out-dated database backups are available, you can still restore the database or
manually export parts of the data before starting the replay in the target system. When
running a replay on a target system that was restored using old backups or contains only
smaller manual exports of data, deactivate the Synchronize Replay with Backup option.
The Replay Management screen opens displaying in the Replay List tab for the workloads that
are being replayed. To access the Replay Monitor, choose the running replay. The monitoring
view provides information such as duration, number of statements, size, and other details
about the replay in progress. You can navigate away from the monitoring view using the arrow
at the top right and you can return anytime.
If you have already replayed preprocessed workloads, you can generate comparison reports
for further analysis. For more information, see Generating Comparison Reports.
● Instance Number: Enter the target instance number (for example, 42) where the capture is
replayed.
● Database Mode: Choose between Single Container or Multiple Containers.
Replayer Options
● User Name: Enter a database user who has WORKLOAD REPLAY ADMIN privilege and is
used for the final preparation steps in the target instance.
● Password: Enter the database user password.
● Request Rate: Modify the rate at which the statements are replayed.
You can decrease the wait time between statements during the replay. For example,
statement B starts one second after statement A has been triggered. When setting the
request rate from 1x to 2x, this difference is only 0.5 seconds.
● Consistent with Backup: This option allows you to synchronize the replay with an existing
database backup.
The option is turned on by default allowing the replayer to compare each statement with
the database backup. This option makes it possible to check if there are no duplicate
inserts and if the backup and replay are aligned. A backup is required for this option to
work correctly.
If the option is turned off, the replayer replays statements, even if no backup is present.
This is important for scenarios in which you use only single tables, or smaller data exports,
which are not considered a complete backup.
● Collect Explain Plan: Collect the output of the EXPLAIN PLAN command for captured
statements. You can use this information for comparison after the replay.
● Transactional Replay: This option enables guaranteed transactional consistency during a
replay.
Caution:
Enabling this option may cause overhead to query runtime as transactional
consistency needs to be checked constantly.
Replayer Information
● Replayer List: Select a running replayer that is used to connect to the target system and
facilitates the replay.
● User Authentication: Enter the password or the securestore key for the database users
captured in the source system. For a realistic replay, all users that are part of the workload
which you have chosen to replay, must be authenticated.
To reset the password for the database users captured in the source system, select Reset
Password, then choose the user. This can be helpful when you do not know the actual
password of each user. On the Reset Password window, set a new password for all selected
users, and choose Confirm. All selected user passwords in the defined target system are
changed as defined in this step.
Note:
Storing the replay results outside the database can be useful when the target and
control systems are the same. In such a setup, the previous replay results of the
control system could be overwritten after recovering the target system from the
database backup.
Prerequisites
● A user with WORKLOAD REPLAY ADMIN and WORKLOAD ANALYZE ADMIN system
privileges.
● You have replayed preprocessed workloads using the Replay Workload card.
You can start the comparison of the replayed workloads from the Replay List in the Replay
Management.
2. Select one entry from the displayed list and choose Close. The Select Target Replay dialog
opens, allowing you to select the replayed workload that you want to compare with the
previously selected workload. The list displays replayed workloads based on the same
initial captured workload.
3. Select one or more entries from the displayed list and choose Compare Replays. The
Comparison Report opens, displaying a comparison of the selected replayed workloads.
Overview Tab
The Overview tab displays an overall comparison of the SQL statements involved in the
capturing and replaying process in the following blocks:
● Result Comparison: In a result-based comparison, you get an overview of the statements
with identical or different results. Choose the block to open directly the Result Comparison
tab.
● Performance Comparison: In a performance-based comparison, you get an overview of the
statements based on a comparison of runtimes. Choose the block to open directly the
Performance Comparison tab.
● Different Statements: Displays the top SQL statements that have different results from the
selected baseline in descending order. You can choose each row to open the Execution
Detail page for the selected SQL statement. Use the drop-down arrow to filter the
statements by time or by the number of records that have different results.
● Slower Statements: Displays the top SQL statements that have a different performance
ordered by the difference in execution time. You can choose each row to open the
Execution Detail page for the selected SQL statement. To view KPI details for each
statement, choose the icon to the right.
● Verification Skipped: Displays the distribution of reasons for statements with skipped
result comparison.
● Replay Failed Statements: Displays the distribution of reasons for the statements, which
failed during replay. Use the drop-down arrow to filter the statements by time or error
code.
● Capture Information: Displays information on the capture system, capture options, and
the properties of the capture file.
● Replay Information: Displays information on the replay system and the replay options. If
the comparison was made between two replays, the information is displayed in a Baseline
Replay Information block and in a Target Replay Information block.
Load Tab
The Load tab includes load charts comparing both the captured and the replayed workloads
after a capture-replay comparison, or the baseline and the target workloads after a replay-
replay comparison. The KPIs can be toggled independently for both the capture and replay
aspects, making it easier to compare them with each other. Additional KPIs can be added
using the Show More KPIs button at the top right of the load chart.
In the first part of the course HA215, we looked at specific troubleshooting and system
analysis when the SAP HANA database system is offline, hanging, or slow. We discussed how
to investigate the system status and how to generate a full system dump of the trace files.
In the second part, we looked at how to perform a performance root cause analysis on issues
regarding high memory, CPU and disk utilization. We also looked at how to identify expensive
SQL statements.
In the final part, we looked at the tools alerting framework, workload management, and the
capture and replay functionality provided by SAP HANA.
Using the SAP HANA Alerting Framework allows you to be informed up-front about possible
problems. Alerts are shown in SAP HANA cockpit, but they can also be sent using email.
SAP HANA workload management lets you control what is executed on the SAP HANA
database when the system is under high load. You can set up rules that allow SAP HANA to
decide what can be executed when the system load is high.
SAP HANA Capture and Replay lets you capture the current workload on your SAP HANA
production system and then replay this workload on a test system. This allows you to
investigate regression and/or performance degradation problems after a change to the SAP
HANA database hardware and software configuration. The tool can also be used to test the
performance and stability of a new SAP HANA version or support package stack.
LESSON SUMMARY
You should now be able to:
● Capture and replay a SAP HANA workload
Learning Assessment
1. When configuring SAP HANA alerts, you can only enter one email recipient per alert.
Determine whether this statement is true or false.
X True
X False
2. Which protocol is used by the statistics service to collects statistical and performance
information?
Choose the correct answers.
X A JSON
X B MDX
X C SNMP
X D SQL
4. CPU-binding is a resource pooling technique at the SAP HANA kernel level, and it can be
configured within the SAP HANA database.
Determine whether this statement is true or false.
X True
X False
X A Index changes
6. The SAP HANA Capture and Replay tool allows you to capture real system workload in a
productive environment, and preprocess and replay the captured workload on a different
target system.
Determine whether this statement is true or false.
X True
X False
X A Index changes
1. When configuring SAP HANA alerts, you can only enter one email recipient per alert.
Determine whether this statement is true or false.
X True
X False
Correct! You can configure more than one email recipient per alert. Read more about this
in the lesson "Configuring SAP HANA Alerting Framework" of the course HA215.
2. Which protocol is used by the statistics service to collects statistical and performance
information?
Choose the correct answers.
X A JSON
X B MDX
X C SNMP
X D SQL
You are correct! The statistics service uses SQL to read the data from the SAP HANA
monitoring views. Read more about this in the lesson "Configuring SAP HANA Alerting
Framework" of the course HA215.
Correct! The query type and business importance are characteristics of the workload.
Read more about this in the lesson "Setting up SAP HANA Workload Management" of the
course HA215.
4. CPU-binding is a resource pooling technique at the SAP HANA kernel level, and it can be
configured within the SAP HANA database.
Determine whether this statement is true or false.
X True
X False
X A Index changes
You are correct! Changes to an Index and/or the table distribution influence the way SAP
HANA accesses data, so this can have an impact on the performance. Read more about
this in the lesson "Using SAP HANA Capture and Replay" of the course HA215.
6. The SAP HANA Capture and Replay tool allows you to capture real system workload in a
productive environment, and preprocess and replay the captured workload on a different
target system.
Determine whether this statement is true or false.
X True
X False
Correct! The capture and replay does not need to be performed on the same server. Read
more about this in the lesson "Using SAP HANA Capture and Replay" of the course HA215.
X A Index changes
Correct! Changes to an Index and/or the table distribution influence the way SAP HANA
accesses data, so this can have an impact on the performance. Read more about this in
the lesson "Using SAP HANA Capture and Replay" of the course HA215.