You are on page 1of 10

Document 1070954.1 https://support.oracle.com/epmos/faces/DocumentDisplay?_adf.ctrl-stat...

Copyright (c) 2022, Oracle. All rights reserved. Oracle Confidential.

Oracle Exadata Database Machine EXAchk (Doc ID 1070954.1)

In this Document

Purpose
Scope
Details
Exachk Overview
Documentation
Installing Exachk
Latest Available Exachk Version
Checking Current Exachk Version
Installing/Updating AHF in an Exadata on-prem deployment
Installing/updating AHF in an Exadata Cloud deployment (Exadata Cloud@Customer, Exadata
Cloud Service)
Running Exachk
Use Case - General Maintenance and Configuration Compliance
Use Case - Troubleshooting any Exadata Stability or Performance Problem
Use Case - Patching, Upgrading, and Other Planned Maintenance
Special Use Cases
Large Exadata OVM Systems Logically Divided into Smaller Systems
Skipping Specific Best Practice Checks
General Notes on the Exachk Report
Verify the Exachk Version Used to Generate the Report
Show the Appropriate Level of Detail
Review Important Sections of the Report
Exachk Health Check Catalog
Troubleshooting Exachk Report Results
Killed processes or skipped checks due to timeouts
A false result where the summary check finding differs from the view detail data
Skipped checks due to “file not found”
References

APPLIES TO:

Gen 1 Exadata Cloud at Customer (Oracle Exadata Database Cloud Machine)


Gen 2 Exadata Cloud at Customer
Oracle Platinum Services - Version N/A and later
Oracle Cloud Infrastructure - Exadata Cloud Service
Oracle Exadata Storage Server Software - Version 12.1.1.1.0 and later
Oracle Solaris on SPARC (64-bit)
Linux x86-64
See "Scope" for additional supported products data

1 of 10 11/02/2022 14:37
Document 1070954.1 https://support.oracle.com/epmos/faces/DocumentDisplay?_adf.ctrl-stat...

PURPOSE

This document describes how to obtain, install, execute, and update Exachk for Oracle Exadata Database Machine based
implementations. It also describes common use cases and best practices to ensure that your Exadata deployment and
configurations remain compliant with Exadata Maximum Availability Architecture (MAA) best practices.

SCOPE

This document applies to all Oracle Exadata Database Machine based implementations:

• Exadata Cloud at Customer, Exadata Cloud Service, and Autonomous Database Services
• Exadata On-Premises
◦ Bare Metal
◦ Virtual Machines

DETAILS

Exachk Overview

Exachk holistically evaluates Exadata Database Machine engineered systems.

It includes:

• Configuration checks for Database Servers, Storage Servers, and Network Fabric Switches:
◦ Firmware
◦ Operating System (e.g. Oracle Linux)
◦ Exadata software
◦ Grid Infrastructure and ASM
◦ Database
• MAA Scorecard:
◦ MAA Configuration Review
◦ Exadata Software Planner
◦ Exadata Critical Issue alerts
• Automatic Correction (when applicable):
◦ Configuration Correction
◦ Critical Issue Avoidance
• Prerequisite checks for DB and GI software updates
• Prerequisite checks for DB and GI upgrades
• Prerequisite checks for application continuity readiness

2 of 10 11/02/2022 14:37
Document 1070954.1 https://support.oracle.com/epmos/faces/DocumentDisplay?_adf.ctrl-stat...

The holistic Exachk report provides customers a self-managed way to keep their Exadata systems compliant with and
optimized to Exadata MAA best practices. In addition to the automatic correction feature, individual checks have
explanations, recommendations, and manual verification commands so that customers and administrators can evaluate the
risks of and self-correct reported conditions.

The Exachk report integrates with Oracle’s ownership, tuning, and optimization of the entire Exadata platform, and is a key
benefit for Exadata customers.

Exachk content is continuously expanded and improved via an MAA / Exadata development collaborative process:

ALWAYS use the latest available Exachk version. The target release schedule for Exachk is three months between versions.
In certain cases, such as the issuance of new Exadata Critical Issues (Document 1270094.1), interim releases may appear
as needed.

MAA and Exadata development recommends that the latest Exachk be executed with the following frequency:

• Perform a full run at least once per month


• Perform a “-profile exatier1” run under the following circumstances:
◦ Immediately after deployment
◦ After a configuration change
◦ Before and immediately after any planned maintenance activity

For all Exadata cloud deployments, customers execute exachk in any customer managed Exadata VM cluster. Oracle
cloud operations personnel execute exachk for any Oracle managed Exadata infrastructure and systems.

Documentation

User's Guide and What's New - Autonomous Health Framework Compliance Checks and Diagnostics

Detailed Features and Fixes - Version History

Installing Exachk

3 of 10 11/02/2022 14:37
Document 1070954.1 https://support.oracle.com/epmos/faces/DocumentDisplay?_adf.ctrl-stat...

Exachk is now a component of the Autonomous Health Framework (AHF), along with Trace File Analyzer (TFA). Install the
current version of AHF to get the latest Exachk and TFA releases.

ALWAYS use the latest available AHF/Exachk version.

Latest Available Exachk Version

(NOTE: active software support is required in order to download EXAchk)

Production Releases:

• On-Premise
◦ AHF/TFA/EXAchk 21.4.1 for Linux
◦ AHF/TFA/EXAchk 21.4.1 for Supercluster

• Cloud Certified
◦ AHF/TFA/EXAchk 21.2.8 for Oracle Cloud

Checking Current Exachk Version

Recent Exadata on-premises and Exadata Cloud Service deployments have AHF/Exachk installed as part of the deployment
procedure. Older deployments may have an AHF/Exachk installation already.

To see if an AHF/Exachk installation already exists, execute the following tfactl command as the root userid on one of the
database servers in a cluster:

# tfactl version -all


TFA Version : 201100
TFA Build ID : 20200401033759
TFA Build Label : TFA_MAIN_GENERIC_200331.0101

EXACHK VERSION: 20.1.1_20200401

AHF VERSION: 20.1.1

In the above example, a full AHF installation exists and the Exachk and AHF versions match.

To find the root of the AFH/Exachk file structure, execute the following command:

# cat /etc/oracle.ahf.loc
/opt/oracle.ahf

Further, if the full AHF installation was successful, Exachk should be scheduled to execute the exatier1 profile every day at
02:00. You can verify the autorun configuration using this command:

# exachk -get all -id autostart_client_exatier1


------------------------------------------------------------
ID: exachk.autostart_client_exatier1
------------------------------------------------------------
AUTORUN_FLAGS = -usediscovery -profile exatier1 -syslog -dball -showpass -tag
autostart_client_exatier1
COLLECTION_RETENTION = 1
AUTORUN_SCHEDULE = 0 2 * * *
------------------------------------------------------------

4 of 10 11/02/2022 14:37
Document 1070954.1 https://support.oracle.com/epmos/faces/DocumentDisplay?_adf.ctrl-stat...

If it is necessary to do an initial installation of AHF, perform the steps in the following section.

NOTE: If you have installed EXAchk using the extract method to /opt/oracle.ahf directory and wish to upgrade to the
latest AHF and take advantage of latest TFA features, you should remove the extract based EXAchk install ( using rm
/opt/oracle.ahf/*) before running the ahf_setup command to upgrade/install AHF.

Installing/Updating AHF in an Exadata on-prem deployment


Exadata Bare Metal host or Exadata Virtual DomU host

As the root user on one BM database server (or domU/VM in Virtual deployment) in the cluster, perform the following
steps:

1. Stage the latest on-premise AHF source file to a temporary directory and unzip it.

unzip AHF-LINUX_<version>.zip

2. Execute the following AHF setup command, from the location in which AHF is staged:

./ahf_setup -ahf_loc /opt -data_dir <ORACLE_BASE of Grid owner>

e.g.:

./ahf_setup -ahf_loc /opt -data_dir /u01/app/grid

Exadata Dom0 (Xen or KVM) Host

AHF can be installed on Exadata Xen dom0 or KVM host to be able to run Exachk on the Exadata infrastructure. Because
there is no /u01 filesystem in Xen dom0/KVM host, the AHF source code is installed into /opt/oracle.ahf and the data also
go the same location. Further, because of no Clusterware running in Xen dom0/KVM Host, AHF must be installed locally
onto each database server in the cluster. AHF/TFA daemon will not be running on Exadata Xen dom0/KVM Host, you will be
only able to run exachk command

As the root userid on each database server Xen dom0/KVM Host in the cluster, perform the following steps:

1. Stage the latest on-premise AHF source file to a temporary directory and unzip it.

unzip AHF-LINUX_<version>.zip

2. Execute the following AHF setup command, from the location in which AHF is staged:

./ahf_setup -ahf_loc /opt -silent -local -data_dir /opt

Installing/updating AHF in an Exadata Cloud deployment (Exadata Cloud@Customer,


Exadata Cloud Service)
Because of the /opt and /u01/app/oracle available space consideration, the AHF source code is installed into
/opt/oracle.ahf, and the data into /u02. Further, because of the root userid access restrictions, AHF must be
installed/updated locally on each database server in the cluster as root user, independently, by following the root access
guidelines for cloud systems. Once every database server has the daemon running, the individual daemons will use TFA
socket access to find each other and establish a cluster wide view of the environment.

As the root userid on each database server in the cluster, perform the following steps:

1. Stage the latest cloud certified AHF source file to a temporary directory and unzip it.

unzip AHF-LINUX_<version>.zip

5 of 10 11/02/2022 14:37
Document 1070954.1 https://support.oracle.com/epmos/faces/DocumentDisplay?_adf.ctrl-stat...

2. Set necessary permissions, if needed:

ls -l / | grep u02

If not already owned by root, do:

chown root:oinstall /u02

3. Execute the following AHF setup command, from the location in which AHF is staged:

./ahf_setup -ahf_loc /opt -silent -local -data_dir /u02

After AHF is installed local to each database server, the TFA daemons will discover each other (typically within 5 to 10
minutes).

Running Exachk

Once Exachk is installed, with the exception of one specific virtualized environment case (which is discussed later), the
same commands can be used in all Exadata environments. For all the following use cases, usage of the root userid is
implicit.

Before executing Exachk, always execute the following command to compare the currently installed version to the most
recent version available in this MOS note:

# exachk -v
EXACHK VERSION: 20.1.0(BETA)_20200220

If there is a more recent version available, please follow the instructions in the “Updating Exachk” section shown above.

Use Case - General Maintenance and Configuration Compliance

Execute the following command at least once a month:

# exachk

Schedule fixes for issues uncovered on a regular maintenance basis, thereby maintaining an up to date and stable Exadata
environment.

Use Case - Troubleshooting any Exadata Stability or Performance Problem

While not primarily designed as a diagnostic tool, an Exachk report provides a lot of background information that may be
useful, and may point out certain classes of problems. If time is of the essence, execute the exatier1 profile (or use the last
overnight exatier1 execution):

# exachk –profile exatier1

If there is time available, then collect a full Exachk report:

# exachk

Use Case - Patching, Upgrading, and Other Planned Maintenance

Use the same commands and timing for this group of use cases. A full report should be run far enough in advance so that
the output can be analyzed and corrective action taken prior to the actual maintenance window. Then shortly before the
maintenance activity begins, execute the exatier1 profile to make sure nothing critical has changed. Immediately after the
maintenance activity, execute the exatier1 profile again to pick up any critical issues that may have become relevant

6 of 10 11/02/2022 14:37
Document 1070954.1 https://support.oracle.com/epmos/faces/DocumentDisplay?_adf.ctrl-stat...

because of the maintenance activity.

Several weeks before:

# exachk

Shortly before and immediately after:

# exachk –profile exatier1

Special Use Cases

Large Exadata OVM Systems Logically Divided into Smaller Systems

If the environment consists of a large Exadata OVM system that is logically divided into multiple smaller OVM systems, such
as a full rack divided into two half racks, then Exachk must be told the configuration of the target logical dom0. If this is
not done, the Exachk discovery process will communicate with all the components of the full rack, since they are physically
interconnected. For example:

# exachk -profile exatier1 \


-clusternodes randomadm01,randomadm02 \
-cells randomceladm01,randomceladm02,randomceladm03 \
-ibswitches randomsw-ibs0,randomsw-ibb0,randomsw-iba0

Skipping Specific Best Practice Checks

In general, keep Exadata machines current with recommended best practices by implementing corrective actions indicated
in the latest available Exachk release.

However, for specific reasons after careful testing and risk analysis, there may be one or more configuration settings
maintained by an organization at a setting other than the recommended best practice setting. Rather than having out of
compliance settings listed in the report and manually ignoring them, it is possible to leave the specific check or checks out
of the report.

If you suspect that a check is giving a false result, do not simply skip that check. Open a service request with Oracle
Support for further investigation. Provide the exachk report and zip file with details showing why the check result is
incorrect.

To skip a check, you must first find the check id of the check. In the Exachk report “Report Feature” section, check the box
next to “Show Check Ids” to make check ids visible in each check summary line. For example:

Check Id Status Type Message Status On Details


OS System is exposed to Exadata All Database
9CC87B4EC33DAE8AE053D598EB0A65EF CRITICAL View
Check Critical Issue EX57 Servers

If there is only one check that you wish to skip, you can do so directly on the Exachk command line. For example:

# exachk –excludecheck 9CC87B4EC33DAE8AE053D598EB0A65EF

If there are only a few, you may use a comma-separated list (no spaces after commas!) on the command line. For
example:

7 of 10 11/02/2022 14:37
Document 1070954.1 https://support.oracle.com/epmos/faces/DocumentDisplay?_adf.ctrl-stat...

# exachk –excludecheck 9CC87B4EC33DAE8AE053D598EB0A65EF,22B973A23F934D53E0530C98EB0A0463

Rather than enter some number of check ids to exclude on the command line every time, you may create a specific file in
the Exachk installation directory to store the check ids. Create the file as an ordinary text file owned by the userid that
owns the Exachk installation directory, and place one check id per line. For example:

# cat /etc/oracle.ahf.loc
/opt/oracle.ahf
# cd /opt/oracle.ahf/exachk
# cat excluded_check_ids.txt
29D9ABC312636BA6E0530B98EB0AC0F4
B5D069540D986319E0431EC0E50A82B1
F084AE43E5770CF4E04312C0E50A4A37
9ADA9729FCD46EBBE040E50A1EC02350
5D691B1A8146F67CE053D398EB0A8822

Regardless of the exclusion method used, an “Excluded Checks” section is added to the Exachk report that lists the
excluded check or checks. For example:

Excluded Checks

Skipping CHECK ID: BF7AE780E1252F69E0431EC0E50AE447 ( High Redundancy Controlfile) on <HOSTNAME>


because its excluded

General Notes on the Exachk Report

Verify the Exachk Version Used to Generate the Report

First and foremost, review the Exachk version listed at the top of the report:

Exachk Version 20.1.1_20200320

Compare it to the most recent version in this MOS note. If there is a more recent version available, please follow the
instructions in the “Updating Exachk” section.

Show the Appropriate Level of Detail

Each Exachk report, by default, displays only the CRITICAL checks failures. All the other result categories of checks are
present, but not displayed. To make all checks visible select “All” in the Report Feature section of the report:

In general, anywhere you see a link in an Exachk report that link may be clicked to see additional data.

Review Important Sections of the Report

• Each report has a “Cluster Summary” section and score at the top of the report.
• The table of contents lists the sections included in the report. Clicking on a link in the TOC takes you to that section
of the report.

8 of 10 11/02/2022 14:37
Document 1070954.1 https://support.oracle.com/epmos/faces/DocumentDisplay?_adf.ctrl-stat...

• There will be a summary line for each check in the report. For example:

Check Id Status Type Message Status On Details


OS System is exposed to All Database
9CC87B4EC33DAE8AE053D598EB0A65EF CRITICAL View
Check Exadata Critical Issue EX57 Servers

Clicking on the view link in the check summary line will expand information on the check, recommended corrective
actions, and data details.

• The Maximum Availability Architecture (MAA) Scorecard section of the report contains important data and directives
related specifically to MAA configurations. This section also contains the recommended software and firmware
version matrix in the “SOFTWARE MAINTENANCE BEST PRACTICES” section. The software maintenance best
practice guide is derived from MOS documents 888828.1 (Exadata), 2333222.1 (Exadata Cloud) and 1270094.1
(Exadata Critical Issues). There may be a delay (up to one Exachk release) between latest information found in the
parent MOS notes and Exachk.
• Another important source of configuration data is the “Infrastructure Software and Configuration Summary” which
summarizes key configuration characteristics across the environment.
• In most cases, there will also be a “Cluster Verification Utility (CVU) result” section detailing the findings from the
CVU utility.
• If there are any “Killed Processes” or “Skipped Checks” sections, both of which indicates issues with the exachk run,
they will be listed both in the table of contents and towards the end of the report. Review the "Troubleshooting
Exachk Report Results" section below.

For complete examples of Exachk report content, please refer to the “ORAchk and Exachk User's Guide”.

Exachk Health Check Catalog

The exachk Health Check Catalog ("Catalog") is an interactive html file that lists exachk checks as of the current release.
There is a User Guide accessible via the link to the immediate right of the top title once you open the Catalog in a browser.
The Catalog is cumulative and keyword searchable.

To see the list of Exadata checks:


1) Select "All" under the "Engineered Systems" pull down menu.
2) Select "Exadata" under the "Engineered Systems" pull down menu.
3) Leave all other pull down menus at the default of "All"

To see the list of "Critical" Exadata checks:


1) Select "All" under the "Engineered Systems" pull down menu.
2) Select "Exadata" under the "Engineered Systems" pull down menu.
3) Deselect "All" under the "Profiles" pull down menu.
4) Select "EXATIER1" under the "Profiles" pull down menu.
5) Leave all other pull down menus at the default of "All"

The Catalog may be accessed either directly in your browser or downloaded from the "Health Check Catalogs section of:

Autonomous Health Framework (AHF) - Including TFA and ORAchk/EXAChk (Doc ID 2550798.1)

Troubleshooting Exachk Report Results

The three most common classes of problems are:

1. Killed processes or skipped checks due to timeouts


2. A false result where the summary check finding differs from the view detail data
3. Skipped checks due to “file not found”

Killed processes or skipped checks due to timeouts

The first thing to try is to extend the default timeout values and rerun Exachk:

9 of 10 11/02/2022 14:37
Document 1070954.1 https://support.oracle.com/epmos/faces/DocumentDisplay?_adf.ctrl-stat...

# export RAT_PASSWORD_CHECK_TIMEOUT=40
# export RAT_TIMEOUT=270
# exachk

If the number of killed processes or skipped checks reduces, then extend the timeouts one more time and run Exachk
again. If there are still killed processes or skipped checks, then please file a service request and work with Oracle Support
to diagnose and correct the issue. Killed processes or skipped checks do not always indicate a problem with the Exachk
program or the individual checks. There may be an environment specific matter affecting a given check.

A false result where the summary check finding differs from the view detail data

Typically, there are two cases that fall in this category.

1. The check view detail contradicts a summary result. For example, the summary data indicates FAIL, but the view
detail data clearly indicates a PASS condition.
2. There is missing data in the view detail data.

For both cases, the first thing to do is to try the manual commands shown in the report for the check (if provided) and
observe the results. If the manual results indicate the check should pass, you may safely ignore the summary finding for
the current time. If the manual findings indicate the check should pass, then take the recommended corrective action.

However, for both of these types, please file a service request and work with Oracle Support to diagnose and correct the
issue. These classes of false results do not always indicate a problem with the Exachk program or the individual checks.
There may be an environment specific matter affecting a given check.

Skipped checks due to “file not found”

These may also be due to timeout issues. Extend the timeouts, as shown above, and retry Exachk. If the number of these
reduces, then extend the timeouts one more time and run Exachk again.

If there are still “file not found” messages, then please file a service request and work with Oracle Support to diagnose and
correct the issue. ”file not found” messages do not always indicate a problem with the Exachk program or the individual
checks. There may be an environment specific matter affecting a given check.

Exachk support for Solaris on Exadata (Terminal Release):


Version Notes
Exadata Storage Server Software version 12.1.1.1.1 on Exadata hardware version X4 or lower is the terminal release for
Solaris on Exadata.

Exachk version 18.3.0 is the terminal release of Exachk for Solaris on Exadata, provided only for backwards compatibility.
No new Exachk development has occurred for Solaris on Exadata since the terminal versions, there are no bug fixes
provided, and support is limited to known workarounds.

Initial Deployment
Please refer to the documentation provided with the Exachk 18.3.0 kit.

Download
Download Exachk version 18.3.0 from here: Exachk version 18.3.0

REFERENCES

NOTE:757552.1 - Oracle Exadata Best Practices

Didn't find what you are looking for?

10 of 10 11/02/2022 14:37

You might also like