Oracle Healtcheck Service Example

Healthcheck report
for Oracle Database
Oracle Cloud Systems Hub

June 2021
Copyright © 2021, Oracle and/or its affiliates. All rights reserved.

What is the Healthcheck service?
The Oracle Healthcheck Service offers a FREE
diagnostic report for your database environment.
With the metric data files that you provide, our Oracle Cloud
Systems Hub engineers can analyze the system
and database performance and overall health.
What data sources can I use for data collection?
AWR Miner StatsPack
AWR classic EXAchk

HTML report ORAchk
FOUR SIMPLE STEPS
STEP 4. SHOWCASE
We share thoughts on the analysis
results and plan together for the future.
As trusted advisors, we come with
many recommendations and proposals
STEP 3. ANALYSIS
Our Solution Engineer teams digest
the data and construct a presentation
for pointing out the problems and
solutions for your database systems
STEP 2. COLLECTION
You collect diagnostic information
from your systems via an SQL script
and pass it on for analysis
STEP 1. DISCOVERY
By engaging with our sales teams by
requesting this FREE service, you identify
which systems are viable for a check-up
What will I receive?
An analysis of the system will be provided in a presentation along with
relevant database best practice recommendations and if needed, a proposal
for system upgrade or refresh.
The only requirement is running an Oracle certified

SQL script on your database environment
which collects diagnostic data from the
database AWR repository, or provide
your own AWR reports.
No user or confidential data

will be queried.
AGENDA
System Overview Healthcheck Analysis Recommendations Solution proposal

A brief recap of the current The analysis of the diagnostic data Our advice on what can be improved A one-in all solution to all the
infrastructure, system characteristics provided, structured by main areas of or needs further investigation problems discovered before
and resources allocated utilization

SYSTEM OVERVIEW
CURRENT ENVIRONMENT:
Database Name Database Version
- 3 Exadata database machines: DBM01 12.1.0.2.0
HYPERNDB 12.1.0.2.0
IBPSFSS 12.1.0.2.0
SOA 12.1.0.2.0
- X4-2 Full Rack (with X6 and X7 upgrades) HYPEDEV 12.1.0.2.0
OBIEREPO 12.1.0.2.0
OFSAPROD 12.1.0.2.0
- X6-2 Half Rack with virtualization FBNOFSA 12.1.0.2.0
TESTEBS 12.1.0.2.0
PROD 12.1.0.2.0
FCCMUAT 12.1.0.2.0
- X7-2 Eight Rack for FCCMDB instance FBNSIT 12.1.0.2.0
MEDATDB 12.1.0.2.0
OFSADEV 12.1.0.2.0
EBSTST 12.1.0.2.0
- Database version for all instances is aligned at 12.1.0.2.0 OFSAUAT 12.1.0.2.0
OBIEPROD 12.1.0.2.0
DISC 12.1.0.2.0
IBPSPMH 12.1.0.2.0
- Out of all databases analyzed, only 3 have serious performance
AGENCYDB 12.1.0.2.0
issues:
DBM01, PROD, OBIEREPO
- The majority of instances have low resource consumption

SYSTEM OVERVIEW
MULTIPLE WAIT
EVENTS IDENTIFIED
Administrative wait events
regularly occurring
DATABASE VERSION
UP TO DATE
RDBMS Software is 19.0.0
No database upgrade is BACKUP STRATEGY TO
recommended.
BE REVISED
Most of the waits are related to
Backup activity.
HARDWARE UPGRADE CONSIDERED CPU USAGE IS

The system is clearly busy and during high
activity, the CPU is getting close to 100%. PEAKING AT 100%
Memory consumption is also high and CPU usage varies from to 6% at
during backup processing, the database is low to 100% high with 27%
facing real performance issues. average.
AREAS FOR POTENTIAL IMPROVEMENT

AGENDA
System Overview Healthcheck Analysis Recommendations Solution proposal

A brief recap of the current The analysis of the diagnostic data Our advice on what can be improved A one-in all solution to all the
infrastructure, system characteristics provided, structured by main areas of or needs further investigation problems discovered before
and resources allocated utilization

CPU UTILIZATION
100
90
80
73,4 73,4 73,4 73,4
70
0% - 5% Database CPU
60
49,5 49,5 49,5 49,5 49,5 49,5 Overall Host CPU utilization can be seen
50
on the graphic on the left side:
38,3
40
- 73% / 49% / 38% Peak CPU
30
25,2 24,5 24,5 24,5 24,9 24,9 24,9 24,9 24,9 24,9 24,9 - 20% - 25% Average CPU
20 - 2 – 6% Low CPU
10
5,7 5,7 5,7
3 3
1,7 1,7 1,7 1,7 1,7 1,7
0
serverq008 serverq008 serverp009 serverp011 serverp009 serverp005 serverq008 serveri013 serverp009 serverq003 serverq001
baseq044 baseq042 basep051 basep085 basep055 basep031 baseq046 basei112 basep053 baseq018 baseq001
LOW HIGH Average

MEMORY UTILIZATION
80
71 72
70 DATABASE MEMORY
66 67
64 Database Memory utilization varies between 16
and 72GB RAM as shown in the left side chart.
60 58
50
HOST MEMORY
Overall Host memory utilization
can be seen on the graphic below:
39 40
40 37
34 35 OS and
32 32 Memory processes,
Free; 11% 15%
30
20 17 18 17
16
13
11
9 8 8 8
10
5 5 5 5 6 5 6 6
3 2 Memory used -
DB, 79%
0
SGA PGA Total memory used

DATABASE SPACE UTILIZATION
1200
Database Name Database Size (GB)

1000 971,38
baseq044 9318,57 910,1
862,11
baseq042 1631,01
810,51
basep051 1674,02 800 756,31
basep085 4689,28
basep055 4572,36 618,15

600 559,46
basep031 6929,91
515,66
baseq046 4371,64
basei112 9065,67 400

336,66 333,09
basep053 9418,85
241,13
baseq018 2866,78 206,76
200
baseq001 7527,19
62065,26GB
*for the databases analyzed – not total on the machine

STORAGE UTILIZATION
FRA free DBFS allocated
FRA used 4% 1%
5%
1000,00
943,37
Database free
900,00 26%
Database used
800,00 64%
700,00
600,00
500,00 471,68
Exadata Storage TB
Total raw capacity 943.37
400,00
Cell Failure Coverage* Disk
314,83
Available raw capacity 943.37
300,00
Available space after redundancy 471.68
Database used 314.83
200,00
Database free 129.57
129,57
FRA used 23.46
100,00
FRA free 22.68
23,46 22,68
3,82 DBFS allocated 3.82
0,00
Available raw Available space Database used Database free FRA used FRA free DBFS allocated
capacity after redundancy

AVERAGE IO UTILIZATION
5145,1
319,3 baseq001 READ IOPS
1062,5 135,7 baseq018
3934 384,2 basep053 Unused; 273.580
3981,5 383,6 basei112
5452,5 290,7 baseq046
Write Iops Max 855,9 Write MB Max 53,8 basep031
2900,4 180,1 basep055
2167,8 232,1 basep085
1917 61,7 basep051
1489,2 45,9 Consumed; 17.920
baseq042
6390,2 322,4 baseq044
4489,2 3469,1
3303,1 349
1147,8 WRITE IOPS
4585,8
24798,5 1490,4
10512 1113,8
Unused; 3.814
Read Iops Max 3632,3 Read MB Max 304,4
10987,2 2961,3
6631,7 864,9
13743,7 410,3
12636,7 302,9
7500,4 1163,5
Consumed; 5.816
0 5000 10000 15000 20000 25000 30000 0 1000 2000 3000 4000

PEAK IO UTILIZATION
baseq001
48909,2 319,3
baseq018
11666,1 135,7
basep053
9128,7 384,2
basei112
8988,9 383,6
baseq046
18372,8 290,7
Write Iops Max Write MB Max basep031
3017,8 53,8
15912,2 basep055
180,1
8072,9 232,1 basep085
6887 61,7 basep051
5129,3 45,9 baseq042
13672,4 322,4 baseq044
42265,8 3469,1
26094,3 349
29538,3 1147,8
134252,8 1490,4
150128,6 1113,8
Read Iops Max 22837,5 Read MB Max 304,4
117062,1 2961,3
22701,9 864,9
81382,3 410,3
87291
302,9
25070,3
1163,5
0 50000 100000 150000 200000 0 500 1000 1500 2000 2500 3000 3500 4000

DATABASE WAIT EVENTS COMPARED
4
3,61
3,5
Concurrency: A high number of wait events can be seen here
due to sessions being blocked, while waiting on the same
3 resource.
2,69
CPU: DB CPU wait events are not particularly indicating a
2,51 performance degradation. Having a high percentage of DB CPU
2,5
means that the database is processing database user-level calls.
User IO: wait events mainly consist of events like “dbfile

2
sequential read” and “dbfile scattered read” which translates to
index usage and full table scans. It is advised to make sure that
common queries have their execution plans optimized, in order
1,43
1,5 1,36 to reduce IO operations and maintain database performance. In
1,26 this instance’s case, User IO wait time is very low.
1,12
1 Other: These wait events resulted from unusual errors and

0,73 0,72 internal database engine processes being delayed. Always
0,62 check alert log for internal errors and abnormal behavior after
0,53
0,5
0,47 0,44 having performance problems.
0,26 0,29 0,25
0,190,22
0,07 0,08 0,06 0,04 0,07 0,06
0,01 0,01 0 0,01
0
DWH14OO EPOR14OO CONDB19C TOPA11OO
Sys I/O Other Applic Commit Concurcy Config Cluster

DATABASE WAIT EVENT SUBCATEGORIES
60
50 48,7
BIGGEST ISSUES DETECTED:
40
19% backup contention
30 18% slow storage IO
20 19 18,59
6% direct path reads
10
5,27
3,02
⇒ 43%
1,01 1,05
Total performance degradation due to storage
0
Backup: MML DB CPU db file sequential direct path read imm op io done resmgr: cpu
Write Backup read quantum
piece

WAIT EVENTS – DB CPU AND USER IO
70
64 The following graphic represents the percentage of DB Time used on DB CPU

and User IO.
60
Having a high percentage of DB CPU means that the database is processing
55 55 database user-level calls and it is a sign of healthy database activity.
50
50 49
47
46 User IO is mainly consisting of the bellow events:
43
42 • “direct path read”, “direct path write” , “ direct path read temp”
40 40
https://docs.oracle.com/database/121/TGDBA/pfgrf_instance_tune.htm#TGDBA
40 38
37 94485
34 34
32 • “direct path write temp” waits are typically caused by sort operations that
cannot be completed in memory, spilling over to disk.
30 29 https://support.oracle.com/epmos/faces/DocumentDisplay?_afrLoop=132051023
27 512846&parent=EXTERNAL_SEARCH&sourceId=PROBLEM&id=1576956.1&_afrWind
26
25 owMode=0&_adf.ctrl-state=1brk1dwyfh_4
23
• “db file sequential read” - Is one of the most common wait event.
20 https://docs.oracle.com/database/121/TGDBA/pfgrf_instance_tune.htm#TGDBA
16 94485
• “db file scattered read” A scattered read is usually a multiblock read. It can
10
occur for a fast full scan (of an index) in addition to a full table scan.
https://docs.oracle.com/database/121/TGDBA/pfgrf_instance_tune.htm#TGDBA
94479
0
DB CPU % User IO%

WAIT EVENTS – CONCURRENCY/APPLICATION
50 49
45 The following graphic shows the concurrency and application waits on each of the
most used databases in the analysis.
40 These wait events are caused either by sessions running user application code (
for example row level locking or explicit lock commands ) or waits for internal
database resources ( library cache locks or latches ).
35
Application is mainly consisting of the bellow events:

30 • enq: TM – contention” - indicates the time spent waiting for a TM lock that is used to
coordinate activities for many base table / partition operations. To resolve this type
of event, you can follow the recommendation from the My Oracle Support Doc ID
25 1905174.1
• “enq: TX – row lock contention” - Reducing this type of waits typically involves
20 altering application code to avoid the contention scenario/s. . To resolve this type of
event, you can follow the recommendation from the My Oracle Support Doc ID
15 15 1966048.1
15
11 Concurrency is mainly consisting of the bellow events:

10 9 • “library cache lock” - To optimize the functioning of the database and diminish the
7 7 7 library cache lock waits you can follow the recommendation from the My Oracle
Support Doc ID 34578.1
5 3 3
2 • “cursor: pin S wait on X” - To optimize the functioning of the database and diminish
1 1 1 1 1
0 0 0 0 0 0 the cursor: pin S wait on X waits you can follow the recommendation from the My
0 Oracle Support Doc ID 1298015.1
Concurrency Wait % Application Wait %

WAIT EVENTS – NETWORK/OTHER
20
18
18
16
14 Network: Waits related to network messaging. It is advised to investigate

13 database links before looking into the network first. Usually, network
waits are related to the time it takes to pack messages for the network
12 before they are sent. 95% of the cases it is a database link issue.
10 Configuration: Waits caused by inadequate configuration of database or

instance resources. Database initialization parameters must be sized
appropriately for the resources allocated to the server. It is advised to
8
check for undersized init parameters and use advisory statistics for
guidance.
6
Other: Wait events resulted from unusual errors. Always check alert log
4 4 4
4 for internal errors and abnormal behavior.
3
2
2
1 1 1 1 1 1 1
0 0 0 0 0 0 0 0 0 0 0 0 0
0
baseq044 baseq042 basep051 basep085 basep055 basep031 baseq046 basei112 baseq018
Network % Configuration % Other %

MAIN ACTIVITY VARIABLES
Executions LogonsTotal
12000 600
10000 500
8000 400
6000 300
9627 499
4000 200 406 367 364
6885 342
285 243 231 285
2000 100 216 185 216
2041 1580 1550
0 307 623 365 190 767 1826 257 0
Redo MBs Commits

140 300
120 250
100
200
80
150
60 109 118 248
40 85 100
55 65
20 50
26 16
0 5,3 11 5,4 5,5 8 0 4,5 4,6 4,5 4,6 4,8 4,6 4,5 4,5 4,8 4,6 4,5

AGENDA
Recommendations
Our advice on what can be improved
or needs further investigation

NOTABLE EVENTS
Recurring Network events Backup Restore throttle sleep
consist of 7% activity on and latch: redo allocation +
instance basep055
concurrency events on instance
basep085 – 20% of total
Constant high commit rate of 21% activity
total on instance basep053
Significant application
Spikes – 25% on instance baseq001
Concurrency (pin S wait on X)
on basei112 – 5% total
database activity
It is recommended to investigate the issues

mentioned above to prevent further database
performance degradation

MEMORY ADVISOR RECOMMENDATIONS
+32% +45% +61% +17%
Consider increasing SGA with Consider increasing SGA with ~4GB on Consider increasing SGA with
~10GB on basep051 for a 32% basep085 and PGA with ~0.8GB for a Consider increasing PGA with
10% and 35% Physical Reads and MB
~10GB on baseq046 for a 60- ~1.1GB on baseq044 for a 17%
decrease in Physical Reads.
Read/Written to disk. 61% decrease in Physical Reads. increase in MB R/W performance.

EXACHK OBSERVATIONS
Status: CRITICAL | Type: Database Check
Message Status On Solution
Database parameter CLUSTER_INTERCONNECTS should be a
Database parameter CLUSTER_INTERCONNECTS is not set to colon delimited string of the IP addresses returned from
od01db01:captdb, od01db02:DEV2, od01db02:captdb
the recommended value sbin/ifconfig for each cluster_interconnect interface returned by
oifcfg.
One or more disk groups which contain critical files do not use
All Databases Solution not yet available...
high redundancy
Place the control file in DATA because both DATA and RECO are
Database control files are not configured as recommended All Databases
high redundancy disk groups
Status: CRITICAL | Type: OS Check
Message Status On Solution

System is exposed to Exadata Critical Issue EX44 All Database Servers Solution not yet available...
System is exposed to Exadata Critical Issue DB41 All Database Servers Solution not yet available...
One or more InfiniBand network parameters on Database
All Database Servers Solution not yet available...
Servers are not as expected
TCP Selective Acknowledgment is not enabled od01db01 Solution not yet available...
If the value is incorrect in /etc/sysctl.conf and matches the value

The vm.min_free_kbytes configuration is not set as in active memory, edit the /etc/sysctl.conf file to include the line
All Database Servers
recommended 'vm.min_free_kbytes = 9correct_value_for_your_configuration:'
and reboot the database server.

SUMMARY OF FINDINGS
Findings No of events Exadata’s proposed solution
High CPU Utilization 3 IORM, RDMA, PMEM, RoCE, Smart Flash Cache, Exadata Smart
Scan
Active Sessions Surge 5
Waits on Network 3
Exafusion , Smart Fusion Block Transfer, RDMA, PMEM, RoCE
Waits on Cluster 7
Waits on Commit 1
RDMA, PMEM, RoCE
Increased Redo Generation Rate 4
Increased Physical Reads 6

Exadata Smart Scan, Storage Indexes, HCC, In-Memory
Columnar, Smart Flash Cache
Increased Waits on I/O 1
Configuration Problems 2 Exadata Automatic Indexing
Exadata Smart Scan, Storage Indexes, HCC, In-Memory

Other wait events 4
Columnar Cache
Concurrency Issues 14 Automatic Diagpack Upload for Oracle ASR
Application generated waits 8 Faster Performance for Large Analytic Queries and Large Loads

AGENDA
Solution proposal
A one-in all solution to all the
problems discovered before

PROPOSED SOLUTION – ORACLE EXADATA
Current Infrastructure Proposed Solution

95% PEAK CPU 50%
85% MEMORY UTILIZATION 35%
75% IO UTILIZATION 20%
50% NETWORK ACTIVITY 15%
20% SYSTEM HEALTH 85%

Oracle Healtcheck Service Example

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Oracle Healtcheck Service Example

Uploaded by

Copyright:

Available Formats

Healthcheck report

for Oracle Database

Oracle Cloud Systems Hub

Copyright © 2021, Oracle and/or its affiliates. All rights reserved.

What data sources can I use for data collection?

AWR Miner StatsPack

AWR classic EXAchk

The only requirement is running an Oracle certified

No user or confidential data

System Overview Healthcheck Analysis Recommendations Solution proposal

Copyright © 2021, Oracle and/or its affiliates. All rights reserved.

- The majority of instances have low resource consumption

Copyright © 2021, Oracle and/or its affiliates. All rights reserved.

HARDWARE UPGRADE CONSIDERED CPU USAGE IS

AREAS FOR POTENTIAL IMPROVEMENT

Copyright © 2021, Oracle and/or its affiliates. All rights reserved.

System Overview Healthcheck Analysis Recommendations Solution proposal

Copyright © 2021, Oracle and/or its affiliates. All rights reserved.

LOW HIGH Average

Copyright © 2021, Oracle and/or its affiliates. All rights reserved.

SGA PGA Total memory used

Copyright © 2021, Oracle and/or its affiliates. All rights reserved.

Database Name Database Size (GB)

basep055 4572,36 618,15

basei112 9065,67 400

Copyright © 2021, Oracle and/or its affiliates. All rights reserved.

Copyright © 2021, Oracle and/or its affiliates. All rights reserved.

Copyright © 2021, Oracle and/or its affiliates. All rights reserved.

Copyright © 2021, Oracle and/or its affiliates. All rights reserved.

User IO: wait events mainly consist of events like “dbfile

1 Other: These wait events resulted from unusual errors and

Sys I/O Other Applic Commit Concurcy Config Cluster

Copyright © 2021, Oracle and/or its affiliates. All rights reserved.

Copyright © 2021, Oracle and/or its affiliates. All rights reserved.

64 The following graphic represents the percentage of DB Time used on DB CPU

DB CPU % User IO%

Application is mainly consisting of the bellow events:

11 Concurrency is mainly consisting of the bellow events:

Concurrency Wait % Application Wait %

Copyright © 2021, Oracle and/or its affiliates. All rights reserved.

14 Network: Waits related to network messaging. It is advised to investigate

10 Configuration: Waits caused by inadequate configuration of database or

Network % Configuration % Other %

Copyright © 2021, Oracle and/or its affiliates. All rights reserved.

Redo MBs Commits

Copyright © 2021, Oracle and/or its affiliates. All rights reserved.

Copyright © 2021, Oracle and/or its affiliates. All rights reserved.

It is recommended to investigate the issues

Copyright © 2021, Oracle and/or its affiliates. All rights reserved.

+32% +45% +61% +17%

Copyright © 2021, Oracle and/or its affiliates. All rights reserved.

Status: CRITICAL | Type: OS Check

Message Status On Solution

If the value is incorrect in /etc/sysctl.conf and matches the value

Copyright © 2021, Oracle and/or its affiliates. All rights reserved.

Increased Physical Reads 6

Configuration Problems 2 Exadata Automatic Indexing

Exadata Smart Scan, Storage Indexes, HCC, In-Memory

Concurrency Issues 14 Automatic Diagpack Upload for Oracle ASR

Copyright © 2021, Oracle and/or its affiliates. All rights reserved.

Copyright © 2021, Oracle and/or its affiliates. All rights reserved.

Current Infrastructure Proposed Solution

85% MEMORY UTILIZATION 35%