You are on page 1of 112

1. Explain the advantages of using ASM?

I. High availability with reduced downtime: ASM provides the hot swappable disks. You can add or
remove disks from a disk group while a database is online. When you perform add or remove disks from a
disk group, ASM automatically redistributes the file contents and eliminates the need for downtime.

II. Data redundancy: ASM provides flexible server-based mirroring options. The ASM normal and high
redundancy disk groups enable 2-way and 3-way mirroring respectively. You can use RAID storage
subsystems to perform the mirroring

III. Reduces the administrative overhead: ASM accomplish this by consolidating data storage into a
small number of disk groups. This enables you to consolidate the storage for multiple databases and to
provide for improved I/O performance.

IV. Load balancing: ASM provides dynamic parallel load balancing which helps to prevent hot spots. IO
spread evenly across available disks.

V. Compatibility with other storage systems: ASM can support with other storage management options
such as raw disks and third-party file systems. This feature simplifies the integration ASM with pre-
existing environments.

VI. ASM uses the Oracle Managed Files(OMF ) for simplified database file management.

VII. It has easy to use management interfaces such as SQL*plus, ASMCMD command-line interface
and Oracle EM (enterprise Manager). Using OEM wizard you can easily migrate a non-ASM database
files to ASM.

VIII. In an ASM instance, you can store database datafile, redo log files, archive log files, backup files,
datapump dump files, change-tracking files and control files of one or several oracle database. Another
feature of ASM is elimination of fragmentation.

2. Explain ASM architecture?


Oracle ASM architecture has the following components
A. Oracle ASM Instance
B. Oracle ASM Disk Groups
C. Oracle ASM Failure Groups
D. Oracle ASM Disks
E. Oracle ASM Files

A. Oracle ASM Instance: An Oracle ASM instance has a System Global Area (SGA) and background
processes that are similar to those of Oracle Database. Oracle ASM SGA is much smaller than a database
SGA. Oracle ASM has a minimal performance effect on a server. Oracle ASM instances mounts ASM disk
groups to make Oracle ASM files available to database instances. Oracle ASM and database instances
require shared access to the disks in a disk group. Oracle ASM instances manage the metadata of the disk
group and provide file layout information to the database instances.
The Oracle ASM metadata includes the following information:
1. The disks that belong to a disk group.
2. The amount of space that is available in a disk group.
3. The filenames of the files in a disk group.
4. The location of disk group data file extents.
5. A redo log that records information about atomically changing metadata blocks.
6. Oracle ADVM volume information

There is one Oracle ASM instance for each cluster node. If there are several database instances for
different databases on the same node, then the database instances share the same single Oracle ASM
instance on that node. If the Oracle ASM instance on a node fails, then all of the database instances on that
node also fail.

B. Oracle ASM Disk Groups: A disk group consists of multiple ASM disks and contains the metadata
that is required for the management of space in the disk group. The Disk group components are ASM
disks, ASM files, and allocation units. Any Oracle ASM file is completely contained within a single disk
group. However, a disk group might contain files belonging to several databases and a single database can
use files from multiple disk groups.
Mirroring protects data integrity by storing copies of data on multiple disks. When you create a disk group,
you specify an Oracle ASM disk group type based on one of the following three redundancy levels:

Normal redundancy: Oracle ASM provides two-way mirroring by default, which means that all files are
mirrored so that there are two copies of every extent. A loss of one Oracle ASM disk is tolerated.

High redundancy: Oracle ASM provides triple mirroring by default, which means that all files are mirrored
so that there are three copies of every extent. A loss of two Oracle ASM disks in different failure groups is
tolerated.

Normal redundancy disk groups require at least two failure groups. High redundancy disk groups require at
least three failure groups. Disk groups with external redundancy do not use failure groups.

C. Oracle ASM Failure Groups: A failure group is a subset of the disks in a ASM Disk group. Failure
groups are used to store mirror copies of data. When Oracle ASM allocates an extent for a normal
redundancy file, Oracle ASM allocates a primary copy and a secondary copy. Oracle ASM chooses the
disk on which to store the secondary copy so that it is in a different failure group than the primary copy.
There are always failure groups even if they are not explicitly created. If you do not specify a failure group
for a disk, then Oracle automatically creates a new failure group containing just that disk, except for disk
groups containing disks on Oracle Exadata cells.
D. Oracle ASM Disks: Oracle ASM disks are the storage devices that are provisioned to Oracle ASM disk
groups. Oracle ASM spreads the files proportionally across all of the disks in the disk group. This
allocation pattern maintains every disk at the same capacity level and ensures that all of the disks in a disk
group have the same I/O load.
An Oracle ASM file consists of one or more file extents. A file extent consists of one or more allocation
units. So Every Oracle ASM disk is divided into allocation units (AU) within the ASM Disk group.
When you create a disk group, you can set the Oracle ASM allocation unit size with the AU_SIZE disk
group attribute. The values can be 1, 2, 4, 8, 16, 32, or 64 MB, depending on the specific disk group
compatibility level.
E. Oracle ASM Files: Files that are stored in Oracle ASM disk groups are called Oracle ASM files. Each
Oracle ASM file is contained within a single Oracle ASM disk group. Oracle Database communicates with
Oracle ASM in terms of files.
The following are the files which are stored in ASM Disk groups
1. Control files
2. Data files, temporary data files, and data file copies
3. SPFILEs
4. Online redo logs, archive logs, and Flashback logs
5. RMAN backups
6. Disaster recovery configurations
7. Change tracking bitmaps
8. Data Pump dumpsets
Oracle ASM automatically generates Oracle ASM file names as part of file creation and tablespace
creation. Oracle ASM file names begin with a plus sign (+) followed by a disk group name.

The contents of Oracle ASM files are stored in a disk group as a set, or collection of extents that are stored
on individual disks within disk groups. Each extent resides on an individual disk. Extents consist of one or
more allocation units (AU).

Extent size always equals the disk group AU size for the first 20000 extent sets(0 – 19999).
Extent size equals 4*AU size for the next 20000 extent sets (20000 – 39999).
Extent size equals 16*AU size for the next 20000 and higher extent sets (40000+).

Background Processes:
1.ARBx: These are the slave processes that do the rebalance activity
2.RBAL: This opens all device files as part of discovery and coordinates the rebalance activity for disk
groups.
3.ASMB: At database instance startup, ASMB connects as a foreground process into the ASM instance.
All communication between the database and ASM instances is performed via this bridge. This includes
physical file changes such as data file creation and deletion. Over this connection, periodic messages are
exchanged to update statistics and to verify that both instances are healthy
4.SMON: This process is the system monitor and also acts as a liaison to the Cluster Synchronization
Services (CSS) process (in Oracle Clusterware) for node monitoring.
5.PSP0: This process spawner process is responsible for creating and managing other Oracle processes.

3. How asm mirroring works?


It means that every stripe is mirrored once. There is a primary on one disk and a mirror on
another disk. Mirroring protects data integrity by storing copies of data on multiple
disks. Failure groups are used to place mirrored copies of data so that each copy is on
a disk in a different failure group.

Following are the three redundancy levels which we can define while creating the
diskgroup.

Normal for 2-way mirroring.It means data will be written to one failure group and will be
mirrored to other failure group. So, Normal redundancy disk groups require at least two
failure groups.

High for 3-way mirroring. It means data will be written to one failure group and will be
mirrored to two other failure groups.High redundancy disk groups require at least three
failure groups.

External to not use Oracle ASM mirroring-No ASM mirroring. Useful when the disk group
contain RAID devices. Here no failgroup clause will be needed.

Example:-
SQL> Create diskgroup diskgr1 NORMAL redundancy
2 FAILGROUP controller1 DISK
4 ‘/devices/diska1’ NAME dasm-d1,
3 FAILGROUP controller2 DISK
5 ‘/devices/diskb1’ NAME dasm-d1
6 ATTRIBUTE ‘au_size’=’4M’;

4. Explain different RAID levels?

What is RAID?
RAID (redundant array of independent disks) is a setup consisting of multiple disks
for data storage. They are linked together to prevent data loss and/or speed up
performance. Having multiple disks allows the employment of various techniques
like disk striping, disk mirroring, and parity.

RAID 0: Striping
RAID 0, also known as a striped set or a striped volume, requires a minimum of two
disks. The disks are merged into a single large volume where data is stored evenly
across the number of disks in the array.

This process is called disk striping and involves splitting data into blocks and writing
it simultaneously/sequentially on multiple disks. Configuring the striped disks as a
single partition increases performance since multiple disks do reading and writing
operations simultaneously. Therefore, RAID 0 is generally implemented to improve
speed and efficiency.

Advantages of RAID 0
 Cost-efficient and straightforward to implement.
 Increased read and write performance.
 No overhead (total capacity use).

Disadvantages of RAID 0
 Doesn't provide fault tolerance or redundancy.

When Raid 0 Should Be Used


RAID 0 is used when performance is a priority and reliability is not. If you want to
utilize your drives to the fullest and don't mind losing data, opt for RAID 0.

RAID 1: Mirroring
RAID 1 is an array consisting of at least two disks where the same data is stored on
each to ensure redundancy. The most common use of RAID 1 is setting up a
mirrored pair consisting of two disks in which the contents of the first disk is
mirrored in the second. This is why such a configuration is also called mirroring.

Unlike with RAID 0, where the focus is solely on speed and performance, the primary
goal of RAID 1 is to provide redundancy. It eliminates the possibility of data loss and
downtime by replacing a failed drive with its replica.

Advantages of RAID 1
 Increased read performance.
 Provides redundancy and fault tolerance.
 Simple to configure and easy to use.
Disadvantages of RAID 1
 Uses only half of the storage capacity.
 More expensive (needs twice as many drivers).
 Requires powering down your computer to replace failed drive.

When Raid 1 Should Be Used


RAID 1 is used for mission-critical storage that requires a minimal risk of data loss.
Accounting systems often opt for RAID 1 as they deal with critical data and require
high reliability.

It is also suitable for smaller servers with only two disks, as well as if you are
searching for a simple configuration you can easily set up (even at home).

RAID 0 – provides striping capability.


RAID 1 – Provides mirroring capability.
RAID 0+1 – First striping, then mirrored
RAID 1+0 – First mirroring then striping – Mostly used for oracle database
RAID 5  – With additional parity
RAID 6 – With additional parity

5. How ASM interacts with database?


First time, when rdbms instance tries to access an asm file, It needs to establish a local
asm connection.
ASMB process contacts CSS using diskgroup names and gets the connection string.
Using that connection string, bequeath connection is established between asm and
RDBMS instance.
The RDBMS authenticates itself to the ASM instance via operating system (OS)
authentication by connecting as SYSDBA. This initial connection between the ASM
instance and the RDBMS instance is known as the umbilicus, and remains active as long
as the RDBMS instance has any ASM files open.
ASMB is the rdbms side process and UFG(umbilicus foreground process) is the asm side
process. Both communciate through this umbilicus.

When ASM instance opens a datafile, ASM ships the  file’s extent map to rdbms
instance , where it is stored in SGA.
By using that extent map , rdbms can do i/o on the asm files directly without going
through asm instance

6. Can i keep disks of different sizes in a diskgroup?


Though we can keep different sized disks in a diskgroup, The extent distribution will be
unbalanced. And it will induce unbalanced i/o across the diskgroup.
Oracle always recommends to use disks of same size in a diskgroup.

7. ASM power limit?


ASM_POWER_LIMIT parameter controlled the throughput and speed of the rebalance
operation.
In 12c the values varies from 1-1024.   Value of 0 will disable the automatic rebalance
operation.
If asm_power_limit is 10 , then 10 ARBn processes will be created to do the rebalance
job.
But please note, Higher value of asm_power_limit can cause more cpu load. So always
use a balanced value and try to do these in non-peak hours.

8. What are some important ASM background processes?


RBAL – This opens all the devices upon discovery and coordinate the rebalance
operation.
ARBn – These are the slave processes, that do the rebalance operation
DBWR – Writes ASM instance metadata changes in sga buffer to DISK.
LGWR – It manages the active change directory and flushes the ACD change records to
disk.
PMON – This manages process and process death
GMON – This manages disk level activities like disk drop/add/offline/online.
MARK – > Mark allocation Unit, process coordinates the updates to the staleness registry
SMON – >
PING – > Monitors network latency

9. What are the parameters of asm instance?


asm_diskstrings:
asm_powerlimit
instance_name
large_pool_size
processes
 
10. There is a control_file parameter for asm instance. If i delete
that what will happen?
In some version of asm , there will a control_file for asm_instance. It is a dummy
control_file, which is of no use. It is meaningless.
However in latest version (19c) , This parameter is not available.

11. How can i improve the i/o performance between database


instance and asm instance?
asm instance is not in the i/o path . Database perform i/o directly with asm disks, It
doesnt go through asm instance. i.e there is no relation between database instance and
asm instance in terms of I/O.
Database instance only get the asm extent map information(asm metadata) from asm
instance.

12. What is the minimum number of asm disk groups we should


have?
2 diskgroups.

13. What are the different  header status of asm disks.?


CANDIDATE – > Means disk  is available and can be added to diskgroup.( Usually in
solaris sparc, hpux)
PROVISIONED – > Same as candidate disk, But disk was provisioned using asmlib.
( Usually in Linux,)
MEMBER – > Means disk is already part of a diskgroup
FORMER –  > Mean disk was formerly part of a diskgroup ,But currently not part of any
diskgroup.

14. Can we add a MEMBER disk to a asm diskgroup?

15. Explain different redundancy types of asm diskgroups.


The redundancy levels are:
 EXTERNAL redundancyOracle ASM does not provide mirroring redundancy
and relies on the storage system to provide RAID functionality. Any write
error causes a forced dismount of the disk group. All disks must be located
to successfully mount the disk group.
 NORMAL redundancyOracle ASM provides two-way mirroring by default,
which means that all files are mirrored so that there are two copies of every
extent. A loss of one Oracle ASM disk is tolerated. You can optionally choose
three-way or unprotected mirroring.A file specified with HIGH redundancy
(three-way mirroring) in a NORMAL redundancy disk group provides additional
protection from a bad disk sector in one disk, plus the failure of another disk.
However, this scenario does not protect against the failure of two disks.
 HIGH redundancyOracle ASM provides three-way (triple) mirroring by
default. A loss of two Oracle ASM disks in different failure groups is tolerated.
16. What is multi pathing? How asm works with multi-pathing?
Generally i/o path consists of components lke LUN ,adapter, cable, switches etc. These
are configured between storage and server system.
And in High availability env, we use multiple components, Which provides us multiple i/o
path options between storage and server, which helps in load balancing and failover in
case of any path /component failure.
Different vendors provided multiple pathing solution such as by EMC. ( emcpower) .
Lets say the first path to disk /dev/rdsk/c12900c4
second path to disk /dev/rdsk/c1300c4,
Then emc will create a pseudo device name call /dev/rdsk/emcpower1.
ASM doesnt provide multipathing on its own. But we can incorporate 3rd party solution
to asm.
So we can use asmlib to make this /dev/rdsk/emcpower1 as asmdisk and use it .
For example, if you are configuring ASMLIB with EMC’s PowerPath, you can use
the following setup:
ORACLEASM_SCANEXCLUDE=sd
ORACLEASM_SCANORDER=emcpower
ORACLEASM_SCANORDER dictates the ASMLIB scan order of the devices, using the
generic prefix.
ORACLEASM_SCANEXCLUDE indicates which devices should not be discovered by
ASMLIB.
 

17. What is disk_repair_time in asm?


DISK_REPAIR_TIME
This attribute specifies the time interval to repair a disk and bring it back online
before initiating the drop operation. This attribute can only be set when altering a
disk group and is only applicable to normal and high redundancy disk groups.

Fast Mirror resync in ASM 11g new feature


In Oracle 10g, If a disk of an disk group is unavailable for only a short period of time due to cable
disconnect or any other reason, then server remove the disk from diskgroup instead of waits for the disk to
become available again. ASM start the re-balance operation for the disk group. After reading the disk then
it need to re-balance again. So to avoid this extra I/O operation in 11g introduced the FAST Mirror Re-
synchronization in ASM.
In 11g Fast mirror resync, If disk is available with in the time limit of parameter DISK_REPAIR_TIME
then rebalance operation is only done for modified blocks after the disk temporary unavailable and
available with in the time limit of parameter.

Default of disk_repair_time is 3.6 hours.

For use this feature both having in 11g:

ALTER DISKGROUP SET ATTRIBUTE 'compatible.asm'='11.1';


ALTER Diskgroup SET ATTRIBUTE 'compatible.rdbms' = '11.1';

Disk group attribute disk_repair_time to determine how long to wait before an ASM disk is permanently
dropped from an ASM disk group after it was taken offline for whatever reason. The default of
disk_repair_time is 3.6 hours(190 Minutes)

For Disable this 11g feature:


Set DISK_REPAIR_TIME parameter to the value 0.
Change the parameter from default 3.6 to 24 hours:

alter diskgroup set attribute 'DISK_REPAIR_TIME'='24H';

 
Drop disk before repair time expired

Alter diskgroup offline disk disk1_001 drop after 0m;


Check the ststus of ASM diskgroup disks and Repair time value:

-- Online the disk


ALTER DISKGROUP DISK ONLINE;

select * from v$asm_disk_stat;

-- check compatibility
select * from v$asm_attribute;

col name format a8


col header_status format a7
set lines 2000
col path format a10
select name,path,state,header_status,REPAIR_TIMER,mode_status,mount_status  from
v$asm_disk;
NAME     PATH                STATE    HEADER_          REPAIR_TIMER    MODE_ST     MOUNT_S
-------- ---------- --------       ------- ------------ ------- -----------------  ------------  ------------- -----------------
DATA1    ORCL:DATA1 NORMAL   MEMBER             0                         ONLINE        CACHED
DATA2    ORCL:DATA2 NORMAL   MEMBER             0                         ONLINE        CACHED
DATA3    ORCL:DATA3 NORMAL   MEMBER             0                         ONLINE        CACHED
DATA4                              NORMAL   UNKNOWN         1200                   OFFLINE       MISSING

Here we see a value “1200” under REPAIR_TIME column; this value is time in seconds after which this
disk would be dropped automatically. This time is calculated using value of a diskgroup attribute
called DISK_REPAIR_TIME that I will discuss bellow.

In 10g, if a disk goes missing, it would immediately get dropped and REBALANCE operation would
kick in immediately whereby ASM would start redistributing the ASM extents across the available
disks in ASM diskgroup to restore the redundancy.

DISK_REPAIR_TIME
Starting 11g, oracle has provided an attribute for diskgroups called “DISK_REPAIR_TIME”. This has a
default value of 3.6 hours. This actually means that in case a disk goes missing, this disk should not
be dropped immediately and ASM should wait for this disk to come online/replaced. This feature
helps in scenarios where a disk is plugged out accidentally, or a storage server/SAN gets
disconnected/rebooted which leaves some ASM diskgroup without one or more disks. During the
time when disk(s) remain unavailable, ASM would keep track of the extents that are candidates of
being written to the missing disks, and immediately starts writing to the disk(s) as soon as missing
disk(s) come back online (this feature is called fast mirror resync). If disk(s) does not come back
online within DISK_REPAIR_TIME threshold, disk(s) is/are dropped and rebalance starts.

FAILGROUP_REPAIR_TIME

Starting 12c, another new attribute can be set for the diskgroup. This attribute is
FAILGROUP_REPAIR_TIME, and this has a default value of 24 hours. This attribute is similar to
DISK_REPAIR_TIME, but is applied to the whole failgroup. In Exadata, all disks belonging to a storage
server can belong to a failgroup (to avoid a mirror copy of extent to be written in a disk from the
same storage server), and this attribute is quite handy in Exadata environment when complete
storage server is taken down for maintenance, or some other reason.

In the following we can see how to set values for the diskgroup attributes explained above.

SQL> col name format a30

SQL> select name,value from v$asm_attribute where group_number=3 and name like
'%repair_time%';

NAME                           VALUE

------------------------------ --------------------

disk_repair_time               3.6h

failgroup_repair_time          24.0h

SQL> alter diskgroup data set attribute 'disk_repair_time'='1h';

Diskgroup altered.

SQL>  alter diskgroup data set attribute  'failgroup_repair_time'='10h';


Diskgroup altered.

SQL> select name,value from v$asm_attribute where group_number=3 and name like
'%repair_time%';

NAME                           VALUE

------------------------------ --------------------

disk_repair_time               1h

failgroup_repair_time          10h

ORA-15042

If a disk is offline/missing from an ASM diskgroup, ASM may not mount the diskgroup automatically
during instance restart. In this case, we might need to mount the diskgroup manually, with FORCE
option.

SQL> alter diskgroup data mount;

alter diskgroup data mount

ERROR at line 1:

ORA-15032: not all alterations performed

ORA-15040: diskgroup is incomplete

ORA-15042: ASM disk "3" is missing from group number "2"

SQL> alter diskgroup data mount force;

Diskgroup altered.

Monitoring the REPAIR_TIME

After a disk goes offline, the time starts ticking and value of REPAIR_TIMER can be monitored to see
the time remains before the disk can be made available to avoid auto drop of the disk.

SQL> select name,path,state,header_status,REPAIR_TIMER,mode_status,mount_status  from


v$asm_disk;
NAME     PATH                STATE    HEADER_          REPAIR_TIMER    MODE_ST     MOUNT_S

-------- ---------- --------       ------- ------------ ------- -----------------  ------------  ------------- -----------------

DATA1    ORCL:DATA1 NORMAL   MEMBER             0                         ONLINE        CACHED

DATA2    ORCL:DATA2 NORMAL   MEMBER             0                         ONLINE        CACHED

DATA3    ORCL:DATA3 NORMAL   MEMBER             0                         ONLINE        CACHED

DATA4                              NORMAL   UNKNOWN         649                     OFFLINE       MISSING

--We can confirm that no rebalance has started yet by using following query

SQL> select * from v$asm_operation;

no rows selected

If we are able to make this disk available/replaced before DISK_REPAIR_TIME lapses, we can bring
this disk back online. Please note that we would need to bring it ONLINE manually.

SQL> alter diskgroup data online disk data4;

Diskgroup altered.

select name,path,state,header_status,REPAIR_TIMER,mode_status,mount_status  from
v$asm_disk;

NAME     PATH                STATE    HEADER_          REPAIR_TIMER    MODE_ST     MOUNT_S

-------- ---------- --------       ------- ------------ ------- -----------------  ------------  ------------- -----------------

DATA1    ORCL:DATA1 NORMAL   MEMBER             0                         ONLINE        CACHED

DATA2    ORCL:DATA2 NORMAL   MEMBER             0                         ONLINE        CACHED

DATA3    ORCL:DATA3 NORMAL   MEMBER             0                         ONLINE        CACHED

DATA4                              NORMAL   UNKNOWN        465                      SYNCING     CACHED

--Syncing is in progress, and hence no rebalance would occur.

SQL> select * from v$asm_operation;

no rows selected
-- After some time, everything would become normal.

select name,path,state,header_status,REPAIR_TIMER,mode_status,mount_status  from
v$asm_disk;

NAME     PATH                STATE    HEADER_          REPAIR_TIMER    MODE_ST     MOUNT_S

-------- ---------- --------       ------- ------------ ------- -----------------  ------------  ------------- -----------------

DATA1    ORCL:DATA1 NORMAL   MEMBER             0                         ONLINE        CACHED

DATA2    ORCL:DATA2 NORMAL   MEMBER             0                         ONLINE        CACHED

DATA3    ORCL:DATA3 NORMAL   MEMBER             0                         ONLINE        CACHED

DATA4    ORCL:DATA4 NORMAL   MEMBER             0                         ONLINE        CACHED

If same disk cannot be made available, or replaced, either ASM would auto drop the disk after
DISK_REPAIR_TIME has lapsed, or we manually drop this ASM disk. Rebalance would occur after the
disk drop.
Since the disk status if OFFLINE, we would need to use FORCE option to drop the disk. After dropping
the disk rebalance would start and can be monitored from v$ASM_OPERATION view.

SQL> alter diskgroup data drop disk data4;

alter diskgroup data drop disk data4

ERROR at line 1:

ORA-15032: not all alterations performed

ORA-15084: ASM disk "DATA4" is offline and cannot be dropped.

SQL> alter diskgroup data drop disk data4 force;

Diskgroup altered.

select group_number,operation,pass,state,power,sofar,est_work from v$asm_operation;

GROUP_NUMBER OPERA PASS                   STATE      POWER      SOFAR   EST_WORK 


---------------------------------- --------- ----            ---------- ---------- ---------- ------------------------

           2                     REBAL RESYNC             DONE          9                0             0   

           2                     REBAL REBALANCE    DONE           9                42          42  

           2                     REBAL COMPACT         RUN             9                1            0   

Later we can replace the faulty disk and then add back the new disk again into this diskgroup. Adding
diskgroup back would initiate rebalance once again.

SQL> alter diskgroup data add disk 'ORCL:DATA4';

Diskgroup altered.

SQL> select * from v$asm_operation;

select group_number,operation,pass,state,power,sofar,est_work from v$asm_operation;

GROUP_NUMBER OPERA PASS                   STATE      POWER      SOFAR   EST_WORK 

---------------------------------- --------- ----            ---------- ---------- ---------- ------------------------

           2                     REBAL RESYNC             DONE          9                0             0   

           2                     REBAL REBALANCE    RUN              9               37           2787  

           2                     REBAL COMPACT         WAIT            9                1            0 

18. What Allocation unit(AU) in asm?


 This is the fundamental unit of allocation in a diskgroup.
 Default AU is 1 MB, can be changed to  2,4,8,16,32MB etc.
 In exadata default AU is 4 MB.
 This is a disk group attribute. So each diskgroup can have their own AU_SIZE
attribute.
19. What is ASM extent?
Consists of one or more AU.  A asm file consists of  one or multiple asm extents .
From 11g onwards, we have variable extent size.i.e
  First 20,000 extent sets will have extent size equal to disk group AU_SIZE.
 Next 20,000 extent sets will have extent size equal to 4*AU_SIZE
 Next 20,000 and higher will have extent size equal to 16* AU_SIZE
Why variable extent size?
For each extent there is an extent map in shared pool. If the large databases use the
default extent size. ( which is equal to au_size), then sga memory requirement  for ASM
instance will be very high and it need to store a lot  extent maps  in memory.
With variable extent size feature , database need fewer extents to describe an asm file,
and less memory to manage the extent maps in shared pool.

20. For very large databases , should we use small AU or large AU?
If the database is very big, then larger AU is recommended, Because
 Reduced SGA size to manage the extent maps in the RDBMS instance
  Increased file size limits
  Reduced database open time, because VLDBs usually have many big
datafiles( In 11g , this has been eradicated by fetching extents only on
demand).
In Oracle 11g, only the first 60 extents in the extent map are sent at file-open time. The
rest are sent in batches as required by the RDBMS.

21. When rebalance happens?


Changes like adding, dropping   disks, triggers rebalance operation. Rebalance operation
provides even distribution of file extents and space usage across all the disks of a
diskgroup. This is necessary for asm to provided balance i/o.

22. Explain in detail how rebalance works?


Let’s say a new disk has been added. Then below is the sequence of rebalance.
This triggers the RBAL process to create the rebalance plan and then begin
coordination of the redistribution
 RBAL calculates estimated time and work required to perform the task and
then messages the ASM Rebalance (ARBx) processes to handle the request. 
ASM_POWER_LIMIT decides how many number of ARBn processes will be
created.
 The Continuing Operations Directory (COD) is updated to reflect a rebalance
activity. This is important, Because support the rebalance is failed in the
middle , Then the other instance can use this COD to either complete or
rollback the rebalance operation.
 RBAL distributes plans to the ARBs. In general, RBAL generates a plan per file;
however, larger files can be split among ARBs.
 ARBx performs rebalance on these extents. Each extent is locked, relocated,
and unlocked.
We can monitor rebalance operation from v$asm_operation.

23. Let’s say currently rebalance is running with power limit of 5.


After running for 1 hour, we found that it is slow and we need to
complete it quickly. Can we increase the power limit and what will
be the impact on ongoing rebalance operation?
Yes, we can increase the power limit of existing rebalance operations. As soon as we
increase the power limit, additional ARB processes will be created and will try to
complete the rest of the rebalance operation.
What if we reduce the power limit  to 2 ?-   > In that case the rebalance will be handled
by 2 ARB processes. And those extra ARB process which were already running, then will
finish their extent relocation and then they are closed.
 

24. In your diskgroup, all the disks are of same size. But still when
you find that disks are not balanced. What could be the reason?
1. Either asm disk was added with rebalance power of 0(ZERO).
2. Or Previous rebalance by aborted due to any reason. ( which was not
completed after that).
 

25.  What will happen if the sever crashed in the middle of


rebalance operation?
 

26.  Difference between asmlib and asm filter driver?


 

27.  Is there any dependency between ASM and ASSM?


No there is no relation between them.

28.  Explain in detail, what happens when a asm file is CREATED by


oracle database.
Suppose a database wants to create datafile inside ASM.
1. First a file creation request is send from rdbms to asm which include info like
DG_NAME,FILE_TYPE,SIZE, BLOCK size etc through Onn process
2. ASM use this information to allocate the file( by considering the redundancy
and striping details as per template information received)
3. After allocating the file, asm sends extent map information to rdbms
instance.
4. ASM creates COD entry to track pending file creation process.
5. RDBMS instance then initialize the file and complete the process.
6. After asm receives the confirmation from rdbms instance, LGWR will flush the
ACD records. Then DBWR will write to allocation table, file directory
information.
If file creation is aborted , in the middle, then ASM will use COD data to rollback the
operation.
29.  Explain in detail, what happens when a asm file is OPENED by
oracle database.
When asm instance need to open an asm file
RDBMS sends an open file request with the file name to asm through o0nn process.
ASM gets the extent map information from file directory
 

30.  Explain in detail, what happens when a asm file is DELETED by


oracle database.
31.  My ASM /grid version is 19c, Can I have one 19c and one 12c
database in that?
32.  Explain what you know about ASM metadata?
                  ASM stores metadata to describe and track diskgroup contents. All of the
metadata that describe the composition and contents of an ASM diskgroup are stored
within the diskgroup itself, which makes each diskgroup self- describing.
ASM has two main classses of metadata.
Physical Metadata
Virtual Metadata
Physical Metadata:
They are stored in a fixed locations on the disk. This fixed location is necessary for asm
bootstapping. This
  Disk Header
  Allocation Table (AT)
  Free Space Table (FST)
  Partnership Status Table (PST)
Virtual Metadata:
 File Directory – Stores information about asm files(name,size,type,striping
infor, redundancy) in a diskgroup.
 Disk Directory – Stores information about disks in a diskgroup.
 Active Change Directory (ACD)
 Continuing Operations Directory (COD) s Template DirectoryAlias Directory
 Attribute Directory
 Staleness Directory
 Staleness Registry
33. What is allocation Table?
 Each asm disk has a allocation table(AT), to track free and allocated space
within the disk.
 This table contains each allocation entry(ATE) for each AU(Allocation Unit).
 ATES are grouped into allocation table block( ATB)
 Unallocated AUs are marked as free in the Allocation Table. The free extents
are kept in a linked list to facilitate quickly finding a free AU for allocation.

34. What is free space Table(FST)?


 FST indicates which Allocation table blocks(ATB) contains free AU.
 Whenever a disk is selected for allocation, then ASM consults FST, so that it
can skip ATBs which are fully occupied.
35. What is Partnership status table ?
 

36. What is the default asm metadata block size?


Metadata block size is 4K . Note that asm metadata block size is independent of oracle
database files.
 

37. What is active change directory(ACD) ?


Active change Directory
 It is similar to redolog in database.
 When the ASM instance needs to make an atomic change to multiple
metadata blocks, a log record is written into the ASM active change directory
(ACD),
 ASM uses the ACD to perform crash recovery and instance recovery to
ensure that the ASM metadata is consistent.

38.What is  Continuous operation directory(COD)?


 This is similar to Undo in database
 Long running operations like CREATE/DROP DISK , rebalance activities are
tracking via this file.
 If the long running processes die before completing the process. Then
recovery process check this file , then either complete or rollback the task
accordingly.
2 types of continuing operations:
Background:
 Disk group rebalance is a background operation process.
 If rebalance fails or asm instance crashes, then the surviving instance will
refer this file to complete the process.
Rollback :
 It is performed by foreground process. Foreground process performs on
behalf of database instance.
 If creation of asm file operation fails, then the partially created file need to be
deleted .
 example- Create/delete/drop file.
39.What is  Staleness registry?
 The staleness registry tracks allocation units that became stale when the disk is
offline, ( Possible only in normal or high redundancy only).
 The staleness directory contains metadata to map the slots in staleness registry
to particular disk and rdbms database.
 When a disk goes offline, each RDBMS instance gets a slot in the staleness
registry for that disk. This slot has a bit for each allocation unit in the offline
disk. When an RDBMS instance I/O write is targeted for an offline disk, that
instance sets the corresponding bit in the staleness registry.
 When disk becomes online, it checks the AU, who has this bit  set and copies
the mirror extents  for them.
40.Explain how extent relocation happens in asmdisks?
Relocation is the act of moving extents from one disk to another in a diskgroup, which
mostly happens during rebalance operation.
Relocation happens per extent basic.
There are two scenarios:
File close:
For a given extent, if the file is closed, then asm can relocate the extents without sending
any messages to other asm or rdbms instances.
File open:
 But for a given extent, if the file is open, then the the asm instance , which is
handling this relocation, will first send a message to all the asm instances,
that it is planning to relocate the extent. The asm instances will again send
messesges to rdbms instance using umbilicius..
 Now rdbms will delay the write to this extent until relocation is completed.
( Though chances are small that , the rdbms is writing to the extent ,which is
getting relocated at the same time).
 Now asm will do the actual relocation. And at this time , if any rdbms
instance wants to read that extent, they can read from the old location also.
But if write request comes, then they need to wait till relocation is completed.
 Once relocation is done, asm will release the old AU extents to free pool.
 

41.Explain how asm crash recovery happens?


ASM crash recovery is similar to that of database instance crash recovery.
Two virtual metadata directories are responsible for crash recovery.
Active change directory(ACD) – Which is like REDO in database
Continuous Change Directory(COD) – Which is like UNDO in database
 When we asm crash recovery Means it is asm diskgroup recovery. During
recovery, First the surviving instance applies the ACD record associated with
crashed instance. Applying ACD records ensure that the asm cache is in
consistent mode.
 After ACD recovery is completed, COD recovery happens if any long running
operation like file creation or disk addition was goind on during instance crash.
42. Does rebalance happens periodically?
No rebalance doesn’t happen periodically.
43. Suppose User is running a create datafile command in
database, But in the middle the server crashed . What will happen
to create datafile command?
This will be rolled back.

44. How do you know whether rebalance is going on or not?


45. What is the use of kfed utility?
46. What is + sign in  every  ASM diskgroup?
47. What is the use of  incarnation number in asm file system?
48. Lets say you have disk group with normal redundancy. And one
disk of that diskgroup became offline temporarily. And in the mean
time that diskgroup received lot of transaction. So how that disk
will be synced with its mirror disks.?
Using staleness registry and staleness directory.

49. Can i export(expdp) data to ASM diskgroup.


Yes we can. For that create the directory with asm path.
create directory EXPDIR as ‘+FRA/BACKUP’;

50. What is oracle ACFS and in which scenarios ACFS is useful?


51. What is Flex ASM?

With Oracle Flex ASM, the clients can connect to remote ASM using network connection
(ie ASM network ). If a server running an ASM instance fails, Oracle Clusterware will start
a new ASM instance on a different server to maintain the cardinality. If a 12c database
instance is using a particular ASM instance, and that instance is lost because of a server
crash or ASM instance failure, then the Oracle 12c database instance will reconnect to an
existing ASM instance on another node. These features are collectively called Oracle Flex
ASM.

52. What are the advantages of using ASM instead of traditional 


raw file system?
53. What is asm striping?
Oracle ASM separate files into stripes and spread them across all the disks of a disk group.
Why striping?
 To balance load across disks
 To reduce I/O Latency
Two types of striping in asm.
COARSE GRAINED:
 In coarse grain, the stripe size is same as AU of asm diskgroup.( i.e 1 MB
default)
It is helpful for voluminous i/o .
 It manages load across disks.
 Datafile, redolog file, tempfiles are coarse grained
FINE GRAINED:
 The stripe size is 128 KB.
 Helpful for low latency activities.
 Useful for files of small size with small io
 Only controlfile is fine grained currently as controlfile size is small and it can
be spead across disks.
 

54. Why control_file is fine grained?


 

55.  Suppose we need 5 disks of 1 TB each to an existing


diskgroup. Should I add them one by one or add all the 5 disk at a
time? Which method will create less overhead.
It is a best practice to add and drop multiple disks at a time so that ASM can reorganize
partnership information within ASM metadata more efficiently.

56.  What is the benefit of using asmlib?


 It simplifies asm disk administration.
 ASMLIB efficiently handles file descriptors and thus reduces the number of
open file descriptors on the system, making it less likely to run out of global
file descriptors. Also, the open and close operations are reduced, ensuring
orderly cleanup of file descriptors when the storage configuration changes.
 

57.What is fast rebalance feature of oracle 11g asm?


Usually, when we add or drop disks to a diskgroup, rebalance is initiated and
communication started between all the active ASM instances.  And if rebalance operation
is big one, then these messasging between the asm instances can cause a delay.
So in a situation , where user dont need to access the diskgroup, then we can use this
fast rebalance feature. Means rebalance will be performed by only one asm instance.
Note – The asm diskgroup will be accessible to oracle database during this phase.
Steps for fast rebalance in 2 node RAC:
1. Dismount the diskgroup from all the asm nodes.
2. Mount the diskgroup only one node asm instance with restrict( ALTER
DISKGROUP DATA MOUNT restrict)
3.  Do storage activites like add/drop disk ( this will start the rebalance)
4. Once rebalance is completed. Dismount the diskgroup from that one node.
and MOUNT the diskgroup on all nodes of cluster.
 
58.What is asm proxy?
 With introduction of Flex ASM in Oracle 12c, hard dependency between ASM
and its clients has been relaxed .i.e all the nodes in a cluster no need have
active asm instance.
 In such a scenario, in order to make ACFS services available on nodes without
an ASM instance, a new instance type has been introduced by Flex ASM.
 The ASM-proxy instance which works on behalf of a real ASM instance. ASM
Proxy instance fetches the metadata about ACFS volumes and file systems
from an ASM instance and caches it.
 If ASM instance is not available locally, ASM proxy instance connects to other
ASM instances over the network to fetch the metadata.
 Additionally, if the local ASM instance fails, then ASM proxy instance can
failover to another surviving ASM instance on a different server resulting in
uninterrupted availability of shared storage and ACFS file systems.
59. How can we check whether flex asm is enabled or not? Also If i
have 5 node RAC, and I want to keep 3 ASM instanes in any of the
3 nodes out of 5., How i will do it.
ASMCMD> showclustermode
ASM cluster : Flex mode enabled
[root@dbatestbin]# ./srvctl modify asm -count 3
[root@dbatest bin]# ./srvctl config asm
ASM home: /crsapp/app/oracle/grid/19c_home
Password file: +DATA/paramfile/orapwASM
ASM listener: LISTENER
ASM instance count: 3
Cluster ASM listener: ASMNET1LSNR_ASM

60. How flex asm works? 


With Oracle Flex ASM, the clients can connect to remote ASM using network connection
(ie ASM network ). If a server running an ASM instance fails, Oracle Clusterware will start
a new ASM instance on a different server to maintain the cardinality. If a 12c database
instance is using a particular ASM instance, and that instance is lost because of a server
crash or ASM instance failure, then the Oracle 12c database instance will reconnect to an
existing ASM instance on another node. These features are collectively called Oracle Flex
ASM.
Flex ASM requires a separate listener called ASMLISTENER to be configured on a
different port number which is not being used by any other Listener. The other important
thing is that FLEX ASM requires a separate network with which the ASM instances and
it’s clients communicate. You can also make use of the private network ethernet (used
for inter node communication) as the network for the ASM instances and it’s clients to
communicate.
61. How can we convert standard asm to flex asm?
[oracle]$ asmca -silent -convertToFlexASM -asmNetworks eth1/192.168.1.0 -
asmListenerPort 1529
[root]# /crsapp/app/oracle/cfgtoollogs/asmca/scripts/converttoFlexASM.sh
 
srvctl config asm
We can change the asm cardinality i.e If we have 3 node RAC, and we want ASM to run
on 2 node RAC, then
[oracle@]$ srvctl modify asm -count 2
 
Alternatively we can use the GUI method i.e using ASMCA utility.

62. Suppose the spfile location inside the gpnp profile is missing,
Will the asm start during cluster startup?
When an Oracle ASM instance searches for an initialization parameter file, the search
order is:
1. The location of the initialization parameter file specified in the Grid Plug and
Play (GPnP) profile
2. If the location has not been set in the GPnP profile, then the search order
changes to:
a. SPFILE OR PFILE in the Oracle ASM instance homeFor example,
the SPFILE for Oracle ASM has the following default path in the
Oracle Grid Infrastructure home in a Linux environment:
$GRID_HOME/dbs/spfile+ASM.ora
Note:
A PFILE or SPFILE is required if your configuration uses nondefault initialization
parameters for the Oracle ASM instance.
 

63. User ran select * from EMP, where the datafile is in ASM .
Explain how it will get the data from asm disks.
 

64. How you estimate how much the rebalance will take?
 

65. What are the different phases of diskgroup rebalance?


 Planning
 File extents relocation
 Disk compacting
66. What is this compact phase in asm rebalance?  Is it really
necessary and can we disable it?
In compact phase, the data is moved to outer layer of the asm disks. Because outer
region or the hot track has good bandwidth and greater speed.
Compact phase may take some time during the rebalance operation. So if we are using
flash disks, then there is no need of compacting. So we can chose the disable it
depending on which oracle version you were.
oracle 12c onward:( we can alter at diskgroup level)
ALTER DISKGROUP <dg> SET ATTRIBUTE ‘_rebalance_compact’=’FALSE’;
Prior to 12c( We need to change parameter at instance level)
alter system set _disable_rebalance_compact=true ….

67. What is flex asm diskgroup?

Oracle RAC Interview Questions


1. Why we are using vip in RAC? Because before 10g, there were
no concept of vip.
If user connected to the INSTANCE  using physical ip, and if the node goes down, then
there is no way for the user to know whether node is available or not. So it need to wait
for a long time, till it gets timed out by the network.
However If i use logical vip( on top of physical vip), then if node goes down, then CRS
will failover this vip to other surviving node. And the user will get the connection error
quickly( like TNS no listener available ).

2. If i have a 8 node RAC, then how many scan listeners are


required?

3 SCAN listeners sufficient for any RAC setup. It is not mandatory for scan listener to run
on all the nodes.

3. How SCAN knows which node has least load?


Load balance Advisory provides load information to scan.

4. Explain how client connection is established in  RAC database ?


LREG process on each instances registers the database service of the node with default
local listener and scan listener. The listeners store the workload information of each
node.
So when client tries to connect using scan_name and port,
1. scan_name will be resolved through DNS , which will redirect to 1st scan ip
( out of 3).
2. the client will connected to the respective scan listener
3. The scan listener compares the work load of both the instances and if scan
determines that node1 has least load , then scan listener send the vip
address  and port details of that particular nodes local listener to client.
4. Now client connects to that local listeners and a dedicated server process is
created
5. Client connection becomes successful and it starts accessing the database.
5. What current block and CR block  and PI in RAC?
Data block requests from global cache are of two types.
current block(cur) – > When we want to update a data, oracle must locate the most
recent version of the block in cache, it is known as current block
consistent read(CR) – > When we want to read a data, then only committed data will be
provided( with help of undo). that is known as consistent read.
Past image(PI) – When  node A wants to updates the block, which is present on node B
and node B has also updated the block , then node B will send the current copy of the
block to Node A, it will keep a past image( PI)  of the block , until it is written to the disk.
Once commit happens on node B for that transaction or when checkpoint happens , the
PI images will be flushed to disk.
There can be multiple CR blocks, But there will be always one Current block.
There can multiple scur(shared current) , But only xcur( exclusive current).

6. What is gc buffer busy wait?


Mean a session is trying to access a buffer in buffer cache, But that particular buffer is
currently busy with global cache operation.
So during that time gc buffer busy wait will happen.
Example –
 Let’s say session A want to access block id 100 , But currently that block is in
buffer cache of session B.
 So session A requested session B LMS process to transfer the block.
 While transfer is going on , session B also tried to access that block. But as
already that block/buffer is already busy in global cache operation. Session B
has to wait with wait event, gc buffer busy wait.
Reasons – Concurrency related, Right hand index growth.

other reason might be lack of cpu, slow interconnect .

8. What are some RAC specific parameters ?


 undo_tablespaces
 cluster_database
 cluster_interconnects
 remote_listener
 thread
 cluster_database_instances
9. Why RAC has separate redo thread for each node?
In RAC, each instance have their own lgwr process , So there has to be separate online
redolog for each instance ( called as thread), So that lgwr will write to the respective redo
log.

10. Why RAC has separate undo tablespace for each node?
If we keep only one undo, then it need more coordination between nodes and it will
impact the traffic between the instances.

11. Explain about local_listener and remote_listener parameter in


RAC?

In RAC, local_listener parameter points to node vip and remote_listener is set to the scan
Purpose of Remote Listener is to connect all instances with all listeners so the instances
can propagate their load balance advisories to all listeners. Listener uses the advisories
to decide which instance should service client request. If listener get to know from
advisories that its local instance is least loaded and should service client request then
listener passes client request to local instance. If local instance is over loaded then
listener can use TNS redirect to redirect client request to a less loaded instance means
remote instance. This Phenomenon is also called as Server Side Load balancing.
12. What are local registry and cluster registry?
13. What is client side load balancing and server side load
balancing?
14. What are the RAC related background processes?
LMON  –
 (Global Enqueue Service Monitor)  It manages global enqueue and resources.
 LMON detects the instance transitions and performs reconfiguration of GES
and GCS resources.
 It usually do the job of dynamic remastering.
LMD – > 
 referred to as the GES (Global Enqueue Service) daemon since its job is to
manage the global enqueue and global resource access.
 LMD process also handles deadlock detection and remote enqueue requests.
LCK0 -(Instance Lock Manager) > This process manages non-cache fusion resource
requests such as library and row cache requests.
LMS – ( Global Cache Service process) – >
 Its primary job is to transport blocks across the nodes for cache-fusion
requests.
 GCS_SERVER_PROCESSES –> no of LMS processes specified in init. ora
parameter.
 Increase this parameter if global cache is very high.
ACMS:
 Atomic Controlfile Memory Service.
 ensuring a distributed SGA memory update is either globally committed on
success or globally aborted if a failure occurs.
 RMSn: Oracle RAC Management Processes (RMSn)
It usually helps in creation of services, when a new instance is added.
 
LMHB  
 Global Cache/Enqueue Service Heartbeat Monitor
 LMHB monitors the heartbeat of LMON, LMD, and LMSn processes to ensure
they are running normally without blocking or spinning
15. What is TAF?
TAF provides run time failover of connection. There are different options we can mention
while creating taf policy.
Let’s say we created TAF with select option. Now Suppose a user connecting to  using
the taf and running a select statement. While select statement is running, the node on
which the select statement  running crashed. So the select statement will be
transparently failed over to other node and select statement will be completed and
results will be fetched.

16. What is Flex Cluster introduced in oracle 12c?


17. ASM is running , but the database is not coming up? What
might be the issue?
18. Can we start crs in exclusive mode? and its purpose?
19. If crs is not coming up , then what are things you will start
looking into?
20. What data we need to check in vmstat and iostat output?
21. Explain different ways to find master node in oracle rac?
 
1. Grep occsd Log file. [oracle @ tadrac1]: /u1/app/../cssd >grep -i “master
node” ocssd.log | tail -1. …
2. Grep crsd log file. [oracle @ tadrac1]: /u1/app/../crsd>grep MASTER crsd.log |
tail -1.
3. Query V$GES_RESOURCE view.
4. ocrconfig -showbackup. The node that store OCR backups is the master
node.
22. What is cache fusion in oracle RAC? and its benefits?
23. Explain split brain in oracle RAC.
24. Difference between crsctl and srvctl?
25. I want to run a parallel query in rac database, But I need to
make sure that, the parallel slave processes will run only on node
where i am running the query and it will not move to other node.
We can set Parallel_force_local parameter to TRUE at session level and then run the
parallel query. All the px processes will run only on that node.

26. My clusterware version is 11gr2 , can i install oracle 12c


database? is the viceversa possible( means clusteware version
12c and oracle database version 11g?)?
My clusterware version can be same or higher than the the database version. But a 12c
database will not work on 11g grid.

27. What are the storage structures of a clusterware?


2 shares storage structure – OCR , VD
2 local storage structure – OLR, GPNP profile.

28. What is OLR and why it is required?


While starting clusterware, it need to access the OCR , to know which resources it need
to start. However the OCR file is stored inside ASM, which is not accessible at this
point( because ASM resource also present in OCR file.
To avoid this, The resources which need to be started on node is stored in  operating file
system called as OLR ( Oracle local registry). Each node will have their OLR file.
So when we start the clusterware, this file will be accessed first.

29. What is OCR and what it contains?


OCR is the central repository for CRS, which stores the metadata, configuration and state
information for all cluster resources defined in clusterware.
node membership information
status of cluster resources like database,instance,listener,services
ASM DISKGROUP INFORMATION
Information ocr,vd and its location and backups.
vip and scan vip details.

30. Who updates OCR and how/when it gets updated?


OCR is updated by clients application and utilities through CRSd process.
1.tools like DBCA,DBUA,NETCA,ASMCA,CRSCTL,SRVCTL through CRsd process.
2. CSSd during cluster setup
3.CSS during node addition/deletion.
Each node maintains a copy of OCR in the memory. Only one CRSd(master) , performs
read, write to the OCR file . Whenever some configuration is changed, CRSd process will
refresh the local OCR cache and remote OCR cache and updates the OCR FILE in disk.
So whenever we try get cluster information using srvctl or crsctl , then it uses the local
ocr for fetching the data . But when it modify , then through CRSd process, it will
updates the ocr physical file).

31. What is the purpose of Voting disk?


Voting disk stores information about the nodes in the cluster and their heartbeat
information. Also stores information about cluster membership.

32. Why we need voting disk?

Oracle Clusterware uses the VD to determine which nodes are members of a cluster.
Oracle Cluster Synchronization Service daemon (OCSSD) on each cluster node updates
the VD with the current status of the node every second. The VD is used to determine
which RAC nodes are still in the cluster should the interconnect heartbeat between the
RAC nodes fail.

33. What is GPNP profile?


Grid plug and play(GPNP) file is small xml file present at os local file system . Each node
have their owner GPNP file.
GPNP file is managed by GPNP daemon.
It stores information like asm diskstring , asm spfile which are required to start the
cluster.
– Storage to be used for CSS
– Storage to be used for ASM : SPFILE location,ASM DiskString  
– public private network details.
When clusteware is started, It needs voting disk( which is inside ASM). So first it will
check the gpnp profile to get the voting disk location( asm_diskstring is defined inside
gpnp profile) .As asm is not up at this point, asm voting disk file will read using kfed read
command. ( We can run kfed, even when asm instance is down).
34. What are the software stacks in oracle clusterware?
From 11g onward, there are two stacks for clusterware is CRS.
1. lower stack is high availability cluster service stack ( managed by ohasd
daemon)
2. upper stack is CRSD stack ( managed by CRSd daemon)
35. What are the role of CRSD,CSSD,CTSSD, EVMD, GPNPD
CRSD – Cluster ready service daemon – It manages the cluster resources , based on
OCR information. It includes start,stop and failover or resource. It monitors database
instance, asm instance ,listeners, services and etc on and automatically restarts them
when failure occurs.
CSSD – > Cluster syncronization service – It manages the cluster configuration like,
which nodes are part of cluster etc. . When a node is added or deleted, it inform the
same about this other nodes. It is also responsible for node eviction if situation occurs.
CSSD has 3 processes – >
the CSS daemon (ocssd),
the CSS Agent (cssdagent),  The cssdagent process monitors the cluster and provides
input/output fencing.
the CSS Monitor (cssdmonitor) – Monitors internode cluster health
 CTSSD – > Provides time managment for cluster service. If ntp is running on
server, then CTSS runs in observer mode.
 
 EVMD – > Event Manger ,  Is a background process that publishes Oracle
Clusterware events  and manages message flow between the nodes and logs
relevant information to log file.
 
 oclskd -> Cluster Kill Daemon – > Is used by CSS to reboot a node based
on requests from other nodes in the cluster
 
 Grid IPC daemon (gipcd): Is a helper daemon for the communications
infrastructure
 
 Grid Plug and Play (GPNPD): GPNPD provides access to the Grid Plug and
Play profile, and coordinates updates to the profile among the nodes of the
cluster to ensure that all of the nodes node have the most recent profile.
 
 Multicast Domain Name Service (mDNS):  Grid Plug and Play uses the
mDNS process to locate profiles in the cluster, as well as by GNS to perform
name resolution.
 Oracle Grid Naming Service (GNS): Handles requests sent by external DNS
servers, performing name resolution for names defined by the cluster.
36. ASM spfile is stored inside ASM diskgroup, So how
clusterware starts the ASM instance( as asm instance needs asm
file startup)?
So here is the sequence of cluster startup.
ohasd is started by init.ohasd
ohasd accesses OLR file(stored in local file system) to initialize ohasd process.
ohasd starts gpnpd and cssd.
cssd process reads gpnp profile to get information like asm_diskstring, asm spfile ..
cssd scans all the asm disk headers and find the voting disk location and read using kfed
command and it joins the cluster.
To read the spfile, It is not necessary to open the disk.  All information necessary for this
stored in the asm disk header. OHASD reads the header of asm disk containing
spfile( this spfile location is retrieved from gpnp profile). and contents of the spfile are
read using kfed command. Using this asm spfile, ASM instance is started.
Now asm instance is up, OCR can be accessed, as it is inside ASM diskgroup. So OHASD
will star the CRSD.
So below are the 5 important files it access.
FILE 1 : OLR ( ORACLE LOCAL REGISTRY )   ——————————-> OHASD Process
FILE 2 :GPNP PROFILE ( GRID PLUG AND PLAY ) ————————> GPNPD process
FILE 3 : VOTING DISK —————————————————————-> CSSD Process
FILE 4 : ASM SPFILE ——————————————————————> OHASD Process
FILE 5 : OCR ( ORACLE CLUSTER REGISTRY ) ——————————> CRSD Process

37. Explain RAC startup sequence?


Init process spawns init.ohasd(inside /etc/init) , which start the OHASd process,
Go through the below diagram .
 

38. What is GES and GCS?

GES and GCS are two important parts of GRD(Global resource Directiory)
GES and GES have a memory structure in Global resource , which is distributed across the
instance. It is part stored in shared pool section.
Global enqueue service ( GES) handles the enqueue mechanism in  oracle RAC.  It
performs concurrency control  on dictonary cache, library cache locks and transactional
locks. This mechanism ensures that all the instances in cluster, know the locking status of
each other . i.e If node 1 want to lock a table , then it need to know what type of lock is
present in other node.  Background processes like LCK0, LMD and LMON  .
Glocal Cache Service(GCS) handles the block management. It maintains  and tracks the
location and status of blocks.  It is responsible for block transfer across instances.  LMS is
primary background process .

39. What is dynamic remastering?


Mastering of a block means, master instance will keep track of the state of blocks  until
the remastering happens due of few of the scenarios like instance crash etc.
GRD stores useful infor like data block address, block status, lock information, scn, past
image etc. Each instance have some of the GRD data in their SGA. i.e any instance which
is master of the block or resource , will maintain the GRD of that resource in their SGA.
Mastering of a resource is decided based on the demand. If a particular resource is
mostly accessed from node 1, then node1 will become the master of that resource. And
if after some time if node 2 is heavily accessing the same resource, then all the resource
information will be moved the node2 GRD.
LMON, LMD, LMS are responsible for dynamic remastering.

Remastering can happen due to below scenarios.


1. Resource affinity – > GCS keeps tracks of the number of GCS  request per
instance and per objects . If one instance is heavily accessing the object
blocks, compare to other nodes, Then gcs can take decision to migration all
the object resource to the heavily accessed instance.
2. Manually remastering – > We can manually remaster a object
3. Instance crash – > If instance is crashed, the the its GRD data will be
remastering to the existing instances in cluster.
40. How instance recovery happens in oracle RAC?
When any one of the instance is crashed in RAC, then this node failure is detected by the
surviving instances. Now the GRD resouces will be distributed across the existing
instances. The instance which first detects the crash, will the start the online redo log
thread of the crashed instance.  The SMON of that instance, will read the redo to do
rollforward ( i.e to apply both committed and noncommited data). Once rollforward is
done, it will rollback the uncommited transactions using UNDO tablespace of the failed
instance.
Sequence
1. Normal RAC operation, all nodes are available.
2. One or more RAC instances fail.
3. Node failure is detected.
4. Global Cache Service (GCS) reconfigures to distribute resource management
to the surviving instances.
5. The SMON process in the instance that first discovers the failed instance(s)
reads the failed instance(s) redo logs to determine which blocks have to be
recovered.
6. SMON issues requests for all of the blocks it needs to recover.  Once all
blocks are made available to the SMON process doing the recovery, all other
database blocks are available for normal processing.

1. Oracle performs roll forward recovery against the blocks, applying all redo
log recorded transactions.
2. Once redo transactions are applied, all undo records are applied, which
eliminates non-committed transactions.
3. Database is now fully available to surviving nodes.
 

41. What is TAF in oracle RAC?


BASIC
PRECONNECT
SELECT  FAILOVER
SESSION FAILOVER

42. Can we have multiple SCAN(name) in a RAC? 


From 12c onwards, We can have multiple scan with different subnets. As part of
installation only scan will be configured. Post installation we need to configure another
SCAN with different subnet( If required).

43. In RAC, where we define the SCAN?


We can define SCAN with below 2 option.
1. Using corporate DNS
2. Using Oracle GNS( Grid naming service)
44. What g stand for in views like gv$session , gv$sql etc.?
45. What is load balancing advisory?
46. What is ACMS?
47. What are some RAC related wait events?
48. What is the role of LMON background process?
49. What is gc cr 2 way and gc cr 3 way?
 

50. What is HAIP?


HAIP, High Availability IP, is the Oracle based solution for load balancing and failover for
private interconnect traffic. Typically, Host based solutions such as Bonding (Linux)is
used to implement high availability solutions for private interconnect traffic. But, HAIP is
an Oracle solution for high availability.
Essentially, even if one of the physical interface is offline, private interconnect traffic can
be routed through the other available physical interface. This leads to highly available
architecture for private interconnect traffic.
The ora.cluster_interconnect.haip resource will pick up a  highly available virtual IP (the
HAIP) from “link-local” (Linux/Unix)  IP range (169.254.0.0 ) and assign to each private
network.   With HAIP, by default, interconnect traffic will be load balanced across all
active interconnect interfaces. If a private interconnect interface fails or becomes non-
communicative, then Clusterware transparently moves the corresponding HAIP address
to one of the remaining functional interfaces.
$ crsctl stat res ora.cluster_interconnect.haip -init
NAME=ora.cluster_interconnect.haip
TYPE=ora.haip.type
TARGET=ONLINE STATE=ONLINE on dbhost1
 
Here if you , while installing, we have given private interrconnect as 192.168.1.0 ( ens225)
, But while starting the cluster, a new vip as 169.254* has been assigned, so
gv$cluster_interconnect shows ip_address as 169.254*.

NOTE – For the HAIP, to failover to other interconnect, there has to be another physical
interconnect,
 

51. What is node eviction and in which scenarios node eviction


happens?
Ocssd.bin is responsible to ensure the disk heartbeat as well as the network heartbeat.
There’s a maximum delay in both heartbeats , The delay of network heartbeat is called
MC(Misscount), The disk heartbeat delay is called IOT (I/O Timeout). this 2 All parameters
are in seconds , By default Misscount < Disktimeout.
[grid@Linux-01 ~]$ crsctl get css misscount
CRS-4678: Successful get misscount 30 for Cluster Synchronization Services.
 
[grid@Linux-01 ~]$ crsctl get css disktimeout
CRS-4678: Successful get disktimeout 200 for Cluster Synchronization Services.
Eviction occurs when cssd detects a heart beat problem i.e when it lost communication
with other node or  lost heart beat info from other node, CSS initiate node eviction.
Node eviction is used for i/o fencing the node, so the users doing i/o wont be able to
access the malfunctioned system. I.e to avoid split brain syndrom.
In node eviction the node will be rebooted automatically and it will try to connect to the
cluster.
From 12c onwards:
1. If the sub-clusters are of the different sizes, the functionality is same as
earlier the bigger one survives and the the smaller one is evicted.
2. If the sub-clusters have unequal node weights, the sub-cluster having the
higher weight survives so that, in a 2-node cluster, the node with the lowest
node number might be evicted if it has a lower weight.
3. If the sub-clusters have equal node weights, the sub-cluster with the lowest
numbered node in it survives so that, in a 2-node cluster, the node with the
lowest node number will survive.
and the best thing here is, you can use crsctl command to assign weight to instruct
clusterware to consider your desires while taking eviction decision.
52. What is rebootless node fencing?
Prior to 11.2.0.2  , If failures happens with RAC components like  private interconnect
and  voting disk accessibility, then to avoid split brain , oracle clusterware does fast
reboot of the node  But the problem was that node reboot that, if any non cluster
related processes are running are running on node, then those also gets aborted. Also ,
with reboot, the resources also need to be remasterd, which is expensive sometime.
Also if sometime if some issue or blockages in the i/o temporarily then also clusterware
will misjudge that, initiate reboot.
So to avoid this, from 11.2.0.2 onward, this method has been improved, and known as
reboot-less node fencing.
1. First clusterware finds which node to be evicted
2. Then i/0 generating processes will be killed on the problematic node.
3. Clusterware resources will be stopped on the problematic node
4. OHASD process would be running, will try continuously to start CRS, till issue
is resolved.
 
But if due to any issue, the it is unable to stop the processes on the problematics
node( i.e rebootless fencing fails) , then fast reboot will be initiated by cssd.

53. In case of node eviction due to private interconnect in a 2


node/3 node rac , How oracle decides which node to be evicted?
3 NODE RAC:
Let’s say there are  nodes are A, B , C. If  network heart beat of node  A failed.  Node B
and Node C wont be able to ping to node A  , But B and C  can communicate between
each other. So B and C will have 2 votes( one more self ping and other for ping to other
node).
But A will have only one vote( i.e for the self ping). So A has less vote, oracle decides that
A needs to evicted.
2 NODE RAC:
Lets say the nodes are A, B. If network heartbeat fails, then A and B wont be able to ping
each other. So both A , B will have one vote each. So which node to be evicted?? Here
quorm disk comes into play. This quorom disk(voting disk) also represents one vote. So
both A and B will try to acquire that vote. Whoever acquire that quorom, gets 2 votes
and stay in the cluster and other one gets evicted.

54. How can we improve global cache performance?


1. We can increase the number of LMS processes, by increasing
gc_server_process.
2. We can set “_high_priority_processes”=”LMS*|LGWR*”
55. What is gc block lost?

It indicates issue with interconnect. If a requested block is not received by the instance in
0.5 seconds, the block is considered to be lost.

56. What is MTU ? How much MTU is recommended in oracle


RAC?
MTU – Means maximum transmission unit.
Usually the standard is 1500 bytes . But the database block is 8k . So during block
transfer between nodes, one block cannot be transfered at a time, It gets broken into
pieces and these small packets are transfer.
So Oracle recommends to use 9000 byte MTU.
 

57. Suppose you have only one voting disk. i.e the diskgroup on
which voting disk resides is in external redundancy. And if that
voting disk is corrupted. What will be your action plan?
 
    – stop crs on all the nodes (  crsctl stop crs -f )
     – start crs in exclusive mode on one of the nodes (crsctl start crs -excl)
     – start asm instance on one node using pfile , as asmspfile is inside asm diskgroup.
     – create a new diskgroup NEW_VOTEDG(
     – move voting disk to NEW_VOTEDG  diskgroup ( crsctl replace votedisk 
+NEW_VOTEDG) – It will automatically recover the voting disk from the latest OCR
backup.
     – stop crs on node1
     – restart crs on all nodes
     – start crs on rest of the nodes
     – start cluster on all the nodes
 
 
 

58. Where the OLR is stored? When olr backup is created.


By default, OLR is located at Grid_home/cdata/host_name.olr
The OLR is backed up after an installation or an upgrade. After that time, you can only
manually back up the OLR. Automatic backups are not supported for the OLR.

9. Explain the backup frequency of OCR.


OCR backup is created at every four hours automatically.

ocrconfig -showbackup - > command to take manual ocr backup.

60. Why we need odd number of voting disks in RAC?


A node must be able to access strictly more than half of the voting disks at any time. So
if you want to be able to tolerate a failure of n voting disks, you must have at least 2n+1
configured. (n=1 means 3 voting disks).
So whether you have 3 disks or 4 disks. only failure of 1 disks will be tolerated

61. How can i get the cluster name in RAC?


olsnodes -c

62. What is disktimeout and miscount ?


Miscount is the maximum delay in network heartbeat. By default miscount is set to 30
second . If the nodes are unable to communicate with each other through private
interconnect for 30 seconds( miscount value), then node eviction will be initiated.
Disktimeout: It is the maximum voting disk heartbeat delay. Default is 200 seconds. If the
CSSD is unable to write more than half of voting disks, then eviction will happen.
 

63. If olr file is missing ,How can you restore olr file from backup
# crsctl stop crs -f
# touch $GRID_HOME/cdata/<node>.olr
# chown root:oinstall $GRID_HOME/cdata/<node>.olr
# ocrconfig -local -restore $GRID_HOME/cdata/<node>/backup_<date>_<num>.olr
# crsctl start crs
64. Someone deleted the olr file by mistake and currently no
backups are available . What will be the impact and how can you
fix it?
If OLR is missing , then if the cluster is already running, then cluster will run fine. But if
you try to restart it , It will fail.
So you need to do below activities.
On the failed node:
# $GRID_HOME/crs/install/rootcrs.pl -deconfig -force
# $GRID_HOME/root.sh

65. Explain the steps for node addition in oracle rac.


 Run gridsetup.sh from any of the existing nodes and select for add node
option and then proceed with the rest of part.
 Now extend the oracle_home to the new node using addnode.sh script( from
existing node)
 Now run dbca from the existing node and add the new instance.
Follow this below link for step by step details.
How to add a node in oracle RAC
66. Explain the steps for node deletion.
 Delete the instance usind dbca
 Deinstall ORACLE_HOME from $ORACLE_HOME/deinstall
 Run gridsetup.sh and select delete node option
Follow the below link for step by step details:
How to delete a node in oracle RAC
67. asm spfile location is missing inside gpnp profile, Then how
will asm instance startup?
For this, we need to understand the  search order of asm spfile
1.  First it will check for asm spfile location inside gpnp profile
2. If no entry is found inside gpnp profile, then it will check the default path
$ORACLE_HOME/dbs/spfile+ASM.ora or a pfile.
68. How you find out issue with private interconnect?
You can use traceroute to check if any issue with data transfer in private interconnect.

69. How to apply patch manually in RAC?


First do  you the patch conflict check against the OH.
Then rootcrs.sh -prepatch to unlock the crs ( without unlocking the crs, opatch utility
cannot do any modification to the grid home)
Then opatch apply ( to apply the patch)
Then rootcr.sh -postpatch to lock the crs
70.Lets say, you applied patch on node 2, and ran rootcrs.sh -
post , and now it shows patch mismatch. But  when you checked
the oracle inventory(opatch lsinventory), Patches are same across
both the nodes. Then what you will do?

In this case, you can run kfod command to find the missing patch.
Action plan:
Run kfod op=PATCHES on all the nodes and see on which nodes if any patch is missing.
Lets say you found that patch 45372828 is missing on node 2, then
On node2 as a root user , run below command
root#$GRID_HOME/bin/patchgen commit 45372828
After that you can run  below commands to verify  whether patch level is same or not .
kfod op=PATCHLVL
kfod op=PATCHES
After the confirmation , you can run rootcrs.sh -patch

71.How you troubleshoot, if the cluster node gets rebooted.


72.In a 12c two node RAC, What will happen, if I unplug the
network cable for private interconnect?
Rebootless node fencing will happen. i.e the node which is going to be evicted, on that
node all cluster services will be down. and the services will be moved to the surviving
node. And crs will do the restart attempt continuously until the private interconnect
issues fixed.  Please note – the node will not be reboot, only the cluster services willl go
down.
However Prior to 11.2 , In this situation, the node reboot will occur.

73.In a rac system , What will happen if i kill the pmon process?
The pmon will be restarted automatically.

74. Can we see DRM( Dynamic Resource Mastering) related


information in oracle RAC?
Yes we can see DRM related data in  gv$gcspfmaster_info  by passing the object_id.
 

75. What is Grid infrastructure Management Repository(GIMR)?


Grid Infrastructure Management Repository (GIMR) is a centralised infrastructure
database for diagnostic and performance data and resides in Oracle GI Home. It is a
single instance CDB with a single PDB and includes partitioning (for data lifecycle
management).

76. What is Rapid Home Provisioning?


77. If MGMTDB is not coming up for any reason, then what will be
the impact on the existing databases?
No impact on existing database, it will just give warning.

78. What is the maximum number of voting disks we can


configure?
We can configure upto 15 voting disk.( from 11g onward)

79. What is node weightage?


Prior to 12cR2 , during node eviction,  node with lower number ( i.e which node joined
the cluster first) survives .
But in 12cR2, node weightage concept has been introduced. I.e the node having more
number of services or workload will survive the eviction.
There is another option to assign weightage to the services/databases using  -
css_critical=yes in srvctl database/service .

80. OCR file has been corrupted, there is no valid backup of OCR.
What will be the action plan?
In this case , we need to deconfig and reconfig.
deconfig can be done using rootcrs.sh -deconfig option
and reconfig can be done using gridsetup.sh script.

80. Suppose someone has changed the permission of files inside


grid_home. How you will fix it?
You can run rootcr.sh -init command to revert the permission.
# cd <GRID_HOME>/crs/install/
# ./rootcrs.sh -init
Alternatively you can check the below files under $GRID_HOME>/crs/utl/<hostname>/
– crsconfig_dirs which has all directories listed in <GRID_HOME> and their permissions
– crsconfig_fileperms which has list of files and their permissions and locations in
<GRID_HOME>.

81. Can i have 7 voting disks in a 3 node RAC? Let’s say in your
grid setup currently only 3 voting disks are present. How can we
make it 7?
82. I have a 3 node RAC. where node 1 is master node. If node 1 is
crashed. Then out of node2 and node3 , which node will become
master?
83. Is dynamic remastering good or bad?
84. What will happen if I kill the crs process in oracle rac node?
85. What will happen if I kill the ohasd process in oracle rac node?
86. What will happen if I kill the database archiver process in
oracle rac node?
It will be restarted.

87. What is application continuity  and transactional dataguard in


oracle rac?
88. What is this recovery buddy feature in oracle 19c?
Usually when instance is crashed in RAC, then one node is elected among the surviving
nodes, for doing the recovery. And that elected node will read the redo logs of the
crashed instance and do the recovery.
However in 19c,  One instance will recovery buddy of another instance. like.
Instance A is recovery buddy of instance B.
Instance B is recovery buddy of instance C.
Instance C is recovery buddy of instance A.
And this buddy instance will the track the block/redo changes of the mapped instance
and keep them in its sga( in hash table ).
So recovery buddy features helps in reducing the recovery time( as it eliminates the elect
and redo read phase).

89. What is nodeapps?


Nodeapps are standard set of oracle application services which are started automatically
for RAC.
Node apps Include: vip,network,adminhelper,ONS

90. CSSD is not coming up ? What you will check and where you
will check.
1. Voting disk is not accessible
2. Issue with private interconnect
 
2.the auto_start parameter is set to NEVER in ora.ocssd resource . ( To fix the issue,
change it to always using crsctl modify resource )

91. How you check the cluster status?


crsctl stat res -t
crsctl check crs
crsctl stat res -t -init

92. crsctl stat res -t -init command, what output it will give?
93. What are the different types of heart beats in Oracle RAC?
There are two types of heart beat.
Network heartbeat is across the interconnect, every one second, a thread (sending) of
CSSD sends a network tcp heartbeat to itself and all other nodes, another thread
(receiving) of CSSD receives the heartbeat. If the network packets are dropped or has
error, the error correction mechanism on tcp would re-transmit the package, Oracle does
not re-transmit in this case. In the CSSD log, you will see a WARNING message about
missing of heartbeat if a node does not receive a heartbeat from another node for 15
seconds (50% of misscount). Another warning is reported in CSSD log if the same node is
missing for 22 seconds (75% of misscount) and similarly at 90% of misscount and when
the heartbeat is missing for a period of 100% of the misscount (i.e. 30 seconds by
default), the node is evicted.
Disk heartbeat is between the cluster nodes and the voting disk. CSSD process in each
RAC node maintains a heart beat in a block of size 1 OS block in a specific offset by
read/write system calls (pread/pwrite), in the voting disk. In addition to maintaining its
own disk block, CSSD processes also monitors the disk blocks maintained by the CSSD
processes running in other cluster nodes. The written block has a header area with the
node name and a counter which is incremented with every next beat (pwrite) from the
other nodes. Disk heart beat is maintained in the voting disk by the CSSD processes and
If a node has not written a disk heartbeat within the I/O timeout, the node is declared
dead. Nodes that are of an unknown state, i.e. cannot be definitively said to be dead, and
are not in the group of nodes designated to survive, are evicted, i.e. the node’s kill block
is updated to indicate that it has been evicted.
Reference – https://databaseinternalmechanism.com/oracle-rac/network-disk-
heartbeats/
[grid@Linux-01 ~]$ crsctl get css misscount
CRS-4678: Successful get misscount 30 for Cluster Synchronization Services.
[grid@Linux-01 ~]$ crsctl get css disktimeout
CRS-4678: Successful get disktimeout 200 for Cluster Synchronization Services.
 

94. What is preferred and available in a service in RAC?


95. Suppose I am running a insert statement by connecting to a
database using service . And if that instance is crashed, then what
will happen to the insert statement?
96. can there be gc-4-way wait events in 4 node rac?
97. suppose in a 5 node rac, i ran srvctl stop database -d
<TESTDB>, then how this information will be conveyed to other
nodes? Provide the sequence of steps it will happen after this
step.
96. Can there be multiple consistent read blocks ?

Overview:
Below are the questions asked to a candidate with 10 years of experience in Oracle DBA
and postgres . There were 2 rounds.
 

Round 1 :
  What are your day to day activities. And what is your role in the Team?
 Explain cluster startup sequence.
 Suppose users are connected to node1 and Os team, want to do emergency
maintenance and they need to reboot. What wil happens to the transactions
on node1 , and can we move them to node 2.
 What are the advantages of partitioning?
 Can we convert non-partitioned table to partitioned table online( in
production)?
 What are the cloning methods. And what is the difference between duplicate
method and restore method?
 How can we open the standby database as read write for testing and Once
testing is done need to revert to previous position. But without using
snapshot command?
 Explain how you did grid upgrade.
 Which advanced security features you have used ?
 What is audit purging?
 Lets says a query was running fine for last few days , but today it is not
performing well. How you will troubleshoot it?
 What is oracle in memory ?
 Explain what type of patches you applied on grid.
 As per resume ,You have worked on ansible . So how good you are with
ansible and what type of scripts you have wrote.
 How can you troubleshoot a shell script?

2nd round:
 Explain ASM architecture , means when ever user runs a query, how it gets
the data from ASM.
 Why ASM performance is better
 Why VIP is used , Bcoz before 10g vip was not used. What benefits vip
provide
 What is Current block and CR block in RAC
 What wait events are there in RAC??
 What is gc buffer busy wait
 What is gc 2 way current wait, gc 3 way current wait.
 What is node eviction. In which cases it happens.
 How you do load balancing using services in your project. Explain
 What activities you did using goldengate
 How you did upgrade/migration using goldengate.
 What is goldengate instantiation.
 What is bounded recovery in goldengate
 What is the difference between CSN in goldengate vs SCN in database?
 Which type of os tool/commands you have used for os
monitoring/performance. And how you read it.
 How you troubleshoot I/O issues in oracle
 What type of sql performane tools you use.
 How you migrate a plan from one db to another db
 What is the dbms package for baseline and sql tuning?
 Lets say, despite importing the baseline, query is not picking the plan, What
can the be reason?
 What is index fast full scan and index range scan?
 What is the purpose of standby redolog.
 Any automation tools you have used?
 Did you worked on jenkins or gitlab?
 What activities you did in postgres.
 How you implement backup in postgres.
 Is your postgres cluster active active or active passive?
 Can we create a active active postgres cluster?
 Do you remember any postgres extensions?
 What type of scripts you have written on ansible?
 Tell me some modules of ansible.
 How can we encrypt the server credentials in ansible.

First round:
oracle dba question
 What happens during hot backup?
  Why hot backup generates lot of redo?
  Different between small file tablespace and big file tablespace . Can pros and
cons of both type of tablespace?
  Cconfigure controlfile autobackup on- what is the use and is it mandatory?
  Can I drop the controlfile of asm instance
  How asm interacts with database
  How I improve the i/o performance between asm and database instance
  Different between expired and obsolete
  What is partition pruning?
  Can I restore a database from obsolete backup?
  Difference between retention 7 days and recovery window 7 days
  What it the role backgrond process LMON?
  What is i/o fencing
  How can u find the master nodes in oracle RAC
  Between raw device and asm , whose performance is better
  Can I have disks of different sizes in a diskgroup
  What is the minimum number of diskgroups in rac and why ?
  User is complaing that he is getting block sessoion in RAC, but when we
checked by connecting to that instance, then you don’t see any blocking
session. Even the sid is also not available, What could be the issue
  What g stands in gv$sqll
  Different paramters in Asm instance
  What is hugepages and how it is related to RAC?
  What is the default pagesize in linux
  Where can you find the oratab entry in sun solaris.
  Can I move the oratab entry to /etc/oratab
  What is ACMS?
  Are you aware of hangcheck?
  How comfortable you are with shell scripting?
  Yesterday RMAN backup ran for 10 min , But today it is taking more than 1
hour and it is still running. Note – databaese size is same, And load is normal.
  Do you know what is hangcheck timer
  What is load balancing advisory?
Postgres question:
 What is ctid in postgres
  I want to rebuild all the indexes in postgres, How can I do?
  How can we reset the password in postgres?
  How can u check the version in postgres
  Explain datatypes in postgres
  What type of querying language postgres support?
  What is the latest postgres version. And which postgres version currently
you are using?
  Which tools you use to manager and monitor postgres
  Compare postgres and oracle
  While dropping a database in postgres, I am getting error and unable to
drop. What could be the issue?
 
2nd round:
ROUND 1: ( Technical Round)
 Background processes in oracle database
  Explain about dbwr and lgwr
  What is shared poool. What it does
  What is the role of PGA
  What it contains?
  What is the persistent area in PGA
  Explain how update statement works.
  Lets say, one user is updating and other one is selecting, How it will get the
data.
  What is touchcount and its related question?
  What redo buffer contains.
  How recovery happens?
  What type of issues you face in oracle database
  How can we control db writer process.  What should the be the value of db
writer process.
  Different status of buffer in buffer cache.
  Different status of redo logfile
  Can i drop a online redolog from oracle?
  What core dba issues you face
  How transaction recovery happens
  How update statement work flow happens
  RAC STARTUP SEQUENCE.
  DIFFERENT BACKGROUND PROCESSES in RAC.
  Role of ocr, vd.
  How many. Number of voting disk for 8 node RAC?
  Which file while starting cluster and what happens next?
  Why OLR  is required?
  What is gpnp profile
  Role of LMS, LMON, LMD, LCK,
  What is dynamic remastering
  What happens during instance reconfiguration
  Which process responsible for instance reconfig
  What is GCS,GES and GRD and which processes are responsible for this.
  What is past image
  Instance recovery in RAC
  Which proess does node eviction, Which node gets evicted
  Different protection modes in dataguard
  Which process gets the data from primary to standby
  Can we convert protection mode
  AFFIRM and NOAFFIRM
  What type of issues come in standby dataguard
  What are some parameters in dataguard
  What is fal_server
  What is log_archive_dest_1 and log_archiv_dest_2
  What is db_name and db_unique_name
  New features of oracle 19c dataguard
  What is far sync
  What is db_flashback_retention_target
  Different between dataguard and active dataguard
  Difference between force logging and supplemental logging
  Difference between classic and integrated
  What is the parallel integrated apply
  What is the coordinated integrated apply
  What is handle collision
  What is the common issues in goldengate and how you handle it
  What is discard file
  Where LCRs are stored
  Lets say process touched limit and you are unable to login with sysdba ,what
you will do
  What is huge pages and why we need to enable hugepages.
  If database is running slow, what are the things we need to check.
  Why dbwr called lazy writer?
 What is the voting disk timeout value?
ROUND 2:( Technical Round)
 How select statement processing happens?
 How insert statement processing happens?
 Explain cache fusion
 Explain cluster startup sequence
 How asm gets started
 Explain flex asm
 How to enable flex asm
 What is asm proxy?
 How you can recover undo tablespace corruption.
 While installing grid, what happens when you do root.sh script.
 In flex asm , how database connects with asm
 If you lost your OLR , How will you troubleshoot?
 If you cluster node gets rebooted, which logs you will observer usually.
 Explain how you apply patch manually in RAC
 Lets say, you applied patch on node 2, and ran rootcrs.sh -post , and now
the command is not coming out and error this patch mismatch, When you
checked the oracle inventory , you found that patches are same . How you
will troubleshoot it
 Explain steps for node addition
 Explain cache fusion , Which processes are responsing for cache fusion.
 Explain write write scenarios in cache fusion
 What is dynamic resource remastering? Is it good or bad?
 If I remove the entry for spfile in gpnp profile, then will the crs start?
 How you check the private interconnect issues.
 How to do db upgrade.
 What happens during db upgrade
 How many phases are there in db upgrade
 If I don’t do the timezone upgrade, will the database work.
 What is far sync?
 Which process send the redologs to standby
 Which process received the data in standby
 If block corruption happens in standby , how you willl recover it
 What is multi instance redo apply . And how I can enable multi instance
redo apply in dataguard.
 What is sql quarantine .
ROUND 3( Managerial Round):
Mostly questions about
 The candidate education history and work experience history.
 Some high level question on database migration.
 High level question on RAC node eviction.
 Asked some question about some real time RAC issues that you faced in the
past.
 Asked about the certifications.
 Asked whether any knowledge on oracle cloud or any other technologies
 Asked whether any knowledge on exadata.
 Asked how good you are at weblogic troubleshooting
 Some discussion happened on salary and work related discussion

 In ansible, what type of activites you did?


 What is a role in ansible?
 How you can print the output in ansible task?
 How you handle errors in ansible?
 Explain the postgres architecture
 How connection gets established in postgres
 Different Components
 Where the data is stored in postgres
 What is vacuuming and vaccuming full and their difference.
 What MVCC? What is the advantage and disadvantage of this?
 What is visibility Map?
 What happens in the backend when we do vacuuming?
 Difference between awk and sed?
 What is the use of xarg?
 What is tee?
 How to list all the files inside a directory and sub-directories recursiverly .
 Which mongo db version you have worked
 What are some parameters in mongod.conf parameter
 What is bindip in mongod.conf
 Different protection modes.
 Different types of standby databases
 What is snapshot standby database and how it works
 If the standby setup in max protection mode and if the standby goes down,
what will happen.
 How to fix if my archive is missing in standby
 Explain different migration process?
 How you can do cross platform migration?
 How you did migration using transportable tablespace?
 What is endian format?
 Different methods of cache fusion?
 Which processes are responsible for cache fusion?
 Where voting disk path is stored, ?
 How can I read GPNP profile?
 What is OLR, and which data is stored there?
 Why not voting disk path is not present in olr?
 In RAC, My one node is down, and getting error like vip already in use? What
might be the issue?
 If a node , not coming up, how you troubleshoot it?
 What RAC scenarios you have faced in your environment?
 What proactive tasks you have done in your environment to improve the
performance?

RMAN Interview QA

1. What is the difference between expired and obsolete backup.


Expire mean, the backup piece is not available at the physical location .
Obsolete backup mean, That backup piece is not required any more.

2. What is block change tracking?


Usually when we take incremental backup, It doesn’t have any information , about which
blocks has been changed since last backup. It scan all the blocks(all the data), then take
necessary backup. And scanning all the blocks, takes lot of time, when db size is large.
But if we enable block change tracking. RMAN will be aware of which blocks got
changed and it will just take backup of those blocks(without scanning all the blocks).
3. How block change tracking works internally?
block change tracking file, stores the changes in bitmap. i.e there will be one bitmap (1
or 0 ) for each chunk of 32k sizeI ( i.e 4 contagious blocks of 8k size). So even if only one
block is changed out of those 4 blocks , then the bitmap will also changed . So rman will
consider that as changed chunk, and incremental will take that backup.

4. What is the difference between differential backup and


Cumulative backup in incremental backup?
Differential Backup:
A differential backup, which backs up all blocks changed after the most recent
incremental backup at level 1 or 0.
Cumulative Backup:
A cumulative backup, which backs up all blocks changed after the most recent
incremental backup at level 0( i.e full backup).
So cumulative backup size will be larger. But restore /recover will be quick.
 

 
 
5. Can we take RMAN backup when the database is down?
For RMAN backup , database has to be in mount or open state. For cold backup only the
database has to be down completely.

6. What happens when we put the database in hot backup mode i.e
alter database begin backup? Why it generates lot of redo?
 DBWn checkpoints the tablespace (writes out all dirty blocks as of a given
SCN)
 CKPT stops updating the Checkpoint SCN field in the datafile headers and
begins updating the Hot Backup Checkpoint SCN field instead
 LGWR begins logging full images of changed blocks the first time a block is
changed after being written by DBWn
Why lot of redo?
Lets say

Full block image logging during backup eliminates the possibility that the backup will
contain unresolvable split blocks. To understand this reasoning, you must first
understand what a split block is. Typically, Oracle database blocks are a multiple of O/S
blocks. For example, most Unix filesystems have a default block size of 512 bytes, while
Oracle’s default block size is 8k. This means that the filesystem stores data in 512 byte
chunks, while Oracle performs reads and writes in 8k chunks or multiples thereof. While
backing up a datafile, your backup script makes a copy of the datafile from the
filesystem, using O/S utilities such as copy, dd, cpio, or OCOPY. As it is making this copy,
your process is reading in O/S-block-sized increments. If DBWn happens to be writing a
DB block into the datafile at the same moment that your script is reading that block’s
constituent O/S blocks, your copy of the DB block could contain some O/S blocks from
before the database performed the write, and some from after. This would be a split
block. By logging the full block image of the changed block to the redologs, Oracle
guarantees that in the event of a recovery, any split blocks that might be in the backup
copy of the datafile will be resolved by overlaying them with the full legitimate image of
the block from the archivelogs. Upon completion of a recovery, any blocks that got
copied in a split state into the backup will have been resolved by overlaying them with
the block images from the archivelogs. All of these mechanisms exist for the benefit of
the backup copy of the files and any future recovery. They have very little effect on the
current datafiles and the database being backed up. Throughout the backup, server
processes read datafiles DBWn writes them, just as when a backup is not taking place.
The only difference in the open database files is the frozen Checkpoint SCN, and the
active Hot Backup Checkopint SCN.
7. What is a snapshot control file?
the snapshot controlfile is a backup creates before the actual backup starts in order to
have a  ‘read consistent’ view of the controlfile during the backup and during resync
catalog.
This snapshot controlfile ensures that backup is consistent to point in time . So if you
add a tablespace after the backup has been started, Then the new file will not be backed
up.

8. What is the difference between validate and crosscheck


command?
Validate backup set  – checks whether backup sets can be restored or not.
crosscheck – go through the headers of the specified file, to check  if they are on disk or
tape.

9. what is the use of CONTROL_FILE_RECORD_KEEP_TIME?


This parameter defines, for how many days we want to keep the backup records
(reusable) in the controlfile.

10. should we set “configure controlfile autobackup” to on or off ?


and what is its significance? What is the default value?
It should be on always.

11.Can we restore a database from obsolete backup?


Yes we can restore a database from obsolete backup. For that we need to catalog those
file explictly.

12.How can we take rman backup in parallel?


Yes we can. Either by mentioned the number of channels or using the parallel parameter.
 
13. What are the different types of retention policy? Explain about
them.
Recovery Window based retention policies.
In the recovery window, Oracle checks the current backup and looks for its relevance
backwards in time.
Lets say you have set recovery window of 7 days. So 7 days  recovery window doesn’t
mean that it will delete copies older than 7 days. It will retain the backups in such a
manner that, you should be able to recover your database to any point in last 7 days.
Example –  > configure retention policy to recovery window of 7 days;
Here our recover window is 7 days .
Now assume that you started on July 1st and have taken backups on the 8th, 15th, 22nd
and 29th of July. Assuming the current date is the 25th of July, according to the recovery
window of seven days, the point of recoverability goes up to the 18th of July. This means
that to ensure the recoverability, the backup taken on 15th will be kept by Oracle so that
you can recover up to that point.
 
Redundancy Based retention policies:
A redundancy-based retention policy specifies how many backups of each datafile must
be retained. After that the older backup will be obsolete.

CONFIGURE RETENTION POLICY TO REDUNDANCY 2;

No retention policy:
Means the backups will not be obsolete at all.
CONFIGURE RETENTION POLICY TO NONE;

14. What happens when we open the database in resetlog?


 

15. What is an incarnation number?


When we open the database in resetlog , a new incarnation number is created. We can a
new version of the database is created.
The log sequence number resets to 1 and the online redologs gets a new timestamp and
scn.

16. What is the purpose of resync catalog command?


17. What is the meaning of the setting configure backup
optimization on in RMAN?
18. Difference between backup piece and backup set?
Backup Set
Logical structure where the backup is stored. It is a logical container. A backup set can
store one or multiple database files, spfiles, control files, etc. Do not think physically. One
backup set can be stored in one or multiple files. Each of those files is called a backup
piece. Usually, one backup set has only one backup piece.
Backup Piece
Physical structure where the backup is stored. A backup set is composed by one or more
physical binary pieces. If you backup to disk, each file generated is a backup piece.

19. How can i take rman backup to multiple directory is local file
sytem?
Yes we can do that.

20. What is the difference between maxpiece size and maxset


size?
MAXPIECESIZE – Limits the size of each backup piece.
MAXSETSIZE – Maximum size of a backup set. Please remember, the maxsetsize value
should be larger than the size of your largest data file.
 

21. What are things we can do to improves the performance of


rman backup?
use of more number of channels
parallelism
multisection size

22. Let’s say there is a requirement that ,Your daily rman backup
need to be completed within 40 min . If it exceeds backup should
be stopped automatically? Can we do that ?
backup duration command we can use:

backup duration 00:40 database;

23. How can i take rman backup of spfile?


 

24. What is an image copy in rman?


An Image copy backup are exact copies of the datafiles including the free space.  They
are not stored in RMAN backup pieces but as actual datafiles, therefore are a bit-for-bit
copy.
This is usually helpful, when you want to move the database from non-asm to asm file
system.
 
25. How can we recover from loss of online redolog?
In this case, first you need to find, what is the stauts of redolog file that is lost/corrupt.
Your action plan depends upon the redolog file status.
 

26. What is incomplete recovery? How it works? What are the


Different types of incomplete recovery scenarios?
Incomplete means you dont have all the redo or archives to recover all the commited
transactions. I.e we need to/want to re recover the database to a specific point in time in
past. So alternatively incomplete recovery is called as point in time recovery.
We do incomplete Recovery in 2 cases.
 You don’t have all the redo required to perform a complete recovery. You’re
missing either archived redo log files or online redo log files (current or
unarchived) that are required for complete recovery. This situation could arise
because the required redo files are damaged or missing.
 You purposely want to roll the database back to a point in time. For example,
you would do this in the event somebody accidentally truncated a table and
you intentionally wanted to roll the database back to just before the truncate
table command was issued.
 

27. Can we restore one table or table partition from rman backup?
Explain how it works internally?
Yes from 12c onwards we can do it.
28.When we duplicate the database using rman active duplication
method, does the dbid gets changed for the new database?
29. Can we change the dbid to a value of our own choice?
Yes we can do it.

30. What is the use of nofilenamecheck command in rman


duplication command?
 
NOFILENAMECHECK: prevents RMAN from checking whether the datafiles and online
redo logs files of the source database are in use when the source database files share the
same names as the duplicate database files (see Example 2-73). You are responsible for
determining that the duplicate operation will not overwrite useful data. This option is
necessary when you are creating a duplicate database in a different host that has the
same disk configuration, directory structure, and filenames as the host of the source
database. For example, assume that you have a small database located in the /dbs
directory of host1:
/oracle/dbs/system_prod1.dbf
/oracle/dbs/users_prod1.dbf
/oracle/dbs/rbs_prod1.dbf
Assume that you want to duplicate this database to host2, which has the same file
system /oracle/dbs/*, and you want to use the same filenames in the duplicate database
as in the source database. In this case, specify the NOFILENAMECHECK option to avoid
an error message. Because RMAN is not aware of the different hosts, RMAN cannot
determine automatically that it should not check the filenames.If duplicating a database
on the same host as the source database, then make sure that NOFILENAMECHECK is
not set. Otherwise, RMAN may signal the error.

31. Suppose you are running a active duplication from a database


of 40 TB. And after 95 percent of cloning is completed, it failed
due to network error? Do we need to start the cloning again from
start?
32. We need to duplicate a bigfile tablespace of size 500GB. Can
we use multiple channels to improves the  speed? Is there any
other option to improve the performance?
During duplication, only one channel is allocated for each datafile. So even if we allocate
5 channels, only one channels will be used to do the duplication.  So it will not improve
the speed.
So in this case we can use the section size attribute in the duplicate command .  Let’s say
the section size we defined is 1G and we have 10 channels. So during duplication the
datafile will be broken into section size of 1GB  and each channel will process one
section size ( i.e all 10 channels will be active in parallel with each processing 1GB data)
COMMAND – >
RMAN> duplicate target database to test_db from active database section size 1024M;

33. Difference between restore and recover?


 Restore: It is the act that involves the restoration of all files that will be
required to recover your database to a consistent state, for example, copying
all backup files from a secondary location such as tape or storage to your
stage area
 Recovery: It is the process to apply all transactions recorded in your archive
logs, rolling your database forward to a point-in-time or until the last
transaction recorded is applied, thus recovering your database to the point-
in-time you need
34. Someone deleted some data from a table 5 mins back.
Database is archivelog mode , But flashback mode is not enabled.
Can we retrieve the table data ?
Yes we can recover that table , only if undo is enabled.

35.Can i take the fullbackup from primary and incremental backup


from standby?
Yes we can take backup from standby and primary database, as both the databases are
same and have the same dbid.

36.Have you heard of Zero data loss Recovery Appliance(ZDLRA)?


37.Can we duplicate a database by skipping few tablespaces?
Yes we can skip few tablespaces. But we cannot skip system, sysaux, undo or any
tablespace containing sys objects.

38. For a schema, its tables are under tablespace DATA and its
indexes are under tablespace IDX. Now if i run rman duplicate by
skipping IDX tablespace, What will happen? Will it work?
So we cannot skip IDX tablespace.  As this is not a self contained tablespace i.e there is a
dependency between objects in DATA and IDX tablespace. So either we need to skip
both or we need to duplicate both .

39.How can we check the integrity of rman backup?


Validate command can be used to check the integrity of the rman backup?

40.Different encryption methods offered by rman?


41.Different between backupset and image copy?
42.What is difference between consistent backup and inconsistent
backup?
Consistent Backup
A consistent backup of a database or part of a database is a backup in which all
read/write datafiles and control files are checkpointed with respect to the same system
change number (SCN). Oracle determines whether a restored backup is consistent by
checking the datafile headers against the datafile header information contained in the
control file
IN-Consistent Backup
An inconsistent backup is a backup in which all read/write datafiles and control files have
not been checkpointed with respect to the same SCN. For example, one read/write
datafile header may contain an SCN of 100 while other read/write datafile headers
contain an SCN of 95 or 90. Oracle cannot open the database until all of these header
SCNs are consistent, that is, until all changes recorded in the online redo logs have been
applied to the datafiles on disk.

43.What is RTO and RPO?


44.  Currently we have 30 days of archive logs present , so used
delete archivelog all completed before ‘sysdate-3’ to delete files
older than 3 days. But this command is not deleting files older than
14 days. Why?
You need to check your CONTROL_FILE_KEEP_RECROD_TIME. If this is set 14 days. then
Only 14 days archive information will be stored in the control_file. If archive logs are
older than 14 days, its related information will be removed from control_file.
So when we use rman it gets data from controlfile. As only 14 days information is
present in control_file. RMAN delete command will ignore files older than 14 days.
Default value of CONTROL_FILE_KEEP_RECORD_TIME is 7 days.

45.  Explain how rman works internally , when you are running
rman database backup?
Lets say you connected as rman to target db and run backup database ;
rman target /
RMAN> backup database;
1. RMAN makes the bequeath connection with the target database.
2. It then connect to internal database user sys.RMAN and spawn multiple
channel as mentioned in script.
3. Then RMAN make a call to sys.dbms_rcvman to request database schema
information from controlfile( like datafile info,scn)
4. After getting the datafile list, it prepare for backup.To guarantee consistency
it either builds or refreshes the snapshot control file.
5. Now RMAN make a call to sys.dbms_backup_restore package to create
backup pieces.
6. If controlfile autobackup is set to on, then it will take backup of spfile and
controfile to backupset.
7. During backup, datafile blocks are read into a set of input buffers, where they
are validated/compressed/encrypted and copied to a set of output buffers.
The output buffers are then written to backup pieces on either disk or tape
(DEVICE TYPE DISK or SBT).
46. Rman full database started at 5:00 AM , while it is still running
at 5.30 AM , a new datafile has been added. The backup completed
at 6:00 AM. So will the new datafile will be part of this rman
backup set?
No it will not be added in the rman backup.

47. What are the different phases of RMAN backup.


1. Read PhaseA channel reads blocks from disk into input I/O buffers.
2. Copy PhaseA channel copies blocks from input buffers to output buffers and
performs additional processing on the blocks.
3. Write PhaseA channel writes the blocks from output buffers to storage
media. The write phase can take either of the following mutually exclusive
forms, depending on the type of backup media:
48. Who updates the block change tracking file(BCT) and how it
does?
CTWR background process updates the block change tracking file.

49. Lets say I have taken a full backup at 5 AM. and after 2 hour ( 7
AM ) I added 2 datafiles.  And now At 9 am , the db
crashed/corrupted. So i need to restore/recover the database. Can
i use the full backup to restore/recover the database including
those datafiles which were added later?
Yes we will restore the full backup , then recover using the archivelogs to the current
point.

50. How can i recover from undo block corruption?


First you need to find which datafile and block is corrupted.

V$DATABASE_BLOCK_CORRUPTION , and then connect with rman and run block recover
command.

51. What is the relation between large pool size and rman backup?
52. How can we take RMAN cold backup ?
if the database is no archivelog mode, then also you can startup the database in mount
stage and take backup.
53. Suppose someone dropped a table mistakenly in production
database. What will be your immediate action plan to recover that
table?
54. What is backup optimization in RMAN?

Postgres DBA Interview Questions


285 views 19 min , 35 sec read 0

POSTGRES ARCHITECTURE & INTERNALS:


1. How connection is established in postgres?
The supervisor process is called postgres server( earlier it was known as postmaster) and
listens at a specified TCP/IP port for incoming connections. Whenever a connection
request comes, it spawns a new backend process. Those backend processes
communicate with each other and with other processes of the instance
using semaphores and shared memory to ensure data integrity throughout concurrent
data access.
Once a connection is established, the client process can send a query to the backend
process it’s connected to. The backend process parses the query, creates an execution
plan, executes the plan, and returns the retrieved rows to the client by transmitting them
over the established connection.
2. Explain the postgres architecture ?
3. What are some important background processes in postgres?
BGWriter ->  It writes dirty/modified buffers to disk

WalWriter- > WAL buffers are written out to disk at every transaction commit. The
default value is -1.

SysLogger:Error Reporting and Logging

Checkpoint

Stats Collector -> It collects and reports information about database activities. The
permanent statistics are stored in pg_catalog schema in the global  subdirectory.

Archiver ->. Setting up the database in Archive mode means to capture the WAL data of
each segment file once it is filled and save that data somewhere before the segment file
is recycled for reuse.
4. What are the memory components in postgres?
Shared Memory:
 shared_buffer ->
o −  Sets the number of shared memory buffers used by the
database server
o −  Each buffer is 8K bytes
o −  Minimum value must be 16 and at least 2 x max_connections
o −  Default setting is managed by dynatune.
o −  6% – 25% of available memory is a good general guideline
o −  You may find better results keeping the setting relatively low
and using the operating system cache more instead
 

 wal_buffer ->
o  Number of disk-page buffers allocated in shared memory for
WAL data
o −  Each buffer is 8K bytes
o −  Needs to be only large enough to hold the amount of WAL
data created by a typical transaction since the WAL data is
flushed out to disk upon every transaction commit
o −  Minimum allowed value is 4
o −  Default setting is -1 (auto-tuned)
 clog buffer
Private Memory used by each server process:
 Temp_buffer -> for temp table operations.
 work_mem ->
o −  Amount of memory in KB to be used by internal sorts and hash
tables beforeswitching to temporary disk files
o −  Minimum allowed value is 64 KB
o −  It is set in KB and the default is managed by dynatune
o −  Increasing the work_mem often helps in faster sorting
o −  work_mem settings can also be changed on a per session basis
 

 maintenaince_work_mem ->
− Maximum memory in KB to be used in maintenance operations such as
VACUUM, CREATE INDEX, and ALTER TABLE ADD FOREIGN KEY − Minimum
allowed value is 1024 KB

− It is set in KB and the default is managed by dynatune


− Performance for vacuuming and restoring database dumps can be
improved

by increasing this value

 autovacuum_work_mem ->
− Maximum amount of memory to be used by each autovacuum worker
process

− Default value is -1, indicates that maintenance_work_mem to be used


instead

5. When wal writer write data to wal segement?


 

6. When bgwriter writes data to disk?


7. What are some wal related parameter in postgrs.conf file?
max_wal_size – > It is the total size of wal segment(soft limit). If the this size is filled, then
checkpoint will occurs.
wal_keep_segment

wal_keep_size =

max_wal_sender = 10  ( maximum number of wal sender process)

wal_level =

8. What is checkpoint? When checkpoint happens in postgres?


A checkpoint is a point in the transaction log sequence at which all data files have been
updated to reflect the information in the log. All data files will be flushed to disk.

1. pg_start_backup,
2. CREATE DATABASE,
3. pg_ctl stop|restart,
4. pg_stop_backup,
5. When we issue checkpoint command manually.
For periodic checkpoints below parameter play important role.

 checkpoint_timeout = 5min
 max_wal_size = 1GB
A checkpoint is begun every checkpoint_timeout seconds, or if max_wal_size is about to
be exceeded, whichever comes first. The default settings are 5 minutes and 1 GB,
respectivel

With these (default) values, PostgreSQL will trigger a CHECKPOINT every 5 minutes, or after
the WAL grows to about 1GB on disk
9. Lets say your work_mem is 4MB. However your query is doing
heavy sorting, for which work_mem required might be more than
4MB. So will the query execution be successful?
Yes it will be successful , Because it utilized the work_mem fully, it started using temp
files.

10. What are the different phases of statement processing?

11. Which parameters controls the behaviour of BGWriter?


bgwriter_delay – size of sleep delay when number of processed buffers exceeded.
bgwriter_lru_maxpages – number of processed buffers after bgwriter delays.
bgwriter_lru_multiplier – multiplier used by bgwriter to calculate how many buffers need
to be cleaned out in the next round.
These settings used to make bgwriter more or less aggressive – lower values of
maxpages and multiplier will make bgwriter lazier, and higher maxpages and multiplier
with low delays will make bgwriter more diligent.

12. Does postgres support direct i/o.?


Postgres doesnt support direct i/o.

13. How authentication happens in postgres?


pg_hba.conf file is used defining host based authentication methods. In this file, we
define the details like host,database, ip address, user and method.

Different types of methods in pg_hba.conf file are .

 Trust: for this option users can connect to the database without specifying a
password. When using this option one should be cautious.
 Reject: This option rejects a connection to a database(s) for a user for a
particular record in the file.
 Password: this option prompts the user for a password before connecting to
the database. When this method is specified the password is not encrypted
between the client and the database.
 Md5: this option prompts the user for a password before connecting to the
database. When this method is specified the client is required to supply a
double-MD5-hashed password for authentication.
 rver.
 Ident –  Obtain the operating system user name of the client by contacting
the ident server on the client and check if it matches the requested database
user name. Ident authentication can only be used on TCP/IP connections.
When specified for local connections, peer authentication will be used
instead.
If you are doing any changes to pg_hba.conf file, then you need to reload /restart the
cluster for the changes to take place.

14. What is the significance of pg_ident.conf file?


Just like we have os authenticated db users in oracle. Here in postgres also we have
similar concept. We can provide mapping of os user and postgres db user inside
pg_ident.conf file.

15. What is this wal_level parameter , different values of wal_level?


   Wal_level determines how much information is written to the WAL
   The default value is replica, adds logging required for WAL archiving as well
as information required to run read-only queries on a standby server
  The value logical is used to add information required for logical decoding
  The value minimal, removes all logging except the information required to
recover from a crash or immediate shutdown
  This parameter can only be set at server start
16. What is visibility map in postgres?
Every heap relation(i.e table/index) have a visibility map associated with them.  It holds
the visibility of each page in the table file. The visibility of pages determines whether
each page has dead tuples. Vacuum processing can skip a page that does not have dead
tuples.

Every visibility map has 2 bits per page.

The first bit, if set, indicates that the page is all-visible( means those pages need not to
be vacuumed)

The second bit, if set means, all tuples on this page has been frozen. ( no need to
vacuum)

Note – > Visiblity map bits are set by VACUUM operation. And if data is modified ,bits
will be cleared.

This condition helps in index only scan.

17. What is free space mapping(FSM) in postgres?


Each table/index has a Free space mapping file. It keeps information about which pages
are free in the respective table/index . The extension of FSM file is .fsm .

When a tuple is inserted, postgres uses the FSM of the respective table, to select the
page, where it can be inserted.

the FSM files are updated during insertion of new record.

The VACUUM process also updates the Free Space Map and usi

We can use pg_freespacemap extension to look into the freespace usage.

18. What is initial fork?


19. What is  TOAST?
20. What happens in the background during vacuuming process?
21. What is transaction id wraparound? How to overcome it.
 
Assume that tuple Tuple_1 is inserted with a txid of 100, i.e. the t_xmin of Tuple_1 is 100.
The server has been running for a very long period and Tuple_1 has not been modified.
The current txid is 2.1 billion + 100 and a SELECT command is executed. At this time,
Tuple_1 is visible because txid 100 is in the past. Then, the same SELECT command is
executed; thus, the current txid is 2.1 billion + 101. However, Tuple_1 is no longer
visible because txid 100 is in the future (Fig. 5.20). This is the so called transaction
wraparound problem in PostgreSQL.
 

To deal with this problem, PostgreSQL introduced a concept called frozen txid, and
implemented a process called FREEZE.
In PostgreSQL, a frozen txid, which is a special reserved txid 2, is defined such that it is
always older than all other txids. In other words, the frozen txid is always inactive and
visible.

In version 9.4 or later, the XMIN_FROZEN bit is set to the t_infomask field of tuples rather
than rewriting the t_xmin of tuples to the frozen txid

22. What is the significance of search_path in postgres?


23. What is multi version concurrency control( MVCC)? 
MVCC helps in currency control by maintaining multiple copies of same tuple, so that
read and write operations same tuple doesnt impacts.  Update in postgres means new
row will be inserted and old row will be invalided( we can say delete followed by insert).
And these old tuples can be visible to the other transactions depending upon the
isolation level.

24. What are the advantages and disadvantages of MVCC?


Advantage is read consistency.  Because of mvcc readers are writers dont block each
other. If a a row has been modified by a user , but not committed . And if another user
try to access that row, Then he will get the old committed data.

Disadvantage is , it causes bloating. Also it needs more data to keep multiple version of
the data. If your database doing lot of DML activities, then postgres need to keep all the
old transaction records also(updated or deleted) . And maintenance activity like vaccum
need to be done remove the dead tuples.

25. Difference between pg_log, pg_clog, pg_xlog?


26. What is ctid?
It is similar to row_id in oracle.

The physical location of the row version within its table. Note that although the ctid can
be used to locate the row version very quickly, a row’s ctid will change if it is updated or
moved by VACUUM FULL.
27. What is oid?
Every row in postgres will have a object identifier called oid.
28. difference between oid and relfilenode?
29. Difference between postgres.conf and postgres.auto.conf file?
postgres.conf is the configuration file of the postgres cluster. But when we do any config
changes using alter system command, then those parameters are added in
postgres.auto.conf file.

When postgres starts , it will first read postgres.conf and then it will read
postgres.auto.conf  file.

30. What is timeline in postgres?


Timeline in postgres is used to distinguish between original cluster and recovered one.
When we initialize the cluster, the timelineid will be set to 1. But if database recovery
happens then it will increase to 2.

31. what is full page write in postgres?


full_page_write parameter is set to on ( default value).

When this parameter is on, the PostgreSQL server writes the entire content of each disk
page to WAL during the first modification of that page after a checkpoint. This is needed
because dirty write that is in process during an operating system crash might be only
partially completed, leading to an on-disk page that contains a mix of old and new data.
The row-level change data normally stored in WAL will not be enough to completely
restore such a page during post-crash recovery. Storing the full page image guarantees
that the page can be correctly restored.
32. What is xmin and xmax in postgres?
xmin and xmax are in row header.

When a row is inserted, the value of xmin is set equal to the transaction id that
performed the INSERT command, while xmax is null.
When a row is deleted, the xmax value of the current version is labeled with the
transaction id that performed DELETE.
The new row will have xmin same as that of xman of previous version.

33. What is a ring buffer in postgres?


Ring buffer is a temporary buffer area, used for performing large read,write operations
on tables.

Ring buffer is allocated for below conditions.

1. Bulk read
2. Bulk write ( like CTAS, COPY FROM  , alter table)
3. During autovacuum process
Once these processes are completed, ring buffer is released.

34. What do you mean by postgres cluster?


Postgres cluster is a collection of databases stored in data_directory. One postgres
instances managed one postgres cluster.

Note – Dont get confused with Oracle RAC cluster, where multiple instances run from
multiple nodes for one database.
 

35. What is single user mode in postgres?


The primary use for this mode is during bootstrapping by initdb. Sometimes it is used for
disaster recovery.

36. What is a timeline history file?


When we do a PITR , a timeline history file is created and it contains below information.

 timelineId – timelineId of the archive logs used to recover.


 LSN – LSN location where the WAL segment switches happened.
 reason – human-readable explanation of why the timeline was changed.
37. Are you aware of archive_library feature in postgres 15?
38. What is the use of wal files/segments?
39. What is LSN?
40. What are the three special transaction ids in  postgres?
0 – > Means invalid txid

1 – > bootstrap txid(during initdb command)

2 – > Frozen txid

ADMINISTRATION:
1. What is the default port in postgres?
Default is 5432.

2. What is the default block size in postgres? Can we set different


block size?
3. What is a tablespace?
A tablespace is used to map a logical name to a physical location on disk. In the simplest
word, we can understand tablespace as a location on the disk where all database objects
like table and indexes are stored.

It is also important to note that the name of the tablespace must not start with pg_, as
these are reserved for the system tablespaces.

Tablespace list can be found using below command:

postgres=# select * from pg_tablespace;


(OR)

postgres=# \db+
4. What are the tablespaces created by default after installing
postgres cluster?
Pg_global – > PGDATA/global – > used for cluster wide table and system catalog
Pg_default – > PGDATA/base directory – > it stores databases and relations
Whenever any user creates any table or index, it will be created under pg_default and
this is the default_tablespace setting.

However you can change the default_tablespace setting using below one.

alter system set default_tablespace=ts_postgres;

5. What is the use of temporary tablespace? 


It is used to store temporary objects like temporary table etc. But If we don’t set
temp_tablespace parameter explicitly then , temporary objects will be created under
pg_default tablespace.

postgres=# SELECT name, setting FROM pg_settings where


name=’temp_tablespaces’;
6.What are the default databases created after setting up postgres
cluster? 
postgres=# select datname from pg_database;
datname
———–
postgres
template1
template0
 
7.What is the significance of template0 and template1 databases?
What the difference between these two?
Whenever we create a new database , it creates is using the template1 database. And if
you modify anything in template1 database like creating new extension in template1
database, then these changes will be reflected to other created databases.

Template0 – > No changes are allowed to this database. If you messed up your
template1 database , then you can revert them by createing a new one from template0.

In the below screenshot, you can see for template1 , datallowconn is true, But template0
it is false.

 
8. What are common database object names in postgres compare
to industry terms.
 

9. Difference between user and schema in postgres?


If you are aware of oracle db, then you know that both user and schema are same . But
in postgres both are different.

A schema is a collection of database objects. But a user is used to connect to postgres


cluster. A single user can be used to connect to any database in the cluster( provider
permission is give). However schema is local to that particular database.

Below diagram should clear your doubts


1|-------------------------------------------|---|
2| PostgreSQL instance                       |   |
3|-------------------------------------------| U |
4|     Database 1      |     Database 2      | S |
5|---------------------|---------------------| E |
6| Schema 1 | Schema 2 | Schema 1 | Schema 2 | R |
7|----------|----------|----------|----------| S |
8| t1,t2,t3 | t1,t2,t3 | t1,t2,t3 | t1,t2,t3 |   |
9-------------------------------------------------

10. Difference between role and user in postgres? Can we convert


a role to user?
Role and user are almost same in postgres, only difference is , a role cannot login , But a
user can. We can say like a user is  role with login privilege.

And yes we can convert the role to a user.

alter role <role_name> nologin;

11. Difference mode to stop a postgres cluster using pg_ctl?


Postgres has 3 shutdown modes.

smart. – > It is the normal mode . i.e  shutdown will wait for all existing connection to be
terminated And it might take a lot of time. And no new connections will be allowed
during this.

fast  -> It terminates all existing sessions and performs checkpoint . It is comparably
quick. But if there is lot of trnsactions that need to written to disk, then checkpoint might
take lot of time.
immediate – > It will abort the instance. I.e it will terminate all existing session and
shutdown the instance without performing checkpoint. It is the quickest . But upon
instance startup , recovery will happen.

So if you have lot of pending transactions in your database, then the best way is perform
checkpoint manually and then stop the server.

postgress#CHECKPOINT; – > Run as super user

enterprisedb$ pg_ctl stop -m fast

12. Can you give comparison of shutdown modes between oracle


and postgres?
ORACLE POSTG
shutdown normal smar
shutdown immediate fast
shutdown abort immed

13. Can we stop a particular database in postgres cluster?


No we cannot stop/start a particular database in postgres. We can only shutdown the
postgres cluster.

14.  What are foreign data wrappers? What is its use?


When we want to access data from a remote db ( can be any type of database
oracle/postgres/mysql etc).

15.  what is the use of share_preload_libraries in postgres.conf


file?
Usually when we add extensions like pg_stat_statement, then we need to add the library
path in the parameter shared_preload_libraries.

Because these extensions use shared memory, we need to restart the postgres cluster.

the reason we are preloading these libraries is  to avoid the library startup time, when
the library is first used.
16. what different types of streaming replications are present in

postgres? And which parameters control that.

18.What is random page cost?

19.Difference between explain and explain analyze in postgres?


Explain – > Generates query plan by calculating the cost

Explain analyze -> It will execute the query and provides query statistics of the executed
query.  This gives more accurate plan details. Please be careful while running
insert,update,delete like DML commands with explain analyze, as it will run the query
and cause data changes.

20. What are the different datatypes in postgres?


Boolean

char,vchar,

int,float(n)

uuid

date

21. what is the maximum file size of table or index in postgres?


Can we increase that ?
Max size is 1GB.  If a table size is big, then it can spread across multiple files. lets says the
file_name is 19870 . and once it reaches 1gb, a new file will be created as 19870.1 .

22. Is there a way in which we can rebuild/reorg a table online to


release free space?
As we know vacuum full is used to rebuild table and it releases free space to operating
system. However this method, puts an exclusive lock on the table.

So we can use pg_repack extension to rebuild a table online.


23. How pg_repack works internally?
24. How can i check the version of postgres?
cat PG_VERSION file.
or select pg_version();

25. What is the latest version of postgres in market?


Postgres 15.

26. While dropping a postgres database i am getting error? What


might be the issue?
If you want to drop a database , then you need to fire the command, after connecting to
a different database in the same postgres cluster with superuser privilege.

27. What are some key difference between oracle and postgres?
28. Between postgres and nosql database like mongodb , which
one is better?
29. What is table partitioning in postgres? What are the
advantages?
30. How you monitor long running queries in postgres?
we can use pg_stat_activities to track .

31. How to kill a session in postgres?


pg_terminate_backend(pid). pg_terminate_backend command send the SIGTERM signal
to backend process. Where cas pg_cancel_backend() commands sends SIGINT command
to backend process.

32. What is an extension in postgres? Which extensions you have


used ?
pg_stat_statements

pg_track_settings

pg_profile

33. What are the popular tools for managing backup and recovery
in postgres?
edb bart , barman etc

34. How can we use connection pooling in postgres?


pgbouncer is used for connection pooling

35. Which utility is used to upgrade postgres cluster?


pg_upgrade utility is used to upgrade postgres version.

36. How can we encrypt specific columns in postgres?


pgcrypto extension can be used

37. What is the use of pgbench utility?


38. What is difference between pg_cancel_backend vs
pg_terminate_backend?
pg_cancel_backend will cancel the running sql query of that session . But session will be
intact.

But pg_terminiate_backend will kill the session i.e the pid.

5. Do we have synonyms in postgres?


5. How can we convert a non partitioned table to partitioned table
in postgres?
5. How can we convert a non partitioned table to partitioned table
in postgres?
5. Can we create invisible index in postgres?
Yes in postgres, we can create hypothetical index. For that we need a third party extesion
called hypopg 
SELECT * FROM hypopg_create_index(‘CREATE INDEX ON t1 (a)’);

5. Suppose we have some views using a table  and if drop that


table, what will be the impact on those views?
5. Difference between open source installation and edb
installation?
5. What is the difference between pgpool and pgbouncer?
Though both are connection pooling tools , they have few differences .

MAINTENANCE:
1. What is vacuuming in postgres?
Whenever tuples are deleted or becomes obsolete due to update, then they are not
removed physicallly. These tuples are known as dead tuples. So vacuuming can be used
to clear those dead tuples and release the space.

Basically two main tasks of vacuuming is :

1. Removing dead tuples


2. Freezing transaction ids
3. Updates the FSM and VMs of tables.
2. Difference between vacuum and vacuum full?
VACUUM:
 Vacuum without full option, will reclaim the space and re-use it. However
that free space is not returned to operating system.
 This will not put any exclusive lock on the system. So impact on the normal
read write of the database.
 We can run this in parallel also. called as parallel vacuuming
Below is the VACUUM sequence:
1. Puts a shared lock on the table.
2. Scan all the pages to get the tuples and freezes old tuple if necessary.
3. Remove the index tuple, pointing to dead tuples in table.
4. Remove the dead tuples.
5. Updates both FSM and VM of the respective table.
6. Update the statistics and system catalog table.
VACUUM FULL
 Vacuum with full option, will reclaim more space and release the space to
operating system.
 However we need additional storage , as it will rewrite the entire  data to new
disk file without any extra space.
 Also it will put exlusive lock on the table for which vaccuming is in progress.
 This method is much slower.
Below is the VACUUM FULL sequence:
1. It puts exclusive lock on the table and create a new table.
2. Copy the live tuples from old table to new table.
3. After copy is done , it removes the old table files, updates the indexes and
updates the VM and FSM.
 

3. What do you mean by vacuum freeze? And when we use it?


4. What is auto vaccuming in postgres?
Autovacuum triggers when dead tuples in a table reaches 20 percent of. table and it
analyze the table when it reaches 10 percent of a table.

Below four parameters control the behaviour of the autovacumming.

autovacuum_vacuum_scale_factor = 0.2;
autovacuum_analyze_scale_factor = 0.1;
autovacuum_vacuum_threshold (integer)50
autovacuum_analyze_threshold (integer)50
5. How can we increase the performance of auto vacuuming?
autovacuum_max_workers parameter helps , it will release number of parallel threads as
per the defined value.

6. How can i disable autovacuuming?


At table level:
ALTER TABLE table_name SET (autovacuum_enabled = false);

7. What is the data fragmentation or bloating in postgres?


8. Difference between normal transaction id and frozen transaction
id?
9. How statistics were updates in postgres?
10. What CLUSTER command does? When and how it helps.?
As we know the table data may or may not be in a ordered fashion. However in index,
the entries are in ordered fashion. So we ca sort the data inside table to be in ordered
fashion by refering the index.

command is – > CLUSTER <table_name> USING < index_name>

In oracle , we will this clustering factor. clustering factor is low means, table data are is
orderly manner.

If we are accessing only single row,  then this clustering factor doesn’t impact the
performance. However you are accessing multiple rows, using index, then clustering
factor will improve the performance.

NOTE – > CLUSTER will put exclusive lock on the table during this activity.

11. What is reindexing and when we should do reindex?


reindex means recreating the index.

We should do reindex, when

1. Index is bloated

2. when we alter its storage parameter , for example fillfactor.

1. How can i rebuild index?


for all indexes for a database:
REINDEX DATABASE <DB_NAME>;

For system tables of a database:


REINDEX system <db_name>;

for all indexes of a schema:


REINDEX SCHEMA schema_name;
for all indexes of a table:
REINDEX TABLE table_name;
for a particular index:
REINDEX INDEX index_name;
2. Will there be any impact on the database, if i run reindex
command .
 Reindex command, puts exclusive lock on the table.  so for big indexes with live system,
might be an issue.
However from postgres 12, REINDEX CONCURRENTLY OPTION IS available,which will do
the reindex, without impacting the table.

2. What happens internally when we rebuild indexes?


 

12. Lets says autovacuuming already running in background and


You again ran the vaccum command manually? What will happen?
In this case, postgres will terminate the autovacuum worker process and give priority to
the manual vacuum command. i.e autovacuuming will never block the normal manual
database operations.

13. Can i disable autovacuum for few tables in the database?


Yes you can do with below command.

alter table test2 set( autovacuum_enabled = off);

14. What is fill factor and how it impacts the autovacuuming


performance.
 15. What exactly happens when we do vacuum full?
1. It puts an exclusive lock on the table

2.Creates a new table file.


3.Copies the live tuples from old table to new table.

4. Once copy is done, remove the old files.

5. Then it updates the indexes, VM,FSM .

6.Updates the table stats and associated system catalogus.

REPLICATION:
1. Explain the basic streaming replication architecture.
1. When we start the standby servers, the wal receiver process gets started on
standby
2. Wal receiver sends connection request to primary
3. When primary receives wal receiver connection request, it starts walsender
process and connection is established between wal sender and wal_receiver
4. Now wal receiver send the standby’s latest LSN to primary.
5. If standby’s LSN < Primary’s LSN, then the wal sender send the required WAL
data to keep standby in sync.
6. The received wal data is replayed on standby.
2. How can you check whether the database is primary or
standby(replication).
method 1 :
We can check using below query. If the output is t means true i.e it is standby(slave
server). If false means its primary or master.

select pg_is_in_recovery();

method 2:
You can run below query.

select * from pg_stat_replication;

3. I have a master and slave setup with streaming replication. Now


I want to break the replication and open the standby server for
some testing purpose and once the testing is done, the server
again need to be a standby of that primary with replication
enabled. How can i do that?
pg_rewind.

4. How pg_rewind works?


5. How can i do switchover in postgres without efm?
In postgres there is nothing called switchover, we can only do failover. but with efm we
can do switchover using efm promote command.

6.Difference between standby.signal and recovery.signal?


 recovery.signal: tells PostgreSQL to enter normal archive recovery .
( example – Point in time recovery).
 standby.signal: tells PostgreSQL to enter standby mode .
7. In streaming replication which additional processes run on
primary and standby?
wal sender on primary

wal receiver on standby

8. What is the default streaming replication mode SYNC or


ASYNC?
ASYNC is the default one.

9. Explain different types of replications methods available in


postgres?
10. What is the significance of the parameter
synchronous_commit?
Below are the values we can set to synchronous_commit.

1. OFF – > Means, commit doesnt wait for transaction record to be flushed to
disk.
2. Local – > commit waits until the transaction record is flushed to disk.
3. ON – > Commit waits , until standby servers mentioned in
synchronous_standby_names , confirm that data is flushed to standby disk.
4. remote_write -< Commit waits, until standby servers mentioned in
synchronous_standby_names , confirm that data is written to os , but not
necessrily the disk.
5. remote_apply – > commit waits, until the standby servers mentioned in
synchronous_standby_names apply those changes to the database.
11. Suppose A is primary and B is standby ? Now failover
happened and then A becomes standby and B becomes primary.
Now i want to make the A as primary again and B as standby . How
can i achieve that?
12. Explain the architecture of EFM?
You need one master, one slave and one witness server.

13. What are some common reasons which cause streaming


replication to fail.
14. What are the changes related to recovery.conf happened in
postgres 12.
From postgres 12 onwards, the parameters of recovery.conf are now part of 
postgres.conf file.  Instead of recovery.conf file, only two files are present, standby.signal
and recovery.signal file( note both are empty files).

15. When we do failover, what happens in the timeline.


Timeline gets changed. i.e new timeline id is selected.

16. Like snapshot standby in oracle, can we use slave server in


postgres as snapshot standby and revert it back to slave server
once testing is done?
pg_rewind

17.Is there any mandatory parameter for pg_rewind to work?


18.Is replication possible between different versions of postgres?
19.What is replication slot?
Suppose the standby database is down for a long time . And if the wal_data on primary
reached the max wal_keep_segment size, then it will start deleting the old wal data. As
primary is not tracking the standby , if it removes the wal data which is not send to
standby yet, then when we start the standby , it will fail .

So to avoid this issue, we can create replication slot. Replication slot will ensure that , the
wal data which is not applied on standby will not be deleted.

5. Do you see any disadvantage of replication slot?


Orphan replication slot, can cause unbounded disk growth which is not required.

5. How many types of replication slots are there?


There are two types:

1. Physical replication slot


2. Logical replication slot
 
MONITORING:
BACKUP & RECOVERY:
1. Explain different backup methods in postgres?
Reference – https://www.tutorialdba.com/p/postgresql-backup-recovery-
overview.html

5. difference between pg_dump and pg_dumpall?


Both are logical backup utilities.

pg_dump -> Used for taking backup of databases, schema,table .backup format is like
sql, tar file or directory

pg_dumpall -> Used for taking backup of complete cluster. backup format is sql .

5. Can i take backup of complete cluster using pg_dump?


5. What is the use of -globals options in pg_dumpall?
1. What pg_basebackup does internally?
1. Explain how postgres does crash recovery?
In case of crash of postgres instance, when the instance starts again, it will reply the wal
file and recover the instance.

1. Can we take incremental backup using postgres? and how?


You can use third party tools like BART and barman.

1. Which important parameter is used for point in time recovery ?


1. What do you mean by continuous archiving in postgres?

17.Can I take export of only users using pg_dump?


3. Different type of indexes present in postgres ?
 BTREE
 HASH
 GIST
 BRIN
 GIN
4.  Does create index puts lock on the table? If yes , then any
workaround to avoid this locking?
Yes create index command puts exclusive lock on the table . So you can
add CONCURRENTLY keyword, which will avoid the exclusive lock on the table.
5. What is index only scan in postgres?
 

PATCH & UPGRADE:


1. What is major version upgrade and minor version upgrade in
postgres?
Major version upgrade  Means when we do upgrade from postgres 12 to postgres 13
or 13 to 14 like that.
In major upgrade, the posgres binary and data_directory path gets changed.

pg_upgrade tool is used for major version upgrade.

Minor version upgrade:  Means when we do upgrade from 13.1 to 13.3 .


In this case, pg binary and data_directory path will not be changed.

 
2. Explain how you do a major upgrade in postgres?
STEPS:
1. Install new pg binary:
2. Initialize new pg binary:
3. shutdown both old and new pg cluster
4. run pg_upgrade with -c option for verify
5. run the actual pg_upgrade ( either with or without link option)
3. What is the user of link option in pg_upgrade?
while doing upgrade, if we are using link option, then data will not be copied from old
directory to new directory. Only the symbolic links will be created in the new data
directory.

For using link option, both old and new data_directory need to be in the same file
system.

4. What are the advantages and disadvantages of using link


option?
pros: As data is not copied, it saves a lot of time in upgrade( especially for large size
database)

cons: If we use link option , then we wont be able to use the old cluster .

PERFORMANCE RELATED:
 

4. How do you find which objects are bloated in db?

Oracle Performance Tuning


Interview Questions And Answers
314 views 23 min , 7 sec read 0

1. Difference between local partitioned  index and global 


partitioned index
In local partitioned Index, there is one to one relation between data partition and index
partition. i.e for each table partition, there will be on respective index partition( based on
the same partitioned key on table) .
But in global partitioned Index, there is no one to one relation. i.e a table can have 20
partitions  and the global partitioned Index can have 4 partitions. Also the
“highvalue=MAXVALUE”  is mandatory for creating global index.

2. What is prefixed and non-prefixed partitioned Index.


If the index column same as the partition key or the left most column of the partition
keys , Then we call it prefix. If the indexed column is either not the leading edge of the
partitioned keys or not a part a partitioned key, then we can call it non-prefixed.
Prefixed Indexes will use partition pruning , But non-prefixed partitioned index will not
use partition pruning.
For global partitioned Index, there is only prefixed index. Oracle doesn’t support non
prefixed global partitioned index.

3. Can we create a unique local partition index?


Yes we can create a unique local partition index  But it can be created only on a partition
key column( or a subset of key column)

4. What is a partial Index? What are the advantages?


From 12c onwards, Index can be created on a subset of partitions. This can be used,
when we know about which partitioned are widely used , which are hardly used. This will
save in saving storage space.

5. What is a b tree index and where it is used?


6. What is a BITMAP index tree and in which scenario this index is
used?
 

7. When we should do index rebuild? What happens when we do


index rebuild?
 

8. What is IOT( Index Organised Table)?  When we should use this?


Primary key is a must in IOT table.
an index-organized table is most effective when the primary key constitutes a large part
of the table’s columns. Also when the table data is mostly static.

9. How to rebuild an INDEX?


 There are two ways to rebuild an index. 
OFFLINE:(During this the exclusive lock will be applied on the table, which will impact
DDL and DMLS.)
ALTER INDEX SCHEMA_NAME.INDEX_NAME REBUILD;
Alternatively we can drop and recreate the index .
ONLINE:(DML and DDLS operations will work as usual, without any impact).
ALTER INDEX SCHEMA_NAME.INDEX_NAME REBUILD ONLINE;
10. How can we improve the speed of index creation on large
table?
First we can use parallel in the creation index syntax to improve the speed. However
make sure to revert the parallelism to default once index is created.
create index TEST_INDX_object on test(object_name) parallel 12;
alter index TEST_INDX_object noparallel;

Apart from this we can  add nologging parameter in create index statment.

Create index , generates a lot of redo and these logs are of no use. So we can create the
create the index with nologgin as below.

create index TEST_INDX_object on test(object_name) nologging parallel 12;

11. What is clustering factor?


Clustering factor , indicates the relation between table order and index order. In index
the the key values are stored in ordered fashion, But in the table, there rows are stored in
the order of insert .

low CLUSTERING_FACTOR means, data is in ordered manner in table. I.e we can say it is
good clustering factor. The minimum clustering_factor is equal to number of block of the
table.

High CLUSTERING_FACTOR means data is randomly distributed. i.e bad clustering factor.
Maximum clustering factor is equal to number of rows of a table.

Please note – rebuilding index, will not improve the clustering factor.  You need to
recreat the table to fix this.

QUERY TO FIND – > SELECT INDEX_NAME,CLUSTERING_FACTOR FROM DBA_INDEXES;


11. How oracle calculates clustering factor of an index?
These  diagrams will help you in understanding how clustering factor is calculated.

For calculating CF, Oracle will fully scan the index leaf blocks. These leaf blocks contain
rowid, So from rowid, it can find the block details of the table. So while scanning the
index, whenever is a change in block id, it will increment the CF by one.
GOOD clustering factor:

In the below diagram, for the first 4 rows, the block is 15 , so CF is 1 at that time. But the
fifth row is in different block, so CF incremented by 1 . So similarly with every block
change, the CF will be incremented by one.

Bad clustering factor:

In the below diagram, for the rows, the block id is getting changed very frequently. So CF
is very high. Which is bad.
 

Diagram reference – https://techgoeasy.com/clustering-factor/


12. What is rowid ?
Rowid is  pseudocolumn, which contains below information.ROWID is used to get the
exact location of a row in the database.It is the fasterst way to locating a row.

When we create index, the rowids and the column_values are stored in the leaf block of
the index.

1. object_id
2. block number
3. position of the row in the block
4. datafile number in which the row resides.(i.e relative file number)
13. What data is stored in the index leaf block?
Index leaf block stores the indexed column values and its corresponding rowid( which is
used to locate the actual row).
14. How oracle uses index  to retrieve data?
When query hits the database, The optimizer creates an execution plan involving the
index. Then the index is used to retrieve the rowid. And using the rowid, the row is
located in the datafile and block.

15. Explain the scenario where , the data will be retrieved using
only index ,without accessing the table at all?
In case of covering index, i.e when we are only retrieving the indexed column data.

16. What is a covering index?


Covering index mean, when the query is trying to fetch only indexed column data, then
we call it covering index. Because, in this case, the access to table is not required, as the
all the column data is already inside index.

select emp_id,emp_name from emp_list where emp_name=’RAJU’;

Here index is already presen ton emp_id,emp_name;

17. What is cardinality in oracle?


Cardinality refers to the uniqueness of the data in a particular column of a table. If the
table column has more number of distinct values, then cardinality is high. If distinct
values are less, then cardinality is low.

18. Why my query is doing full table scan , despite having index on
the predicate.
With below scenarios the optimizer might not use the index.

 condition where index_column like ‘%id’


 Huge data is requested from the table.
 Very high degree of paralleism is defined in table.
 Very high degree of paralleism is defined in table.
 Also when we search for a null value.
19. Difference between invisible index and unusable index?
When we make the index invisible, optimiser will not use it . But the index will be
maintained by oracle internally.i.e for all dml activites on table, the index will be
updated.  If we want the optimizer to use it , we can make it visible using alter table
command.

But when we make the index unusable, then oracle optimiser will not use it and also the
the index will not be maintained bz oracle further. So we want to use the index again
then, we need to rebuild again. This is usually used in large environments to drop index.
Because in large database, dropping index may take a lot time. So First it can be made
unusable for some time and during low business hours, index can dropped.

20. Why moving a table , makes the index unusable?Do you know
any other scenarios which will make the index unusable?
When we move the table , the rows moved to a different location and gets a new rowid.
But Index still points to the old rowids. So we need to rebuild the index, which will make
the index entries to use new set of rowids for the table row.

Different reasons for index unusable is dropping table partition/truncate partition.

21. Do you know any scenario where making the index unusable
can be helpful?
In DWH envs, when huge data loading operations are performed on tables having lot of
index, the performance will slow down, as all the respective indexes need to be updated .
So to avoid this, we can make the indexes unusable and load the data. And once loading
is completed, we can recreate/rebuild the indexes.

21. Difference between primary key and unique key?


Unique key constraint can contain null value, But primary key constraints should be not
null .

22. How index works on query with null value? Lets say i am
running  a query select * from emp where dept_Name=null; and an
index is present on dept_name . In that case, will oracle use the
index?
First we need to find, what type of index it is using. If the index is B-Tree then null values
are not stored in the b-tree index. So it will do a TABLE FULL SCAN.

But if the index is a BITMAP index, then for null values also index will be used ( i.e it will
do index scan), Because bitmap index stores null value also.

23. Between b-tree index and bitmap index, which index creation is
faster?
Bitmap index creation is faster.

24. Can I create a bitmap index on partitioned table?


Bitmap index can be created on partitioned table, but they must be local partitioned
index.

25. Can I create two indexes on a same column in oracle?


Prior to 12c, it was not allowed to have two indexes on same column. However from 12c
onwards , we can create two indexes on same column, but the index_type should be
different. i.e if i have already btree index on a column, then other index should be a
bitmap.

26. What is the difference between heap organized table and index
organized table?
 

27. What is an overflow in IOT table? Explain its purpose.


 

28. Can we create a secondary index on IOT?


Yes we can create an secondary index( Either b-tree or bitmap _) IOT.

29. Is it possible to convert a heap organized table to  index


organized table and vice versa?
30. In which scenarios i should IOT ?
Mostly for a table with small number of columns and most of the columns are in primary
key and mostly accessed.

Also used where requirement is for fast data access of primary key column data.

It is not recommended in DWH  ,because it involve bulk data loading, which will make
the physical guess to stale very quickly.

31. When should i rebuild an IOT index?Explain different ways to


do it.
32. We want to find the list of unused index in a database. How can
I do that?
We can alter the index with MONITORING USAGE clause.  After the respective index
usage will be tracked.

33. Explain difference between coalescing and shrinking in


reducing index fragmentation?
Coalescing combines adjacent leaf blocks into a single block and put the new free leaf
blocks in the free list of index segment,Which can be used by index in future. But it will
not release the free space to database.

But shrinking will release the free space to database

34. What are the different index scanning methods ?


INDEX UNIQUE SCAN:

INDEX RANGE SCAN:

INDEX FULL SCAN:

INDEX FAST FULL SCAN:

INDEX SKIP SCAN:

35. Which index scan method cause , db file scattered read wait
event( Ideally indexes cause db file sequential read).
Index fast full scan cause db file scattered read wait event. Because this method used
multi block read operations to read the index. This type of operations run in parallel.

36. If i use not equal condition in the where clause, will it pick the
index? like select * from table_tst where object_id <> 100?
If this query returns high number of rows, then optimizer will use FULL TABLE SCAN i.e
index will not be used.

And if you try to force the query to use index using index hint, then also it will do INDEX
FULL SCAN, but not Index range scan.

37.  If i query with wild character in where clause, Will the


optimizer pick index?
It depend where the wild character is placed.

If query is like select * from emp where emp_name  like ‘SCO%’;  ->> Then it will use
index
And

If query is like select * from emp where emp_name like ‘%SCO’;  –> This will not use
Index
38.  What is a functional Index?
If the query contains function on a index column in the where then normal index will not
be used.

i.e query like select * from emp where lower(emp_name)=’VIKRAM’; — Then optimizer
will not use normal index.

So in this case, We can create a function index as below , which will be used by
optimizer.

create index fun_idx on emp(lower(emp_name));


39.  what are the index related initialization paramters.
optimizer_index_caching

optimizer_index_cost_adj

39.  If given an option between index rebuild and Coalsace, which


one you should choose?
Coalasce is always prefered over index rebuild,

Because, index rebuild need twice the size to complete the activity. Also it will generate
lot of redo during the activity and there might be impact on CPU usage.

39.  Should we do index rebuild regularly?


It is advised not to do index rebuild regularly or not at all ( unless performance
assessment finds issue with index).

The general believe that index rebuild balances the index tree, and improves clustering
factor and to reuse deleted leaf blocks.But this is myth. Because Index is always balanced
and the rebuilding index doesn’t improve clustering factor.

39.  Does rebuild index helps in clustering factor?


Rebuilding an index never has an influence on the clustering factor but instead requires
a table re-organization.

39.  What is index block splitting? Different types of splitting?


Index entries are managed in a orderly fashion in index structure. So if a new key is
getting added, and there no free space of the key, then index block will be splitted.

Two types of splitting.

1. 50-50 Splitting – > To accomodate the new key, The new block will be
created, 50 percent will be stored in the existing leaf block and 50 percent
will be in the new block.
2. 90-10 splitting – > If the index entry is the right most value( when index entry
is sequentially increasing like transaction_date), then leaf block will be
splitted and the new index key will added to the new leaf block
HISTOGRAM and STATISTICS
1. What are statistics in oracle? What type of data in stores?
2. Difference between statistics and histogram?
3. What is a histogram ? What are different types of histogram?
Histogram , we can say a type of column statistics  which provides more information
about data distribution in table column.
Pre-12c , there were only  2 types of histogram.

Height based,

Frequency based

From 12c , another two types of histograms were introduced, apart from above two.

Top N frequency

Hybrid

5. What is the need of histogram?


By default oracle optimizer thinks that data is distributed uniformly. And accordingly it
calculates the cardinality.

But if the data is non-uniform, i.e skewed, then the cardinality estimate will be wrong. So
here histograms comes into picture. It helps in calculating the correct cardinality of the
filter or  predicate columns .

-- When no histograms are present:


SQL> select column_name,num_distinct,num_buckets,histogram,density from
dba_tab_col_statistics where table_name='HIST_TEST';

COLUMN_NAME NUM_DISTINCT NUM_BUCKETS HISTOGRAM DENSITY


----------------------- ------------ ----------- --------------- ----------
OBJECT_NAME 107648 1 NONE 9.2895E-06
OWNER 120 1 NONE .008333333
OBJECT_TYPE 48 1 NONE .020833333
OBJECT_ID 184485 1 NONE 5.4205E-06

SQL> SELect count(*) from hist_test where object_type='TABLE';

COUNT(*)
----------

17933

Execution Plan
----------------------------------------------------------
Plan hash value: 3640793332
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 9 | 1 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 9 | | |
|* 2 | INDEX RANGE SCAN| TEST_IDX2 | 3843 | 34587 | 1 (0)| 00:00:01 | ---
optimizer estimates incorrect number of rows.
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):


---------------------------------------------------

2 - access("OBJECT_TYPE"='TABLE')

--- Gather stats again:

SQL>
SQL> BEGIN
DBMS_STATS.GATHER_TABLE_STATS (
ownname => 'SYS',
tabname => 'HIST_TEST',
cascade => true, ---- For collecting stats for respective indexes
granularity => 'ALL',
estimate_percent =>dbms_stats.auto_sample_size,
degree => 8);
END;
/ 2 3 4 5 6 7 8 9 10

PL/SQL procedure successfully completed.

SQL> SELect count(*) from hist_test where object_type='TABLE';

COUNT(*)
----------
17933

Execution Plan
----------------------------------------------------------
Plan hash value: 3640793332

-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 9 | 1 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 9 | | |
|* 2 | INDEX RANGE SCAN| TEST_IDX2 | 17933 | 157K| 1 (0)| 00:00:01 | ---
Optimizer using the exact nuumber of rows.
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):


---------------------------------------------------
2 - access("OBJECT_TYPE"='TABLE')

SQL> select column_name,num_distinct,num_buckets,histogram,density from


dba_tab_col_statistics where table_name='HIST_TEST';

COLUMN_NAME NUM_DISTINCT NUM_BUCKETS HISTOGRAM DENSITY


----------------------- ------------ ----------- --------------- ----------
OBJECT_NAME 107648 1 NONE 9.2895E-06
OWNER 120 1 NONE .008333333
OBJECT_TYPE 48 48 FREQUENCY 2.7102E-06 --- Frequency histogram is present
OBJECT_ID 184485 1 NONE 5.4205E-06

6. What changes to histogram are introduced in 12c version  . And


Can you explain some problems with 11g histogram?
In 12c  two new histograms has been introduced. TOP N frequency and Hybrid.

7. Explain how oracle decides which histogram to create?


NDV -> Number of distinct values

n – >  number of buckets.

 
8. When oracle creates histogram?
When we run gather stats command with method_opt set AUTO, It will check the
sys.col_usage$ , table to see, whether the columns of the table used as join or predicate (
ie. in where clause or join). And if columns are present, then oracle will create histograms
for the column ( Only  for the columns having  skewed data).

ie. if a column has not used in join or where clause of any query, Then even if we run
gather stats, the histogram will not be created.

Lets see the demo:

--- HISTO_TAB is a new table.

SQL> select column_name,num_distinct,num_buckets,histogram,density from


dba_tab_col_statistics where table_name='HISTO_TAB';

no rows selected

-- Run status:

SQL> SQL> BEGIN


DBMS_STATS.GATHER_TABLE_STATS (
ownname => 'SYS',
tabname => 'HISTO_TAB',
cascade => true, ---- For collecting stats for respective indexes
granularity => 'ALL',
estimate_percent =>dbms_stats.auto_sample_size,
degree => 8);
END;
/
2 3 4 5 6 7 8 9 10

PL/SQL procedure successfully completed.

SQL> SQL> SQL> select column_name,num_distinct,num_buckets,histogram,density from


dba_tab_col_statistics where table_name='HISTO_TAB';

COLUMN_NAME NUM_DISTINCT NUM_BUCKETS HISTOGRAM DENSITY


----------------------- ------------ ----------- --------------- ----------
LN_NUM 9398 254 NONE .000034
LN_NUM2 137 1 NONE .00729927
MODIFICATION_NUM 478 1 NONE .00209205
ORDER_ID 637056 1 NONE 1.5697E-06

-- Now run a query with a column as predicate.

SQL> select count(*) from HISTO_TAB where ln_Num=1;

COUNT(*)
----------
625700

SQL> select dbms_stats.report_col_usage(OWNNAME=>'SYS',TABNAME=>'HISTO_TAB') from


dual;

DBMS_STATS.REPORT_COL_USAGE(OWNNAME=>'SYS',TABNAME=>'HISTO_TAB')
--------------------------------------------------------------------------------
LEGEND:
.......

EQ : Used in single table EQuality predicate


RANGE : Used in single table RANGE predicate
LIKE : Used in single table LIKE predicate
NULL : Used in single table is (not) NULL predicate
EQ_JOIN : Used in EQuality JOIN predicate
NONEQ_JOIN : Used in NON EQuality JOIN predicate
FILTER : Used in single table FILTER predicate
JOIN : Used in JOIN predicate
GROUP_BY : Used in GROUP BY expression
...............................................................................

###############################################################################

COLUMN USAGE REPORT FOR SYS.HISTO_TAB


.....................................

1. LN_NUM : EQ --- >>

###############################################################################

-- Run stats again:


SQL> SQL> BEGIN
DBMS_STATS.GATHER_TABLE_STATS (
ownname => 'SYS',
tabname => 'HISTO_TAB',
cascade => true, ---- For collecting stats for respective indexes
granularity => 'ALL',
estimate_percent =>dbms_stats.auto_sample_size,
degree => 8);
END;
/
2 3 4 5 6 7 8 9 10

PL/SQL procedure successfully completed.

SQL> SQL> select column_name,num_distinct,num_buckets,histogram,density from


dba_tab_col_statistics where table_name='HISTO_TAB';

COLUMN_NAME NUM_DISTINCT NUM_BUCKETS HISTOGRAM DENSITY


----------------------- ------------ ----------- --------------- ----------
LN_NUM 9398 254 HYBRID .000034. --- >> THIS ONE
LN_NUM2 137 1 NONE .00729927
MODIFICATION_NUM 478 1 NONE .00209205
ORDER_ID 637056 1 NONE 1.5697E-06
SQL> SQL>

64. What is the significance of method_opt parameter in gather


stats command. ?
Method_opts parameter of dbms_stats procedure controls below things.

1. Creation of histogram .

2. Creation of extended statistics

3. On which columns base statistics will be gathered.

Syntax –  >

FOR ALL [INDEX|HIDDEN] COLUMN SIZE [SIZE_CLAUSE]


First part( Controls collection of base statistics for columns)
FOR ALL COLUMNS – Gathers base statistics for all  columns

FOR ALL INDEXED COLUMNS – Gathers base statistics for columns included in index

FOR ALL HIDDEN COLUMNS – Gathers base statistics for virtual columns

2nd part ( size parameter controls the creation of histogram)


In size_clause we defines the number of buckets. Maximum is 254 Buckets.

Auto means – > oracle will create histogram automatically as per usage in col_usage$
table.

If we say method_opt => ‘ for all columns size 1’  : It means base statistics will be
collected for all columns , and as bucket size is 1, so Histogram will not be created.

9. What are  extended statistics and expression statistics?


10. How statistics become stale? 
By default stale_percent preference is set to 10 %.  So when 10 percent of rows gets
changed , then the statistics will become stale.

11. Is there any way we can configure the database ,that the
statistics will never go stale( Despite having thousands of
transactions).
From 19c onwards it is possible. 19c introduced real time statistics, Mean for the DML
activities , the statistics data will be collected in real time ( So stale stats will never occur).

This feature is controlled by OPTIMIZER_REAL_TIME_STATISTICS parameter(True/False).


12. For large schemas, gather stats can take a lot of time. How can
we speed up of the gather statistics process?
We can use Degree parameter to create multiple parallel threads to increase the speed.
Other option is concurrent statistics gathering. i.e if we set the preference parameter
CONCURRENT , then statistic will be gathered for multiple tables of a schema, at a same
time.( by creating multiple scheduler jobs).

We can combine both the option to speed up .(But make sure these can increase the
load on the system).

GENERIC:
1. What is an ITL?
2. Lets say a user want to update 50 rows in a table( i.e a block)  ,
then how many ITL slots are required?
We need one ITL slot for one transaction. i.e ITL slots are not allocated as per row. It is as
per transaction id. Each slot is for one transaction id.

34. What is a dead lock in oracle?


Dead block occurs when session A holding resource requested by session B , And at the
same time Session B also holding a resource requested by session A.

This error occurs usually due to bad application design.

34. What will be the impact, if run compile a package/procedure


during peak business hour?
34. A database has SGA of 20G and a user ran a select query on a
big table, whose size is of 100 GB, will the user get the data?
34. What is the significance of
DB_FILE_MULTIBLOCK_READ_COUNT?
This paramter specifies, how many block will be fetched with each i/o operation. It allows
the full scan operations to complete faster.

34. What is db time and elapsed time in AWR?


34. I ran one insert statement on a session but didn’t commited it.
Then i ran one DDL statement( alter table statement) on the same
session. Once alter is done, I have closed the session. So now
what will happen to the insert transaction?
DDL statements cause implicit commit. So the insert statement will be commited , when
we ran the DDL statement.

34. As we know that awr reports only few top sql queries. But I
have a requirement that, a specific sql query should be reported in
the awr report, whether it is a top sql or not. Can we do that.
Yes we can do this. We need to get the sql_id of the sql query and mark it as coloured
using dbms_workload_repository package.

34. Dynamic sampling?


34. What are the different levels of sql trace?
10046 – sql trace

10053 – optimizer trace

34. What is this parameter optimizer_mode?


ALL_ROWS

FIRST_ROWS

FIRST_ROWS n(1,10,100) – . 
34. Is it true that , parallel query scan use direct path read
bypassing the buffer cache?
Yes parallel scans are direct path read , they  by passes the data buffer.So they dont have
to worry about catering blocks from buffer cache.

But what is there is dirty buffer in buffer cache, which is not written to disk. And if direct
read is happens from disk, then parallel query will give wrong results. So to fix this,
parallel query  will first issue a segment checkpoint, so that dbwr will write all the dirty
buffers of that segment to disk.

Note that  – with parallel scan as direct path read happens,


db_file_multi_block_read_count paralmeter will be optimally used.

34. Difference between freelist management and automatic


segment space management(ASSM)?
 

35. What is direct path read?


In Direct Path Read, the server process reads the data block from disk directly into
process-private memory i.e. PGA bypassing the SGA.
 

Direct read happens during below activities.

 Reads from a temporary tablespace.


 Parallel queries.
 Reads from a LOB segment.
36. What is bind ? How to use of binds helps ?
Bind  variable is a place holder in a sql statement, which can be placed with any valid
value. Usually application triggers queries with bind variables and the these bind values
are passed at run time.

select * from emp_name where emp_id=:1


Advantages is – Binds allow sharing the parent cursors in library cache. i.e it helps in
avoiding hard parsing.

37. What is bind peeking? What is the problem with bind peeking?
What oracle did to fix this issue?
The problem with bind was that, the query optimizer doesn’t know the literal values. So
especially when we use range predicates like < and  > or between, the optimizer plan
might varry when liternal values passed in the where clause( i.e full table scan or use of
index). So optimizer  might give wrong estimates when using binds.

So to overcome this, In oracle 9i bind peeking concept was introduced.( As per dictionary
, peeking means to have a glance  quickly.) With this, before execution plan is prepared,
the optimizer will peek into the literal values and use that to prepare the execution plan. 
So now as the optimizer is aware of the literal values, it can produce a good execution
plan.

But the problem with bind peeking was that,  the optimizer will peek into the literal
values only with the first execution . And the subsequent query execution use same
execution plan of the first one, despite these could have a better execution plan.

So to reduce this problem , In 11g, oracle introduced adaptive cursor sharing.

38. What is adaptive cursor sharing?


We can call it bind aware. With this feature, for every parsing the optimizer performs the
estimation of selectivity of the predicate. Based on that right child cursor is used/shared.

Adapative cursor sharing is introduced in 11g oracle. Adaptive cursor sharing kicks in
when the application query uses binds or when cursor_sharing is set FORCE.
39. What is the significance of cursor_sharing parameter?  What is
the default one?
Below are the 3 possible values of cursor_sharing parameter.

EXACT

FORCE

SIMILAR

40. Can i run sql tuning advisor on standby database?


Yes we can do using database_link_to parameter while creating tuning task.

The public database link need to created in primary with user sys$umf , pointing to
primary db itself.

41. Can i get the sql_id of a query before it is being executed?


From 18c onwards you can set feedback only sql_id  , in sql  and run the query
 

SQL> set feedback only sql_id


SQL> select * from dba_raj.test where owner =’public’;

1 row selected.

SQL_ID: 11838osd6slfjh88

42. How can we improve the speed of dbms_redef operation?


You can run below alter parallel commands before starting the dbms_redef process.

ALTER SESSION FORCE PARALLEL DML PARALLEL 8;


ALTER SESSION FORCE PARALLEL QUERY PARALLEL 8;

43. How can i encrypt an existing table without downtime?


You can use the dbms_redef method to do this online.

44. What is the difference between PID and SPID in v$Process?


PID – > Oracle internal counter, which oracle use for its own purpose. For every new
process, PID increments by 1.

SPID – > It is the operting system process. SPID is mostly used by DBAs for tracing or
killing session.

45. Do you know where awr data is store?


It is stored in SYSAUX tablespace.
46.  What are the different access methods in explain plan.

47.  How many types of joins oracle has?


NESTED LOOP: 
Nested loops joins are useful when small subsets . For every row in the first table (the
outer table), Oracle accesses all the rows in the second table (the inner table) looking for
a match.

HASH JOIN:
Hash joins are used for joining large data sets. The Optimizer uses the smaller of the two
tables or data sources to build a hash table, based on the join key, in memory. It then
scans the larger table and performs the same hashing algorithm on the join column(s). It
then probes the previously built hash table for each value and if they match, it returns a
row.
SORT MERGE JOIN:
Sort Merge joins are useful when the join condition between two tables is an in-equality
condition such as, <, <=, >, or >=. Sort merge joins can perform better than nested loop
joins for large data sets. The join consists of two steps:

REFERENCE –  > https://sqlmaria.com/2021/02/02/explain-the-explain-plan-join-
methods/
48.  What is lost write in oracle?
49.  Which checks are performed in parsing phase of a query?
SYNTAX CHECK: It checks whether syntax is correct or not

SEMANTIC CHECK: It checks whether the user has permisson to access the table or
whether table and column name is correct or not.

SHARED POOL CHECK:  It checks whether it should do soft parse or hard parse.  To
explain in detail. When ever a sql query comes, db calculates the hash value of that
statement. and then it search that hash value in the shared pool ( esp shared sql area). If
it is already present, then it will try to reuse the same execution plan. In that case we call
it soft parse. If the hash value is not found or it cannnot reuse the existing plan, then
hard parsing happens.

50.  What are some of the os commands to monitor oracle server ?


TOP :
It provides information like load average, cpu , memory usage and processes consuming
high cpu.

VMSTAT:  Virtual Memory statistics : Details like cpu run queues, swap, disk operations
per second. paging details and load information.
usage –  > vmstat -5 10

vmstat -a ( to get active and inactive memory information)

SAR ( for linux):


System activity Report – >  It can be used to gerenate cpu usage ( like system,
user,idle) ,iowaits . Apart from that we can get the historical report from sar using
command like

sar -u -s 06:30:00 -e 07:15:00

It is also recommended to install oswatcher on the servers.  Using oswatcher collected


file, we can generate different graphs for a particular duration.
51.  What is ASH?
Every second, Active Session History polls the database to identify the active sessions
and dumps relevant information about each of them. Session is deemed as active, if it is
consuming cpu or waiting for non idle wait events.

We can view these information on gv$active_session_history. After a period, these data


will be flushed to dba_hist_active_session_history.

52.  Explain about ORA-1555 snapshot too old error. What is the
action plan for this error?
 

53. Explain about ORA-4031, unable to allocate shared memory


error?
WAIT EVENTS:
1.  When log_file_sync wait event occurs?
When user session commit or rollback. when user commits, LGWR writes the contents of
redo log buffer to online redolog files and once writing is done, it will post  the same to
user. So while user is waiting for the confirmation from LGWR, it will wait for for log file
sync event.

If application is causing log of commits , then log_file_sync wait can be a top wait event
in AWR.

This wait event can be reduced by checking with appication, whether they can reduce
unnecessary commit and do commit in batches.

Slow i/o do redo disk can also cause this issue.

2. what you know about  log_file_parallel_write wait event?


This wait event occurs when log buffer contents are getting written to redo log file. This
might be the top event when application cause lot of commits or database is in hot
backup mode. slow i/o to redo disk can also cause this.

One solution is to increase the log buffer size and reduce the number of commits. Also
try avoiding putting the database in hot backup mode.

3.  How can you fix log file switch(checkpoint incomplete) wait
event?
To complete a checkpoint, dbwr must write every associated dirty buffer to disk and
every datafile and controlfile should be updated to latest checkpoint number
Let’s say LGWR finished writing to log file 2 and ready to switch to log file 1 and start
writing. However the DBWR is still writing checkpoint related redo information of logfile
1 to disk.  So it cannot write to logfile 1 unless that checkpoint is completed ( i.e unless
dbwr completed its writing)

To fix this. Increase the redo log size or add more number of log file.

4. log buffer space?


4. What are some buffer cache related wait events?
db file parallel read

db file sequential read

db file scattered read

free buffer wait

latch cache buffer chain

latch cache buffer lru chain

5. When db file scattered read wait event occurs?


We can call this multi block read. While doing full table scan or index fast  scan,  lot of
blocks need to be fetched from disk to buffer  and they will be in buffer in a scattered
manner( Because these buffers in the buffer cache will not contagious) . And during this
activity, it need to find lot of free buffer which leads to this wait event.

Solution:

Avoid full tablespace if possible.

If large tables, consider using partitioning.

Check whether proper indexing is in place to avoid full scan.

6. what is db file parallel read wait event??


 

7. what is db file sequential read wait event??


Sequential read means, single block read.This is largely due to index full scan or index
scan with order by clause.

When rows in the table are in random order, then clustering factor will be very high. . i.e
more table blocks need to be visited to get the row in index block.
Also when index blocks are fragmented, then more number of blocks need to be visited,
which can cause this wait event.

9. redo allocation latch contention?


10.  redo copy latch contention?
8. What is buffer busy wait? What should be the approach to fix it?
Buffer busy wait is due to excessive logical i/o.

This wait happens, when server process, got a latch on the hash bucket, But another
session is holding the block in buffer( either it is writing  to that block or may be buffer
getting is flushed to disk).Some times this  is known as read by other sessions.

To troubleshoot this issue, first we need to find out the the responsible object. We can
get the p1(file_id),p2(block),p3 (reason) details  of the query from v$session_wait .  And
by using the  p1 and p2 values, we can get the segment details from the dba_extents.

Then we need to check , the segment is of which type.

If it is a undo – > Then need to increase the size of undo tablespace.

If it is data block – >  Means the query might be inefficient . May be we need to try
moving the hot blocks to a di

11.  LATCH: CACHE BUFFER CHAINS (CBC) wait event?


Each buffer in the buffer cache has an associated element in the buffer header array.
( These details are available in v$bh).

Buffer header array is allocated in shared pool. These buffer header arrays  stores
attribute and status details of the buffers. These buffer headers are chained together
using double linked list.and linked to hash bucket. And there are multiple hash buckets
and these buckets or chains are protected by latch cache buffer chain.

If a process want to search or modify a buffer chain , then first it need to get a latch i.e
CBC latch.

So when multiple users want to access the same block or the other block on the same
hash bucket, then there will be contention for getting a latch. So when contention is
severe( i.e process is finding to difficult to get a latch) , then this LATCH : CBC wait event
will occur.

Not only accessing the same block.,But when simultaneous insert, update runs on a same
block , then cloned buffers copies also gets attached to same hash bucket and it
increases the buffer chain length . So process after getting the latch, takes more time to 
scan through the long chain.

The wait event occurs when logical I/O is high. So to avoid this wait event, we need to
find a way to reduce the logical I/O.

Solution:

The solution depends upon what query is causing the issues.

If the issue is due to no. of users accessing the same data set, then we can increase the
PCTFREE of table/index., so that there will less rows per block. and the data will spread
across multiple chains.

9.  LATCH: CACHE BUFFER LRU CHAINS?


This wait event occurs when physical i/o is high.

When the requested block is not available in Buffer cache, then server process need to
read the block from disk to buffer cache. For that it needs a free buffer. So for this ,
server process search through the LRU list, to get the free block. And search or access a
LRU list , first it needs to get a Latch which is called LATCH – cache buffer lru chain.

Also when the dbwr writes the buffers to disk, first it scans through lru list to get the
dirty buffer which need to flushed.  For this purpose also it needs to get a latch.

Below are some of the activities which are responsible for this wait event.

Small buffer cache- Means , very frequently dirty blocks need to be writen to disk very
often. which will increase the latch contention.

Very lot of full table scan –  Means,  it need to find lot of free buffer to read the the data
from disk. Means very often latch need to obtained.

8. What is free buffer wait event?


Reasons:
 DBWR is unable to handle the load
 Buffer cache size is small
10. Why the wait event  enq: TX – allocate ITL entry occurs and
how to fix it.
Interested Transaction List(ITL) is a  internal structure in each block. When a process want
to change a data in block, it need to get a empty ITL slot to record that the transaction is
interested in modifying the block. Once slots are over, it uses the free spaces available
on block.. The no. of ITL slots are controlled by INITRANS . Default Value of INITRANS is
1 for table and 2 for table.

So concurrent DMLs happens on a block and There are no free ITL slots, then it wait for a
free ITL slot, which cause this wait event.

To fix this we can increase the INITRANS value , if still issue present then increase the
PCTFREE.

Steps to initrans.
ALTER TABLE INITRANS 60;
But problem is the new value is applicable for the new blocks. So to apply to changes to
all the blocks, we need to move the table and rebuild respective indexes.

ALTER TABLE EMPLOYEE MOVE ONLINE;

ALTER INDEX EMP_ID REBUILD ONLINE;

If after updating initrans also issue is there , then update PCTFREE and move table and
index.

ALTER TABLE EMPLOYEE PCTFREE 20;


ALTER TABLE EMPLOYEE MOVE ONLINE;

ALTER INDEX EMP_ID REBUILD ONLINE;

11. What is enq: TX – index contention ?


When lot of insert and delete operations happens on a table, contention might happen
on the respective indexex.

Because, insert cause index splitting, if the transactions are accessing the index , during
these splitting, then there will a waiting.

There are few solutions , which can be applied . But before doing changes, need to be
tested thoroughly.

1. Creating a reverse key Index.( This helps where only inserts happen, but
might impact the range scan)
2. Increase pctfree value( This will reduce index splitting)
12. What are different types of mutex wait events?
 cursor: mutex X
 cursor: mutex S
 cursor: pin S
 cursor: pin X
 cursor: pin S wait on X
 library cache: mutex X
 library cache: mutex S
13. Library cache lock wait event?
Below are some reason which might cause this wait event.

Reasons
 Small shared pool
 Less cursor sharing / i.e more hard parsing
 library cache object invalided and reloaded frequently
 Huge number of child cursor for a parent cursor ( i.e high version count)
 Non  use of binds by application queries
 DDL during busy activity
Solutions:
 Increase shared pool size
 Always run critical DDLs during non-busy hours
 Avoid object compilations during peak time
 Check query with literates( Check cursor_sharing parameter)
 

14. Can a full tablescan cause db file sequential read?


It can happen , if you have chained rows.

You might also like