You are on page 1of 28

August 2009

Master of Science in Information Technology (MScIT-NEW) – Semester 3


MT0046 –: Oracle9i– 4Credits

(Book ID: B0894)


Assignment Set – 1 (40 Marks)

[Book ID: B0894]

Each question carries 5 marks

1. Mention the different modular database application programs in PL/SQL

Structure of PL/SQL-Blocks
PL/SQL is a block-structured language. Each block builds a (named) program unit, and blocks can be nested.
Blocks that build a procedure, a function, or a package must be named. A PL/SQL block has an optional declare
section, a part containing PL/SQL statements, and an optional exception-handling part. Thus the structure of a
PL/SQL looks as follows (brackets [ ] enclose optional parts):
[<Block header>]
[declare
<Constants>
<Variables>
<Cursors>
<User defined exceptions>]
begin
<PL/SQL statements>
[exception
<Exception handling>]
end;

The block header specifies whether the PL/SQL block is a procedure, a function, or a package.

If no header is specified, the block is said to be an anonymous PL/SQL block. Each PL/SQL block again builds a
PL/SQL statement. Thus blocks can be nested like blocks in conventional programming languages. The scope of
declared variables (i.e., the part of the program in which one can refer to the variable) is analogous to the scope
of variables in programming languages such as C or Pascal.
4.1.3 Declarations
Constants, variables, cursors, and exceptions used in a PL/SQL block must be declared in the declare section of
that block. Variables and constants can be declared as follows:
<variable name> [constant] <data type> [not null] [:= <expression>];
Valid data types are SQL data types (see Section 1.1) and the data type boolean. Boolean data may only be true,
false, or null. The not null clause requires that the declared variable must always have a value different from
null. <expression> is used to initialize a variable.
If no expression is specified, the value null is assigned to the variable. The clause constant states that once a
value has been assigned to the variable, the value cannot be changed (thus the variable becomes a constant).
Example:
declare
hire date date; /* implicit initialization with null */
job title varchar2(80) := ’Salesman’;
emp found boolean; /* implicit initialization with null */
salary incr constant number(3,2) := 1.5; /* constant */
...
begin . . . end;

2. Write the syntax of creating PL/SQL trigger

Use the CREATE TRIGGER statement to create and enable a database trigger, which is
• A stored PL/SQL block associated with a table, a schema, or the database or
• An anonymous PL/SQL block or a call to a procedure implemented in PL/SQL or Java

Oracle automatically executes a trigger when specified conditions occur.


When you create a trigger, Oracle enables it automatically. You can subsequently disable and enable a
trigger with the DISABLE and ENABLE clause of the ALTER TRIGGER or ALTER TABLE statement.

Prerequisites
Before a trigger can be created, the user SYS must run a SQL script commonly called
DBMSSTDX.SQL. The exact name and location of this script depend on your operating system.
• To create a trigger in your own schema on a table in your own schema or on your own schema
(SCHEMA), you must have the CREATE TRIGGER privilege.
• To create a trigger in any schema on a table in any schema, or on another user's schema
(schema.SCHEMA), you must have the CREATE ANY TRIGGER privilege.
• In addition to the preceding privileges, to create a trigger on DATABASE, you must have the
ADMINISTER DATABASE TRIGGER system privilege.
If the trigger issues SQL statements or calls procedures or functions, then the owner of the trigger must
have the privileges necessary to perform these operations. These privileges must be granted directly to
the owner rather than acquired through roles.

Syntax
create_trigger::=

dml_event_clause::=

referencing_clause::=
Semantics

OR REPLACE
Specify OR REPLACE to re-create the trigger if it already exists. Use this clause to change the
definition of an existing trigger without first dropping it.

schema
Specify the schema to contain the trigger. If you omit schema, then Oracle creates the trigger in your
own schema.

trigger
Specify the name of the trigger to be created.
If a trigger produces compilation errors, then it is still created, but it fails on execution. This means it
effectively blocks all triggering DML statements until it is disabled, replaced by a version without
compilation errors, or dropped. You can see the associated compiler error messages with the SQL*Plus
command SHOW ERRORS.

3. Differentiate between shared pool and large pool of oracle memory component

Introduction
The Oracle shared pool provides critical services for sharing of complex objects among large
numbers of users. DBAs often spent much time learning about the shared pool memory
management and configure and tune shared pool usage.
The Automatic Shared Memory Management (ASMM) features introduced in 10GR1 solved
this problem by providing the DBA a simple, automatic self-tuning mechanism for configuring
shared memory components of SGA, including the buffer cache and shared pool.

Purpose of the Shared Pool


The shared pool was introduced as a feature of the Oracle database in version 7, primary as
repository for Shared SQL and PL/SQL. Since that time, although it has evolved to serves in
its primary role for caching of cursors shared between sessions connected to the database.
At the most fundamental level, the shared pool is a metadata cache. Whereas the buffer
cache is almost completely utilized for the caching of data, the shared pool is used for
caching complex objects describing where the data is stored, how it relates to other data and
how it can be retrieved.
Much of the shared pool usage is to support the execution od shared SQL and PL/SQL
packages; but in order to build a cursor or compile a PL/SQL procedure, we need to know all
about the database objects referenced by the SQL or PL/SQL being compiled. For example,
in order to build a cursor for a simple select statement from a table, we need to know
metadata about the table including column names, data types, indexes and optimizer
statistics. All of this additional metadata is also cached in the shared pool, independently of
the cursors or program unit.

Components of the Shared Pool


Generally what is of far greater interest to the DBA, are the allocation of memory that can be
aged in and out of the cache since building new objects in the shared pool is expensive and
impacts scalability and performance. Memory allocations for objects that can be rebuilt are
sometimes referred to as 'recreatable'.V$SGASTATThe easiest way for examining the
contents of the shared pool at a given time is to query the fixed view V$SGASTAT. Since
Oracle keeps running totals of component memory allocations in the shared pool, selecting
from this view is inexpensive and will not impact the production system. This vire has been
enhanced in 10GR2 to provide a finer-granularity of memory allocation data.
Example:

SELECT * FROM(SELECT NAME, BYTES/(1024*1024) MB FROM V$SGASTAT WHERE


POOL = 'shared pool'ORDER BY BYTES DESC)WHERE ROWNUM <= 10;

NAME MB
-------------------------- ----------
ktcmvcb 6856.96056
KGH: NO ACCESS 1610.02487
sql area 571.81263
free memory 502.999672
CCursor 301.799118
PCursor 176.69886
library cache 112.56636
kglsim object batch 77.3775787
gcs resources 75.5597076
sql area:PLSQL 51.2854691

Multiple selects from V$SGASTAT will show that some components remain constant in size.
Large Pool
The large pool is to reduce the strain on the shared pool, the following will use the large pool if
configured:
• Shared server processes the UGA’s will be stored here
• Parallel execution servers if you have enabled parallel query, the parallel servers communicate
via the large pool
• I/O slave processes if you have enabled db writer slaves or tape i/o slaves.
• RMAN the rman channel processes use the large pool for caching buffers during backup and
restore operations

Large pool
show parameter large_pool_size;
configuration
Large Pool Size select * from v$sgastat where pool='large pool';
Java Pool
Java stored procedures are loaded from the data dictionary into the shared pool, the purpose of the Java
pool is to provide room for the runtime memory structures used by a Java application. It is controlled
by three instance parameters:
• Java_pool_size dynamically create the java pool
• Java_max_sessionspace_size maximum amount of space aloud for any one session (static)
• Java_soft_sessionspace_limit a message is written to the trace file when the limit is breached,
can help with monitoring. (static)

Java pool
show parameter java_pool_size;
configuration
Java Pool Size select * from v$sgastat where pool='java pool';

4. How background processes are useful in oracle database? Explain

To maximize performance and accommodate many users, a multiprocess Oracle Database system uses
background processes. Background processes consolidate functions that would otherwise be handled by
multiple database programs running for each user process. Background processes asynchronously
perform I/O and monitor other Oracle Database processes to provide increased parallelism for better
performance and reliability.
Table 4-4 describes the basic Oracle Database background processes, many of which are discussed in
more detail elsewhere in this book. The use of additional database server features or options can cause
more background processes to be present. For example, when you use Advanced Queuing, the queue
monitor (QMNn) background process is present. When you specify the FILE_MAPPING initialization
parameter for mapping datafiles to physical devices on a storage subsystem, then the FMON process is
present.
Table 4-4 Oracle Database Background Processes
Process Name Description
Database writer (DBWn) The database writer writes modified blocks from the
database buffer cache to the datafiles. Oracle Database
allows a maximum of 20 database writer processes
(DBW0-DBW9 and DBWa-DBWj). The
DB_WRITER_PROCESSES initialization parameter
specifies the number of DBWn processes. The database
selects an appropriate default setting for this
initialization parameter or adjusts a user-specified
setting based on the number of CPUs and the number
of processor groups.

For more information about setting the


DB_WRITER_PROCESSES initialization parameter,
see the Oracle Database Performance Tuning Guide.

Log writer (LGWR) The log writer process writes redo log entries to disk.
Redo log entries are generated in the redo log buffer of
the system global area (SGA). LGWR writes the redo
log entries sequentially into a redo log file. If the
database has a multiplexed redo log, then LGWR
writes the redo log entries to a group of redo log files.
See Chapter 10, "Managing the Redo Log" for
information about the log writer process.

Checkpoint (CKPT) At specific times, all modified database buffers in the


system global area are written to the datafiles by
DBWn. This event is called a checkpoint. The
checkpoint process is responsible for signalling DBWn
at checkpoints and updating all the datafiles and control
files of the database to indicate the most recent
checkpoint.

System monitor (SMON) The system monitor performs recovery when a failed
instance starts up again. In an Oracle Real Application
Clusters database, the SMON process of one instance
can perform instance recovery for other instances that
have failed. SMON also cleans up temporary segments
that are no longer in use and recovers dead transactions
skipped during system failure and instance recovery
because of file-read or offline errors. These transactions
are eventually recovered by SMON when the
tablespace or file is brought back online.

Process monitor (PMON) The process monitor performs process recovery when a
user process fails. PMON is responsible for cleaning up
Process Name Description
the cache and freeing resources that the process was
using. PMON also checks on the dispatcher processes
(described later in this table) and server processes and
restarts them if they have failed.

Archiver (ARCn) One or more archiver processes copy the redo log files
to archival storage when they are full or a log switch
occurs. Archiver processes are the subject of Chapter
11, "Managing Archived Redo Logs".

Recoverer (RECO) The recoverer process is used to resolve distributed


transactions that are pending because of a network or
system failure in a distributed database. At timed
intervals, the local RECO attempts to connect to remote
databases and automatically complete the commit or
rollback of the local portion of any pending distributed
transactions. For information about this process and
how to start it, see Chapter 33, "Managing Distributed
Transactions".

Dispatcher (Dnnn) Dispatchers are optional background processes, present


only when the shared server configuration is used.
Shared server was discussed previously in "Configuring
Oracle Database for Shared Server".

Global Cache Service (LMS) In an Oracle Real Application Clusters environment,


this process manages resources and provides inter-
instance resource control.

5. Explain the notions of transparency and site autonomy.

Site autonomy means that each server participating in a distributed database is administered
independently from all other databases. Although several databases can work together, each database is
a separate repository of data that is managed individually. Some of the benefits of site autonomy in an
Oracle Database distributed database include:

• Nodes of the system can mirror the logical organization of companies or groups that need to
maintain independence.
• Local administrators control corresponding local data. Therefore, each database administrator's
domain of responsibility is smaller and more manageable.
• Independent failures are less likely to disrupt other nodes of the distributed database. No single
database failure need halt all distributed operations or be a performance bottleneck.
• Administrators can recover from isolated system failures independently from other nodes in the
system.
• A data dictionary exists for each local database. A global catalog is not necessary to access
local data.
• Nodes can upgrade software independently.

Although Oracle Database permits you to manage each database in a distributed database system
independently, you should not ignore the global requirements of the system. For example, you may
need to:

• Create additional user accounts in each database to support the links that you create to facilitate
server-to-server connections.

Set additional initialization parameters such as COMMIT_POINT_STRENGTH, and OPEN_LINKS.

6. Describe the Oracle Architecture with a neat labeled diagram

All buildings, from the home you live in to the place you work, have
an architecture that keeps the building together. This architecture
(if it’s well designed), will keep the building from falling apart,
keep you nice and warm, and provide you with easy access to
facilities like water fountains and restrooms.
A well-crafted architecture will also help you work more efficiently.
The same is true for databases. Despite the name “Oracle”, there is
no magic!
The architecture that Oracle has built its database around is
designed to perform quickly, efficiently and without errors.
In this section we will introduce you to the Oracle architecture.
This architecture includes the following components:
* The System Global Area (SGA) and other memory areas that utilize
RAM
* Database related background processes
* Tablespaces and Datafiles
* Database related files
* The instance and the database
Next we will look at each of these components in more detail. Finally
we put the components together into a single “big picture” view, and
see how all these pieces fit together to complete the Oracle puzzle.
Keep in mind that the discussion in this chapter is limited to the
overall database architecture. In later chapters we will get into the
finer details of managing these structures.
With this in mind, let’s drill down a bit deeper into the
architecture of Oracle and learn more about memory structures, the
part of Oracle that utilizes your system’s RAM.
Diving into the RAM Pools
All computers have memory. Memory should not be confused with disk
storage. Memory is volatile, which means its content is lost after
the power is removed. Memory is also very fast. RAM memory is
expressed in nanoseconds (billionths of a second) and disk speed is
in milliseconds (thousandths of a second). In Oracle, RAM speed is
hundreds of times faster than disks. Don’t get burned by tiny
pools. See the Google search ”oracle cache disk speed” for details.
Disk storage is non-volatile. This means that the data stored on a
disk will remain after the power is turned off. Disks are always
slower than RAM, but disks are hundreds of times cheaper than RAM.
There is a trade-off between memory and disks: Memory is fast but
expensive (about $1,000 per gigabyte), whereas disks are slower but
very cheap. Thus, memory is used for short-term storage of
information that is frequently needed and disks are used for long-
term storage of information.
Oracle has a number of memory areas that it uses to store
information. In this section we will address the main Oracle memory
areas. They are called:
* The System Global Area (SGA) – RAM areas for the Oracle programs.
* The Program Global Area (PGA) – Private RAM areas for individual
client connections to Oracle
The Instance and the Database
We are almost at the end of our introduction to the Oracle database
architecture. We can’t complete this journey, however, until we
define two more terms, instance and database. The Oracle instance is
the combination of the Oracle SGA and the related background
processes (the programs, PMON, SMON, etc.). When the SGA RAM memory
is successfully allocated and the Oracle database processes are
running normally, the instance is said to be “up”.
However, and instance is sometimes different from a database. You can
have the instance running, but the database might not be mounted or
open. The Oracle Database includes the physical files we have
discussed: the datafiles, the control file, and the redo log files.
When the Oracle instance is running, it can attach itself to a
database, “mount” the control file, and finally “open” the datafiles
and redo log files. This is an important distinction because many
Oracle operations must be done with the instance started but the
database is not open.
Oracle Architecture Concepts
In the previous section we discussed the Oracle physical
architecture. Things like files, programs and hardware are all
considered physical pieces (or physical properties) of the Oracle
database. In this section we are concerned with the logical pieces of
the Oracle database.
Oracle segregates “physical components” (the .dbf files on disk) my
mapping them into “logical” containers called tablespaces. In turn,
we allocate our tables and indexes inside these tablespace, and
Oracle takes-care of the interface to the physical disk files.

7. Explain the individual components of the Architecture.


The Oracle database has a logical layer and a physical layer. The physical layer consists of the files that
reside on the disk; the components of the logical layer map the data to these physical components.

The Physical Layer

The physical layer of the database consists of three types of files:


• One or more datafiles--Datafiles store the information contained in the database. You
can have as few as one datafile or as many as hundreds of datafiles. The information for a single
table can span many datafiles or many tables can share a set of datafiles. Spreading tablespaces
over many datafiles can have a significant positive effect on performance. The number of
datafiles that can be configured is limited by the Oracle parameter MAXDATAFILES.
• Two or more redo log files--Redo log files hold information used for recovery in the
event of a system failure. Redo log files, known as the redo log, store a log of all changes made
to the database. This information is used in the event of a system failure to reapply changes that
have been made and committed but that might not have been made to the datafiles. The redo log
files must perform well and be protected against hardware failures (through software or
hardware fault tolerance). If redo log information is lost, you cannot recover the system.
• One or more control files--Control files contain information used to start an instance,
such as the location of datafiles and redo log files; Oracle needs this information to start the
database instance. Control files must be protected. Oracle provides a mechanism for storing
multiple copies of control files.

The Logical Layer


The logical layer of the database consists of the following elements:
• One or more tablespaces.

• The database schema, which consists of items such as tables, clusters, indexes, views,
stored procedures, database triggers, sequences, and so on.

Tablespaces and Datafiles


New Term: The database is divided into one or more logical pieces known as tablespaces. A
tablespace is used to logically group data together. For example, you can create one tablespace for
accounting and a separate tablespace for purchasing. Segmenting groups into different tablespaces
simplifies the administration of these groups (see Figure 2.1). Tablespaces consist of one or more
datafiles. By using more than one datafile per tablespace, you can spread data over many different disks
to distribute the I/O load and improve performance.

The relationship between the database, tablespaces, and datafiles.


As part of the process of creating the database, Oracle automatically creates the SYSTEM tablespace for
you. Although a small database can fit within the SYSTEM tablespace, it's recommended that you
create a separate tablespace for user data. The SYSTEM tablespace is where the data dictionary is kept.
The data dictionary contains information about tables, indexes, clusters, and so on.
Datafiles can be operating system files or, in the case of some operating systems, RAW devices.
Datafiles and data access methods are described in detail on Day 12.

The Database Schema

New Term: The database schema is a collection of logical-structure objects, known as schema objects,
that define how you see the database's data. These schema objects consist of structures such as tables,
clusters, indexes, views, stored procedures, database triggers, and sequences.
• Table--A table, which consists of a tablename and rows and columns of data, is the basic
logical storage unit in the Oracle database. Columns are defined by name and data type. A table
is stored within a tablespace; often, many tables share a tablespace.
• Cluster--A cluster is a set of tables physically stored together as one table that shares a
common column. If data in two or more tables is frequently retrieved together based on data in
the common column, using a clustered table can be quite efficient. Tables can be accessed
separately even though they are part of a clustered table. Because of the structure of the cluster,
related data requires much less I/O overhead if accessed simultaneously.
• Index--An index is a structure created to help retrieve data more quickly and efficiently
(just as the index in this book allows you to find a particular section more quickly). An index is
declared on a column or set of columns. Access to the table based on the value of the indexed
column(s) (as in a WHERE clause) will use the index to locate the table data.

NOTE: A new feature in Oracle8 is the index-only table. In an index-only table, the data
and index are stored together. This is discussed in detail on Day 13, "Using Indexes and
Sequences."

• View--A view is a window into one or more tables. A view does not store any data; it
presents table data. A view can be queried, updated, and deleted as a table without restriction.
Views are typically used to simplify the user's perception of data access by providing limited
information from one table, or a set of information from several tables transparently. Views can
also be used to prevent some data from being accessed by the user or to create a join from
multiple tables.
• Stored procedure--A stored procedure is a predefined SQL query that is stored in the
data dictionary. Stored procedures are designed to allow more efficient queries. Using stored
procedures, you can reduce the amount of information that must be passed to the RDBMS and
thus reduce network traffic and improve performance.
• Database trigger--A database trigger is a procedure that is run automatically when an
event occurs. This procedure, which is defined by the administrator or developer, triggers, or is
run whenever this event occurs. This procedure could be an insert, a deletion, or even a
selection of data from a table.
• Sequence--The Oracle sequence generator is used to automatically generate a unique
sequence of numbers in cache. By using the sequence generator you can avoid the steps
necessary to create this sequence on your own such as locking the record that has the last value
of the sequence, generating a new value, and then unlocking the record.

Segments, Extents, and Data Blocks


Within Oracle, the space used to store data is controlled by the use of logical structures. These
structures consist of the following:
• Data blocks--A block is the smallest unit of storage in an Oracle database. The database
block contains header information concerning the block itself as well as the data.
• Extents--Extents consist of data blocks.
• Segments--A segment is a set of extents used to store a particular type of data, as shown
in Figure 2.2.

Segments, extents, and data blocks.


Segments
An Oracle database can use four types of segments:
• Data segment--Stores user data within the database.
• Index segment--Stores indexes.
• Rollback segment--Stores rollback information used when data must be rolled back.
• Temporary segment--Created when a SQL statement needs a temporary work area; these
segments are destroyed when the SQL statement is finished. These segments are used during
various database operations, such as sorts.

Extents
Extents are the building blocks of segments; in turn, they consist of data blocks. An extent is used to
minimize the amount of wasted (empty) storage. As more and more data is entered into tablespaces in
your database, the extents used to store that data can grow or shrink as necessary. In this manner, many
tablespaces can share the same storage space without preallocating the divisions between those
tablespaces.
At tablespace-creation time, you can specify the minimum number of extents to allocate as well as the
number of extents to add at a time when that allocation has been used. This arrangement gives you
efficient control over the space used in your database.
Data Blocks
Data blocks are the smallest pieces of an Oracle database; they are physically stored on disk. Although
the data block in most systems is 2KB (2,048 bytes), you can change this size for efficiency depending
on your application or operating system.
NOTE: Oracle blocks do not need to be, and may not be the same as, operating system data
blocks. In fact, in most cases they are not.

The Oracle Instance

The Oracle instance consists of the Oracle processes and shared memory necessary to access
information in the database. The instance is made up of the user processes, the Oracle background
processes, and the shared memory used by these processes (see Figure 2.3).

The Oracle Memory Structure

New Term: Oracle uses shared memory for several purposes, including caching of data and indexes as
well as storing shared program code. This shared memory is broken into various pieces, or memory
structures. The basic memory structures associated with Oracle are the System Global Area (SGA) and
the Program Global Area (PGA).

8. Discuss the features of the heterogeneous services

Oracle9i, like previous versions, supports heterogeneous services to allow data in non-Oracle database
to be queried using SQL. This support has been in the form of transparent gateways, which are vendor
specific, or generic connectivity which uses ODBC or OLEDB to make the connections. The
functionality supported by generic connectivity is typically more limited than that possible when using
vendor specific gateways, but it is quick and simple to configure. The steps listed below can be used to
connect Oracle to any ODBC compliant database (MS Access, SQL Server etc.) on the local server:
• Make sure a valid username and password are present for the non-Oracle database
(MyUser/MyPassword).
• Create an ODBC DataSource for the non-Oracle database (Access1).
• Create a file called: "ORACLE_HOME\hs\admin\initACCESS1.ora" which contains:

HS_FDS_CONNECT_INFO = Access1 #ODBC DSN


HS_FDS_TRACE_LEVEL = OFF

• Add the following entry to the tnsnames.ora file:

ACCESS1.WORLD =
(DESCRIPTION=
(ADDRESS=(PROTOCOL=tcp)(HOST=localhost)(PORT=1521))
(CONNECT_DATA=(SID=ACCESS1))
(HS=OK)
)

• Add the following entry into the listener.ora file:

(SID_DESC=
(SID_NAME=ACCESS1)
(ORACLE_HOME=D:\Oracle\Ora\9011)
(PROGRAM=hsodbc)
)
• Reload the listener: lsnrctl reload
• Create a database link using:

CREATE DATABASE LINK access1.world CONNECT TO "MyUser" IDENTIFIED BY


"MyPassword" USING 'ACCESS1.WORLD';

• Query the table using:

SELECT *
FROM mytable@ACCESS1.WORLD;

With MS Access the first query is slow due to the Access application being loaded. As expected,
subsequent queries do not display a similar lag.
August 2009
Master of Science in Information Technology (MScIT-NEW) – Semester 3
MT0046 –: Oracle9i– 4Credits
(Book ID: B0894)
Assignment Set – 2 (40 Marks)

[Book ID: B0894]

Each Question carries 8 marks

1. List the objectives of Query processing


2. Explain the features of RMAN Repository
3. Discuss the relative advantages of centralized and distributed databases
4. What benefits does strict two-phase locking provide? What disadvantages result?
5. What are characteristics of distributed query processor? Explain
1. List the objectives of Query processing ?

Ans. :
the main objective of query processing in a distributed environment is to form a high level query on
a distributed database, which is seen as a single database by the users, into an efficient execution
strategy expressed in a low level language on local databases.

An important point of query processing is query optimization. Because many execution strategies
are correct transformations of the same high level query, the one that optimizes (minimizes)
resource consumption should be retained.

The good measures of resources consumption are:


i. The total cost that will be incurred in processing the query. It is the some of all times incurred
in processing the operations of the query at various sites and intrinsic communication.
ii. The resource time of the query. This is the time elapsed for executing the query. Since
operations can be executed in parallel at different sites, the response time of a query may be
significantly less that its cost.
Obviously the total cost should be minimized.
i. In a distributed system, the total cost be minimized included CPU, I/O, and communication
costs. These costs can be minimized by reducing the number of I/O operation through fast access
methods to the data and efficient use of main memory. The communication cost is the time
needed for exchanging the data between sites participating in the execution of the query.
ii. In centralized systems, only CPU and I/O cost have to be considered.

16
Objectives of query processing
– transform a high-level query on a DDB into an efficient execution strategy in a low-level language
on local DB’s

high level user query

query processor

low level data manipulation commands

General steps in query processing


– Query specified by a high-level query language
» SQL is the most widely used
declarative
– Query execution plan
» low-level physical steps
for executing high-level (declarative) user queries
_Query optimization
» how do we determine the “best” execution plan?

Relational calculus query on DDB

Relational algebraic query on DDB

Algebraic query on fragments

Optimized fragment query

with communication operations

Optimized local query

Selecting alternatives
– A calculus query may have many equivalent algebra queries

17
– Select execution strategy that could minimize
Execution time, or total resource consumption
» Computationally intractable problem
Choose a solution close to the optimum
_ In DDB
» communication cost for data exchange occurs
– select best sites
» to process data, and possibly
» the way data should be transferred

2. Explain the features of RMAN Repository ?

Ans.
The RMAN repository is the collection of metadata about the target databases that RMAN uses for
backup, recovery, and maintenance. RMAN always stores this information in records in the control
file. The version of this information in the control file is the authoritative record of RMAN's backups
of your database. This is one reason why protecting your control file is a important part of your
backup strategy. RMAN can conduct all necessary backup and recovery operations using just the
control file to store the RMAN repository information, and maintains all records necessary to meet
your configured retention policy.

You can also create a recovery catalog, an external Oracle database in which to store this
information. The control file has finite space for records of backup activities, while a recovery
catalog can store a much longer history. The added complexity of operating a recovery catalog
database can be offset by the convenience of having the extended backup history available if you
have to do a recovery that goes further back in time than the history in the control file.

There are also a few features of RMAN that only function when you use a recovery catalog. For
example, RMAN stored scripts are stored in the recovery catalog, so commands related to them
require the use of a recovery catalog. Other RMAN commands are specifically related to managing
the recovery catalog and so are not available (and not needed) if RMAN is not connected to a
recovery catalog.

The recovery catalog's version of the RMAN repository is maintained solely by RMAN. The target
instance never accesses it directly. RMAN propagates information about the database structure,
archived redo logs, backup sets, and datafile copies into the recovery catalog from the target
database's control file after any operation that updates the repository, and also before certain
operations.

Storage of the RMAN Repository in the Control File


Because most information in the recovery catalog is also available in the target database's control
file, RMAN supports an operational mode in which it uses the target database control file instead of
18
a recovery catalog. This mode is especially appropriate for small databases where installation and
administration of a separate recovery catalog database is burdensome. The only RMAN feature
that is not supported in NOCATALOG mode is stored scripts.

Types of Records in the Control File


When you do not use a recovery catalog, the control file is the exclusive source of information
about backups and copies as well as other relevant information. The control file contains two types
of records: circular reuse records and noncircular reuse records.

Circular Reuse Records


Circular reuse records contain noncritical information that is eligible to be overwritten if the need
arises. These records contain information that is continually generated by the database. Circular
reuse records are arranged in a logical ring. When all available record slots are full, the database
either expands the control file to make room for a new record or overwrites the oldest record. The
CONTROL_FILE_RECORD_KEEP_TIME initialization parameter specifies the minimum age in
days of a record before it can be reused.

Noncircular Reuse Records


Noncircular reuse records contain critical information that does not change often and cannot be
overwritten. Some examples of information in noncircular reuse records include datafiles, online
redo logs, and redo threads.

Recovery Without a Recovery catalog


To make it easier to restore and recover the database without using a recovery catalog, Oracle
recommends that you:

• Enable the control file autobackup feature, which causes RMAN to automatically back up
the control file, and also enables RMAN to restore the control file autobackup without
access to a repository

• Keep a record of your DBID, which you may need to recover your database in the event
that you lose your control file

• Use a minimum of two multiplexed or mirrored control files on separate disks

• Keep all Recovery Manager backup logs.

If you lose the current control files, then you can restore a control file autobackup even if you do
not use a recovery catalog.

Storage of the RMAN Repository in the Recovery Catalog


It is recommended that you store the recovery catalog in a dedicated database. If you store the
recovery catalog alongside other data in some other database, then if you lose that other database
you will lose your recovery catalog as well. This will make your recovery more difficult.

19
Registration of Databases in the Recovery Catalog
The enrolling of a database in a recovery catalog is called registration. You can register more
than one target database in the same recovery catalog. For example, you can register databases
prod1, prod2, and prod3 in a single catalog owned by catowner in the database catdb. Because
RMAN distinguishes databases by unique database identifier (DBID), each database registered in
a given catalog must have a unique DBID.

Contents of the Recovery Catalog


The recovery catalog contains information about RMAN operations, including:

• Datafile and archived redo log backup sets and backup pieces

• Datafile copies

• Archived redo logs and their copies

• Tablespaces and datafiles on the target database

• Stored scripts, which are named user-created sequences of RMAN commands

• Persistent RMAN configuration settings

Resynchronization of the Recovery Catalog


The recovery catalog obtains crucial RMAN metadata from the target database control file.
Resynchronization of the recovery catalog ensures that the metadata that RMAN obtains from the
control file stays current. Resynchronizations can be full or partial.

Backups of the Recovery Catalog


A single recovery catalog is able to store information for multiple target databases. Consequently,
loss of the recovery catalog can be disastrous. You should back up the recovery catalog
frequently.

If the recovery catalog is destroyed and no backups of it are available, then you can partially
reconstruct the catalog from the current control file or control file backups. Nevertheless, you
should always aim to have a valid, recent backup of the catalog.

Compatibility of the Recovery Catalog


When you use RMAN with a recovery catalog in an environment where you have run past versions
of the database, you can wind up with versions of the RMAN client, recovery catalog database,
recovery catalog schema, and target database that all originated in different releases of the
database.

Media Management
Oracle9s Media Management Layer (MML) API lets third-party vendors build a media manager,
software that works with RMAN and the vendor's hardware to allow backups to sequential media
devices such as tape drives. The media manager handles loading, unloading and labeling of
20
sequential media such as tapes. You must install media manager software to use RMAN with
sequential media devices.

When backing up or restoring, the RMAN client connects to the target instance and directs the
instance to send requests to its media manager. No direct communication occurs between the
RMAN client and media manager.

Performing Backup and Restore with a Media Manager


Before performing backup or restore to a media manager, you must allocate one or more channels
to handle the communication with the media manager. You can also configure default channels for
use with the media manager, which will be applied for all backup and recovery tasks that use the
media manager where you do not explicitly allocate channels.

For example, this sequence of commands would configure channels for the media manager and
back up the database to the media manager:

RMAN> CONFIGURE DEVICE TYPE sbt PARALLELISM 1;


RMAN> CONFIGURE DEFAULT DEVICE TYPE TO sbt;
RMAN> CONFIGURE CHANNEL DEVICE TYPE sbt PARMS 'ENV=(NSR_SERVER=bksvr1)';
RMAN> BACKUP DATABASE;
When RMAN executes the BACKUP DATABASE command, it sends the backup request to the
database server session performing the backup. The database server session identifies the output
channel as a media management device and makes a request to the media manager to write the
output.

RMAN does not issue specific commands to load, label, or unload tapes. When backing up, RMAN
gives the media manager a stream of bytes and associates a unique name with that stream. When
RMAN needs to restore the backup, it asks the media manager to retrieve the byte stream. All
details of how and where that stream is stored are handled entirely by the media manager.

The media manager labels and keeps track of the tape and names of files on each tape, and
automatically loads and unloads tapes, or signals an operator to do so.

Some media managers support proxy copy functionality, in which they handle the entire data
movement between datafiles and the backup devices. Such products may use technologies such
as high-speed connections between storage and media subsystems to reduce load on the primary
database server. RMAN provides a list of files requiring backup or restore to the media manager,
which in turn makes all decisions regarding how and when to move the data.

21
3. Discuss the relative advantages of centralized and distributed databases

Ans. :
A distributed database is a database that is under the control of a central database management
system (DBMS) in which storage devices are not all attached to a common CPU. It may be stored
in multiple computers located in the same physical location, or may be dispersed over a network of
interconnected computers.

Collections of data (eg. in a database) can be distributed across multiple physical locations. A
distributed database can reside on network servers on the Internet, on corporate intranets or
extranets, or on other company networks. Replication and distribution of databases improve
database performance at end-user worksites. [1]

Relative Advantages of centralized distributed databases


• Reflects organizational structure — database fragments are located in the departments
they relate to.
• Local autonomy — a department can control the data about them (as they are the ones
familiar with it.)
• Protection of valuable data — if there were ever a catastrophic event such as a fire, all of
the data would not be in one place, but distributed in multiple locations.
• Improved performance — data is located near the site of greatest demand, and the
database systems themselves are parallelized, allowing load on the databases to be
balanced among servers. (A high load on one module of the database won't affect other
modules of the database in a distributed database.)
• Economics — it costs less to create a network of smaller computers with the power of a
single large computer.
• Modularity — systems can be modified, added and removed from the distributed database
without affecting other modules (systems).
• Reliable transactions - Due to replication of database.
single site failure doesnot affect performance of system. all transactions follow A.C.I.D property. a-
atmocity : the transaction takes place as whole or not at all. c-consistency : maps one consistent
DB state to another i-isolation : each transaction sees a consistent DB d-durabality : the results of
a transaction must survive system failures

22
4. What benefits does strict two-phase locking provide? What disadvantages result?

Ans.

The management of distributed transaction means with interrelated problem like reliability,
concurrency control and the efficient utilization of the resources of the complete system. The two
type of protocols like 2 phase commit protocol for recovery and 2 phase locking for concurrency
control.

The basic idea of locking is that whenever a transaction accesses a data item, it locks it, and that a
transaction which wants to lock a data item which is already locked by another transaction must
wait until the other transaction has released the lock (unlock).

A sophisticated locking mechanism known as 2-Phase locking which included the all said
principles is normally used.

According to this there are two separate phase:


i. Growing Phase: each transactions there is a first phase during which new locks are acquired.
ii. Shrinking Phase: a second phase during which locks are only released.

LOCKING IN CENTRALIZED DATABASE:


i. Lock mode: transaction locks the data item in the following:
• Shared mode: Read the date item.
• Exclusive mode: Edit the data item.
ii. The well-formed transactions: the transactions are always well formed if it always locks a
data item in shared mode before reading it, and it always locks a data item in exclusive mode
before writing it.
iii. Compatibility rules existing between lock mode.
• A transaction can lock a data item in shared mode if it is not locked at all or it is locked
in shared mode by another transaction.
iv. Conflicts: two transcations are confilict if they want ot want to lock the same data item with
two compatible modes. Two types of confilict : read-write conflict and Write-Write conflict.
v. Granularity of locking: this term relates to the size of objects that are locked with a single lock
operation.

SUCCESSFUL RULES ARE FOLLOWED:


23
• Transactions are well-formed
• Compatibilty rules are observed
• Each transaction does not request new locks after it has released a lock.

SYNTEX:
(Begin application)
Begin transaction
Acquire locks before reading or writing
Commit
Release locks
(End application)

24
5. What are characteristics of distributed query processor? Explain

Ans.

it is very difficult to give the characteristics, which differentiates centralized and distributed query
processors. Still some of them have been listed here. Out of them, the first four are common to
both and the next four are particular to distributed query processors.

Characterization of Query Processing


i. Languages
ii. Types of Optimization
iii. Optimization timing
iv. Statistics
v. Decision sites
vi. Exploitation of the Network Topology
vii. Exploitation of replicated fragments.

i. Languages:
The input language to the query processor can be based on relational calculus or relational
algebra. In distributed context, the output language is generally some form of relational algebra
augmented with communication primitives.

ii. Types of Optimization


Types of optimizers
– Cost-based: exhaustive search
» can be optimal, potentially
» combinatorial complexity in the number of relations
– Heuristics
» may not be optimal, in general
» commonly used heuristics
_ regroup common sub-expressions
_ perform selection, projection first
_ reorder operations to reduce intermediate relation size

25
_replace a join by a series of semi-joins
iii. Optimization timing
– Static
» optimization before execution, i.e., at compile time
_ needs to estimate the size of the intermediate results
_ good for queries executed multiple times
» (ex) R*
– Dynamic
» optimization at query execution time

_ potentially, exact information on intermediate results


» must be repeated for each execution
_ good for ad-hoc queries
» (ex) Distributed INGRES
– Hybrid
» compile using a static algorithm
» if the error in estimate sizes > threshold,
_ reoptimize at run time
» (ex) MERMAID
iv. Statistics
» effectiveness of query optimization relies on statistics on DB
– relation
» cardinality, size of a tuple, join selectivity
– attribute
» cardinality of domain, actual number of distinct values
_Common assumptions
» independence between different attribute values
» uniform distribution of attribute values within their domain
_Statistics need to be periodically updated

v. Decision sites
» who makes a decision?
– Centralized

26
» single site determines the “best” schedule
» simple, but need knowledge about entire DDB
– Distributed
» cooperation among sites to determine the schedule
» need only local information
» cost of cooperation
– Hybrid
» one site makes a major decision, and
_ other sites can make local decisions
» (ex) System R*

vi. Network topology


– Wide area networks (WAN)
_ low bandwidth, low speed, high protocol overhead
» communication cost will dominate
_ ignore all other cost factors
» global schedule to minimize communication cost
» local schedules according to centralized query optimization
– Local area networks (LAN)
_ communication costs are comparable to I/O costs
» I/O and CPU costs need to be considered
» broadcasting can be exploited (e.g., for joins)
vii. Use of semijoins
– good for reducing the size of data transfer
– (ex) SDD-1 makes an extensive use of semijoins
» SDD-1 was designed for slow WAN
– (ex) System R* do not employ semijoins
» System R* assumes faster networks
Query decomposition
– decompose the distributed calculus query into an
algebraic query on global relations
_ Data localization
– transform the distributed query into a fragment query
_ Global query optimization

27
– optimized fragment query (e.g., join ordering) with communication operations
_ Local query optimization
– optimized local query

28

You might also like