Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Standard view
Full view
of .
0 of .
Results for:
P. 1
mc0077 set 2 july 2011

# mc0077 set 2 july 2011

Ratings: (0)|Views: 119|Likes:

### Availability:

See more
See less

01/03/2013

pdf

text

original

July 2011Master of Computer Application (MCA)
–
Semester 4MC0077
–
–
4 Credits
(Book ID: B0882)
Assignment Set
–
2
1. Describe the following with suitable examples:
o
Cost Estimation
o
Measuring Index Selectivity
Ans:
Cost Estimation
One of the hardest problems in query optimization is to accurately estimate the costs of alternative query plans. Optimizers cost query plans using a mathematical model of queryexecution costs that relies heavily on estimates of the cardinality, or number of tuples, flowingthrough each edge in a query plan.
Cardinality
estimation in turn depends on estimates of theselection factor of predicates in the query. Traditionally, database systemsestimate
selectivity
through fairly detailed statistics on the distribution of values in each column,such as histograms. This technique works well for estimation of selectivity of individualpredicates. However many queries have conjunctions of predicates such as
select count(*)from R, S where R.make='Honda' and R.model='Accord'
. Query predicates are often highlycorrelated (for example,
model='Accord'
implies
make='Honda'
), and it is very hard to estimatethe selectivity of the conjunct in general. Poor cardinality estimates and uncaught correlation areone of the main reasons why query optimizers pick poor query plans. This is one reason why a DBAshould regularly update the database statistics, especially after major data loads/unloads.The
Cardinality
of a set is a measure of the "number of elements of the set". There are twoapproaches to cardinality
one which compares sets directly using bijections and injections, andanother which uses cardinal numbers
Measuring Index Selectivity

Index Selectivity
B*TREE Indexes improve the performance of queries that select a small percentage of rows from atable. As a general guideline, we should create indexes on tables that are often queried for lessthan
15% of the table’s rows.
This value may be higher in situations where all data can beretrieved from an index, or where the indexed columns can be used for joining to other tables.The ratio of the number of
distinct values
in the indexed column / columns to the
number of records
in the table represents the selectivity of an index.
The ideal selectivity is 1.

Suchselectivity can be reached only by unique indexes on NOT NULL columns.

Example with good Selectivity

A table having 100′000 records and one of its indexed column has 88000 distinct values, then theselectivity of this index is 88′000 / 10′0000 = 0.88.
Oracle implicitly creates indexes on the columns of all unique and primary keys that you definewith integrity constraints. These indexes are the most selective and the most effective inoptimizing performance. The selectivity of an index is the percentage of rows in a table havingthe same value for the indexed column.
An index’s selectivity is good if few rows have the same
value.

lf an index on a table of 100′000 records had only 500 distinct values, then the index’s selectivity
is
500 / 100′000 = 0.005 and in this case a query which uses the limitation of such an index willretum 100′000 / 500 = 200 records for each distinct value. It is evident that a

full table scan ismore efficient
as using such an index where much more I/O is needed to scan repeatedly theindex and the table.
How to Measure Index Selectivity?

Manually measure index selectivity
The ratio of the number of distinct values to the total number of rows is the selectivity of thecolumns. This method is useful to estimate the selectivity of an index before creating it.
select count (distinct job) "Distinct Values" from emp;

select count(*) "Total Number Rows" from emp;
Selectivity = Distinct Values / Total Number Rows= 5 / 14= 0.35
Automatically measuring index selectivity
We can determine the selectivity of an index by dividing the number of distinct indexed values bythe number of rows in the table.create index idx_emp_job on emp(job);analyze table emp compute statistics;
select distinct_keys from user_indexes
where table_name = ‘EMP’

and index_name = ‘IDX_EMP_JOB’;

select num_rows from user_tables
where table_name = ‘EMP’;
Selectivity = DISTINCT_KEYS / NUM_ROWS = 0.35
Selectivity of each individual Column
Assuming that the table has been analyzed it is also possible to query USER_TAB_COLUMNS toinvestigate the selectivity of each column individually.
select column_name, num_distinctfrom user_tab_columns
where table_name = ‘EMP’;

2. Describe the following:
o
Statements and Transactions in a Distributed Database
o
Heterogeneous Distributed Database SystemsAns:
Statements and Transactions in a Distributed Database

The following sections introduce the terminology used when discussing statements andtransactions in a distributed database environment.
Remote and Distributed Statements
A
Remote Query
is a query that selects information from one or more remote tables, all of whichreside at the same remote node.A
Remote Update
is an update that modifies data in one or more tables, all of which are locatedat the same remote node.