Professional Documents
Culture Documents
o Analyze Statement
GATHER_STATS_JOB
Optimizer statistics are automatically gathered with the job GATHER_STATS_JOB . This job
gathers statistics on all objects in the database which have:
Missing statistics
Stale statistics
maintenance window closes. The default setting for the attribute is ,
stop_on_window_close TRUE
The remaining objects are then processed in the next maintenance window.
BEGIN
DBMS_SCHEDULER.DISABLE('GATHER_STATS_JOB');
END;
/
o Automatic statistics gathering relies on
the modification monitoring feature,
described in "Determining Stale
Statistics". If this feature is disabled,
then the automatic statistics gathering
job is not able to detect stale statistics.
This feature is enabled when
the parameter is set
STATISTICS_LEVEL
Volatile tables that are being deleted or truncated and rebuilt during the
course of the day.
Objects which are the target of large bulk loads which add 10% or more
to the object's total size.
The statistics on these tables can be set to values that represent the
typical state of the table. You should gather statistics on the table when
the tables has a representative number of rows, and then lock the
statistics.
on the table during the overnight batch window may not be the most
appropriate statistics for the daytime workload.
If you choose not to use automatic statistics gathering, then you need to
manually collect statistics in all schemas, including system schemas. If the data
in your database changes regularly, you also need to gather statistics regularly to
ensure that the statistics accurately represent characteristics of your database
objects.
When gathering statistics manually, you not only need to determine how to
gather statistics, but also when and how often to gather new statistics.
For partitioned tables, there are often cases in which only a single
partition is modified. In those cases, statistics can be gathered only
on those partitions rather than gathering statistics for the entire
table. However, gathering global statistics for the partitioned table
may still be necessary.
The GATHER_DATABASE_STATS or GATHER_SCHEMA_STATS proced
ures gather new statistics for tables with stale statistics when
the OPTIONS parameter is set
to GATHER STALE or GATHER AUTO. If a monitored table has been
modified more than 10%, then these statistics are considered stale
and gathered again.
DBMS_STATS
As database administrator, you can generate statistics that quantify the data
distribution and storage characteristics of tables, columns, indexes, and
partitions. The cost-based optimization approach uses these statistics to
calculate the selectivity of predicates and to estimate the cost of each
execution plan. Selectivity is the fraction of rows in a table that the SQL
statement's predicate chooses. The optimizer uses the selectivity of a
predicate to estimate the cost of a particular access method and to determine
the optimal join order and join method.
The statistics are stored in the data dictionary and can be exported from one
database and imported into another. For example, you might want to transfer
your statistics to a test system to simulate your production environment.
You should gather statistics periodically for objects where the statistics
become stale over time because of changing data volumes or changes in
column values. New statistics should be gathered after a schema object's
data or structure are modified in ways that make the previous statistics
inaccurate. For example, after loading a significant number of rows into a
table, collect new statistics on the number of rows. After updating data in a
table, you do not need to collect new statistics on the number of rows, but
you might need new statistics on the average row length.
Table statistics
o Number of rows
o Number of blocks
o Average row length
Column statistics
o Number of distinct values (NDV) in column
o Number of nulls in column
o Data distribution (histogram)
Index statistics
o Number of leaf blocks
o Levels
o Clustering factor
System statistics
o I/O performance and utilization
o CPU performance and utilization
Generating Statistics
Because the cost-based approach relies on statistics, you should generate
statistics for all tables and clusters and all indexes accessed by your SQL
statements before using the cost-based approach. If the size and data
distribution of the tables change frequently, then regenerate these statistics
regularly to ensure the statistics accurately represent the data in the tables.
Some statistics are computed exactly, such as the number of data blocks
currently containing data in a table or the depth of an index from its root block
to its leaf blocks.
To estimate statistics, Oracle selects a random sample of data. You can specify
the sampling percentage and whether sampling should be based on rows or
blocks. Oracle Corporation recommends
using DBMS_STATS.AUTO_SAMPLE_SIZE for the sampling percentage. When in
doubt, choose row sampling.
When you generate statistics for a table, column, or index, if the data
dictionary already contains statistics for the object, then Oracle updates the
existing statistics. Oracle also invalidates any currently parsed SQL
statements that access the object.
When you associate a statistics type with a column or domain index, Oracle
calls the statistics collection method in the statistics type, if you analyze the
column or domain index.
System Statistics
System statistics describe the system's hardware characteristics, such as I/O and
CPU performance and utilization, to the query optimizer. When choosing an
execution plan, the optimizer estimates the I/O and CPU resources required for
each query. System statistics enable the query optimizer to more accurately
estimate I/O and CPU costs, enabling the query optimizer to choose a better
execution plan.
Unlike table, index, or column statistics, Oracle does not invalidate already
parsed SQL statements when system statistics get updated. All new SQL
statements are parsed using new statistics.
Workload Statistics
Run
the dbms_stats.gather_system_stats('start') proce
dure at the beginning of the workload window, then
the dbms_stats.gather_system_stats('stop') proced
ure at the end of the workload window.
Run dbms_stats.gather_system_stats('interval',
interval=>N) where N is the number of minutes when
statistics gathering will be stopped automatically.
In release 10.2, the optimizer uses the value of when performing full table
mbrc
scans (FTS). The value of is set to the maximum allowed by the
db_file_multiblock_read_count
operating system by default. However, the optimizer uses for costing. The
mbrc=8
"real" is actually somewhere in between since serial multiblock read requests
mbrc
are processed by the buffer cache and split in two or more requests if some
blocks are already pinned in the buffer cache, or when the segment size is
smaller than the read size. The value gathered as part of workload statistics is
mbrc
serial workloads, as is often the case with OLTP systems. On the other hand, FTS
occur frequently on DSS systems but may run parallel and bypass the buffer
cache. In such cases, will still be gathered since index lookup are performed
sreadtim
using the buffer cache. If Oracle cannot gather or validate gathered or , mbrc mreadtim
Noworkload statistics consist of I/O transfer speed, I/O seek time, and CPU
speed ( ). The major difference between workload statistics and noworkload
cpuspeednw
Noworkload statistics gather data by submitting random reads against all data
files, while workload statistics uses counters updated when database activity
occurs. represents the time it takes to position the disk head to read data.
isseektim
Its value usually varies from 5 ms to 15 ms, depending on disk rotation speed
and the disk or RAID specification. The I/O transfer speed represents the speed
at which one operating system process can read data from the I/O subsystem.
Its value varies greatly, from a few MBs per second to hundreds of MBs per
second. Oracle uses relatively conservative default settings for I/O transfer
speed.
In Oracle 10g, Oracle uses noworkload statistics and the CPU cost model by
default. The values of noworkload statistics are initialized to defaults at the first
instance startup:
ioseektim = 10ms
iotrfspeed = 4096 bytes/ms
cpuspeednw = gathered value, varies based on system
LOCK_SCHEMA_STATS
LOCK_TABLE_STATS
UNLOCK_SCHEMA_STATS
UNLOCK_TABLE_STATS
Estimate statistics for tables and relevant indexes whose statistics are too
out of date to trust.
How Dynamic Sampling Works
For a query that normally completes quickly (in less than a few seconds), you will
not want to incur the cost of dynamic sampling. However, dynamic sampling can
be beneficial under any of the following conditions:
You control dynamic sampling with the OPTIMIZER_DYNAMIC_SAMPLING parameter, which can be
set to a value from to . The default is .
0 10 2
Partitioned schema objects can contain multiple sets of statistics. They can have
statistics that refer to any of the following:
Unless the query predicate narrows the query to a single partition, the optimizer
uses the global statistics. Because most queries are not likely to be this
restrictive, it is most important to have accurate global statistics. Intuitively, it
can seem that generating global statistics from partition-level statistics is
straightforward; however, this is true only for some of the statistics. For
example, it is very difficult to figure out the number of distinct values for a
column from the number of distinct values found in each partition, because of
the possible overlap in values. Therefore, actually gathering global statistics
with the DBMS_STATS package is highly recommended, rather than calculating
them with the ANALYZE statement.
GATHER_DATABASE_STATS
Statistics for all objects in a database
CPU and I/O statistics for the system
GATHER_SYSTEM_STATS
EXEC DBMS_STATS.gather_database_stats;
EXEC
DBMS_STATS.gather_database_stats(estimate_percent
=> 15);
EXEC DBMS_STATS.gather_schema_stats('SCOTT');
EXEC DBMS_STATS.gather_schema_stats('SCOTT',
estimate_percent => 15);
EXEC DBMS_STATS.gather_table_stats('SCOTT',
'EMPLOYEES');
EXEC DBMS_STATS.gather_table_stats('SCOTT',
'EMPLOYEES', estimate_percent => 15);
EXEC DBMS_STATS.gather_index_stats('SCOTT',
'EMPLOYEES_PK');
EXEC DBMS_STATS.gather_index_stats('SCOTT',
'EMPLOYEES_PK', estimate_percent => 15);
This package also gives you the ability to delete statistics:
EXEC DBMS_STATS.delete_database_stats;
EXEC DBMS_STATS.delete_schema_stats('SCOTT');
EXEC DBMS_STATS.delete_table_stats('SCOTT',
'EMPLOYEES');
EXEC DBMS_STATS.delete_index_stats('SCOTT',
'EMPLOYEES_PK');
DBA_TABLES
DBA_OBJECT_TABLES
DBA_TAB_STATISTICS
DBA_TAB_COL_STATISTICS
DBA_TAB_HISTOGRAMS
DBA_INDEXES
DBA_IND_STATISTICS
DBA_CLUSTERS
DBA_TAB_PARTITIONS
DBA_TAB_SUBPARTITIONS
DBA_IND_PARTITIONS
DBA_IND_SUBPARTITIONS
DBA_PART_COL_STATISTICS
DBA_PART_HISTOGRAMS
DBA_SUBPART_COL_STATISTICS
DBA_SUBPART_HISTOGRAMS
Gather statistics during the day. Gathering ends after 720 minutes and is stored
in the mystats table:
BEGIN
DBMS_STATS.GATHER_SYSTEM_STATS(
gathering_mode => 'interval',
interval => 720,
stattab => 'mystats',
statid => 'OLTP');
END;
/
Gather statistics during the night. Gathering ends after 720 minutes and is
stored in the mystats table:
BEGIN
DBMS_STATS.GATHER_SYSTEM_STATS(
gathering_mode => 'interval',
interval => 720,
stattab => 'mystats',
statid => 'OLAP');
END;
/
If appropriate, you can switch between the statistics gathered. It is possible to
automate this process by submitting a job to update the dictionary with
appropriate statistics.
During the day, the following jobs import the OLTP statistics for the daytime
run:
VARIABLE jobno number;
BEGIN
DBMS_JOB.SUBMIT(:jobno,
'DBMS_STATS.IMPORT_SYSTEM_STATS(''mystats'',''OLTP'');'
SYSDATE, 'SYSDATE + 1');
COMMIT;
END;
/
During the night, the following jobs import the OLAP statistics for the
nighttime run:
BEGIN
DBMS_JOB.SUBMIT(:jobno,
'DBMS_STATS.IMPORT_SYSTEM_STATS(''mystats'',''OLAP'');'
SYSDATE + 0.5, 'SYSDATE + 1');
COMMIT;
END;
/