Professional Documents
Culture Documents
© 2008
2009 Oracle Corporation – Proprietary and Confidential
<Insert Picture Here>
Title of Presentation
Exadata Oracle Database Machine Overview
Presenter‘s Name
Presenter‘s Title
Student Manual
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2008
2009 Oracle Corporation – Proprietary and Confidential
Module Agenda
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
<Insert Picture Here>
Why Exadata?
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Database machine
Requirements with 11g R2
• Data integrity
• MVRC
• Data Guard
• ASM
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Database machine
Requirements with 11g R2
• Data integrity
• Performance
• MVRC
• Efficient caching
• Bitmap indexes
• Partitioning
• Materialized views
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Database machine
Requirements with 11g R2
• Data integrity
• Performance
• Scalability
• RAC
• Powerful platforms
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Database machine
Requirements
• Data integrity
• Performance
• Scalability
Oracle delivers!
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Database Machine
What more?
• Efficiency
Resources
Efficiency = -----------------------
Demands
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Business impact of efficiency
• Business queries if 10 TB of data must be scanned
Data Bandwidth
1 GB/sec 3 queries/work day Don‘t Ask
10 TB
10 GB/sec
3 queries/hour Ask Tomorrow
100 GB/sec
35 queries/hour Ask Anything
Exadata Database Machine
Extreme efficiency
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
<Insert Picture Here>
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Exadata Hardware Architecture
Scaleable Grid of industry standard servers for Compute and Storage
• Eliminates long-standing tradeoff between Scalability, Availability, Cost
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Exadata Smart Flash Cache
Extreme Performance OLTP & DW
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
<Insert Picture Here>
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Exadata Intelligent Storage
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Traditional Scan Processing
• With traditional storage, all
SELECT
customer_name Rows Returned database intelligence
FROM calls resides in the database
WHERE amount > hosts
200;
• Very large percentage of
DB Host reduces data returned from storage
terabyte of data to 1000 is discarded by database
customer names that servers
Table
are returned to client
Extents • Discarded data consumes
Identified
valuable resources, and
impacts the performance of
other workloads
I/Os Issued I/Os Executed:
1 terabyte of data
returned to hosts
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Exadata Smart Scan Processing
Reduces demand
• Only the relevant columns
SELECT
customer_name • customer_name
FROM calls Rows Returned
and required rows
WHERE amount > • where amount>200
200; are are returned to hosts
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Additional Smart Scan functionality
Reduces demand
• Join filtering
• Filtering is performed within Exadata storage cells
• Join predicates are transformed into filters
• Backups
• Only changed blocks are returned
• Create Tablespace (file creation)
• Formatting of tablespace extents eliminates the I/O associated with
the creation and writing of tablespace blocks
• Smart Scan offload for encrypted tablespaces and
columns
• Offload of Data Mining Model scoring
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Exadata Hybrid Columnar Compression
Highest Capacity, Lowest Cost
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Exadata Storage Index
Transparent I/O Elimination with No Overhead
Table Index
• Exadata Storage Indexes maintain summary
information about table data in memory
A B C D
• Store MIN and MAX values of columns
1 • Typically one index entry for every MB of disk
Min B = 1
3
Max B =5 • Eliminates disk I/Os if MIN and MAX can never
5 match ―where‖ clause of a query
5
8 Min B = 3 • Completely automatic and transparent
Max B =8
3
Select * from Table where B<2 - Only first set of rows can match
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
<Insert Picture Here>
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Exadata I/O Resource Management
Mixed Workloads and Multi-Database Environment
• Database A:
• Reporting: 60% of I/O resources
• ETL: 40% of I/O resources
• Database B:
• Interactive: 30% of I/O resources
• Batch: 70% of I/O resources
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
<Insert Picture Here>
Exadata benefits
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Exadata Benefits
Fast Predictable Performance
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Brian Camp
SVP, Infrastructure Services
KnowledgeBase Marketing
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Performance Query Throughput Query Throughput with Flash
60
50
50
Why is Oracle Faster?
40
DB Processing in Storage
30
Better Compression (10x) 21
20
Smart Flash Cache
11.4
10
7.5
Faster Interconnect (40Gb/sec) 10
More Disks 0
HITACHI TERADATA NETEZZA SUN ORACLE
USP V 2550 TwinFin 12 Database Machine
Faster Disks (15K RPM)
© 2009 Oracle Corporation For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Exadata Performance Scales
10 Hour
• Exadata delivers brawny
Table Scan Time hardware for use by Oracle‘s
brainy software
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
<Insert Picture Here>
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Exadata Hardware Architecture
Scaleable Grid of industry standard servers for Compute and Storage
• Eliminates long-standing tradeoff between Scalability, Availability, Cost
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Standardized and Simple to Deploy
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Paul Hartley
General Manager
LGR Telecommunications
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Exadata Storage Server Building Block
• High-performance storage server built from
• Hardware by Sun industry standard components
• Software by Oracle
• 12 disks - 600 GB 15000 RPM High
Performance SAS or 2TB 7200 RPM High
Capacity SAS
• 4 x 96 GB Flash Cards
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
New - Exadata Database Machine X2-8 Full Rack
Extreme Performance for Consolidation, Large OLTP and DW
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Exadata Product Capacity
X2-8 X2-2 X2-2 X2-2
Full Rack Full Rack Half Rack Quarter Rack
High Perf Disk 100 TB 100 TB 50 TB 21 TB
Raw Disk1
High Cap Disk 336 TB 336 TB 168 TB 72 TB
1 – Raw capacity calculated using 1 GB = 1000 x 1000 x 1000 bytes and 1 TB = 1000 x 1000 x 1000 x 1000 bytes.
2 - User Data: Actual space for end-user data, computed after single mirroring (ASM normal redundancy) and after
allowing space for database structures such as temp, logs, undo, and indexes. Actual user data capacity varies by
application. User Data capacity calculated using 1 TB = 1024 * 1024 * 1024 * 1024 bytes.
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Exadata Product Performance
X2-8 X2-2 X2-2 X2-2
Full Rack Full Rack Half Rack Quarter
Rack
Raw Disk Data High Perf Disk 25 GB/s 25 GB/s 12.5 GB/s 5.4 GB/s
Bandwidth1,4 High Cap Disk 14 GB/s 14 GB/s 7 GB/s 3 GB/s
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Database Server Operating System Choices
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Exadata Licensing
Database nodes
Required Products
Oracle Database 11g Enterprise Edition
Oracle Exadata Storage Server Software
Highly recommended products
RAC
Partitioning Option
Other Recommended Software
Advanced Compression Option
Enterprise Manager Packs: Diagnostics, Provisioning, Tuning
OLAP Option
Data Mining Option
Advanced Security Option
Real Application Testing
Oracle Business Intelligence Enterprise Edition Plus
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
<Insert Picture Here>
Exadata summary
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Exadata Database Machine Summary
Extreme Performance for all Data Management
© 2009 Oracle Corporation For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
<Insert Picture Here>
Title of Presentation
Smart features
Presenter‘s Name
Presenter‘s Title
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2008 Oracle Corporation – Proprietary and Confidential
Module Agenda
Smart Scans
Smart Scan
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2010 Oracle Corporation – Proprietary and Confidential
Traditional Scan Processing
I/Os Issued I/Os Executed:
1 terabyte of data
returned to hosts
Traditional Scan Processing
Table extents and
meta-data sent to
cells
Exadata Smart Scan Processing
SELECT • Smart Scan Example:
customer_name • Smart Scan processing on
FROM calls the Exadata Storage cells
WHERE amount > scans data blocks to
200; identify relevant rows and
columns
Table extents and
meta-data sent to
cells
Smart Scan
identifies rows and
columns within
terabyte table that
match request
Exadata Smart Scan Processing
SELECT • Smart Scan Example:
customer_name • Only relevant rows and
FROM calls columns returned to
WHERE amount > database server
200; • Does not return blocks
when Smart Scan
used
• Will return blocks
Table extents and when appropriate
meta-data sent to
cells
Smart Scan
identifies rows and
2MB of data
columns within
returned to server
terabyte table that
match request
Exadata Smart Scan Processing
SELECT • Smart Scan Example:
customer_name • Database server only has
FROM calls to assembled returned
WHERE amount > Rows Returned relevant data into result set
200; • No wasted I/O bandwidth or
database server CPU
Consolidated
Table extents and Result Set
meta-data sent to Built From All
cells Cells
Smart Scan
identifies rows and
2MB of data
columns within
returned to server
terabyte table that
match request
<Insert Picture Here>
A B C D E A B C D E B D
Smart Scan
Join filtering
Step 1
Step 2
FULL ACCESS
Smart Scan at work
Step 2
conditions
• Smart scan processing
Smart Scan at work
FULL ACCESS
Step 1
Step 2
No FULL ACCESS
Smart Scan at work
Step 1
Step 2
No FULL ACCESS
Smart Scan at work
No FULL ACCESS
<Insert Picture Here>
select cust_id
from customers
where region = ‘US’ Scoring function
and prediction_probability(churnmod, ‘Y’ using *) > 0.8; executed in
Exadata
001010110010101100101011001010000
• Single query
• EXPLAIN PLAN
• Operation name and redicate information will use keyword of
storage
Smart Scan
CELL_OFFLOAD_PLAN_DISPLAY
• Controls whether the offload status of a step in an
execution plan is displayed
• Set with ALTER SYSTEM or ALTER SESSION
commands
• Values
• AUTO (default) – displays predicates if cell is present and table is
on the cell
• ALWAYS – shows option whether cell is present or not
• NEVER – does not display offload status
• Be aware – optimizer does not control if processing is
actually offloaded, just if it is eligible
Monitoring Smart Scan
Efficiency
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2010 Oracle Corporation – Proprietary and Confidential
Smart Scans
Building on benefits
Subsecond
On Database
Machine
20 GB 5 GB
with Storage Indexes with Smart Scans
Title of Presentation
Compression
Presenter‘s Name
Presenter‘s Title
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2008 Oracle Corporation – Proprietary and Confidential
Module Agenda
Oracle compression
options
Oracle Database Compression
Database 11g
OLTP Advanced Compression Advanced Compression
Advanced Compression
Option
Advanced Compression
Compress All Your Data
4X
Compression
Advanced Compression Option
Table Compression
• Oracle Database 11g extends table compression for
OLTP (and other) data
• Support for conventional DML operations
• Average storage savings of 2-4x
• New algorithm significantly reduces write overhead
• Improved performance for queries accessing large
amounts of data
• Compression enabled at either the table or partition
level
• Completely transparent to applications
Table Compression
Block-Level Batch Compression
5 Jack Smith
Local
Symbol Table
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2009 Oracle Corporation – Proprietary and Confidential
Table Compression Syntax
OLTP Table Compression Syntax:
CREATE TABLE emp (
emp_id NUMBER
, first_name VARCHAR2(128)
, last_name VARCHAR2(128)
) COMPRESS FOR OLTP;
* http://www.oracle.com/technology/products/database/compression/compression-advisor.html
Advanced Compression Option
SecureFiles
• Next-generation high performance LOB
• Superset of LOB interfaces allows easy migration from LOBs
• Transparent deduplication, compression, and encryption
• Leverage the security, reliability, and scalability of database
• Enables consolidation of file data with associated relational data
• Single security model
• Single view of data
• Single management of data
• Scalable to any level using SMP scale-up or grid scale-out
• SecureFiles standard with Oracle Database 11g
• Compression and deduplication with Advanced Compression
Option
• Encryption with Advanced Security Option
SecureFiles
Deduplication
Secure
Hash
NoCompression
OLTP workload No Compression Batch workload
Compression
Compression 300
20
Redo Transport(Mbit/sec)
18
Advanced Compression
in the real world
Advanced Compression
Oracle‟s Internal E-Business Application DB
• Oracle‘s Internal E-Business Suite Production System deployed ACO in 2009
• 4-node Sun E25K RAC, 11gR1
• Average overall storage savings 3x
• Table compression 4x
• Index compression 2x
• LOB compression 2.3x
• 65TB of realized storage savings primary, standby and test systems
• Additional benefits were also accrued in Dev clones and Backups
• Payroll, Order-2-Cash, AP/AR batch flows, Self-Service flows run without regression,
Queries involving full table scans show speedup
Advanced Compression
Oracle‟s Internal Beehive Email DB
• Production system on 11gR1 & Exadata for Primary and Standby
• Using Exadata Storage Servers for storage
• Average Compression Ratio: 2x
• Storage savings add up with standby, mirroring, flash recovery area
• Compression went production in 2009
• Consolidate 90K employees on this email server, more being migrated
• Savings As of April 2010
• Beehive Saved 365TB of storage using Advanced Compression
• Incrementally saves 2.6TB/day based on db size growth
• Savings higher with Sun user migration
• Compression also helped improve performance by caching only
compressed emails in memory and reducing I/O latencies
Advanced Compression
SAP R/3, BW, Leading Global Company
• Compression on SAP databases
at leading global company
• Oracle Database 11g Release 2
• SAP R/3 DB
• 4.67TB Uncompressed
• 1.93 TB Compressed
• 2.4x compression ratio
• SAP BW DB
• 1.38 TB Uncompressed
• .53 TB Compressed
• 2.6x compression ratio
• Leverage 11g compression for
Tables, Indexes and LOB data
<Insert Picture Here>
Exadata Hybrid
Columnar Compression
Exadata Hybrid Columnar Compression
Compression • New in Exadata Version 2
Unit
• Hybrid columnar compressed tables
• New approach to compressed table storage
• Useful for data that is bulk loaded and queried
• Update activity is light
• How it works
• Tables are organized into Compression Units
10x to 15x (CUs)
Reduction • CUs are a multiple of database block size
• Within Compression Unit, data is organized by
column instead of by row
• Column organization brings similar values
close together, enhancing compression
Exadata Hybrid Columnar Compression
Compression Units
• Compression Unit
• Logical structure spanning multiple database blocks
• Data organized by column during data load
• Number of rows for a CU determined at load, based on row size
and estimated compression
• Each column compressed separately
• All column data for a set of rows stored in compression unit
Logical Compression Unit
Can mix OLTP and hybrid columnar compression by partition for ILM
Exadata Hybrid Columnar Compression
Warehouse Compression
• 10x average storage savings
• 100 TB Database compresses to 10 TB
• Reclaim 90 TB of disk space
• Space for 9 more ‗100 TB‘ databases
• 10x average scan improvement
• 1,000 IOPS reduced to 100 IOPS
10
TB
100 TB
Exadata Hybrid Columnar Compression
Archive compression
• Compression algorithm optimized for maximum storage
savings
• Benefits any application with data retention requirements
• Best approach for ILM and data archival
• Minimum storage footprint
• No need to move data to tape or less expensive disks
• Data is always online and always accessible
• Run queries against historical data (without recovering from tape)
• Update historical data
• Supports schema evolution (add/drop columns)
Exadata Hybrid Columnar Compression
Archive compression
• Optimal workload characteristics for Archive compression
• Any application (OLTP, Data Warehouse)
• Cold or historical data
• Data loaded with bulk load operations or compressed using in-
database bulk compression operations
• Minimal access and update requirements
• OLTP Applications
• Table partitioning
• Heavily accessed data
• Partitions using OLTP Table Compression
• Cold or historical data
• Partitions using Online Archival Compression
• Data Warehouses
• Table partitioning
• Heavily accessed data
• Partitions using Warehouse Compression
• Cold or historical data
• Partitions using Online Archival Compression
EHCC benefits
Efficient data movement
• Read/Write compressed data to disk
• Write compressed data to ASM mirrors
• Read/Write compressed data in Flash Cache
• 10x improvement for Flash price performance
• Send compressed data over Infiniband
• Write compressed data to Redo Logs
• Send compressed data to standby database
• 10x reduction in WAN bandwidth cost: makes ADG appealing for DW
• Write compressed data to Backups
Oracle Confidential
EHCC benefits
Efficient queries
Oracle Confidential
<Insert Picture Here>
• Compression Ratios
• Query High: 11x
• Archive High: 16x
• Load Performance
• data pump loading from flat file
• 28% increase in elapsed time
• Query Performance
• 40% faster to execute 60
queries in customer workload
Oracle Confidential
EHCC benefits
Table scan performance
• Table scans of EHCC data run significantly faster than
uncompressed
• Sample test run (uses Call Data Record data, 46
columns)
• Compression ratio: 14x
• Load takes 55% more time
• Table Scan runs 5.5x faster (less disk I/O)
Oracle Confidential
Exadata Hybrid Columnar Compression
Estimating savings
• EHCC Compression Advisor
• Runs on any 11.2 setup (non-Exadata too)
• Given a sample of customer data, provides compression ratio
estimates
• Patch available for 11.2 (8896202)
<Insert Picture Here>
Title of Presentation
Storage indexes
Presenter‘s Name
Presenter‘s Title
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2008 Oracle Corporation – Proprietary and Confidential
<Insert Picture Here>
Storage indexes
Exadata Storage Index 11.2
Transparent I/O Elimination with No Overhead
Table Index
• Exadata Storage Indexes maintain summary
information about table data in memory
A B C D
• Store MIN and MAX values of columns
1 • Typically one index entry for every MB of disk
Min B = 1
3
Max B =5 • Eliminates disk I/Os if MIN and MAX can never
5 match ―where‖ clause of a query
5
Min B = 3 • Completely automatic and transparent
8
Max B =8
3
Select * from Table where B<2 - Only first set of rows can match
Storage indexes
How they work
Cust_ID
05-SEP-2009
10075
07-OCT-2009
20098
leads to elimination of data chunk #2
Prod_ID 20010 32932
Order_date Ship_date Cust_ID Prod_ID Amount
Title of Presentation
Resource Manager
Presenter‘s Name
Presenter‘s Title
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2008 Oracle Corporation – Proprietary and Confidential
Module Agenda
Resource Manager
overview
Resource Manager
Overview
OLTP
service = „Customer_Service‟
client program name = „Siebel Call Center‟
Oracle username = „Mark Marketer‟
Reports
module name = „AdHoc‟
query has been running > 1 hour
estimated execution time of query > 1 hour
Low-Priority
Creating Resource Plans
Priority-Based Plan
Priority 1: OLTP
Ratio-Based Plan Priority 2: Reports
Priority 3: Ad-Hoc
Reports
30%
OLTP
60%
Hybrid Plan
Low-Priority
Level 1 Level 2
10%
OLTP 90%
Reports 60%
Low-Priority 40%
Enable Resource Management
• Manually
• Set resource_manager_plan parameter
• Automatically
• Set resource plan for a scheduler window
<Insert Picture Here>
Contending CPU
workloads
Resource Manager
Contending CPU workloads
40%
What if you cannot tolerate
performance degradations for
certain workloads?
OLTP Reports ETL +
only only Reports
Resource Manager
Contending CPU workloads
100%
20%
CPU
With Resource Manager,
80% 90% 80% 90% you control how CPU
Usage
resources should be
allocated
10%
OLTP Reports
Prioritized Prioritized
Resource Manager
CPU management details
Parallel execution
workload management
Parallel execution
Potential problems
• Introduced in 11.2.0.1
• Goals:
1. Run enough parallel statements to fully utilize system
resources
2. Ensure appropriate degree of parallelism for all statements
• Enable by setting parallel_degree_policy = “auto”
Parallel Statement Queuing
FIFO Queue
8
128
Parallel Statement Queuing
With Resource Manager
• One Consumer Group can flood the system and
queue with queries
• Critical queries are forced to queue
• Critical queries are stuck behind batched queries
Limit the DOP for queries from a Consumer Group
Limit the percentage of parallel servers a Consumer
Group can use
• Reserves parallel servers for critical parallel queries
• Coming soon…
• Coming soon…
Separate queues per Consumer Group
Resource Plan specifies which queues parallel
statements are issued next
Parallel Statement Queuing
With Resource Manager
Current Resource Plan:
Priority 1: Tactical
Next Parallel Priority 2, 70%: Normal
Query Priority 2, 30%: Ad-Hoc
Tactical T T
Consumer Group
Parallel TTT
Normal N
Consumer Group Query N
Selection
Ad-Hoc A A A
Consumer Group
Running
Queries
Test Results: 2 Concurrent Workloads
Database consolidation
Database consolidation challenges
Service levels
100.00%
90.00%
80.00%
% Disk Utilization
70.00%
No limit
60.00%
75% limit
50.00%
50% limit
40.00%
30.00% 25% limit
20.00%
10.00%
0.00%
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Time in Minutes
<Insert Picture Here>
Server consolidation
Server Consolidation
Challenges
• Common theme in today‘s data centers
• Many test, development, and small production databases
• Low loads
• Not critical
• Cannot fully utilize today‘s powerful servers!
• Solution – server consolidation
• Run multiple database instances on the same server
• But there may be problems
• Contention for CPU, memory, and I/O
• Unexpected workload surges on one instance can wreak
havoc on other databases
Server consolidation challenge
Instance Caging
• Just 2 steps:
1. Set ―cpu_count‖ parameter
• Maximum number of CPUs the instance can use at any
time
2. Set ―resource_manager_plan‖ parameter
• Enable any CPU resource plan
• E.g. out-of-box plan ―DEFAULT_PLAN‖
Instance caging
Over-provisioning approach
• Scenario
• Multiple database Sum of cpu_counts = 12
• Multiple database
instances sharing a
32 Total Number
server of CPUs = 32
Instance D
• Performance-critical 24 Instance C
databases
Instance B
• Cannot afford any 16
interference from each
other 8
Instance A
• Use Instance Caging
to partition
Instance Caging
Results
Swingbench CPU Utilization
100%
• Swingbench OLTP 90%
application 80%
• 4 CPU Linux server 70%
6000
• 2 sysbench 5000
applications
0
0,6 1,5 2,4 3,3 4,2 5,1 6,0
cpu_count: <Instance 1>,<Instance 2>
Exadata I/O Resource Manager
Title of Presentation
I/O Resource Manager
Presenter‘s Name
Presenter‘s Title
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2008 Oracle Corporation – Proprietary and Confidential
Module Agenda
InfiniBand Switch/Network
Storage Network
Reports Storage
Desired Bandwidth:
800 MB/s
Development Database
Managing the I/O Bandwidth with IORM
I/O Resource Manager provides a way to
Production Database manage how multiple workloads and databases
share the available I/O bandwidth
Critical Reports
1000 MB/s
Storage Network
Reports Storage
Actual Bandwidth:
800 100 MB/s
Development Database
When Does I/O Resource Manager
Help the Most?
• Conflicting Workloads
• Multiple consumer groups in a Database (e.g. ad hoc queries,
critical reports)
• Multiple databases (e.g. production, test)
• Concurrent database administration - backups, ETL, file
creation
• I/O is a bottleneck
• Significant proportion of the wait events are for I/O
• Any data warehouse workload!
<Insert Picture Here>
IORM resource
management
IORM Possible Scenarios
I/O
Resource
Manager
Inside Across
One Multiple
Database Databases
Priority DSS
service = „PRIORITY‟
Oracle username = „LARRY‟
Oracle username = „DEV‟
client program name = „ETL‟
DSS
function = „BACKUP‟
query has been running > 1 hour
Maintenance
Creating Resource Plans
Priority-Based Plan
Priority 1: Priority DSS
Ratio-Based Plan Priority 2: DSS
Priority 3: Maintenance
DSS
30%
Priority DSS
60%
Customer Sales
Service Development Sales Test
Standby Database Database
Database
IORM Resource Management
Inter-database
PP
Sales
Database
DDD Sales-Priority
IORM examples
IORM Possibilities
Priority-Based Plan
Priority 1: Interactive
Priority 2: Batch
Scenario 1: OLTP vs Report
Results
• I/O Resource Manager boosts OLTP performance by 408%!
• Report has small effect on OLTP performance (8%)
• Report data uses significant disk space, resulting in longer seek times
• Storage system is fully utilized
• OLTP workload: 376 IOPS per disk
• Report workload: 5 MBps per disk
1400
1200
1000
in TPS 600
> 4X Improvement!
400
200
0
Priority-Based Plan
Priority 1: Production Data Warehouse
Priority 2: Development Data Warehouse
Scenario 2: DSS Query vs DSS Query
Results
• I/O Resource Manager boosts critical query time by 41%
• Non-critical query has small effect on critical query (9%)
• Report data uses significant disk space, resulting in longer seek times
• Running queries together is 17% more efficient than running them
serially
400
41% Improvement!
300
Critical Query:
200
Elapsed Time in
Seconds 100
0
Both Queries Both Queries
Critical Query No IORM IORM
<Insert Picture Here>
IORM at work
How IORM operates
Disk Queue
I/O
Requests Traditional
RDBMS Storage H L H L L L
Server
High-Priority Low-Priority
Workload Workload
I/O Scheduling
Exadata way
• Exadata executes requests, based on the user‘s prioritization
scheme
• Exadata may internally queue I/O requests to prevent a low-
priority intensive workload from flooding the disk
Exadata
High-Priority Disk Queue
Workload Queue
I/O H
Requests I/O
RDBMS Resource H L H H
Manager
LLLL
Low-Priority
Workload Queue
IORM Resource Plans
• I/O Resource Manager issues enough I/O requests to the disk to
keep it busy and efficient
• One queue for each consumer group
• When IORM is ready to issue the next request, it uses the
Resource Plan to select a consumer group queue
• Percentage for each queue determined by overall resource plan
Exadata
Priority DSS Consumer Disk Queue
Group Queue
I/O P
Requests I/O
RDBMS Resource P D P P
Manager
DDDD
DSS Consumer
Group Queue
IORM
Background I/Os
Enabling IORM
Enabling IORM
Steps
• Define consumer groups with DBRM
• You must assign sessions to consumer groups, either manually or
through consumer group mapping rules
• Create intra-database plan with Database Resource
Manager
• [Assign categories with to consumer groups with DBRM]
• [Create inter-database plan with CellCLI]
• Enable plan with RESOURCE_MANAGER_PLAN
parameter
• Enable IORMPLAN on all cells
• DBPLAN and CATPLAN
Enabling IORM
Title of Presentation
Flash Cache
Presenter‘s Name
Presenter‘s Title
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2008 Oracle Corporation – Proprietary and Confidential
Module Agenda
Disk Controller
HBA with
512M BBC
2 Quad-Core Intel®
Xeon® E5540
Processors
ASM diskgroup
Grid Disk 1
Cell
Flash Disks …
Disk
Grid Disk n ASM diskgroup
Flash Cache
Creating grid disks
• Flash-based cell disks and grid disks
CellCLI> LIST CELLDISK DETAIL
name: FD_00_cell01
diskType: FlashDisk
. . .
name: CD_00_cell01
diskType: HardDisk
. . .
Flash
Cache
Database CellSRV
Server
Disk
Flash Cache at work
Read operations
Flash
Cache
Database CellSRV ?
Server
Disk
Flash Cache at work
Write operations
Flash
Cache
Database CellSRV
Server
Disk
Flash Cache at work
Post operation (read or write)
name: cell01_FLASHCACHE
cellDisk: FD_00_cell01,FD_01_cell01
. . .
FD_14_cell01,FD_15_cell01
creationTime: 2009-10-19T17:18:35-07:00
id: b79b3376-7b89-4de8-8051-6eefc
size: 365.25G
status: normal
Flash Cache monitoring
MS Metrics
• Get overall statistics for Smart Flash Cache on a Cell
name: FC_BYKEEP_USED
description: “Megabytes used for
keep objects on FlashCache"
Flash Cache monitoring
Finding if an object is cached
• Cell-level caching statistics for a DB object
• AWR report
STATISTIC_NAME VALUE
------------------------ ------
optimized physical reads 743502
Flash Response monitoring
Mapping cell disks
Flash Cache
troubleshooting
Data integrity
Protection
• Flash Cache is less stable than disk
• Flash Cache includes read-cache-verification
• Few ‗check bytes‘ are stored in memory for every 4 KB of data
written to flash
• During flash reads ‗check bytes‘ are verified
• If verification fails, data is read from disk
• Checking for Flash Cache errors
• CellCLI> LIST METRICCURRENT FC_IO_ERRS
CELLSRV alert.log file
Title of Presentation
Oracle Exadata Database Machine performance
Presenter‘s Name
Presenter‘s Title
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2008 Oracle Corporation – Proprietary and Confidential
Module Agenda
Infiniband
roughly 2.5GB/s
Sun Oracle Database Machine
Data flow enhancement
• The Sun Oracle Database Machine implements best
practices for throughput
• Balanced configuration, designed to avoid bottlenecks at all
points in the data flow
• Software designed to reduce volume of data required to flow
• Elimination of
• Excess rows (predicate evaluation, join filtering, storage
indexes)
• Excess columns (column projection)
• Software designed to reduce disk I/O
• Storage indexes, Exadata Smart Flash Cache
• Software designed to efficiently allocate data flow bandwidth
• IORM
<Insert Picture Here>
• Exchange 1:
• Scales horizontally to a maximum aggregate rate of roughly
67 GB/s
• Achieving this maximum theoretical rate involves parallel
scanning of Flash and HDD on all cells in a full rack.
• At this rate, data cannot flow through Exchange 2. That is,
the data cannot leave the cells at this rate.
• Think of a query that looks for a non-existent needle in
a haystack:
• SQL> SELECT base FROM payroll WHERE base >
8,000,000 ;
• Many chefs
• – 283 –
Database Machine data flows
Scaling – Exchange 2
• Exchange 2:
• Scales out to a maximum aggregate rate of roughly 20 GB/s
• This is the aggregate rate of data flow between the storage
grid and the database grid.
• The difference between Exchange 1 and Exchange 2 is the
Smart Scan effectiveness.
• Cells must reduce payload through projection/filtration to fit
within Exchange 2 bandwidth.
• The aggregate out-flow rate from Exchange 2 must fit
within about 20 GB/s
Database Machine data flows
Scaling – Exchange 2
• Exchange 2:
• All hosts in the database grid must participate in order to
accommodate maximum Exchange 2 data flow
• That is, less than 8 hosts cannot ingest this flow of data
• NOTE: A single Oracle foreground (no PQ) can drive
storage at roughly 20 GB/s but no data can flow from the
storage grid to the single foreground process at this rate.
Think of a fully offloaded query.
• 1 order, many eaters
Database Machine data flows
Scaling – Exchange 3
• Exchange 3:
• Scales horizontally to a maximum aggregate rate of roughly
20 GB/s
• This is the aggregate rate of data flow between the database
grid and the storage grid.
• PQ server must have sufficient CPU bandwidth else disk I/O
is throttled.
• All database hosts must participate to realize maximum
theoretical Exchange 3 bandwidth.
• Many orders, 1 dish
Sun Oracle Database Machine
Data flow maximums
Infiniband
Exchange 2 2.5 GB/s per cell
20 GB/s Agg
Case studies
Case studies
Principals
• Complex Query
• Synopsis:
• 5-table join – Busy database CPUs
• Heavy predicate evaluation – Busy Storage Server CPUs
• See next slide
Case studies
Complex query
select custid, sum(refund_amt) returns from CUST_SERVICE cs where return_dt > ( SYSDATE - 180)
and CLUB_CARD_NUM > 0 and CC_NUMBER > 0 and cs.club_card_num not like '%A%‗ group by custid
),
xxx as ( select cmr.custid, cf.aff_cc_num, cmr.returns, sum(os.trans_amt) from yyy cmr, CF_BASE2 cf, OS_BASE2 os
where cf.custid = cmr.custid and cf.custid = os.custid and os.club_points_earned > 0 and os.STORE_CODE > 0
and os.TRANS_ID > 0 and os.CUSTID > 0 and ( cf.club_card_num not like '%A%‗ and cf.AFF_CC_NUM not like '%A%'
and cf.CUST_SHIPTO_DETAIL2 not like '%NO DETAIL%‗ and cf.CUSTDETAIL1 not like '%NO DETAIL%'
) and os.trans_dt > ( SYSDATE - 180) group by cmr.custid, cf.aff_cc_num, cmr.returns
having (returns / sum(os.trans_amt) * 100) > 2
)
select card_no, sum(purchase_amt) sales
from ACT_BASE2 act where act.purchase_dt > ( SYSDATE - 180)
and
card_no in ( select aff_cc_num from xxx) and act.merchant_code not in
( select merchant_code from PARTNER_MERCHANTS where store_zip > 0 and store_name not like '%ACME%')
and MERCHANT_CITY not like '%Frankfort%' and MERCHANT_STATE not like '%KY%'
and MCC > 1 and CARD_TYPE not like '%NO CARD%‗ and PURCHASE_AMT > 0
group by card_no
having sum(purchase_amt) > 10;
Case study 1
Results
• Light, lightweight scan
• Fully offloaded – no data returned to server
• Caveats: This full-rack is configured with 450 GB SAS drives and
is missing 1 96 GB Flash card.
• Maximum HDD disk throughput is 20 GB/s, Combined
Flash+HDD is 54 GB/s.
SQL> SELECT MAX(MCC) FROM ALL_CARD_TRANS WHERE
MCC < 0;
Storage Source Query Throughput GB/s CPU %busy Query CPU Seconds Effective
From Storage iDB Cells Database Tm (sec) GB/s
Light, lightweight Scan 49 0.007 35 3 51 4194
FLASH+HDD
Light, lightweight Scan (HCC 6:1) 23 0.007 90 2 20 4124 125
Light, lightweight Scan 20 0.003 15 3 125 4682
HDD
Light, lightweight Scan (HCC 6:1) 19 0.012 70 5 24 3956 104
Case study 1
Results
Storage Source Query Throughput GB/s CPU %busy Query CPU Seconds Effective
From Storage iDB Cells Database Tm (sec) GB/s
Light, lightweight Scan 49 0.007 35 3 51 4194
FLASH+HDD
Light, lightweight Scan (HCC 6:1) 23 0.007 90 2 20 4124 125
Light, lightweight Scan 20 0.003 15 3 125 4682
HDD
Light, lightweight Scan (HCC 6:1) 19 0.012 70 5 24 3956 104
Storage Source Query Throughput GB/s CPU %busy Query Tm CPU Effective
(sec) Seconds GB/s
From Storage iDB Cells Database
100
14000
90
12000 80
10000 70
MB/sec
60
%CPU
8000
50
6000
40
4000
30
2000
20
0 10
3:30:34 3:30:43 3:30:51 3:31:00
0
3:30:34 3:30:43 3:30:51 3:31:00
Time
Time
100
90
• Heavy filtration/projection
80
throttles I/O
70
50
measuring does not meet your
40
expectations, remember the 3
30
Data Flow Exchanges
20
10
0
3:30:34 3:30:43 3:30:51 3:31:00
Time
Cells DB Grid
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2008
2009 Oracle Corporation – Proprietary and Confidential
<Insert Picture Here>
Title of Presentation
CellCLI, DCLI and ADRCI
Presenter‘s Name
Presenter‘s Title
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2008 Oracle Corporation – Proprietary and Confidential
Module Agenda
• CellCLI
• DCLI
• ADRCI
<Insert Picture Here>
Exadata software
archtecture
Exadata software architecture
CellCLI
iDB
Management
Server
CellSRV
[IORM]
Restart
Server
DiskDiskDiskDiskDisk
<Insert Picture Here>
CellCLI
CellCLI
Overview
• Command line utility for managing cell resources
• CellCLI runs on the cell
• Run locally from a shell prompt
• Run remotely via ssh or dcli
• Run automatically by EM agent with Exadata EM plugin
• Can run non-interactively
CellCLI>
CellCLI
Syntax
• Commands not case sensitive
• - character for line continuation
• ; optional command terminator
• REM, REMARK or – indicate comments
CellCLI
Commands
HELP [topic]
Available Topics:
ALTER
ALTER ALERTHISTORY
ALTER CELL
ALTER CELLDISK
ALTER GRIDDISK
ALTER IORMPLAN
ALTER LUN
ALTER THRESHOLD
ASSIGN KEY
CALIBRATE
CREATE
CREATE CELL
CREATE CELLDISK
CREATE GRIDDISK
…
CellCLI>
CellCLI
Object commands
• List and change cell resources
• Syntax: <verb> <object-type> [ALL |object-name] [<options>]
• Generic verbs: ALTER, CREATE, DROP, and LIST
used to change, create, remove, and display objects
CellCLI> create griddisk all prefix=data
GridDisk data_CD_1_stsd2s3 successfully created
GridDisk data_CD_2_stsd2s3 successfully created
GridDisk data_CD_3_stsd2s3 successfully created
GridDisk data_CD_4_stsd2s3 successfully created
GridDisk data_CD_5_stsd2s3 successfully created
...
DCLI
DCLI
Overview
• The DCLI script runs commands on multiple cells in
parallel threads.
• File copy and command execution occur on a set of cells in
parallel.
• Command output is collected and displayed after file copy
and command execution is finished on all cells.
• Setup:
• Copy DCLI from cell (/opt/oracle/cell/cellsrv/bin/dcli) to host from
which management is done.
• Create files which contain a list of cells to which commands are
issued, e.g. mycells
• Run ―dcli –k –g mycells‖ to create ssh key equivalence on cells
DCLI
Return codes
• A DCLI can returns one of the following values
• 0 – The command(s) were copied and run on all designated cells
• 1 – One or more cells could not be reached or returned a non-zero
error code
• 2 – A local error prevented execution of any commands
DCLI
Example 1
$ scp celladmin@stsd2s3:/opt/oracle/cell/cellsrv/bin/dcli .
dcli 100% 32KB 31.6KB/s 00:00
$ dcli -g mycells -k
celladmin@stsd2s1's password:
celladmin@stsd2s2's password:
stsd2s1: ssh key added
stsd2s2: ssh key added
stsd2s3: ssh key already exists
options:
--version show program's version number and exit
-cCELLS comma-separated list of cells
-fFILE file to be copied
-gGROUPFILE file containing list of cells
-h, --help show help message and exit
-k push ssh key to cell's authorized_keys file
-lUSERID user to login as on remote cells (default: celladmin)
-n abbreviate non-error output
-rREGEXP abbreviate output lines matching a regular expression
-sSSHOPTIONS string of options passed through to ssh
--scp=SCPOPTIONS string of options passed through to scp if different
from sshoptions
-t list target cells
-v print extra messages to stdout
--vmstat=VMSTATOPS vmstat command options
-xEXECFILE file to be copied and executed
DCLI
Example 3
$ dcli -c mycells cellcli -e create griddisk all prefix="data", size=120G
stsd2s1: GridDisk data_CD_2_stsd2s1 successfully created
stsd2s1: GridDisk data_CD_3_stsd2s1 successfully created
stsd2s1: GridDisk data_CD_4_stsd2s1 successfully created
stsd2s1: GridDisk data_CD_5_stsd2s1 successfully created
stsd2s1: GridDisk data_CD_6_stsd2s1 successfully created
...
$ dcli -c mycells 'cellcli -e alter griddisk all availableTo=\"+ASM,dbm\"'
stsd2s1: GridDisk data_CD_1_1_stsd2s1 successfully altered
stsd2s1: GridDisk data_CD_2_stsd2s1 successfully altered
stsd2s1: GridDisk data_CD_3_stsd2s1 successfully altered
stsd2s1: GridDisk data_CD_4_stsd2s1 successfully altered
...
$ dcli -g mycells cellcli -e assign key for dbm='1212824bf214e59f3b60d1553b784cf0'
stsd2s1: Key for dbm successfully altered
stsd2s2: Key for dbm successfully altered
stsd2s3: Key for dbm successfully altered
ADRCI
ADRCI
Utility Overview
Title of Presentation
Exadata and availability
Presenter‘s Name
Presenter‘s Title
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2008 Oracle Corporation – Proprietary and Confidential
Module Agenda
• RAC
• ASM
<Insert Picture Here>
ASM diskgroup
Grid Disk 1
Cell
Physical …
Disk Disk
Grid Disk n ASM diskgroup
Sys Area Sys Area
Exadata storage
Cell disks
Cell
Disk Exadata Cell Exadata Cell
Disk
Hot ASM
Disk Group
Exadata Cell Exadata Cell Cold ASM
Disk Group
Hot Hot Hot Hot Hot Hot
ASM ASM
Exadata Cell Exadata Cell
Failure Group Failure Group
ASM
Disk Group
• ASM mirroring is used protect against disk failures
• ASM failure groups are used to protect against cell
failures
Exadata storage
ASM interactions
Cell becomes
unreachable
ONLINE OFFLINE
Cell becomes
reachable
Exadata Storage Server
Availability – Case 2
ONLINE OFFLINE
Alter GridDisk/CellDisk
Inactive
ONLINE OFFLINE
Alter GridDisk/CellDisk
Active
Exadata disk availability
Automatic disk online
cell unreachable
OFFLINE cell reachable
disk pulled out disk pushed in
disk inactivated disk activated
user offline disk group mounted
ONLINE SYNC
What is automated ?
ONLINE OFFLINE
ADD DROP
Planned Data
Downtime Changes Online Redefinition
• Network considerations
• Configure with Infiniband for local standby database for
approximately 2 GB/sec bandwidth using IPoIB
• Otherwise GigE will provide approximately 120 MB/sec
• Standbys can use VIP interface or dedicated interface
• Use dedicated network interface for redo transport
• Refer to Support Note 960510.1 for complete details
Data Guard Redo transport
Configuration best practices
Title of Presentation
Backup and recovery
Presenter‘s Name
Presenter‘s Title
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2008 Oracle Corporation – Proprietary and Confidential
Module Agenda
• Backup rates
• 18 TB/Hr full image backups
• 10-46 TB/Hr effective backup rate for incremental backups
• Restore rates
• 24 TB/Hr restore rates
• Recovery rates
• 2.1 TB/Hr recovery rates
• Above rates pertain to physical files. With
compression, effective backup/restore rates will
multiply
• It all comes down to bandwidth of your slowest
component
Backup, restore and recovery operations
Database Machine
• Simple operations with standard RMAN commands
• Automatically parallelized across all storage servers
• Data aware
• Detection of block corruptions
• Auto repair and manual block repair options
• Integrated and transparently
• OLTP and data warehouse databases
• RAC, Data Guard, flashback technologies, ASM, Exadata
• Oracle native compression capabilities
• OLTP (typically 3 X compression)
• Exadata Hybrid Columnar Compression (typically 10-15 X
compression)
<Insert Picture Here>
Best Practices
for disk-based backup
and recovery
Disk-based backup and recovery
Exadata Storage Server Grid Disk layout
The faster (outer) 40% of the disk is assigned to the DATA Area
The slower (inner) 60% of the disk is assigned to the RECO Area
InfiniBand Network
Disk Based Backup & Recovery
Alternative FRA on Exadata
Best Practices
for tape-based backup
and recovery
MAA Validated Architecture
The faster (outer) 80% of the disk is assigned to the DATA Area
The slower (inner) 20% of the disk is assigned to the RECO Area
• Benefits
• Fault isolation from Exadata Storage Server
• Maximizes Database Machine capacity and bandwidth
• Move backup off-site easily
• Keep multiple copies of backups in a cost effective manner
• Trade-offs
• Disk-based solutions have better recovery times for data and
logical corruptions and certain tablespace point in time
recovery scenarios
• No differential incremental backups are available
Tape-based backup & recovery
Configuration best practices for tape
• Ethernet or InfiniBand based configuration only
• Hardware changes to Database Machine are not supported
• Smaller databases can use Gigabit Ethernet
• Use a dedicated interface for the transport to eliminate impact to
client access network
• Typically a dedicated backup network is in place
• Maximum throughput with the GigE network is 120 MB/sec X
Number of Database Servers
• For a full Database Machine, 960 MB/sec possible
• Use InfiniBand for best performance
• Bigger database needing faster backup rates
• Lower CPU overhead
Tape-based backup & recovery
InfiniBand configuration best practices for tape
• http://www.oracle.com/technology/products/bi/db/exadata/pdf/
maa_tech_wp_sundbm_backup_final.pdf
3rd Party Media Management Vendor
No additional complexity
• http://www.oracle.com/technology/deploy/availability/pdf/maa
_wp_dr_dbm.pdf
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2008 Oracle Corporation – Proprietary and Confidential
<Insert Picture Here>
Title of Presentation
Best practices for data loading
Presenter‘s Name
Presenter‘s Title
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2008 Oracle Corporation – Proprietary and Confidential
Module Agenda
SQL*Net
Oracle Database
DBFS Instance
DBFS Content Repository Package
DBFS Content Repository Package
Create Open Read Write List
Create Open Read Write List
Secure Files
Metadata Tables
<Insert Picture Here>
DBFS performance
expectations
Oracle Database File System (DBFS)
Performance expectations
• Full Rack Oracle Database Machine can load a little more
than 5TB/h (V2)
• Data staged In DBFS
• DBFS tablespaces reside on same disks as data warehouse
tablespaces
• Data loaded into normal redundancy ASM Disk Group so double
writes
• Total I/O == 15.6TB/h or 4.4 GB/s
Oracle Database File System (DBFS)
Performance expectations
• 4.4 GBs/second -
• 1.5 GB/s flowing from a file system housed in one Oracle database
• 2.9 GB/second of writes (ASM normal redundancy)
• Could you achieve the same outside of DBFS on Database
Machine?
• 1.5 GB/s supply-side is 13 active line-rate GbE paths, or
• 2 active IB paths with NFS via TCPoIB from a high end NAS
device
• DBFS solves the problem without any additional resources
outside the rack
<Insert Picture Here>
Configuration and
execution
Configuration
DBFS
• House DBFS in a dedicated database
• Use DBCA with OLTP template to create database
• AMM or ASMM are fine…prefer ASMM though
• 8GB SGA buffer pool, 1GB shared pool
• Redo logs should be at least 2GB
• Create bigfile tablespace for the file system (8K, 16K
blocksize)
• Create a DBFS user (e.g., dbfs identified by dbfs)
• Grant create session, create table, create procedure and
dbfs_role to DBFS user
• Grant quota unlimited on the DBFS tablespace to DBFS user
DBFS
Implementation
• Create DBFS File System
• cd to $ORACLE_HOME/rdbms/admin)
• Start SQL Plus
SQL>@dbfs_create_filesystem_advanced.sql <TS Name> <FS
Name>\ nocompress nodeduplicate noencrypt non-partition
• Mount the file system
$ nohup $ORACLE_HOME/bin/dbfs_client dbfs@ -o\
allow_root,direct_io /data <passwd.txt &
DBFS
Implementation
• Move flat files to DBFS using FTP, SCP
• Define external tables with CREATE command
• You can move compressed files to save network bandwidth
• Use preprocessor directive to decompress external tables
Data loading best practices
External Tables
• Full usage of SQL capabilities directly on the data
• Automatic use of parallel capabilities (just like a table)
• No need to stage the data again
• Better allocation of space when storing data
• High watermark brokering
• Additional capabilities
• Optional sorting at load time (think improved compression)
Data loading best practices
Direct Path loads
• Data is written directly to the database storage using
multiple blocks per I/O request using asynchronous
writes
• Data bypasses buffer caches
• A CTAS command always uses direct path
• An INSERT AS SELECT needs an APPEND hint to
go direct
May 19th
2008
May 20th
2008
2. Use CTAS command
to create non- May 21st
partitioned table 2008 Sales
TMP_SALES May 22nd
table now
2008 has all the
Tmp_ sales May 23rd
data
Table 2008
May 24th
th
3. Create indexes 2008 5. Gather
Statistics
4. Alter table Sales
exchange partition
May_24_2008 with table
Tmp_ sales tmp_sales
Table
<Insert Picture Here>
RAC Node | DBFS Client RAC Node | DBFS Client RAC Node | DBFS Client
DBFS Instance 1 DBFS Instance 2 DBFS Instance 3
/data/FS1 /data/FS1 /data/FS1 …
Oracle Database
DBFS
Content Repository: FS1
RAC Node
Provider System DBFS Instance 1 …
SQL*Net
dbfs_client executable (OCI)
Oracle Database
DBFS
Content Repository: FS1
120
107
99
100
80
scp
60
46 dbfs_client "Injection"
40
20 10
0
% CPU MB/s
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2008 Oracle Corporation – Proprietary and Confidential
<Insert Picture Here>
Title of Presentation
Consolidation of mixed workloads
Presenter‘s Name
Presenter‘s Title
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2008 Oracle Corporation – Proprietary and Confidential
Module Agenda
Consolidation challenges
Typical consolidation challenges
• Packaged applications
• Schema name collisions
• Different SLAs
• 24/7 versus 8/5
• Daytime (<2 secs) vs Night (batch only)
• Workload types: OLTP, DW, hybrid
• Sizing for availability
• Predictable response times
• Application tier scalability
Key question
Consolidation of mixed workloads
Consolidation
configuration options
High-level consolidation options
Database
• Single RAC database
• Place all schemas in one database
• Pro:
• Better resource control
• Less overhead
• Focus on one database's performance and management
• Con:
• One set of instance-level params
• Outage affects all tenants
• Migration to single database can be challenging, since they
were separate for a reason
High-level consolidation options
Database
• Multiple RAC databases
• Move databases with minimal changes
• Pro:
• Flexibility for different params and versions/patches
• Simple platform migration
• Security and isolation more easily achieved
• Con:
• Resource control more difficult
• More moving parts to manage
• Most common choice
• RAC One Node
• Single-instance databases, one cluster
High-level consolidation options
Storage
• Single diskgroup, all cells
• Stripe all data across all cells (DBFS, DATA, RECO only)
• Pro:
• Maximum throughput/bandwidth
• Centralized resource control
• Less overhead
• Simpler management
• Con:
• Loss of two cells (normal redundancy) may cause an outage
• Recommended option
High-level consolidation options
Storage
• Segregate groups of cells
• Isolated environments, more simultaneous failures tolerated
• Pro:
• Can sustain more simultaneous failures (potentially)
• Little chance of one database impacting other
• Con:
• Reduced throughput/bandwidth
• Management overhead
• Fewer CPUs to operate on decompression
• Sizing such environments is difficult, especially performance
High-level consolidation options
Storage
If you are going to run mixed workloads successfully on
one Exadata system, one of the following have to be
true:
1.All priority workloads are OLTP.
2.All priority workloads are data warehouse.
3.The Exadata Smart Flash Cache is mainly used by
OLTP workloads.
1. KEEP attribute on objects
2. Flash disks
High level consolidation options
Exadata Smart Flash Cache considerations
Consolidation tools
Oracle features
Tools for successful consolidation
Consolidation sizing
Sizing for consolidation
Considerations
• Cumulative resource requirements
• Utilize AWR to determine current requirements
• Sizing presentation describes sizing in general
• Database Machines will probably have
• Faster CPUs
• Faster storage – more IOPS
• Greater network bandwidth – more MBs/second
• Reductions in amount of data moved to CPU
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2008 Oracle Corporation – Proprietary and Confidential
<Insert Picture Here>
Title of Presentation
Sizing for the Database Machine
Presenter‘s Name
Presenter‘s Title
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2008 Oracle Corporation – Proprietary and Confidential
Module Agenda
• Sizing options
• Comparative sizing method
<Insert Picture Here>
Sizing challenges
Sizing challenges
What‟s the big deal?
• One of three answers –
• Quarter
• Half
• Full
• A simple answer does not imply a simple process for
arriving at the answer
Sizing challenges
Issues
• Capacity sizing is simple
• A single number for each resource category
• Workload sizing is complex
• Reflects cumulative amounts of resource consumption across a
broad range of heterogeneous database operations
• Real world workload sizing is even more complex
• Interaction of workload demands, both average and peak, over
time against resources
• Additional resource demands stem from interactions of different
interactions
Sizing challenges
Impact of sizing decision
• System sizing drives solution price
• Under-sizing will reduce price & under-cut competitors
• Under-sizing will reduce pricing/discount pressure
• Under-sizing will result in business impact
• Impact can be MANY TIMES greater than system price
• Example: $60,000 under-investment ==> $1M+ business loss
• Can result in serious customer satisfaction issues or even
lawsuits
Sizing challenges
Impact of sizing decision
• Process is key to customer satisfaction
• Need a defined & documented process
• Process must produce the same results given same inputs
• Need to retain historical sizing documents
• Accuracy & transparency
• Must be reasonably accurate (often a RANGE of sizes)
• Must be able to explain the process (transparency)
<Insert Picture Here>
Sizing options
Sizing processes
Analytic processes
• Comparative sizing
• System refresh
• System replacement
• Competitive
• Predictive sizing
• New application deployments
• Depend on accurate metrics of workload and real-world
comparisons
• Use predictive sizing to check comparative approach
• Hybrid approach
• Very scalable method for sizing customer systems
• Produces relatively accurate result
Sizing processes
Benchmark-based sizing
Comparative sizing
method
Comparative Sizing
Steps
Growth Server H/W Vendor & Model (HP, Dell, Sun, IBM, etc.)
E25K UltraSparc IV+ (1.95 GHz) 1230 / 144 cores = 8.5/core 0.32
Note: These are the best case numbers on a per-core basis. Database Machine CPU
is 26.6/core
IBM Power – SPECint Comparison
IBM Processor CINT2006_rates Equivalent
Database
System
Machine
Cores
Power 7 Eight-Core (3.86
pSeries
GHz)
652 / 16 cores = 40.8/core 1.53
Note: These are the best case numbers on a per-core basis. Database Machine CPU
is 26.6/core
Note: IBM does have faster per core numbers for the Power7. But this is based on a
quad-core version where they effectively plug in an 8-core chip, turn off 4-cores and
run the remaining 4 cores at a faster speed. This is benchmark special and not cost
effective for the customer.
HP Itanium – SPECint Comparison
HP Processor CINT2006_rates Equivalent
Database
System
Machine
Cores
Itanium Quad-core 9350
Integrity
(1.73 GHz)
134 / 8 cores = 16.75/core 0.63
Itanium Dual-core 9050
Integrity
(1.6GHz)
53.9 / 4 cores = 13.5/core 0.50
Superdome Intel Itanium 2 (1.66 GHz) 1650 / 128 cores = 12.9/core 0.48
Note: These are the best case numbers on a per-core basis. Database Machine CPU
is 26.6/core
AMD Opteron – SPECint Comparison
System Processor CINT2006_rates Equivalent
Database
Machine
Cores
Opteron Dual-Core 2222
HP DL185 G5
(3.0 GHz)
61 / 4 cores = 15.25/core 0.57
Opteron Quad-Core 2389
HP DL385 G5p 143 / 8 cores = 17.9/core 0.67
(2.9 GHz)
Opteron Six-Core 8439
HP DL585 G5
(2.8 GHz)
416 / 24 cores = 17.3/core 0.65
Opteron 12-core 6176
HP DL385 G7
(2.3 GHz)
398 / 24 cores = 16.6/core 0.63
Note: These are the best case numbers on a per-core basis. Database Machine CPU
is 26.6/core
Intel – SPECint Comparison
System Processor CINT2006_rates Equivalent
Database
Machine Cores
Xeon Dual-Core X5270
HP DL380 G5
(3.5 GHz)
90.7 / 4 cores = 22.7 / core 0.85
Xeon Quad-Core X5365
HP DL380 G5
(3.0 GHz)
116 / 8 cores = 14.5 / core 0.55
Xeon Quad-Core X5470
HP DL360 G5
(3.33GHz)
150 / 8 cores = 18.75/core 0.70
Xeon Quad-Core X5570
HP DL360 G6
(2.93 GHz)
251 / 8 cores = 31.4/core 1.18
Note: These are the best case numbers on a per-core basis. Database Machine CPU
is 26.6/core
Comparative sizing
Validate storage requirements DB Node Sizing
Gather Inputs
Server (CPU)
Disk capacity is fixed by rack size
Disk size Need vs Capacity SAS is assumed
Utilization
Growth
X-factor
Compression Assume not more than 2X + A.C. (not HCC)
Report
SAS vs SATA Assume SAS unless demand for SATA
Feedback
Expansion Cab. Must justify why extra Cells would be okay
Server (CPU) Storage Sizing for utilization would be ideal, but cannot
be collected in all cases.
Disk size
Utilization Key Peak Stats
Growth % CPU Busy
Title of Presentation
Migrating to the Database Machine
Presenter‘s Name
Presenter‘s Title
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2008 Oracle Corporation – Proprietary and Confidential
Module Agenda
• Migration strategies
• Migration methods overview
• Physical migration
• Logical migration
• Migration methods in practice
• Bulk data movement
<Insert Picture Here>
Initial considerations
Database Machine software
Considerations
Migration strategies
Migration strategy
Migration method considerations
• Determine what to migrate
• Because of Exadata unique features (e.g. Smart Scan), expect
differences between source and Exadata warehouse databases
• Fewer indexes, fewer materialized views, potentially different
partitioning strategy, compression
• Avoid methods that migrate what you will discard
• Consider configuration of source system
• Not all migration methods available for all source environments
• Non-Oracle: Not covered in this presentation, although many
methods work if you take into consideration platform differences
• Oracle: Source database version and platform matters
• Target system fixed: 11.2, ASM, Linux x86-64
Migration strategy
Migration method considerations
• Minimize downtime
• Yes, but implementing best practices is more important (your
future performance depends on it)
<Insert Picture Here>
Migration methods
overview
Migration methods
Overview
http://www.oracle.com/technology/products/bi/db/exadata/pdf/migration-to-exadata-whitepaper.pdf
Migration methods
Migration method choice
Physical migration
Physical migration
Basics
• Physical standby
• Transportable database (TDB)
• Transportable tablespaces (TTS)
Logical migration
Migration methods
Logical migration
• Logical standby
• GoldenGate / Streams
• Data Pump
• Create Table As Select (CTAS) or Insert As Select
(IAS)
Logical migration methods
Logical standby
• Overview
• Steps depend on starting point - See following slides
1. Source database 11.2
2. Source database < 11.2 (including HP DBM)
• Source system criteria
• Linux (check Note 413484.1 for cross-platform support)
• Outage time
• Typically Data Guard switchover + application failover
• Consider
• Archivelog mode, LOGGING, and supplemental logging required
• Data type support
• Can apply catch up?
Logical migration methods
Logical standby – source system 11.2
• Overview
• Create logical standby on 11.2 DBM
• Change table storage characteristics, as desired (Note:737460.1)
• Data Guard switchover
• When to use this method
• Table storage characteristics will be changed
• If not, use physical standby method
Logical migration
Logical standby – source system < 11.2
• Overview
• Create Exadata database
• Import user data into Exadata using Data Pump
• Network mode - Direct import from source via dblink
• Can result in large UNDO on target
• File mode - Export to dump file(s), transfer file(s), Import
• Source system criteria
• 10.1 or later on any platform
• Outage time
• Network mode - 1x data movement
• File mode - 3x data movement and 2x staging space
Logical migration methods
CTAS / IAS
• Overview
• Create Exadata database
• CTAS or IAS
• From external tables in DBFS staging area
• From dblink to source database
• Source system criteria
• Any version or platform
• Outage time
• Significant (3x) variation depending on partitioning (and what
scheme), compression, target data type
• Consider
• Use DBFS for staging external tables, not local filesystem
• Dblink - Manually parallelize
Logical migration
Method selection
Migration methods in
practice
Migration methods
In practice
• Restriction
• RDBMS 11.1 cannot use Exadata 11.2
• RDBMS 11.2 cannot use Exadata 11.1
• Option #1 - Physical Standby + Database Upgrade
• Option #2 – Logical Standby source system < 11.2
• Reduce downtime – rolling database upgrade
Migration Scenario
From 10gR2 / 11gR1 on Big Endian
• Performance criteria
• Network
• Protocol
• Source system
• Target system (i.e. DBM)
Use IB network
Bulk data movement
Protocol
• TCP over IB (TCPoIB)
• On source system
• Use IP connected
mode (CM)
• Set Large MTU
(65520)
• DBM DB servers
already configured
• Source system
• I/O subsystem must deliver
• Fast IB network can‘t compensate for slow I/O
• CPU usage varies
• Data transfer with very fast networks can cause high CPU usage
• One CPU may be pegged while others have headroom (e.g.
interrupt handling)
• Use mpstat(1) to investigate
Bulk data movement
Target system
Title of Presentation
Why you don‟t need a Database Machine
Presenter‘s Name
Presenter‘s Title
For Oracle employees and authorized partners only. Do not distribute to third parties.
© 2008 Oracle Corporation – Proprietary and Confidential
Module Agenda
• The goal
• Build your own database machine
• Do you need the Database Machine?
<Insert Picture Here>
Magic?
Looking back
The goal
The goal
Winter Corporation Exadata Proof of Concept
• Workload
• Execute 4 complex, concurrent queries
• Vast amounts of data, query I/O rate peaked at 14
GB/s and queries complete in 99 seconds
• Sun Oracle Database Machine completes these
queries in 48 seconds—without using any V2 software
features.
• 48 seconds == 20.8 GB/s
The goal
Don‟t forget!
• Remember –
• The Sun Oracle Database Machine is a balanced
configuration, so you must guarantee I/O
throughput capabilities in every section of the
machine you will build
• Database Machine also provides for balance
across components and high availability
<Insert Picture Here>
Is it worth it?
Typical DW technical architecture
Hardware needed to achieve 6 GB/s
Ethernet Network Team Switch Vendor
Interconnect
or . . . .
Typical DW technical architecture
Hardware needed to match X2-8 (2x cores)
or . . . .
Database Machine
Hardware needed to achieve 18 GB/s
Is it worth it?
Database versus purpose-built
"And this is a big deal: it takes months and months, and lots
of negotiating with lots of vendors, and at the end of the day
they have this completely unique system that they built—and
it's really good, but they're the only ones in the world who
have this unique system. Which means that if there's any
problem, they're going to be the first ones to find it, right?
--Andrew Mendelsohn
Is it worth it?
Well, is it?
© 2008 Oracle Corporation – Proprietary