Information Lifecycle Management (ILM) for Business Data

October 2006

ILM for Business Data
• This talk focuses on ILM for Business Data

Financial Data Customer Data Product Data People Data

What Drives ILM?
• Reduce cost to retain data
• • Vast amounts of data are retained enterprises for business and regulatory reasons Need to optimize the cost of retaining data in the database to avoid skyrocketing costs

Active Data

Less Active Data

Historical Data

Database Implementation without ILM
Active Data Lifecycle Less Historical Active Archive

DIGITAL DATA STORAGE

High Performance Storage Tier

Tape Archive

Match Lifecycle to Storage to Optimise Cost
Data Lifecycle Less Historical Active Offline Archive

Active

DIGITAL DATA STORAGE

High Performance Storage Tier

Low Cost Storage Tier

Online Archive Storage Tier

Offline Archive

Key Idea – Compliance
• •
Businesses Impacted

Must guarantee compliance to government and business mandates Intelligent centralized data platform allows you to define and enforce compliance policies

Sarbanes Oxley • All U.S. public companies and private foreign issuers

Documentation Regulations • Pharmaceuticals • Aerospace/Defense • Petrochemical • Telcos
1990

UK/PRO (Standard) SEC Rule 17a-3/4 • All companies doing • All U.S. companies business in UK public engaged in brokersector dealer activities European Data Privacy NASD 3010, 3110, Directive NYSE 342 • All companies doing business • All member companies in Europe handling PII Gramm Leach Bliley AUS/PRO (Standard) • Banks and financial services companies doing • All companies doing business in AUS public sector business in U.S. HIPAA DoD 5015.2 (Standard) CA Breach Law • Healthcare • All U.S. • U.S. DoD • Insurance companies doing • U.S Federal Agencies • All U.S. businesses • Business dealing with U.S. business in CA handling medical Federal Agencies handling PII records
2000 2004

ILM and Oracle
• From database perspective ILM is a set of policies and techniques for Managing Data
• Managing data is Oracle’s core competency

• Oracle platform can be used to implement ILM policies and techniques for business data
• Some new technology but mostly an application of existing data management capabilities

Oracle is Ideal for Business ILM
Interfaces

• Understands Business Data • Hardware Independent

Applications

Database

• Central Point of Control • Customizable
Storage

Oracle is Ideal for Business ILM
• Oracle Database is the best place to implement business ILM
• Application Transparent ILM • Oracle classifies business data transparently to the applications • Fine Grained ILM • Oracle manages the lifecycle of groups of business data down to the level of individual rows • Low Cost ILM • Oracle can use low cost storage to reduce the cost of retaining data • Enforceable Compliance Policies • Oracle has sophisticated techniques to define and enforce data policies

Active

Less Active

Historical

Implementing Business ILM using the Oracle Database Platform

4 Steps to Business ILM
1. Define Data Classes 3. Create Data Access and Migration Policies 4. Define and Enforce Compliance Policies

2. Create Storage Tiers for the Data Classes

Active

Less Historical Active

DIGITAL DATA STORAGE

High Performance Storage Tier

Low Cost Storage Tier

Online Archive Storage Tier

Offline Archive

Step 1 to Business ILM
1. Define Data Classes

Define Classes of Data

A C T I V I T Y
Active High Volume

D A T A V O L U M E

First understand data as part of a business process
• How is it used? • How long must it be kept? • How does access vary over time?

• • •

Low Volume 0 1 1

Less Active 5 10

Months

Years

Choose classification based on this understanding Classification by Age is most common Others are possible
• Privacy • Product ID Consider a Hybrid Classification • Classify by business attribute AND age

Separate Data by Class
All Orders
Q1 Orders Q2 Orders Q3 Orders Q4 Orders

• •

The goal of ILM is to apply different policies to different classes of data To treat data classes differently, you must physically separate data by class
• Data classes must be mapped to data attributes e.g. Order date

Table Partitions enable you to separate data by data attribute
• Can manage each class (partition) as a unit • Store, move, archive, search, query

Previous Orders

• •

Partitions are transparent to the application Use ILM Assistant to define a Lifecycle

Step 2 to Business ILM

2. Create Storage Tiers for the Data Classes

DIGITAL DATA STORAGE

High Performance Storage Tier

Low Cost Storage Tier

Online Archive Storage Tier

Offline Archive

Create Physical Storage Tiers
• Create separate storage areas for high performance and low cost storage • High Performance Storage Tier uses
High Performance Storage Tier

• High performance storage arrays • Disks optimized for throughput

• Low Cost Storage Tier uses
• Modular arrays for reduced cost • Large capacity commodity ATA disks
Low Cost Storage Tier

Online Archive Storage Tier
• The Online Archive Storage Tier is
• Very large • Low Activity • Read-only or Read-mostly

• Store in low cost storage tier
• Information is still online and always readable • No delay when data needed, always available • Storage cost is almost same as tape

• Leverage usage patterns to further reduce size & cost
• Defragment and Compress • Rows, tables, files • Declare read-only

Assign Classes to Storage Tiers
All Orders
Q1 Orders Q2 Orders

Active Data
High Performance Storage Tier

Assign data partitions to appropriate storage tiers

Q3 Orders Q4 Orders

Less Active Data
Low Cost Storage Tier

Historical
Previous Orders

Data
Online Archive Storage Tier

Storage – Sample Device Costs
Storage Tier Vendor EMC DMX 1000 73GB, <6TB HP XP128 73GB, < 6TB IBM DS8300 73GB, <6TB EMC CX500 146GB, <4TB HP EVA 3000 146GB, <4TB IBM DS4300 146GB, <4TB min $/GB $26.90 max$/GB $34.10 $23.90 $28.80 $10.40 $8.90 $7.50

$29.00

$18.90 $22.80 $8.20

$7.00

$7.00 $5.90

• Prices in high end range typically represent a system configured for performance • Prices in the low end range typically represent a system configured for capacity

Tiered Storage – Sample Costs
Storage Tier High Performance (2550 GB) High Performance (50 GB) Low Cost (500 GB) Online Archive (2000 GB) Single Tier $74,300.00 $1,450.00 $3,500.00 $14,000.00 $1,450.00 $3,500.00 $5,600.00 Multiple Tiers w/Compression

$74,300.00

$18,950.00

$10,550.00

• Usage of appropriate storage tiers reduces total cost of ownership by order of magnitudes • Compression reduces TCO even more • Full application transparency • Use ILM Assistant to view and project cost savings

Step 3 to Business ILM
3. Create Data Access and Migration Policies

Active

Less Historical Active

DIGITAL DATA STORAGE

High Performance Storage Tier

Low Cost Storage Tier

Online Archive Storage Tier

Offline Archive

Define Access Policies
Authorized Users Special Users

• Access Policies determine data visibility
• Only authorized data • Only recent data

Access Policy

• Special users or operations access historical data
• Hiding historical data speeds up data scans and maintenance

• Implement Access Policies using Views and Virtual Private Database
High Performance Low Cost Storage Tier Storage Tier Online Archive Storage Tier

• Transparent to the application

Migrate Data between Classes
High Performance Storage Tier

• Periodically move data between storage tiers as access patterns change
• e.g. MOVE PARTITION holding Q2 Orders from high performance storage tier to low cost storage tier

Q2 Orders

• Move important data on demand
Low Cost Storage Tier

• UPDATE of partition key will cause row to move to a new partition
• e.g. product warranty expires

• ILM Assistant will advise when time to move data

Step 4 to Business ILM
4. Define and Enforce Compliance Policies

DIGITAL DATA STORAGE

High Performance Storage Tier

Low Cost Storage Tier

Online Archive Storage Tier

Offline Archive

Centralized Compliance Enforcement
• It is impractical to manage policy and ensure compliance on decentralized and fragmented data
• Cannot trust everyone to know the rules, and abide by them

Oracle Database Platform provides an intelligent central location to store data and enforce data policies for compliance
• Policies are customizable for businesses specific needs • Centralized policy definition and enforcement • Centralized access control using Enterprise Users • Flexibility to respond to changing regulations

Elements of Compliance Policy
• Retention
• Ensure data is retained unmodified for a specific time period

• Immutability
• Prove to external party that data is complete and unmodified

• Privacy
• Protect personal and other sensitive data

• Auditing
• Track and report changes to important data

• Expiration
• Expunge stale data to limit liability

Data Retention
• Sarbanes-Oxley HIPAA European Data Privacy Directive UK PRO DOD5015.2-STD • Business and regulatory needs dictate specific retention periods Oracle security can enforce that users can create specific kinds of data, but not delete or modify the data
• Privileges to create, modify, and delete can be independently granted to specific users for specific tables • More granular control than file systems

• • • •

Data can be periodically purged using privileged procedures Changes in privileges can be audited Centrally administered Use ILM Assistant to describe a Lifecycle

Immutability
• Security enforcement prevents unauthorized changes, but sometimes you must prove that enforcement has not been bypassed, even by order of the CEO
• Must prove no data tampering at database level or below whether intentional or accidental

Oracle can generate a digital signature to prove that data has not not been modified
• Signature can be generated on the result of any query • e.g. all the orders booked last quarter • Signature will change if even a single bit of the data is changed • Signature can be time stamped and stored, or given to third party • Prevents covering up illegal or unethical activity • ILM Assistant can generate & compare

Privacy
• Oracle provides many methods to ensure data privacy • Virtual Private Database (VPD)
• With "Column Relevance", VPD can be configured such that the policy is enforced only when a critical column is selected (e.g. salary or credit limit) • Centrally managed control of data access by user

Auditing
• • • Oracle can audit any change or access to data in the database Common problem is that auditing every access produces a flood of audit records Fine Grained Auditing (FGA) specifies the conditions necessary for an audit record to be generated
• Create relevant and meaningful audit trails

FGA policies are bound to a table or view
• Audit when a specific column has been selected or updated • E.g. any access to credit card, salary, etc. • Audit changes to “closed” transactions • Detect changes to financial books that have been closed • Audit actions by administrators or highly privileged users

Expiration
• Data eventually expires
• No longer needed for legal or business reasons • Becomes a potential liability

Permanently Remove Expired Data
• Periodically the partition containing expired data can be dropped
• DROP PARTITION

• Files containing partition can then be expunged • Backups of partition can be independently expunged
DIGITAL DATA STORAGE

Additional Features for Implementing ILM

Automatic Storage Management
• ASM enables the use of inexpensive modular storage for low cost storage tier • Resiliency
• ASM mirrors data across inexpensive modular storage arrays

• Performance
• ASM Stripes data to ensure no hot spots

• Availability
• ASM Allows online addition and removal of disks

Disk Group

ASM Disk Groups per Storage Tier & Partitions
• Each Tier uses ASM for load balancing within the tier • Partitions are in different disk groups • Data is moved between disk groups using
• Partition Move Operation, or • Online Reorganization of tables, or • Tablespace Copy followed by “rename” Disk Group P
Current Month Last 11 months

Disk Group L
Year 2002 Year 2001 Year 2000

Disk Group H
Years 19951999

High Performance Storage Tier

Low Cost Storage Tier

Online Archive Tier

Prevent Data Changes
• Sometimes Data reaches a point when it must never be changed e.g. the electronic record of a bank cheque Data can be placed in Read-Only Tablespace which means:
• Backup • Only needs to be backed up once • Read-only Media • Can place data on a CDROM or WORM • Deferred Read-Only mode does not open files unless they are actually accessed

Read Only

Table Level ILM
High I/O Tables

• ILM can also be done at the table level • Measure the I/O rate and I/O rate per gigabyte of database tables
High Performance Storage Tier

• Use v$segment_statistics

Low I/O Tables

• Tables with low I/O density can go on slower and lower cost storage
• Use “alter table move” or online redefinition to move the table to a different storage tier

Low Cost Storage Tier

Moving Non-Partitioned Data
• Move files to a storage device using datafile copy • ALTER TABLESPACE RENAME DATAFILE • Move tables or partitions using Online Redefinition • Package DBMS_REDEFINITION

Moving Data to a New External Home
Data is often moved from one database to another e.g. move orders over 12 months old from OLTP system to Data Warehouse • Data can be relocated using
• Oracle Streams • Changes made to the OLTP database are replicated in the data warehouse • Transportable Tablespaces • Move the entire partition to another database e.g. all of this weeks orders • Data Pump • Fast unload and load of data in a file

Oracle Warehouse Builder
• Build custom application to move data

Reducing Storage Requirements
With so much data being retained, can it be more efficiently stored and managed by reducing its storage requirements • Remove Unused Space
• Online Segment Shrink • Removes unused Space within a segment • Move Partition or Table to a new tablespace to eliminate all intra or inter segment unused space

Table Compression
• Use when data is no longer changing to compress row data

Summarize (Materialized View)
• Long-term a summarized value may be sufficient rather than each individual record

Securing & Protecting Data

• Oracle Database 10g protects against:
• • • • • Unauthorised access Hardware failures Human errors and software failure Data Corruption Complete Site Failure

Protect from Human Error using Flashback
• A new strategy for point in time recovery • Flashback Log captures old versions of changed blocks • Think of it as a continuous backup • Replay log to restore DB to time • Restores just changed blocks • It’s fast - recover in minutes, not hours • It’s easy - single command restore Flashback Database to ‘2:05 PM’
Data Files Flashback Log

Disk Write New Block Version

“Rewind” button for the Database

Corruption Protection using Flash Recovery Area
• • •

On-disk Backups using RMAN are an important part of ILM Fully automatic disk based backup and recovery Nightly incremental backup rolls forward recovery area backup
• Changed blocks are tracked in production DB

DIGITAL DATA STORAGE

Database Area Data on all Storage Tiers

Nightly Flash Incremental Applied and Recovery Validated Area

Weekly Tape Archive Offline Archive Tier

Full scan is never needed
• Dramatically faster (20x) • Blocks validated to prevent corruption of backup copy

Low-Cost Storage Tier

Use low cost ATA disk array for recovery area

Site Failure Protection – Data Guard
Production Database
Transaction Shipping (Real Time Apply) No Delay

Standby Database

Data Guard ensures Business Continuity thru standby databases • Physical (identical copy) • Logical Transfer processing to the standby sites for routine maintenance or when a disaster occurs

Highest Data Protection Using Low Cost Storage

Data Guard Flash Recovery Area Flashback
Human Error Protection Corruption Protection Site Failure Protection

ASM Mirroring
Storage Failure Protection

Combine the Features to Achieve Any Level of Data Protection

ILM & Oracle RAC
Real Applications Clusters Network
• • • • • Run All Your Applications Centralized Scalable Management Console Highly Available Interconnect No Single Point of Failure Add Low Cost Capacity on Demand
• Servers or Storage Users

No Single Point Of Failure

• •

Manageable Secure

he Shared Cac
Storage Area Network Drive and Exploit Industry Advances in Clustering

What Happens to Data at Lifecycle End

Data Archiving
What happens to data when it is unlikely to be needed again? • Archive to Tape • Archive to a Database • Efficiently Remove Archive Data using Drop Partition

Archive to a Database
• Online Database Archive
• Central Archive for all systems • No delay when data needed, always available • Storage cost is almost same as tape • Only backup when data is added • Always Readable • No impact on Production system

Main Archive Database

• Use Oracle Warehouse Builder to create custom loading

Archive to Tape
Online Archive Storage Tier

• Archive expired data in Database Format for rapid unload
• Transportable Tablespaces
• Archive Oracle data files directly • Reload files if they are later needed

Expired Data
DIGITAL DATA STORAGE

• Data Pump export format
• Uses Oracle native data types for high performance

Offline Archive

• Archive using XML functionality built into the Database

ILM Assistant

Oracle ILM Assistant

• Manages your ILM environment via a GUI interface

• Advises how to
• Partition a Table • When data needs to be moved • Generate Scripts to move data

• Define Lifecycle Definitions • Illustrates Storage Costs & Savings • Manage Security & Compliance • • • Calendar of Events • Simulates impact of partitioning on a table

Tool downloaded from OTN Requirements
• Oracle APEX • Oracle 9i or greater

Cost Savings

Cost Savings by Storage Tier

Lifecycle Events

Security & Compliance

Summary

Partitioning for ILM Customer Success: NASDAQ
Nasdaq implemented partitioning to enable information lifecycle management, supporting two-tier storage that moves aged-out data to less expensive storage quickly and easily while keeping recent data fresh and up-todate with better performance Saved millions of dollars in storage costs and eliminated the need to buy additional hardware
http://www.oracle.com/customers/snapshots/nasdaq.pdf

ILM and Oracle
• ILM is a data management strategy designed to
• Reduce Cost of Retaining Data • Comply with Legal, Regulatory, and Business Mandates

• From the database perspective, ILM is a set of policies and techniques for Managing Data
• Managing data is Oracle’s core competency

• Oracle platform can be used to implement ILM policies and techniques for business data
• Some new technology but mostly an application of existing data management capabilities

Benefits of Oracle for ILM
Over 25 years of investment in Data Management
Performance - Fastest and most functional access to data Security Safety Flexibility Hardware Longevity Simplicity Open - Retained data uses same security as current data - Full protection from corruptions, errors, disasters - Easily adapts to changing requirements - Total hardware independence - Oracle Databases will be supported for decades - No specialized data stores to manage - Standard SQL interfaces

Consistency - Data is transactionally consistent

Conclusion
Financial Data Customer Data Product Data

• Oracle Database for ILM
• Implements policies in a application transparent fashion • Stores maximum data for lowest cost • Centralizes compliance enforcement • ILM Assistant can model future environment • Understanding of data and knowledge of access pattern makes it the best location to implement business ILM
DIGITAL DATA STORAGE

Active

Less Historical Active

High Performance Storage Tier

Low Cost Storage Tier

Online Archive Storage Tier

Offline Archive

Further Information
OTN For white papers, presentations, eSeminar
http://www.oracle.com/technology/deploy/ilm/index.html

Sign up to vote on this title
UsefulNot useful