You are on page 1of 54

SESSION CODE: #DAT314

Data Warehousing with FastTrack and PDW


Nicholas Dritsas Principal Program Manager SQL Server Customer Advisory Team Microsoft

Agenda
SQL Server DataWarehouse Offering Overview Fast Track Offering
Motivation Balanced Architecture Approach for DW Example FastTrack Reference Architectures Optimizing Storage, Load and Maintenance Case Studies

Parallel DataWarouse Offering Overview

Agenda
SQL Server DataWarehouse Offering Overview Fast Track Offering
Motivation Balanced Architecture Approach for DW Example FastTrack Reference Architectures Optimizing Storage, Load and Maintenance Case Studies

Parallel DataWarouse Offering Overview

Microsoft Data Warehousing - Products Positioning


PDW with Hub-and-spoke

Scale Complexity HA by default SW-HW integration

1 Minimal HW tune

4 3
PDW

up/optimization. Supports mixed workloads 2 Balanced solution for mostly scan centric workloads.
3 Max HW tune up for

SQL Server 2008 R2 with Fast Track Reference Architecture

2
SQL Server 2008 R2

most DW scenarios. 4 Most flexible Architecture for handling all DW scenarios.

New in SQL Server 2008 Data Warehousing Enablers


High speed Adapters Data Compression Star Join Query Optimization

MERGE SQL Statement

Backup Compression

Parallel Query Enhancements

Change Data Capture (CDC)

Resource Governor

Scale-out Shared Databases

Persistent Lookups

Policy Based Administration Partition-Aligned Indexed Views

Data Mining Improvements

Data Profiling

New - Report Builder 2.0

Included at no charge! No Fee Based Options: Compression Partitioning Advanced Security Manageability ETL Business Intelligence

Agenda
SQL Server DataWarehouse Offering Overview Fast Track Offering
Motivation Balanced Architecture Approach for DW Example FastTrack Reference Architectures Optimizing Storage, Load and Maintenance Case Studies

Parallel DataWarouse Offering Overview

Microsoft Data Warehousing - Products Positioning


PDW with Hub-and-spoke

Scale Complexity HA by default SW-HW integration

1 Minimal HW tune

4 3
PDW

up/optimization. Supports mixed workloads 2 Balanced solution for mostly scan centric workloads.
3 Max HW tune up for

SQL Server 2008 R2 with Fast Track Reference Architecture

2
SQL Server 2008 R2

most DW scenarios. 4 Most flexible Architecture for handling all DW scenarios.

SQL Server Relational Data Warehouses Today


Hundreds of deployments > 1 TB Dozens of deployments > 5 TB A wide variety of approaches Synergy with the SQL Sever BI Stack Momentum!
Steady stream of enabling features
Resource Governor, Compression, Star Query,

Next scale breakthrough coming with Parallel Data Warehouse this year
9

Some SQL Data Warehouses Today

Big SAN Biggest 64-core Server Connected together!

Whats wrong with this picture???

10

System out of balance


This server can consume 16 GB/Sec of IO, but the SAN can only deliver 2 GB/Sec
Even when the SAN is dedicated to the SQL Data Warehouse, which it often isnt Lots of disks for Random IOPS BUT Limited controllers Limited IO bandwidth

System is typically IO bound and queries are slow


Despite significant investment in both Server and Storage
11

The Alternative: A Balanced System


Design a server + storage configuration that can deliver all the IO bandwidth that CPUs can consume when executing a SQL Relational DW workload Avoid sharing storage devices among servers Avoid overinvesting in disk drives
Focus on scan performance, not IOPS

Layout and manage data to maximize range scan performance and minimize fragmentation
12

What is FastTrack Data Warehouse?


A method for designing a cost-effective, balanced system for Data Warehouse workloads Reference hardware configurations developed in conjunction with hardware partners using this method Best practices for data layout, loading and management Relational Database Only Not SSAS, IS, RS

13

Agenda
SQL Server DataWarehouse Offering Overview Fast Track Offering
Motivation Balanced Architecture Approach for DW Example FastTrack Reference Architectures Optimizing Storage, Load and Maintenance Case Studies

Parallel DataWarouse Offering Overview

14

Data Warehouse Workload Characteristics


SELECT L_RETURNFLAG, L_LINESTATUS, SUM(L_QUANTITY) AS SUM_QTY, SUM(L_EXTENDEDPRICE) AS SUM_BASE_PRICE, SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)) AS SUM_DISC_PRICE, SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)*(1+L_TAX)) AS SUM_CHARGE, AVG(L_QUANTITY) AS AVG_QTY, AVG(L_EXTENDEDPRICE) AS AVG_PRICE, AVG(L_DISCOUNT) AS AVG_DISC, COUNT(*) AS COUNT_ORDER LINEITEM L_RETURNFLAG, L_LINESTATUS L_RETURNFLAG, L_LINESTATUS

Scan Intensive

Hash Joins
Aggregations

FROM GROUP BY ORDER BY

15

Balanced Architecture Components

16

Balanced System - CPU


Determine your data consumption rate, per CPU core, for your query mix
Simple example: Assume TPCH query 2 is your average query Run the query on a test server with data fully cached in memory
Execute parallel query using MAXDOP 4 Observe 100% CPU on 4 cores Time the query and observe # pages read (Set Statistics IO on; Set Statistics Time on) Per Core Consumption = (# Logical Reads* 8K)/(CPU Time)

17

You can get more sophisticated


Realize that queries performing complex calculations, format conversions, multi-dimension hash joins, etc. will be more cpu-intensive than others
Complex queries will consume data at a slower per-core rate than simpler queries

Alternative: Measure per-core data consumption for a variety of queries, and take the weighted average
A standard approach to capacity planning
18

Or you can leave it to us


Weve measured a mix of TPCH queries that reflect a prototype Data Warehouse workload Concluded that SQL Sever 2008 R2 on current x64 cores consume ~200 MB/Sec per core on average for this workload We use this as a basis for the published reference architectures Your mileage will vary!
For precise system sizing, measure your own workload
19

Balanced System Determine Storage Sizing


CPU core count and consumption rate for workload will determine # of controllers and enclosures need to provide aggregate throughput # of controllers will determine minimum disk count for delivering the scan bandwidth Determine desired per-disk capacity based on expected data volume
Leave enough room for TempDB and for extra copies of the largest tables in the system, for maintenance activities
20

Balanced System IO Stack


Use a 2x quad-core server as a building block / starting point Ensure that the per-core data consumption rate can be delivered by all elements of the IO stack
Maximum theoretical throughput for IO stack components sized for an 8 CPU core Fast Track system (assumes 200 MB/s per core)

CPU Socket (4 Core)

CPU Socket (4 Core)

21

Balanced System Determine Storage Sizing (2)


Keep in mind theoretical maximums are just that theoretical Some testing/validation may be needed
Observed bandwidth realized on 8 core Fast Track system running SQLIO

CPU Socket (4 Core)

CPU Socket (4 Core)

22

Balanced System - Scaling the IO Stack


CPU Socket (4 Core) CPU Socket (4 Core) CPU Socket (4 Core) CPU Socket (4 Core) CPU Socket (4 Core) CPU Socket (4 Core) CPU Socket (4 Core) CPU Socket (4 Core)

Fiber Switch

Storage Processor Storage Processor

Storage Enclosure
Storage Processor Storage Processor

RAID-1 RAID-1 RAID-1 RAID-1 RAID-1

Storage Enclosure
Storage Processor Storage Processor

RAID-1 RAID-1 RAID-1 RAID-1 RAID-1

Storage Enclosure
Storage Processor Storage Processor

RAID-1 RAID-1 RAID-1 RAID-1 RAID-1

HBA HBA HBA HBA HBA HBA HBA


Storage Processor Storage Processor Storage Processor Storage Processor Storage Processor Storage Processor

Storage Enclosure

RAID-1 RAID-1 RAID-1 RAID-1 RAID-1

RAID-1 RAID-1

RAID-1 RAID-1

Storage Enclosure

RAID-1

Storage Enclosure

RAID-1 RAID-1 RAID-1 RAID-1 RAID-1

Storage Enclosure
Storage Processor Storage Processor

RAID-1 RAID-1 RAID-1 RAID-1 RAID-1

Server
23

HBA

Storage Enclosure

RAID-1 RAID-1 RAID-1 RAID-1 RAID-1

Agenda
SQL Server DataWarehouse Offering Overview Fast Track Offering
Motivation Balanced Architecture Approach for DW Example FastTrack Reference Architectures Optimizing Storage, Load and Maintenance Case Studies Conclusions

Parallel DataWarouse Offering Overview


24

Using a Preconfigured FastTrack Reference Architecture


Guesstimate of 200 MB/sec per core for an average DW workload Equates to 800 MB/Sec enclosure per quad-core CPU Estimate total bandwidth needed under query concurrency
Derives CPU count Derives total Storage profile
25

Published Reference Architectures


Balanced System Examples -- HP / Dell / IBM, 8 to 48 core

26

Agenda
SQL Server DataWarehouse Offering Overview Fast Track Offering
Motivation Balanced Architecture Approach for DW Example FastTrack Reference Architectures Optimizing Storage, Load and Maintenance Case Studies

Parallel DataWarouse Offering Overview

27

Optimizing Storage Layout for Scan Intensive Workloads


LUN configuration is based on RAID1 pairs
Optimal for scan type access patterns
S P A S P B
RAID GP01 RAID GP02 RAID GP05

01

02

03

04

09

10

LUN1 LUN2
RAID GP03

LUN3 LUN4
RAID GP04

LUN0 (Logs)

H S

Striping across storage is accomplished via SQL Server data files Observed throughput for a single RAID pair >= 130 MB/s
28

05

06

07

08

LUN5
LUN6

LUN7
LUN8

Storage Layout Implications for SQL Server


Create a SQL data file per LUN, for every filegroup TempDB filegroups share same LUNs as other databases Log on separate disks, within each enclosure
Striped using SQL Striping Log may share these LUNs with load files, backup targets
29

Storage Layout Implications for SQL Server


LUN 1 LUN 2 LUN 3
Permanent FG
Permanant_DB

LUN16

Permanent_1.ndf

Permanent_2.ndf

Permanent_3.ndf

Permanent_16.ndf

Stage Database

Stage FG

Local Drive 1

Stage_1.ndf

Stage_2.ndf

Stage_3.ndf

Stage_16.ndf

TempDB

TempDB.mdf (25GB) TempDB_02.ndf (25GB)

TempDB_03ndf (25GB)

TempDB_16.ndf (25GB)

Log LUN 1 Permanent DB Log


Stage DB Log

How Scans are Optimized


SQL Server issues a large number of asynchronous read-ahead requests when performing scans

Attempts to issue I/O at rate needed to keep CPUs busy


Size of I/O issued is dependent on continuity of underlying data pages
I/O size can be any multiple of 8K up to 512K

Average request size that will be issued by read-ahead operations can be determined by looking at
avg_fragment_size_in_pages exposed by sys.dm_index_physical_stats Values >= 64 pages will mean I/Os sizes issued by read-ahead should be at or near 512K

31

Read-Ahead in Action
Clustered index: Key Order
1. Next range of pages requests is determined by looking at B-Tree for next range of key values 2. Pages for the range are sorted

3. I/O issued for each contiguous range of pages (up to 64 pages in a single request)

Heap: Allocation Order


Scan GAM pages to determine next range of pages
I/O issued for each contiguous range of pages (up to 64 pages in a single request)

32

Techniques to Maximize Scan Throughput


E startup parameter Minimize use of NonClustered indexes on Fact Tables Load techniques to avoid fragmentation
Load in Clustered Index order (e.g. date) when possible

Index Creation always MAXDOP 1, SORT_IN_TEMPDB


Isolate volatile tables in separate filegroup Isolate staging tables in separate filegroup or DB

Periodic maintenance

33

Conventional data loads lead to fragmentation


Bulk Inserts into Clustered Index using a moderate batchsize parameter
Each batch is sorted independently

Overlapping batches lead to page splits


1:31 1:32 1:36 1:33 1:32 1:34 1:37 1:35 1:33 1:38 1:34 1:39 1:35 1:40

Key Order of Index

34

Alternatives for loading


Use a heap
Practical if queries need to scan whole partitions

orUse a batchsize = 0
Fine if no parallelism is needed during load

orUse a Two-Step Load


1. Load to a Staging Table (heap) 2. INSERT-SELECT from Staging Table into Target CI Resulting rows are not fragmented Can use Parallelism in step 1 essential for large data volumes
35

Two-Step Load Variations


To achieve high parallelism during historical load
Typically into a partitioned table Use a Staging Table (heap) that is partitioned identically to the Target Table Use multiple concurrent streams to load the Staging Table with moderate batchsize (SSIS, Bulk Insert, etc) INSERT-SELECT separate partitions into the Target Table potentially in parallel
Use ALTER TABLE SET ( LOCK_ESCALATION = AUTO)

Note: If memory is limited, TempDB could be heavily used for sorting

36

Two-Step Load Variations (cont.)


To avoid most TempDB space and TempDB IO during load
Use a partitioned Staging Table that is also indexed identically to Target Table Load Staging Table using moderate batchsize (< 1M rows) Final INSERT-SELECTs will avoid any sort!
However the staging loads will be logged

Note: Parallelism will be limited if load batches overlap

37

Other fragmentation best practices


Avoid Autogrow of filegroups
Pre-allocate filegroups to desired long-term size Manually grow in large increments when necessary

Keep volatile tables in a separate filegroup


Tables that are frequently rebuilt or loaded in small increments

If historical partitions are loaded in parallel, consider separate filegroups for separate partitions to avoid extent fragmentation
38

Sometimes fragmentation cant be avoided


If incremental loads overlap data already present in the Clustered Index, page splits will occur anyway Periodic table maintenance can reduce the fragmentation Partitioning on history (date key) can help minimize needed maintenance operations

39

Maintenance considerations
Use ALTER INDEX REBUILD WITH (MAXDOP = 1, SORT_IN_TEMPDB)
Single threaded -- avoids creating new extent fragmentation Can rebuild just the current partition

Avoid ALTER INDEX REORGANIZE


Pages will become physically ordered, but significant extent fragmentation may occur

40

Handling long-term accumulation of fragmentation


Sometimes it may be best to start fresh:
Create a new filegroup to replace the old Create a new copy of the table in new filegroup
With matching Partitions and Clustered Index

INSERT-SELECT from old to new (avoids a sort) Build secondary indexes Drop original table and rename the new All but final step can be performed online

41

Agenda
SQL Server DataWarehouse Offering Overview Fast Track Offering
Motivation Balanced Architecture Approach for DW Example FastTrack Reference Architectures Optimizing Storage, Load and Maintenance Case Studies

Parallel DataWarouse Offering Overview

42

Case 1: Insurance Claims -High-volume loads in a short load window


Example: Load and enrich 50 GB of incremental data in less than 1 hour Only possible with a highly parallel load design Use partitioned destination table
Partitioned by equal ranges of customer key
But a Clustered Index on Date # partitions = # cores

Parallel loading to staging table first Separate filegroups per-partition prevents interleaving during load
43

System Design

MSA2000 DAE Pri_A Pri_B Pri_C Pri_D Log Hot Spare Hot Spare

Primary Storage 8 Drives (4 RAID1 Pairs)

Logs 2 Drives (1 RAID1 Pair)

Spares 2 Drives

44

Results
Existing Appliance Loading Subject Area 1 Loading Subject Area 2 Query times Subject Area 1 Query times Subject Area 2 5:10:21 total time SQL Server Fast Track DW 51:31 total time Comparison R SQL Server 6x faster R SQL Server 2.5x faster R SQL Server 12x faster R SQL Server 7x faster

4:36:08 total time


3:03 avg query time (using 9 benchmark queries) 56:44 avg query time (using 4 benchmark queries) $22K / TB $13K / TB

1:50.01 total time


0:15 avg query time (using 9 benchmark queries) 8:09 avg query time (using 4 benchmark queries)

Price per TB (8TB) Cal : Price per TB (16TB) Cal:

45

Case 2: Telecom--Initial Data Load


Load 400 GB to new Clustered Index on an 8core server in under 7 hours Target table designed with 8 partitions of evenly spaced historical ranges 3-step load process leveraging partitioning
Load, Index, Switch All steps use parallelism Minimal logging

46

Case 2: Telecom -- Initial Data Load


Data Size: 400G (50G * 8) Bulk Insert 8 files to match core count, and partition the final table according to core count 1 Heap Table per destination partition, and final table is assumed to be Empty Create Clustered Index on the Heap Tables, and 1:1 switch each into the final Partitioned Table SSIS Package Attributes/MaxConcurrentExecuables: 8 Use MAXDOP=1: minimal fragmentation

1. Bulk Insert

2. Create Clustered Index

3. Switch

47

Agenda
SQL Server DataWarehouse Offering Overview Fast Track Offering
Motivation Balanced Architecture Approach for DW Example FastTrack Reference Architectures Optimizing Storage, Load and Maintenance Case Studies

Parallel DataWarouse Offering Overview

48

Microsoft Data Warehousing - Products Positioning


PDW with Hub-and-spoke

Scale Complexity HA by default SW-HW integration

1 Minimal HW tune

4 3
PDW

up/optimization. Supports mixed workloads 2 Balanced solution for mostly scan centric workloads.
3 Max HW tune up for

SQL Server 2008 R2 with Fast Track Reference Architecture

2
SQL Server 2008 R2

most DW scenarios. 4 Most flexible Architecture for handling all DW scenarios.

49

SQL Server Parallel Data Warehouse


A data warehouse appliance with massive scalability

Massive Scale-Out of SQL Server through Massively Parallel Processing (MPP) system: 10s TB 100s TB PB Choice of hardware vendor - Reference Architectures from HP, Bull EMC, Dell, IBM Low cost of ownership through industry standard hardware Simplified deployment & maintenance via appliance model Integration with existing SQL Server 2008 data warehouses via Hub & Spoke Architecture Deep integration with Microsoft BI

50

Parallel Data Warehouse Appliance Hardware Architecture


Database Servers Control Nodes Active / Passive
SQL SQL

Storage Nodes

Client Drivers

SQL

SQL

SQL

Management Servers
SQL

Data Center Monitoring

SQL

Landing Zone
ETL Load Interface

SQL

SQL

Backup Node
Corporate Backup Solution

SQL

SQL

Spare Database Server

51

Corporate Network

Private Network

Dual Fiber Channel

Dual Infiniband

Question & Answer Session

2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Resources

www.msteched.com/Australia
Sessions On-Demand & Community

www.microsoft.com/australia/learning
Microsoft Certification & Training Resources

http:// technet.microsoft.com/en-au
Resources for IT Professionals

http://msdn.microsoft.com/en-au
Resources for Developers

54

You might also like