You are on page 1of 21

Module 2

Planning Data Warehouse


Infrastructure

Module Overview
Considerations for Data Warehouse Infrastructure
Planning Data Warehouse Hardware

Lesson 1: Considerations for Data Warehouse


Infrastructure
System Sizing Considerations
Data Warehouse Workloads
Typical Server Topologies for a BI Solution

Scaling-out a BI Solution
Planning for High Availability

System Sizing Considerations

Data Volume

Number of Users

Analysis/Report Complexity

Availability Requirements

Data Warehouse Workloads


ETL

Processing
Aggregation storage

Multidimensional on disk
Tabular in memory

Control flow tasks


Data query and insert
Network data transfer
In-memory data pipeline
SSIS Catalog or MSDB I/O

Query execution

DW

Reporting

Client requests
Data source queries
Report rendering
Caching
Snapshot execution
Subscription processing
Report Server Catalog I/O

Operations and
Maintenance

OS activity
Logging
SQL Server Agent Jobs
SSIS packages
Indexes
Backups

Typical Server Topologies for a BI Solution

DW

Single Server Architecture


Distributed Architecture
Few

Servers

Hardware costs
Software license costs
Configuration complexity
Scalability & Performance
Flexibility

Many

Scaling-out a BI Solution
Data Warehouse

Integration Services

Analysis Services

Reporting Services

Planning for High Availability

Data Warehouse

Analysis Services

AlwaysOn Failover Cluster


RAID Storage

Integration Services

AlwaysOn Availability Group

AlwaysOn Failover Cluster

Reporting Services

NLB Report Servers


AlwaysOn Availability Group
Or
AlwaysOn Failover Cluster

Lesson 2: Planning Data Warehouse Hardware


SQL Server Fast Track Reference Architectures
Core-Balanced System Architecture
Demonstration: Calculating Maximum

Consumption Rate
Determining Processor and Memory Requirements
Determining Storage Requirements
Considerations for Storage Hardware
SQL Server Data Warehouse Appliances
SQL Server Parallel Data Warehouse

SQL Server Fast Track Reference Architectures


Pre-tested and approved

hardware specifications and


guidance
Available from multiple
hardware vendors in
partnership with Microsoft
Support for a range of data
warehouse sizes
Tools provided to calculate
required specification

Core-Balanced System Architecture


2 x FC Port per processor
Max I/O Rate = 2000 MB/s

Per-Core MCR = 200 MB/s


Total MCR = 1600 MB/s

Storage Enclosure

Server

Storage
Processors
4-Spindle RAID 10 Disk Groups

Windows Server
Quad
Core
CPU

Dual Port
FC HBA

Quad
Core
CPU

Dual Port
FC HBA
Dual Port
FC HBA
Max I/O Rate = 2000 MB/s

Fiber Switch

SQL Server

Storage Enclosure
Storage
Processors
4-Spindle RAID 10 Disk Groups

Storage Enclosure
Storage
Processors
4-Spindle RAID 10 Disk Groups

Max I/O Rate = 1800 MB/s

Demonstration: Calculating Maximum Consumption Rate


In this demonstration, you will see how to:
Create tables for benchmark queries
Execute a query to retrieve I/O statistics
Calculate MCR from the I/O statistics

Determining Processor and Memory Requirements


Estimating CPU Requirements:
Determine core MCR
Apply formula to estimate required

number of cores:

((Average query size in MB/ MCR) x Concurrent users) / Target response time

Spread cores across CPUs based on the

number of storage arrays

Estimating RAM Requirements:


Use a minimum of 4 GB per core
(or 64 - 128 GB per socket)
Target 20% of data volume

Determining Storage Requirements


Data Warehouse

Determine initial data volume

Number of fact table rows x row size


Use 100 bytes per row as an estimate if unknown

Project data growth

Add 30-40% for dimensions and indexes


Number of new fact rows per month

Factor in compression

Typically 3:1

Other storage requirements

Configuration databases
Log files
TempDB
Staging tables
Backups
Analysis Services models

Considerations for Storage Hardware


Use more smaller disks instead of

fewer larger disks

Use the fastest disks you can afford


Consider solid state disksespecially for random I/O
Use RAID 10, or minimally RAID 5

Consider a dedicated storage area network for

manageability and extensibility

Balance I/O across enclosures, storage processors, and


disk groups

SQL Server Data Warehouse Appliances


Pre-built hardware and software solutions based

on tested configurations
Part of a range of SQL Server-based appliances
Available from multiple hardware vendors

SQL Server Parallel Data Warehouse


Special SQL Server Edition only available in

Management
Servers
Landing Zone
(ETL Interface)
Backup Nodes

Dual Fiber Channel

Control Node
Cluster

Infiniband

hardware appliances
Massively parallel processing
Shared-nothing architecture
Dedicated control nodes, compute nodes, and
storage nodes
Database servers
(compute nodes)

Storage Arrays

Lab: Planning Data Warehouse Infrastructure


Exercise 1: Planning Data Warehouse Hardware

Logon Information
Virtual Machine: 20463C-MIA-SQL
Use Name: ADVENTUREWORKS\Student
Password: Pa$$w0rd
Estimated Time: 30 Minutes

Lab Scenario
You are planning a data warehouse solution for
Adventure Works Cycles, and have been asked to
specify the hardware required. You must design a
SQL Server-based solution that provides the right
balance of functionality, performance, and cost.

Lab Review
Review DW Hardware Spec.xlsx in the
D:\Labfiles\Lab02\Solution folder. How does the
hardware specification in this workbook compare to
the one you created in the lab?

Module Review and Takeaways


Review Question(s)

You might also like