You are on page 1of 41

Secrets of SSAS Storage

Stacia Misner
Principal Consultant
Data Inspirations

November 6-9, Seattle, WA

Stacia Misner

Consultant, Educator, Mentor, Author
SQL Server MVP, SSAS Maestro

BIA-300-HD

Overview



Storage Architecture
Partitions
Aggregations
Storage Impact on Queries

BIA-300-HD

Storage Architecture
Query Architecture
Dimension Storage
Measure Group Storage

BIA-300-HD

Query Architecture Client Application MDX Query Query Parser Analysis Services Server Populate Axes Formula Engine Cache Compute Cell Data Calculation Engine Subcube Operations Formula Engine Storage Engine Dimension Data Measure Group Data Attribute Store Aggregations Hierarchy Store Fact Data Storage Engine Cache Storage Engine BIA-300-HD .

Dimension Storage Attribute Store Name Hash Key Hash Key Store DataID Attribute Key Member Property Store DataID Attribute Property Ordered by DataID for fast random access Relationship Store Bitmap Indexes DataIDs BIA-300-HD .

Dimension Storage Hierarchy Store Applies only to natural hierarchies Set Store DataID Structure Store (Parent level) Level Index Path of each member DataID Parent DataID FirstChild DataID Children Count Structure Store (Child level) Level Index DataID Parent DataID FirstChild DataID Children Count BIA-300-HD .

Measure Group Storage STORAGE ENGINE CACHE Loads as queries execute Clears with cleaner thread or processing of partition AGGREGATION DATA Responds to request with aggregated values in storage Summarizes lower level aggregated values on-the-fly as needed FACT DATA Storage Engine Measure Group Data Aggregations Storage Engine Cache Fact Data Storage Engine Scans MOLAP partitions and partitions segments in parallel Uses bitmap indexes to scan pages to find requested data BIA-300-HD .

Partitions Partitioning Strategy Partition Storage Partition Design Merging Partitions BIA-300-HD .

Partitioning Strategy Storage Query Processing MOLAP 20% Agg Current Year HOLAP 30% Agg Prior Year ROLAP 15% Agg History BIA-300-HD .

Partition Storage: MOLAP Multidimensional OLAP Storage of data and aggregations in cache Highly compressed and indexed Analysis Server RDBMS UDM Data MOLAP Aggs cache MDX Unified Dimensional Model BIA-300-HD .

Partition Storage: HOLAP Hybrid OLAP Storage of aggregations only in cache Translation of MDX to SQL for retrieval of detail data Analysis Server RDBMS Data SQL UDM MOLAP Aggs cache MDX MDX BIA-300-HD .

Partition Storage: ROLAP Relational OLAP Relational storage of data and aggregations Real-time analysis Analysis Server RDBMS Data/Aggs SQL UDM MDX BIA-300-HD .

ROLAP Dimension Dimension data exists in relational storage only Analysis Services maintains a cache of requested dimension data Use case: Hundreds of millions of members in dimension BIA-300-HD .

Partition Design Table Binding Use one source fact table per partition Insulate partition from table changes by using a view BIA-300-HD .

Partition Design Query Binding Use separate queries to same fact table for each partition Configure as last step in cube design to use most recent DSV BIA-300-HD .

Merging Partitions 2008 2007 2006 History 2005 Use Management Studio interface for manual merge Execute XMLA script for automated merge in Execute DDL Task (SSIS) BIA-300-HD .

Aggregations Aggregation Concepts Aggregation Usage Aggregation Wizard Optimizations Aggregation Designer Usage Based Optimization BIA-300-HD .

gender.Aggregation Concepts Aggregation design Indexed view for aggregation (ROLAP only) Aggregation-aware queries Aggregation-less queries • • • • • • • Sales by country. or both Sales by category Sales by year Sales by group Sales by gender and year Sales by category and year Sales by country and category BIA-300-HD .

Aggregation Concepts BIA-300-HD .

Aggregation Usage Default Full • • Include in every aggregation (or a lower level attribute) Use only for most commonly used attributes. sparingly None • • Exclude in every aggregation Use for infrequently used attributes Unrestricted • • No constraints Analysis Services decides whether to include or exclude Rule of thumb…use only 5-10 Unrestricted attributes per dimension BIA-300-HD .

Aggregation Usage Default Rules ATTRIBUTE FULL NONE UNRESTRICTED Granularity attribute for measure group Special dimension types* Natural hierarchies with attribute relationships Non-aggregatable attributes All others *Many-to-many dimension Unmaterialized referenced dimension Data mining dimension BIA-300-HD .

Aggregation Wizard Optimizations Remove unneeded attributes Add user hierarchies for natural hierarchies where possible Ensure correct attribute relationships Set AttributeHierarchyEnabled to False for “member properties” Configure AggregationUsage in advance Use correct estimates for partition rows and attributes within a partition BIA-300-HD .

Aggregation Designer Profiler: Aggregation hit Profiler: No aggregation hit BIA-300-HD .

Usage Based Optimization Clear property value to disable logging SERVER PROPERTY SETTING Log\QueryLog\CreateQueryLogTable true Log\QueryLog\QueryLogConnectionString Data source=localhost. else logs to table OlapQueryLog records deleted when: • Measures added to or removed from measure group • Dimensions added to or removed from measure group • Attributes added to or removed from dimension BIA-300-HD .Initial Catalog=DW Log\QueryLog\QueryLogSampling 1 Log\QueryLog\QueryLogTableName OlapQueryLog AS creates if doesn’t exist.

Usage Based Optimization OLAPQueryLog Filter Criteria Query Details BIA-300-HD .

Storage Impact on Queries Which Engine is the Bottleneck? Query Analysis Partitioning Aggregation Design BIA-300-HD .

Which Engine is the Bottleneck? Time to execute query (cold cache) Storage Engine time = add elapsed time for each Query Subcube event Formula Engine = Total execution time (Query End event) – Storage Engine time Bottleneck is engine consuming 30% or more of total query execution time BIA-300-HD .

Prepare for Query Clear Cache Reload MDX Script (without caching) BIA-300-HD .

Profiler Results Begin Query Begin indicates successful query parsing Serialize Results Current counts members on each axis BIA-300-HD .

Storage Engine Reads Partitions BIA-300-HD .

Query Subcube Event Sum each Query Subcube Event to compute total Storage Engine query time Review TextData for vectors returned to formula engine…very cryptic BIA-300-HD .

Query Subcube Verbose Event Use Query Subcube Verbose Event to understand vectors BIA-300-HD .

Subcube Details BIA-300-HD VALUE RESULT 0 Default member returned * All members returned + Selected members returned - Slice below granularity returned 4 Single member’s DataID BIA-300-HD .

Profiler Results End Serialize Results Current reports total number of cells in query results Query End reports total query duration BIA-300-HD .

Duration Analysis BIA-300-HD .

Partitioning Create multiple partitions to enable optimal scans of fact data Partition by one or more attributes used in many queries. such as Year Define slice BIA-300-HD .

Aggregation Design Aggregation Design Wizard • • • View aggregation candidates Update member and fact record counts Develop and apply aggregation design Good Practice Usage-Based Optimization Wizard • • Capture query sampling in usage log Tune aggregation performance to actual usage Best Practice! Aggregation Utility (2005) / Aggregation Designer (2008) • Override aggregation design algorithm BIA-300-HD .

com blog.datainspirations.com/9vqqmuy Analysis Services 2008 R2 Operations Guide http://tinyurl.Resources Analysis Services 2008 Performance Guide http://tinyurl.com Twitter: @StaciaMisner BIA-300-HD .com/8wvdyg4 Stacia Misner smisner@datainspirations.

PASS Resources Free SQL Server and BI training Free 1-day Training Events Regional Event Local and Virtual User Groups Free Online Technical Training This is Community Learning Center BIA-300-HD .

WA .Thank you for attending this session and the 2012 PASS Summit in Seattle BIA-300-HD November 6-9. Seattle.