You are on page 1of 49

Module 1: Planning a BI Solution

Contents:

Module Overview

Lesson 1: Elements of a BI Solution

Lesson 2: The Microsoft BI Platform

Lesson 3: Planning a BI Project

Lab: Planning a BI Solution

Module Review and Takeaways

Module Overview

Business Intelligence (BI) is an increasingly important IT service


in many businesses. In the past, BI solutions were primarily the
preserve of large corporations; but as data storage, analytical,
and reporting technologies become more affordable, many small
and medium-sized organizations are able to take advantage of BI
solutions.

As a SQL Server database professional, you may be required to


participate in, or perhaps even lead, a project with the aim of
implementing an effective BI solution. Therefore, it is important
that you have a good understanding of the various elements that
comprise a BI solution, the business and IT personnel typically

involved in a BI project, and the Microsoft products that you can


involved in a BI project, and the Microsoft products that you can
use to implement the solution.

Objectives
After completing this module, you will be able to:

• Describe the elements of a typical BI solution.

• Describe key considerations for planning a BI project.

• Select appropriate Microsoft technologies for a BI solution.

Lesson 1 : Elements of a BI Solution

Although there’s no single definitive template for a BI solution,


there are some common elements that are typical across most BI
implementations. Being familiar with these common elements
will help you identify the key components required for your
specific BI solution.

Lesson Objectives
After completing this lesson, you will be able to:

• Describe the common elements in a typical BI solution.

• Describe the role of business data sources in a BI solution.

• Describe the role of a data warehouse in a BI solution.

• Describe the role played by an extract, transform, and load (ETL)


process in a BI solution.

Describe the role played by analytical models in a BI solution.


• Describe the role played by analytical models in a BI solution.

• Describe reporting and analysis in a BI solution.

Overview of a BI Solution

Fundamentally, all BI solutions are designed to take data


generated by business operations, structure it into an
appropriate format for consistent analysis and reporting, and
use the information gained by examining the data to improve
business performance. No two BI solutions are identical, but
most include the following elements:

• Business data sources. The data that will ultimately provide the
basis for business decision making through the BI solution
usually resides in existing business applications or external data
sources (which may be commercially available data sets or data
exposed by business partner organizations).
• A data warehouse. To make it easier to analyze and report on the
business as a whole, the business data is typically consolidated
into a data warehouse. Depending on the size of the organization,
and the specific BI methodology adopted, this may be a single,
central database that is optimized for analytical queries; or a

distributed collection of data marts, each pertaining to a specific


distributed collection of data marts, each pertaining to a specific
area of the business.
• Extract, transform, and load (ETL) processes. To get the
business data from the data sources into the data warehouse, an
ETL process periodically extracts data from the source systems,
transforms the structure and content of the data to conform to the
data warehouse schema, and loads it into the data warehouse.
ETL processes are often implemented within a wider enterprise
integration management (EIM) framework that ensures the
integrity of data across multiple systems through master data
management (MDM) and data cleansing.
• Analytical data models. The data warehouse schema is usually
optimized for analytical querying and in some cases you may
decide to perform all analysis and reporting directly from the
data warehouse itself. However, it is common to build analytical
data models on top of the data warehouse to abstract the
underlying data tables, add custom analytical values such as key
performance indicators, and aggregate the data for faster
analytical processing.
• Reporting. Most BI solutions include a reporting element that
enables business users to view reports containing business
information. Most reporting solutions provide a set of standard
business reports that are generated on a regular basis, and some
also empower users to perform self-service reporting in order to
generate their own custom reports. Reports can be created
directly from the data warehouse or from the analytical data
models built on it, depending on your specific business
requirements and constraints.
• Analytical Information Worker Tools. In addition to reports,
most BI solutions deliver analytical information to business users

through information worker tools. These tools might be locally


through information worker tools. These tools might be locally
installed applications, such as Microsoft Excel; or interactive
dashboards in web-based applications, such as Microsoft
SharePoint Server.

Business Data Sources

Most businesses use software applications to process business


operations. For example, a retail business might use a point-of-
sale (POS) system to process sales transactions as customers
purchase goods, and an inventory management system to
perform stock control operations, such as ordering new stock as
goods sell out. Most organizations also use a human resources
system to manage employee records, and many sales-oriented
businesses use customer relationship management (CRM)
systems to manage customer contact details and sales
opportunities.

The systems used in an organization might be purpose-built


applications or based on simple documents, such as
spreadsheets. In some cases, business operations might be
automated by sensors or plant machinery. Regardless of the
specific implementation, all of these systems generate some
form of business data; and this data is the starting point for any
BI solution.
form of business data; and this data is the starting point for any
BI solution.

Business data sources for a BI solution typically include some or


all of the following:

• Application databases, often implemented as relational databases


in systems such as SQL Server, Oracle, or Microsoft Access
• Proprietary data stores, such as those used by many commercial
financial accounting applications
• Documents such as Excel workbooks

• Sensor readings emitted by plant machinery, which may be


captured as a data stream using technologies such as Microsoft
SQL Server StreamInsight
• External data sources such as cloud-based databases or web
services
• Master data hubs that contain definitive data values for core
business entities

One of the first tasks in any BI project is to audit the available


data sources and try to identify:

• The specific data that is stored in each source.

• The volume of data currently stored and being generated by


ongoing operations.
• The data types and range of values for important business data
fields.
• Business-specific values used to indicate key information (for

example, a POS system may use numeric codes to indicate


example, a POS system may use numeric codes to indicate
payment types, such as 0 for cash, 1 for credit, and so on).
• Common errors, reliability issues, and missing or null values in
the data.
• Data duplication or inconsistencies across multiple systems.

• Existing data integration processes.

• Data source usage patterns and update periodicity.

• Technologies that can be used to extract the source data to a


staging database.

The Data Warehouse

The data warehouse is the central data repository on which all


reporting and analysis is based. Typically, it contains numeric
business measures that are important to the business, such as
revenue, cost, or profit; and the key business entities or
dimensions by which those measures can be aggregated, such as
fiscal period, customer, or product.

Kimball and Inmon Methodologies

Typically, a data warehouse is implemented as a relational


Typically, a data warehouse is implemented as a relational
database in which the business data has been denormalized into
a star schema consisting of fact tables that contain numeric
measures and dimension tables that contain attribute hierarchies
across which the measures are aggregated. This approach
reflects the dimensional model methodology promoted by Ralph
Kimball, and is the most common approach adopted by
Microsoft SQL Server customers. In the Kimball methodology,
the data warehouse may consist of multiple data marts, each
dealing with a specific area of the business. The fact and
dimension tables in these data marts are conformed so that they
share the same grain (granularity) and dimension attributes
across all data marts. This enables the data marts to be
independent data stores that can be logically viewed as a single
enterprise data warehouse (EDW). The Kimball methodology is
often referred to as a bottom-up approach.

An alternative data warehouse design, popularized by Bill


Inmon, is the corporate information factory (CIF) model. In the
CIF model, the enterprise data warehouse stores the business
data in a normalized relational schema. This is then used to feed
departmental data marts, in which specific subsets of the data
are exposed in a star schema. The dependency of the data marts
on a central EDW leads many to refer to the Inmon methodology
as a top-down approach.

Common Implementations
Although the Kimball and Inmon methodologies in their pure
form are designed for BI solutions that distribute the data across
multiple departmental data marts, it is common for
organizations to begin with a Kimball-style data mart for a
subset of the business that expands over time into a single,
central data warehouse database for the entire enterprise. The
central data warehouse database for the entire enterprise. The
availability of inexpensive storage and the increasing power of
server hardware mean that a single data warehouse can support
a huge volume of data and heavy user workloads.

In very large enterprises, a federated approach is often used in


which a hub-and-spoke architecture synchronizes departmental
data marts with a central enterprise data warehouse.

Note: SQL Server can be used to support both Kimball and


Inmon style data warehouse solutions. In response to the
more common use of the Kimball methodology, the SQL
Server database engine has been designed to optimize star-
join queries and most documentation about data
warehouse implementation in SQL Server assumes a
dimensional model rather than a normalized EDW. In
deference to these facts, this course focuses on a Kimball
style data warehouse. However, you should investigate the
details of both approaches and consider which best suits
your specific business requirements and constraints.

Extract, Transform, and Load Processes

If the data warehouse is the central “brain” of the BI solution,


If the data warehouse is the central “brain” of the BI solution,
the ETL system is the heart. It pumps business data through the
solution to keep the data warehouse up to date and ensure that
the BI solution as a whole continues to deliver value to the
business. A significant proportion of the effort to design and
build an effective BI solution goes into the ETL subsystem.

ETL Processes in a Data Warehousing Solution


ETL is about more than just copying data from a source to the
data warehouse. ETL processes must extract data efficiently,
filtering extraction operations to include only new or changed
rows wherever possible. Typically, the extracted data is initially
loaded into a staging area so that loads to the data warehouse
tables can be synchronized across data extracted from multiple
sources and performed at the most appropriate time.
Additionally, the ETL process typically applies transformations
to the data before it is loaded, in order to ensure that the data
values and schema conform to the data warehouse dimensional
model. Finally, when loading the data warehouse, an important
part of the ETL process is to handle slowly changing dimensions
(dimension records that must be updated while retaining
historical versions). When loading large volumes of fact or
dimension data, the ETL process must minimize the adverse
impact on data warehouse queries and ensure that the data is
loaded in the least amount of time possible.

Another consideration for ETL is the logging strategy that you


will use to record ETL activity provide troubleshooting
information in the event of a failure somewhere in the ETL
process.

Enterprise Integration Management


ETL is a subset of a larger framework for managing data known
ETL is a subset of a larger framework for managing data known
as enterprise integration management (EIM). Software vendors
and database professionals differ on the specific details of the
elements that comprise an EIM solution, but in the SQL Server
platform, EIM is generally considered to include:

• ETL capabilities provided by SQL Server Integration Services


(SSIS).
• Data cleansing and matching capabilities provided by Data
Quality Services (DQS).
• Master data management (MDM) capabilities provided by Master
Data Services (MDS).

Using ETL for Application Data Integration


In addition to populating and refreshing the data warehouse,
ETL processes can be used to synchronize data across multiple
business applications―for example, to refresh product data in
an e-commerce system from a centrally managed catalog. To
accomplish this, you can use SSIS or other synchronization
technologies such as SQL Server replication. When planning a BI
solution in environments where data is transferred between
source systems, it is important to understand the lineage of the
data and to be aware of the schedule on which these data
transfers occur.

Managing Data Quality


In any system that relies on user input, there is a risk that the
quality of the data will be impaired because of erroneous or
duplicate data entry. Although most applications perform a
degree of input validation, there is always a possibility that some
degree of input validation, there is always a possibility that some
data is invalid (for example, a user may enter “New Yrk” instead
of “New York”), inconsistent (for example, one user may enter
“CA” and another user may enter “California”), or duplicated (for
example, an existing customer may re-register on an e-
commerce site with a different email address because the user
had forgotten his or her password). Data quality technologies,
such as DQS, enable you to automate the identification of invalid
or inconsistent column values and duplicate rows. This makes it
possible to cleanse and de-duplicate source data before loading
it into a data warehouse, improving the integrity of the business
analysis and reporting provided by your BI solution.

Master Data Management


In an organization with multiple business applications, it is
possible for the same business entity to be represented in
multiple data sources. This presents the challenge of identifying
the definitive version of the entity. For example, an organization
might store customer details in a CRM system, an order
processing system, and an e-commerce site profile management
system. If the same customer exists in all three systems, and the
address data does not match, it is difficult to establish which of
the addresses on record is correct. By implementing a master
data management system, the organization can establish a
definitive master record and use it to ensure the integrity of data
across all systems.

Analytical Data Models


Most corporate BI solutions include analytical data models that
provide information workers with a way to “slice and dice” the
measures in the data warehouse by aggregating them across the
dimensions. Often, these analytical models are referred to as
cubes. Technically, a cube is a specific organization of measures
and dimensions in a multidimensional data model, but the word
“cube” is commonly used as a generic term to refer to any data
model that enables users to aggregate measures by business
entities.

Benefits of Analytical Data Models


It is possible to create BI solutions that support reporting and
analysis directly from tables or views in the data warehouse.
However, in most scenarios, creating a separate analytical data
model layer results in the following benefits:

• The data model abstracts the underlying data warehouse tables,


which enables you to create models that reflect how business
users perceive the business entities and measures regardless of
the data warehouse table schema. If necessary, you can modify or
expand the underlying data warehouse without affecting the data
model used by business users to perform analysis.
• Because the data model reflects the users’ view of the business,

data analysis is easier for information workers with little or no


data analysis is easier for information workers with little or no
understanding of database schema design. You can use
meaningful names for tables and fields and define hierarchies
based on attributes in dimension tables that make the data more
intuitive for business users.
• You can add custom logic to a data model that adds business
value when analyzing the data. For example, you can define key
performance indicators (KPIs) that make it easier to compare
actual business measures with targets.
• Although the SQL Server database engine can provide extremely
high query performance, a data warehouse typically contains a
massive volume of data. Because most analysis involves
aggregating measures across multiple dimensions, the processing
overhead for complex queries can result in unacceptable
response times―especially when many users access the data
concurrently. A data model typically pre-aggregates the data,
which provides vastly superior performance for analytical
queries.
• Data models are a common feature in BI solutions, and a number
of standards have been established. By creating a data model, you
can expose your analytical data through a standard interface to
be consumed by client applications, such as Microsoft Excel or
third-party analytical tools.

Types of Analytical Data Model


SQL Server 2012 supports two kinds of analytical models:

• Multidimensional data models. Multidimensional data models


have been supported in every version of SQL Server Analysis

Services since the release of SQL Server 7.0. You can use a
Services since the release of SQL Server 7.0. You can use a
multidimensional data model to create an Analysis Services
database that contains one or more cubes, each of which provides
aggregations of measures in measure groups across multiple
dimensions.
• Tabular data models. Tabular data models were first introduced
with PowerPivot in SQL Server 2008 R2, and they are enhanced in
SQL Server 2012. From the point of view of a user performing
analysis, tabular model provide similar functionality to a
multidimensional model (in fact, in many cases, the two models
are indistinguishable from one another). For BI developers,
tabular models do not require as much online analytical
processing (OLAP) modeling knowledge as multidimensional
models, because they are based on relationships between
multiple tables of data.

Note: Multidimensional and tabular models are compared


in more detail in Module 5: Designing Analytical Data
Models.

Reporting and Analysis

The primary purpose of a BI solution is to help organizations


The primary purpose of a BI solution is to help organizations
track and improve business performance through reporting and
analysis.

Reporting
Reporting is the communication of information gained from BI.
Most organizations rely on reports to summarize business
performance and activities. Consequently, most BI solutions
include a reporting element that generates these reports. Typical
reports include financial and management reports (such as cash
flow, profit and loss, balance sheet, open orders, and other
accounts-based reports), and other reports, depending on the
nature of the business (for example, a retail business might
require stock inventory reports, whereas a technical support call
center might require a report that shows call log data).

In some scenarios, users might need to view reports


interactively in a web browser or custom application; whereas
in others, the reports might be required to be sent as email
attachments in specific formats (such as Excel workbooks or
Word documents). In many cases, the reports might need to be
printed (for example, to send a physical report to customers or
shareholders). When planning a reporting solution, you must
take into consideration the reports that are required, the
audiences for those reports, and how they will be delivered.

Regardless of the specific reports that are required, or how they


will be distributed and consumed, there are two common
approaches to report generation in most BI solutions:

• IT-provided reports. Traditionally, standard business reports are

created by a specialist report developer and automatically


created by a specialist report developer and automatically
generated with current data as requested or on a regular basis.
Although the reports themselves may be developed by a business
user with report development skills, they are generally supported
by IT and delivered though the organization’s reporting
infrastructure.
• Self-service reporting. As business users have become more
technically proficient and report authoring tools have become
easier to use, many organizations supplement standard reports
with the ability for users to create their own reports with no
intervention from IT. For self-service reporting to be effective,
some initial work needs to be done to design and implement a
suitable reporting infrastructure; but after that is in place, the
users can benefit from the ability to customize the reports they
use without placing an additional burden on the IT department.

Analysis
Analysis is the interpretation of business data delivered by the
BI solution. For some business users, notably business analysts,
performing analysis is a discrete activity that involves using
specialist analytical tools to examine data in analytical models.
For others, analysis is simply a part of everyday work and takes
the form of using reports or dashboards as a basis for business
decision making.

In general, when planning a BI solution, you should consider the


following kinds of analytical requirements:

• Interactive analysis. Some BI solutions must support interactive


“slice and dice” analysis in business tools such as Microsoft Excel

or specialist data analysis tools. The resulting information can


or specialist data analysis tools. The resulting information can
then be published as a report.
• Dashboards and scorecards. Commonly, analytical data can be
summarized in a dashboard or scorecard and embedded into
business applications or portals, such as SharePoint Server sites.
These kinds of solutions might provide some interactivity to
enable users to “drill down” into specific details, or they may
simply show important KPIs.
• Data mining. Most analysis and reporting concerns historical
data, but a BI solution can also support predictive analysis by
using that historical data to determine trends and patterns.

Data Sources
You can access data for analysis and generate reports from
virtually any data source; but in a BI solution, reports are
commonly based on one of the following data sources:

• Analytical data models. If you have created analytical data


models in your BI solution, you can use them as a source for
analysis and reports. This approach enables you to take
advantage of the benefits of data models in your reporting
solution as described in the previous topic.
• The data warehouse. You can create analytical reports directly
from the data warehouse (or a departmental data mart). This
enables you to express queries in Transact-SQL which may be a
more familiar to a report developer than a data modeling query
language (such as MDX or DAX).
Note: Considerations for designing a reporting solution are
discussed in more depth later in this course.

Lesson 2: The Microsoft BI Platform

Microsoft products are used to provide the IT infrastructure for


most of the organizations in the world. Therefore, it makes sense
for many of these organizations to consider using the Microsoft
platform for BI and benefiting from the close integration and
common infrastructure capabilities of the various products that
can be used to deliver a BI solution.

As a Microsoft BI professional, you need to know which products


can be used to implement the various elements of a BI solution,
and how those products can be integrated to work together.

Lesson Objectives
After completing this lesson, you will be able to:

• Describe the role of Windows Server in a BI solution.

• Describe the role of SQL Server in a BI solution.

• Describe the role of SharePoint Server in a BI solution.

• Describe the role of Office applications in a BI solution.

• Identify hardware appliances that can be used in a BI solution.

• Describe considerations for upgrading from previous versions of


SQL Server.
Windows Server

Microsoft Windows Server 2012 is the foundation for a


Microsoft-based enterprise solution and provides a number of
core infrastructure services, including:

• Network communication and management.

• Active Directory–based authentication and security management.

• Core application services, such as the Microsoft .NET Framework


and Internet Information Services (IIS).
• Storage management, including NTFS disk volumes and Storage
Spaces.
• Failover Clustering.

• Virtualization.

Windows Server 2012 Editions


Windows Server 2012 is available in the following editions:

• Windows Server 2012 Datacenter. This edition provides all


features of Windows Server and is optimized for highly

virtualized environments.
virtualized environments.
• Windows Server 2012 Standard. This edition provides all
features of Windows Server and is designed for physical or
minimally virtualized environments.
• Windows Server 2012 Essentials. This edition is designed for
small business with up to 25 users and 50 client devices.
• Windows Server 2012 Foundation. This edition is designed for
environments with up to 15 users.

Note: Although Windows Server 2012 includes


comprehensive infrastructure management tools, large
enterprises might consider also using Microsoft System
Center 2012 products to manage enterprise infrastructure.

SQL Server

Microsoft SQL Server 2012 provides the core data services for a
BI solution. These services include:

• The SQL Server database engine, which is used for application


databases, operations databases, and the data warehouse
throughout the BI solution.
throughout the BI solution.
• SQL Server Integration Services (SSIS), which is used as the
primary platform for ETL processes.
• Data Quality Services (DQS), which provides data cleansing and
matching capabilities.
• Master Data Services (MDS), which provides master data
management capabilities.
• SQL Server Analysis Services (SSAS), which provides a storage
and query processing engine for multidimensional and tabular
data models.
• SQL Server Reporting Services (SSRS), which provides a platform
for publishing and delivering reports that users can consume
through a native web-based interface or have delivered by way of
subscriptions.

SQL Server 2012 Editions


SQL Server 2012 is available in the following core editions:

• SQL Server 2012 Enterprise. You should use this edition for data
warehouses and BI solutions that require advanced SSIS features,
such as fuzzy logic and change data capture (CDC) components.
• SQL Server 2012 Business Intelligence. You should use this
edition for servers hosting SSIS, DQS, and MDS. You should also
use this edition for SSRS and SSAS solutions that require more
than 16 processor cores or if you need to support tabular data
models, PowerPivot for SharePoint, Power View for SharePoint,
or advanced data mining.

SQL Server 2012 Standard. You can use this edition for solutions
• SQL Server 2012 Standard. You can use this edition for solutions
that require basic SSRS reporting, SSAS multidimensional models,
and basic data mining.

Note: SQL Server 2012 is also available in Web and Express


editions, but these are generally not appropriate for BI
solutions. A special edition of SQL Server named Parallel
Data Warehouse provides support for massively parallel
processing (MPP) data warehouse solutions, but this edition
is only available pre-installed on an enterprise data
warehouse appliance from selected Microsoft hardware
partners.

SharePoint Server

Microsoft SharePoint Server 2013 provides enterprise


information sharing services through collaborative websites.
SharePoint Server provides the following BI capabilities:

• Excel Services. Users can view and interact with Excel


workbooks that are shared in a SharePoint document library
through a web browser―including workbooks that use data
connections to query data in a data warehouse or Analysis
Services data model.
Services data model.
• PowerPivot for SharePoint. Users can share and interact with
Excel workbooks that contain a PowerPivot tabular data model.
This enables business users to create and share their own
analytical data models.
• Integration with SSRS. You can deliver and manage reports and
data alerts through SharePoint document libraries instead of the
native Report Manager interface provided with SSRS.
• Power View. Power View is an interactive data visualization
technology through which users can graphically explore a data
model in a web browser.
• PerformancePoint Services. PerformancePoint Services enables
BI developers to create dashboards and scorecards that deliver
KPIs and reports through a SharePoint site.

Office Applications

Microsoft Office 2013 provides productivity applications that


business users can use to consume and interact with BI data.
These applications include:

Microsoft Excel. Excel is the most commonly used data analysis


• Microsoft Excel. Excel is the most commonly used data analysis
tool in the world, and can be used to:
o Import data from a data warehouse and use it to create charts and reports.

o Create interactive PivotTables and PivotCharts from analytical data


models in SSAS or PowerPivot for SharePoint.

o Create PowerPivot workbooks that contain tabular data models without


requiring SSAS.

o Create Power View visualizations from tabular models in the workbook or


external tabular models.

• Microsoft Word. Word is a document authoring tool. In a BI


scenario, users can export SSRS reports in Word format and use
Word’s editing and reviewing tools to enhance them.
• Microsoft PowerPoint. PowerPoint is a widely used presentation
tool. Users can save Power View visualizations as PowerPoint
presentations, and present business data in a dynamic,
interactive format.
• Microsoft Visio. Visio is a diagramming tool that can be used to
visualize data mining analyses.

Upgrading to SQL Server 2012

If you have an existing BI solution that is based on a previous


If you have an existing BI solution that is based on a previous
version of SQL Server, you should consider the following general
guidelines before upgrading to SQL Server 2012:

• Decide whether you want to upgrade in-place by using SQL


Server setup, or use a side-by-side upgrade approach in which a
new instance of SQL Server is installed and databases, SSIS
packages, reports, and other objects are migrated to it from the
existing instance. In-place upgrades are generally simpler to
perform, but side-by-side upgrades enable you to keep the
existing instance operational during the upgrade process and give
you the option to revert to the existing instance in the event of a
problem.
• Verify that the hardware and edition of Windows Server you plan
to use are adequate for SQL Server 2012. SQL Server product
documentation (Books Online) includes details of minimum
hardware and software requirements for each SQL Server
component.
• Verify that upgrading is possible from your current installation,
or plan to upgrade by installing a new installation of SQL Server
and migrating databases, SSIS packages, reports, and other
objects. You can upgrade 32-bit installations of previous versions
of SQL Server to SQL Server 2012 on the 32-bit subsystem
(WOW64) of a 64-bit server, and 64-bit installations of previous
versions must be upgraded to SQL Server 2012 64-bit. You can
upgrade from the following previous versions of SQL Server
(Books Online includes a table showing specific edition upgrade
paths):
o SQL Server 2005 SP4 or later.

o SQL Server 2008 SP2 or later.


o SQL Server 2008 R2 SP1 or later.

o Some editions of SQL Server 2012 can be upgraded to a different edition.

• Verify that the SQL Server features you currently use are
supported in SQL Server 2012. Books Online contains a list of
deprecated and discontinued features as well as breaking
changes.
• Use Upgrade Advisor to prepare for upgrades. Upgrade Advisor is
a tool provided on the SQL Server 2012 installation media that
you can use to analyze an existing SQL Server installation and
identify any issues that could potentially prevent a successful
upgrade.

Upgrading SQL Server Database Engine Instances


When planning to upgrade from a previous version of the
database engine, consider using the Distributed Replay Utility to
run a typical database workload from your current solution on a
proof-of-concept SQL Server 2012 instance, and compare
performance and behavior to ensure compatibility.

Upgrading SQL Server Analysis Services


SQL Server 2012 Analysis Services can be installed in one of two
modes: multidimensional mode and tabular mode. Because
tabular mode did not exist in previous versions, existing
Analysis Services instances can only be upgraded to SQL Server
2012 in multidimensional mode.

Upgrading SQL Server Integration Services


Consider the following guidelines for upgrading to SQL Server
2012 Integration Services from previous versions:
2012 Integration Services from previous versions:

• Upgrading SSIS does not replace the previous instance of the SSIS
service or tools. However, after upgrading, you cannot use the old
version of the tools to create, manage, or run SSIS packages. If you
have upgraded SSIS and want to use a command line utility such
as DTExec.exe, you should enter the full path to the required
version of the utility.
• Upgrading SSIS does not upgrade existing packages to the new
format used in SQL Server 2012. You should upgrade these
packages by using the SSIS Package Upgrade Wizard.
• Support for Data Transformation Services (DTS) packages created
in SQL Server 2000 has been discontinued in SQL Server 2012. If
you have existing DTS packages, you must migrate them to SQL
Server 2005, 2008, or 2008 R2 Integration Services packages
before they can be migrated to the SQL Server 2012 Integration
Services format. If this is not possible, you must recreate your
packages in SQL Server 2012 Integration Services after upgrading.
• Support for ActiveX script components in SSIS packages has been
discontinued in SQL Server 2012. ActiveX scripts in existing
packages can be upgraded to Visual Studio Tools for Applications
(VSTA) by using the SSIS Package Upgrade Wizard.

Upgrading Master Data Services


Master Data Services was first introduced in SQL Server 2008 R2,
and consists of a SQL Server database and an IIS web
application. There is no direct upgrade path for Master Data
Services components, so you must use the following technique to
upgrade to SQL Server 2012:
Upgrade the existing SQL Server 2008 R2 database engine
1. Upgrade the existing SQL Server 2008 R2 database engine
instance to SQL Server 2012, or install a new instance of SQL
Server 2012. If you install a new instance, you can move the
existing Master Data Services database to the new instance or
continue to host it in the SQL Server 2008 R2 instance.
2. Add the Master Data Services feature to the SQL Server 2012
instance.
3. Use SQL Server 2012 Master Data Services Configuration
Manager to upgrade the existing Master Data Services database
to the new version of the schema.
4. Use SQL Server 2012 Master Data Services Configuration
Manager to create a new Master Data services web application
and associate it with the upgraded database.

Upgrading SQL Server Reporting Services


When planning to upgrade to SQL Server 2012 Reporting
Services, consider the following guidelines:

• Similar to other components of SQL Server, you can perform an


in-place upgrade, or you can install a new instance and migrate
existing reports, data sets, report parts, and data sources by
attaching the existing report server database to the new server.
• Back up the report server encryption key before upgrading, and
restore it to the new instance if you are upgrading by installing a
new instance.
• You cannot perform an in-place upgrade that changes the
installation mode. For example, you cannot use SQL Server Setup
to upgrade SQL Server 2008 Reporting Services in native mode to

SQL Server 2012 Reporting Services in SharePoint Integrated


SQL Server 2012 Reporting Services in SharePoint Integrated
mode.
• The format used for reports was updated in SQL Server 2008 R2.
The compiled versions of reports are automatically updated the
first time they are run on an upgraded report server. The source
report definition language (RDL) files are not upgraded.
• You can perform an in-place upgrade from SQL Server 2008 R2
Reporting Services integrated with a SharePoint Server 2010 farm
to SQL Server 2012 or SQL Server 2012 SP1 Reporting Services
with no downtime.
• You can perform an in-place upgrade from SQL Server 2005 SP4
or 2008 SP2 Reporting Services integrated with a SharePoint
Server 2007 farm to SQL Server 2012 or SQL Server 2012 SP1
Reporting Services with a SharePoint Server 2010 farm, but
downtime is required because both SQL Server and SharePoint
must be upgraded. You should consider performing a new
installation of both products and migrating content and reports.
• SQL Server 2012 SP1 is required for integration with a SharePoint
Server 2013 farm. In-place upgrade from previous versions of
SQL Server and SharePoint is not supported, but you can install a
new SharePoint Server 2013 farm, migrate the SharePoint content
and configuration databases to the new farm, install SQL Server
2012 Reporting Services in SharePoint integrated mode, and
migrate the existing reporting services objects.

Upgrading PowerPivot
If you have an existing SQL Server 2008 R2 version of
PowerPivot installed in a SharePoint Server 2010 farm, consider
the following guidelines for upgrading it:
You must apply SQL Server 2008 R2 SP1 to the PowerPivot
• You must apply SQL Server 2008 R2 SP1 to the PowerPivot
instance of SQL Server Analysis Services before upgrading.
• You must apply SharePoint Server 2010 SP1 and the SharePoint
Server 2010 August 2010 cumulative update or later to all
SharePoint servers in the farm before upgrading.
• Use SQL Server 2012 Setup to upgrade the PowerPivot instance of
Analysis Services.
• Use the PowerPivot Configuration Tool or PowerShell cmdlets to
upgrade the solutions and websites in the farm.
• Remove the SQL Server 2008 R2 version of the PowerPivot add-in
from all installations of Microsoft Excel 2007 or Excel 2010 and
replace it with the SQL Server 2012 version of the add-in. By
default, Excel 2013 includes the SQL Server 2012 PowerPivot add-
in.

SQL Server Appliances

Microsoft has partnered with a number of hardware vendors to


create the following SQL Server-based appliances that are
optimized for specific BI workloads:

Enterprise data warehouse appliances. Enterprise data


• Enterprise data warehouse appliances. Enterprise data
warehouse appliances include SQL Server Parallel Data
Warehouse edition, which uses massively parallel processing
(MPP) technologies to support high performance data warehouses
that scale up to 500 TB of data or more.
• Departmental and mid-market data warehouse appliances.

For organizations that don’t require the scalability of an


enterprise data warehouse, appliances for smaller data
warehouses that need to store up to 5 TB of data can provide a
solution that is fast and simple to deploy.

• Business decision appliances. Installing and configuring an


effective BI server that includes SharePoint and PowerPivot
requires significant product expertise and can be a time
consuming task. Business decision appliances provide an out-of-
the-box solution that can be used to provision self-service BI
server easily and quickly.

When planning a SQL Server-based BI solution, you should


consider the following potential benefits of using appliances:

• Massive scalability of enterprise data warehouses that goes


beyond what is possible to achieve with a symmetric multi-
processing (SMP) architecture.
• Rapid time to solution compared to designing and creating a
custom server build.
• Pre-tested and optimized hardware and software configurations
that are specifically designed for BI workloads.

Single-source purchasing and support.


• Single-source purchasing and support.

• Additional value-add services, including installation, training, and


consulting services.

Note: Microsoft partners with multiple hardware vendors


to offer appliance-based BI solutions. For more information
about the appliances from specific vendors, see
http://www.microsoft.com/sqlserver/en/us/solutions-
technologies/appliances.aspx.

Lesson 3: Planning a BI Project

Statistics show that a surprisingly high number of BI projects in


organizations throughout the world fail. Often projects are
abandoned before completion, fail to deliver all of the originally
specified deliverables, or simply do not deliver a solution that
adds value to the business. In many cases, the fundamental
cause of failure is that the project was insufficiently envisioned
or that key stakeholders were not included in the planning.

Careful planning can help to ensure that a BI project runs


smoothly with a successful outcome. By applying some common
best practices, you can increase the likelihood that your BI
project will not be added to the long list of BI project failures.

Lesson Objectives
After completing this lesson, you will be able to:

Describe key features of a BI project.


• Describe key features of a BI project.

• Identify common project personnel in a BI project.

• Describe the role of business stakeholders in a BI project.

• Identify infrastructure commonly used in a BI project.

• Plan pilot solutions as part of a BI project.

BI Project Overview

There are numerous frameworks for planning and managing IT


projects, and many organizations have a policy to use a specific
approach when implementing a new solution. Whichever
approach you use, a BI project must start with the business
requirements and use these to inform the design of the overall
technical architecture, the data warehouse and ETL, and the
reporting and analysis that the solution will provide.

Business Requirements
The most important thing to consider when planning a BI project
is that the core purpose of the project is to improve the business.
More than any other type of IT project; BI projects are closely
and inseparably bound to business processes and goals. Most IT
projects require a deep understanding of technology, but in a BI
projects require a deep understanding of technology, but in a BI
project you must also have detailed knowledge of how various
business processes work and interact with one another, and
what the commercial aims of the business are.

Understanding the overall structure, processes, and goals of the


business makes it easier to gather, interpret, and prioritize the
business requirements for the BI solution. Typically, BI
requirements are fundamentally about being able to quantify
core business metrics across various aspects of the business in
order to measure business performance and inform business
decisions. For example, a requirement might be that the solution
enables sales managers to see monthly sales revenue by
salesperson in order to reward success and identify employees
that need additional support or motivation. Or another
requirement might be to view quarterly order amounts by
product line in order to plan more efficient manufacturing
based on demand trends. Only after you have identified the
specific business requirements for your BI solution can you start
considering the design of the infrastructure, data warehouse and
ETL solution, and analytical reports.

Technical Architecture and Infrastructure Design


When you have a good understanding of the business
requirements, you can start to plan the overall solution
architecture. Based on the business requirements, you can
identify the required elements of the BI solution (as described in
the first lesson in this module) and consider the software
products you want to use to implement those elements (as
described in the second lesson in this module).

After selecting the technologies you intend to use, you can start
to design the infrastructure for the BI solution, including server
to design the infrastructure for the BI solution, including server
hardware and configuration, security, and high availability
considerations.

Note: Server hardware and infrastructure considerations


are discussed in Module 2: Planning BI Infrastructure.

Data Warehouse and ETL Design


The business requirements determine the data that the BI
solution must include―specifically, the numeric measures that
users need to aggregate (for example, revenue or profit), and the
business dimensions across which they must be aggregated (for
example, salesperson or product line). When you have identified
the data that is required to meet the business requirements, you
can start to design the data warehouse in which that data will be
stored and the ETL process that will be used to populate and
refresh the data warehouse from the business applications
where the data currently resides.

Note: Data warehouse design is discussed in Module 3:


Designing a Data Warehouse. ETL design is discussed in
Module 4: Designing an ETL Solution.

Reporting and Analysis Design


In many cases, the business requirements for a BI solution are
actually expressed as specifications for reports or analytical data
sets. Business users often describe the reports they want to be
able to consume, the dashboards they want to be able to view, or
the PivotTables they want to be able to create. Using the business

requirements to identify the information that users want to get


requirements to identify the information that users want to get
out of the BI solution helps you engage with the users to refine
your understanding of how they want to consume or interact
with that information, which in turn enables you to design an
appropriate solution for analysis and reporting.

Note: Considerations for designing analytical data models


are discussed in Module 5: Designing Analytical Data
Models. Solutions for delivering analysis and reports are
discussed in Module 6: Planning a BI Delivery Solution,
Module 7: Designing a Reporting Services Solution, Module
8: Designing an Excel-Based Reporting Solution, and Module
9: Planning a SharePoint Server BI Solution.

Monitoring and Optimization


Performance optimization is a consideration all through the
design and implementation of the various elements of the BI
solution, and your project planning should include consideration
of how performance of the overall solution will be monitored
and optimized as the volumes of data and users grow. In
particular, you must consider how you will measure
performance, what expectations users have for performance,
and how you will identify performance degradation.

Note: Performance monitoring and optimization is


discussed in Module 10: Monitoring and Optimizing a BI
Solution.

Operations and Maintenance

When planning a BI solution, it is easy just to focus on the


When planning a BI solution, it is easy just to focus on the
functional requirements of the solution. However, it is important
to also consider operational requirements and factor them into
the design. Operational requirements for a BI solution include a
number of ongoing tasks, such as index maintenance in the data
warehouse, backup procedures for all databases and data stores
used in the solution, scheduled automation of ETL tasks, data
model processing, logging and auditing, and many others.

Note: Considerations for operations and maintenance are


discussed in Module 11: Planning BI Operations.

Project Infrastructure

It is easy to focus on the infrastructure requirements of the


solution you intend to build and overlook the infrastructure
required to actually build it. In the same way that a construction
project to build an office building requires a site office, parking
facilities for the construction crew, and so on, a BI project
requires hardware and software resources for the project team
to use during the development of the solution.

Project Management Infrastructure

From the beginning, a project requires infrastructure to enable


From the beginning, a project requires infrastructure to enable
team members to communicate, collaborate, and document
project planning information. Examples of this kind of
infrastructure include:

• Office productivity applications.

• Microsoft Project.

• A SharePoint Server site for the project.

Design and Development Tools


When designing the BI solution, the team will need tools such as:

• Microsoft Visio to support diagrammatic design.

• SQL Server Data Tools to develop data models, reports, and SSIS
packages.
• Other Microsoft Visual Studio components to develop custom
application components.
• Team Foundation Server (TFS) to provide source control and issue
tracking capabilities.

Development and Test Infrastructure


In addition to development tools, the project will require servers
on which to develop and test the various elements of the
solution. These servers will require the same software used in
the production solution―for example, SQL Server and
SharePoint Server; and the test servers should be configured as
similarly as possible to the envisioned production infrastructure.
However, considering the complexity of some enterprise-scale BI
similarly as possible to the envisioned production infrastructure.
However, considering the complexity of some enterprise-scale BI
solutions, you may choose to simplify the test environment―for
example, by provisioning a single-server installation of
SharePoint Server instead of a multi-server farm, combining SQL
Server components on a single server instead of provisioning
dedicated servers, and using standalone servers instead of
failover clusters.

Project Personnel

A BI project involves several roles. These roles typically include:

• A project manager. Coordinates project tasks and schedules and


ensures that the project is completed on time and within budget.
• A BI solution architect. Has overall responsibility for the
technical design of the data warehousing solution.
• A data modeler. Designs the data warehouse schema and
analytical data models.
• A database administrator. Designs the physical architecture and
configuration of the data warehouse database. In addition,
database administrators who have responsibility for data sources
that are used in the data warehousing solution must be involved
in the project to provide access to the data sources that the ETL
process uses.
process uses.
• An infrastructure specialist. Implements the server and
network infrastructure for the data warehousing solution.
• An ETL developer. Builds the ETL workflow for the data
warehousing solution.
• A report developer. Creates the reporting elements of the BI
solution.
• Business users. Provide requirements and help to prioritize the
business questions that the data warehousing solution will
answer. Often, the team includes a business analyst as a full-time
member to help interpret the business questions and ensure that
the solution design meets the needs of the users.
• Testers. Verify the business and operational functionality of the
solution as it is developed.

Note: The list in this topic is not exhaustive and represents


roles that must be performed, not necessarily individual
people. In some cases, multiple roles may be performed by a
single person―though in general, you should avoid having
testers validate their own development work.

In addition to the technical project personnel listed here, the


project team should include business stakeholders from the very
beginning of the planning phase. The roles performed by
business stakeholders are discussed in the next topic.

Business Stakeholders
The previous topic described the technical roles required in a BI
project. However, the project team should also include
representatives from key areas of the business to help ensure
that the solution meets the business requirements and to help
promote user acceptance.

Executive Sponsor
The culture of each organization is unique, but in almost all
businesses, a BI project will face personality clashes and political
obstacles that must be navigated to create a solution that is in
the best interests of the business as a whole. Employees tend to
focus on their own specific areas of the business, and they can
often be resistant to changes that affect their day-to-day
activities or to what they see as external interference in their
responsibilities.

The challenge of obtaining “buy-in” from business users is easier


to overcome if the project has an executive sponsor who has
aligned the project goals with the strategic aims of the business
and can champion the project at the highest level of the
organization. When the BI project team meets resistance or
contradictory views from business users, the executive sponsor
can use his or her influence to resolve the issue.

Business Users
Business Users
Although executive sponsorship is essential to drive the project
forward, it is important to take into account the input from
business users. A solution that is enforced on users without
consultation is unlikely to gain acceptance, and in most cases it
is unlikely that the primarily technical members of the project
team have sufficient knowledge of the business to create a useful
solution even if users could be persuaded to accept it.

Businesses are complex ecosystems in which many processes


interact to achieve multiple objectives. In some organizations,
the business processes are formally defined and documented,
but even when this is the case, it is likely that day-to-day
activities vary, often significantly, from “official” practices.
Generally, business users have a better insight into how the
business processes actually work, what the various data
elements used in those processes actually mean, and how
important they are than can be gained by a technical architect
examining existing systems and their documentation.

For example, suppose an existing system for processing sales


includes a data field named SZ_Code with values such as STD-
140 and SPC-190. The usage of this field is not listed in the
application documentation, yet you see that it is used in
approximately 75 percent of sales orders. Only a business user
who is familiar with the sales order process could tell you that
the field represents a size code for products that are available in
multiple sizes, and that the value STD-140 represents a standard
size of 140 centimeters, whereas SPC-190 means that the product
was ordered in a special-order size of 190 centimeters that had
to be custom made.

Data Stewards
Data Stewards
Some information workers have particularly detailed knowledge
of the business processes and data in a specific area of the
business. By formally including these people in the BI project
team, you can have them adopt the role of data steward
(sometimes referred to as data governor or data curator) for the
data elements used in their area of the business. A data steward
can provide valuable services to the project, including:

• Representing the interests of a specific business area while the BI


solution is planned. For example, ensuring that all of the data
elements that are important to that business area are included in
the data warehouse design and that the reports required by that
business area are considered.
• Validating and interpreting data values in the source systems that
will be used to populate the data warehouse, and helping to
identify the appropriate transformations and exceptions that will
need to be implemented.
• Taking ongoing responsibility for maintaining a Data Quality
Services knowledge base for the business area, so that data values
can be cleansed and matched effectively.
• Taking ongoing responsibility for maintaining relevant business
entities in a Master Data Services model to ensure consistency of
data definitions across the organization.

Project Scope
From the very beginning of a project, it is important to prioritize
the business requirements in terms of their value to the
business, and the feasibility of meeting them with specific
constraints, such as available data, budget, and project
deadlines. This enables you to scope the project in a way that
maximizes the chances of it successfully delivering value to the
business.

Initial Scoping
After the initial requirements gathering is accomplished, the
project team and business stakeholders must negotiate the
importance or value of the requirements. At this stage, you may
be able to judge the feasibility of meeting some objectives, but
others will require further investigation to identify suitable
source data or to estimate the effort required.

You can use a matrix to record the relative value and feasibility
of each requirement as they are agreed by the team members. It
is likely that there will be some disagreements about the
importance of some objectives, and feasibility may not be easy to
assess. In these cases, you should make a note of the issues and
move on. At this stage, it is important to get a comprehensive
view of the potential project scope―further iterations of the
design process will gradually resolve prioritization conflicts and
help clarify feasibility.
help clarify feasibility.

Refining the Scope


After the initial scoping discussion, the feasibility of the
requirements that were identified can be investigated. Typically
this investigation involves:

• Using the techniques for auditing data sources discussed in the


first lesson of this module to determine whether sufficient data is
available and accessible to meet the requirements.
• Estimating the development effort and skills required for each of
the requirements.

As the investigations reveal more information, the team should


meet to refine the matrix created during the initial scoping
exercise.

Identifying a Pilot Scope


When the scope is considered to be well defined, the team
should examine the requirements in the high value, high
feasibility quadrant of the matrix and further prioritize them to
determine a set of requirements that can be addressed in a pilot
or proof-of-concept solution.

Using a pilot project enables you to reduce the time it takes for
the BI project to add value to the business. By prioritizing the
requirements based on their value and feasibility, you can
quickly demonstrate the effectiveness of the BI initiative without
losing the momentum the project has built up during the initial

scoping phase. In most cases, the pilot focuses on a related set of


scoping phase. In most cases, the pilot focuses on a related set of
requirements, often in a specific, high-profile area of the
business. However, because you have used the scoping phase to
consider all requirements, you can design the pilot with
extensibility in mind, ensuring that the design of the pilot will
support the addition of the other highly important business
requirements at a later stage.

After scoping the pilot, you can start designing the solution.
However, you must make sure that the project team carefully
considers the following questions:

• How will the pilot incorporate user acceptance testing (UAT)?


Instead of delivering the solution to all users in the affected area
of the business, you may want to enroll a subset of users in the
pilot program with a particular focus on providing feedback on
the usability and usefulness of the solution. Often, these users can
provide valuable feedback that results in improvements to the
design of reports, data models, dashboards, SharePoint document
library structures, and other user-visible aspects of the solution.
• How will you measure the success of the pilot? Other than
qualitative measures based on feedback from users, you should
consider quantitative goals for the pilot. The criteria for success
should ultimately be aligned with the business goals, so you need
to be able to measure the effects of the solution in terms of
revenue growth, increased profitability, reduced costs, increased
customer satisfaction survey scores, or whatever quantifiable
goal the BI solution is intended to help the business achieve.
Therefore, you should determine a realistic time interval over
which the success of the project should be assessed.
Lab: Planning a BI Solution

Scenario
Adventure Works Cycles is a multinational corporation that
manufactures and sells bicycles and cycling accessories. The
company sells its products through an international network of
resellers, and in recent years has developed a direct sales
channel through an e-commerce website.

The company is financially sound and has a strong order book;


however, sales volumes have remained relatively static for the
past few years. Senior management is under pressure from
shareholders to develop a strategy for growth that will drive
increased revenue and profit. Management believes that a key
factor in their growth strategy is investment in technology that
improves collaboration between the various divisions of the
company, and enables them to track and share key business
performance metrics.

Objectives
After completing this lab, you will be able to:

• Identify and prioritize business requirements.

• Determine software for a BI solution.

Estimated Time: 60 Minutes

Start 20467B-MIA-DC and 20467B-MIA-SQLBI, and then log on to


20467B-MIA-SQLBI as ADVENTUREWORKS\Student with the
password Pa$$w0rd.
password Pa$$w0rd.

Exercise 1: Scoping a BI Solution

Scenario
Adventure Works employees have identified a number of
business areas that could benefit from greater access to key
business information through analytics and reporting. To meet
these needs, the senior management at Adventure Works has
approved the scoping of an enterprise BI initiative to provide
decision makers with the information they need to make the
business more efficient.

As a BI consultant, you have been asked to review the business


requirements and assist in the scoping and design of a pilot BI
solution.

The main tasks for this exercise are as follows:

1. Prepare the Lab Environment


2. Review Interview Transcripts
3. Prioritize Business Requirements

Task 1: Prepare the Lab Environment

• Read the lab and exercise scenarios.

• Ensure that the 20467B-MIA-DC and 20467B-MIA-SQLBI virtual


machines are both running, and then log on to 20467B-MIA-SQLBI

as ADVENTUREWORKS\Student with the password Pa$$w0rd.

You might also like