Professional Documents
Culture Documents
(BA05SG01)
Edition 1.0
Dec - 2013
Business Intelligence 2013
Trademarks
The following terms are trademarks of the International Business Machines Corporation in the United
States, other countries, or both:
RACF®
Advanced Peer-to-Peer Geographically Dispersed
Networking® Parallel Sysplex™ RMF™
AD/Cycle® GDDM® S/360™
AIX® GDPS® S/370™
C/370™ HiperSockets™ S/390®
CICS® IBM® Sysplex Timer®
CICSPlex® IMS™ System z9™
Domino® Language Environment® System/360™
DB2® Lotus® System/370™
DFS™ Multiprise® System/390®
DFSMSdfp™ MVS™ SAA®
DFSMSdss™ MVS/ESA™ Tivoli®
DFSMShsm™ MVS/XA™ VisualAge®
DFSORT™ NetRexx™ VSE/ESA™
DRDA® NetView® VTAM®
Encina® Open Class® WebSphere®
Enterprise Storage Server® OS/390® z/Architecture™
Enterprise Systems Parallel Sysplex® z/OS®
Architecture/390® Processor Resource/Systems z/VM®
ECKD™ Manager™ z/VSE™
ESCON® PR/SM™ zSeries®
FlashCopy® QMF™ z9™
FICON® Redbooks™ RACF®
Disclaimer
The information contained in this document has not been submitted to any formal IBM test and is distributed on an “as
is” basis without any warranty either express or implied. The use of this information or the implementation of any of
these techniques is a customer responsibility and depends on the customer’s ability to evaluate and integrate them
into the customer’s operational environment. While each item may have been reviewed by IBM for accuracy in a
specific situation, there is no guarantee that the same or similar results will result elsewhere. Customers attempting to
adapt these techniques to their own environments do so at their own risk.
© Copyright International Business Machines Corporation 2012, 2013. This document may not be reproduced in
whole or in part without the prior written permission of IBM.
Table of Contents
Building Reports........................................................................................................................................56
Building a Report ......................................................................................................................................57
Drill-up, Drill-down Capabilities .................................................................................................................69
LAB EXERCISE…………………………………………………………………………………………….. 90
Objective ...................................................................................................................................................90
Charts .....................................................................................................................................................118
Grouping Data.........................................................................................................................................122
Summarization of data item ....................................................................................................................127
Chapter 1
Introduction to BI
“The process of taking large amounts of data, analyzing that data and presenting a high level set of
reports that condense the essence of that data into the basis of business actions, enabling
management to make fundamental daily business decisions ” (Stackowiak et al 2007).
“The way and method of improving business performance by providing powerful assists for
executive decision maker to enable them to have actionable information at hand.” ( Cui et al-
2007).
“The process of collection, treatment and diffusion of information that has an objective, the
reduction of uncertainty in the making of all strategic decisions” (Zeng et al-2006).
“ A business management term used to describe applications and technologies which are used to
gather, provide access to analyze data and information about an enterprise, in order to help them
make better informed business decisions” (Experts).
Concept
It is defined as the application of a set of methodologies and technologies that transform raw data into
meaningful and useful information for business purposes for enabling more effective, strategic and
operational insights and decision making (Gartner Group-1996).
Business Intelligence is an integrated solution for companies, within which the business
requirement is definitely the key factor that drives technology innovation.
How to identify and creatively address key business issues is therefore the major challenge of a BI
application to achieve real business impact.
A BI that includes effective data warehouse and also a reactive component capable of monitoring
the time critical operational processes to allow tactical and operational decision makers to tune
their actions according to the company strategy. (Golfarelli et al-2004).
BI provides an in-depth analysis of detailed business data, including database and application
technologies as well as analysis practices. (Gangadharan and Swamy-2004).
It encompasses knowledge management, enterprise resource planning, performance
measurement, decision support systems and data mining.
It helps in the strategic management of a business.
It corroborates/coordinates different areas either within the business or outside to work together.
It helps to attain true business knowledge i.e, identifying, creating, representing, distributing and
enabling adoption of insights and experiences.
Performance measurement as well as information to the management about the progress towards
business goals.
Analyzing the data to arrive at optimal decisions and to discover business knowledge.
For serving the strategic management of a business.
Collaborating / coordinating different areas within the business or outside, to work together.
Attaining true business knowledge i.e, identifying, creating, representing, distributing and enabling
adoption of insights and experiences.
The data components of a BI architecture include the data sources that corporate executives and
other end users need to access and analyze to meet their business requirements. Important
criteria in the source selection process include data currency, data quality and the level of detail in
the data. Both structured and unstructured data may be required as part of the BI architecture, as
well as information from both internal and external sources.
Information management architectural components are used to transform raw transaction data into
a consistent and coherent set of information that is suitable for BI users. For example, this part of
BI architecture typically includes data integration, data cleansing and the creation of data
dimensions and business rules that conform to the architectural guidelines.
It may also define structures for data warehousing or for a data federation approach that
aggregates information in virtual databases instead of physical data warehouses or data marts.
The technology components are used to present information to business users and enable them to
analyze data. This includes the BI software suite or BI tools to be used within an organization as
well as the supporting IT infrastructure i.e. hardware, database software and networking devices.
There are various types of BI applications that can be built into an architecture: reporting, ad hoc
query, data mining and data visualization tools, plus online analytical processing (OLAP) software,
business intelligence dashboards and performance score cards.
BI Components
Business intelligence users are beginning to demand real time business intelligence relating to their
business, particularly in front line operations. They will come to expect up to date and fresh information in
the same fashion as they monitor stock quotes online. Monthly and even weekly analysis will not suffice. In
the near future, companies will become dependent on real time business information just like getting
information from the internet in one or two clicks.
Also in the near future, information will become more democratized where end users from throughout the
organization will be able to access information. So, in the future, the capability requirements of BI will
increase in the same way that consumer expectations will increase. It is therefore imperative that
companies increase at the same pace or even faster to stay competitive.
New options:
SaaS (Software as a Service) is a software delivery method that provides access to software and its
functions, remotely as a web based service.
Saas allows organizations to access business functionality at a cost lower than paying for licensed
applications. Also, because the software is hosted remotely, users don’t need to invest in additional
hardware. SaaS removes the need for organizations to handle the installation, setup and daily upkeep and
maintenance.
In this option, the business organization rents or leases storage space from a third party provider. Data is
transferred from the client (business organization) to the service provider via the internet and the client
would access their stored data whenever it is required using a software provided by the storage provider.
SaaS is popular with small and medium businesses because there are usually no start-up costs. The
software and associated data are centrally hosted on the ‘cloud’.
SaaS, IaaS (Infrastructure as a service), PaaS (Platform as a service), DaaS (Desktop as a service) etc
are considered to be part of cloud computing.
Cloud computing or the Cloud is a colloquial expression used to describe a variety of different types of
computing concepts that involves a large number of computers connected through a real-time
communication network, typically the internet.
Cloud computing is mostly used to sell hosted services in the sense of “application service provisioning”
that run “client server software” at a remote location. End users access cloud based applications through a
web browser or mobile application while the business software and the user’s data are stored on servers at
a remote location.
Resource pooling
The provider’s computing resources are pooled to serve multiple consumers.
Rapid elasticity
Capabilities can be elastically provisioned and released.
Measured service
Cloud systems automatically control and optimize resource used.
Criticism
With Saas, the users do not have a copy of the executable file; it is on the server, where the users can’t
see or touch it. SaaS gives the service operator the power to change the software in use.
Users must send their data to the server in order to use them. This has the same effect as spyware. The
service operator has unjust powers over the users.
Reporting: The process involves extract, transform and load (ETL) procedures in coordination with a data
warehouse (database) and then using reporting tools (software).
Online analytical processing (OLAP): This tool enables users (management) to analyze multidimensional
data interactively from multiple perspectives.
Data mining: is the computational process of discovering patterns in large data sets. It uses information
from past data to analyze the outcome of a particular problem that may arise.
Process mining: is a process management technique used for extracting knowledge from event logs
recorded by an information system.
Complex event processing: is a method of tracking and processing information about things that happen
and deriving a conclusion from them.
Bench marking: Comparing one’s business processes with other best practices from other industries.
The end user should not assume that any two tools of business intelligence would do the same
thing in the same way. The chance is very slim.
The end user must have a clear idea as to what kind of output he needs and the analysis he needs
to complete. If need be, he can take the help of someone with expertise.
The end user should never assume that anything other than whatever he has been asked to do will
take longer than he imagined.
Another assumption that he needs to discard is about the placement capabilities of BI output.
The important thing to accomplish by end user is to create the output with clean, accurate data and
present it in a clear and logical to interpret way.
Don’t assume that the output can be produced only in a particular way. It is important to make
absolutely no assumptions about any tool and the user’s abilities to learn and drive it.
ETL tools
ETL tools and processes are responsible for the extraction of data from one or many source systems, as
they transform data from many different formats into a common format and then load that data into a data
warehouse. (Schink, 2009).
The extracted data has to be deemed central to the business. The ETL tools manipulate and present the
data into information that is then used for managerial decision making.
ETL solutions are divided into 3 distinct stages that find and convert data from various sources and insert
the resulting product into a data warehouse. The three stages of ETL are,
The extraction stage: This stage involves obtaining access to data originating from different
sources. These sources are often distributed across multiple platforms and can be part of a
customer’s information system.
The transformation stage: This stage transforms the extracted data and is considered the most
complex stage of the ETL process. The transformation stage converts the data into the same
schema (wherein the individual tables or data sets beaming in from different sources and
converging)of the data warehouse to which it is to be loaded.
The transformation phase is usually performed by means of traditional programming languages,
script languages or the SQL language.
The” load” stage: The load stage pushes the transformed data and loads the data warehouses with
data that are aggregated and filtered.
The requirement of BI system to be able to extract data in different formats from disperse sources,
transform them into like formats and then load into the appropriate data warehouse has traditionally made
the ETL process the most expensive aspect of a BI system. In some cases, a BI system may have a
dedicated but separate data warehouse that acts as a staging area.
Generally, there are four categories of ETL tools.
Tools that address the extraction and loading aspects of the ETL process.
Tools that provide a preference for the data types and format to be extracted and loaded.
Tools that offer a balance across all tool functions, the lack of emphasis may cause this aspect to
result in poorer handling of a large volume of data formats.
Tools that emphasize the integration of data into data warehouse.
Data warehouse
A data warehouse is a collection of relevant business data that is organized and validated (Cody et al –
2002) so that it can be analyzed to support business decision making. Data warehouse are populated with
data that has been extracted from distributed databases, often heterogeneous and in some cases, external
to the organization which is using it. Data warehouses are subject oriented databases that are integrated
into an information system. They are time relevant, meaning that they are ace snapshots of a point of time
within the information systems and they are not updatable so as to maintain the integrity of the historical
point in which the snapshot of data is taken.
Data warehouses are offline, meaning that they reside on a different system than that of the data of which
they are storing a snapshot. New data warehouses are constantly being loaded with business critical data
to ensure that up to the minute data is available for decision making. The criticality of that data is
predetermined by the organization using the data warehouse. The data is historical, a summary of prior
transactions that when analyzed can provide a wealth of knowledge in which managerial decisions can be
made. These decisions are based on the history of the business that a data warehouse is holding.
A data warehouse, in addition to providing the snapshot of historical data, also provides room for the
thematic storage of aggregated information data that has been analyzed by an ETL tool then loaded with
the appropriate data warehouse.
A well implemented data warehouse is easy to use, allows for quick information recovery, stores more
information, improves productivity, allows for better decision, increases an organization’s competitive
advantage. To conclude, the key role of a data warehouse is to provide an understanding of business
problems, opportunities and performances based on compelling BI facilitating decision making.
A querying and reporting tool helps to run regular reports, create organized listings and perform cross
tabular reporting and querying.
The common query tools are (a) the SQL (Structured Query language) window, (b) Query and reporting
from DB2 and (c) SAS/English software.
The query and reporting tools are meant to allow the users to interact directly with the organization’s data.
To do that, they have to be designed to work closely with the data warehouses or other data sources. A
query is nothing more than an electronic request for specific information from a database. Queries hold an
important place in the world of database technology because they are the most basic link between user
and the insights contained in the database.
Common query tools provide a graphical interface to make it easy to see what data fields are available.
The latest generation tools also offer easy to use drag and drop capabilities to allow users to adjust their
queries.
Reporting tools
Retrieving the right information and presenting it in a meaningful, understandable way is the function of
reporting tool. They are usually bundled with query tools. They also have the graphical interface. Beyond
simple presentation, reporting systems helps organize and distribute query results in the form of reports.
The OLAP attempts to analyze complex data in real time on a database that is constantly updated with
transactional data. The OLAP optimizes the searching of huge data files by means of automatic generation
of SQL queries.
OLAP allows users access, analysis and modeling of business problems and sharing of information that is
stored in data warehouses. OLAP offers techniques for data analysis and drilling data and the tools are
mainly used for interactive report generations. OLAP tools use data mining techniques and statistical
methods to create readable, fast report generation that is used for forecasting that can further assist in
strategic decision making. These reports are generated based on a manager’s predefined criteria.
OLAP enables managers to analyze data from multiple perspectives and explore it in order to discover
hidden information.
Data Mining
Data mining techniques are designed to identify relationships and rules within a data warehouses, then
create a report of these relationships and rules. The data mining process involves discovering various
patterns, generalizations, regularities and rules in data sources. Knowledge from data mining may be used
to predict an outcome of a decision and can also describe reality. The ‘predictions’ generated by data
mining use known variables to predict the outcome of a situation, while ‘reality’ is measured by graphing,
tabling and creating formulae based on the existing data.
There are several basic strategies for data mining. The most common are a) classification b) estimation c)
prediction d) time series analysis and e) market basket analysis.
These strategies can be aligned with the needs of an organization and help decision making by discovering
various patterns, generalizations, regularities and rules in data sources.
(eg.) Classification – To classify unstructured data such as email,as spam or a legitimate piece of
correspondence.
Data mining discovers hidden patterns in data. Data mining operates at a detail level instead of a summary
level. In other words, data mining predicts while OLAP forecasts.
At senior managerial levels, BI systems provide the input to strategic and tactical decisions and at the
lower managerial levels help individuals do their day-to-day job.
On a strategic level, BI systems create the information used in the forecasting of future results based on
historical results, on the tactical level, they provide a basis for decision making to optimize actions for
overall company performances and on an operational level, provide just-in-time analysis of departmental
performance. BI systems can be used to guide and improve decision making at all levels, strategic, tactical
and operational. It is done by taking advantage of all available information and through the use of analysis
such as OLAP.
Management level
A BI system allows managers to make decisions using real time data by monitoring competition, carrying
out constant analysis of numerous data and considering different variants of organization performance.
BI system – A set of integrated tools, technologies and programmed products used to collect,
integrate, analyze and make data. ( Koronios and Yeoh-2010)
Data – Conversations, graphics, images, movies, news items, text, videos and web pages used as
an input for analysis for the purpose of decision making (Negash, 2004).
Data Mining—Tools specifically designed to identify patterns, relationships and rules within the
data warehouse. (Hevner and March-2005)
Data warehouse - A subject oriented, collection of data used to support decision making in
organization ( Anderson et al – 2008).
Data warehousing - A systematic approach to collecting relevant business data in order to
organize and validate the data so that it can be analyzed to support business decision making .
(Cody et al – 2002).
Decision support system (DSS) – A set of tools that analyze data and present it in such a way as
to support decisions (Airineri and Homocianu -2009).
Delivery of information – Task of the presentation component of a BI system. This presentation
includes graphics and multimedia interfaces that allows information to be presented in a
comfortable and accessible form (Olszak and Ziemba-2007)
Extract, transform and load (ETL) - Processes and tools used to extract data from legacy systems
and external sources, then transforming and pre-processing the data into a useful format to lead
into data warehouse structures.(Hevner and March- 2005).
Business intelligence hardware – Infrastructure that exists in an organization that is used in
decision making support. This infrastructure includes servers (file and compute), network
equipment and workstations. (Arnott et al – 2004).
BI software – Software that is used in an organization for decision making support. This software
includes OLAP, ETL, data mining and other analytical utilities. (Arnott et al – 2004).
Online analytical processing (OLAP) - Tools that allow analyze multidimensional data known as
cubes. Cubes are data that are extracted from the data warehouse and used by managers to
decision making situations (Hevner and March- 2007).
Operational decisions – Decisions that are related to and affect the ongoing operations of an
organization based on up-to-date financial data, sales and cooperation with suppliers and
customers (Olszak and Ziemba- 2007).
Searching – The collection of raw, unprocessed data from a set of source systems and data
structures. Data is moved from these sources (internal or external) into a data warehouse through
ETL process to deliver meaningful information for managerial decision support. (Watson and
Wixon- 2007).
Strategic decision – Decisions that set objectives and that are made to realize those objectives
(eg. Development of future results based on historical results, profitability of offers and the
effectiveness of distribution channels).( Olszak and Ziemba- 2007).
Tactical decision – Decisions related to marketing, sales, finance and capital management.
Tactical decisions are often used to support strategic decisions. ( Olszak and Ziemba-- 2007).
Chapter 2
Elements of Business Intelligence Solutions
Getting to the data is just one capability of the business query tool; the other aspect is presenting and
formatting the data in a meaningful way, loosely referred to as ‘reporting’.
The most basic of formatting capabilities allow for changing the font of column headings and making them
bold and centered. Simple report styles include displaying information in a cross-tab report, a chart or a
master-detail report with groupings and subtotals. Tools may provide a set of templates to create nicely
formatted reports that use a consistent corporate look and feel. More complex formatting capabilities
include the ability to present multiple charts on a page, perhaps coming from different data sources.
Examples of business query tools include,
An ad-hoc query may be exploratory in nature as users try to find the root cause of the problem. As users
explore the data, corollary ad-hoc queries may arise as shown below. It is important to recognize the
iterative nature of business intelligence and therefore, the necessity of flexible tool.
Business query tools allow business users to access a data source via business terms without having to
write any SQL (Structured query language). The data source could be a data warehouse or it might be
direct access to an operational system.
A key feature of a business query tool is that it has a business view that hides the complexity of the
physical database. It uses business terminology rather than the physical field name in RDBMS. Business
query tools will generate the SQL behind the scenes so that users don’t need to learn how to write SQL
code.
Reporting
Whereas business query and reporting tools allow for basic report formatting, production reporting tools
have much sophisticated formatting and design capabilities. Sometimes, this category of tools are known
as pixel perfect, operational or enterprise reporting.
Key differences between business query tools and production reporting tools.
Online analytical processing (OLAP) is a capability that focuses on ‘ analyzing and exploring data’ where
as query and reporting tools put greater emphasis on ‘accessing data’ for monitoring purposes. In other
words, OLAP focuses on ‘why something is happening’ rather than ‘what is happening’. To find out that
‘why’, users will have to navigate and ‘drill’ within a data.
OLAP provides interactive analysis by different dimensions (ie., geography, product, time etc.) and different
levels of detail (ie., year, quarter, month etc.).
As the technology and users have evolved and matured, the distinction between OLAP and reporting have
increasingly blurred. OLAP users want highly formatted reports that are based on multidimensional data,
whereas ‘reporting’ users immediately want to drill when they see a problem with a particular metric in a
report.
The following characteristics distinguish between OLAP tools and business and query tools.
In understanding OLAP requirements, it is important to distinguish between OLAP platform issues and
OLAP user interface issues.
OLAP platforms
The OLAP platform is about how the data is stored to allow for multidimensional analysis. On the one
hand, the users should not have to worry at all about how the data is stored, replicated and cached. And
yet, the OLAP architecture affects the users in their analyzing and how. The OLAP architecture also
influences what OLAP front end, the user, can use.
Just as business query and reporting tools allow users to retrieve data from relational databases without
knowing SQL, OLAP viewers allow users to access data in an OLAP data box without knowing
multidimensional expressions. (MDX is also a query language, similar to SQL).
Many of the leading BI suite vendors offer OLAP viewers to third party OLAP data sources, sometimes via
the business query and reporting tools, or via a production reporting tool or via a special OLAP viewer.
(Microsoft ProClarity and Panorama Novaview).
BI dashboards are similar to car dashboards- they provide multiple indicators or reports in a highly visual
way. A dashboard may be comprised of
A key characteristic of dashboards is that they present information from multiple data sources.
Development
In the late 1980s, Executive Information System (EIS) tried to deliver similar capabilities. They were often
customer coded, inflexible based on quarterly data. Nowadays, new dashboards are user-built, flexible and
sometimes updated in real time. They also increasingly leverage advanced visualization capabilities that
facilitate greater insights and conveying more information in less space.
While some dashboards may offer general packages, others provide specialized controls that focus on
certain areas of the business. But most dashboards share a set of common features.
Scorecards
The terms ‘dashboards’ and ‘scorecards’ are often used interchangeably, although they are indeed
different things.
Scorecards are designed to measure progress towards meeting strategic business goals, dashboards are
only designed for ongoing monitoring of specific activities. Dashboards are not meant to convey
information about how close, operations match the intent of the business and associated targets.
Scorecards are designed to help managers and executives quickly identify company initiatives that need
extra attention. They are also an excellent way to distribute accountability to mid-level managers.
Strategic scorecards contain metrics from four key areas that drive the success of a business (people,
customers, financial, operations) and will include strategy maps to show how the metrics relate to one
another.
Fig. 1. Balanced scorecard measuring health services to displaced persons in Aceh, Indonesia, August
2007 (Source: WHO)
Metadata models
Metadata is ‘data about data’. It defines the contents and locations of the data (or data model) in the data
warehouse, relationships between the operational databases and the data warehouse and the business
views of the data in the warehouse as accessible to the end-user tools. Metadata is searched by users to
find the subject areas and the definition of the data. A repository is a place where this data is managed and
maintained.
An enterprise data warehouse normally includes data from various sources. These sources include
operational data as well as third party external data. Metadata can be extracted from the various tools,
applications and vendor solutions that are used to supply data to the warehouse. RDBMS, modeling tools
and enterprise resource planning (ERP) systems are all examples of and important contributions to
metadata.
As the data is stored and propagated throughout a typical warehouse environment, each physical data
adds more metadata. Further, each component of warehousing like ETL, data management and data
access impacts the type of metadata being generated.
The metadata structures the information in the data warehouse into categories, topics, groups, hierarchies
etc. It provides information about the data within a data warehouse.
Metadata defines how the transformed data is to be interpreted, for example whether 5/9/99 means
September 5, 1999 or May 9, 1999.
Metadata gives information about related data in the data warehouse.
Metadata estimates response time by showing the number of records to be processed in a query.
Metadata holds calculated fields and pre-calculated formulae to avoid misinterpretation and contains
historical changes of a view.
The data warehouse administrator’s perspective of metadata is a full repository and documentation of all
contents and process within the data warehouse; from an end user perspective, metadata is the road map
through the information in the data warehouse.
Metadata provides the pointer to data warehouse for decision support. Therefore, it acts as a logical link
between the decision support system application and the data warehouse. Thus, any data warehouse
design should assure that there is a mechanism that populates and maintain the metadata repository and
that all access paths to data warehouse have metadata as entry point. In other words, there should be no
direct access permitted to the data warehouse data (especially updates), if it does not use metadata
definitions to gain the access. The software tool will provide a facility for metadata definition in a metadata
repository.
Automating data gathering and analysis is crucial for enterprises to retain or gain a competitive edge.
Some of the main automated tasks are summarized below.
for business users to constantly monitor business information. (eg) Inventory falling below a
certain level.
BI guided analysis, is an extension to BI alert by giving additional supporting information about
the business issue that raised the alert (eg) A work flow that leads the business user through a
series of steps that helps them investigate the problem in more detail.
BI-driven risk analysis. It evaluates an alert and makes recommendations on what action could
be taken to address the situation. (eg.) Risk assessment of granting a loan or credit card.
BI-driven decision agent. It automatically takes action to fix the problem. (eg)Stopping the use of
a fraudulent credit card
Types
Basically, passive mobile BI involves around a ‘push’ technique. Passive mobile BI can also be divided into
two parts.
Event based alerts sent to a mobile device.
Reports ‘pushed’ to a mobile device after being refreshed.
However, passive mobile BI was not enough to support the real-time analytical requirement that the users
run into.
For example, a salesman in the field may be able to look at the latest price for a service but he will not be
able to efficiently sort through information for different customers, to arrive at a competitive price.
Users
There are four categories: Executives, Field workers, Business analysts and Clerical staff. Passive type
only caters to clerical staff. Active mobile BI is required for the other three.
Usage models
There are three distinct models: Exception based, Pushed information and Pulled information :
Before an enterprise decides to enable mobile BI as part of IT solution, the following factors need to be
evaluated for success.
BI maturity – The enterprise needs to have a working business intelligence solution in place.
Need for mobility.
Cost of handsets – affects ‘return on investment’, if the user base in large.
Cost of mobile client – again affects ‘return on investment’ if the users are on varied types of
mobile devices.
Cost of the solution and availability of the sponsors.
A mobile device communicates with an application server using a wireless network offered by mobile
service provider via secured network. Users can work with data and applications both online and offline
from application servers.
Data components can be stored on mobile device to allow users to work independently, despite
connectivity and bandwidth issues.
Users can get live alerts on business critical KPIs. Client based applications have more access to the
peripherals, allowing more elaborate functionality, storage access, multimedia display etc.
Disconnected BI
This is about viewing analytics data, interactive dashboards and queries even when the mobile device is
not connected to the network. Typically, disconnected users connect their personal machines (usually
laptops) to a server, hosted by the vendor. After downloading a BI Disconnected Analytics application, they
can disconnect their laptops from the network (for example, to travel on a business trip) and still view
dashboards and queries on their laptop machines.
This approach has limited capability in terms of availability of data and its interactivity due to concerns of
making the data available on the mobile device itself.
The way to make disconnected BI to work is to make reports accessible from mobile browser as a
distributable minimal content that makes the business user to make insight at a glance. They are called the
downloaded offline reports. This can give a snapshot of the intended content and does not replace the
connected version of the reports, at least for now.
Collaborative BI
Collaborative BI is a growing trend that merges BI and social media tools and amounts to business users
determining what the most valuable and relevant data in their organization is and sharing it to improve
decision making across the board.
There are various emerging BI systems nowadays. Among them are the following characterizing trends.
BI as a service, where BI applications are hosted as a service provided to business users across the
internet.
Real-time BI, where information about business operations is delivered as they occur, with near zero
latency.
Situational BI, where information in an enterprise data warehouse is completed and enriched by co-relating
it with the external information that may come from the corporate intranet.
Pervasive BI, where information can be easily and timely accessed through devices with different
computations and visualization capabilities and with sophisticated and customizable presentations, by
everyone in the organization.
Collaborative BI, where company information assets are empowered thanks to co-operation and data
sharing with other companies and organizations, so that decision making process is extended beyond the
company boundaries.
Collaboration is working together to achieve a goal. Teams that work collaboratively can obtain greater
resources, recognition and reward when facing competition for finite resources.
Indeed, cooperation is seen by companies as one of the major means for increasing flexibility,
competitiveness and efficiency so as to survive in today’s uncertain and changing market. Companies
need strategic information about the outer world, users need to access information anywhere it can be
found. In inter-business corroborative contexts, this is particularly relevant, since the companies organize
and coordinate to share opportunities, respecting their own autonomy and yet pursuing a common goal.
Data warehouse integration is an enabling technique for collaborative BI. It provides a broader base for
decision support and knowledge discovery than each single data warehouse could offer.
Benefits of collaborative BI
Challenges
Too many conflicting voices can make it difficult to come to an agreement on an issue.
Collaboration capabilities
A common infrastructure for producing and developing enterprise reports, scorecards, dashboards, ad-hoc
analysis and OLAP analysis.
Rich visualization, interactive dashboards, a vast range of animated charting options, to increase user
adoptions.
Web based service oriented architecture that integrates with existing IT infrastructure.
(Wayne Eckerson)
Real-time monitoring capabilities
In today’s competitive environment, analyzing data to predict market trends of products and services and to
improve the performance of enterprise systems is an essential business survival activity. However, it is
becoming clear that business success requires such data analysis to be carried out in real-time and that
actions in response to analysis results must also be performed in real-time in order to meet the rapid
change in demand from customers and regulators alike.
There are two main reasons that make real-time BI a necessity. Firstly, the conditions and environments in
which businesses operate are in a constant state of flux. Sales patterns change from place to place and
from time to time. Currency valuations shift and alter profit margins. Suppliers change delivery schedules
and their process. Customers become more educated and therefore more demanding.
Secondly, advances in technology, especially the internet, make real-time business seemingly achievable.
Indeed the internet has revolutionized information sharing. Vast amounts of data are available. Almost all
company data sources could be made accessible over an internet. And, it is easy to capture all sorts of
data and store them cheaply.
However, real-time data, analysis and actions pose a great challenge to the vendors.
Current BI systems suffer from two bottle-necks in realizing this vision. First, the transition from data into
information is handled by analysts, and therefore takes time. Second, current BI solutions do not go
beyond generating reports. Transitions from information to actions can only be manual and require long
periods.
The challenge is, therefore, to use intelligent technologies to model the manual intervention present in
current systems and automate both the flow of information from operational to tactical to strategic layer,
representing data to the information stage and the actions necessary to translate strategic objectives back
to operational drivers and effect strategic decisions in real time. Additional challenges exist in areas of on-
demand infrastructure configuration, data integration, unified datamodels and on demand data
warehousing.
Keeping all these in mind, IBM has developed IBM Cognos real-time monitoring tool, which is an
actionable business intelligence solution which addresses an enterprise’s real-time monitoring needs for
the operational front lines. It delivers self-service, interactive dashboards with easy to develop operational
KPIs and measures to support the organization’s operational and time-sensitive KPI monitoring agenda. It
also provides analysis and exception management through proactive alerts.
The business managers and analyst gain immediate insight into up to the moment changing conditions
within an organization’s operational environments. In addition business users can view real-time
information for a holistic view of their business.
The BI SDK provides positional data and information on events generated by the people, equipment and
sensors associated with business rules in location-based applications.
The SDK is designed to provide developers with all the tools and level of information they need, to build
powerful, custom applications to suit the needs of any enterprise. Whether for security, personal safety or
logistics processes optimization, the SDK provides event data to create custom location-aware
applications.
The BI SDK allows third party systems to fully exploit the events that occurred in the location aware
applications. As an example, airport customers used the system to automatically direct camera towards the
zones where alarms are triggered in an office, the removal of anti-theft key triggers an alarm installed on
the floor. In a hospital, a message is sent to the head nurse when the number of intravenous pumps on a
specific unit reaches a minimum load.
The SDK comes with several source code examples and uses the Microsoft Net Framework
(www.purlink.ca)
The SDK provides a platform independent automation interface for working with IBM Cognos services and
components. The developers can create custom reports, manage deployment and integrate security and
portal functionality to suit the needs, locale and existing software infrastructure. The SDK uses a collection
of cross-platform web services, libraries and programming interfaces.
By SDK, we can choose to automate only a specific task or can program the entire process from modeling
through to reporting, scheduling and distribution. (IBM Cognos)
Through the SAP business intelligence java SDK, analytical applications can be built that accesses,
manipulates and displays both multidimensional (OLAP) and tabular (relational) data. It provides additional
capabilities and simplifies the task of implementing client applications based on these interfaces. (SAP
Java SDK).
Web applications
Communication and information disseminations can be done using e-mail, instant and text messaging, and
through web conferences and seminars. Social networking and associated communities and profiles
enable the sharing of expertise and best practices. Desktop application, is another easy to use interface.
Chapter 3
Building BI Project
Business Analysis
Data Analysis Application Prototyping Metadata Repository
Analysis
Design
ETL Design
Construction
ETL Development
Implementation
Deployment
Release
Evaluation
© Copyright IBM Corporation | Page 36 of 139
Business Intelligence 2013
The above two diagrams would give a general idea about the various stages of a Business Intelligence
program / project management.
Among the various stages, BI project planning defines the project in terms of
Project Scope
On BI projects, scope must be measured by the number of ‘data elements’ that have to be extracted from
the source systems, transformed and cleansed, and loaded into the BI target databases. This is so
because analyzing and preparing source data takes much longer than providing data access and enabling
data analysis through queries and reports.
Project risks
The project manager must identify situations which may trigger a risk and incorporate a mitigation plan as
well as a contingency plan. Some common project risks include
Change control procedures – Business managers and IT managers frequently put the project team under
unwanted pressure to incorporate scope changes without slipping the schedule.
Issues management procedures – Every issue must be assigned to a person who has the responsibility for
its resolution.
Project planning is not a one-time activity. Since a project plan is based on estimates, frequently no more
than best guesses, project plans must be adjusted constantly.
Task dependencies
Finish to Start indicates that task 2 cannot start until task 1 is finished.
Start to Start indicates that task 2 can start at the same time as task 1.
Finish to Finish indicates that task 2 cannot finish until task 1 finishes.
Start to Finish indicates that task 2 cannot finish until task 1 starts.
‘Finish to Start’ and ‘Start to Start’ are the most common task dependencies.
Once the task dependencies are identified, and the resources leveled, use the critical path method diagram
to find out the critical activities.
Project schedules
The most commonly used diagram representing the project schedule is a Bar chart or Gantt chart.
The second most commonly used method is Critical Path Method (CPM).
The more complex method is PERT. (Program, Evaluation, Review Technique)
Project resources
Normally, in any project, resources mean men, money and materials. But in BI project, resources mean
human resources only, because major applications are to be built with a technology architecture base.
To identify resources required, the project manager lists out the major tasks of the project and the kind of
required skill for carrying out the tasks. The project manager assigns a role to cover the task.
A role is a standard combination of abilities and sets of skills that commonly go together.
Once the project manager determines all the roles, it is time to match them with the available resources. If
you are tempted to start with the people and fill in the roles underneath them, you may go wrong. Instead, if
you start with project roles and then match them with people available in the resource pool, you have a
better chance of fitting the right person to the job, as well as identifying skills missing form your team. It is
better to understand what you need first - without considering what resources you have available. Only
when the roles are defined, should you start adding real live people into your plans.
Team Members
Most business intelligence projects include the following professionals who are each expert in one or other
key area of the project.
Project manager
Business analyst
BI developers
Database administrators
Data administrators
ETL developers
Testers.
Project Tasks
After the BI project’s roles are developed and its resources allocated, the major tasks in the project plan
are to be listed out. Then, they are to be reviewed and details added. After several passes (reviews) the
plan will contain a list of specific tasks, broken down into subtasks. These tasks have to be connected in a
logical order.
First Review
Important project milestones are to be identified and inserted in the plan. Milestones are broad targets.
Reaching them indicates the completion of a discrete goal or task.
The milestones have to be aligned with the overall strategy and architecture of the project. They need not
be always technology based. The following non-technical milestones are common in a BI project.
Once the milestones are decided, start adding the high-level tasks required to reach each milestone. The
high-level tasks are all technical activities such as writing code modules, installing software packages,
performing necessary analyses, regression testing- to assess the development team’s progress.
Linkages: When tasks are linked, it means that they can be completed only ‘in a given order’. Usually it
means that a first task must be completed before starting a second task.
Constraints: There are external factors which affect the completion dates of certain tasks.
These linkages and constraints are listed out to the best of one’s knowledge and ability.
Third Review
After breaking down the main task into more detailed sub tasks, the bar chart is prepared to get an idea
about how tasks link together in dependencies and extend over a calendar.
Conclusion
Once the project begins, continue to revise the list of tasks, the dependencies, the time estimates and
other things. The project plan continues to be revised.
But when things go wrong, a well planned project can turn into a nightmare if no contingency plans are in
place.
Contingency Planning
A contingency plan is a branch off the main sequence of tasks and events and it is designed to get the
project back on track and performing.
Contingency plan’s value is unmistakable. For working out the contingency plan, you have to first discuss
with your lead technology team and define what conditions must exist for you to implement the plan
successfully. You must also discuss with them the probable causes, for the plan to go wrong and how to
prevent them.
Check points
It is advisable for you and your project leaders to take a breather and evaluate how things are going at
predetermined time. It helps to have a preset list of metrics (checks) that indicate the well-being of a
project.
The best project managers have mitigation strategies worked out ahead of time in case specific problems
emerge for the project. Prepare a mitigation module for the project plan-- a remedy for fixing whatever the
problem is-- and be ready to activate it if necessary.
Risk Assessment
Technology Risks
How mature are the selected technologies within the market place?
How mature are the selected technologies within the organization?
How many different technologies co-exist?
Do we have incompatible operating systems?
Do we have incompatible database management systems?
Complexity Risks
How complex is the overall IT environment?
How complex is the BI application itself?
How extensively will workflow have to change?
How many sites will be supported?
What is the degree of distribution of data, processes and controls?
Integration Risks
How may interfaces will the BI application have?
Are there external interfaces?
How much source-data redundancy exists?
Can the primary keys from various data sources be matched?
Organization Risks
How much risk will business management tolerate?
How much will IT management tolerate?
How much financial and moral support can we expect when the project encounters hurdles?
Returns
There are three ways in which BI solutions can directly contribute to value and thereby increase the ROI.
Increase revenue
Decrease costs
Improve efficiencies
Making money from the information itself – for example, by reselling customer demographic
information or customer buying patterns provided the data privacy governances allow it to do so.
Identifying new or wider market opportunities
Increasing the market range or penetration
Measuring Success
Measuring the success of a BI solution can be divided into two broad perspectives, intrinsic or perceived
value and financial return. Current BI project are funded solely on account of perceived value and more
often require financial justification to ensure that the proposed investment reflects the best use of the
organization’s capital.
The Net Present Value (NPV) is a sophisticated budgeting technique that is used to do analyze the
profitability of an investment in terms of today’s money value and therefore its success.
The Internal rate of return (IRR) is another sophisticated budgeting technique. The concept of IRR
is simple- an organization should invest in an internal project if it can earn a higher return than
investing its capital elsewhere. IRR assumes that the cost of capital remains constant over time.
Profitability index is another method closely aligned to the NPV and IRR. It provides a relative
index value for comparison of projects.
ROI is the most commonly accepted financial measure for evaluating the success for a project. It
allows us the benefit of a project over the initial costs and to evaluate the NPV of projected cash
flow derived from the gains.
However, most of the companies do not conduct ROI analysis for BI, perhaps because a majority
believes that intangible benefits of BI are more important than tangible benefits. While conducting
ROI studies for BI, respondents said that they believe the most significant cost metrics are the
direct costs of software licenses and hardware acquisition.
The key benefit metrics are improved access to data and improved quality of a data for decision
making. The greatest hurdles to conducting ROI assessments for BI is that real costs and tangible
benefits are hard to quantity and those hurdles seem consistent for both large and small
companies.
A typical example of how to calculate the ROI is given at the end of this chapter.
In spite of all the above methods, the key method for measuring the success of business
intelligence solutions in an organization is the rate of end-user adoption. That is, how many
people who have access for BI software actually use that software and access the reports it
generates on a regular basis, to help them make operational and strategic business decisions.
To increase user acceptance of a BI system, it can be advisable to consult business users at an early
stage of BI lifecycle. This will provide an insight into the business process and what the users need from
the BI system. There are several methods for gathering the information, such as questionnaires and
interview sessions etc.
While gathering the user requirement, the local IT department should also be consulted in order to
determine to which degree it is possible to fulfill the business needs. Users can tell designers what they
need the software to do, and how it needs to look to support their activities and therefore the user
community must be heavily represented in the requirements gathering process. This sort of user centered
approach throughout the design and development stage may further increase the chance of user adoption
in the BI system.
The purpose is to get the best ideas from all the employees and prioritize them, after consultation with
them, at the same time maintaining the integrity of the project plan and meeting the budget and timeline.
The usual methods adopted for gathering the requirements include the following.
Group meetings - meeting with a bunch of stakeholders and thrashing out the requirements.
Individual interviews - with representatives of the user community
Questionnaires - to the various users
Brain storming – with the users
Prototypes and pilot story boarding –the possible scenarios to be shown of how the applications
might look and ask for the feedback.
The result of all these exercises is determining what data, the users will need to do their job well and what
data they would like to have.
Group sessions:
Individual meeting
With certain stakeholders
After getting the requirement, a report should always be prepared. The report should reflect information,
that is, timely, accurate, high-value and actionable.
Validating BI Requirements
After all the requirements of BI have been collected, one may find that a few of them may seem to be
working at cross purposes.
Therefore, the next step ‘validating the requirement’ comes into play. It is to ensure that the requirements
all make sense when viewed together, and in the context of the underlying business processes.
This stage is about laying out all requirements, finding commonalities, building a list of unknowns and
questions (ultimately seeking answers for them) and mapping out the data needs of the project to make
sure that they are feasible.
When requirements don’t agree with each other, or don’t appear to correspond correctly with the reality of
existing business procedures, it is an issue that has to be flagged and noted. After the double checking is
carried out, the business analyst does some research and makes a recommendation as to how move
forward.
The validation step can take as long as gathering initial requirements-- don’t rush or take shortcuts. All
disputes and questions and resolutions must be documented.
Prioritizing BI Requirements
In the process of collecting business requirements, the focus has to be not only on the details of the project
itself, but also on the priorities of the requirements.
Too many features, too few resources or too little time means that the work may have to be done in
phases. Judging how important requirements are relative to each other is the task of prioritization. This will
be performed by analysts with input from stakeholders.
The most common technique for prioritizing requirements is to divide them into categories ranked by how
necessary they are to the project. There are 3 classifications in this regard.
Must: They must be there, if not, the solution does not work.
Should: Should have a solid reason to be included.
Could: These requirements are nice-to-haves.
After the requirements are vetted, it is preferable to run the requirements documentation by a group of
high-level stakeholders. The reason is to take one final look at the business needs, to make sure that
everything is fine.
The final critical step in the requirements gathering cycle is to get final sign-off on the requirements
document from the client organizations, stakeholders or project sponsors.
The goal is to create a requirements baseline and even if a stakeholder dreams up a new feature for the BI
environment, the requirements document cannot be altered.
Changing Requirements
Ideally, the requirements, once they are fully defined in the early stages of the implementation, would never
change. But the reality is that change is inevitable. BI projects are time consuming and during that time, a
company’s business needs may change. Most likely, the cause of changes in the requirement is a part of
the design process: as the design matures, certain needs come out in the open out that were not obvious
during the early stages.
Some project managers opine that changing requirements are actually a good sign because they show that
the stakeholders are actively engaged.
It is absolutely essential that you have a plan to deal with changes in requirements. Specially, there should
be a plan to manage change requests so that they can be tracked and prioritized. Additionally, an impact
assessment is to be carried out to see how the suggested change affects the rest of the project
requirements and ultimately, the overall implementation schedule.
Changing requirements during development is expensive, and it is to be examined whether the change is
worth or wait for some more time.
A successful BI solution is once when it meets its business objectives. For that purpose, there are certain
things which can enhance the chances of success. They are
To be realistic in setting up the goals, i.e, the approach has to be grounded in reality and also the
goal has to be well defined.
BI solution should aim to solve a problem that is overdue and which is really important to the
company.
While solving an existing problem, it should be borne in mind that each piece of the solution should
be a step to the success of the next.
BI Design
A successful BI design is similar to any good product marketing effort. Marketing effort or BI design is all
about knowing about your customers, what motivates them to buy the product or use it, their opinions or
notions about it, what conditions may affect their using it, and their experience with similar products etc.
The best BI design is one that treats the users of business intelligence tools as customers. Since the
customers of any product may have different needs, designers also should prepare for varying usage
patterns among user groups. There are three types of user groups.
These are the analysts who will utilize the tools to the fullest capabilities, taking advantage of advanced
features.
They study the data architecture and know where to find even small but key data.
They develop and administer their own report, using report- design tools by advanced queries.
They have high expectations about BI solution’s robustness, thoroughness and usability. They find
innovative approaches to finding BI insights and applying them to business problems.
Design Aspect
Consult them while designing the BI solutions. Ask them what kind of tools they would like for exploring the
data and building their own reporting and analysis tools. Get a clean understanding of each feature’s
importance. That will help in prioritizing the development effort according to what the experts want.
Design Aspect
During design, use the input form business users to help you create your standard reports. Don’t focus on
technical details, ask about accessibility, appearance etc.; seek advice on how to make the basic features
more user-friendly.
Design Aspect
Ask them how to make expert features user friendly. Make a good, flexible design that can make an
enormous impact with them. Make the complex guided-analysis path ways and drill-down exploration of
data into a comfortable design.
Designing the data environment: The BI tools have to use the collected, aggregated operational
data stored in a data warehouse or a data mart in a BI environment. Therefore, the data analysts
in your company, have to decide the data environment that best suits the needs of the BI tools.
The data environment’s conception and design will be largely a function of several factors.
Data Sources: What information do they contain and how accessible is it to the BI system.
Query Requirements: What kind of questions, the users will pose? Are there specific reporting
patterns that need to be considered?
Performance Requirements: How fast will your users expect their answers?
The ETL Design: the ‘Extract, Transform and Load’ processes are the steps that the systems
takes to move information form the source data system to the data environment. Hence the ETL
design is one of the most important factors in the making the BI environment function successfully.
Extracting and loading are simply moving data, but transforming is what makes the data usable.
The state of the data sources will ultimately determine the design of the ETL process that harvests
the data on a set schedule, cleanses it, transforms it into a single format, the front-end systems
can use, and loads into the final data environment. Therefore, put your team’s top talent on the
task of designing sound ETL processes.
When the data gets into final data environment, what the user does with your data will ultimately
determine the kind of data environment that gets built.
Best practices dictate that your users’ needs must influence the design of data environment.
Designing the front-end environment”
When the source data flows into the target data environment, by means of ETL processes, it is
necessary to design ways for users to access the stored data. In most BI environments, the front-
end systems include the reporting, querying and all levels of general and custom analysis
package.
For any BI solution, the core of the front-end is the reporting environment. For the creation of any
standard report, the firs step is formulate the queries to get the appropriate data, then with the
help of front-end tool, merge the data sets into one according to your requirement.
Although many people are competent at creating reports, doing it well requires a combination of
talents well grounded in information architecture, business understanding and even graphic
designs. Report writers are specialists; a good one adds value to the BI process.
Ad-hoc information access design:
If your environment is to support users who want more than standard reports, you will need to
create a robust ad-hoc reporting environment.
Ad-hoc reports are extremely valuable for information consumers; they allow workers to operate
outside the information contained in standard reports. An ad-hoc reporting environment exposes
the key entities of the database to the user, who can then pick and choose among them to create
a custom report.
A company that relies on ad-hoc reporting tools must design a logical data model that makes
sense, to its information consumers.
There are a number of BI front-end tools that specialize in ad-hoc reporting (eg) Microsoft’s Report Builder
product.
OLAP Design:
An OLAP front-end tool gives users a multidimensional view of the underlying data, where
information can be rearranged easily to support the fast-moving business needs.
Designing an OLAP front-end solution involves four main steps.
Identify the variables to be tracked, measured or described.
Group the variables together in logical domains of association.
Define the relationships between the major entities.
Define the formulae and derivations that serve to aggregate and combine the measures in a useful
way for the end users.Consult an OLAP expert, if necessary.
Testing:
Testing is an essential part of the design and deployment process for BI because mistakes
detected at this point are far cheaper and easier to fix.
Document your testing. Have Quality Assurance professionals create a test plan, and test cases,
while the developers are still building and coding the system.
Pilot Project:
A pilot project is more than a dry run in BI application. It is essentially a small-scale demonstration
designed to show that BI can add value in a certain domain of the business.
Proof of Concept:
It is different from a pilot project in terms of its goals to apply a particular solution to a business
problems to see how well the proposed system can withstand the accompanying demands.
Annexure
Expenses
Hardware $250,000
Software $400,000
Labor $750,000
Total $1,400,000
Assumption
Position % Time
2 BI Operational support 100
1 Database administrator 10
1 System administrator 10
1 Network administrator 8
1 Trainer 25
Solution
Analysis: Since the ROI is greater than 100%, the payback period for the BI project is better than the 3
year period, used in the calculation.
Chapter 4
Report Authoring
Task flow
To customize development intelligence report definitions, you have to install ‘Rational reporting for
development intelligence’ and to customize document- style reports, you have to install ‘Rational publishing
engine’ in the system.
A report author can create reports using ‘Report Studio’ and ‘Query Studio’. You can use ‘Report Studio’ to
create reports to track the progress of your project as well as to identify and track business problems. You
can use the ‘Query studio’ to create a simple report such as graph or a chart report.
Building a Report
When you create a report, you are actually creating a report specification. This report specification defines
the queries and prompts that are used to retrieve the data and the layouts and styles used to present the
data.
Creating a report includes
Creating basic reports
Understanding how to create accessible reports
Viewing lineage information for data items
Accessing the IBM web sphere business glossary
Creating reports for IBM Cognos Go! Mobile
Creating reports for mobile devices
Specify the data package, that will provide items for the report.
The packages are based on models that are created in the modeling tool, Framework Manager. A
model is a set of related objects such as query subjects, dimensions, filters and calculations
Refresh the package, to ensure that you are working with the latest version.
Choose a basic report layout.
Add data
Select the data items you want to appear in your report. If it is a relational style report or dimensional style
report, you add data under the relevant heading.
Validate the report, so that it contains no errors.
Save the report.
Run the report.
A multidimensional database (MDB) is a type of database that is optimized for data warehouse and online
analytical processing (OLAP) applications. Multidimensional databases are frequently created using input
for existing relational databases. Whereas a relational database is typically accessed during a SQL query,
a multidimensional database allows a user to ask questions like ‘How may ‘lenova’ laptops have been sold
in Bangalore city so far this year’ and similar questions related to summarizing business operations and
trends. An OLAP application that accesses data from a MDB is known as MOLAP application.
Conceptually, a MDB uses the idea of a data cube to represent the dimensions of data available to a user.
For example, ‘sales’ could be viewed in the dimensions of product model, geography, time or some
additional dimension. In this case ‘sales’ is known as the ‘measure atribute’ of the data cube and other
dimensions are seen as ‘feature attributes’.
The multidimensional database management system rapidly processes the data in the database so that
the answers can be generated quickly.
While building reports, choose the source of data and authoring tool for the report properly, since the
choice of data source and authoring tool are interrelated.
The simplest reports that you can generate are based on work item lists. You can create reports about
work items by exporting a work item query to Microsoft Excel. Work items lists are best suited to tables and
charts.
You can create status and historical trend data by using OLAP data cube, which is optimized for reporting.
The OLAP data cube is best suited to reports that provide aggregated information such as the number of
work items that meet a set of criteria. If you want to create reports that show trends over time such as
progress charts, you can most easily create them from the OLAP data cube.
You can use the relational warehouse database to create reports that provide line-item details. These
include reports that contain titles of work items and more complex reports that do not include trends or
historical data.
Your choice of data source depends not only on the kind of data that you want to show but also on the tool
that you use to create reports.
Authoring tool Work item Query OLAP data cube Relational warehouse
results database
Excel Yes Yes No
Report builder No Yes Yes
Report designer No Yes Yes
These are all
Microsoft tools.
This style is recommended for relational data sources. Relational data is best represented by lists. This
data is organized in Report studio by query items.
In relational reporting, you summarize data by using headers and footers in lists, summary functions,
member summaries and within detail aggregation. You focus data in relational reporting with summary or
detail filter.
This style is recommended for dimensionally modeled relational (DMR) and OLAP data sources.
Dimensional data is best represented by crosstabs, maps and charts. This data is shown in ‘Report studio’
in dimensions, hierarchies, levels and member.
In dimensional reporting, you summarize data by using member summaries and within set aggregates. You
focus data in dimensional reporting by adding only the relevant members to the edge of a crosstab or to
the context filter. You can also enable drilling up and drilling down in dimensional reports.
‘Analysis studio’ and the Report studio Express authoring mode are also available with dimensional data.
Reports can be created in ‘Report Studio’ using either a relational reporting style or a dimensional reporting
style depending upon the type of data source which will be used. It is important to choose a reporting style
to ensure the making the most of the data and to avoid mixing dimensional and relational concepts.
The following table outlines the best practices for both reporting styles.
For example, when you add the For example, when you add the
‘Product type’ query item to a list, ‘Product type’ level to a cross tab, all
all product types appear when the product types appear when the
report is run. report is run.
Report Types
List
A list report is a simple and common report that is used to view the detail listing of any business entity
within a business process. The data is represented in a simple tabular format, in rows and columns. The
selected data fields acts as the columns and the series of set of a single column values acts as the rows in
the tabular matrix. So, each column holds the values from a specific database item or other pertinent field
from the data source to which the report is connected.
Case I
In one Super-Retail shop, a simple list report displays the revenue generated per day for each quarter in
the Store A of Central Region division.
Crosstab
Crosstab reports also display data in row and column format and is used to display dataset that has
relationship between more than one data item from the data source, with summarized value at the
intersection of the rows and columns. It is also known as matrix reports as it is compact in its display of
data.
Case 1
The crosstab report displays summary values of cost and revenue generated per fiscal quarter for the year
2012 by stores in each geographic division.
The year 2012 is the crosstab node and the fiscal quarters are crosstab node members along the column.
The store’s geographic region and the stores within region are in crosstab node and its members
respectively along the row.
Charts
The dataset in charts are represented similar to the crosstab reports, where the summary values are
displayed at the intersections of the queried data items.
Types of chart
Micro Chart
Pie Chart
Column Chart
Marimekko Chart
Progressive Column Chart
Pareto Chart
Bar Chart
Line Chart
Area Chart
Radar Chart etc.
Axes provide reference lines that provide a frame for reference for measurement or comparison. Normally
the x-axis has qualitative item along it such as product and the y-axis has quantifiable values along it such
as sales figures.
Data series is a group of related data points marked and plotted as a chart. Each series has a unique color
or pattern.
Categories are the groups of related data series plotted along the x-axis. Stacked data plotters are used
when multiple groups are needed to be displayed.
Legend the color key assigned to data series and categories for visual aid.
The following table depicts the different types of charting components and their intended usage.
Pie
To display an entities parts to a whole
Stacked component
Bar
Column
To display time series trends of the entity values
Line
Area
3 – dimension
To display value variance
Radar
Fig—4-05
Maps
Maps are used to represent data points over a display of geographic area. Maps are used to display data
in the form of spatial information. They may be literal geographic maps of countries, layout plan of
constructed spaces, track spread of race course, auditorium seating plan and so on .
The maps display the data points as the values at the intersection of the data items, similar to a crosstab.
The crosstab tabular data is now presented in the spatial informational context.
Case 1
In Super-Retail stores, the effective usage of floor space for inventory, walk area and staff space can be
made by the analysis of the floor space index map.
In addition to grouping, you can also specify the sort order of data items. In lists, you can sort data items
within groups as well as ungrouped items. In crosstabs, you can sort row and column items.
Product line and Product type are one dimension and Region and Country are from another dimension. If
you group the first three, you will encounter errors. Also, when working with dimensional data sources, you
can group or sort only in the order of the existing hierarchy. If you change the order, the error will show up.
You can omit columns from the grouping. For example, if a level hierarchy contains the levels, Country,
State, City and another level hierarchy contains the level ‘Product’,
Country, State, City is valid
Country, Product, State is not valid because a level from another level hierarchy was inserted
between two levels from another level hierarchy
Country, City and Product is valid even though ‘State’ is omitted.
You can sort items to view them in the preferred order. By default, ‘Report studio’ retrieves items in the
order defined in the data source.
OLAP data sources always have a defined order. Relational and dimensionally- modeled relational data
sources may not always have a defined order. The data modeler defines the sorting options in the model.
You cannot sort data items from different dimensions that are intermixed.
Filtering Reports
Filters are used to limit the results that are displayed in a report, when an analysis is run, so that the results
answer a particular question. Only those results that match the criteria are shown.
Filters are applied on a column-level basis and can be applied directly to attribute columns and measure
columns. Filters are applied before the query is aggregated.
When working with dimensional data, you can use ‘context filter’ or ‘slicer filter’, to quickly focus your report
on a particular view of the data.
Context filters are different from other filters. When you filter data, members that do not meet the filter
criteria are removed from the report. A context filter does not remove members from a report, instead their
values are filtered or/and you see blank cells.
You can add calculations to lists, crosstabs and all other data containers. You can also add calculations
directly to a page header, body or footer. However you must first associate a query to the relevant page.
Calculations make a report more meaningful by deriving additional information from the data source. For
example, you create an invoice report and you want to see the total sale amount for each product ordered.
Create a calculated column that multiplies the Product price by the Quantity ordered.
For relational data sources, if an expression is used in multiple reports or by different report authors, ask
your modeler to create the expression as a stand-alone object in the model and include it in the relevant
page. If you start with an embedded calculation, you can later convert it into a stand-alone calculation.
In ‘Report studio’, we can create the following types of calculation, when working with dimensional data
sources:
If you want to create a calculation specifically for one query subject or dimension, you can embed the
calculation directly in that object. For query subjects, the calculation can be done for either data source
query subjects or model query subjects. However it is recommended that you apply calculations in model
query subjects wherever possible. This allows for better maintenance and change management.
Conditional Formatting
Formatting data in a report is to improve readability. For example, you can show all data values in the order
Year, Month and Day. If you do not set ‘Data format’ properties here, data is formatted according to the
properties set in the model. If the properties were not set in the model, data is formatted according to the
‘International Components for Unicode’ (ICU) formats.
You can also format data, based on a condition or specify the format for a particular object.
Formatting data includes
Conditional Formatting
You can define conditions to control what users see when they run a report. Conditions can apply to
specific items in a report. For example, you can define a conditional style to highlight exceptional data,
such as product revenue exceeding the target.
Conditions can apply at the report layout level. Conditional layouts are useful for delivering reports to a
multilingual audience. For example, you can have text items, such as titles and cover pages, appearing in
the same language as the data in the report.
If a report contains both conditional styles and style variables, the style variables are applied before the
conditional style.
You can specify which objects are rendered when a report is run. You can,
Add variables
Hide and show objects
Add conditional rendering
Add multiple layouts
Create multilingual reports
In a report, data can be summarized to obtain totals, averages, aggregates and so on. You can also add
both detail and summary aggregation.
Detail aggregation, which is supported only for relational data sources, specifies how a data item is totaled
at the lowest level in a report. In lists, detail aggregation specifies how the values that appear in the rows
are totaled. In crosstabs, detail aggregation specifies how the values in cells are totaled.
Summary aggregation, which is supported for all the data sources, specifies how data items are totaled in
the headers and footers of a list and in the total rows and columns of a crosstab.
You can add simple summaries to the groups in a report by using the ‘aggregate’ button. This button
provides a subset of the aggregate functions available in ‘Report studio’. For list reports, a ‘Custom’ option
is also available so that you can add you own aggregate function in the expression of the data item.
You can drill down to see more detail about the children of the data in the workspace. For example, you
can review revenue for each product line and their drill-down to analyze the revenue for each product. In
other words, drilling down is exploring for more detail.
You can drill up to compare the results of the parent data in your workspace. For example, you can
examine revenue for a single product and drill up to see revenue for the entire product line for comparison.
In other words, drilling up is to get an overall picture.
You can drill up or down on a row or column heading or you can drill on a crosstab cell. If you drill on a
crosstab cell that appears within a row and a column, you can choose whether you want to drill up or down
on the row or the column. For example, a cell appears in ‘January column and in the ‘Quantity’ row. When
you right click the cell and choose ‘drill down’, you can click January or Quantity.
When working with dimensional or dimensionally- modeled relational (DMR) data sources, you can create
reports that allow the reader to view more general or more detailed information on your data within a
predefined dimensional hierarchy ( such as Year-Quarter-Month) without having to create different reports.
You can link groups of data items from different queries so that when you drill up or drill down in one query,
the data item also drills up or drills down in the linked query. For example, if your report shows a list and a
crosstab that each include the product line data item, you can link the product line data item from the
crosstab query to the product line data item from the list query, so that when you drill up in the crosstab
product line data item, the list product line data item also gets drilled up.
Drill-through Capability
Using drill-through access, you can move from one report to another within a session while maintaining
your focus on the same piece of data. For example, you select a product in a ‘Sales Report’ and move to
an ‘inventory report’ about that product.
Drill-through access helps you to build business intelligence applications that are bigger than a single
report. Drill-through applications are a network of linked reports that users can navigate, retaining their
context and focus, to explore and analyze information.
Drill-through access works by defining the target report or object and then using information from the
source report to filter the target.
The source
The target
How the data in the packages that contain these objects is related
Whether to run the target report or to open it
If the target is being run, in what format to run it and what filters to use on the target
Running Reports
Before you begin, make sure that you have validated the report
Open the report with ‘Report studio’
Click Run > Run options. Set the following options in the options window or leave the default
values.
Data mode – It specifies how much data is returned by the report query. You can select all data, limited
data (due to filters) and no data.
Language – To be specified
Prompt – If you are required to be prompted for each run defined in the report.
Click ‘Ok’.
Click ‘Run’ icon in the tool bar.
You cannot produce a report in CSV format if you have more than one query defined in the report unless
the additional queries are used for prompts. The report runs in the IBM cognos viewer.
Chapter 5
BI Deployment, Administration and Security
The shape of the BI solution is driven by the key decision whether to build a centralized or decentralized
architecture system. Each will involve different approach to the implementation process.
It’s possible to have a highly decentralized departmental implementation of BI. For example, imagine an
international organization that wants to introduce BI for its sales team, but the sales team has sub-units
that work on different products with wildly different sales dynamics. Perhaps they’re in different countries
and don’t even speak the same language. In that case, a one-size fits-all solution wouldn’t work; you’d
have to allow those sub-units to customize the BI solution as needed.
Autocratic: Decisions are made from the top down, with little room for interpretation. Autocratic
cultures lend themselves to centralized BI; the apparatus is already in place for centrally controlled
strategy and application administration, and usually that’s what their people are used to.
Entrepreneurial: Business decisions are made throughout the company and innovation is
encouraged. Where departments and teams have the authority to dictate the terms of how their
supporting tools are built, decentralized BI systems are more likely to flourish.
Organizational structure. In most cases, but not all, the organizational structure aligns with the
company’s culture. For example, if decisions are made at the top, the structure of power and
communication radiate from a handful of top executives down to the rest of the organization in a rigid way,
while entrepreneurial companies are likely to be matrix-managed rather than hierarchical.
It’s important to be aware of organizational factors because they will dictate how data moves between
business units and teams, and that will ultimately guide the centralized versus decentralized decision.
Since BI typically needs to bring lots of disparate data together, a company where teams are not in the
habit of sharing data or working together for the greater good could make it difficult to install a centralized
BI solution. You’ll want to understand how the business units work together to perform common functions,
as well as where they are geographically.
It’s probable the answer lies somewhere in between centralized and decentralized BI architecture. For
example, a company might install an enterprise-wide, centralized, homogeneous BI system, but allow one
key team with a unique function — say, the corporate-strategy group or the sales-operations crew —to
build their BI solution as they see fit.
Therefore the first key decision is setting policy to specify how coupled your BI environment should be; it’s
definitely as important as any decision you’ll make throughout the life of your project.
BI Architectural Alternatives
Once either a centralized or decentralized model has been decided, the other general architecture choices
can be made. Here, It is needed to consider a number of vital factors, such as these:
The goal is to produce a short list of architecture alternatives that satisfy all the bare minimum
requirements, and hopefully supply some nice-to-have features as well.
Each short list will include the querying, reporting, analysis, and other frontend tools that the end-users
throughout the company will be using. There will also be the underlying database technology — not just
software, but configuration options and architectural considerations as well. Finally, it is to be ensured to
put any hardware requirements on the short list as well. For example, if the short list includes a solution
that involves creating a new centralized data-warehouse environment, the entry on the short list should
include
A basic analysis of existing processing and storage capacity (relative to the minimum amount needed)
For BI implementations, the architectural choices will almost always start with three major categories of
technology:
Hardware
A discussion of what, if any, hardware changes are required (for example, will you have to increase server
processing horsepower to handle the complexity of the transactions expected?).
Data management
A small list of data-management options (such as how the data will be transported and transformed, and
what the target database will need to be able to do.)
End-user tools
Recommendations of one or more tools that meet the system’s general business needs directly (such as
an advanced statistical package, or a managed reporting solution to meet the needs of the prospective
user groups.)
If you have a rough idea of the requirements, then starting with the end-user tools is a good starting place.
You can develop a list of available software that will meet your querying, reporting, and analysis needs,
then work backward to expand the evaluation to identify compatible data handling technology that can
support the tools. Then you’ll look for what hardware is required to run it.
You’ll want to look at solution elements like these:
_ Operating systems
_ Network protocols
_ Server hardware
_ Primary database vendor
_ Data Warehouse and Extract, Transform, and Load (ETL) processes
_ Kinds of front-end BI tools you absolutely must have
_ Kinds of front-end BI tools that would be nice to have
At this point it makes sense to have in-depth discussions with candidate vendors and consultants so that
you can get detailed information on their product capabilities, plus a full menu of their support options to go
along with their wares. Invite them for a discussion of your situation and take careful note of how they
would approach the challenges you face.
It helps if the vendors have an idea of what you’re trying to do, so you should be prepared to share a little
information with them about your project. Project details will help the third parties tailor their pitch toward
what you actually need.
Evaluating possible solutions without considering your budget affords you a certain amount of freedom to
isolate the best alternatives, and identify the most important components without constraint. But sooner or
later the issue must be addressed.
Software licenses: Don’t be haphazard in your approach to buying licenses. Many vendors have complex
and confusing schemes that can lead you to pay for more seats than you end up needing. A software
partner with simple and flexible licensing can be worth its weight in gold.
Hardware-acquisition costs: Buying new gear for your BI solution can be an expensive proposition,
especially where it concerns high-performance servers and networking gear. Having a scalable solution is
a good way to save money; you can start small and work your way up to the hardware capacity you need,
but not until you actually need it.
Service and maintenance costs: Many vendors make their money not from the initial purchase of their
product, but from ongoing fees exacted from customers for supporting their products. Make sure you
account for all outlays over the life of the products you’re buying for your BI system, not just the big check
at the beginning.
Training and support costs: Complex software means the end-users of your business-intelligence
system will need help before they get full use out of it. It’s important that quality education be made
available for the user community, and that costs money.
Technology Risks
With the architecture selection made, and the solution coming into tighter focus, it’s time to start working
through how to make the vision a reality. The goal is not to create a full project plan with step-by-step
instructions; instead, the roadmap must include what you’re going to deliver and when. The roadmap is
where you lay out a strategy for building your business intelligence solution in a way that keeps momentum
up, maintains support throughout the organization, doesn’t use up resources too quickly, and tolerates
occasional failures.
It’s almost always beneficial to start with a limited solution that grows into a full BI implementation after
several subsequent phases. That is, you should look for objectives within your grasp for the early phases
of the project. The initial goal should be to start building a solution at the intersection of the highest-value,
least risky functions that are also the easiest to deliver:
It’s always best to do improvements to existing systems, rather than building brand new systems. If (for
example) you already have a sales analytics module up and running, it probably makes sense to make the
early phases of your project include an upgrade to advanced sales analytics before building the HR
function afresh.
It’s not always easy to decide where your priorities should be. It is better to lay out the key variables listed
in the previous section, along with a few that are peculiar to the situation. Work through the possible first
steps and grade them out based on how they fall in each category, that is, the highest value, least risky
and easier to deliver. From that grading, you’ll get the optimal combination of functions for Phase I.
Productivity 4 2 1 7
Analytics
The grading determines the sensible initial steps for the BI project.
Next decision will be the items for Phase II and their grading.
Contingency Plan should also be in position.
Incremental Rollout
When planning the implementation, use the same iterative approach used while developing the BI
application and the metadata repository. The iterative approach, or incremental rollout, works well because
it reduces the risk of exposing potential defects in the BI application to the entire organization. In addition, it
gives you the opportunity to informally demonstrate the BI concepts and the BI tool features to the
business people who were not directly involved in the BI project
Treat the business people as customers—keeping customer care in mind. Trouble-free implementation,
interactive training, and ongoing support will help you get their buy-in. Always ask yourself, "What is in it for
the customers?" Take the opportunity to test your implementation approach. You may consider adjusting
your implementation approach or modifying the BI application prior to the full rollout (e.g., change
cumbersome logon procedures). It may be necessary to duplicate implementation activities at multiple
sites. Adding these sites slowly over time is easier than launching them all at the same time.
Introduction: BI is a very resource intensive task. The art of extracting information from a
potentially large amount of data requires adequate amount of processing power and exercises all
the important subsystems of a computer: CPU, memory, disk and network. Having the right
amount of capacity of these four aspects of the system is the key to success.
Sizing: Sizing a BI deployment requires a reasonable degree of planning so that calculations and
predictions can be made about the needs of the system. The number of users and the needs of
those users are required to predict load on the system. The types of data sources also have an
effect on the load and needs of the systems.
Once the user requirements are defined, the system can be defined that will achieve the required amount
of processing. The final step in the sizing is to apply it to the hardware scenario. The deployment hardware
can range from many small machines to one large machine. The sizing exercise includes the allocation of
BI services to the nodes in the system, taking into account the CPU, memory, disk, and network
capabilities of nodes to be used in the construction of the system landscape.
Prerequisites
1. The goal of the sizing exercise is to calculate the peak load that will be placed on the system. In
order to proceed with the sizing steps, you need to know the following information. Gathering this
data accurately is the most important part of this sizing exercise since all the sizing calculations
derive from this information.
2. Users: How many information customers, business users and expert users of each type of BI tool?
An average user workflow is also important to know. If you expect users to open five BI documents
and refresh them all at the same time that is five times the load of one user. That needs to be
accounted for since the system needs to handle that load.
3. It is very important to know if the common workflow of the users is going to include refreshing
reports and if so, how frequently? Will the reports be scheduled to run at night and only to be
viewed during the day or will they be refreshed by users every hour? That is an important part of
the load predictions and thus the sizing estimates.
Data Sources
What type of data sources will be used: direct access SQL databases or others? It is important to know
which types of data sources will be used for the majority of BI processing so that peak load can be
predicted.
Document Size:
The relative size of each document is important to know. Will most be large and require a lot of processing.
This should be determined for each BI tool expected to be used.
Rating:
In order to know how much computing resources are needed for a deployment, you need to know the
computer’s performance rating.
Sizing Steps
The basic approach to sizing is working through the requirements of all the tiers in the deployment,
accounting for the processing and memory requirements of each sub system. Most sub systems and
services have special considerations that need to be taken into account.
Once the processing and memory requirements are determined, the next step is to fit the services to the
hardware landscape. This requires knowing the core rating of the computer to be used as well as the
amount of memory available to the machine or machines.
For each tier, namely, a) Application Tier b) Intelligence Tier, and c) Processing Tier, the processing power
and memory requirements need to be determined. This is the core of the Sizing Exercise. Once these
numbers are determined, scale –out to deployment hardware can be done.
The processing power is calculated and specified in a processing unit which is a computer independent
unit. It takes into account the I/O processing capabilities of the computer in addition to its raw computation
abilities. Computers can have different processing abilities even with the same type of CPU.
Measurements
Are there any plans in place for measuring and monitoring system resources such as disk, cycles etc—how
will it be known if one user or one group is taking far more resources than others and far more than
anticipated? To assist in such endeavours, some vendors offer powerful management tools.
The following information will be required to monitor and manage the environment through these tools.
What is the operational state of the physical hardware, nodes and network?
What is the state of the software services that are executing in the environment?
How to ascertain the performance and load on the computing infrastructure?
If a failure occurs, how quickly can an alert action be generated to respond to the event?
If a failure occurs within the environment, how does that impacts the business operations?
How to measure current work load and predict future capacity requirements?
How many users are actively using the computing environment?
What services are they using?
Are there any attempts to access the environment without proper authentication or authorization?
These questions and the corresponding answers are integral to successful management of the BI
environment.
Performance measurement focuses on results and allows users to analyze those results through charts,
grids, trends etc, whereas performance management is to do with business processes and day-to-day
activities that lead to strategic goals.
It is vitally important to use appropriate monitoring and alert utilities that will detect any resource problem
and sound an alarm in case of an actual or pending system failure or resource shortfall. Selecting an
appropriate monitoring utility is especially critical in a distributed environment.
This is achieved by creating an EPM Center of Excellence. It brings together people, processes and
technologies to promote collaboration and the use of best practices throughout an organization. They
consist of cross-functional teams, which provide consistency and efficiency in the company’s use of EPM
system. They deliver repeatable and successful deployments in a way that is beneficial to the entire
organization rather than a single project.
End-User Provisos
The initial end users should be required to submit a list of activities that they are going to perform in the
early stages. The list should include the following:
OLAP Implementation
OLAP may be defined in terms of just five keywords – Fast Analysis of Shared Multidimensional
Information. Fast, such that the most complex queries requiring not more than 5 seconds to be processed.
Analysis refers to the process of analyzing information of all relevant kinds in order to process complex
queries and establish clear criteria for the results of such queries. The information to be used for analysis is
generally obtained from a shared source, such as a data warehouse. The information may be related in
more than one or two dimensions. For example, a particular set of business data may be related, variously,
to sales figures, market trends, consumer buying patterns, supplier conditions and the liquidity of the
business. Presented in such a multidimensional detail, such information can be useful and vital to
managerial decision-making.
OLAP TECHNIQUES
Online analytical processing or OLAP is implemented in many different ways. However, the commonest
way is to retrieve information from different corporate databases, such as data warehouses, and store
temporarily into OLAP multi-dimensional databases for retrieval by the front-end systems. The
multidimensional database is optimized for fast retrieval. Several techniques for the quickening of data
retrieval and analysis are used on the procedural side of the database management.
OLAP Implementation depends not only on technological basis software that is going to be used- but also
more on the business models and data sources. Each industry sector or business area is specific and
required detailed modeling to create multidimensional cubes for easy data load and reporting building.
With huge amounts of data collected in various kinds of application, data warehouse is becoming a
mainstream information repository for decision support and data analysis, mainly because a data
warehouse facilitates on line analytical processing (OLAP). It is important to study methods for supporting
data warehouse, in particular, its OLAP operations efficiently. Data Cube technologies support OLAP and
data mining. There are two popular data cube technologies: Relational OLAP (ROLAP) and
Multidimensional OLAP (MOLAP).
“Data warehousing” refers to the process of constructing and using data warehouses. The construction of a
data warehouse requires data integration, data cleaning and data consolidation. It is constructed based on
the multidimensional data model.
A data cube in data warehousing is not necessarily a three dimensional (3D) geometric structure, but is
essentially “N” dimensional. The edges of the cube are called “dimensions” which are the entities to which
an organization wants to keep records. Each dimension may be associated with a “dimension table”. For
example, a dimension table for “Product” may contain such attributes as “description’, “brand’, ‘category’
etc. In addition, a data cube in data warehousing is mainly constructed to measure the company’s
performance. Thus a typical multidimensional data model is organized around a theme which is
represented by a ‘fact table’ of some numerical measures—the objects of analysis.
For example, a fact table may contain ‘sales’, ‘budget’, ‘revenue’, ‘inventory’ etc. Each of the numerical
measures depends on a set of dimensions which provide context for that measure. Therefore, the
dimensions together are assumed to uniquely determine the measure, which is a value in the
multidimensional space of dimensions.
Implementation
Data warehouses might be implemented on standard or extended relational DBMS called Relational OLAP
(ROLAP) servers. These servers support extensions to SQL and special access and implementation
methods to efficiently implement the multidimensional data model and operations. They assume that data
is stored in relational databases. On the other hand, Multidimensional OLAP (MOLAP) servers directly
store multidimensional data in some special data structures (such as arrays) and implement the OLAP
operations over the specified data structures.
Most data warehouses use a ‘star schema’ to represent the multidimensional data model (Kimball 96).
The following figure shows a typical star schema representation. The database consists of one large
dominant table in the centre called the’ fact table’ which is the only table with multiple arrows connecting it
to other tables. The other tables called the ‘dimension tables’ all have only a single join attaching them to
the fact table. Each ‘tuple’ in the fact table consists of a pointer (foreign key which may be generated for
efficiency) to each of the dimension tables that provide its multidimensional coordinates and stores the
numeric measures for those coordinates. Each dimension table consists of columns that correspond to
attributes of the dimensions.
The popular conceptual model that influences the front end tools, database design and the query engines
for OLAP is the multidimensional view of data in the warehouse and its stress on aggregation of measures
by one or more dimensions as one of key operations.
Expanding BI potential
Collect the right data from the right source and stage it to be cleansed and transferred properly
using data integrity and integration tools.
Store data with the proper structure and format.
Analyze stored data using tools such as OLAP.
Use this data.
Benefits:
Improved availability, security and quality of information.
Improved business insight across all functional areas.
The foundation for corporate alignment and improved strategic communication.
Increasing the number of people using the BI system without any changes to the data or the
analysis capabilities it supports.
Expanding the depth or breadth of the data that is available in the system.
Expanding the types of analytics offered to end user.
A combination of the above three.
Option 1: Once it is ascertained how many more business users will be accessing the system, accordingly
BI system infrastructure and technical support resources can be enhanced.
Options 2,3 and 4: These require an assessment of the expanded BI requirement, with respect to the
number of business users, types of analytics, that the new users would require and whether more data for
reporting and analysis are required.
Predictive Analysis
BI can do more for the users, when it is powered with predictive analysis. Predictive analysis goes beyond
traditional analysis of past and present data; it helps you to ‘see ahead’ so that you can create more
accurate forecasts and predictions. Predictions can be set across many areas of an organization and
across many types of organization.
Text Mining
With the extensive inclusion of document, especially text, in the business systems, data mining does not
cover the full scope of BI. Data mining cannot deliver its impact on extracting useful details from the large
collection of unstructured and semi-structured written materials based on natural languages. The most
pressing issue is to draw the potential BI from text. In order to gain competitive advantage for the business,
it is necessary to develop the new powerful tool, text mining, to expand the scope of BI.
Authentication: is the process of verifying that a user is a bona fide user, making sure that he is authentic.
This is done with user credentials (a user name and password) that are stored in a secured authentication
source. If the credentials are valid, the user is granted access. The next layer of security, authorization,
comes into play.
Authorization: defines the level of access that the user has to the functionality and content. It is the job of
administrator to assign specific permissions to specific users. Managing individual user access and
permissions can become extremely complex. To simplify, users are combined into groups and roles.
Users: are the individual people who have user names and passwords and have permission to access the
application.
Groups: are a collection of users that have the same access permissions—for example, everyone in the
Sales group can see content related to Sales.
Roles: are a collection of users that have the same access to functionality—for example, everyone in the
‘Authors’ role can create reports.
For all practical purposes, groups and roles behave in the same manner. The only technical difference is
that groups limit their membership to users and other group, while roles can contain all three—users,
groups and other roles.
When a user belongs to more than one group or role, permissions for all of the groups and roles are
merged into one set and applied to that user.
Access Permission: Groups allow you to organize your security structure and place users with common
interests (for example, sales and marketing) together. Roles allow you to assign functionality (such as
Authors, OLAP analysis and so on) to groups and users.
Access permissions allow you to determine what level of interaction your users, groups and roles can have
with the objects.
Read---Allows users to view any property of the object as well as create shortcuts to the object.
Write---Allows users to delete objects and modify the properties of an object, such as the name or
description.
Execute---Allows users to run reports and access data sources.
Set Policy--- Allows users to modify the security settings of an object.
Traverse--- Allows users to view the contents of an object and execute reports.
Access permissions can be set in one of the three ways---Granted, Denied or Not Set.
A typical scenario is a user logging in to their machine where they get authenticated by the operating
system (OS) security. Next, the user opens a browser and accesses a protected resource on a web server.
Access to that resource is governed by web server security. Ideally, there is some SSO from the operating
system to the web server security. If not, the user would get prompted to authenticate to the web server’s
security layer.
If this protected resource happens to be an IBM Cognos BI Gateway, another SSO hop would be from the
web server security to IBM Cognos BI. In this scenario,, there are two hops for overall SSO—from the OS
to web server and then from the web server to IBM Cognos BI. This assumes the SSO to IBM Cognos BI is
based on some information provided by the web server security but it could be based instead on OS’s
security, if a token is available.
Server Administration
Server Administration is performed by the System Administrator. It involves maintaining all of the software
and configuration setting.
Normally, the software and configuration setting need not be disturbed. However, the System Administrator
can manage the metrics monitoring the system, manage the software and configuration properties,
manage gateways, set logging levels, tune the system, view the status og jobs and monitor the system.
Status and monitoring allow you to keep an eye on the health of your installation by showing you where
system performance is slowing down.
Status shows how the system is running. You can see the current, past and upcoming activities. The
activities screen displays a bar chart showing jobs that have succeeded, failed and have been cancelled.
You can view the run history or the history details. Status viewing is extremely useful for jobs that have
failed, since you can view the ‘run history’. For future events, you can view what is scheduled to occur to
ensure that the system will not be overloaded with requests that would adversely affect the system
parameters.
Monitoring the system is done by using metrics. The threshold value set as target values show you how the
system is working as designed. If any part of the system displays a red light, that portion can be addressed
appropriately to restore the system to peak performance. Additionally logging can be used to evaluate the
system’s performance.
Metrics and logging offer a complete view of the system. Use metrics to determine which portion of the
system is running poorly. Then you can view the logs to determine the reasons.
Portlet Integration
Portals are used to display and organize their content inside a web-based application.
Auditing
IBM Cognos Platform provides a complete auditing capability that permits administrators to report on and
manage system usage. The information logged by IBM Cognos Platform auditing can be used to support
administrative requirements such as
Capacity planning
Licensing conformance reporting
Performance monitoring
Identifying unused content
The idea of not backing up BI target databases on the assumption that data can be rebuilt because data is
derived from other systems is foolish when dealing with a very large database (VLDB). Although backing
up a database is time consuming and takes the database off line for several hours, the alternative of
reloading years’ worth of data again will take much longer, if it can be done at all. Not every organization
opts to keep all source extract files for years.
It is mandatory to back up the BI target database on a regular basis. But many of the hardware platforms
on which BI applications reside often have limitations on the amount of data that can be backed up on a
regular basis. These limitations are due to the slow speed of data transfer between the server and the
backup device.
Incremental backup: This strategy envisages backing up only the actual changes to a database
since the last update, rather than the entire database. But it may not be advisable since there are
multiple databases in the BI decision-support environment and the summarized data must stay
synchronized with the detail data and until the backups of all databases are completed, no
refreshing can happen to any of the databases.
High-speed mainframe backup: Another possibility is to use the mainframe transfer utilities to pass
the BI data back to the mainframe for a high-speed backup which is supported only on the
mainframe. This is an expensive but a robust solution.
Partial backup: Another strategy relies on partitioning the database tables by date to support
partial backup. While one partition is being backed up, the other can remain available. A big
drawback of this strategy is that if the table is partitioned by a date column for backup purposes
(which means it is clustered by the date column) it cannot be clustered in any other way for
access purposes. This can affect performance while running the reports and queries.
Business intelligence
LAB Exercise:
LAB EXERCISE
Objective
This course on exercise for Report Authoring uses ‘IBM Cognos BI Developer Edition’ as the BI tool for
illustrating the creation of basic reports. Different types of functionality and format are discussed in the
following sections.
Microsoft Internet Explorer web browser is well supported by Cognos and the participants are advised to
use it, as their web browser. For the purpose of illustration the sample database that is available in the IBM
Cognos BI Developer Edition tool is used all along this course.
This course uses IBM Cognos Report Studio, which is a Web-based report authoring tool that professional
report authors and developers use to build sophisticated, multiple-page, multiple-query reports against
multiple databases. With Report Studio, you can create any reports that an organization requires, such as
invoices, statements, and weekly sales and inventory reports.
Reports can contain any number of report objects, such as lists, grouped lists, crosstabs and charts, as
well as non-BI components such as images, logos etc.
IBM Cognos BI can report from a wide variety of data sources, both relational and dimensional. Database
connections are created in the Web administration interface, and are used for modeling, for authoring, and
for running the application.
To use data for authoring and viewing, the business intelligence studios need a subset of a model of the
metadata (called a package). The metadata may need extensive modeling in Framework Manager.
Reports, scorecards, analysis, dashboards and more are created in the business intelligence studios of
IBM Cognos BI. Which studio you use depends on the content, lifespan, and audience of the report, and
whether the data is modeled dimensionally or relationally. For example, self-Professional Authoring service
reporting and analysis are done through Query Studio and Analysis Studio, and scheduled reports are
created in Report Studio.
Report Studio reports and scorecards are usually prepared for a wider audience, published to IBM Cognos
Connection or another portal, and scheduled there for bursting, distribution, and so on. You can also use
Report Studio to prepare report templates.
You deliver content from the IBM Cognos portal or other supported portals, and view information that has
been saved to portals, or delivered by other mechanisms. You can also run reports, analyses, scorecards,
and more from within the business intelligence studio in which they were created.
Content pane
Source Tab
Data Item Tab
Toolbox Tab
Properties window
Report Window
Explorer Bar
Report Studio Tool bar
This window consists of three tabs and the contents of them are used to create or modify reports.
Source Tab: Contains the selected Package and its content. The Package consists of data items from the
meta-model like query subjects, query items, predefined calculations and filters. The user can navigate the
contents of the package similar to an explorer to identify the required data components for their reports.
Data item tab: Contains the Queries that are used by report in development. This tab provides navigation
among the report components that are included in building the report.
Tool box tab: Contains report object components that are made available by the BI tool to build a report.
Examples are text, picture, table, list, chart objects and so on.
Properties Window: Lists out the customizable properties of a selected report object. The options in this
window allow the user to control the appearance and behavior for a given report object within the report.
Examples are manipulating properties like text source, position, color, border etc. for a text item component
in a report.
Properties Pane
The Properties pane lists the properties that you can set for a report object. Obtain additional information
about a property by selecting it and pressing F1. For example, you can view the list of objects that use
each property. When you specify a value for a property, press Enter, click another property, or save the
report to ensure that the value is saved.
Tip: To view a description of the currently selected property at the bottom of the pane, from the View menu,
click Property Descriptions.
Report Window
This window is the canvas where the report is actually created with data items and layout formats by the
user.
Explorer Bar is used and works in conjunction with the building of report in the Report Window.
Page Explorer: Used to work on the report layout graphically by navigating the different layers of a report.
Title pages, report pages and prompt pages are the basic layout layers that a user can navigate from this
tab while creating a report.
Query Explorer: Used to understand and customize SQL of the report in development
Condition Explorer: Used to view and customize the conditions that are included in the report.
Layout
A layout is a set of pages that defines the appearance and formatting of a report. When you design the
layout of a report, you
present the data in a meaningful way by using lists, crosstabs, charts, and maps
add formatting, such as borders, color, images, and page numbers
specify how the data flows from one page to the next
Pages
Pages are containers for the layout objects that you use to build a report. A page is made up of the
following mandatory and optional components:
page header (optional)
page body (mandatory)
page footer (optional)
When you run a report, the amount of data queried often exceeds one page. As a result, a page will repeat
until all the data is shown. You have control over how data flows from one page to the next. For example,
here are alternative representations of a report that contains a chart and a lengthy list.
4.
1. The chart appears on its own page. The list begins on the next page and fills subsequent pages
until all rows appear.
5. 2. The chart and the first few rows of the list appear on the first page. The rest of the data in the list
appears on the subsequent pages.
Objects
You add layout objects to a page when you create a report. Below are objects that you will use often when
building reports in IBM Cognos Report Studio:
list
crosstab
Add a crosstab to show data in a grid with dimensions along the rows and columns and measures in the
cells or intersection points.
chart
map
repeater
Add a repeater to show each instance of a certain column or data item in a separate frame.
text
block
Add a block to hold text or other information. Blocks are often used to lay out horizontal bands of
information.
table
In addition to text and images, the Toolbox tab contains other objects that you can add to the report layout.
Queries
Queries determine what data items appear in the report. Sometimes you want detailed rows of data, which
you obtain by using a simple SELECT statement. Other times you must calculate totals or averages using
summary functions and grouped columns or must apply filters to show only the data you want.
IBM Cognos Report Studio automatically creates the queries you need as you build reports. However, you
can modify these queries or create your own custom queries to get the results you want.
Queries specify what data appears in the report. In IBM Cognos Report Studio. Queries can be created
and modified using the Query Explorer, in Explorer tab.
You can find them in the Public Folders tab in IBM Cognos Connection.
To make designing examples faster, especially financial examples, some general information about The
Great Outdoors Company is useful.
Revenue for The Great Outdoors Company comes from corporate stores and from franchise operations.
The revenues are consolidated from the wholly-owned subsidiaries. There are six distinct organizations,
each with its own departments and sales branches. Five of these are regionally-based companies.
has its own collection of products, differentiated from the other GO companies by brand, name,
price, color and size
sells from a single branch to all regions and retailers
functions both as an operating company based in Geneva, and as a part owner of the hree GO
subsidiaries in Europe.
The diagram below illustrates the consolidated corporate structure, including the percentage changes in
ownership for GO Central Europe, and shows the reporting currency and GL prefix for each subsidiary.
Each corporation has the same departmental structure and the same GL structure, shown in the table
below. Divisions may not report in the same currencies. For example, the Americas subsidiary reports in
US dollars, but the Corporate division local currency is Canadian dollars, and the Operations division local
currency is pesos.
Each corporation has a complete chart of accounts. Most of the accounts, such as those under non-
personnel expenses, are at the department level, and contain only summary amounts.
For example, although each marketing department has expenses, the cost is unspecified at the transaction
level where marketing promotions occur.
Employees
The Great Outdoors data contains a full list of employees in all divisions, departments, and locations.
Data is available for reports about bonuses (Global Bonus report) and sales commissions (Sales
Commissions for Central Europe report), training (Employee Training by Year report), and performance
reviews and employee satisfaction surveys (Employee Satisfaction).
In the GO Data Warehouse (analysis) package, groups of measures and the related dimensions are
organized into folders. The employees are organized in hierarchies for region, manager, position or
department, and organization, to make different kinds of aggregation easy to report on.
Aggregation has been defined for the Employee Position Summary measures, so that Position count and
Planned position count aggregate correctly at each level of time: monthly, quarterly, or yearly. For
example, see the Planned Headcount report.
Data about sales and marketing is available for all of the companies in the Great Outdoors group. GO
Accessories has richer details to support analysis examples. For example, see the Revenue vs % Gross
Profit by Product Brand analysis, based on the Sales and Marketing cube. Marketing and sales campaigns
are tied to the Great Outdoors regional companies.
Overall, the GO companies have experienced solid growth across most product lines (Sales Growth Year
Over Year), in all regions (Revenue by GO Subsidiary 2005), because of factors like an increase in repeat
business and new or improved products, such as the high margin sunglasses product line.
In the product lines sold by the five regional companies (all but GO Accessories) promotions have had
mixed success (Promotion Success by Campaign, Bundle and Quarter).
Sales Outlets
Revenue from the corporate outlets is available at the transaction level. Revenue from the franchise outlets
is available at the consolidated level only (Sales and Marketing cube). Metrics about retailers show that the
number of new retail outlets has dropped over the time period covered by this data.
GO Accessories sells worldwide, and sells only accessories. Transaction data for GO Accessories is the
primary source for analysis of product by brand, color and size. The other five subsidiaries in the group of
companies are regional and sell all product lines for retailers in their region.
For example, the report Top 10 Retailers in 2005 uses sparklines and list data to review revenues at the
retailer level.
Answering some of these questions involves working with other Cognos BI components. For example,
Framework Manager is the metadata tool used to model your enterprise data. Models created in
Framework Manager provide the data items that you use in reports.
Creating a report
When you create a report, you are actually creating a report specification. The report specification defines
the queries and prompts that are used to retrieve data and the layouts and styles used to present the data.
For simplicity, the report specification is named by the same name as the report.
Specify the package that will provide items for the report. The packages that you use to generate reports
are based on models that are created in the modeling tool, IBM Cognos Framework Manager.
A model is a set of related objects, such as query subjects, dimensions, filters, and calculations. When you
open a package in IBM Cognos Business Intelligence, these model objects are visible in the left frame.
The package must be previously created and published to the IBM Cognos Connection portal.
Steps
1. Open IBM Cognos Report Studio with the package you want to use.
2. In the Welcome dialog box, choose whether to open a new or existing report or template:
To create a new report or template, click Create a new report or template and choose a basic
report layout.
Language option: You can specify a language other than the default language for your package by clicking
on the ellipsis (...) button to the right of the Package field, clicking the Language ellipsis button in the
Report Package dialog box, highlighting the desired language, and then clicking OK.
As stated on the language selection dialog box, items such as separators and decimals may need to be
manually updated for language-specific syntax rules as a result of your selection.
To open an existing report or template, click Open an existing report or template and select a report.
Results
Objects from the selected package, such as query items, appear on the Source tab.
Refresh the Package
If the package that a report is using has changed, refresh it to ensure that you are working with the latest
version.
Procedure
In the Source tab , click the refresh button .
Tip: You can also close and reopen the report to upgrade it to the latest version of the package.
IBM Cognos Report Studio includes several basic report layouts that include report objects, such as lists,
crosstabs, maps, repeaters, and headers and footers. You can also choose to start with a blank report or
open an existing report.
Tip: The Budget vs. Actual sample report in the GO Data Warehouse (analysis) package is based on a
basic report layout.
Steps
2. If you want to change the package, click the ellipsis (...) button and click a different package.
Tip: In the Type box, click Templates to see only existing templates.
Results
The basic report layout appears in the report page.
C.) Add Data
Select the data items you want to appear in your report.
Procedure
From the Source tab , drag data items to the report object. A flashing black bar indicates where you can
drop a data item. Data items in the report appear on the Data Items tab .
Data Source Icons
Each object in the data source has a representative icon. You can insert all of the following objects in a
report, except for packages and dimensions.
In relational data sources, query item, which represents a column of qualitative data in
the database, such as product name or country or region.
A member is a unique item within a hierarchy. For example, Camping Equipment and 4
Man tent are members of the Products Hierarchy.
Dimension, which represents a broad grouping of descriptive data about a major aspect
of a business, such as products, dates, or markets.
Level, which is a set of members that have common attributes. For example, a
geographical dimension might contain levels for country or region or city.
Multiple levels can exist within a level hierarchy, beginning with the root level. The root
level is the parent and rollup of all members in the first level. It is used to obtain a rollup
of all values across the hierarchy and to provide a convenient point to start drilling.
For example, a Years level hierarchy may contain the following levels:
Model Filter
Model Calculation
Folder, used to organize data items. You cannot import a folder into your report.
You can insert a single data item anywhere in your report using the singleton object. The singleton object
retrieves only the first row value for that query. Inserting a single data item is useful when you want to show
a value that is independent from the rest of the values in the report or when you want to insert some
boilerplate text, such as a company name and address. For example, you can add the total revenue value
in the header of each page in a report.
You can associate multiple singleton objects with a single query in relational reporting and dimensional
reporting to optimize performance, such as when all the data items in the singleton are from the same
database table. In addition, two or more singletons can reference data items from the same query. This is
useful when using a single query is more efficient to display a set of single values than using multiple
queries.
You can also filter the data item in the singleton. For example, you can show the total revenue for only the
year 2007.
Queries that are associated to a singleton object are not supported when producing report output in
delimited text (CSV) format.
Tip: The Returns by Damage, Failed Orders and Complaints in 2006 sample report in the GO Data
Warehouse (analysis) package includes a singleton.
Steps
2. From the Source tab, drag a data item into the Singleton container.
Tip: To create a singleton, you can also drag a data item anywhere in your report layout.
3. To change the query associated to the singleton object, in the Properties pane, double-click the Query
property and make changes.
Results
When the report is run, the first row value for the data item is retrieved.
Steps
A message box appears indicating whether any errors were found in the report.
2. If you require more detail from the validation process, from the Tools menu, click Validate Options and
do the following:
The Interactive data check box controls how queries will be processed during validation.
Clear the Interactive data check box to set the Execution Optimization property to All Rows.
A message box appears indicating whether any errors were found in the report.
Revalidate your report.
3. If you encounter validation errors and want IBM Cognos Report Studio to identify incorrect objects in
your report, from the Tools menu, click Auto Correct.
Report Studio provides a list of such objects that you can remove individually to successfully run your
report.
In some cases, the information or error message is linked to the location of the issue in your report. To go
to the location of the issue, click the message, and then click Select. If only warnings and information
appear in the dialog box, these will disappear when you click OK.
Save your report to preserve the modifications you made. Reports are saved to the IBM Cognos Business
Intelligence server. You can also save your report on your computer.
Steps
1. From the File menu, click Save or click Save As to save a copy of the report under a different name.
2. If you are saving the report for the first time, specify where to save the report and type a file name.
3. Click Save.
Run your report to see the data that is retrieved. Save time by validating it first to check for errors.
You can also run a report or a group of reports in IBM Cognos Connection. Further-more, you can
save report outputs to a file system.
You can specify not to render a report page if it does not contain any data.
When you run a report in an export format such as PDF, delimited text (CSV), Microsoft Excel
spreadsheet software (XLS), the IBM Cognos report name is used as the exported file name.
If you run a report that uses functions or features not supported by the data source, an error
message appears. We recommend that you periodically test your reports while you author them in
IBM Cognos Report Studio to ensure that you do not encounter multiple error messages when you
run the report.
Steps
1. Open a report.
2. If you want to clear parameter values stored on the IBM Cognos Business Intelligence server, from the
File menu, click Clear Parameter Values.
Parameter values stored by the IBM Cognos BI server include signon, validation, and prompt information.
For example, if you define two data source connections in IBM Cognos Connection that point to the same
data source, you are prompted to choose one when you run a report. This information is stored so that you
are not prompted each time you run the report.
3. If you want to view only the tabular data, from the Run menu, click View Tabular Data. If the report
contains multiple queries, you must first click an object, such as a list or crosstab that uses the query for
which you want to view the tabular data.
Use this command to ensure that the right results appear. For example, you create a calculation and you
want to ensure it is giving you the results you want.
Tip: You can also view tabular data in Query Explorer, which is useful when you are building queries.
4. If you want to set run options, from the Run menu, click Run Options.
The default value is the value of the selected corresponding run option in IBM Cognos Connection.
Note: The run options that you set apply only to the current session. When you close Report Studio, the
options return to the default settings.
6. From the Run menu, click one of the options to produce the report in the format you want.
You can produce a report in HTML, PDF, CSV, various Microsoft Excel formats, and XML. You cannot
produce a report in CSV format if you have more than one query defined in the report unless the additional
queries are used for prompts.
Results
The report runs in IBM Cognos Viewer. Once the report has finished running, you can run the report again
in the same format or in a different format. If you run the report again in CSV or XLS format, the report will
appear in a new browser window.
The options available in IBM Cognos Viewer depend on the capabilities set by the administrator for each
user.
Query subject
Query item
Measure
Dimension
Hierarchy
Level
Property/attribute
Top node member
Member
Level item
Steps
Right-click of the data item and click Glossary.
Results
The IBM Cognos Business Glossary appears.
Use list reports to show detailed information from your database, such as product lists and customer lists.
A list report is a report that shows data in rows and columns. Each column shows all the values for a data
item in the database or a calculation based on data items in the database.
Format lists to give them the appearance you want. You can specify formatting for individual column titles
or bodies or for all columns in a list. When you specify formatting for all columns in a list, the formatting is
automatically applied to new columns you subsequently add.
Steps
To format the text in the column title or the data in the column body, click the unlock button in the
toolbar and then click the title or body.
To format the whole column, click the select ancestor button in the title bar of the Properties pane
and click List Column.
For example, to specify a background color, click Background Color and choose the color.
To format list column bodies, click the select ancestor button in the title bar of the Properties pane
and click List Columns Body Style.
To format list column titles, click the select ancestor button in the title bar of the Properties pane
and click List Columns Title Style.
To format entire columns, click the select ancestor button in the title bar of the Properties pane and
click List Columns.
To format the entire list, click the select ancestor button in the title bar of the Properties pane and
click List.
For example, to specify a background color, click Background Color and choose the color.
Formatting for specific columns overrides formatting for entire columns. For example, you specify red as
the background color for a specific column body and green as the background color for all columns. When
you run the report, the specific column body is red and the remaining columns in the list are green. New
columns added to the list will be green.
In list reports, we recommend that you avoid using set expressions. When in a list, set expressions, such
as TopCount, may produce fewer rows than in the corresponding crosstab.
For example, the following list report includes Year in the first column, followed by a column
containing an expression that returns the top three months by revenue. Only three rows appear in the
report and they correspond to the top three months across all years. If you group by Year, the report still
shows only three months. However, the corresponding crosstab report, you see three months for each
year.
In crosstab reports in IBM Cognos 8, set evaluation is always done in the context of what the set is nested
under in the report. However, in list reports set evaluation is done independently of the grouping. For
example, if the first column contains Country instead of Year, you see the top three months for each
country (across all years) in both cases.
In the case of different dimensions, you can force context independence by replacing [Revenue] in the
topCount expression with tuple ([Revenue], X), where X is the default member of the hierarchy that
contains Month.
However, for nested levels in the same hierarchy, there is no such workaround at this time.
Use repeaters to repeat items when you run the report. For example, you can use repeaters to create
mailing labels, including customer names and addresses.
To build a repeater, drag the Repeater or Repeater Table object from the Toolbox tab to the work area.
Use repeaters to repeat items across a single row without a particular structure. For example, you want to
create a list that contains Year, and Product line. For each year, you want all product lines to appear in a
single row. To do this, create a list with Year as a column and with a repeater as a second column. Then
insert Product line into the repeater. Use repeater tables to repeat items in a table structure. Drop the items
in the repeater, and modify the properties of the repeater to obtain the results you want. For example, you
can specify how many frames appear per page in a repeater table by typing values in the Across and
Down properties.
You can convert a list into a repeater table to take advantage of an existing list.
Steps
You are a report author at The Great Outdoors Company, which sells sporting equipment. You are
requested to create mailing labels for all of the company's retailers.
Steps
3. In the New dialog box, click Repeater Table and click OK.
4. Click the repeater, click the select ancestor button in the Properties pane title bar, and click Repeater
Table.
6. In the Insertable Objects pane, on the Toolbox tab, drag the Table object to the repeater.
8. Click the table, ensure that you see Table Cell in the Properties pane title bar, and modify the following
properties:
Double-click Background Image, click Browse, click logo.jpg, and click OK.
In the Background Image dialog box, under Position, click the align top right button.
Under Tiling, click Do not tile and click OK.
Double-click Size & Overflow, and in the Height box, type 175, and click OK.
9. Click the table, click the select ancestor button in the Properties pane title bar, and click Table.
Double-click Border.
In the Style box, click Solid line.
In the Width box, click 1 pt.
In the Color box, click Black.
Under Preview, click the apply all borders button and click OK.
Double-click Font, and under the Size box, click 8pt, and click OK.
11. In the Insertable Objects pane, on the Toolbox tab, drag the Block object to the repeater 8 times to
create 8 blocks.
13. Click the first block, and, in the Properties pane, specify properties for the first block:
Double-click the Padding property, type 35 in the box on the right, click mm as the unit, and click
OK.
Set the Horizontal Alignment property to Center.
14. In the Insertable Objects pane, on the Source tab , expand Sales and Marketing (query), Sales (query),
and Retailer site and drag seven data items to the remaining seven blocks.
15. Ctrl+click the seven blocks to select them, and in the Properties pane, specify properties for the seven
blocks:
Double-click the Padding property, type 25 in the box on the left, click mm as the unit, and click
OK.
Set the Horizontal Alignment property to Left.
Results
When you run the report, each page contains 10 mailing labels in two columns.
Crosstabs
Use crosstab reports to show information in a more compact form than in a grouped list.
For example, create a crosstab report to show total sales by product line generated by each sales
representative.
Like list reports, crosstab reports are reports that show data in rows and columns. However, the values at
the intersection points of rows and columns show summarized information rather than detailed information.
F. Indenting data
When you add data items to crosstabs, you create crosstab nodes and crosstab node members. These
objects allow you to easily create crosstabs, by dragging data items to crosstabs.
Crosstabs are dimensional objects that have row edges and column edges. Each edge is composed of a
set of crosstab nodes. Each crosstab node contains the following:
Zero or one nested crosstab node, which contains one or more crosstab node members or nested
crosstab nodes.
Each crosstab node member refers to a data item that contains an expression to define the members in the
crosstab.
Crosstab node 1 contains a single node member for the total. This node refers to the data item
Total(Product line).
Crosstab node 2 contains a crosstab node member that refers to the data item Product line. This member
has a nested crosstab node containing a crosstab node member that refers to the data item Product type.
Crosstab node 3 contains a single node member for the average. This node refers to the data item
Average(Product line).
Crosstab node 4 contains a crosstab node member that refers to the data item Order year. This member
has two nested crosstab nodes. The first node contains a crosstab node member that refers to the data
item Order month. The second node contains a crosstab node member for the total.This node refers to the
data item Total(Order month).
Tip: Nodes are also created when you add data items to charts.
Crosstab nodes can be placed anywhere in the crosstab. For example, in the previous diagram, you can
drag Order month under Average(Product line) to create a row edge.
To move nested items from one edge to another, ensure that you select the crosstab node and not the
crosstab node member. For example, in the previous diagram, you want to move Product line and Product
type to the column edge. To do this, click Product line and, in the Properties pane, click the select ancestor
button and click Crosstab Node. Both Product line and Product type are selected.
Create a single-edge crosstab report to show data in a list-like form. For example, to show the quantity of
products sold for each year and for each order method, you could create a crosstab with Order Year and
Order Method as rows and Quantity as the measure.
Steps
3. In the Insertable Objects pane, on the Source tab , drag data items to the Rows or Columns drop zone.
A black bar indicates where you can drop the data item.
If you dragged the data item in step 3 to the Rows drop zone, drag the additional items above or
below the first item.
If you dragged the data item in step 3 to the Columns drop zone, drag the additional items to the left
or right of the first item.
5. To add measures to the crosstab, drag the measures to the Measures drop zone.
Results
When you run the report, a crosstab is produced that has only one edge.
Nest data in a crosstab report to compare information by using more than one data item in a column or
row. For example, a report shows the number of sales by product line for the past fiscal year.
You decide to add a data item to further break down the number of sales by quarter.
When nesting columns in a crosstab report, there are four distinct drop zones where you can insert a new
data item. The drop zone you choose will define the relationship between the data item and the column.
Rows
The following relationships are created when you insert a data item as a row:
Inserting a data item to the left or right of a column creates a parent-child relationship between
them.
When you insert a data item to the left of a column, the data item becomes a parent to the column. When
you insert a data item to the right of a column, the data item becomes a child of the column.
Inserting a data item above or below a column creates a union relationship between them.
Columns
The following relationships are created when you insert a data item as a column:
Inserting a data item to the left or right of a column creates a union relationship between them.
Inserting a data item above or below a column creates a parent-child relationship between them.
When you insert a data item above a column, the data item becomes a parent to the column. When you
insert a data item below a column, the data item becomes a child of the column.
For example, you have a crosstab with Product line as rows and Quantity and Revenue as nested rows.
For columns, you have Order method with Country as a nested column. In this crosstab,
Steps
1. In the Insertable Objects pane, on the Source tab , click the data item to add.
2. Drag the data item to the report as a nested column or nested row.
A black bar indicates where you can drop the data item.
Tip: If you add more than one measure to a crosstab, all measures appear as columns. You cannot have
one measure appear as a row and another as a column. To make all measures appear as rows, swap
columns and rows.
You can specify the default measure for crosstabs and certain chart types. Specify the default measure to
be used when the measures cannot be determined by what is on the edges. For example, you create a
crosstab with Order method as rows and Product line as columns. You add Quantity and Revenue as
nested rows, making Order method their parent. You then add Country under Order method. Since there is
no measure specified for Country, you specify the default measure so that data is returned for each
country.
In crosstabs, Report Studio automatically sets the default measure when you insert a measure into the
crosstab cells.
Steps
Click any part of the crosstab, and then click the select ancestor button in the title bar of the
Properties pane.
Click Crosstab.
Set the Default Measure property to the default measure.
2. To specify the default measure for a chart, drag the measure to the Default measure box in the chart.
Swap columns and rows to look at information from a different perspective. This may help you discover
high and low points in the data that you hadn't previously noted.
Steps
From the toolbar, click the swap rows and columns button .
Results
In the report, the rows become the columns and the columns become the rows.
You can indent crosstab node members to set them apart from surrounding data.
When you insert a hierarchy in rows, all members are automatically indented according to their level. By
default, the first member in a set is not indented. If you insert a hierarchy in columns, the members are not
automatically indented. You can change the indentation properties of members in the report.
Relative indentation means that the member will shift by one tab when the member's level increases
relative to the previous member in the hierarchy.
Indenting based on the level in the hierarchy means that the member will shift the number of tabs
equivalent to the level in the hierarchy.
Steps
2. In the Properties pane, set the Level Indentation property to your indentation type, length, and direction.
Change a list into a crosstab to view your data from a different perspective.
Steps
Results
The list becomes a crosstab with the columns you selected in step 2 appearing as columns and nested
columns. The unselected columns, except for measures, appear as rows and nested rows. If you have one
measure, it becomes the cells of the crosstab. If you have more than one measure, they appear as
columns.
Tip: To make all measures appear as rows, swap columns and rows.
Example 1:
You are a report author at The Great Outdoors Company, which sells sporting equipment. You are
requested to create a report showing sales by order method to determine which methods are generating
the most revenue and the highest sales volume.
Steps
Click Product type and drag it just to the right of Product line.
Expand Sales fact and drag Quantity to the Measures drop zone to add it as a measure.
Tip: Drop Revenue into the crosstab when you see a vertical bar between the Product type column and the
Order method column.
5. Click any part of the crosstab, and then click the select ancestor button in the title bar of the Properties
pane.
6. Click Crosstab.
Results
When you run the report, you can see that for camping equipment, Cooking Gear generated the highest
sales volume for the Special order method and Tents generated the most revenue. Tents ordered by the
Web order method generated the largest revenue.
Example 2
You are a report author at The Great Outdoors Company, which sells sporting equipment. You are
requested to create a report showing sales for each product line by quarter and by order method. Since the
report will have columns with data from different dimensions, you create a discontinuous crosstab report.
Discontinuous crosstabs are also known as disconnected or disjoint crosstabs or crosstabs with unrelated
columns.
Steps
Expand Order method and drag Order method to the Columns area.
5. Pause the pointer over the query explorer button and click Query1.
6. On the Toolbox tab, drag Data Item to the Data Items pane.
7. In the Expression Definition box, type the following and click OK:
9. Pause the pointer over the page explorer button and click Page1.
10. On the Source tab, expand Retailer site and drag Region to the left of Order method.
Tip: Drop Region into the crosstab when a flashing vertical bar appears between the Product line column
and the Order method column. Otherwise, Region may appear as a nested row instead of a column.
13. From the Data Items pane, drag Region to the Sort List pane.
14. Click the sort order button to sort quarters in ascending order and click OK.
Charts
You can use IBM Cognos Report Studio to create many chart types, such as column, bar, area, and line
charts.
The following shows the most common chart objects as they appear in your report output in IBM Cognos
Viewer.
You can click in different areas of the chart to get access to different properties.
Color
Style
Weight
Transparency
of chart axis lines and specify where to display major and minor gridlines on the chart.
Data Series
A data series is a group of related data points that are plotted in a chart. Each series has a unique color or
pattern and is described in the legend. You can plot one or more data series in a chart; pie charts have
only one data series.
Categories
Categories are groups of related data from the data series that are plotted on the X-axis. Categories of
multiple data series are shown together using clustered and stacked data markers.
Axes
The primary axis (or Y-axis) refers to measures of quantitative data, such as sales figures or quantities.
Charts can have more than one primary axis.
The category axis (X-axis or ordinal axis) plots qualitative data, such as products or regions. It runs
horizontally, except in bar charts.
Major gridlines extend from the tick marks on an axis and run behind the data markers.
Legend
A legend is a key to the patterns or colors assigned to the data series or categories in a chart.
Charts use graphical elements such as columns, horizontal bars, points, bubbles, lines, and areas as
visual representations of data points.
Before creating charts, review the available chart types to select the best chart for your needs. Also review
the chart objects that make up charts.
Tip: To view the available chart types, from the File menu, click New, and then double-click the chart icon.
You can also view the available chart types by adding a Chart object from the Toolbox tab in the Insertable
Objects pane to an existing report.
To create charts, drag data items from the Source tab to the,
Measures
Data series
Categories drop zones.
Steps
4. Drag and drop product lines to the categories object of the chart
Results
The Chart now displays the clustered bar chart, which displays Great outdoors company’s revenue for four
years by their product line.
In the example chart, the data series are order years 2004, 2005, 2006, and 2007.
In the example chart, the categories are the product lines of The Sample Outdoors Company in clustered
columns.
The following shows the same chart as it appears in the IBM Cognos Report Studio user interface. The Y-
axis is selected, and the axis titles are displayed.
More customizations
As you author your chart and edit the chart properties, IBM Cognos Report Studio provides you with a
preview of your chart, using simulated data. This allows you to view your style changes without running
your report. You need to run your report to see the chart with your actual data. You can disable the chart
preview in the options.
After you create a basic chart, modify the various chart objects to customize your chart.
Experiment with the other type of charts namely pie, column, bar, line, area, radar and bubble charts.
Grouping Data
Group data items in a list report to remove duplicate values.
For example, you have a report that shows products purchased. For each product, the product type is also
shown. You group the Product type column to show only one instance of each product type in the list.
Tip: The Succession Report sample report in the GO Data Warehouse (analysis) package includes
grouping.
.You can also group data items in repeaters and in page sets.
In addition to grouping, you can specify the sort order of data items. In lists, you can sort data items within
groups as well as ungrouped items. In crosstabs, you can sort row and column items.
Steps
1. Click the column on which to group. You can click either the column heading or one of the column cells.
A symbol appears indicating that the column is grouped. You can control when its values appear by setting
the group span, and the column appears in the list of headers and footers that you can add to the report.
You can perform more advanced groupings in a list to get the results you want. For example, you can
view the entire grouping structure and make changes to the grouping order
group a data item that appears in a query but not in the layout
specify a list of properties for a group
Steps
2. Click the select ancestor button in the title bar of the Properties pane and click List.
4. To change the grouping order, in the Groups pane, click the grouped item in the Groups folder and drag
it to the grouping hierarchy.
5. To create a new grouping, in the Data Items pane, click the data item and drag it to the Groups folder in
the Groups pane.
Tip: Click overall to specify a list of data items for the overall group. The overall group is the grouping of all
data into one group. You can specify a list of data items for the overall group to add overall totals to the list.
When columns are grouped, you can choose how often to show column names by changing the group
spanning. For example, when country and city are both grouped, you can choose to show the country
name each time
Spanning one grouped column by another column is helpful if the second column contains many items.
Steps
2. In the Properties pane, set the Group Span property to the column to span.
You are a report author at The Great Outdoors Company, which sells sporting equipment. You are
requested to create a report that lists all product orders organized by order number.
To make the report easy to read, you group the Order number column so that only one instance of each
order number appears. However, because each order contains different products, the same order date
appears for each product. You decide to show the order date only when the order number changes.
Steps
4. In the Insertable Objects pane, on the Source tab, expand Sales and Marketing (query) and Sales
(query).
8. Expand Sales fact and add Quantity and Revenue by double-clicking them.
11. In the Properties pane, set the Group Span property to Order number.
Results
When you run the report, the first row of the Order date column appears for each Order number row.
You are a report author at The Great Outdoors Company, which sells sporting equipment. You are
requested to create a report that lists all products sold by the company, organized by the product line and
product type.
To make the report easier to read, you group the Product line and Product type columns so that only one
instance of each column appears. However, because some product lines contain many different product
types and products, you decide to show the product line for each product type.
Steps
4. In the Insertable Objects pane, on the Source tab , expand Sales and Marketing (query).
5. Expand Sales (query) and Product and add the following data items by double-clicking them:
Product line
Product type
Product name
9. In the Properties pane, set the Group Span property to Product type.
10. Click any part of the list and then click the select ancestor button in the title bar of the Properties pane.
Results
When you run the report, the product line appears whenever the product type changes.
Create sections in a report to show a data item as the heading of a section. When you run the report,
separate sections appear for each value.
Creating sections is similar to creating headers by grouping on a data item. The difference is that section
headers appear outside the list, crosstab, chart, or repeater. In addition, you can group data items only in
lists.
Steps
If the column is in a list or repeater, this menu option will create sections without creating a master detail
relationship. This can improve performance when running the report. If the column is in a crosstab or chart,
this menu option will create sections using a master detail relationship.
Sections are created, and a section header appears. The data container that contains the column used to
create sections is embedded in a list.
3. To add or remove section headers and footers, from the Structure menu, click Headers & Footers, click
List Headers & Footers, and select or clear the appropriate check boxes. The section header and footer
check boxes appear under Overall header.
Remove Sections
You can remove sections and reinsert the data item that was used to create the sections in the data
container.
Steps
2. Expand the page containing the data container with the section.
3. Expand Page Body until you see the data container in which you added the section.
The data container is nested in the List Column Body object of the List object that is created when you add
a section.
8. Drag the data item that was used as a section header back into the data container.
9. If you created sections in a list without creating a master detail relationship, click the data item, and then
click the group/ungroup button to ungroup the item.
In a report, you can add both detail and summary aggregation. Detail aggregation, which is supported only
for relational data sources, specifies how a data item is totaled at the lowest level in a report.
In lists, detail aggregation specifies how the values that appear in the rows are totaled.
In crosstabs, detail aggregation specifies how the values in the cells are totaled.
For example, detail aggregation for a measure like Revenue might be Total in both lists and crosstabs. In
the following list report, this means that the values you see for the Revenue column represent the total
revenue for each product type.
Summary aggregation, which is supported for all data sources, specifies how data items are totaled in the
headers and footers of a list and in the total rows and columns of a crosstab.
For list reports, these summary aggregates only summarize the data that is visible on that page of the
report.
You can specify detail and summary aggregation in different ways by using any of the following:
How data is aggregated also depends on the type of data that you are aggregating. Aggregation rules are
applied differently to facts, identifiers, and attributes.
For example, if you aggregate a data item that represents part numbers, the only aggregate rules that
apply are count, count distinct, count non-zero, maximum, and minimum.
You can add simple summaries to the groups in a report by using the aggregate button. This button
provides a subset of the aggregate functions available in Report Studio. For list reports, a Custom option is
also available so that you can add your own aggregate function in the expression of the data item
The aggregate button sets the rollup aggregate property (p. 158) for the data item to the selected summary
aggregate, and places the data item into an appropriate footer. A footer is created for each group as well
as an overall footer, unless they already exist.
Steps
In the Properties pane, under Text Source, set the Source Type property to the source type to
define the label.
For example, set it as Data Item Value to produce a dynamic label for the summary based on data item
values.
This property depends on the source type you chose. For example, if you chose Data Item Value as the
source type, set the Data Item Value property to the data item to use to define the label.
In lists, the summary appears as a footer. If the column to which you added a summary is grouped, group
and overall summaries appear.
Try outs
1. To change a summary, select it and, in the Properties pane, under Data Item, click Rollup
For example, you have a crosstab with Product line as rows, Order year as columns, and Revenue
as the measure.
For Product line, you can add the Total summary as a header, which will total all revenue for each
order year.
You can then add the Average summary as a footer, which will give the average revenue of all
product lines for each order year.
Set the Auto Group & Summarize query property to specify whether Report Studio should group non-fact
data items (identifiers and attributes) and apply aggregate functions to aggregate fact data items in lists.
If you are using an OLAP data source, data is always summarized regardless of how this property is set.
Steps
1. Pause the pointer over the query explorer button and click a query.
2. In the Properties pane, set the Auto Group & Summarize property:
To group non-aggregate fact data items and apply aggregate functions to aggregate fact
data items in lists, set this property to Yes.
To render detail rows, set this property to No.
When working with relational data sources and list reports, for each data item in a report, you can specify
detail and summary aggregation properties to manage summaries without having to create complex data
item expressions.
Steps
1. Click the data item for which to set detail or summary aggregation.
2. In the Properties pane, set the Aggregate Function or the Rollup Aggregate Function property to a
function.
You can use summary functions in data item expressions. The summary functions in the Expression Editor
that have the same name as the summary functions available through the Aggregate Function and Rollup
Aggregate Function properties operate the same way.
Average
Calculated
Count
Count distinct
Custom
Maximum
Minimum
etc.
For example, in a relational report, setting the Aggregate Function property to Total is the same as
changing the expression of the data item to total ([Revenue]).
In general, report maintenance is easier if the Aggregate Function and Rollup Aggregate Function
properties are used rather than adding aggregate functions to data item expressions.
Use summary functions in expressions if the required summary is not supported as an aggregate or rollup
aggregate property or if the complexity of the expression cannot be managed using the data item
properties.
Add an aggregate function to a data item expression if one of the following conditions applies:
The underlying data source is relational, and you want to use database vendor-specific aggregate
functions.
You want to use aggregate functions that require more than one parameter, such as percentile.
You require aggregate expressions that are not available in the aggregation properties, such as a ‘for’
clause.
Example
Your report uses Product line, Product type, and Quantity. You want to calculate the percentage of the
parent quantity that is represented by each product type. This requires you to compare the total quantity for
each product type to the total quantity for the parent product line.
In other words, your expression requires aggregates at different group levels. You use aggregate functions
in the data item expression with a for clause to specify the group level as follows:
total ([Quantity] for [Product type]) / total([Quantity] for [Product type]) Tip: The Total Revenue by Country
sample report in the GO Data Warehouse (query) package includes a total aggregate function.
Steps
2. In the Properties pane, double-click the Expression property and add aggregate functions to the
expression for the data item.
In Report Studio, summary and detail filters, which are best for list reports in the relational reporting style.
Summary filters are applied to summaries and detail filters are applied to details.
You can also use prompts, or prompt controls, to allow users to provide input for the filter.
Add a filter expression to focus a report and minimize processing time by excluding unwanted data.
For example, you can filter data to show customers who placed purchase orders that were valued at over
one thousand dollars during the past year. When you run the report, you see only the filtered data.
Detail filters are applied to the data source. They are Boolean expressions used to exclude database rows
based on non-aggregated values or to exclude lowest level groups based on leaf aggregates.
You can specify whether detail filters are applied before or after aggregation using the Application property.
Tip: The 2005 Sales Summary sample report in the GO Sales (analysis) package includes a detail filter.
Summary filters are Boolean expressions used to exclude specific groups based on the group aggregates.
They are based on summary calculations.
Steps
1. To add a filter that was created in the package, in the Insertable Objects pane, on the Source tab , drag
the filter to the report.
You must drag the filter to an object that is linked to a query. If there is more than one query defined in the
report, drag the filter to an object that is linked to the query.
If you have more than one query defined in the report, you must first click an object that is linked to a
query. If there is more than one query defined in the report, click an object that is linked to the query.
To add a filter that will apply to detail values, click the Detail Filters tab.
To add a filter that will apply to summary values, click the Summary Filters tab.
To add data items that are not shown in the report, on the Source tab, double-click data items.
To add data items that are in the report but not necessarily in the model (such as calculations), on
the Data Items tab , double-click data items.
To add data items from a specific query in the report , on the Queries tab, double-click data items.
To add functions, summaries, and operators, on the Functions tab, double-click elements.
To add a value that is derived from a parameter, on the Parameters tab, double-click the
parameter. Parameters define prompts, drill-through reports and master detail relation-ships
7. In the Usage box, specify whether the filter is required, optional, or not to be used.
8. If you created a detail filter, in the Application box, click one of the following options:
To apply the filter to individual records in the data source, click Before auto aggregation.
For example, for a specific product type within a product line, you want to filter out individual orders that
generated less than one thousand dollars in revenue.
To apply the filter to data after the query has grouped and summarized at the lowest level of detail,
click After auto aggregation.
For example, you want to filter out the product types that generated less than ten thousand dollars in
revenue within a product line.
For example, you have a list that contains the data items Product line, Product type, and Revenue. The
revenue values you see are aggregated to the product type level. If you create a detail filter on Revenue
and you choose to filter values before auto aggregation, you are filtering non-aggregated revenue values. If
you choose to filter values after auto aggregation, you are filtering revenue values aggregated to the
product type level.
9. If you created a summary filter, click the ellipsis (...) button under Scope and select the grouping level at
which to apply the filter.
For example, a revenue report is grouped on product line and product type. You can choose to filter total
revenue for either the product line or product type level.
If the query in which you are creating the summary filter is linked to more than one data container, the data
item that you select as the grouping level must be grouped in all data containers linked to the query.
Otherwise, the report will not run successfully.
Tip: To filter at the overall level, do not click a level. For example, if a report has an overall footer that
shows the total revenue for all products, by not choosing a level you will apply the filter to the footer.
In addition, if you are using a dimensional data source, excluding a parent level excludes its children, and
excluding all children excludes the parent.
Sorting Data
You can sort items to view them in your preferred order.
1. By default, Report Studio retrieves items in the order defined in the data source.
3. Relational and dimensionally-modeled relational data sources may not always have a defined order. The
data modeler defines the sorting options in the model.
You can sort items in a list in ascending or descending order based on a value or a caption, such as
revenue or employee name. You can also perform advanced sorting to sort columns within groups or to
sort a row or column using another data item.
Tip: The Sales Growth Year Over Year sample report in the GO Data Warehouse (analysis) package
includes sorting.
Steps
2. Click the sort button and click Sort Ascending or Sort Descending.
Tip: If you are sorting an extended data item, click the sort button, click Layout, and then click Sort
Ascending or Sort Descending.
An arrow appears beside the data item to indicate that a sort order was set.
When you specify a sort order for more than one column, the columns are sorted in the order in which they
were inserted in the report.
For example, you add columns A, B, and C to a report and specify a sort order for each. When you run the
report, column A is sorted first, then column B, and then column C. You can change the order in which the
columns are sorted in the Advanced Sorting options.
If you use relational data, in a list, you can sort columns within groups and change the sort order of
columns. In a crosstab, you can sort a row or column by another item, such as Order year by Revenue.
Steps
1. Click a column.
2. From the Data menu, click Sort, and then click Advanced Sorting.
Tip: If you are sorting a set, click Sort, click Layout, and then click Advanced Sorting.
In the Groups pane, under the Groups folder, expand the folder of the grouped column.
In the Data Items pane, drag the data items to sort to the Sort List folder.
Tip: You can also drag data items from the Detail Sort List folder.
4. To change the sort order of columns, in the Groups pane, change the order of columns in the Sort List
folder of a group or in the Detail Sort List folder.
Tip: Add items to the Detail Sort List folder to sort items that are not groups.
From the Data Items pane, drag the item on which to sort to the Sort List pane.
For example, you have a crosstab with Product line as rows, Order year as columns, and Revenue as the
measure. To sort Order year by Revenue, drag Revenue to the Sort List pane.
Calculations
Create calculations to make a report more meaningful by deriving additional information from the data
source.
For example, you create an invoice report and you want to see the total sale amount for each product
ordered. Create a calculated column that multiplies the product price by the quantity ordered.
For relational data sources, if an expression is used in multiple reports or by different report authors, ask
your modeler to create the expression as a standalone object in the model and include it in the relevant
package.
You can add calculations to lists, crosstabs, and all other data containers. You can also add calculations
directly to a page header, body, or footer. However, you must first associate a query to the page.
Steps
3. To create a calculation that contains run-time information, such as current date, current time, and user
name, drag Layout Calculation to the report.
To add data items that are not shown in the report, on the Source tab, double-click data items.
To add data items that are in the report but not necessarily in the model, such as calculations, on
the Data Items tab , double-click data items.
To add data items from a specific query, on the Queries tab, double-click data items.
To add functions, summaries, and operators, on the Functions tab, double-click elements.
To add a value that is derived from a parameter, on the Parameters tab, double-click a parameter.
A report specification is an XML file that you can view (Tools, Show Specification). In addition, you can
view the specification for a selected object with the Show Specification (Selection) menu option.
You can also programmatically create or modify reports by using an editing tool to work with report
specifications. You then use the IBM Cognos Software Development Kit to implement the reports in your
IBM Cognos Business Intelligence environment. This is useful if, for example, you must make the same
modification in many reports. Rather than opening each report and making the change, you can automate
the process using the Software Development Kit, thereby saving you time.
Tips
When you are viewing the report specification, you cannot modify or copy parts of it.
You can also modify the XML code in a report specification by saving the report specification on
your computer.