You are on page 1of 157

<Insert Picture Here>

BI 11g Design Best Practices and Performance Tuning


Nicolas Barasz and Paul Benedict
Customer Engineering & Advocacy Lab
April 2014
Agenda

• Repository design best practices


• Dashboards and reports design best practices
• Reading query log
• Performance tuning (relational database)
• Performance tuning (multi-dim database)
• Troubleshooting
• 10g Upgrade considerations
Agenda

• Repository design best practices


• Physical Layer
• Business Model
• Presentation Layer
• Dashboards and reports design best practices
• Reading query log
• Performance tuning (relational database)
• Performance tuning (multi-dim database)
• Troubleshooting
• 10g Upgrade considerations
Create Aliases for all tables

• Create Aliases for all tables and prefix their names with
text that reflects the type of table e.g. Dim_ , Fact_ or
Fact_Agg.
• Create joins between the Alias tables, not the “master”
ones.

Original tables
VS Aliases
Avoid Circular Joins

• Circular joins may have a big impact on data integrity.


They can always be avoided by creating aliases.
Connection Pool Configuration

• Use native database drivers instead of ODBC.


• Set the maximum number of connections high enough.
Recommendation is to set it to 10-20% of concurrent
users multiplied by number of queries executed on a
dashboard page. Note that due to usage of expandable
hierarchies and selection steps, the number of queries
executed in parallel in 11g is often greater than in 10g.
• Use a separate connection pool for initialization blocks.
Query Limits

A user who has access to Answer can significantly slow


down the BI Server and the database with a bad report
that extracts millions of records. To prevent that, enable
query limits. If there is no specific users requirement, use
100 000 rows and 1h as a starting point.
Agenda

• Repository design best practices


• Physical Layer
• Business Model
• Presentation Layer
• Dashboards and reports design best practices
• Reading query log
• Performance tuning (relational database)
• Performance tuning (multi-dim database)
• Troubleshooting
• 10g Upgrade considerations
Business Model Design

OLAP

OLTP

Dimensions
Facts

ODBC
CSV
XML
Business Model Design
• Logical star-schemas only:
 No snow-flaking !
 Only one exception: BM
for Siebel Marketing list
formats.
Dimension Sources per Level

• Create a logical table source in the dimension at each


level that matches the level of a fact LTS. It was
recommended in 10g, but it is mandatory in 11g.
Logical Tables

• Use a separate dimension logical


table for each dimension – don’t
combine/merge them into one

• The same goes for facts, we don’t


want to end up with a single fact
logical table called “Facts – Stuff”!

• Have a separate logical table for


“Compound” facts (which combine
facts together from multiple LTS)

• Prefix logical table names with


either:
 Dim –
 Fact –
 Fact Compound –
Logical Table Columns

• Try assigning business columns as dimension primary


keys.
• Rename logical columns to use presentation names
• Keep only required columns.
Logical Table Columns

• You should not assign logical


primary keys on fact logical
tables

• Create “dummy” measures to


separate out facts into various
groups if need be

• Make sure almost every fact


logical column has an
aggregation rule set.
Level Keys

• The primary key of each level must always be


unique
• The primary key of the lowest level of the
hierarchy must always be the primary of the
logical table
Missing Dimensional Hierarchies

• Always create a dimension hierarchy for all dimensions,


even if there is only one level in the dimension.
• BI Server may need it to select the most optimized Logical
Table Source.
• It may be useful when BI Server performs a join between two
results sets, when two fact tables are used in a report.
• It is necessary for level-based measures.
• It is needed to set content level of logical table sources
Missing Dimensional Hierarchies

• Always configure drill-down, even if there is only one


level in the dimension. It may be useful for instance to
drill-down from contact type to contact name.
• Always specify the number of elements per level. BI
Server will use it to identify aggregate tables and mini
dimensions. It does not need to be accurate, a rough
estimate is fine.
Content Level

Always specify the content level in all logical table


sources, both in facts an dimensions.
• It will allow BI Server to select the most optimized LTS
in queries.
• It will help consistency checker finding the issues in
RPD configuration, preventing runtime errors.
Implicit fact

Set up an implicit fact column for each presentation


folder.
It prevents users from getting wrong results if they create
a report without fact column.
Use a constant as the implicit fact column to optimize
performance
Canonical Time Dimension

Each Business Model should include a main time


dimension connected to almost all fact tables. This is
necessary for reports that includes multiple facts. It is
also much easier for end-users than having a time
dimension per fact table.
Consistency Check Manager

Fix almost all errors, warnings, and best practices


detected by consistency check manager.
• If there is a message, it means that there is something
wrong in the configuration. It will have consequences,
even if there is no problem on the first reports.
• When there are too many messages, it is difficult to see
which ones are important.
Agenda

• Repository design best practices


• Physical Layer
• Business Model
• Presentation Layer
• Dashboards and reports design best practices
• Reading query log
• Performance tuning (relational database)
• Performance tuning (multi-dim database)
• Troubleshooting
• 10g Upgrade considerations
Simple Presentation Folders
• Small presentation folders are easier to understand and
to manipulate.
• Try to limit the number of fact tables, keep the ones that
have a lot of common dimensions and are “linked” from
a business perspective.
• Configure presentation folders
specific to each type of user.
Canonical Time Dimension

• The “canonical” time


dimension should always be
the very first presentation
Table

• “Secondary” time dimensions


can be given their own
presentation tables further
down
Homogeneous Presentation Folders
• List the dimension presentation tables first, starting with
the canonical time dimension.
• Place the measures/facts at the bottom. Do not mix
dimension and fact columns in the same presentation
Table
• Naming of presentation tables/columns should be
consistent across all folders. This is very important,
otherwise prompts values cannot be retrieved when you
navigate from one report to another report based on
another presentation folders.
• Make it easy to distinguish between dimensions and
facts.
Object Descriptions
• Add descriptions to presentation folders to explain the
purpose of each folder within Answers

• Add descriptions to presentation tables and columns so


that they appear in Answers when users roll-over them
with the mouse. For each column, explain the data
content with for instance calculation formula...
Global Recommendations
• To satisfy all your drill-down requirements, you don’t need to have all your
reporting objects in a single Subject Area / Presentation Folder

• For example, if you want to drill from a summary “Orders” report down to
“Order Item” level, you don’t need to create a single Subject Area that
contains both Order and Order Item objects

• You can start by creating a report against the “Orders” Subject Area and
then you can drill-down to another report defined against “Order Items”
Subject Area

• You just need to ensure the Presentation Table/Column names that are
being “prompted” have the same names in both Subject Areas

• If the Presentation Table/Column names aren’t the same then use Aliases
to make them the same!
Agenda

• Repository design best practices


• Dashboards and reports design best practices
• Reading query log
• Performance tuning (relational database)
• Performance tuning (multi-dim database)
• Troubleshooting
• 10g Upgrade considerations
Delete Unused Views

Each view may have a cost on performance,


upgrade, and maintenance, even if it is not
included in compound layout. Delete all unused
views, including table views.
Default values in Dashboard prompts

Put a default value in dashboard prompts.


• If you know what users will select most often, use it as
the default value.
• If you do not know, then put a dummy value so that the
report does not return anything. If necessary customize
the « no result » view to tell users to select a value in
prompt.
• There is nothing worse than executing a useless long
query that returns all data from the database because
there is no default filter. It costs a lot of resources both
on the database and on BI Server.
Hierarchies and attribute columns

Never mix hierarchies and attribute columns of the same


dimension in a report. This leads to misunderstandings
and unexpected behaviors, in particular when
hierarchical prompts are used.

Note that selection steps generated by hierarchical


prompts apply on hierarchies only, not on attribute
columns.

Adding filters on attribute columns works fine though,


even if you use the hierarchy in the report. But do not
include the attribute column in the columns selected.
Hierarchies and attribute columns
Groups and Calculated Items

It is important to understand the differences between two


types of selection steps: groups and calculated items.
• Performance considerations
• Calculated items are computed on presentation server. They
are executed on the (normally small) result set retrieved from BI
Server. Usually they do not have any impact on performance.
• Groups are computed on the database. They generate
additional logical and physical queries. They have a significant
impact on resources required on the database, and therefore on
global performance.
Groups and Calculated Items

• Functionality perspective
• Calculated items formula are exactly applied on result set as
they are. Aggregation rules used to compute the metrics on BI
Server are not considered.
• Groups generate a query with a filter based on members
selected. Aggregation rules are applied on BI Server as usual.
Filter or Selection Step ?

Applying filters in reports may seem similar to


selection steps. But is it really the same ? Let’s
study an example:
Filter or Selection Step ?

Looking at a simple table, it seems identical:


Filter or Selection Step ?
But see what happens when columns are removed
from tables:
Filter or Selection Step ?
Filter or Selection Step ?

Filters:
• Are always applied on all views.
Selection Steps:
• Are applied only if the corresponding column is included
in the view.
• May generate additional logical and physical queries.
Prompts or Hierarchical Prompts ?

11g Hierarchical prompts look great for end-users.


But they are often misunderstood:

• Hierarchical prompts generate selection steps. This


impact the report layout as it includes the members that
must be shown in the report.

• Normal prompts generate filters. Filters do not impact


the report layout but only the data retrieved from the
database.
Prompts or Hierarchical Prompts ?
Prompts or Hierarchical Prompts ?

Hierarchical prompt:
Prompts or Hierarchical Prompts ?

Normal prompt:

Selection steps are not filters. Hierarchical prompts


do not behave like normal prompts. Choose wisely.
General reports best practices

• Do not put too many pages per dashboard, all pages


should be visibles.

• Dashboard should be as interactive as possible: column


selectors, drill-down, guided navigation… Interactivity is
one of the best assets of Oracle BI. Use it.

• Do not overuse the new expandables level-based


hierarchies as they tend to generate many physical
queries. Often one query is necessary for each level
shown, more if multiple fact LTS are used.
Agenda

• Repository design best practices


• Dashboards and reports design best practices
• Reading query log
• Performance tuning (relational database)
• Performance tuning (multi-dim database)
• Troubleshooting
• 10g Upgrade considerations
Reading Query Log

Query log can be retrieved from Administration\Manage


Session or from NQQuery.log file. Log level can be
defined in the RPD globally (Tools\Option\
Repository\System logging level), or using session
variable in the RPD. Note that this session variable can
be overriden at report level by adding in prefix
SET VARIABLE LOGLEVEL=3;
Log level 3 is usually enough for performance tuning and
basic troubleshooting. Log level 5 is required to see
calculations performed on BI Server. Log level 7 is used
by development team only.
Reading Query Log
[2014-04-14T07:06:54.000-06:00] [OracleBIServerComponent]
[TRACE:5] [USER-0] [] [ecid:
61834cf427fc85a5:529015b6:14546f55499:-8000-
000000000001ab12,0:1:33:3] [tid: 1fa4] [requestid:
4dd80013] [sessionid: 4dd80000] [username: weblogic]

• Timestamp: start of query execution by BI Server


• TRACE: log level
• Requestid: ID of the logical query. It can be used to
track all elements of this query in NQQuery log.
• Username: user who executed the query.
Reading Query Log
1

3
4

Logical SQL query


Reading Query Log
1. Variables: Variables set for this particular query. The most
common variables are:
• QUERY_SRC_CD: the origin of the query, Prompt, Report…
• SAW_SRC_PATH: the catalog path to the query if it is saved
• SAW_DASHBOARD: the catalog path to the dashboard that
included this query
• SAW_DASHBOARD_PG: Name of the dashboard page
2. Columns selected by the user.
3. Sort Key: sort key defined in the RPD for the columns selected
4. REPORT_XXX or any AGGREGATE function: additional
aggregation requested for a measure at the level specified in BY
clause. This usually comes from view definition (sub-total,
excluded column…). Users can also put it in column’s formula.
5. From: subject area.
6. Fetch: maximum number of rows retrieved.
Reading Query Log
• Logical Request (before navigation): This step rewrites
the logical SQL after adding more elements like security
filters, …
• The logical query block fail to hits or seed the cache in
subrequest level due to [[ only one subrequest ]]: This
means that the query could not be splitted into multiple
sub-queries to be stored in cache.
• Execution Plan: This tracks all the steps required during
the execution. Note in particular the database ID. When
database ID is 0:0,0, it means that the step is done on
BI Server.
Reading Query Log

• This id identifies the physical query. It can be used to


track the performance of this query in the log.

• Number of rows and bytes retrieved by this query.

• Duration of this physical query.

• Total number of physical queries.

• Number of rows returned by BI Server to Presentation


Server.
Reading Query Log

• Elapsed Time: total time between start and end of this


query. This includes fetching all rows.

• Response Time: time between start of the query and


beginning of data fetch by Presentation Server. When
there is a significant difference between Elapsed Time
and Response Time, it usually comes from time needed
to fetch all rows. Results may sometimes be displayed
without waiting for all rows to be fetched.

• Compilation Time: time spent by BI Server to compile


data. It should almost always be less than 2s.
Cautions about the Query Log

• Single threaded activity. Under adverse circumstances,


can be a performance bottle neck at levels >2.

• Times listed/computed are when entries are written to


the log, which is almost always when they occur. See
above. Or there are other bottlenecks impacting
logging.

• Query logging is diagnostic, not intended for collecting


usage information.
Log Level

In production environment, set BI Server log level


to 0. When there is a lot of reports running in
parallel, query logging may cause performance
issues.
Creating a session log from the Query
Log

F:\middleware\Oracle_BI1\bifoundation\server\bin\nqlogviewer -f
f:\shared\RFAxx\407\nqquery.log -s adb20000 -r adb2002a -o
f:\shared\RFAxx\407\q2.txt

F:\middleware\Oracle_BI1\bifoundation\server\bin\nqlogviewer
-u<user name> -f<log input file name> -o<output result file name>
-s<session id> -r<request id>
Agenda

• Repository design best practices


• Dashboards and reports design best practices
• Reading query log
• Performance tuning (relational database)
• Performance tuning (multi-dim database)
• Troubleshooting
• 10g Upgrade considerations
Methodology

This section describes how to analyze performance issues from the BI


Server and SQL generation. It does not cover performance issues from
network, Presentation Server, or Browser.

1. Get the query log with at least log level 3.

2. Check if time is spent on BI Server or database (response time


and physical query duration versus compilation time). Normally,
time spent on BI Server should not exceed few seconds.
Otherwise, analyze the steps done on BI Server to find the cause
(log level 5 required).
Methodology

3. Look at the physical SQL for a first level of verifications:


• Are all tables included in this query really necessary ? Do we
have tables that are joined but are not included in select
clause and do not have filters applied (real filters, not join
conditions) ?
• How many physical queries/sub-queries are generated ?
More precisely, how many times do we read a fact table ? In
a perfect world, we read only 1 fact table and only once. If
there are more, find out why and see if some could be
removed. Check for excluded columns, non-additive
aggregation rules (REPORT_AGGREGATE,
count(distinct)...), selection steps, sub-query in the report, set
operators (UNION), totals, sub-totals, multiple views, etc.
• Are there any outer joins ? Where do they come from ? Could
they be removed by changing the design ?
Methodology

4. If optimizing the SQL is not enough, look with a DBA at execution


plan and find out the root cause of performance issue. Globally
there are mainly four ways to improve performance at this point:
• Reducing volume of IOs by improving data access path.
• Reducing volume of IOs by reducing the volume of data read.
Review the filters applied.
• Increasing parallelism (number of thread used to read big
tables).
• Improving IO speed (hardware improvement, in-memory...)
Methodology

Reducing the volume of data read can be done by reviewing the data
model, for instance:
• Aggregate table creation
• Fragmentation. For instance if most of the time only data of
current year/Quarter/Month are selected, we can split the fact
into two tables.
• Denormalization (to reduce the number of joins).
• Normalization (to reduce number of columns in the table). For
instance a big table with 500 columns could be split into two
tables, one with columns often used and another with columns
rarely used.
Level-Based Hierarchies

Queries using Level-based hierarchies generate in logical


SQL one sub-query for each level used in the report.
Therefore the cost on performance can be important.
Level-Based Hierarchies

With relational databases the number of physical sub-


queries is usually proportional to the number of logical
sub-queries. In the previous example 3 physical sub-
queries are generated.

The number of physical sub-queries can sometimes be


reduced by BI Server cache. If sub-request caching is
enabled (DISABLE_SUBREQUEST_CACHING=NO in NQSConfig.ini),
BI Server can re-use previously cached data and execute
only the physical sub-queries for data that are not in
cache.
Skipped/Ragged Hierarchies

Selecting Skipped/Ragged options increase significantly


the cost of hierarchies on performance. Additional logical
SQL sub-queries are required in case there is a null value
in a level displayed or at any lower level. The example
below generates 5 logical SQL sub-queries although only
3 levels are displayed.
Value-Based Hierarchies

With value-based hierarchies, there is only 1 logical SQL


query no matter how many levels are displayed.
Value-Based Hierarchies

On physical side, even with relational database, there is


only one sub-query executed on the fact table. Multiple
sub-queries are usually required on the dimension table,
but these should be very fast since they read the
dimension only. Value-based hierarchies are very efficient
regarding performance.
RPD Opaque Views

• Push the SQL statement as a sub select to the main


SQL generated from the query
• All tables used in opaque view definition are always
queried together, even if some of them are not really
necessary.
• Should be used as a last
resort only. For instance
when variables must be
included in SQL with
multiple levels of
aggregation.
RPD Opaque Views

• When possible, replace the view by aliases of the


corresponding physical tables. Filters may be applied in
logical table sources or in physical joins.
• Or create a physical table instead, loaded in the ETL
process
• Or create a materialized view (in RPD, materialized
views should be created as normal physical tables)
Database Features

Depending on your configuration, you may enable some


parameters in database feature:
• PER_PREFER_MINIMAL_WITH_USAGE: Enable this parameter
if your database optimizer cannot handle properly WITH clause,
for instance on database Oracle 10g (sometimes also useful on
database Oracle 11g). But be careful as it may have bad
performance impact on reports that use COUNT(distinct).
• PERF_PREFER_INTERNAL_STITCH_JOIN: This parameter may
sometimes be enabled to work around database optimizer bugs.
Note that it may increase significantly the workload on BI Server.
It is usually not recommended.
Count(distinct)

• Whenever it is possible, replace it by Count().


Count(distinct) has a high cost on performance on the
database.
• If there are multiple LTS, the aggregation rule must be
specified for each LTS.
Base Measure, Case when, Filter Using

Users want to filter the values for a measure. For


instance they want number of opened and closed
service requests.

There are multiple ways to do that. But each option


has consequences…
Base Measure, Case when, Filter Using

First approach: use the base measure with filters in


the report
Base Measure, Case when, Filter Using

Second approach: use “case when” statement in


the Logical Table Source
Base Measure, Case when, Filter Using

Third approach: use “Filter Using” statement in the


logical column
Base Measure, Case when, Filter Using

Solution Benefits Downside Rank

Base Measure -Flexible -Cannot be always 1 – Should be


-Perfectly Optimized used, depending used most of the
-Good for users’ on report time
education configuration
Case When -Simple physical query -No automatic 2 – Should be
-Always works where clause. used from time to
-Need filters in time
reports for good
performance
Filter Using -Where clause added -Where clause 3 – Should be
automatically quickly becomes used rarely
HUGE
IndexCol

Sometimes the formula or columns used vary depending on a


session/presentation variable.
If you use a « case when » statement then the entire formula is pushed
to the physical query. But by using function IndexCol only the required
column/expression is pushed to the database.
Combined with the new 11g features in prompts (allow selection in a list
of custom values), it allows users to modify very significantly reports
structure without any increased cost on performance. This function can
be used in the RPD or directly in reports.

INDEXCOL( CASE VALUEOF(NQ_SESSION."PREFERRED_CURRENCY") WHEN 'USD' THEN 0 WHEN 'EUR'


THEN 1 WHEN 'AUD' THEN 2 END , "01 - Sample App Data (ORCL)".""."BISAMPLE"."F19 Rev.
(Converted)"."Revenue_Usd", "01 - Sample App Data (ORCL)".""."BISAMPLE"."F19 Rev.
(Converted)"."Revenue_Eur", "01 - Sample App Data (ORCL)".""."BISAMPLE"."F19 Rev.
(Converted)"."Revenue_Aud")
Mini-Dimensions

• Mini-dimension tables include combinations of the most


queried attributes of their parent dimensions.
• They must be small compared to the parent dimension,
so they can include only columns that have a relatively
small number of distinct values.
Mini-Dimensions

Mini-dimensions are joined both to main fact table and to


aggregate tables.
Mini-Dimensions

• They improve query performance because BI Server


will often use this small table instead of the big parent
dimension.
• They increase the usage of aggregate tables. Due to
the level of aggregation, aggregate tables cannot be
joined to the parent dimension. But they can be joined
to the mini-dimension instead. It allows reports to use
the aggregate table even if they use some columns
from the corresponding dimension.
Override Default Aggregation Rule

It is possible to improve performance by overriding the


default aggregation rule for a column in reports when:
• The aggregation rule for all metrics used in this column’s formula
is SUM
• AND although a formula is applied on this/these metric(s), it is still
possible to aggregate the global formula using a SUM
• AND there are multiple levels of aggregation in the report, like
multiple views or totals/sub-totals
In this case, overriding the default aggregation rule will
reduce the number of physical queries executed.
Override Default Aggregation Rule

In the following example, the formula used for the metric is


ifnull(Revenue,0). There is a pivot table with a total. Note that the
aggregation used in the logical sql is “REPORT_AGGREGATE”
Override Default Aggregation Rule

Note the two sub-queries included in the physical SQL:


Override Default Aggregation Rule

Next, let’s override the aggregation rule:


Override Default Aggregation Rule

The logical SQL now shows REPORT_SUM:


Override Default Aggregation Rule

The physical SQL now includes only one query:


Override Default Aggregation Rule

Overriding default aggregation rule Count(Distinct) by Sum:

Most of the time, when aggregation rule is Count(Distinct), a separate


physical query is required for each level of sub-total. However, in some
reports, due to the dimensions selected and the structure of the report,
applying a Sum on the main result set to compute sub-totals provide the
same result.

In this case, overriding the default aggregation rule by Sum may greatly
improve report’s performance.
Excluded columns

Delete columns that are excluded from all views


• They increase the volume of data retrieved
• They make BI Server computing results at multiple
levels of aggregation, impacting resources needed
both on database and BI Server
• They may have
an impact on
results when
using complex
aggregations.
General Performance Tips

• Avoid using a filter based on another report.

• Use sub-totals and grand-totals only if necessary. Each


total means an additional level of aggregation and may
have an impact on performance.

• Do not show more than 6 reports per page (depending


on the performance of the reports).
Agenda

• Repository design best practices


• Dashboards and reports design best practices
• Reading query log
• Performance tuning (relational database)
• Performance tuning (multi-dim database)
• Troubleshooting
• 10g Upgrade considerations
Methodology

When OBIEE uses Essbase as data source, there are


additional design considerations that may have big impact
on performance.

The design solutions to improve performance change


depending on the use case. So the objective here is not to
provide best practices that should always be applied.

Instead, the following slides will present tuning methodology


and multiple techniques. It is up to the developer to study
multiple options, study OBIEE session log, and select the
best one for his use case.
Methodology

1. Simplify the MDX generated.


2. Reduce the number of MDX queries generated.
3. Make sure that optimal filters/selections are applied in
MDX.
4. Perform tuning with DBA on Essbase side and/or
check on Essbase why performance is still bad.
5. Modify OBIEE report based on feedback from
Essbase DBA.

Pre-requisites: being able to understand MDX queries


and OBIEE session logs
Reduce the number of selection steps

Optimizing selection step definition tends to reduce the


number of MDX queries and to simplify them.

For instance

is more optimized than

Each use case is unique. The objective is to simplify MDX


queries and at same time apply optimal filters/selections.
Case statement

Case statement is not supported in MDX. It is always


applied on BI Server.

• The main benefit of using Case statement in reports formulas is


that it cannot be included in MDX and therefore may help
simplifying MDX query.

• The main drawback of using Case statement in reports formulas is


that it cannot be included in MDX and therefore prevents from
applying optimal selections in MDX queries.

Each use case is unique. The objective is to simplify MDX


queries and at same time apply optimal filters/selections.
Case statement
There are restrictions:

• If the case statement does not combine multiple members, the


base column used in case statement should also be included in the
query and in the views as a separate column (hidden).
• If the case statement combines multiple members, then base
column cannot be included in the view without impacting the level
of aggregation. In this case:
• if the aggregation rule of measure is not External Aggregation,
the base column should not be in the query.
• If the aggregation rule of measure is External Aggregation,
base column must be included in the query and should be
excluded from the view. Aggregation rule of measure must be
changed from Default into a simple internal aggregation rule
(SUM, MAX, MIN). This works only if the internal aggregation
rule can be used to combine members and provides correct
results.
Case statement: example 1

User requested a report that shows the revenue by year


and by LOB for some LOBs, and group remaining LOBs
together:
Case statement: example 1
First option based on FILTER function

The MDX query is more complicated than required. This is not


optimized. There is no real filter since all LOBs are selected.
Case statement: example 1
Second option based on CASE statement

LOB column has been added in the query (and excluded


from the view). The case statement combine multiple
members (Games, TV, Services) and Revenue is defined
with External Aggregation. But the measure Revenue is
additive. So aggregation rule has been changed into SUM.
Case statement: example 1

MDX query is much more simple:


Case statement: example 2

Developer applied a case statement to rename brands. A


dashboard prompt allows users to select the brand:
Case statement: example 2

Due to the case statement, the filter on ‘Brand2’ is not


applied in MDX query. All brands are selected. This is not
optimized.

Developer should remove the case statement and instead


rename members in Essbase or create Essbase aliases.
FILTER function

Unlike Case statement, FILTER function can be shipped to


Essbase.

• The main benefit of using FILTER function in reports formulas is


that the selection is applied in MDX query and therefore may
reduce the volume of data calculated/retrieved in Essbase.

• The main drawback of using FILTER function is that it may


increase significantly the complexity of MDX query. Sometimes it
may even increase the number of MDX queries executed.

Each use case is unique. The objective is to simplify MDX


queries and at same time apply optimal filters/selections.
FILTER function: example

User requested a report that shows the total revenue for


Brand BizTech and total revenue for one specific customer:
FILTER function: example
First option based on Case statement

The MDX query is simple. But it returns 2995 rows (all


combinations of all brands and customers) instead of 1 row.
This is not optimized.
FILTER function: example
Second option based on FILTER

Filters are applied in MDX and only 1 row is returned. This


is optimized.
FILTER_METRIC_SPLITTING_LEVEL

Starting from 11.1.1.7.140425, a new parameter can be


used to modify MDX generation.

When this is activated, BI Server will generate multiple


simpler MDX queries instead of a single complicated query.
Tests showed significant performance improvements in unit
testing with this solution.

However, the high number of MDX queries generated may


cause scalability issues in environment with many
concurent users. So this setting must be tested properly
with high concurrency level.
FILTER_METRIC_SPLITTING_LEVEL

This new feature is managed by a variable


FILTER_METRIC_SPLITTING_LEVEL: value 0 means
disabled, 1 means enabled.
This variable can be created as a session variable, or in
opmn.xml file:
<ias-component id="coreapplication_obis1" inherit-
environment="true">
<environment>
<variable id="FILTER_METRIC_SPLITTING_LEVEL" value="CHANGE ME">
Security filters

Usually in OBIEE, security filters are defined in Application


role permission under Manage/Identity using Administration
Tool.
When Essbase is the data source, this is not recommended.
Instead, security filters should be defined directly in
Essbase.
As a consequence, user’s login must be provided to
Essbase in connection pool and BI Server cache becomes
user specific.

Each use case is unique. The objective is to simplify MDX


queries and at same time apply optimal filters/selections.
Use OBIPS Formatting Features
Instead of Doing Formatting in Column
Expressions

Consider applying (conditional) formatting from the report UI


rather than in the report SQL when possible.

For example, replace NULL values with a marker or a


different value such as 'N/A', 0, etc. using the formatting
features in OBIPS rather than with column expressions.

Each use case is unique. The objective is to simplify MDX


queries and at same time apply optimal filters/selections.
Selection Step Condition vs Filter
Selection step ‘Condition’ usually generates a first MDX query to
retrieve the members based on condition. Next this list of member is
included as an input in second MDX query.

This is perfect when the number of selected member is small and first
query runs fast. But if the number of members selected by the condition
is huge (thousands of members for instance), passing them as
parameter in second query may cause performance issues.

In this case it is probably better to apply global filters in the report


instead of using selection step condition (when possible).

Each use case is unique. The objective is to simplify MDX queries and
at same time apply optimal filters/selections.
Selection Step Condition: example

User requested a report that shows the quantity by


customer and brand only for customers with a
revenue>=100 000:
Selection Step Condition: example
First MDX query runs fine. But the second query has a huge
number of members in it:
Selection Step Condition: example

If this is causing performance issue, the condition may be


replaced by a filter:
Selection Step Condition: example

The filter is performed by BI Server, and MDX generated is


more simple:
Essbase Calculated Members or
OBIEE formulas ?

Often, calculations can be defined either on Essbase side


by creating calculated members, or on OBIEE side by
applying formula in reports/RPD.

If the MDX takes too much time when using calculated


members, try using base members and perform calculation
in OBIEE. This may improve performance very significantly.
Avoid CAST Expressions

For physical columns, the BI Server can automatically


convert returned value to the physical date type specified in
the physical layer of the RPD.

For example, if a Qtr column is mapped as INT in the


physical layer metadata, values will be converted to integer
even if it's returned as text by Essbase. Setting the desired
data type in the physical layer avoids the need to use an
explicit cast, which allows for better MDX query generation.
External or Explicit Aggregation Rule
Whenever possible, explicit agggregation rules (SUM for instance)
should be used. Be sure to match what is specified in the physical
and logical layers.
Using explicit aggregation allows to aggregate data either on Essbase
or on BI Server and therefore provides more flexibility than External
Aggregation.
There are some potential issues when specifying external aggregation
and using these metrics in derived expressions (case for instance).
Include Null Values

The option available in Analysis property to include null values may


have an impact on performance depending on the number of
dimensions and volume of data selected. Avoid using it unless it is really
necessary.
Database Features

• PERF_PREFER_SUPPRESS_EMPTY_TUPLES: This is for


Essbase only. If enabled, instead of applying non empty on the
axis, which may contains a very sparse set. Each cross-join of two
dimensions will have empty tuples suppressed before cross-
joining another dimension.
IndexCol

Sometimes the formula or columns used vary depending on a


session/presentation variable.
If you use a « case when » statement then the entire formula is pushed
to the physical query. But by using function IndexCol only the required
column/expression is pushed to the database.
Combined with the new 11g features in prompts (allow selection in a list
of custom values), it allows users to modify very significantly reports
structure without any increased cost on performance. This function can
be used in the RPD or directly in reports.

INDEXCOL( CASE VALUEOF(NQ_SESSION."PREFERRED_CURRENCY") WHEN 'USD' THEN 0 WHEN 'EUR'


THEN 1 WHEN 'AUD' THEN 2 END , "01 - Sample App Data (ORCL)".""."BISAMPLE"."F19 Rev.
(Converted)"."Revenue_Usd", "01 - Sample App Data (ORCL)".""."BISAMPLE"."F19 Rev.
(Converted)"."Revenue_Eur", "01 - Sample App Data (ORCL)".""."BISAMPLE"."F19 Rev.
(Converted)"."Revenue_Aud")
Agenda

• Repository design best practices


• Dashboards and reports design best practices
• Reading query log
• Performance tuning (relational database)
• Performance tuning (multi-dim database)
• Troubleshooting
• 10g Upgrade considerations
Troubleshooting

When a report returns an error message, or provides wrong results,


follow these steps:
1. Simplify (or ask customer to simplify) the report as much as
possible. Keep the smallest number of columns, union, the most
simple formulas, just one view… Even the remaining view should
be as simple as possible (no total/sub-total, no sort order, etc.).
The objective is to get the most simple report possible that still
allows to reproduce the issue.
2. Using this simplified version of the report, retrieve a screenshot
of the results, the corresponding query log, report XML, and
RPD.
3. Analyze the log. If the issue seems to come from the RPD, check
the definition of corresponding columns and LTS. Verify that best
practices are applied.
Troubleshooting

4. If the issue comes from report’s structure or if the cause is


unknown, try reproducing it on SampleApp. To do so, edit the
report XML. Replace in the XML the name of the subject area
and names of columns by SampleApp subject area and columns.
Modify the filters in order to retrieve data.
5. If the issue is reproducible on SampleApp, it means that it either
comes from a bug or from report’s definition. You can run
additional tests in this simple environment to find the cause and
a solution or a workaround. If you are stuck raise an SR or a
Bug.
6. If the issue is not reproducible on SampleApp, then it probably
comes either from the RPD or from data. You can search for
special data (special characters, null values…) by running
queries directly on the database and/or by running the logical
SQL query in Administration/Issue SQL.
Agenda

• Repository design best practices


• Dashboards and reports design best practices
• Reading query log
• Performance tuning (relational database)
• Performance tuning (multi-dim database)
• Troubleshooting
• 10g Upgrade considerations
10g Upgrade considerations

There are many modifications on existing functionalities and


algorithms between 10g and 11g.

Depending on the configuration, these modifications may


change significantly the results in reports. They may impact
both data and format of the report.

The list of examples mentioned here is NOT exhaustive.


Calculated Items

Calculated items with option “Hide Details” checked in


10g as shown above appear in all views in 11.1.1.7
Calculated Items

10g 11.1.1.7
Calculated Items
To replicate 10g behavior in 11g, you must:
• Add a new column identical to the one used to compute the
calculated item.

• In all views except in the one that includes the calculated item,
replace the old column by the new one.
Calculated Items
To identify 10g reports with Calculated Items and option
« hide details » selected, you can run a basic search in
all 10g catalog files (*. To select reports files only).

To identify all reports with calculated items, search for


string:
calcItem

To identify reports with calculated items and option ‘hide


details’ selected, search for:
hideDetails="true“
Report-Based Totals

This option did not work in 10g and is fixed in 11g. It is


selected by default.
• It may change significantly the values. 11g values are often better
than 10g, but not always…
• Depending on the report, it may be hard to explain the results to
users.
• It may be removed from tables and pivot tables, but not from
charts.
Report-Based Totals

What really does Report-Based Total option ?


Sort Orders

Sort orders in 11g are very often different from 10g.

• In 11g, sort defined in criteria tab is not necessarily


applied to pivot tables, especially if the column sorted in
criteria tab is excluded from the pivot table. The sort
order has to be defined in the pivot table itself. Note
that when the column that you want to use to sort is not
in the pivot table, you have to add it, apply the sort, and
then hide the column.
Sort Orders

• 10g bugs are fixed when a sort key is created in RPD


configuration (example: month name, sorted by month
number). In 10g, sometimes the sort was not applied if
the sort column was not included in the report. In 11g,
even if the sort column is not included in your report,
the sort key defined in RPD is always applied.
Sort Orders

• In some circumstances, the sort order defined in 10g


was not applied properly. For instance you select the
sort order Ascending, and instead result is sorted in
descending order. Users in 10g automatically adapted
their sort orders in reports often without even noticing
the issue, just by looking at results. This is fixed in 11g.
So sometimes, in the report definition the sort order is
Descending, in 10g the results are sorted Ascending, in
11g the results are sorted Descending.
Sort Orders

• Sort in Graph: In 11g it is not possible to sort data in a


graph using a column that is not included in the view.
You have to add the column in the view (it can be
hidden) to apply the sort order defined on this column.
Total with Union/Running Aggregates

When a result set is computed with multiple queries


(UNION) or with running aggregates (MAVG, MSUM,
RSUM, RCOUNT, RMAX, RMIN), 11g does not apply any
default aggregation rule for totals. The aggregation rule
must be specified manually in tables/pivot tables.

This is necessary for totals, sub-totals, or when some


columns are excluded from the view.
Generated SQL

The SQL generation in 11g is different from 10g. The


objective is to get more optimized SQL in 11g. However
this may lead to differences in results if the RPD
configuration or tables content is not consistent.

10g 11g
Analyzing Catalog Upgrade Logs

• The main log for catalog upgrade is


$MW_HOME/instances/instance1/diagnostics/logs/OracleBIPresentationS
ervicesComponent/coreapplication_obips1/webcatupgrade0.log

• In this log file, search for keyword « error ». Do not pay


attention to other messages.

• For each error/warning there is a global error message


with the path of the object (report, ibot…). Next there is
the XML of the object before/after upgrade. The « after
upgrade » XML is available only for warnings. After that,
there is a detail error message describing the issue.
Analyzing Catalog Upgrade Logs

• Datatype error: Type:InvalidDatatypeValueException,


Message:Value '-2147483648' must be greater than or
equal to MinInclusive '0‘
• The segment count has an invalid value.

• Required attribute ‘guid’ was not provided:


• The iBot has been upgraded, but some of the
recipients are not found in the list of users available
in the authentication source. Check if the user is still
able to authenticate, and if not, delete him from the
webcat.
Analyzing Catalog Upgrade Logs

• No character data is allowed by content model:


• Report XML is invalid and should be fixed. Remove
the unwanted characters.

There are many different error messages about invalid


XML. Note that very often, it is faster to delete/recreate
the report in 10g or in 11g than to spend a lot of time
trying to fix the XML error.
Graph Engine

The software used for the graph engine in 11g is not the
same as the one in 10g.

Although the upgrade process tries to match as much as


possible the graph properties selected in 10g with the
ones available in 11g, a number of differences have to be
expected.

11g graph engine has some new options that were not
available in 10g, and some options that existed in 10g
are not available anymore.
Graph Engine, Miscellaneous

• The ranges for the numeric axis labels in graphs have changed
from 10g to 11g due to a different automatic axis range calculation
engine.
• Hidden columns used for labels in 10g are not displayed in 11g. If
you have a column that is used as the label for a graph, but the
column is hidden from the graph, then in 11g, the labels are not
displayed.
• Some axis labels might be skipped as a result of the automatic
label layout algorithm in use for 11g. The option that prevented
skipping labels in 10g does not exist in 11g. It is possible to see all
labels by modifying the size of the graph and labels.
Graph Engine, Miscellaneous

10g 11g
Graph Engine, Miscellaneous

• You cannot rotate graph labels for the y-axis other than 0-90 or -90.
You cannot perform 45-degree rotations.
• In 10g, graphs do not always honor criteria-level formats or other
global data formats for columns. Data labels and numeric axis
labels do not consistently follow this formatting. This issue has
been addressed in 11g.
• In 10g, pie graphs display absolute values, including negative
values. Negative values are interpreted as positive values and
those slices are displayed. In 11g, slices are not displayed for
negative values. When all the values are negative, the graph is not
displayed. In 11g, the legend is displayed for negative values.
Graph Engine, Miscellaneous

• When a stacked bar graph is upgraded from 10g to 11g, the order
or position of the series might change. However, the legend view is
upgraded without any change. This might cause a mismatch
between the legend that is displayed in the legend view and the
color that is displayed in the graph. To resolve this, either change
the color in the graph or update the legend to match the color in
the graph. In addition, the stacking order in the bar graph changes
when you include a column in Vary Color By. For other cases, the
order and coloring is maintained. The legend is incorrect or
mismatched when you specify conditional formatting on the column
in Vary Color By.
Default number of rows

• In 10g the number of rows displayed was limited only in


table view. In 11g this number of rows is limited in all
views. Some parameters in instanceconfig.xml allow
you to change this limit.

• Number of records that can be exported is limited as


well. There is a parameter available in EM to set the
maximum number of rows exported. But this does not
override the maximum number of rows per view. So
both parameters (MaxVisibleRows per view and global
export limit) have to be modified.
Default number of rows
Font weight and alignment

If the font is not explicitly set, then it relies on the setting


of the nearest ancestor element in HTML that has font
size specified. Then the behavior of the font is non-
deterministic and since if the parent element changed
between 10g and 11g, this is impacted. For instance, the
following text is in a dashboard page:

<span style="font-weight: bold;">Multi-segments choice</span>

In 10g, its closest ancestor element is (8pt) but now in


11g, it is 9pt. Thus you see the fonts in 9pt size. The
solution is to add : font-size:8pt in the span so that it
won't be affected by changes made to the framework.
Hidden but included data is not
displayed
In 10g, if a column is hidden but included in a pivot table,
the data is displayed in the pivot table. In 11g, if the
column is hidden at the criteria level, then the data is not
displayed
iBots => Agents

Options available in 11g agents are significantly different


from 10g iBots, in particular for script management. So
scripts options on 10g iBots are not available after the
upgrade. They can still be executed, be cannot be
modified.
A new agent must be created in 11g if you need to modify
these options.
Multiple column selectors

In 10g, column selectors included just a list of columns


selected. In 11g however, column selectors also include
the properties of each column available. If multiple
column selectors include the same column they may be
in conflict with each other after the upgrade.

Whenever possible, merge all column selectors to keep


only one per report before the upgrade. If not possible,
make sure at least that the same column is not included
in two column selectors.
Upgrading one report only

Note that it is possible to upgrade just one single report.


This can be very useful for testing or to maintain
consistency between 10g and 11g environments. To
upgrade one report, copy/paste the XML from Advanced
tab in Answer from one environment to the other. When
the XML is applied in 11g environment, it is upgraded
automatically.
Spaces in Column Names

In 10g, when a column had leading or trailing spaces it


created a warning in consistency checker. In 11g, this is
considered as an error. So it is mandatory to remove all
leading and trailing spaces from columns.

The main impact is that all reports using these columns


have to be modified. The easiest solution is to use a
simple text Search&Replace tool that can search and
replace in multiple files at the same time. Just identify the
column’s previous name in a report XML and replace it by
the new one.
Clean 10g Catalog

A number of issues during catalog upgrade are caused


by obsolete elements that should be deleted
• Unused Views: pivot tables may include calculated
items. Even if the views are obsolete and are not
included in compund layout, calculated item will be
propagated to all other views during the upgrade.
Delete all unused pivot table views before the upgrade.
• Obsolete Reports: old catalogs usually include many
reports that are not used anymore. These reports may
include errors that will have to be analyzed and fixed
during the upgrade. The number of reports also impact
the duration of the upgrade. Delete obsolete reports.
Clean 10g Catalog

• Old Users: error messages will appear during catalog


upgrade for each user in the catalog who cannot be
authenticated anymore. These users’ folders cannot be
upgraded. They also increase upgrade duration
significantly. Delete old accounts before or after the
upgrade.

• As described in other slides, a number of reports have


to be modified so that their behavior will not change in
11g. To reduce the duration of the freezing period (time
between the last catalog extract from 10g and the 11g
production roll-out), do as many modification as
possible in 10g before the upgrade.
UA or Manual Catalog Upgrade

The upgrade assistant copies the catalog first before


starting the upgrade process. For big catalogs, a number
of problem may happen during this phase (not enough
space, network issue…). Even if the copy fails, the
upgrade will start.

It is possible to copy the catalog and start the upgrade


process manually instead.
UA or Manual Catalog Upgrade

• Copy 10g catalog to a new location on 11g Server


• Stop 11g Presentation Server
• Update 11g catalog location using Enterprise Manager
• Add/Modify these flags to instanceconfig.xml:
<Catalog>
<UpgradeAndExit>true</UpgradeAndExit>
</Catalog>
• Start Presentation Server – This will upgrade the catalog and
shutdown automatically
• Remove the flag true from the instanceconfig.xml
• Start Presentation Server again
Data Type Conversion

In 10g there were issues regarding data type


management. When dividing an integer by an integer, the
result (integer or decimal) was different depending on
where the calculation was done. On BI Server or on
some databases result was an integer, but when applied
on Oracle database result was decimal.

To fix this and to comply to ANSI standard, in 11g


integer/integer=integer for all data sources.
Data Type Conversion

This may cause some discrepancies between 10g and


11g results. Adding cast function in calculations may be
required in 11g to get same results as in 10g.

It is also possible to go back to 10g behavior by creating


session variable DISABLE_FLOOR_IN_DIVISION with
value 1. Note that this will result in same inconsistencies
as 10g.

You might also like