You are on page 1of 17

SAP BW Release 3.

5
-2- 07.10.2004

Copyright

Copyright 2002 SAP AG. All rights reserved.

No part of this publication may be reproduced or transmitted in any form or for any purpose without the
express permission of SAP AG. The information contained herein may be changed without prior
notice.

Some software products marketed by SAP AG and its distributors contain proprietary software
components of other software vendors.

Microsoft, WINDOWS, NT, EXCEL, Word, PowerPoint and SQL Server are registered
trademarks of Microsoft Corporation.

IBM, DB2, DB2 Universal Database, OS/2, Parallel Sysplex, MVS/ESA, AIX, S/390,
AS/400, OS/390, OS/400, iSeries, pSeries, xSeries, zSeries, z/OS, AFP, Intelligent Miner,
WebSphere, Netfinity, Tivoli, Informix and Informix Dynamic ServerTM are trademarks of IBM
Corporation in USA and/or other countries.

ORACLE is a registered trademark of ORACLE Corporation.

UNIX, X/Open, OSF/1, and Motif are registered trademarks of the Open Group.

Citrix, the Citrix logo, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame,
MultiWin and other Citrix product names referenced herein are trademarks of Citrix Systems, Inc.

HTML, DHTML, XML, XHTML are trademarks or registered trademarks of W 3C, World Wide Web
Consortium, Massachusetts Institute of Technology.

JAVA is a registered trademark of Sun Microsystems, Inc.

JAVASCRIPT is a registered trademark of Sun Microsystems, Inc., used under license for
technology invented and implemented by Netscape.

SAP, SAP Logo, R/2, RIVA, R/3, SAP ArchiveLink, SAP Business Workflow, WebFlow,
SAP EarlyWatch, BAPI, SAPPHIRE, Management Cockpit, mySAP, mySAP.com, and other
SAP products and services mentioned herein as well as their respective logos are
trademarks or registered trademarks of SAP AG in Germany and in several other countries
all over the world. MarketSet and Enterprise Buyer are jointly owned trademarks of
SAP Markets and Commerce One. All other product and service names mentioned are the
trademarks of their respective owners.

Scoring BW 3.5
-3- 07.10.2004

Contents

An Introduction to Weighted Score Tables ......................................................................................... 4


Use and Applications ............................................................................................................................ 5
Typical Input........................................................................................................................................... 6
Typical Output........................................................................................................................................ 7
Data Mining Functions in the Analysis Process Designer ( APD)..................................................... 8
Settings for Weighted Score Tables .................................................................................................... 9
Model Fields for Weighted Score Tables ..................................................................................... 9
Field Parameters for Outlier Treatment ..................................................................................... 13
Field Parameters for Treating Missing Values .......................................................................... 14
Model Parameters for Weighted Score Tables .......................................................................... 15
Algorithm .............................................................................................................................................. 16
Recommendations for Weighted Score Tables ............................... Fehler! Textmarke nicht definiert.

Scoring BW 3.5
-4- 07.10.2004

An Introduction to Weighted Score Tables


A weighted score table is a method of evaluating alternatives when the importance of each criteria
differs. In a weighted score table, each alternative is given a score for each criteria. These scores are
then weighted by the importance of each criterion. All of an alternative's weighted scores are then
added together to calculate that alternative's total weighted score. The alternative with the highest total
score should be the best alternative
You can use weighted score tables to make predictions about future customer behavior. You create a
model in the data mining application to make predictions. After a model has been created based on
historical data, it can then be applied to new data to make prediction s. The prediction, that is, the
output of the model is called a Score. You can create a single score for your customers by taking into
account different dimensions.
SAPs Weighted Score Tables method allows you to define your own valuation function by first
assigning weights to the individual model fields and then creating a weighted total from these model
fields.

Scoring BW 3.5
-5- 07.10.2004

Use and Applications


Weighted Score Table
A beverage outlet wants to attract the younger end of the market by introducing a product from a
higher price category into its product range. To determine potential customers, customer data
(including attributes like age, income, and drink expenditure) is valuated directly using the weighted
score tables. The age group 10-19 is valuated with 15, 20-29 with 10, 30-39 with 5, and so forth. The
customer incomes are valuated continuously by taking the respective figure as the value in each case
(that is, an income of 50,000 is valuated with 50,000). A weight is then assigned to each attribute: 2 for
age and 0.0001 for income. Thus, the score for a 25-year-old customer with an income of 40,000 is
calculated as follows: (2 x 10) + (0.0001 x 40,000) = 24.

Advertising Measures
You can use Weighted Score Tables to define a customer valuation that is dependent on the
characteristics and key figures of the customers. You can use valuation to support advertising
measures, to make service offers or rebates, or to select customers of interest for other purposes.

Scoring BW 3.5
-6- 07.10.2004

Typical Input
The following table contains data that could make up part of the typical input data for weighted score
tables. The customer id is the tables key, and state, salary and status are discrete fields. The
remaining fields are continuous.

Customer No of
ID State Salary Children Status
1 WA USD50K - USD70K 1 BRONZE
2 BC USD70K - USD90K 1 BRONZE
3 WA USD50K - USD70K 1 BRONZE
4 BC USD10K - USD30K 4 NORMAL
5 CA USD30K - USD50K 3 SILVER
6 WA USD70K - USD90K 3 BRONZE
7 OX USD30K - USD50K 2 BRONZE
8 DF USD50K - USD70K 2 BRONZE
9 BC USD10K - USD30K 5 NORMAL
10 OR USD30K - USD50K 4 GOLDEN
11 CA USD50K - USD70K 4 BRONZE
12 WA USD50K - USD70K 1 BRONZE

Scoring BW 3.5
-7- 07.10.2004

Typical Output
The weighted score tables results can be displayed in a tabular form showing the individual records
and their result values.

In the above example, the partial score for each model field, that is, Number of Children, State, Salary
and Status, is displayed.
Scoring BW 3.5
-8- 07.10.2004

Data Mining Functions in the Analysis Process


Designer (APD)

The Analysis Process Designer (APD) is the application environmen t for the SAP data mining solution.
From SAP BW Release 3.5, data mining functions are fully integrated into the APD. You can perform
the following functions in the APD:

Creating and changing data mining models


Training data mining models with SAP BW data (data mining model as data target in the analysis
process)
Execution of data mining methods such as prediction with decision tree, with cluster model and
integration of data mining models from third parties (data mining model as a transformation in the
analysis process)
Visualization of data mining models

For more information, see SAP Library at help.sap.com under SAP NetWeaver -> Release 04 ->
Information Integration -> SAP Business Information Warehouse -> BI Platform -> Analysis Process
Designer / Data Mining

Scoring BW 3.5
-9- 07.10.2004

Settings for Weighted Score Tables


The input data for SAPs Weighted Score Table is divided into two parts:

Model Fields
Model Parameters

Model Fields for Weighted Score Tables

1 2

Model fields are the attributes that define the object and the predictable field is the class label. In
Model Fields screen, you can add the fields that are required for creating decision trees. You must
define the content type for each model field.

(1) Content Type


It defines the data in a model field. There are 4 content types for model fields used in decision tree
classification.
Key field: The key field acts as the record identifier. This field does not have any influence on the
outcome.
Discrete: Also referred to as categorical, the data in the model fiel d for this content type contains a
finite set of values. For example, a model field Gender has two values - Male and Female. Attributes
like Color, Gender, and Status are examples of discrete attributes.

Continuous: Continuous data can have any value in an interval of real numbers. This implies that the
value does not have to be an integer. Attributes having infinite set of possible real values are called
Continuous. Typically, they have a Minimum and Maximum value and attribute values could be
anything within this interval. Attributes like Salary, Sales Revenue, Quantity sold etc are examples of
Continuous attributes. You can discretize a Continuous attribute by defining fixed intervals. For
example, if the salary ranges from $100 to $20000, then we can form intervals like $0 2000, $2000
$4000, $4000 $6000. $18000 $20000. An attribute value will fall into any one of these intervals.

(2) Field Parameters for Weighted Score Tables


You can specify the parameter values for each model field, except t he Key Field. You must specify the
partial weights for the model fields depending on the type of the model field.

Scoring BW 3.5
- 10 - 07.10.2004

Discrete: For model fields of type discrete, you specify the individual values of the field. As described
below, you can define a common partial weight for some of the remaining values. This weight is
applied only if you have set the Treat as separate instance indicator in the Outlier treatment tab page.
For more detailed information about handling outliers, see the section Parameters of the Model Fields
for Treating Outliers.
For a model field of type Discrete, the field parameters are as follows:
Weight of Model Field: You can define the weight of a model field.
Value: You enter the values for the model field in the column.
Partial Score: For each model field value that you enter, you must specify a model field weight. The
weights for the model fields determine the share of partial weights that the score has.
Partial Score for remaining values: You can define a single weight for all the remaining values in the
dataset.
The score is calculated as follows:
Score (Field1, Field2 ...) = Weight1 x Partial Weight1 (Field1) + Weight2 x Partial Weight2 (Field2) + ...

Scoring BW 3.5
- 11 - 07.10.2004

Field Parameters for type Continuous

In the case of continuous model fields, you specify partial weights for individual threshold values. You
must also specify how to deal with values of partial weights between the threshold values. You have
the following options:
Function is piecewise constant: You have to specify the function of the partial weights between the
threshold values. Check this option to specify the function of the partial weight as piecewise constant.
If you check this option, then the function is constant between each pair of threshold values . That is,
the partial weight of the left threshold and of the right threshold is considered for each setting.
Alternatively, if you do not set the function is piecewise constant indicator, linear interpolation is
applied to calculate the partial weight continuously between each pair of threshold values.
In each case, you have to specify at least two threshold values because the value range used for
outlier treatment lies above the largest threshold and beneath the smallest threshold.

The score value to be defined is dependent on the discrete model field Status and the
continuous model field Salary. The weighting of these two model fields should be 3 and 1
respectively. In the model field Status, the data to be processed takes the values gold,
silver, bronze, copper, and iron. The following partial weightings could then be specified:
Value Partial Weighting

Scoring BW 3.5
- 12 - 07.10.2004

Gold 10
Silver 6
Bronze 4

The partial weighting 2 can be assigned to the remaining values. For the model field
Salary, the threshold values and corresponding partial weightings could be assigned as
follows:
Threshold Value Partial Weighting
0 0
10 000 10

25 000 20

50 000 30

The partial weightings function should be piecewise constant and take the partial
weighting of the left threshold value in the interval between two threshold values. In this
way, the score value (silver, 40 000) = 3 x 6 + 1 x 20 = 38 is obtained. If the partial
weightings function for the income should be continuous instead of piecewise constant,
then it produces the score value (silver, 40 000) = 3 x 6 + 1 x 26 = 44.
If the Treat as separate instance option was selected in outlier handling for the model field
Status, then the function produces the score value (iron, 10 000) = 3 x 2 + 1 x 10 = 16.
If the Constant extrapolation option was chosen in outlier handling for the model field
Income, then the function produces the score value (silver, 60 000) = 3 x 6 + 1 x 30 = 48.
If the Extrapolation option is chosen, this produces the score value (silver, 60 000) = 3 x 6
+ 1 x 34 = 52.
For more detailed information about how to treat outliers, see the section Parameters of the Model
Fields for Treating Outliers.

Scoring BW 3.5
- 13 - 07.10.2004

Field Parameters for Outlier Treatment


The parameters for the model fields offer control options for treating outliers.
Which values are treated as outliers?
For discrete model fields, outliers are values that do not belong to the values specified explicitly or to
the most frequently occurring values.
For continuous model fields, outliers are values falling outside of the outer borders that are
determined during the definition of the value ranges, either explicitly or automatically.
You can make an outlier treatment setting to decide whether processing is stopped, the record is
ignored, or the default score is set when a record occurs containing an outlier.

Continuous Model Fields


For continuous model fields, you can specify that an extrapolation is applied.
Extrapolation: Whenever a linear regression model is fit to a group of data, the range of the data
should be carefully specified. Using the regression method, it is possible to make predictions for
values outside this specified range of data. This process is known as extrapolation. If you choose the
option Extrapolation, the outliers, that is, the values lying outside the specified range of values will not
be treated separately.
Constant Extrapolation: If you select this option, then the function is constantly extrapolated beyond
the external borders.

Scoring BW 3.5
- 14 - 07.10.2004

Discrete Model Fields


Treat as separate instance: This option is valid only for discrete model fields. By setting this option,
all outliers are treated as a single, common remainder.

Field Parameters for Treating Missing Values


The parameters for the model fields offer control options for handling missing values.

Scoring BW 3.5
- 15 - 07.10.2004

For treating missing values, you first have to set the appropriate indicator and identify a missing value.
If, for example, the size of family is denoted by a numeric value and NA has been used to denote a
value that is unknown, you can enter NA as the Missing Value. You define a separate treatment for
this value accordingly.
You can make a setting to decide whether processing is stopped, the record is ignored, or the default
score is set when a value defined in this way occurs. Using the option Replace by value, you can
substitute the missing value with another value.

Model Parameters for Weighted Score Tables

Default Score
You use this parameter to specify a default output value for weighted score tables. If required, this
value is always set whenever a record does not fulfill certain conditions (for example, it has missing
data or outliers). The default value for this field is 0 (zero).

Scoring BW 3.5
- 16 - 07.10.2004

Algorithm
Weighted Score Tables
A function f that is defined by weighted score tables is a linear combination of functions of a variable.
f ( x1 ,..., x n ) = w1 f 1 ( x1 ) + ... + wn f n ( x n )
The weights w1 ,..., wn are arbitrary numbers. Each of the functions f1 ,..., f n is mapped to exactly one
model field. The arguments x1 ,..., xn of these functions are those values that the model fields can
take.
For discrete model fields, the score table of the model field is used to directly assign a function value
f i ( xi ) to individual values xi of the model field. A common function value can be assigned to values
that are not listed explicitly in the table.
For continuous model fields, the score table of the model field is also used to d irectly assign a function
value xi to individual values f i ( xi ) of the model field. Either a linear interpolation is made between
two points, or the function value from the left or right point is taken. Respectively, either a polygon line
or a piecewise constant function is defined. Depending on the option selected by the user, the function
is continued as linear or continuous beyond the outer points.
Let us assume that you would like to valuate your customer data on the basis of the fields Occupation
and Age. For this, you could define a weighted score table function as follows:
Score (Occupation, Age) = w 1 x ps1(Occupation) + w2 x ps2 (Age)

w1 and w2 stand for the weights you give the two fields, such as:

w1 := Weight for occupation := 2

w2 := Weight for age := 5

ps1 and ps2 stand for the functions with which you define partial scores for both fields, as in the
following table:

Occupational ps1 (Occupational


Group Group)
Employee 5
Civil servant 7
Self-employed 10
Other 2

Age ps2 (Age)

0 0
20 6

Scoring BW 3.5
- 17 - 07.10.2004

30 10
50 4
65 2

Ages falling between those specified above should be interpolated. This then gives you, for example:
Score (Employee,25) = 2 x ps 1(Employee) + 5 x ps 2(25) = 2 x 5 + 5 x 8 = 50

Recommendations for Weighted Score Tables


Start with small models containing few model fields
For continuous model fields, you can choose to perform valuations as continuous or in stages. If
you opt for valuation in stages, you must specify whether the stages you choose contain the left
interval border or the right interval border.
Do not use any model fields with the weight 0. Remove all such fields from the model.
The structure on which the function is based, is defined as the weighted sum of the functions in
one individual model field. This produces certain dependencies. If, for example, the function f is to
be defined for gender and age, then male and female customers have profiles that only differ in
one additive constant. If f(male,20)=15 and f(female,20)=10, then the following applies for every
age x f(male, x)f(female, x) = f(male, 20)f(female, 20) = 5. If f(male, 50)=20, then it also applies
that f(female, 50)=15.

Scoring BW 3.5