Lift Chart (Training Dataset) : A) Answer

Y Xun a) answer X normal Acual Y Pcut = 0.4 Pcut = 0.
5
0 20 0 0 0 0
0 22 0.125 0 0 0
0 24 0.25 0 0 0
0 26 0.375 0 0 0
0 28 0.5 0 1 1
0 28.6 0.5375 0 1 1
0 27 0.4375 0 1 0
1 27.4 0.4625 1 1 0
1 28 0.5 1 1 1
1 28.4 0.525 1 1 1
1 29 0.5625 1 1 1
1 30 0.625 1 1 1
1 32 0.75 1 1 1
1 34 0.875 1 1 1
1 36 1 1 1 1
c) Answer
When we chose random 5 varia

Lift chart (training dataset) But when we got the best 5, we
9
8
7
6 Cumulative Y when
Cumulative
5 sorted using predicted

values
4
3 Cumulative Y using
average
2
1
0
0 2 4 6 8 10 12 14 16
# cases
d) Answer
Y Xun X normal Pi Acual Y Pcut = 0.4 Pcut = 0.5

0 20 0 0.1464 0 0 0
0 22 0.125 0.1938 0 0 0
0 24 0.25 0.2500 0 0 0
0 26 0.375 0.3232 0 0 0
0 28 0.5 0.5000 0 1 1
0 28.6 0.5375 0.5968 0 1 1
0 27 0.4375 0.3750 0 0 0
1 27.4 0.4625 0.4032 1 1 0
1 28 0.5 0.5000 1 1 1
1 28.4 0.525 0.5791 1 1 1
1 29 0.5625 0.6250 1 1 1
1 30 0.625 0.6768 1 1 1
1 32 0.75 0.7500 1 1 1
1 34 0.875 0.8062 1 1 1
1 36 1 0.8536 1 1 1
When Pcut value = 0.5, these two methods show the same Sensitivity & Specificity. However wh
Therefore d) rule would be better.
Pcut = 0.6 B) Answers
0 Pcut value = 0.4
0 Confusion Matrix Predicted Class Accuracy 0.80
0 Actual class 1 0 Sensitivity 1.00
0 1 8 0 Specificity 0.57
0 0 3 4
0
0 Pcut value = 0.5
0 0 2 5
1
1 Pcut value = 0.6
1 4 4 Specificity 1.00
0 0 7
hen we chose random 5 variables, it could correct nearly over 2 successes

t when we got the best 5, we can get almost 5 successes. So this prediction will be shown correctly.
Pcut = 0.6 Pcut value = 0.4

0 0 2 5
0
0 Pcut value = 0.5
0 0 2 5
1
1 Pcut value = 0.6
0 0 7
ty & Specificity. However when we chose other Pcut value, D's pi shows better resaults.
XLMiner : Multiple Linear Regression
Output Navigator
Inputs Train. Score - Summary Valid. Score - Summary Test Score - Summary Database Score
Elapsed Time Train. Score - Detailed Rep. Valid. Score - Detailed Rep. Test Score - Detailed Rep. New Score - Detailed Rep.
ANOVA Training Lift Charts Validation Lift Charts Test Lift Charts Subset selection
Reg. Model Residuals Var. Covar. Matrix Collinearity Diagnostics
Inputs
Data
Training data used for building the model ['2007007723_이승훈_Data Mining_HW#3.xlsx']'data'!
$A$34:$D$48
# Records in the training data 15
Variables
# Input Variables 1
Input variables Pi
Output variable Y
Constant term present Yes
Output options chosen

Summary report of scoring on training data
Lift charts on training data
The Regression Model
Input variables Coefficient Std. Error p-value SS

Constant term -0.32129854 0.24628934 0.21466681 4.26666689
Pi 1.69143438 0.44914669 0.00235517 1.94783044
Training Data scoring - Summary Report
Total sum of
RMS Error Average Error
squared errors
1.78550285446 0.3450123529 4.1354902E-08
Elapsed Time
Overall (secs) 4.00

Date: 09-Oct-2013 20:01:05 (Ver: 12.5.3E)
Database Score
New Score - Detailed Rep.
Subset selection
Residual df 13
R-squared 0.5217402941
Std. Dev. estimate 0.37060273
Residual SS 1.78550291
$A$33:$D$48
XLMiner : Multiple Linear Regression - Lift chart for training data
Decile-wise lift chart (training datase

Lift chart (training dataset)
2
9 1.8
Decile mean / Global mean

8 1.6
7 1.4
6 Cumulative Y when 1.2
Cumulative
5 sorted using predicted 1

values
4 0.8
average 0.6
2 0.4
1 0.2
0 0
0 2 4 6 8 10 12 14 16 1 2 3 4 5 6 7 8
# cases Deciles
Decile Mean Std.Dev.

1 1 0
2 1 0
3 1 0
4 1 0
5 1 0
6 0 0
7 1 0
8 0 0
9 1 0
10 0.1666666667 0.3726779962
Date: 09-Oct-2013 20:01:06 (Ver: 12.5.3E)
Back to Navigator
e lift chart (training dataset)
3 4 5 6 7 8 9 10
Deciles
Min. Max.
1 1
1 1
1 1
1 1
1 1
0 0
1 1
0 0
1 1
0 1
Serial no. in training data in training data edicted values
1 1.1224310107 1 1
2 1.0423125458 1 2
3 0.947277245 1 3
4 0.8234248295 1 4
5 0.7358479475 1 5
6 0.6881910802 0 5
7 0.6581382797 1 6
8 0.52441865 0 6
9 0.52441865 1 7
10 0.3606462198 1 8
11 0.3129893525 0 8
12 0.2254124705 0 8
13 0.101560055 0 8
14 0.0065247542 0 8
15 -0.073593711 0 8
using average Deciles / Global mean
0.5333333333 1 1.875
1.0666666667 2 1.875
1.6 3 1.875
2.1333333333 4 1.875
2.6666666667 5 1.875
3.2 6 0
3.7333333333 7 1.875
4.2666666667 8 0
4.8 9 1.875
5.3333333333 10 0.3125
5.8666666667
6.4
6.9333333333
7.4666666667
8
DataSource
WorkBook Path D:\2013 2학기 수업\Data Mining\HW3
WorkBook Name 2007007723_이승훈_Data Mining_HW#3.xlsx
Training Range [data]!$A$34:$D$48
#Training Rows 15
#Variables in Data set 4
#Selected Variables 2
Data Dictionary
Variables in Data Set Y Xun X normal Pi
Variable Type* Continuous Continuous Continuous Continuous
Variable Data Type Number Number Number Number
Mining Schema
Selected Variables Pi Y
Variable Type Input Output
Inputs Normalised No
Model
Input Variables Coefficient

Constant Term -0.32129854
Pi 1.69143438
Date: 09-Oct-2013 20:01:08 (Ver: 12.5.3E)
*This is an indication of how XLMiner stores this variable for later retrieval; it does not necessarily reflect what type of variable was originall
what type of variable was originally input.
Output Navigator
Inputs Train. Score - Summary Valid. Score - Summary Test Score - Summary Database Score
Elapsed Time Train. Score - Detailed Rep. Valid. Score - Detailed Rep. Test Score - Detailed Rep. New Score - Detailed Rep.
ANOVA Training Lift Charts Validation Lift Charts Test Lift Charts Subset selection
Reg. Model Residuals Var. Covar. Matrix Collinearity Diagnostics
Inputs
Data
Training data used for building the model ['2007007723_이승훈_Data Mining_HW#3.xlsx']'data'!
$A$2:$D$16
# Records in the training data 15
Variables
# Input Variables 1
Input variables X normal
Output variable Y
Output options chosen

Summary report of scoring on training data
Lift charts on training data
The Regression Model
Input variables Coefficient Std. Error p-value SS

Constant term -0.14704199 0.22532152 0.52539462 4.26666689
X normal 1.3562299 0.40151784 0.00494938 1.74501586
Training Data scoring - Summary Report
Total sum of
RMS Error Average Error
squared errors
1.988317449163 0.3640803436 -9.8333333E-09
Elapsed Time
Overall (secs) 2.00

Date: 09-Oct-2013 20:01:32 (Ver: 12.5.3E)
Database Score
New Score - Detailed Rep.
Subset selection
Residual df 13
R-squared 0.4674149604
Std. Dev. estimate 0.39108503
Residual SS 1.98831749
$A$1:$D$16
XLMiner : Multiple Linear Regression - Lift chart for training data
Decile-wise lift chart (training datase

Lift chart (training dataset)
2
9 1.8
Decile mean / Global mean

8 1.6
7 1.4
6 Cumulative Y when 1.2
Cumulative
5 sorted using predicted 1

values
4 0.8
average 0.6
2 0.4
1 0.2
0 0
0 2 4 6 8 10 12 14 16 1 2 3 4 5 6 7 8
# cases Deciles
Decile Mean Std.Dev.

1 1 0
2 1 0
3 1 0
4 1 0
5 1 0
6 0 0
7 1 0
8 0 0
9 1 0
10 0.1666666667 0.3726779962
Date: 09-Oct-2013 20:01:33 (Ver: 12.5.3E)
Back to Navigator
e lift chart (training dataset)
3 4 5 6 7 8 9 10
Deciles
Min. Max.
1 1
1 1
1 1
1 1
1 1
0 0
1 1
0 0
1 1
0 1
Serial no. in training data in training data edicted values
1 1.20918791 1 1
2 1.0396591725 1 2
3 0.870130435 1 3
4 0.7006016975 1 4
5 0.6158373288 1 5
6 0.5819315812 0 5
7 0.5649787075 1 6
8 0.53107296 0 6
9 0.53107296 1 7
10 0.4802143388 1 8
11 0.4463085912 0 8
12 0.3615442225 0 8
13 0.192015485 0 8
14 0.0224867475 0 8
15 -0.14704199 0 8
using average Deciles / Global mean
0.5333333333 1 1.875
1.0666666667 2 1.875
1.6 3 1.875
2.1333333333 4 1.875
2.6666666667 5 1.875
3.2 6 0
3.7333333333 7 1.875
4.2666666667 8 0
4.8 9 1.875
5.3333333333 10 0.3125
5.8666666667
6.4
6.9333333333
7.4666666667
8
DataSource
WorkBook Path D:\2013 2학기 수업\Data Mining\HW3
WorkBook Name 2007007723_이승훈_Data Mining_HW#3.xlsx
Training Range [data]!$A$2:$D$16
#Training Rows 15
#Variables in Data set 4
#Selected Variables 2
Data Dictionary
Variables in Data Set Y Xun a) answer X normal
Variable Type* Continuous Continuous Categorical Continuous
Variable Data Type Number Number String Number
Mining Schema
Selected Variables X normal Y
Variable Type Input Output
Inputs Normalised No
Model
Input Variables Coefficient

Constant Term -0.14704199
X normal 1.3562299
Date: 09-Oct-2013 20:01:33 (Ver: 12.5.3E)
*This is an indication of how XLMiner stores this variable for later retrieval; it does not necessarily reflect what type of variable was originall
what type of variable was originally input.

Lift Chart (Training Dataset) : A) Answer

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lift Chart (Training Dataset) : A) Answer

Uploaded by

Copyright:

Available Formats

Y Xun a) answer X normal Acual Y Pcut = 0.4 Pcut = 0.

When we chose random 5 varia

5 sorted using predicted

Y Xun X normal Pi Acual Y Pcut = 0.4 Pcut = 0.5

hen we chose random 5 variables, it could correct nearly over 2 successes

Pcut = 0.6 Pcut value = 0.4

Reg. Model Residuals Var. Covar. Matrix Collinearity Diagnostics

Output options chosen

The Regression Model

Input variables Coefficient Std. Error p-value SS

Training Data scoring - Summary Report

1.78550285446 0.3450123529 4.1354902E-08

Overall (secs) 4.00

New Score - Detailed Rep.

Decile-wise lift chart (training datase

Decile mean / Global mean

5 sorted using predicted 1

Decile Mean Std.Dev.

e lift chart (training dataset)

Input Variables Coefficient

Reg. Model Residuals Var. Covar. Matrix Collinearity Diagnostics

Input variables X normal

Output options chosen

The Regression Model

Input variables Coefficient Std. Error p-value SS

Training Data scoring - Summary Report

1.988317449163 0.3640803436 -9.8333333E-09

Overall (secs) 2.00

New Score - Detailed Rep.

Decile-wise lift chart (training datase

Decile mean / Global mean

5 sorted using predicted 1

Decile Mean Std.Dev.

e lift chart (training dataset)

Input Variables Coefficient

You might also like