You are on page 1of 13

Boston Housing

Group I

Presented By:

•Abdullah Shaukat
•Adil Nasser
•Fawad Tariq
•Mohsin Mahmood
•Muhammad Bin Waseem
•Shiza Moeen
• Introduction to Data Set
• Model Results Explanation
• Classification Tree
Contents • Logistic Regression
• Comparison & Conclusion
Introduction to Data
Base
• This dataset contains information collected by
the US Census Service concerning housing in
the area of Boston Massachusetts. It was
obtained from the StatLib archive
(http://lib.stat.cmu.edu/datasets/boston). The
dataset has 506 cases.
• There are 14 attributes in each case of the
dataset.
Introduction to Data Base

Attributes Description
CRIM per capita crime rate by town 
ZN proportion of residential land zoned for lots over 25,000 sq. ft. 
INDUS proportion of non-retail business acres per town. 
CHAS Charles River dummy variable (1 if tract bounds river; 0 otherwise) 
NOX Nitric oxides concentration (parts per 10 million) 
RM average number of rooms per dwelling 
AGE proportion of owner-occupied units built prior to 1940 
DIS weighted distances to five Boston employment centers 
RAD index of accessibility to radial highways 
TAX full-value property-tax rate per $10,000 
PTRATIO pupil-teacher ratio by town 
B  1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town 
LSTAT  % lower status of the population 
MEDV Median value of owner-occupied homes in $1000
CAT.MEDV  Indicates whether median is above or below $30,000
Classification Tree – Results

Training: Classification Summary Validation: Classification Summary

Confusion Matrix Confusion Matrix


Actual\Predicted 0 1
Actual\Predicted 0 1
0 257 0 0 155 10
1 0 47 1 5 32

Error Report Error Report


Class # Cases # Errors % Error Class # Cases # Errors % Error
0 257 0 0 0 165 10 6.060606061
1 47 0 0 1 37 5 13.51351351
Overall 304 0 0 Overall 202 15 7.425742574

Metrics Metrics
Metric Value Metric Value
Accuracy (#correct) 304 Accuracy (#correct) 187
Accuracy (%correct) 100 Accuracy (%correct) 92.57425743
Specificity 1 Specificity 0.939393939
Sensitivity (Recall) 1 Sensitivity (Recall) 0.864864865
Precision 1 Precision 0.761904762
F1 score 1 F1 score 0.810126582
Success Class 1 Success Class 1
Success Probability 0.5 Success Probability 0.5
Classification Tree – Lift Charts
Classification Tree – Decile Charts
Classification Tree – Tree
Logistic Regression – Results

Training: Classification Summary Validation: Classification Summary

Confusion Matrix Confusion Matrix


Actual\Predicted 0 1 Actual\Predicted 0 1
0 251 6 0 155 10
1 7 40 1 3 34

Error Report Error Report


Class # Cases # Errors % Error Class # Cases # Errors % Error
0 257 6 2.33463035 0 165 10 6.060606061
1 47 7 14.89361702 1 37 3 8.108108108
Overall 304 13 4.276315789 Overall 202 13 6.435643564

Metrics Metrics
Metric Value Metric Value
Accuracy (#correct) 291 Accuracy (#correct) 189
Accuracy (%correct) 95.72368421 Accuracy (%correct) 93.56435644
Specificity 0.976653696 Specificity 0.939393939
Sensitivity (Recall) 0.85106383 Sensitivity (Recall) 0.918918919
Precision 0.869565217 Precision 0.772727273
F1 score 0.860215054 F1 score 0.839506173
Success Class 1 Success Class 1
Success Probability 0.5 Success Probability 0.5
Logisitic Regression – Lift Charts
Logisitic Regression – Decile Charts
Comparison & Conclusion

Algorithm Overall Error Rate First-Left Decile Transparency


Classification Trees 7.43 5.2 High
Logistic Regression 6.44 5.2 Medium
Thank you

You might also like