You are on page 1of 15

CSW 33035

BUSINESS INTELLIGENT

ASSIGNMENT 2 STUDY CASE

NAME : MUHAMMAD SYAZWAN BIN ZURI

NO. MATRIC : 067636

PROGRAM : ISMSKKRK 3

LECTURE NAME: DR.MOKHAIRI


NO. CONTENTS PAGE

1. Introduction, Regression Model, others related 3

2. Print screens (from Weka, from excel) 4-5

3. Interpreting the model and complete the table 6-14

4. Conclusions 15

5. References 15

2
1.INTRODUCTION

 REGRESSION MODEL
A regression model provides a function that describes the relationship between one or
more independent variables and a response, dependent, or target variable.

Types of Regression:

1. Linear

A model that uses a linear regression has an output-input connection that is a straight
line. The most straightforward to understand and even see in action in the actual
world is this. Even when a relationship isn't quite linear, our brains nevertheless
attempt to recognise the pattern and associate that relationship with a crude linear
model.

2. Multiple

Multiple regression means that more than one input variable may have an impact on
the final result, or target variable. You might combine the number of emails sent in
the previous month with an extra variable for our example email campaign.

3. Non-Linear

For some portions of the marketing example above, a linear model would be
appropriate. However, we are aware that when we keep sending more emails during a
campaign, fewer people respond as compared to the total amount of emails sent. We
require a non-linear regression model to represent this.

4. Stepwise Regression Modeling

Stepwise regression is more of a method than the previous topics we have covered up
to this point, which are specific kinds of models. The analyst may begin building a
model using the input variable that is most directly associated if the model has several
potential inputs. The next stage after completing that is to improve the model's
accuracy.

 DATASET
A data set is a group of figures or values related to a specific topic. A data set may
include, for instance, each student's test results in a certain class. A data set is the
quantity of fish that each dolphin consumes in an aquarium.

 CROSS VALIDATION
A statistical technique called cross-validation is used to assess the effectiveness (or
accuracy) of machine learning models. It serves as a safeguard against overfitting in
prediction models, especially when the available data may be scarce. In cross-
validation, you divide the data into a predetermined number of folds (or divisions),
analyse each fold individually, and then average the total error estimate.

3
2. PRINT SCREENS (FROM WEKA, FROM EXCEL)

Print screen from Weka for Selling Price.

Print screen from Weka for House Size.

4
Print screen from Excel for Selling Price.

Print screen form Excel for House Size.

5
3. INTERPRETING THE MODEL AND COMPLETE THE TABLE

House size Lot size Bedrooms Bathrooms Bungalow Selling


(square feet) (square price
meter) (RM)
5000 1050 4 4 1 550000
5500 950 4 3 1 530000
4000 800 5 4 1 300000
3000 500 3 2 0 140000
2000 200 2 2 0 130000
4000 400 2 2 1 135000
3000 345 4 4 1 140000
Figure 1

Figure 2

Save data into Excel, save the file as a CSV file in Figure 3.

6
Figure 3

Figure 4

Open Weka and choose open file, then CSV file as file type and doing the next step in
Figure 5.

7
Figure 5

8
 In the study example, the Selling Price is the required data.

House size Lot size Bedrooms Bathrooms Bungalow Selling price


(square feet) (square (RM)
meter)
3000 950 3 4 1 ???

2000 500 3 3 1 ???

Choose linear regression with 3- fold cross validation, the Figure 6 shows the Selling
Prices Formula.

Figure 6

In Figure 7, create new data in Excel for case study to be completed.

9
Figure 7

Figure 8, copy the formula in Weka and paste at Excel with ‘=’ symbol and replace a
name in contains with column and row in Excel.

Figure 8

10
Figure 9

Based on the Figure 9, this answers for expected Selling Prices.

 The next data that need to be completed is the house size.

House size Lot size Bedrooms Bathrooms Bungalow Selling


(square feet) (square price
meter) (RM)
??? 450 3 2 0 200000
??? 550 4 4 1 340000

The next study case, create new data in Excel as shown in Figure 10.

11
Figure 10

In Weka , adjust selling price to house size and start that show formula for house size.

12
Figure 11

Copy the formula and paste it with ‘=’ symbol into new Excel column at Figure 12.

Figure 12

13
Figure 13

Based on Figure 13, -the answers for the expected house size.

14
4. CONCLUSION

The conclusion I got from this case study is to find the missing dataset from Weka
with the given formula to solve the incomplete table and Weka is one of the
prediction models that is very close to an answer such as data the predicted price and
house size for the house with data are as below:

House size = 5000

Lot size = 1050

Bedroom = 4

Bathrooms = 4

Bungalow = 1

Selling prize (RM) = 550000

Came up as RM 512649.004 which is very close to RM 550000 as for house price

5.0 REFFERENS

 https://www.mygreatlearning.com/blog/cross-validation/
 Linear Regression & Prediction | WEKA Way ! retrieved from
https://medium.com/@rahulvaish/linear-regression-prediction-weka-way-
3fdc1643e1b6
 What is linear regression? Retrieved from
regression#:~:text=Resources-,What%20is%20linear%20regression%3F,is
%20called%20the%20independent%20variab le.

15

You might also like