You are on page 1of 2

ISYS3374 Business Analytics – Third Assessment

Note: You need to submit your answers in a word document. You need to transfer the results from the excel file
into the word document. In addition, you must submit your Excel file (we prefer a single excel file with one or
multiple worksheet for each question) but note that only the word document will be marked. If you think there is
any issue or unclarity in any question, please make your assumptions (if there is any) and clearly explain them in
your report.

You need to add the coversheet and sign it. Please write the name of your tutor as well as the name of your lecturer
in the coversheet.
The analysis and the answers must be your own individual work without consultation of any other person. Also,
you are not allowed to help/advise other students.

SECTION A: DISCUSSION QUESTIONS

1- Explain the confusion matrix in classification methods and provide an example on how you
interpret the its number? (3 marks).
2- Give two practical examples on applications of classification methods in your discipline. Provide
detail explanations. You need to explain why you think classification can be used in those cases,
you do not need to provide data or solve them. (2 marks).
3- Give two examples related to your discipline that you need to apply over sampling partitioning
before building the model. You need to provide detail explanations (5 marks).
4- Assume one of the explanatory variable (named X1) in your logistic regression is a categorical
variable with the following levels: low, average and high, and another explanatory variable
(named X2) is also categorical with the following levels: Sydney, Melbourne and Brisbane.
Explain how you will use them in developing your logistic regression model. How many
coefficients you will have in your final model? (6 marks).
(5+3+2+6 = 16 marks)

SECTION B: QUANTITATIVE QUESTIONS


5- There are 500 client records in the first worksheet of the Excel file (provided for this assessment)
who have shopped many special products from an e-Business website. Each record includes data
on types of product purchased (between 1-5), purchase amount ($), age, gender, family size of
the customer, whether the client has a membership and whether the customer has a discount card.
a) Explain the steps on how to develop a KNN model to predict which customers will spend
more than $1000. (Write your answer as: Step 1-… Step 2- … and so on. You don’t need
to run XLMiner and report the results.) (8 marks)
b) Develop a predictive model to predict the spend amount of a new female customer with
age of 28 who is living in a family with size 3 and is not a member and hold a discount
card type 3. (6 marks)
(8+6=14 marks)

6- A company provides maintenance service for washing machines in Victoria. The collected data
are presented in the Excel file (second worksheet).

a) Assume the manager asked you to analyse the data and provide him some insights and
recommendations. The report should not exceed 2 pages. (8 marks)
b) Build a model to predict the repair time for a future booking service than needs to be done
by John and it is an Electrical repair. Do you suggest this service to be assigned to the
morning shift or afternoon shift? (6 marks)
c) What other data you recommend to the manger to be added into this dataset in future for
better analysis and what kind of analysis you think will be useful based on them. (4
marks)
(8+6+2 = 16 marks)

1
ISYS3374 Business Analytics – Third Assessment

7- In worksheet 3, a dataset from blood bank is presented. The data are recorded for apheresis blood
donation made by a group of donors of a period of time. The donor ID is unique for each donor.
A donor might have donated more than once in this period. At each donation, the blood total
protein level of the donor has been recorded. Use the dataset to answer the following questions:
a) There are some missing values for blood type. Think how you can fill in the missing
values. Explain your approach (step by step) and also apply your approach and try to fill
the missing value as much as possible in. (save the results in an Excel worksheet in and
name it Question 3 Part a.) (4 marks)
b) Calculate the average of total protein for each blood type. Explain your approach (step
by step). Report them in a worksheet and name it Question 3 Part b. (2 marks)
c) Calculate the range of total protein for each blood type. Explain your approach (steps by
step). Report them in a worksheet and name it Question 3 Part c. (5 marks)
d) Is total protein declining by age? (2 marks)
e) Present two best visualisation tool for this data that you think provide useful information?
(4 marks)

(4+2+5+2+4= 17 marks)

The data presented in worksheet 4 is the results of a 4-year study conducted to assess how age, weight,
and gender influence the risk of diabetes. Risk is interpreted as the probability (times 100) that the patient
will have diabetes over the next 4-year period.
a) What predictive model you suggest to relate risk of diabetes to the person’s age, weight
and the gender. Why? (4 marks)
b) Develop an estimated multiple regression model that relates risk of diabetes to the
person’s age, weight, gender and life style. Present the regression formula as a
mathematical equation. Interpret the coefficients of the regression and comment on the
strength of the regression. (4 marks)
c) What is the risk percentage of diabetes over the next 4 years for a 59-year-old man living
in a small town with 72 kg weight? (4 marks)
(8+4+4= 16 marks)

8- Matthew has a new job as business analyst. He plans to invest 10 percent of his annual salary
after the tax into a retirement account at the end of every year for the next 30 years. Suppose that
annual return is 5%, and his current salary before tax is 80k which grow 3% per year. The tax
will apply as 15% on the salary up to 50k and it is 20% for the salary interval of 50k and 80k
and the tax rate will be 25% for the remaining salary more than 80k (for example if his salary
will be 105k, he is paying 15% tax on his first 50k and 20% in the next 30 k and 25% on his next
25k of his salary). then:
a) Create a spreadsheet which shows Matthew the balance of retirement account for various
levels of annual investments and returns. (3 marks)
b) If Matthew aims to gain $1,500,000 at the end of the 30th year, what percentage of his
salary he should put in the investment annually. (3 marks)
(3+3 = 6 marks)

9- A company blends two materials: A and B to produce two types of fertilizers. Fertilizer 1 must
be at least 45% of A and sells for $65 per kilo gram. Fertilizer 2 must be at least 70% of B and
sells for $48 per kilogram. The price of martial A is $10 per 100 kilo grams and the price of
martial B is $14 per 100 kilo grams if they purchased over 10,000 kilo gram the price will be
reduced by 10%. Total budget of the company to spend on raw martial is $1600.
a) Write the linear optimization model for the company to make the best decision.
b) Solve the model and present the results and interpret them.
c) Rewrite the model if 10% discount only apply to the amount purchased over 10000 kilo grams
(For example if the company purchases 1001 kg of A, the total price is 1000*10+1*9).
(5+5+5=15 marks)

TOTAL MARKS= 100


2

You might also like