You are on page 1of 9

GROUP ASSIGNMENT COVER SHEET

STUDENT DETAILS

Student name: Trần Nguyễn Bình Khang Student ID number: 31231020532

Student name: Võ Phương Nghi Student ID number: 31231020933

Student name: Nguyễn Trần Phước Tài Student ID number: 31231024094

Student name: Nguyễn Đình Viên Student ID number: 31231024778

Student name: Hoàng Ngọc Hải Yến Student ID number: 31231020764

UNIT AND TUTORIAL DETAILS

Unit name: Statistics for Business Unit number: SB-DH49-12


Friday
Tutorial/Lecture: Group Assignment Class day and time: 8:00AM - 11:15AM
Lecturer or Tutor name: Dr. Le Anh Tuan

ASSIGNMENT DETAILS

Title: Group Assignment 1


Length: Due date: 8/3/2024 Date submitted: 8/3/2024

DECLARATION
I hold a copy of this assignment if the original is lost or damaged.
I hereby certify that no part of this assignment or product has been copied from any other student’s work or
from any other source except where due acknowledgement is made in the assignment.
I hereby certify that no part of this assignment or product has been submitted by me in another
(previous or current) assessment, except where appropriately referenced, and with prior permission
from the Lecturer / Tutor / Unit Coordinator for this unit.
No part of the assignment/product has been written/ produced for me by any other person except where
collaboration has been authorised by the Lecturer / Tutor /Unit Coordinator concerned.
I am aware that this work may be reproduced and submitted to plagiarism detection software programs for
the purpose of detecting possible plagiarism (which may retain a copy on its database for future
plagiarism checking).
Student’s signature: Trần Nguyễn Bình Khang
Student’s signature: Võ Phương Nghi
Student’s signature: Nguyễn Trần Phước Tài
Student’s signature: Nguyễn Đình Viên
Student’s signature: Hoàng Ngọc Hải Yến
Question 1: Prepare a histogram to describe the distribution of Employees using
Sturges’ Rule.

Step 1: Calculate the number of classes and interval width by using Sturges' Rule.
k: Number of classes
w: interval width
n: sample size
We use Descriptive Statistics in Excel to get the number of samples, maximum value
and minimum value, creating a catalyst for the next phase - using Sturges' Rule.

Sturges' Rule:
𝑘 = 1 + 3. 3 × 𝑙𝑜𝑔(𝑛) = 1 + 3. 3 × 𝑙𝑜𝑔(994) = 10. 89 ≈ 11
𝑚𝑎𝑥𝑖𝑚𝑢𝑚 −𝑚𝑖𝑛𝑖𝑚𝑢𝑚 3519−3
→ interval width: 𝑤 = 𝑘
= 11
= 319,3636364 ≈ 320
Round up the number of classes and interval width to get the desired result.

Step 2: Creating and customizing the histogram


1. Create a histogram according to the data set by using the available model in
Excel.
2. Customize the number of bin and its width by right-clicking on the horizontal
axis, then clicking on Format Axis and respectively adjusting the number of bin
and its width to come up with the desirable histogram.
Question 2
Step 1: Select every numerical variable in the data set.
Step 2: Then use the analysis tool in Excel to illustrate Descriptive Statistics of each
variable in detail.
Step 3: Aggregate all the statistics of each variable and transfer the information into a
comprehensible table.
Question 3
Using Excel Pivot Table, we created a table demonstrating average sales by industry and
year.

Question 4
Using Megastat, we created a table with Q1, Q2, Q3, IQR. Then we use these data to
calculate the limits using the interquartile rule.
𝑈𝑝𝑝𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 = 𝑄3 + 1. 5𝐼𝑄𝑅 = 284. 75 + 1. 5 × 235. 75 = 638. 375
𝐸𝑥𝑡𝑟𝑒𝑚𝑒 𝑢𝑝𝑝𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 = 𝑄3 + 3𝐼𝑄𝑅 = 284. 75 + 3 × 235. 75 = 992
𝐿𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 = 𝑄1 − 1. 5𝐼𝑄𝑅 = 49 − 1. 5 × 235. 75 =− 304. 625
𝐸𝑥𝑡𝑟𝑒𝑚𝑒 𝑙𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 = 𝑄1 − 3𝐼𝑄𝑅 = 49 − 3 × 235. 75 =− 658. 25
Using Excel filters, we identified that there are 0 low extremes; 0 low outliers; 34 high
outliers (outliers with values higher than the upper limit but lower than the extreme
upper limit); 41 high extremes (outliers with values higher than extreme upper limit)

Question 5
Excluding all the outliers based on Question 4, we used Megastat to identify Q1, Q2,
Q3, IQR and create a box-and-whiskers plot for the employee observations of each
country.
Comparison:
Based on 4 box-and-whisker plots, we can draw some conclusions:
- Both Malaysia and the Philippines have the smallest number of employees
recorded for a specific industry (3 employees).
- Vietnam has the largest number of employees recorded for a specific industry
(550 employees).
- Both Vietnam and Indonesia have the greatest variety in the number of employees
recorded for industries with a range of 534 employees.
- Vietnam is the country that has the largest median number (200 employees)
among 4 countries.
- As the IQR in the plot of Indonesia and Philippines are 195.5 and 159.25
respectively are much higher than that of Malaysia and Vietnam (66 and 80
respectively) which mean the employees in Indonesia and Philippines are much
more spread out among Malaysia and Vietnam
- Based on 4 box-and-whiskers plots, we can easily see that the dataset in 4
countries are right-skewed.

Question 6

Relationship between Assets and revenue: Positive and very low correlation
Relationship between Assets and sales: Positive and very low correlation
Relationship between Assets and ROA: Negative and very low correlation
Relationship between Revenue and Sales: Positive and very low correlation
Relationship between Revenue and ROA: Positive and very low correlation
Relationship between Sales and ROA: Positive and low correlation
Question 7

a.
The event of selecting a service firm is 𝐴.
The probability of randomly selecting a service firm is: 𝑃(𝐴) = 30. 58%
b.
The event of selecting a service firm located in Indonesia is 𝐵.
The probability of randomly selecting a service firm located in Indonesia is:
𝑃(𝐵) = 33. 60%

c.
The event of selecting a medium-sized firm is 𝐶1.
The event of selecting a firm that has parking complaints is 𝐶2.
The probability of randomly selecting a medium-sized firm that has parking complaints
is: 𝑃(𝐶1 ∩ 𝐶2) = 2. 01% + 0. 40% + 0. 60% + 1. 71% = 4. 73%

d.
The event of selecting a non-small firm is 𝐷1.
The probability of randomly selecting a non-small firm is:
𝑃(𝐷1) = 74. 648%
The event of selecting a firm with complaints about location is 𝐷2.
The probability of randomly selecting a firm with complaints about location is:
𝑃(𝐷2) = 15. 091%
The event of selecting a firm with complaints about services is 𝐷3.
The probability of randomly selecting a firm with complaints about services is
𝑃(𝐷3) = 50. 00%
The probability of selecting a firm that is a non-small firm and has complaints about
location and services is: 𝑃(𝐷1∩ 𝐷2 ∩ 𝐷3) = 49. 296%
The probability of randomly selecting a non-small firm or a firm with complaints about
location or services is:
𝑃(𝐷1 ∪ 𝐷2 ∪ 𝐷3) = 𝑃(𝐷1) + 𝑃(𝐷2) + 𝑃(𝐷3) − 𝑃(𝐷1∩ 𝐷2 ∩ 𝐷3)
= 74. 648% + 15. 091% + 50. 00% − 49. 296% = 90. 443%

Question 8
a. Prepare a decision tree

b. The probability that a firm chosen at random is a good firm is:


𝑃(𝐺𝑜𝑜𝑑) = 𝑃(𝑉𝑖𝑒𝑡𝑛𝑎𝑚) × 𝑃(𝐺𝑜𝑜𝑑|𝑉𝑖𝑒𝑡𝑛𝑎𝑚) + 𝑃(𝑀𝑎𝑙𝑎𝑦𝑠𝑖𝑎) × 𝑃(𝐺𝑜𝑜𝑑|𝑀𝑎𝑙𝑎𝑦𝑠𝑖𝑎)
+ 𝑃(𝑃ℎ𝑖𝑙𝑖𝑝𝑝𝑖𝑛𝑒𝑠) × 𝑃(𝐺𝑜𝑜𝑑|𝑃ℎ𝑖𝑙𝑖𝑝𝑝𝑖𝑛𝑒𝑠) + 𝑃(𝐼𝑛𝑑𝑜𝑛𝑒𝑠𝑖𝑎) × 𝑃(𝐺𝑜𝑜𝑑|𝐼𝑛𝑑𝑜𝑛𝑒𝑠𝑖𝑎)
= 0. 25 × (0. 48 + 0. 6 + 0. 35 + 0. 5)
= 0. 4825 = 48. 25%
c. If a firm is rated as “Bad”, the probability that it operates in Vietnam is:
0.25×0.52
𝑃(𝑉𝑖𝑒𝑡𝑛𝑎𝑚|𝐵𝑎𝑑) = 1−0.425
≈ 0. 251 = 25. 1%

You might also like