Professional Documents
Culture Documents
STUDENT DETAILS
ASSIGNMENT DETAILS
DECLARATION
I hold a copy of this assignment if the original is lost or damaged.
I hereby certify that no part of this assignment or product has been copied from any other student’s work or
from any other source except where due acknowledgement is made in the assignment.
I hereby certify that no part of this assignment or product has been submitted by me in another
(previous or current) assessment, except where appropriately referenced, and with prior permission
from the Lecturer / Tutor / Unit Coordinator for this unit.
No part of the assignment/product has been written/ produced for me by any other person except where
collaboration has been authorised by the Lecturer / Tutor /Unit Coordinator concerned.
I am aware that this work may be reproduced and submitted to plagiarism detection software programs for
the purpose of detecting possible plagiarism (which may retain a copy on its database for future
plagiarism checking).
Student’s signature: Trần Nguyễn Bình Khang
Student’s signature: Võ Phương Nghi
Student’s signature: Nguyễn Trần Phước Tài
Student’s signature: Nguyễn Đình Viên
Student’s signature: Hoàng Ngọc Hải Yến
Question 1: Prepare a histogram to describe the distribution of Employees using
Sturges’ Rule.
Step 1: Calculate the number of classes and interval width by using Sturges' Rule.
k: Number of classes
w: interval width
n: sample size
We use Descriptive Statistics in Excel to get the number of samples, maximum value
and minimum value, creating a catalyst for the next phase - using Sturges' Rule.
Sturges' Rule:
𝑘 = 1 + 3. 3 × 𝑙𝑜𝑔(𝑛) = 1 + 3. 3 × 𝑙𝑜𝑔(994) = 10. 89 ≈ 11
𝑚𝑎𝑥𝑖𝑚𝑢𝑚 −𝑚𝑖𝑛𝑖𝑚𝑢𝑚 3519−3
→ interval width: 𝑤 = 𝑘
= 11
= 319,3636364 ≈ 320
Round up the number of classes and interval width to get the desired result.
Question 4
Using Megastat, we created a table with Q1, Q2, Q3, IQR. Then we use these data to
calculate the limits using the interquartile rule.
𝑈𝑝𝑝𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 = 𝑄3 + 1. 5𝐼𝑄𝑅 = 284. 75 + 1. 5 × 235. 75 = 638. 375
𝐸𝑥𝑡𝑟𝑒𝑚𝑒 𝑢𝑝𝑝𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 = 𝑄3 + 3𝐼𝑄𝑅 = 284. 75 + 3 × 235. 75 = 992
𝐿𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 = 𝑄1 − 1. 5𝐼𝑄𝑅 = 49 − 1. 5 × 235. 75 =− 304. 625
𝐸𝑥𝑡𝑟𝑒𝑚𝑒 𝑙𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 = 𝑄1 − 3𝐼𝑄𝑅 = 49 − 3 × 235. 75 =− 658. 25
Using Excel filters, we identified that there are 0 low extremes; 0 low outliers; 34 high
outliers (outliers with values higher than the upper limit but lower than the extreme
upper limit); 41 high extremes (outliers with values higher than extreme upper limit)
Question 5
Excluding all the outliers based on Question 4, we used Megastat to identify Q1, Q2,
Q3, IQR and create a box-and-whiskers plot for the employee observations of each
country.
Comparison:
Based on 4 box-and-whisker plots, we can draw some conclusions:
- Both Malaysia and the Philippines have the smallest number of employees
recorded for a specific industry (3 employees).
- Vietnam has the largest number of employees recorded for a specific industry
(550 employees).
- Both Vietnam and Indonesia have the greatest variety in the number of employees
recorded for industries with a range of 534 employees.
- Vietnam is the country that has the largest median number (200 employees)
among 4 countries.
- As the IQR in the plot of Indonesia and Philippines are 195.5 and 159.25
respectively are much higher than that of Malaysia and Vietnam (66 and 80
respectively) which mean the employees in Indonesia and Philippines are much
more spread out among Malaysia and Vietnam
- Based on 4 box-and-whiskers plots, we can easily see that the dataset in 4
countries are right-skewed.
Question 6
Relationship between Assets and revenue: Positive and very low correlation
Relationship between Assets and sales: Positive and very low correlation
Relationship between Assets and ROA: Negative and very low correlation
Relationship between Revenue and Sales: Positive and very low correlation
Relationship between Revenue and ROA: Positive and very low correlation
Relationship between Sales and ROA: Positive and low correlation
Question 7
a.
The event of selecting a service firm is 𝐴.
The probability of randomly selecting a service firm is: 𝑃(𝐴) = 30. 58%
b.
The event of selecting a service firm located in Indonesia is 𝐵.
The probability of randomly selecting a service firm located in Indonesia is:
𝑃(𝐵) = 33. 60%
c.
The event of selecting a medium-sized firm is 𝐶1.
The event of selecting a firm that has parking complaints is 𝐶2.
The probability of randomly selecting a medium-sized firm that has parking complaints
is: 𝑃(𝐶1 ∩ 𝐶2) = 2. 01% + 0. 40% + 0. 60% + 1. 71% = 4. 73%
d.
The event of selecting a non-small firm is 𝐷1.
The probability of randomly selecting a non-small firm is:
𝑃(𝐷1) = 74. 648%
The event of selecting a firm with complaints about location is 𝐷2.
The probability of randomly selecting a firm with complaints about location is:
𝑃(𝐷2) = 15. 091%
The event of selecting a firm with complaints about services is 𝐷3.
The probability of randomly selecting a firm with complaints about services is
𝑃(𝐷3) = 50. 00%
The probability of selecting a firm that is a non-small firm and has complaints about
location and services is: 𝑃(𝐷1∩ 𝐷2 ∩ 𝐷3) = 49. 296%
The probability of randomly selecting a non-small firm or a firm with complaints about
location or services is:
𝑃(𝐷1 ∪ 𝐷2 ∪ 𝐷3) = 𝑃(𝐷1) + 𝑃(𝐷2) + 𝑃(𝐷3) − 𝑃(𝐷1∩ 𝐷2 ∩ 𝐷3)
= 74. 648% + 15. 091% + 50. 00% − 49. 296% = 90. 443%
Question 8
a. Prepare a decision tree