You are on page 1of 9

MID SEMESTER EXAM 2020/2021

POSTGRADUATE PROGRAM ON GEOMATIS ENGINEERING


FACULTY OF CIVIL, PLANNING, and GEO ENGINEERING, ITS
Course / Code : Applied Statistics and Probability/ RM185101
Type : TAKE HOME TEST
Submission date : Friday/ 20 November 2020
Lecturer : Ira Mutiara Anjasmara, ST, M.Phil, Ph.D
Name : Muhammad Nahdi Febriansyah
NRP : 6016201006

Answer the following questions.


(10) 1. Explain the purpose of using the standard normal distribution table, the student (t) distribution
table, and the chi-square (χ2) distribution table..
Answer:
• The standard normal distribution table is table that shows the z-score. Two parameters
define this distribution : the mean (𝜇) and the standard deviation 𝜎. It is used for the
data set >30. Probability for all normal distribution are computed using the standart
normal distribution, but since most real data sets do not have a standart normal
distribution, we must transform our real normal distribution (with mean and standard
deviation) so it has a mean of 0, and a SD of 1.
• The student (t) distribution table is table that shows the t-score corresponding to a
particular area in the upper tail. Its dependence on v and 𝛼. The t distribution is
concerned with a small samples (n<30) drawn from a population that has a normal
distribution.
• The chi-square (𝜒2) distribution table is the table that give a value of 𝑥𝑣,𝛼/2 2

corresponding to the area in the upper tail (𝛼), for a specific degree of freedom (v).
The number in the main body of the table give the 𝜒2 score corresponding to those
particular values of v and 𝛼. This applied when we wish to make inferences about the
population variances using the sample variance.
(10) 2. Explain the following terms:
– hypothesis testing
– type I error
– significance level in statistical test
Answer:
– hypothesis testing
Hypothesis testing is a process for making a statistical decision based on information
contained in the sample. A statistical hypothesis is an assumption, statement or question
concerning one or more populations, which may or may not be true. The truth of the
assumption can only be nown for certain if we examine the whole population, which is
impractical. Thus, the aim of hypothesis testing is to decide whether the assumption is true
based on random samples.
– type I error
Type I error is a error because reject H0 (null hypothesis) when it is true.

– significance level in statistical test


The level of significance is the threshold used to determine significance in probability.
Where a small value of α means a small chance of making the wrong decision, and thus a large
chance of making the right decision.

(20) 3. The Geomatics Engineering Department is recently purchasing an Electronic Distance Measure-
ment (EDM) with the specified standard deviation 5 mm.
1/2
MID SEMESTER EXAM 2020/2021
POSTGRADUATE PROGRAM ON GEOMATIS ENGINEERING
FACULTY OF CIVIL, PLANNING, and GEO ENGINEERING, ITS
A group of student is testing the new EDM by measuring a 250.00 m baseline. After 40
measure- ments it was found that the mean length of the baseline is 249.997 m.
At the 1% level of significance, is the test shows that the EDM works according the given
specifi- cation.
Answer:
We have:
µ= 250; 𝑥̅ = 249.997; n = 40; α=0.01; σ =5 mm= 5 m

Step 1
- Alternative hypothesis: Ha : µ≠250
- Null hypothesis: H0: µ = 250

Step 2
Determine number of tails:
This is a 2-tailed test, because the null hypothesis has equality.

Step 3
Determine level of significance:
α = 0.01 that is mean the confidence level is 99%.

Step 4
Determine the critical value of z:
We have a 2-tailed test, so we need to find zα/2 = z0.01/2 = z0.005. From the standard normal
distribution table, we have: z0.005 = z(0.5 – 0.005) = z(0.495) =2.58

Step 5
Determine the rejection region:
The null hypothesis will be rejected if µ≠250, so we have the following situation:
Since we are testing µ≠250, so our tails will be in the left and right of the normal z curve;
therefore the rejection region is z> +2.58 in the RHS or z<-2.58 in the LHS.

Step 6
Menentukan uji statistic (z-score) sampel data:
𝑥̅ −𝜇 249.997 −250
𝑧 = 𝜎 = 0.005/ 40 = -3.794
̅
𝑥 √
Step 7
Compare the test statistic against its critical value:
-3.794<-2.58 and 2.58<-3.794. therefore z, and hence 𝑥̅ , the sample mean, do lie in the
rejection region.
Hence, we reject H0 at the 0.01 significance level.

Step 8
Our sample measurement is incompatible with the supposed population mean at 1% level
of significance. According from calculation, test shows that the EDM not works properly
according the given specification.

(20) 4. A laser scanner e-Axis THX-1138 is tested on a 200 m baseline. After 100 measurements, it is
found that mean value of the baseline is 200.001 m with standard deviation 5 mm.
Answer the following questions based on null hypothesis “there is no significance difference be-
tween the already known baseline with the result of the new measurements”.
a. Find the 95% confidence limit from the result of the new measurements.
b. Based on the answer [a.], is the null hypothesis will be rejected of not.
2/2
MID SEMESTER EXAM 2020/2021
POSTGRADUATE PROGRAM ON GEOMATIS ENGINEERING
FACULTY OF CIVIL, PLANNING, and GEO ENGINEERING, ITS
c. Find the 99% confidence limit from the result of the new measurements.
d. Based on the answer [c.], is the null hypothesis will be rejected of not.
Answer:
a. We have:
µ= 200, σ = 5, 𝑥̅ = 200.001, n = 100, α=0.05.
Step 1
- Alternative hypothesis: Ha : µ≠200
- Null hypothesis: H0: µ = 200

Step 2
Determine number of tails:
This is a 2-tailed test, because the null hypothesis has equality.

Step 3
Determine level of significance:
α = 0.05 that is mean the confidence level is 95%.

Step 4
Determine the critical value of z:
We have a 2-tailed test, so we need to find zα/2 = z0.05/2 = z0.025. From the standard normal
distribution table, we have: z0.025 = z(0.5 – 0.025) = z(0.475) = 1.96

Step 5
Determine the rejection region:
The null hypothesis will be rejected if µ≠200, so we have the following situation:
Since we are testing µ≠200, so our tails will be in the left and right of the normal z curve;
therefore the rejection region is z> +1.96 in the RHS or z<-1.96 in the LHS.

Step 6
Menentukan uji statistic (z-score) sampel data:
𝑥̅ −𝜇 200.001 −200
𝑧 = 𝜎 = 0.005/ 100 = 2
̅
𝑥 √
Step 7
Compare the test statistic against its critical value:
-2 <-1.96 and 1.96 <2. therefore z, and hence 𝑥̅ , the sample mean, do lie in the rejection
region.
Hence, we reject H0 at the 0.05 significance level.

Step 8
Our sample measurement is incompatible with the supposed population mean at 95%
confidence level.

b. Based on the answer [a.], we reject null hypothesis (H0) with 95% confidence, because
z, and hence 𝑥̅ , the sample mean, do lie in the rejection region.

c. We have:
µ= 200, σ = 5, 𝑥̅ = 200.001, n = 100, α=0.01.
Step 1
- Alternative hypothesis: Ha : µ≠200

3/2
MID SEMESTER EXAM 2020/2021
POSTGRADUATE PROGRAM ON GEOMATIS ENGINEERING
FACULTY OF CIVIL, PLANNING, and GEO ENGINEERING, ITS
- Null hypothesis: H0: µ = 200

Step 2
Determine number of tails:
This is a 2-tailed test, because the null hypothesis has equality.

Step 3
Determine level of significance:
α = 0.01 that is mean the confidence level is 99%.

Step 4
Determine the critical value of z:
We have a 2-tailed test, so we need to find zα/2 = z0.01/2 = z0.005. From the standard normal
distribution table, we have: z0.025 = z(0.5 – 0.005) = z(0.495) = 2.58

Step 5
Determine the rejection region:
The null hypothesis will be rejected if µ≠200, so we have the following situation:
Since we are testing µ≠200, so our tails will be in the left and right of the normal z curve;
therefore the rejection region is z> +2.58 in the RHS or z<-2.58 in the LHS.

Step 6
Menentukan uji statistic (z-score) sampel data:
𝑥̅ −𝜇 200.001 −200
𝑧 = 𝜎 = 0.005/ 100 = 2
̅
𝑥 √
Step 7
Compare the test statistic against its critical value:
-2.58 <-2 and 2 <2.58. therefore z, and hence 𝑥̅ , the sample mean, do not lie in the
rejection region.
Hence, we do not reject H0 at the 0.01 significance level.

Step 8
Our sample measurement is compatible with the supposed population mean at 99%
confidence level.

d. Based on the answer [a.], we do not reject null hypothesis (H0) with 99% confidence,
because z, and hence 𝑥̅ , the sample mean, do not lie in the rejection region.

(20) 5. A study focused on the relationship between length of putt and percentage of putts made by pro-
fessional golfers gave the results shown in the table:
Length of Putt Percentage of Putts made
(metres) (%)
2 93.3
3 83.1
4 74.1
5 58.9
6 54.8
7 53.1
8 46.3

4/2
MID SEMESTER EXAM 2020/2021
POSTGRADUATE PROGRAM ON GEOMATIS ENGINEERING
FACULTY OF CIVIL, PLANNING, and GEO ENGINEERING, ITS
9 31.8
10 33.5

a. Develop a scatter diagram for the data.


b. Use the method of least squares to develop an estimated regression line for the relationship.
c. Draw your estimated regression line on the same graph as the scatter diagram. Does it appear
to provide a good fit?
d. What does the scatter diagram indicate about the relationship between the two variables?
e. Compute SSE, SSR, SST.
f. Calculate r2. Comment on the goodness of fit.
Answer:

a. Develop a scatter diagram for the data.

b. Use the method of least squares to develop an estimated regression line for the relationship.
X Y X^2 Y^2 XY
2 93.3 4 8704.89 186.60
3 83.1 9 6905.61 249.30
4 74.1 16 5490.81 296.40
5 58.9 25 3469.21 294.50
6 54.8 36 3003.04 328.80
7 53.1 49 2819.61 371.70
8 46.3 64 2143.69 370.40
9 31.8 81 1011.24 286.20
10 33.5 100 1122.25 335.00
Total 54 528.9 384 34670.35 2718.9
Average 6 58.76667

• Covariance:

5/2
MID SEMESTER EXAM 2020/2021
POSTGRADUATE PROGRAM ON GEOMATIS ENGINEERING
FACULTY OF CIVIL, PLANNING, and GEO ENGINEERING, ITS
1 54∗528.9
Sxy =9−1[2718.9 - ]=-56.812
9

• variance:

1 542
S2x=9−1[384 - ]= 7.5
9

• Gradient:

b1= -7.575

• Y-intercept:

bo= 104.216
̂ = -7.575x + 104.216
So, the equation of the line regression it will be: 𝒀

c. Draw your estimated regression line on the same graph as the scatter diagram. Does it appear
to provide a good fit?

Yes, it appear to provide a good fit

6/2
MID SEMESTER EXAM 2020/2021
POSTGRADUATE PROGRAM ON GEOMATIS ENGINEERING
FACULTY OF CIVIL, PLANNING, and GEO ENGINEERING, ITS
d. What does the scatter diagram indicate about the relationship between the two variables?
From the results our data above (two variables) indicates that they are almost can
make street line (linear data) so the relationship between Length of Putt and Percentage of
Putts made are strong and they are close to be street line.

e. Compute SSE, SSR, SST.


I calculate SSE, SSR, SST using Excel, and here are the results.
X Y X^2 Y^2 XY Yp Y-Yp (Y-Yp)^2 Yp-Yrt (Yp-Yrt)^2 Y-Yrt (Y-Yrt)^2
2 93.3 4 8704.89 186.60 89.07 4.23 17.92 30.30 918.09 34.53 1192.55
3 83.1 9 6905.61 249.30 81.49 1.61 2.59 22.73 516.43 24.33 592.11
4 74.1 16 5490.81 296.40 73.92 0.18 0.03 15.15 229.52 15.33 235.11
5 58.9 25 3469.21 294.50 66.34 -7.44 55.38 7.58 57.38 0.13 0.02
6 54.8 36 3003.04 328.80 58.77 -3.97 15.73 0.00 0.00 -3.97 15.73
7 53.1 49 2819.61 371.70 51.19 1.91 3.64 -7.58 57.38 -5.67 32.11
8 46.3 64 2143.69 370.40 43.62 2.68 7.20 -15.15 229.52 -12.47 155.42
9 31.8 81 1011.24 286.20 36.04 -4.24 17.99 -22.73 516.43 -26.97 727.20
10 33.5 100 1122.25 335.00 28.47 5.03 25.33 -30.30 918.09 -25.27 638.40
Total 54 528.9 384 34670.35 2718.9 145.8225 3442.8375 3588.66
Average 6 58.76667
Notes: Yp = 𝑌̂
• SSE

SSE= 145.823

• SSR

SSR= 3442.837

• SST
`

SST= 3588.660

f. Calculate r2. Comment on the goodness of fit.

3442.838
𝑟2 = = 0.959
3588.660

The value from 𝑟2 or coefficient of determination is used to evaluate the


goodness of fit for the regression relationship. Large values of the coefficient of
7/2
MID SEMESTER EXAM 2020/2021
POSTGRADUATE PROGRAM ON GEOMATIS ENGINEERING
FACULTY OF CIVIL, PLANNING, and GEO ENGINEERING, ITS
determination imply that at least squares lines provides a good fit to the data. So, this
from the calculation 0.959 approach to 1 that indicates the best fit.

(20) 6. The travel time of GPS signals depends on the amount of water in the atmosphere. A weather
balloon recorded the moisture content at different altitudes in the atmosphere:
altitude (km) % humidity
(x) (y)
1 2.6
1.5 4.5
2 7.8
2.5 12.2
3 20.5
3.5 30.2
4 50.1
4.5 78.2

A linear regression model fitted to this data gives the dependence of humidity on altitude as:
yˆ = 19.8x− 28.8 with r2 = 0.845
a. Compute and plot the residuals for this model.
b. Do there appear to be any outliers in this data set? Discuss your answer.
Answer:
a. Compute and plot the residuals for this model.
X Y Yp Y-Yp
1 2.6 -9 11.6
1.5 4.5 0.9 3.6
2 7.8 10.8 -3
2.5 12.2 20.7 -8.5
3 20.5 30.6 -10.1
3.5 30.2 40.5 -10.3
4 50.1 50.4 -0.3
4.5 78.2 60.3 17.9
Notes: Yp = 𝑌̂
where, 𝜀 = Y - 𝑌̂

b. Do there appear to be any outliers in this data set? Discuss your answer.

8/2
MID SEMESTER EXAM 2020/2021
POSTGRADUATE PROGRAM ON GEOMATIS ENGINEERING
FACULTY OF CIVIL, PLANNING, and GEO ENGINEERING, ITS

Jumlah Y diterima Jumlah Y ditolak


1 sigma 26.41585 5 3
2 sigma 52.83169 7 1
3 sigma 79.24754 8 0

Based on 1𝜎, 2 𝜎, 3 𝜎 can be known with certainty. Which is outlier data or not. If the
data exceeds this limit, it can be categorized as an outlier data.

9/2

You might also like