You are on page 1of 9

Part 2A

About the data:


PM 10 is one of the parameters which we use to judge the quality of the air. PM 10
basically means the particles which are less than 10 micrometres in the diameter.
Government of India continuously monitors the air quality of different cities in India.
The data which has been analysed was taken from an analysis report [1] which was
produced by Government of India. This data mainly comprises of air quality data for
“Graphite India, White Field Road, Bangalore” which are considered as industrial
area of Bangalore for the year 2015.

Data and Analysis:


The data comprise of value of 119 PM 10 measures taken throughout the year.
Summary Statistics if the data is tabulated below.

Summary statistics

Mean 180.4369748
Standard Error 7.441923828
Median 169
Mode 137
Standard Deviation 81.18180462
Sample Variance 6590.485401
Kurtosis 0.512480187
Skewness 0.715125982
Range 434
Minimum 27
Maximum 461
Sum 21472
Count 119

For proportion, we consider values where PM10 > 135 μg/m3 . Following is the data
set for our consideration in the further steps.
Here k = 10000 + (100 - x) * 20,
And x= last three digits of roll number= 232
x 232
n 232
k 7360

Mean= μ 180.4369748
Population Variance= σ2 6590.485401
Sample Std deviation= σ 81.18180462
PM10 > 135 80
PM10 < 135 39
Population proportion=
π 0.672268908

 A date set with kXn= 7360X232 has been generated.


 We have calculated estimators of mean, variance and proportion for each
row.
 We plotted histograms for the estimators as shown.

Average Estimator
1000
800
Frequency

600
400
200
0 Frequency
193.607069
162.3105172
165.4401724
168.5698276
171.6994828
174.8291379
177.9587931
181.0884483
184.2181034
187.3477586
190.4774138

196.7367241
199.8663793

160.745689655172
Proportion Estimators
1400
1200
Frequency

1000
800
600
400
200
0 Frequency

0.551724137931034

Variance Estimators
1000
900
800
700
Frequency

600
500
400
300
200 Frequency
100
0

4388.93118375877
Std Deviation Estimators
1000
900
800
700
Frequency

600
500
400
300
200 Frequency
100
0

66.2490089266154

 In next step, we have constructed confidence intervals for μ, π and σ2 for all
the data sets.
Sample of confidence interval is shown below.

C.I. = 88%
s2 LL UL Is σ2 in interval
265.3263 198.5627 6590.485401
7430.553 6497.24 8681.833 inside
5997.521 5244.203 7007.483 inside
6527.389 5707.518 7626.579 inside
6338.024 5541.938 7405.326 inside

Part 2A- SUMMARY


4a

AS from the histogram, we can see that all the graphs i.e. x- bar, s2, s and p follow
approximately normal distribution. Hence, we can say that central limit theorem
(CLT) is in business.
4b
 We calculated expected value of the parameters and tried to compare them with the
actual values. We got the following results.

Particulars Mean Variance Proportion

Population 180.437 6,590.49 0.67227

Sample 180.435 6,549.41 0.672000

Delta 0.002 41.071 0.000


Delta % 0% 1% 0%

As we can see that the difference percentage is not too much, we can see that all
three are unbiased estimators.

5b
 Now we will try to check the sanctity of the confidence coefficient and check
whether the population parameters are within the range or not.

Particulars Mean Proportion Variance

Confidence interval 92% 93% 88%

Outside range 2 523 1,163

Inside range 7,358 6,837 6,197

Total values 7,360 7,360 7,360


% outside range 0.03% 7.11% 15.80%
% inside range 99.97% 92.89% 84.20%

It can be seen that level of confidence is in business.


Part 2B- 1
 For this part, we take one row as the sample data sets.

Testing of Hypothesis: Mean

H0: μ<= 160


Ha: μ> 160

x bar 180.46
s/sqrt(n) 11.84814
z 1.726853
p value 0.042716

Level of
significance 0.05

Power calculation
Power of the test P[Reject H0 | H0 is false]
Assume μ 195
alpha 0.05
z alpha 1.645
Reject H0 if x bar > 179.4884588
z -1.30919613
P[Type-II error] = β 0.095233961
Power = 1-β 90%

Since p value is less than alpha value, we reject null


hypothesis at 5% significance level

Testing of Hypothesis: Proportion


H0: p0 <= 0.6
Ha: p0 > 0.6
p 0.655172
standard error 0.032163
z 1.71538
p value 0.043138

Level of
significance 0.05
Since p value is less than alpha value, we reject null
hypothesis at 5% significance level

Testing of Hypothesis: Variance

H0: Var<= 7000


Ha: Var > 7000
s2 7430.553
Chi2 245.2082
p value 0.24863

Level of
significance 0.05

Since p value is more than alpha value, we fail to reject null hypothesis
at 5% significance level

Part 2B- 2
Delhi PM10 level (Jan- Jun) Delhi PM10 level (Jul- Dec)
83 308
102 209
114 287
109 273
127 297
85 261
Up to 60 points Up to 59 points
Part 2C-1
In this part, we find the power of test if 15% chance of unit digits polled be 0.
Using binomial approximation, we calculated power as 0.858.

Power 0.858

Detailed calculation is showed in the excel file.

Part 2C-1
For hypothesis, we assume that the elections were not rigged.
I.e.
Null hypothesis H0 : p0=p1=p2=………. =p9
Alt hypothesis Ha : p0≠p1≠p2≠………. ≠p9

We tabulate the data and test the null hypothesis.

No of zeroes 0 1 2 3 4 5 6 7 8 9
Actual votes 37 25 26 30 27 33 21 27 23 23
Expected voetes 27.2 27.2 27.2 27.2 27.2 27.2 27.2 27.2 27.2 27.2
Actual proportion 0.1360 0.0919 0.0956 0.1103 0.0993 0.1213 0.0772 0.0993 0.0846 0.0846
Estimated proportion 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
Test Statistic individual 3.5309 0.1779 0.0529 0.2882 0.0015 1.2368 1.4132 0.0015 0.6485 0.6485

TS 8.0000
df 9
p-value 0.53
p-value>alpha

As p value is more than alpha value. We fail to reject. Hence, we assume that
proportions are not different and elections were not rigged.

You might also like