Basic Statistics - All Calculations

Measure of Central Tendency
Mean
Median
Mode
Measure of Dispersion/Spread
Range
Min
Max
Variance
Std Deviation
Measure of Shape
Skewness
Kurtosis
Mean
62 58
4 52
83 43
24 29
40 69 Average 49 (SUM(A3:A21))/(COUNT(A3:A21))
73 7 49
87 26 Average is the same as mean
2 14
54 65
5 69
74 93
83 65
87 33
54 61
94 50
26 6
14 24
9 96
56 0
58
Median
Odd count Even Count
62 58 Always sort the data in ascending order
4 52 Find the count, and divide the count by 2, to the the index value of the median data point
83 43 Check the value at that index in the data.
24 29
40 69
73 7 62 2
87 26 4 4
2 14 83 5
54 65 24 9
5 69 40 14 (COUNT(E32:E50))/2 10
74 93 73 24
83 65 87 26 The value at the 10th index i
87 33 2 40
54 61 54 54 50% count of the data will be
94 50 5 54 10th index
26 6 74 56
14 24 83 62
9 96 87 73
56 0 54 74
58 94 83
26 83
14 87
9 87
56 94
58 0
52 6
43 7 For even number of observations, we have 2 centr
29 14
69 24 COUNT(E54:E73)/2
7 26 (COUNT(E54:E73)/2)+1
26 29
14 33 Find the 10 & 11 index value in the data
65 43
69 50 10th index Average of these 2 values will be the median
93 52 11th index
65 58 (50+52)/2 51
33 58
61 61 So the median here is 51
50 65
6 65
24 69
96 69
0 93
58 96
Mode
0
1
2
4
9 84 Most frequent value in the data is the Mode value
20
27
28
30
32
33
40
41
45
54
59
61
66
75
84
84
87
96
A21))/(COUNT(A3:A21))
e of the median data point
Index value of the median in the data
The value at the 10th index is 54, hence 54 is the median
50% count of the data will be either equal to or lesser than the median value
observations, we have 2 centre values:
10
10 11 5
11 11 6
12
ex value in the data 12 5th
13 6th 12.5
alues will be the median 13
14 12.5
14
16
Range
58
52
43 Range is the difference between maximum value & minimum value
29
69 96 MAX(A3:A22)-MIN(A3:A22)
7
26 Therefore our data could be any of the 96 values between the min & the
14
65
69
93
65
33
61
50
6
24
96
0
58
Variance & Std Deviation

X X-Mean(X-Mean)^2
58 12.1 146.41
52 6.1 37.21
43 -2.9 8.41
29 -16.9 285.61
69 23.1 533.61
7 -38.9 1513
26 -19.9 396.01
14 -31.9 1018
65 19.1 364.81
69 23.1 533.61
93 47.1 2218
65 19.1 364.81
33 -12.9 166.41
61 15.1 228.01
50 4.1 16.81
6 -39.9 1592
24 -21.9 479.61
96 50.1 2510
0 -45.9 2107
58 12.1 146.41
Mean 45.9
14666 SUM(D27:D46)
19 COUNT(D27:D46)-1
Varianc771.88 D49/D50
Std Dev27.783 SQRT(D52)
A firm is starting a delivery service for a new client between 2 points.

Since it is a new client, the firm wants to send more consistent delivery boy to deliver t
Delivery boy 1 (Time in minutes) – 12,13,17,21,24, 24, 26,27, 27, 30, 32, 35, 37, 38, 41,
Delivery boy 2 (Time in minutes)- 34, 14, 31, 59, 11, 50, 27, 33, 53, 34, 13, 13, 42, 29, 33
Delivery No. Boy 1 Boy 2
1 12 34
2 13 14
3 17 31
4 21 59
5 24 11
6 24 50
7 26 27
8 27 33
9 27 53
10 30 34
11 32 13
12 35 13
13 37 42
14 38 29
15 41 33
16 43 42
17 44 34
18 46 33
19 53 44
20 60 21
Mean 32.5 32.5

Variance 166.16 183.74
Since the variance of delivery boy 1 is lesser, he is more consiste
0 2 10 22 33 57 82 115 Mean
0 2 10 22 34 58 82 117 Std Dev
0 2 11 22 35 59 86 118
0 2 11 23 35 61 86 123
0 3 12 23 37 62 87 127 Empirical Rule
0 3 12 23 37 63 91 128 Approx. 67-68% of t
0 3 13 24 37 64 94 133
0 4 14 25 38 66 99 136 2.7286
0 5 15 26 38 66 100 139
0 6 15 27 40 66 100 183 Approx. 95% of the
1 6 16 28 43 68 102
1 7 16 28 44 68 102 -37.96
1 7 18 30 46 68 105
1 8 18 31 48 71 106 Approx. 99.7% of th
1 8 18 31 49 77 107
2 8 19 31 53 77 107 -78.64
2 9 20 31 54 78 107
2 9 21 31 54 79 108
2 9 22 31 54 80 112 0
2 9 22 33 55 81 115 10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
170
180
190
5
6
minimum value 7
14
24 Range 91
26
etween the min & the m29
33
43
50
52
58
58
61
65
65
69
69
93
96
ivery boy to deliver the product.
0, 32, 35, 37, 38, 41, 43, 44, 46,53,60
34, 13, 13, 42, 29, 33, 42, 34, 33, 44, 21
, he is more consistent as compared to delivery boy 2, hence we should hire boy 1.
43.413 Median31 Mode 0

40.685
-78.64 -37.95 2.72 43.41 84.09 124.78 165.46
3 Std D 2 std de1 std deMean 1 std de2 std de3 Std Dev
Empirical Rule
Approx. 67-68% of the data is expected to be within 1 std deviation away from the mean
84.098 98 0.6533
Approx. 95% of the data is expected to be within 2 std deviations away from the mean
124.78 144 0.96
Approx. 99.7% of the data is expected to be within 3 std deviations away from the mean
165.47 149 0.9933
Bin Frequency
0 10 Histogram
10 32 35
20 15 30
25
30 16 20
Frequency
40 17 15
Frequency
10
50 5 5
60 8 0
0 20 4 0 6 0 80 0 0 2 0 4 0 6 0 8 0 r e
70 10 1 1 1 1 1 Mo
80 6 Bin
90 6
100 5
110 8
120 5
130 3
140 3
150 0
160 0
170 0
180 0
190 1
More 0
from the mean
om the mean
from the mean

Sepal.Length
5.1
4.9
4.7 Min 4.3
4.6 Max 7.9
5 Mean 5.843333
5.4 Std. Dev 0.828066
4.6
5 3.359135 4.187201 5.015267 5.84 6.671399 7.499466 8.327532
4.4 3 2 1 Mean 1 2 3
4.9
5.4
4.8
4.8
4.3
5.8
5.7
5.4
5.1
5.7
5.1
5.4
5.1
4.6
5.1
4.8
5
5
5.2
5.2
4.7
4.8
5.4
5.2
5.5
4.9
5
5.5
4.9
4.4
5.1
5
4.5
4.4
5
5.1
4.8
5.1
4.6
5.3
5
7
6.4
6.9
5.5
6.5
5.7
6.3
4.9
6.6
5.2
5
5.9
6
6.1
5.6
6.7
5.6
5.8
6.2
5.6
5.9
6.1
6.3
6.1
6.4
6.6
6.8
6.7
6
5.7
5.5
5.5
5.8
6
5.4
6
6.7
6.3
5.6
5.5
5.5
6.1
5.8
5
5.6
5.7
5.7
6.2
5.1
5.7
6.3
5.8
7.1
6.3
6.5
7.6
4.9
7.3
6.7
7.2
6.5
6.4
6.8
5.7
5.8
6.4
6.5
7.7
7.7
6
6.9
5.6
7.7
6.3
6.7
7.2
6.2
6.1
6.4
7.2
7.4
7.9
6.4
6.3
6.1
7.7
6.3
6.4
6
6.9
6.7
6.9
5.8
6.8
6.7
6.7
6.3
6.5
6.2
5.9
0 2 10 22 33 57 82 115
0 2 10 22 34 58 82 117 0
0 2 11 22 35 59 86 118 20
0 2 11 23 35 61 86 123 40
0 3 12 23 37 62 87 127 60
0 3 12 23 37 63 91 128 80
0 3 13 24 37 64 94 133 100
0 4 14 25 38 66 99 136 120
0 5 15 26 38 66 100 139 140
0 6 15 27 40 66 100 183 160
1 6 16 28 43 68 102 180
1 7 16 28 44 68 102 200
1 7 18 30 46 68 105
1 8 18 31 48 71 106
1 8 18 31 49 77 107 Mean
2 8 19 31 53 77 107 Median
2 9 20 31 54 78 107 Mode
2 9 21 31 54 79 108
2 9 22 31 54 80 112
2 9 22 33 55 81 115
Bin Frequency
0 10
Histogram
20 47 5050
40 33 4040
60 13 3030
Frequency
Frequency
80 16 20 Freq
20
100 11 10
10
0
120 13 0 0 20 40 60 8 0 0 0 2 0 4 0 6 0 8 0 0 0 r e
0 1 01 1 01 1 02 Mo
140 6 40 80 12 16 20
160 0 BinBin
180 0
200 1
More 0 Right Skewed Data
43.41333
31
0 Properties of Skewed Data
Right Skewed Mode<Median<Mean
Left Skewed Mode>Median>Mean
Normal Mode=Median=Mean
stogram
Frequency
00 120 140 160 180 200 ore

0 0 0 M
12 16 20
nBin
Mode<Median<Mean
Mode>Median>Mean
Mode=Median=Mean
Bin Frequency Relative Frequency
0 10 0.066666666666667
20 47 0.313333333333333
40 33 0.22
60 13 0.086666666666667
80 16 0.106666666666667
100 11 0.073333333333333
120 13 0.086666666666667
140 6 0.04
160 0 0
180 0 0
200 1 0.006666666666667
More 150 1
Cummulative Frequency Cummulative Relative Frequency
10 0.066667
57 0.38
90 0.6
103 0.686667
119 0.793333
130 0.866667
143 0.953333
149 0.993333
149 0.993333
149 0.993333
150 1
Probability is chance of an event.
There are 2 types of probabilities: Discrete Probability Distribution & Continuous Probability Distribution
Dice
1 0.166667
2 0.166667 (Number of ways/outcomes where the event occurs) / (Total number of possible outcomes)
3 0.166667
4 0.166667
5 0.166667 P(Even number) 0.5 Probability will ALWAYS sum up to 1, and will lie in
6 0.166667 Where probability closer to 0 talks about less likely
1 P(Prime number) 0.5
P(x<6) 0.833333
0
Tossing a coin 10 times Head = 0.5 Tail = 0.5 Unlikely/
Impossible Events
0 0.000977
1 0.009766
2 0.043945
3 0.117188
4 0.205078 P(Less than 4 heads) 0.171875
5 0.246094
6 0.205078 P(7 or less) 0.945313
7 0.117188
8 0.043945 P(Greater than 6 heads) 0.171875
9 0.009766
10 0.000977 P(Greater than 2 heads) 0.945313
Team A 0.7 Team B 0.3
0 0.00243
1 0.02835
2 0.1323
3 0.3087
4 0.36015
5 0.16807
1
mber of possible outcomes)
AYS sum up to 1, and will lie in the range of 0 & 1

oser to 0 talks about less likely events, & probability closer towards 1 talks about more certain events.
0.25 0.5 0.75 1

Equal Chance Certain/
Impossible Events Most likely events
Continuous Probability Distribution
Sepal.LengSepal.WidtPetal.Leng Petal.WidtSpecies Z-Score The temperature of bangalore lies between 20 to 2

5.1 3.5 1.4 0.2 setosa -0.90 The temperature of bangalore tomorrow will be ex
4.9 3 1.4 0.2 setosa -1.14
4.7 3.2 1.3 0.2 setosa -1.38 4.3
4.6 3.1 1.5 0.2 setosa -1.50 7.9 4
5 3.6 1.4 0.2 setosa -1.02 4.5
5.4 3.9 1.7 0.4 setosa -0.54 5
4.6 3.4 1.4 0.3 setosa -1.50 5.5
5 3.4 1.5 0.2 setosa -1.02 6
4.4 2.9 1.4 0.2 setosa -1.74 6.5
4.9 3.1 1.5 0.1 setosa -1.14 7
5.4 3.7 1.5 0.2 setosa -0.54 7.5
4.8 3.4 1.6 0.2 setosa -1.26 8
4.8 3 1.4 0.1 setosa -1.26
4.3 3 1.1 0.1 setosa -1.86 Mean 5.843333
5.8 4 1.2 0.2 setosa -0.05 Stdev 0.828066
5.7 4.4 1.5 0.4 setosa -0.17
5.4 3.9 1.3 0.4 setosa -0.54 P(X<=4.6) -1.51 0.0655
5.1 3.5 1.4 0.3 setosa -0.90
5.7 3.8 1.7 0.3 setosa -0.17 What is the probability of a randomly selected flow
5.1 3.8 1.5 0.3 setosa -0.90
5.4 3.4 1.7 0.2 setosa -0.54 P(X>=6.1) 0.32
5.1 3.7 1.5 0.4 setosa -0.90
4.6 3.6 1 0.2 setosa -1.50 P(X<=680)
5.1 3.3 1.7 0.5 setosa -0.90 Mean 711
4.8 3.4 1.9 0.2 setosa -1.26 St Dev 29
5 3 1.6 0.2 setosa -1.02 z -1.07
5 3.4 1.6 0.4 setosa -1.02
5.2 3.5 1.5 0.2 setosa -0.78
5.2 3.4 1.4 0.2 setosa -0.78 P(X>700) 0.648 64.8
4.7 3.2 1.6 0.2 setosa -1.38 z -0.38
4.8 3.1 1.6 0.2 setosa -1.26
5.4 3.4 1.5 0.4 setosa -0.54
5.2 4.1 1.5 0.1 setosa -0.78
5.5 4.2 1.4 0.2 setosa -0.41 Degrees of Freedom
4.9 3.1 1.5 0.2 setosa -1.14 a 450
5 3.2 1.2 0.2 setosa -1.02 b 250
5.5 3.5 1.3 0.2 setosa -0.41 c 300 500
4.9 3.6 1.4 0.1 setosa -1.14 d 100
4.4 3 1.3 0.2 setosa -1.74 e -600
5.1 3.4 1.5 0.2 setosa -0.90
5 3.5 1.3 0.3 setosa -1.02
4.5 2.3 1.3 0.3 setosa -1.62
4.4 3.2 1.3 0.2 setosa -1.74 Sample Mean
5 3.5 1.6 0.6 setosa -1.02 Sample Std Dev
5.1 3.8 1.9 0.4 setosa -0.90 Population St Dev
4.8 3 1.4 0.3 setosa -1.26 n
5.1 3.8 1.6 0.2 setosa -0.90 Dev
4.6 3.2 1.4 0.2 setosa -1.50
5.3 3.7 1.5 0.2 setosa -0.66 t
5 3.3 1.4 0.2 setosa -1.02
7 3.2 4.7 1.4 versicolor 1.40 CI 1571
6.4 3.2 4.5 1.5 versicolor 0.67 Acceptable Margin of Error= 1 - Confide
6.9 3.1 4.9 1.5 versicolor 1.28
5.5 2.3 4 1.3 versicolor -0.41 What if we don't have info on Populatio
6.5 2.8 4.6 1.5 versicolor 0.79
5.7 2.8 4.5 1.3 versicolor -0.17 Dev
6.3 3.3 4.7 1.6 versicolor 0.55 t
4.9 2.4 3.3 1 versicolor -1.14
6.6 2.9 4.6 1.3 versicolor 0.91 CI 1514.97
5.2 2.7 3.9 1.4 versicolor -0.78
5 2 3.5 1 versicolor -1.02
5.9 3 4.2 1.5 versicolor 0.07
6 2.2 4 1 versicolor 0.19 Degrees of Freedom
6.1 2.9 4.7 1.4 versicolor 0.31 a 100
5.6 2.9 3.6 1.3 versicolor -0.29 b 50
6.7 3.1 4.4 1.4 versicolor 1.03 c 180
5.6 3 4.5 1.5 versicolor -0.29 d 150
5.8 2.7 4.1 1 versicolor -0.05 e 20
6.2 2.2 4.5 1.5 versicolor 0.43
5.6 2.5 3.9 1.1 versicolor -0.29
5.9 3.2 4.8 1.8 versicolor 0.07
6.1 2.8 4 1.3 versicolor 0.31
6.3 2.5 4.9 1.5 versicolor 0.55
6.1 2.8 4.7 1.2 versicolor 0.31
6.4 2.9 4.3 1.3 versicolor 0.67
6.6 3 4.4 1.4 versicolor 0.91
6.8 2.8 4.8 1.4 versicolor 1.16
6.7 3 5 1.7 versicolor 1.03
6 2.9 4.5 1.5 versicolor 0.19
5.7 2.6 3.5 1 versicolor -0.17
5.5 2.4 3.8 1.1 versicolor -0.41
5.5 2.4 3.7 1 versicolor -0.41
5.8 2.7 3.9 1.2 versicolor -0.05
6 2.7 5.1 1.6 versicolor 0.19
5.4 3 4.5 1.5 versicolor -0.54
6 3.4 4.5 1.6 versicolor 0.19
6.7 3.1 4.7 1.5 versicolor 1.03
6.3 2.3 4.4 1.3 versicolor 0.55
5.6 3 4.1 1.3 versicolor -0.29
5.5 2.5 4 1.3 versicolor -0.41
5.5 2.6 4.4 1.2 versicolor -0.41
6.1 3 4.6 1.4 versicolor 0.31
5.8 2.6 4 1.2 versicolor -0.05
5 2.3 3.3 1 versicolor -1.02
5.6 2.7 4.2 1.3 versicolor -0.29
5.7 3 4.2 1.2 versicolor -0.17
5.7 2.9 4.2 1.3 versicolor -0.17
6.2 2.9 4.3 1.3 versicolor 0.43
5.1 2.5 3 1.1 versicolor -0.90
5.7 2.8 4.1 1.3 versicolor -0.17
6.3 3.3 6 2.5 virginica 0.55
5.8 2.7 5.1 1.9 virginica -0.05
7.1 3 5.9 2.1 virginica 1.52
6.3 2.9 5.6 1.8 virginica 0.55
6.5 3 5.8 2.2 virginica 0.79
7.6 3 6.6 2.1 virginica 2.12
4.9 2.5 4.5 1.7 virginica -1.14
7.3 2.9 6.3 1.8 virginica 1.76
6.7 2.5 5.8 1.8 virginica 1.03
7.2 3.6 6.1 2.5 virginica 1.64
6.5 3.2 5.1 2 virginica 0.79
6.4 2.7 5.3 1.9 virginica 0.67
6.8 3 5.5 2.1 virginica 1.16
5.7 2.5 5 2 virginica -0.17
5.8 2.8 5.1 2.4 virginica -0.05
6.4 3.2 5.3 2.3 virginica 0.67
6.5 3 5.5 1.8 virginica 0.79
7.7 3.8 6.7 2.2 virginica 2.24
7.7 2.6 6.9 2.3 virginica 2.24
6 2.2 5 1.5 virginica 0.19
6.9 3.2 5.7 2.3 virginica 1.28
5.6 2.8 4.9 2 virginica -0.29
7.7 2.8 6.7 2 virginica 2.24
6.3 2.7 4.9 1.8 virginica 0.55
6.7 3.3 5.7 2.1 virginica 1.03
7.2 3.2 6 1.8 virginica 1.64
6.2 2.8 4.8 1.8 virginica 0.43
6.1 3 4.9 1.8 virginica 0.31
6.4 2.8 5.6 2.1 virginica 0.67
7.2 3 5.8 1.6 virginica 1.64
7.4 2.8 6.1 1.9 virginica 1.88
7.9 3.8 6.4 2 virginica 2.48
6.4 2.8 5.6 2.2 virginica 0.67
6.3 2.8 5.1 1.5 virginica 0.55
6.1 2.6 5.6 1.4 virginica 0.31
7.7 3 6.1 2.3 virginica 2.24
6.3 3.4 5.6 2.4 virginica 0.55
6.4 3.1 5.5 1.8 virginica 0.67
6 3 4.8 1.8 virginica 0.19
6.9 3.1 5.4 2.1 virginica 1.28
6.7 3.1 5.6 2.4 virginica 1.03
6.9 3.1 5.1 2.3 virginica 1.28
5.8 2.7 5.1 1.9 virginica -0.05
6.8 3.2 5.9 2.3 virginica 1.16
6.7 3.3 5.7 2.5 virginica 1.03
6.7 3 5.2 2.3 virginica 1.03
6.3 2.5 5 1.9 virginica 0.55
6.5 3 5.2 2 virginica 0.79
6.2 3.4 5.4 2.3 virginica 0.43
5.9 3 5.1 1.8 virginica 0.07
bangalore lies between 20 to 25 degrees
bangalore tomorrow will be exact 23 degrees 23.1
23.5
22.6
Histogram
40
Bin Frequency 30
Frequency
4 0 20 Frequency
4.5 5 10
5 27 0
5.5 27 4 4.5 5 5.5 6 6.5 7 7.5 8 More
6 30 Bin
6.5 31
7 18
7.5 6
8 6
More 0
ity of a randomly selected flower to have a sepal length of less than 4.6?
100000
100 1000
(X-Mean)/Stdev
Stdev/sqrt(n) n=Sample Size
t-Stat (X-Mean)/(stdev/sqrt(N))
1990
2833
2500
140 df 139
211.2886
1.984 (From the table)
2409 95% times the average balance maintained will lie between $1571 & $2409
e Margin of Error= 1 - Confidence Level
e don't have info on Population Parameter?
239.4322
1.984
2465.03 95% times the average balance maintained will lie between $1514.97 & $2465.03
500
Frequency
e
Situation 1
Null Hypothesis: The person is innocent. Truth
Alternate Hypothesis: The person is guilty.
The is no sufficient evidence to prove the person guilty, hence we will accept the null hypothesis & the person will b
Hence the decision made is correct.
Situation 2
Null Hypothesis: The person is innocent. 0
The is sufficient evidence to prove the person guilty, hence we will accept the alternate hypothesis & the person wi
Hence the decision made is correct.
Situation 3
Null Hypothesis: The person is innocent. Truth
The is evidence to prove the person guilty, hence we will accept the alternative hypothesis & the person will be pun
Hence are making a Type 1 Error.
Situation 4
Null Hypothesis: The person is innocent. 0
The is no sufficient evidence to prove the person guilty, hence we will accept the null hypothesis & the person will b
Hence are making a Type 2 Error.
Decision Made H0 is Null Hypothesis

Accept H0 Reject H0 H1 is Alternate Hypothesis
H0 is True Correct Type 1 Error
Reality
H0 is False Type 2 Error Correct We usually must not say 'Accept Null Hy
So if there is no sufficient evidence from
l hypothesis & the person will be let free.
ate hypothesis & the person will be punished.
thesis & the person will be punished.
l hypothesis & the person will be set free.
rnate Hypothesis
y must not say 'Accept Null Hypothesis' , because the fact mentioned in Null Hypothesis is the default belief.
e is no sufficient evidence from the sample to prove the Null as wrong, we would rather say 'Fail to reject the Null'.
One tailed tests
Variables like:
Sales
Profitability
Scores in an exam etc
Which are beneficial when on the upside, and detrimental when on the downside.
Variables like:
Loss
Number of defects
Which are detrimental when on the upside, and beneficial when on the downside.
Two tailed tests

Body temperature
Rainfall
Which is good enough only when It is in a specified safe range, but detrimental when on either of the extremes, higher or lowe
of the extremes, higher or lower.
Sample Mean 1990
Sample SD 2833
Population SD 2500
Sample size 140
Sample SD 239
1990
1751 2229
1511.1355993 2468.8644007
Null & Alter
z-score / t-test
0.05 Alpha/ Significance level
p-value Probability Value
p-value > 0.05 Null Hypothesis is true
p-value< 0.05 Alternate Hypothesis is true
In a company, we have 500 employees

Average salary of employees
SD of the salaries
If I pick a random employee, what is the probability that his salary is 35000.
t-stat 35000-40000
2000/sqrt(50)
-17.67766953
a 6
b 11
c 17
d 32
e -16
50 50
t 2.2360679775
df 79
ANOVA
Null Hypothesis : The difference between the 3 samples is almost the same, and the different diets have equal impact on weig
Alternate Hypothesis: Atleast 1 diet plan is significantly reducing weight than others
S.No Atkins GM South Beach

1 6 4 6
2 2 4 7
3 3 5 5
4 4 7 6
5 2 8 8
6 3 5 7
7 3 3 8
8 2 7 9
9 7 10 6
10 8 4 5
11 10 5 5
12 4 2 12
Mean 4.5 5.3333333333 7
Grand Mean 5.61111111111111
Sum of Squares Within
Atkins D= (X-Mean) Square of D

6 1.5 2.25
2 -2.5 6.25
3 -1.5 2.25
4 -0.5 0.25
2 -2.5 6.25
3 -1.5 2.25
3 -1.5 2.25
2 -2.5 6.25
7 2.5 6.25
8 3.5 12.25
10 5.5 30.25
4 -0.5 0.25
Average 4.5 77
SS Within Groups (SSE, Sum of Squares Error) 179.66666667
Degrees of Freedom (k-1) (Numerator) 2
SS Between Groups
Atkins GM South Beach

6 4 6
2 4 7
3 5 5
4 7 6
2 8 8
3 5 7
3 3 8
2 7 9
7 10 6
8 4 5
10 5 5
4 2 12
Mean 4.5 5.3333333333 7
Grand Mean 5.61111111111111
D -1.1111111111 -0.2777777778 1.3888888889
Square of D 1.2345679012 0.0771604938 1.9290123457
(D^2) * n 14.814814815 0.9259259259 23.148148148
SSC 38.888888889
Degrees of Freedom (N-k) 33
Mean Square Between = SSC / (k-1)

k= No. Of samples
Degrees of Freedom
MSB 19.444444444 Numerator for F-Stat
Mean Square Error = SSE/ (n-k)

n= total number of observations
MSE 5.4444444444 Denominator for F-Stat

F - statistic MSB/MSE
3.5714285714 F-Stat
F- Critical 3.31 F-Crit

F-Stat> F-Crit Reject the Null Hypothesis & accept Alternate
In Excel
Anova: Single Factor
SUMMARY
Groups Count Sum Average Variance
Atkins 12 54 4.5 7
GM 12 64 5.3333333333 5.1515151515
South Beach 12 84 7 4.1818181818
ANOVA
Source of Variation SS df MS F
Between Groups 38.888888889 2 19.444444444 3.5714285714
Within Groups 179.66666667 33 5.4444444444
Total 218.55555556 35
Anova: Single Factor
SUMMARY
Groups Count Sum Average Variance
Column 1 12 54 4.5 7
Column 2 12 64 5.3333333333 5.1515151515
Column 3 12 84 7 4.1818181818
ANOVA
Source of Variation SS df MS F
Between Groups 38.888888889 2 19.444444444 3.5714285714
Within Groups 179.66666667 33 5.4444444444
Total 218.55555556 35
40000
2000
that his salary is 35000.

>5%
<5% Alternate
>5% Null hypothesis
ANOVA
ent diets have equal impact on weight loss.
Shelf 1
Shelf 2
Shelf 3
Shelf 4 Highest Sales
Shelf 5
Sum of Squares Within
GM D=X-Mean Square of D South Beach D=X-Mean Square of D

4 -1.3333333333 1.7777777778 6 -1 1
4 -1.3333333333 1.7777777778 7 0 0
5 -0.3333333333 0.1111111111 5 -2 4
7 1.6666666667 2.7777777778 6 -1 1
8 2.6666666667 7.1111111111 8 1 1
5 -0.3333333333 0.1111111111 7 0 0
3 -2.3333333333 5.4444444444 8 1 1
7 1.6666666667 2.7777777778 9 2 4
10 4.6666666667 21.777777778 6 -1 1
4 -1.3333333333 1.7777777778 5 -2 4
5 -0.3333333333 0.1111111111 5 -2 4
2 -3.3333333333 11.111111111 12 5 25
5.3333333333 56.666666667 7 46
Number of samples minus 1
SS Between Groups
Total elements minus number of sample

t Alternate
Atkins GM South Beach

5 kms
2 kms
P-value F crit
0.0394405888 3.2849176511
P-value F crit
0.0394405888 3.284917651
Observed Frequencies
Non-smoker Smoker Total
Athlete 14 4 18
Non-athlete 0 10 10
Total 14 14 28
Expected Frequencies
Non-smoker Smoker Total
Athlete 9 9 18
Non-athlete 5 5 10
Total 14 14 28
Athlete/Non-smoker = ((14-9)^2)/9 2.78

Non-Athlete/Non-smoker = ((0-5)^2)/5 5.00
Athlete/Smoker = ((4-9)^2)/9 2.78
Non-Athlete/Smoker = ((10-5)^2)/5 5.00
Chi-Square-stat 15.56
df=(number of rows−1)⋅(number of columns−1) 1
Critical 3.84
Test>Crit Alternate Hypothesis is true
Alternate Hypothesis The 2 variables are dependent & there is a statistically significant association between
Null Hypothesis The 2 variables are independent & there is no significant association between smoking
In Excel
0.0000801 P-Value
<0.05 P-value is less than 0.05, so reject null hypothesis.
y significant association between smoking and being a professional athlete
ant association between smoking and being a professional athlete

Basic Statistics - All Calculations

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Basic Statistics - All Calculations

Uploaded by

Copyright:

Available Formats

Measure of Central Tendency

e of the median data point

Index value of the median in the data

The value at the 10th index is 54, hence 54 is the median

Variance & Std Deviation

Std Dev27.783 SQRT(D52)

A firm is starting a delivery service for a new client between 2 points.

Mean 32.5 32.5

Since the variance of delivery boy 1 is lesser, he is more consiste

0, 32, 35, 37, 38, 41, 43, 44, 46,53,60

43.413 Median31 Mode 0

124.78 144 0.96

165.47 149 0.9933

from the mean

Right Skewed Mode<Median<Mean

Left Skewed Mode>Median>Mean

00 120 140 160 180 200 ore

Team A 0.7 Team B 0.3

AYS sum up to 1, and will lie in the range of 0 & 1

0.25 0.5 0.75 1

Sepal.LengSepal.WidtPetal.Leng Petal.WidtSpecies Z-Score The temperature of bangalore lies between 20 to 2

Stdev/sqrt(n) n=Sample Size

1.984 (From the table)

e don't have info on Population Parameter?

Alternate Hypothesis: The person is guilty.

Alternate Hypothesis: The person is guilty.

Alternate Hypothesis: The person is guilty.

Alternate Hypothesis: The person is guilty.

Decision Made H0 is Null Hypothesis

ate hypothesis & the person will be punished.

thesis & the person will be punished.

l hypothesis & the person will be set free.

Two tailed tests

Null & Alter

0.05 Alpha/ Significance level

p-value Probability Value

p-value > 0.05 Null Hypothesis is true

p-value< 0.05 Alternate Hypothesis is true

In a company, we have 500 employees

S.No Atkins GM South Beach

Sum of Squares Within

Atkins D= (X-Mean) Square of D

SS Within Groups (SSE, Sum of Squares Error) 179.66666667

Degrees of Freedom (k-1) (Numerator) 2

Atkins GM South Beach

Degrees of Freedom (N-k) 33

Mean Square Between = SSC / (k-1)

MSB 19.444444444 Numerator for F-Stat

Mean Square Error = SSE/ (n-k)

MSE 5.4444444444 Denominator for F-Stat

F- Critical 3.31 F-Crit

Anova: Single Factor

that his salary is 35000.

Sum of Squares Within

GM D=X-Mean Square of D South Beach D=X-Mean Square of D

Number of samples minus 1

Total elements minus number of sample

Atkins GM South Beach

Athlete/Non-smoker = ((14-9)^2)/9 2.78

df=(number of rows−1)⋅(number of columns−1) 1

Test>Crit Alternate Hypothesis is true

You might also like