You are on page 1of 22

Activity Data Type

Number of beatings from Wife Discrete


Results of rolling a dice Discrete
Weight of a person Continuous
Weight of Gold Continuous
Distance between two places Continuous
Length of a leaf Continuous
Dog's weight Continuous
Blue Color Discrete
Number of kids Discrete
Number of tickets in Indian railways Discrete
Number of times married Discrete
Gender (Male or Female) Discrete
Q1) Identify the Data type for the Following:

Q2) Identify the Data types, which were among the following
Nominal, Ordinal, Interval, Ratio.
Data Data Type
Gender Nominal
High School Class Ranking Ordinal
Celsius Temperature Interval
Weight Ratio
Hair Color Nominal
Socioeconomic Status Ordinal
Fahrenheit Temperature Interval
Height Ratio
Type of living accommodation Nominal
Level of Agreement Ordinal
IQ(Intelligence Scale) Ordinal
Sales Figures Ratio
Blood Group Nominal
Time Of Day Interval
Time on a Clock with Hands Interval
Number of Children Ordinal
Religious Preference Ordinal
Barometer Pressure Interval
SAT Scores Interval
Years of Education Interval

Q3) Three Coins are tossed, find the probability that two heads and one tail are
obtained?
Soln:
When 3 coins are tossed the number of possible outcomes can be: 2^3=8
Where H=HEAD
T=TAIL
SO the possibilities are
(HHH,TTT,HTH,HHT,THH,TTH,THT,TTH)
The outcomes that has two heads and one tail are- HTH,HHT,THH
..therefore, the probability of obtaining two heads and one tail are:3/8 or 0.375

Q4) Two Dice are rolled, find the probability that sum is
a) Equal to 1
b) Less than or equal to 4
c) Sum is divisible by 2 and 3
 The possible outcomes are -36
(1,1)(1,2)(1,3)(1,4)(1,5)(1,6)
(2,1)(2,2)(2,3)(2,4)(2,5)(2,6)
(3,1)(3,2)(3,3)(3,4)(3,5)(3,6)
(4,1)(4,2)(4,3)(4,4)(4,5)(4,6)
(5,1)(5,2)(5,3)(5,4)(5,5)(5,6)
(6,1)(6,2)(6,3)(6,4)(6,5)(6,6)

a) Equal to 1
0/36
d) Less than or equal to 4
(1,1)(1,2)(1,3)(2,1)(2,2)(3,1)
6/36

e) Sum is divisible by 2 and 3


(1,5)(5,1)(3,3)(2,4)(4,2) )(6,6)
6/36

Q5) A bag contains 2 red, 3 green and 2 blue balls. Two balls are drawn at
random. What is the probability that none of the balls drawn is blue?
Total balls=7
Out of 7 balls ,the possibility that first ball is not blue when 2 balls are randomly
drawn are = 5/7

Out of 6 balls ,the possibility that second ball is not blue when 2 balls are
randomly drawn are = 4/6
= (5/7)*(4/6)=20/42= 10/21

Q6) Calculate the Expected number of candies for a randomly selected child
Below are the probabilities of count of candies for children (ignoring the nature of
the child-Generalized view)
CHILD Candies count Probability
A 1 0.015
B 4 0.20
C 3 0.65
D 5 0.005
E 6 0.01
F 2 0.120
Child A – probability of having 1 candy = 0.015.
Child B – probability of having 4 candies = 0.20
Child C - probability of having 3 candies = 0.65
Child D - probability of having 5 candies = 0.005
Child E - probability of having 6 candies = 0.01
Child F - probability of having 2 candies = 0.120
Expected number of candies for a randomly selected child=
 =(1*0.015)+(4*0.20)+(3*0.65)+(5*0.005)+(6*0.01)+(2*0.120)
=0.015+0.8+1.95+0.025+0.06+0.24=3.09

Q7) Calculate Mean, Median, Mode, Variance, Standard Deviation, Range &
comment about the values / draw inferences, for the given dataset
- For Points,Score,Weigh>
Find Mean, Median, Mode, Variance, Standard Deviation, and Range
and also Comment about the values/ Draw some inferences.
Use Q7.csv file
3.9 2.62 16.46
3.9 2.875 17.02
3.85 2.32 18.61
3.08 3.215 19.44
3.15 3.44 17.02
2.76 3.46 20.22
3.21 3.57 15.84
3.69 3.19 20
3.92 3.15 22.9
3.92 3.44 18.3
3.92 3.44 18.9
3.07 4.07 17.4
3.07 3.73 17.6
3.07 3.78 18
2.93 5.25 17.98
3 5.424 17.82
3.23 5.345 17.42
4.08 2.2 19.47
4.93 1.615 18.52
4.22 1.835 19.9
3.7 2.465 20.01
2.76 3.52 16.87
3.15 3.435 17.3
3.73 3.84 15.41
3.08 3.845 17.05
4.08 1.935 18.9
4.43 2.14 16.7
3.77 1.513 16.9
4.22 3.17 14.5
3.62 2.77 15.5
3.54 3.57 14.6
4.11 2.78 18.6
 Sorted
2.76 1.513 14.5
2.76 1.615 14.6
2.93 1.835 15.41
3 1.935 15.5
3.07 2.14 15.84
3.07 2.2 16.46
3.07 2.32 16.7
3.08 2.465 16.87
3.08 2.62 16.9
3.15 2.77 17.02
3.15 2.78 17.02
3.21 2.875 17.05
3.23 3.15 17.3
3.54 3.17 17.4
3.62 3.19 17.42
3.69 3.215 17.6
3.7 3.435 17.82
3.73 3.44 17.98
3.77 3.44 18
3.85 3.44 18.3
3.9 3.46 18.52
3.9 3.52 18.6
3.92 3.57 18.61
3.92 3.57 18.9
3.92 3.73 18.9
4.08 3.78 19.44
4.08 3.84 19.47
4.11 3.845 19.9
4.22 4.07 20
4.22 5.25 20.01
4.43 5.345 20.22
4.93 5.424 22.9
Points

Mean=3.596

Median=3.69+3.7/2=3.695

Mode=3.07

points( (x-
x) x-avg avg)^2
- 0.69983
2.76 0.83656 7
- 0.69983
2.76 0.83656 7
- 0.44430
2.93 0.66656 6
- 0.35588
3 0.59656 7
- 0.27726
3.07 0.52656 8
- 0.27726
3.07 0.52656 8
- 0.27726
3.07 0.52656 8
- 0.26683
3.08 0.51656 7
- 0.26683
3.08 0.51656 7
- 0.19941
3.15 0.44656 8
- 0.19941
3.15 0.44656 8
- 0.14943
3.21 0.38656 1
- 0.13436
3.23 0.36656 8
- 0.00319
3.54 0.05656 9
0.02343 0.00054
3.62 8 9
0.09343 0.00873
3.69 7 1
0.10343 0.01069
3.7 8 9
0.13343 0.01780
3.73 8 6
0.17343 0.03008
3.77 8 1
0.25343 0.06423
3.85 8 1
0.30343 0.09207
3.9 8 4
0.30343 0.09207
3.9 8 4
0.32343 0.10461
3.92 8 2
0.32343 0.10461
3.92 8 2
0.32343 0.10461
3.92 8 2
0.48343 0.23371
4.08 8 2
0.48343 0.23371
4.08 8 2
0.51343 0.26361
4.11 8 8
0.62343 0.38867
4.22 8 4
0.62343 0.38867
4.22 8 4
0.83343 0.69461
4.43 8 8
1.33343 1.77805
4.93 8 6
8.86232
2
Variance=8.862322/31= 0.28588

Standard deviation=0.5346

Score

Mean=3.217

Median=3.315+3.435/2=3.325

Mode=3.44

score

1.513

1.615
1.835
1.935

2.14

2.2

2.32

2.465

2.62

2.77

2.78

2.875

3.15

3.17

3.19

3.215

3.435

3.44

3.44

3.44

3.46

3.52

3.57

3.57

3.73

3.78
3.84
3.845

4.07

5.25

5.345

5.424

Variance=29.678748/31=0.95737

Standard deviation=0.97845

Weigh

Mean=17.84875

Median=17.6+17.82/2=17.71

Mode=17.02

(x-
Weigh x-avg avg)^2
-
3.3487 11.2141
14.5 5 3
-
3.2487 10.5543
14.6 5 8
-
2.4387 5.94750
15.41 5 2
-
2.3487 5.51662
15.5 5 7
-
2.0087 4.03507
15.84 5 7
-
1.3887 1.92862
16.46 5 7
-
1.1487 1.31962
16.7 5 7
-
0.9787 0.95795
16.87 5 2
-
0.9487 0.90012
16.9 5 7
-
0.8287 0.68682
17.02 5 7
-
0.8287 0.68682
17.02 5 7
-
0.7987 0.63800
17.05 5 2
-
0.5487 0.30112
17.3 5 7
-
0.4487 0.20137
17.4 5 7
-
0.4287 0.18382
17.42 5 7
-
0.2487 0.06187
17.6 5 7
-
0.0287 0.00082
17.82 5 7
0.1312 0.01722
17.98 5 7
0.1512 0.02287
18 5 7
0.4512 0.20362
18.3 5 7
0.6712 0.45057
18.52 5 7
0.7512 0.56437
18.6 5 7
0.7612 0.57950
18.61 5 2
1.0512 1.10512
18.9 5 7
1.0512 1.10512
18.9 5 7
1.5912 2.53207
19.44 5 7
1.62122.62845
19.47 5 2
2.05124.20762
19.9 5 7
2.15124.62787
20 5 7
2.16124.67100
20.01 5 2
2.37125.62282
20.22 5 7
5.051225.5151
22.9 5 3
98.9881
5
Varinace=98.98815/31= 3.1931

Standard deviation= 1.78694

Inferences:

In case of points and score mean <median,which means the data is negatively skewed

.Whereas in case of weigh mean>median..which means the weigh data distribution is positively skewed

Q8) Calculate Expected Value for the problem below


a) The weights (X) of patients at a clinic (in pounds), are
108, 110, 123, 134, 135, 145, 167, 187, 199
Assume one of the patients is chosen at random. What is the Expected
Value of the Weight of that patient?
Expected value=1308/9= 145.33

Q9) Calculate Skewness, Kurtosis & draw inferences on the following data
Cars speed and distance
Use Q9_a.csv

Speed has a negative skewness and kurtosis for speed has negative value…negative values in kurtosis
indicates the data is platykurtic

Dist has a positive skewness and kurtosis for dist has positive value…positive values in kurtosis indicates
the data is leptokurtic
SP and Weight(WT)
Use Q9_b.csv

Sp has a positive skewness and kurtosis for sp has positive value…positive values in kurtosis indicates the
datais leptokurtic.

wt has a negative skewness and kurtosis for wt has positive value…positive values in kurtosis indicates
the data is leptokurtic

Q10) Draw inferences about the following boxplot & histogram


 The histogram is skewed towards the right .Therfore the data is positively skewed

i.e., mean>median

 The outliers on the right i.e., on the higher side


 In the box plot the median is closer the Q1.This shows that mean > median…. Therefore the data
is positively skewed.

Q11) Suppose we want to estimate the average weight of an adult male in


Mexico. We draw a random sample of 2,000 men from a population of
3,000,000 men and weigh them. We find that the average person in our
sample weighs 200 pounds, and the standard deviation of the sample is 30
pounds. Calculate 94%,98%,96% confidence interval?
Q12) Below are the scores obtained by a student in tests

34,36,36,38,38,39,39,40,40,41,41,41,41,42,42,45,49,56

1) Find mean, median, variance, standard deviation.


2) What can we say about the student marks?

1) Mean =sum of all observations of the data--738


-------------------------------------------
Count of the data-18
700/17=41.176
Mean=41

Median= count of the data/2


18/2=9, that is the the 9th observation
Median= count of the data/2+1=9+1=10th observation
The data has even count….the average of two medians to be taken
9 th +10th obs/2= 40+41/2=40.5
Median=40.5

Variance
marks(x (x-
) (x-avg) avg)^2
34 -7 49
36 -5 25
36 -5 25
38 -3 9
38 -3 9
39 -2 4
39 -2 4
40 -1 1
40 -1 1
41 0 0
41 0 0
41 0 0
41 0 0
42 1 1
42 1 1
45 4 16
49 8 64
56 15 225
434

Sum(x-avg)^2=434
Count-1=18-1=17
Variance=sum of squared differences(Sum(x-avg)^2)/count-1 =434/17=25.52941
Standard deviation=sqrt(variance)=5.052

2)in this data ,mean>standard deviation, which means the marks of the students
is more closer to the mean and the deviation is less

Q13) What is the nature of skewness when mean, median of data are equal?
 If the mean,median of a data are equal the nature of skewness is zero…
therefore it’s a normal distribution
Q14) What is the nature of skewness when mean > median ?
 Positively skewed distribution
Q15) What is the nature of skewness when median > mean?
 Negatively skewed distribution
Q16) What does positive kurtosis value indicates for a data ?
 Positive kurtosis mean the data will be leptokurtic and the curve will be
more peaked
Q17) What does negative kurtosis value indicates for a data?
 negative kurtosis mean the data will be platykurtic and the curve will be
more flatter

Q18) Answer the below questions using the below boxplot visualization.

What can we say about the distribution of the data?


 The data is not normally distributed
What is nature of skewness of the data?
 Negatively skewed

What will be the IQR of the data (approximately)?


Q3-10
Q1-18
10-18=-8

Q19) Comment on the below Boxplot visualizations?

Draw an Inference from the distribution of data for Boxplot 1 with respect
Boxplot 2.
1- The median of both the box plots lies between the same range (250-275)
2- The data is normally distributed.(data is not skewed).
3- Both plots has no outliers.
Q 20) Calculate probability from the given dataset for the below cases

Data _set: Cars.csv


Calculate the probability of MPG of Cars for the below cases.
MPG <- Cars$MPG
a. P(MPG>38)
b. P(MPG<40)
c. P (20<MPG<50)
Q 21) Check whether the data follows normal distribution
a) Check whether the MPG of Cars follows Normal Distribution
Dataset: Cars.csv
 Mpg of cars follows normal distribution

b) Check Whether the Adipose Tissue (AT) and Waist Circumference(Waist)


from wc-at data set follows Normal Distribution
Dataset: wc-at.csv
 Adipose tissue does not follow normal distribution
 Waist circumference does not follow normal distribution

Q 22) Calculate the Z scores of 90% confidence interval,94% confidence


interval, 60% confidence interval
Q 23) Calculate the t scores of 95% confidence interval, 96% confidence
interval, 99% confidence interval for sample size of 25

Q 24) A Government company claims that an average light bulb lasts 270
days. A researcher randomly selects 18 bulbs for testing. The sampled bulbs
last an average of 260 days, with a standard deviation of 90 days. If the
CEO's claim were true, what is the probability that 18 randomly selected
bulbs would have an average life of no more than 260 days

Hint:
rcode  pt(tscore,df)

df  degrees of freedom

You might also like