You are on page 1of 11

DIGITAL ASSIGNMENT-1

NAME:MICAH JOSEPH
REGISTRATION NO. : 19BIT0404

STATISTICS LAB -2

Problem 1: The following table shows the scores (X) of 10 students on


Zoology test and scores (Y) on Botony test .The maximum score in each
test was 50.Obtain least square equation of line of regression of X on Y.
If it is known that the score of a student in Botony is 28,Estimate his/her
score in Zoology.

X 34 37 36 32 32 36 35 34 29 35
Y 37 37 34 34 33 40 39 37 36 35

Sol)
R code
> x=c(34,37,36,32,32,36,35,34,29,35)
> y=c(37,37,34,34,33,40,39,37,36,35)
> fit=lm(x~y)
> fit Call: lm(formula = x ~ y)
Coefficients:
(Intercept) y 18.9167 0.4167
The equation of the line of regression of X and Y is X=18.9167+0.4167Y. The required
score of the student in Zoology is 30.58333
2) Twelve recruits were subjected to selection test to ascertain their
suitability for a certain course of training. At the end of training they were
given a proficiency test. The marks scored by the recruits are recorded
below Calculate rank correlation coefficient

A)
CODE :-
> selection =c(44,49,52,54,47,76,65,60,63,58,50,67)
> proficiency =c(48,55,45,60,43,80,58,50,77,46,47,65)
> cor.test(selection,proficiency,method ="spearman")

OUTPUT :-
Spearman's rank correlation rho

data: selection and proficiency


S = 80, p-value = 0.01102
alternative hypothesis: true rho is not equal to 0
sample estimates:
rho
0.7202797
Q3): The sale of a Product in lakhs of rupees(Y) is expected to be influenced by two
variables namely the advertising expenditure X1 (in’OOO Rs) and the number of
sales persons(X2) in a region. Sample data on 8 Regions of a state has given the
following results

Area Y X1 X2
1 110 30 11
2 80 40 10
3 70 20 7
4 120 50 15
5 150 60 19
6 90 40 12

> #19BIT0404
> #MICAH JOSEPH
> date()
[1] "Thu Aug 20 11:56:57 2020"
> Y=c(110,80,70,120,150,90)
> X1=c(30,40,20,50,60,40)
> X2=c(11,10,7,15,19,12)
> input_data=data.frame(Y,X1,X2)
> input_data
Y X1 X2
1 110 30 11
2 80 40 10
3 70 20 7
4 120 50 15
5 150 60 19
6 90 40 12
> RegModel <- lm(Y~X1+X2, data=input_data)
> RegModel

Call:
lm(formula = Y ~ X1 + X2, data = input_data)

Coefficients:
(Intercept) X1 X2
21.716 -1.664 12.015

> summary(RegModel)

Call:
lm(formula = Y ~ X1 + X2, data = input_data)

Residuals:
1 2 3 4 5 6
6.0448 4.7015 -2.5373 1.2687 -0.1493 -9.3284

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 21.7164 9.8944 2.195 0.1157
X1 -1.6642 0.7078 -2.351 0.1002
X2 12.0149 2.3950 5.017 0.0153 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 7.158 on 3 degrees of freedom


Multiple R-squared: 0.9645, Adjusted R-squared: 0.9409
F-statistic: 40.78 on 2 and 3 DF, p-value: 0.006682

Calculate the multiple correlation coefficients R1.23 , R2.31 , R3.12 and


the partial correlation coefficients r12.3 , r2.31 , r3.12 .
> #19BIT0404
> #MICAH JOSEPH
> date()
[1] "Thu Aug 20 17:23:41 2020"
> r12=0.98
> r13=0.44
> r23=0.54
> R1_23=sqrt((r12^2+r13^2-2*r12*r23*r13)/(1-r23^2))
> R1_23
[1] 0.9857139
> R2_13=sqrt((r12^2+r23^2-2*r12*r23*r13)/(1-r13^2))
> R2_13
[1] 0.9874611
> R3_12=sqrt((r13^2+r23^2-2*r12*r23*r13)/(1-r12^2))
> R3_12
[1] 0.7018014
> r12.3<-(r12-r13*r23)/(sqrt(1-r13^2)*sqrt(1-r23^2)
+ r12.3<-(r12-r13*r23)/(sqrt(1-r13^2)*sqrt(1-r23^2))
Error: unexpected symbol in:
" r12.3<-(r12-r13*r23)/(sqrt(1-r13^2)*sqrt(1-r23^2)
r12.3"
> r12.3<-(r12-r13*r23)/(sqrt(1-r13^2)*sqrt(1-r23^2))
> r12.3
[1] 0.9822531
> r13.2<-(r13-r12*r23)/(sqrt(1-r12^2)*sqrt(1-r23^2))
> r13.2
[1] -0.5325716
> r23.1<-(r23-r12*r13)/(sqrt(1-r12^2)*sqrt(1-r13^2))
> r23.1
[1] 0.608844
Challenging tasks:
2)In a study of the relationship between level education and income the
following data was obtained. Find the relationship between them and
comment. Also illustrate the relationship by using boxplot diagram

#19BIT0404
> #MICAH JOSEPH
> date()
[1] "Thu Aug 20 12:33:08 2020"
> x=c(5,6,1.5,3.5,3.5,7,1.5)
> y=c(3,5.5,7,5.5,4,2,1)
res<-cor.test(x,y,method="pearson")
> res

Pearson's product-moment correlation

data: x and y
t = -0.39585, df = 5, p-value = 0.7085
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.8197648 0.6661915
sample estimates:
cor
-0.1743193
> boxplot(x,y)

Q2)

RCODE
FORMULA: Y=b0+blX1+b2X2,..4-bnXn
R Code: Y<-
c(158,152,159,157,157,153,156,158,159,158,154,158,161,159,157,156,155,158,154,157,15
9,160,156,155,150,158,159,152,146,155)
Xl<c(51,52,48,50.5,57,60,44,51.5,57,56,54,58,59,57.5,49.5,51,54,56,59.5,52,45,53,49,50,5
1,41,48,45,38,45)
X2<c(20.43,22.51,18.99,20.49,23.12,25.63,18.08,20.63,22.55,22.43,22.77,23.23,22.76,22.4,
20.08,20.96,22.48,22.43,25.09,21.1,17.8,20.7, 20.13,20.81,22.67,16.42,18.99,19.48,17.83,1
8.73)
model<-lm(Y-X1+X2)

SCREENSHOT
OBSERVATION COPY

You might also like