BRM Lab

BUSINESS RESEARCH METHODOLOGY LAB
(Using MS Excel and R Studio)
PRACTICAL FILE
Submitted for partial fulfillment for the award of the Degree
of
BACHELOR OF BUSINESS ADMINISTRATION

{BBA (G) 2022 – 2025}
Under the guidance of
Dr. AANCHAL AGGARWAL

Submitted by
“YASH AGGARWAL”
“08329801722”
VIVEKANANDA SCHOOL OF BUSINESS STUDIES

VIVEKANANDA INSTITUTE OF PROFESSIONAL STUDIES-TC
(Affiliated to Guru Gobind Singh Indraprastha University)
YASH AGGARWAL 08329801722 1

INDEX
TOPIC PAGE NO
 Descriptive statistics
 Histogram frequency distribution
 Correlation (Positive, Negative, zero)
HYPOTHESIS TESTING
 One sample t test using dummy (one-tail)
 One sample t test using dummy (two-tail)
 Two sample t test (one-tail)
 Two sample - t test (two tail)
 Paired Sample t test (one-tail)
 Paired Sample t test (two-tail)
 Two sample z test
 F test
 ANOVA – Single Factor
 ANOVA – Two Factor without replication
 ANOVA – Two Factor with replication
 Chi-square test
 Regression
HYPOTHESIS TESTING in R Studio
 How to install R Studio
 Introduction to R studio
 Import of Data Sheet in R studio
 Descriptive statistics
 Correlation
 Hypothesis Testing: One sample T test (one tail)
 Hypothesis Testing: Two sample T test (alpha=10%)
 Hypothesis Testing: Paired Sample T test
 Hypothesis Testing: F test
 Hypothesis Testing: One-way ANOVA

Descriptive Analysis
Step 1: Go to Data  Data Analysis  Descriptive Statistics
Step 2: Enter input range, tick on labels in first row if you have selected a heading, select an
output range and click on summary

Step 3: Click ok.

Histogram Analysis
Step1: Go to data tab  Data Analysis  Histogram
Step 2: Enter input range, bin range, tick labels, tick pareto, cumulative percentage, chart
output

Step 3: Enter output range and click ok.

Correlation:
The correlation coefficient (a value between -1 and +1) tells you how strongly two variables
are related to each other
a. Positive Correlation –
What is the correlation between advertisement of a product in a month and its sales in crores?
Sales in
Advertisement in month crores
32 5
54 10
67 15
65 20
98 24
112 34
101 25
34 34
Step 1:

Step2:
Step3:
Result:
Advertisement in month
Advertisement in month 1
Sales in crores 0.485149134 1

Inference:
Here r =+0.48, therefore there is a positive correlation between advertisements and sales.
b. Negative Correlation –
What is the correlation between no of cigarettes in a week and life expectancy?
Cigarette Life
s expectancy
5 80
23 78
25 60
48 53
17 85
8 84
4 73
26 79
11 81
19 75
14 68
35 72
29 58
4 92
23 65
1
YASH AGGARWAL 08329801722
0
Step 1:
Step 2:
1
1
Step 3:
Result:
Cigarettes Life expectancy
Cigarettes 1
-
0.7134301
Life expectancy 7 1
Inference
Here r = -0.71, therefore there is a negative correlation between number of cigarettes in a
week and life expectancy
c. No/Zero Correlation –
What is the correlation between shoe size and IQ level?
Shoe
size IQ level
1 4
2 5
3 4
4 5
1
2
5 4
6 5
7 4
Step 1:
Step 2:
1
3
Step 3:
Result:
Shoe
size IQ level
Shoe
size 1
IQ level 0 1
Inference
Here r = 0, therefore there is no or zero correlation between shoe size and IQ level
1
4
Hypothesis Testing
(i) T-Test
One Sample t-test using dummy (one tailed):
Problem: Suppose that we want to hypothesize that the mean number of TV hours watched
per week is greater than 28.5 at α=0.05
Hours Dummy
25.7 0
38.5 0
29.3 0
25.1
30.6
34.6
30
39
33.7
31.6
25.9
34.4
26.9
23
31.1
29.3
34.5
31.2
33.2
30.2
36.5
37.5
27.6
24.6
23.9
27
29.5
30
29.6
1
5
HYPOTHESIS TESTING:
Null Hypothesis: The mean no. of TVs is not greater than 28.5
Alternate Hypothesis: The mean no. of TVs is greater than 28.5
H0 = µ ≤ 28.5
H1 = µ > 28.5
Step 1:
Step 2:
1
6
Step 3:
Output:
t-Test: Two-Sample Assuming Equal Variances
Hours Dummy
Mean 30.48275862 0
Variance 19.13362069 0
Observations 29 3
Pooled Variance 17.85804598
Hypothesized Mean Difference 28.5
df 30
t Stat 0.773637505
P(T<=t) one-tail 0.222599519
t Critical one-tail 1.697260887
P(T<=t) two-tail 0.445199038
t Critical two-tail 2.042272456
DECISION RULE:
If t-stat is greater than t-critical, reject null hypothesis. If p(t) is less than α, reject Null
hypothesis
1
7
INFERENCE:
Since t stat (0.77) is lesser than t Critical (1.69), accept Null hypothesis.
Since P (0.22) is greater than α (0.05), accept Null hypothesis.
CONCLUSION:
The mean number of TV hours watched per week is not greater than 28.5
1
8
(ii) T-test
One sample t-test using dummy (two tailed)
Problem: There is a significant difference between the mean age of population and estimated
mean age of population. Mean age of population is 40.
Age Dummy
42 0
76 0
56 0
67
65
65
89
45
45
65
78
55
44
65
76
89
54
56
56
76
45
1
9
HYPOTHESIS TESTING:
Null Hypothesis: There is no significant difference between mean age and estimated age
Alternate Hypothesis: There is a significant difference between mean age and estimated age
H0 = µ = 40
H1 = µ ≠ 40
Step1:
Step2:
2
0
Step 3:
Output:
Age Dummy
Mean 62.33333333 0
Variance 208.6333333 0
Observations 21 3
Hypothesized Mean Difference 40
df 22
t Stat 2.627378828
P(T<=t) one-tail 0.007690983
P(T<=t) two-tail 0.015381965
2
1
DECISION RULE:
hypothesis
INFERENCE:
Since t stat (2.62) is greater than t Critical (2.07), reject Null hypothesis.
Since P (0.01) is lesser than α (0.05), reject Null hypothesis.
CONCLUSION:
The population mean is greater than 40 at α=0.05
2
2
(iii) T-test
Two sample t-test (one tail)
Problem: To analyse that the time spent by full time students in studying statistics is greater
than the time spent by part time students.
Full Part
time time
3.2 3.1
1.5 3.4
6.5 4.6
0.2 2.8
3.7 2.3
3.3 1.5
1.7 3.8
3.6 9.5
3.8 4.3
5.3 2.7
6.9 1.6
3.6 1.6
1.7 3.2
1.2 4.2
7.2 3.9
3.9 1.2
1.9 0
5.3 0
HYPOTHESIS TESTING:
Null Hypothesis: The time spent by full time students studying statistics is not more than the
time spent by part time students
Alternate Hypothesis: The time spent by full time students studying statistics is more than
the time spent by part time students
H0 = µf ≤ µp; µf - µp≤0
H1 = µf > µp; µf - µp>0
Step 1:
2
3
Step 2:
2
4
Step 3:
Output:
Full time Part time

Mean 3.583333 2.983333333
Variance 4.133235 4.566176471
Observations 18 18
df 34
t Stat 0.863063
P(T<=t) one-tail 0.197075
P(T<=t) two-tail 0.39415
DECISION RULE:
hypothesis
INFERENCE:
2
5
CONCLUSION:
The time spent by full time students studying statistics is not more than the time spent by part
time students
2
6
(iv) T-test
Two sample t-test (two tail)
Problem: Two types of drugs were used on 7 patients for reducing their weight. Drug A was
imported and drug B was indigenous. The decrease in the weight after using drugs for six
months was as follows:
Is there a significant difference in the efficiency of the two drugs?

Drug A Drug B
10 8
12 9
13 12
11 14
14 15
12 10
13 9
HYPOTHESIS TESTING:
Null Hypothesis: There is no significant difference in the efficiency of the two drugs
Alternate Hypothesis: There is a significant difference in the efficiency of two drugs
H0 = µa = µb; µa - µb = 0
H1 = µa ≠ µb; µa - µb ≠ 0
2
7
Step 1:
2
8
Step 2:
2
9
Step 3:
Output:
Drug A Drug B
12.1428571
Mean 4 11
7.33333333
Variance 1.80952381 3
Observations 7 7
4.57142857
Pooled Variance 1
Hypothesized Mean
Difference 0
df 12
t Stat 1
0.16852452
P(T<=t) one-tail 9
1.78228755
t Critical one-tail 6
0.33704905
P(T<=t) two-tail 8
3
0
DECISION RULE:
hypothesis
INFERENCE:
Since t stat (1) is lesser than t Critical (2.17), accept Null hypothesis.
CONCLUSION:
There is no significant difference in the efficiency of the two drugs.
3
1
T-test of 2 samples
Problem: To determine which means of the subjects is different from other. We'll apply T-test
2 sample assuming equal variances for:
1. Economics and Science

2. Science & History
3. Economics and History
1. Economics and Science –

Null hypothesis: There is no difference between the means of the subjects
Alternate hypothesis: There is a difference between the means of the subjects
H0: u1=u2
H1: u1≠u2, u1 – u2=0
2. Science and History –

H0: u2=u3
H1: u2≠u3, u2 – u3=0
3. Economics and History –

H0: u1=u3
H1: u1≠u3, u1 – u3=0
3
2
Output:
(i) Economics & Science

DECISION RULE:
If t-stat is greater than t-critical, reject null hypothesis.
If p(t) is less than α, reject Null hypothesis
INFERENCE:
Since t stat (-4.43) is less than t Critical (2.144), accept null hypothesis.
Since p(t) (0.0005) is less than α (0.05), accept null hypothesis.
CONCLUSION:
There is a significant difference between means marks of the students in subjects –
economics and science
(ii) Science & History
DECISION RULE:
INFERENCE:
Since t stat (4.95) is greater than t Critical (1.76), reject null hypothesis.
Since p(t)(0.0002) is less than α (0.05), reject null hypothesis.
3
3
CONCLUSION:
There is a no difference between means marks of the students in subjects - science and
history
(iii) Economics & History

DECISION RULE:
INFERENCE:
Since t stat (1.62) is lesser than t Critical (2.11), accept null hypothesis.
Since p(t) (0.12) is greater than α (0.05), accept null hypothesis.
CONCLUSION:
There is a difference between the mean marks of students in subjects’ economics and science
3
4
(v) Paired sample t-test (one tail)
Problem: Is there sufficient evidence to suggest that the mean time to exhaustion is greater
after chocolate milk than after carbohydrate replacement drink? Use a significance level of
0.1. (Use µcm-µcd in hypothesis statements)
Cyclis Chocolate
t Milk Carbohydrate Replacement Drink
1 50.46 42.9
2 47.08 50.1
3 57.51 41.67
4 46.6 32.69
5 29.1 46.33
6 57.5 31.63
7 23.87 20.61
8 28.65 14.99
9 35.37 20.11
HYPOTHESIS TESTING:
Null Hypothesis: Mean time to exhaustion is not greater after chocolate milk than after
carbohydrate replacement drink
Alternate Hypothesis: Mean time to exhaustion is greater after chocolate milk than after
carbohydrate replacement drink
H0 = µcm ≤ µcd or µcm - µcd ≤ 0
H1 = µcm ≥ µcd or µcm - µcd ≥ 0
3
5
Step 1:
Step 2:
3
6
Step 3:
Output:
t-Test: Paired Two Sample for Means
Chocolate Milk Carbohydrate Replacement Drink

Mean 41.79333333 33.44777778
Variance 164.53125 160.9338194
Observations 9 9
Pearson Correlation 0.508406248
df 8
t Stat 1.979280834
P(T<=t) one-tail 0.0415706
P(T<=t) two-tail 0.083141199
3
7
DECISION RULE:
hypothesis
INFERENCE:
Since t stat (1.97) is lesser than t Critical (1.39), reject Null hypothesis.
Since P (0.04) is lesser than α (0.1), reject Null hypothesis.
CONCLUSION:
Mean time to exhaustion is greater after chocolate milk than after carbohydrate replacement
drink.
3
8
(vi) Paired sample t-test (two tail)
Problem: Determine that there is a significant difference between the time to finish the race
when race is completed with local shoes and branded shoes.
Athelet Local Branded

e shoes shoes
1 3.2 3.1
2 1.5 3.4
3 6.5 4.6
4 0.2 2.8
5 3.7 2.3
6 3.3 1.5
7 1.7 3.8
8 3.6 9.5
9 3.8 4.3
10 5.3 2.7
11 6.9 1.6
12 3.6 1.6
13 1.7 3.2
14 1.2 4.2
15 7.2 3.9
HYPOTHESIS TESTING:
Null Hypothesis: There is a no significant difference between the time to finish the race
Alternate Hypothesis: There is a significant difference between the time to finish the race
H0 = µL=µB, µL- µB=0
H1 = µL≠µB, µL- µB≠0
3
9
Step 1:
Step 2:
4
0
Step 3:
Output:
t-Test: Paired Two Sample for Means
Local shoes Branded shoes

Mean 3.56 3.5
Variance 4.598285714 3.76
Observations 15 15
Pearson Correlation -0.022160001
df 14
t Stat 0.079506488
P(T<=t) one-tail 0.468877535
P(T<=t) two-tail 0.93775507
DECISION RULE:
4
1
hypothesis
INFERENCE:
CONCLUSION:
There is a no significant difference between the time to finish the race when race is
completed with local shoes and branded shoes
4
2
Z-Test
(i) Z-test
Problem: The net annual returns (the returns on investment after deducting all relevant fees)
in percentage are given. Can investors do better by buying mutual funds directly from banks
or other financial institutions than by purchasing mutual funds through brokers? Can we
conclude at the 5% significance level that directly-purchased mutual funds outperform
mutual funds bought through brokers?
Direct Broker
9.33 3.24
6.94 -6.76
16.17 12.8
16.97 11.1
5.94 2.73
12.61 -0.13
3.33 18.22
16.13 -0.8
11.2 -5.75
1.14 2.59
4.68 3.71
3.09 13.15
7.26 11.05
2.05 -3.12
13.07 8.94
0.59 2.74
13.57 4.07
0.35 5.6
2.69 -0.85
18.45 -0.28
4.23 16.4
10.28 6.39
7.1 -1.9
-3.09 9.49
5.6 6.7
5.27 0.19
8.09 12.39
15.05 6.54
13.21 10.92
1.72 -2.15
14.69 4.36
-2.97 -11.07
10.37 9.24
-0.63 -2.67
-0.15 8.97
4
3
0.27 1.87
4.59 -1.53
6.38 5.23
-0.24 6.87
10.32 -1.69
10.29 9.43
4.39 8.31
-2.06 -3.99
7.66 -4.44
10.83 8.63
14.48 7.06
4.8 1.57
13.12 -8.44
-6.54 -5.72
-1.06 6.95
HYPOTHESIS TESTING:
Null Hypothesis: Investors do not do better by buying mutual funds directly from banks or
other financial institutions than by purchasing mutual funds through brokers
Alternate Hypothesis: Investors do better by buying mutual funds directly from banks or
other financial institutions than by purchasing mutual funds through brokers
H0 = µFI≤µB; µF-µB≤0
H1 = µFI>µB; µF-µB>0
4
4
Step 1:
Step 2:
4
5
Step 3:
Output:
z-Test: Two Sample for Means
Direct Broker
Mean 6.6312 3.7232
Known Variance 36.7384 42.4725
Observations 50 50
Hypothesized Mean
Difference 0
2.31039869
z 4
0.01043304
P(Z<=z) one-tail 6
1.64485362
z Critical one-tail 7
0.02086609
P(Z<=z) two-tail 1
1.95996398
z Critical two-tail 5
4
6
DECISION RULE:
If z-stat is greater than z-critical, reject null hypothesis. If p(z) is less than α, reject Null
hypothesis
INFERENCE:
Since z stat (2.31) is greater than z Critical (1.64), reject null hypothesis.
Since P (0.01) is less than α (0.05), reject null hypothesis.
CONCLUSION:
Investors do better by buying mutual funds directly from banks or other financial institutions
than by purchasing mutual funds through brokers
4
7
(ii) F-test
Problem: Determine whether variance of class1 is greater than the variance of Class2
Class Class
1 2
65 76
76 54
65 67
76 65
56 76
45 66
HYPOTHESIS TESTING:
Null Hypothesis: Variance of class1 is not greater than variance of class 2
Alternate Hypothesis: Variance of class1 is greater than variance of class 2
H0 = Var1≤Var2
H1 = Var1>Var2
4
8
Step 1:
Step 2:
4
9
Step 3:
Output:
F-Test Two-Sample for Variances
Class1 Class2
Mean 63.83333333 67.33333333
Variance 142.9666667 67.06666667
Observations 6 6
df 5 5
F 2.131709742
P(F<=f) one-tail 0.212888468
F Critical one-tail 5.050329058
5
0
DECISION RULE:
If f-stat is greater than f-critical, reject null hypothesis.
If p(f) is less than α, reject Null hypothesis
INFERENCE:
Since f stat (2.13) is lesser than f Critical (5.05), accept null hypothesis.
Since P (0.21) is more than α (0.05), accept null hypothesis.
CONCLUSION:
Variance of class1 is not greater than variance of class 2
5
1
(i) ANOVA Test
ANOVA-Single Factor
Problem: To test that there is a significant difference between means marks of the students in
subjects - economics, science and history
Economics Science History

42 69 35
53 54 40
49 58 53
53 64 42
43 64 50
44 55 39
45 56 55
52 39
54 40
HYPOTHESIS TESTING:
Null Hypothesis: There is no significant difference between means marks of the students in
subjects - economics, science and history
Alternate Hypothesis: There is a significant difference between means marks of the students
in subjects - economics, science and history
H0 = µe=us=uh
H1 = at least one of the means is different, µe≠us≠uh
5
2
Step 1:
Step 2:
5
3
Step 3:
Output:
Anova: Single Factor
SUMMARY
Varianc
Groups Count Sum Average e
48.3333
Economics 9 435 3 23.5
32.3333
Science 7 420 60 3
43.6666
History 9 393 7 50.5
ANOVA
Source of Variation SS df MS F P-value F crit
1085.8 15.1962 7.16E- 3.44335
Between Groups 4 2 542.92 3 05 7
35.7272
Within Groups 786 22 7
1871.8
Total 4 24
5
4
DECISION RULE:
If f-stat is greater than f-critical, reject null hypothesis.
If p(f) is less than α, reject Null hypothesis
INFERENCE:
Since f stat (15.19623) is greater than f Critical (3.443357), reject null hypothesis.
Since p (f)(7.16E-05) is less than α (0.05), reject null hypothesis.
CONCLUSION:
There is a significant difference between means marks of the students in subjects -
economics, science and history
5
5
(ii) ANOVA TEST
ANOVA- Two Factor without replication
Problem: To test whether or not marks of students differ with respect to student and subject
both.
student economics science history

a 42 69 35
b 53 54 40
c 49 58 53
d 53 64 42
e 43 64 50
HYPOTHESIS TESTING:
Row wise:
Null Hypothesis: There is no significant difference in marks of students.
Alternate Hypothesis: There is significant difference in marks of students.
Column Wise:
Null Hypothesis: There is no significant difference in marks for three subjects- Economics,
Science and History.
Alternate Hypothesis: There is significant difference in marks for three subjects-
Economics, Science and History.
5
6
Step 1:
Step 2:
5
7
Step 3:
Output:
Anova: Two-Factor Without Replication
Su Averag Varianc
SUMMARY Count m e e
14 48.666 322.33
a 3 6 67 33
14
b 3 7 49 61
16 53.333 20.333
c 3 0 33 33
15
d 3 9 53 121
15 52.333 114.33
e 3 7 33 33
24
economics 5 0 48 28
30
science 5 9 61.8 34.2
22
history 5 0 44 54.5
ANOVA
5
8
Source of
Variation SS df MS F P-value F crit
Rows 60.933 4 15.233 0.3002 0.8698 3.8378
33 33 63 89 53
Columns 872.13 2 436.06 8.5952 0.0101 4.4589
33 67 69 72 7
Error 405.86 8 50.733
67 33
Total 1338.9 14
33
(iii) ANOVA TEST

ANOVA- Two Factor with replication
Problem: Anova with replication-two factors-A two-way ANOVA with replication is
performed when you have two groups and individuals within that group are doing more than
one thing (i.e., taking two tests).
economic scienc histor

s e y
SCHOOL A 42 69 35
53 54 40
49 58 53
53 64 42
43 64 50
SCHOOL B 44 55 39
45 56 55
52 0 39
54 0 40
0 0 0
Hypothesis Testing:
Row wise:
H0: There is no significant difference between school A and School B
5
9
H1: There is a significant difference between school A and School B
Column wise:
H01: There is no significant difference between economics, medicine and history
H2: There is a significant difference between economics, medicine and history
Interaction wise:
H03: There is no significant difference between school A and School B subject-wise (in
conjunction with subjects)
H3: There is a significant difference between school A and School B subject-wise (in
conjunction with subjects)
Step 1:
Step 2:
6
0
Step 3:
Output:
Anova: Two-Factor With Replication
SUMMARY economics science history Total

SCHOOL A
Count 5 5 5 15
Sum 240 309 220 769
Average 48 61.8 44 51.26667
6
1
Variance 28 34.2 54.5 95.6381
SCHOOL B
Count 5 5 5 15
Sum 195 111 173 479
Average 39 22.2 34.6 31.93333
Variance 494 924.2 420.3 579.4952
Total
Count 10 10 10
Sum 435 420 393
Average 43.5 42 39.3
Variance 254.5 861.5556 235.5667
ANOVA
Source of
Variation SS df MS F P-value F crit
Sample 2803.333 1 2803.333 8.6027 0.007272 4.259677
Columns 90.6 2 45.3 0.139014 0.870912 3.402826
Interaction 1540.467 2 770.2333 2.363646 0.115611 3.402826
Within 7820.8 24 325.8667
Total 12255.2 29
Decision Rule:
If f-stat is greater than f critical, reject Null Hypothesis.
If p(f) is less than F, reject Null Hypothesis
Inference:
Row Wise
Since f stat (8.37636059) is greater than f-critical (4.49399848), reject null hypothesis.
Since p value (0.01) is less than  (0.05), we will reject null hypothesis.
Column Wise
Since f stat (0.101) is less than f-critical (4.49399848), accept null hypothesis.
Since p value (0.753) is greater than  (0.05), we will accept null hypothesis.
Interaction Wise
Since f stat (3.181) is greater than f-critical (4.49399848), accept null hypothesis.
Since p value (0.093) is greater than  (0.05), we will accept null hypothesis.
Conclusion:
6
2
Row Wise
There is enough evidence that marks of students differ significantly school wise.
Column Wise
There is enough evidence that there is no difference between the marks of the three subjects,
i.e., Economics, Science and History.
Interaction Wise
There is no significant difference between the marks of the School A and School B subject
wise (in conjunction with subjects).
6
3
CHI SQUARE TEST
Problem Statement- To analyse that there is a significant relationship between gender and
newspaper brand.
Null Hypothesis : There is no significant relationship between gender and newspaper brand
Alternate Hypothesis : There is a significant relationship between gender and newspaper
brand
Observed
Count of Column Labels
Newspaper
Row Labels Economic Hindustan The Indian Times of Grand
Times Times Express India Total
female 13 16 8 6 43
6
4
male 16 15 11 7 49
Grand Total 29 31 19 13 92
Expected values
Expected values = row total *column total/Grand total
Expected
Row Economic Hindustan The Indian Times of
Labels Times Times Express India
female 13.55435 14.48913 8.880435 6.076087
male 15.44565 16.51087 10.11957 6.923913
Chi square test
Chi Square test

Row Economic Hindustan The Indian Times of
labels Times Times Express India
female 0.022672 0.157548 0.087289 0.000953
male 0.019896 0.138256 0.076601 0.000836
X2 = 0.50404971
Degree of freedom
(row-1) * (column-1)
(2-1)*(4-1) = 3
6
5
pvalue = 0.918
decision rule:
if p value is less than alpha, reject null hypothesis
inference:
here p value (0.918) is greater than alpha (0.05), accept null hypothesis
conclusion:
there is no significant relationship between gender and newspaper brand.
6
6
Regression Analysis
Problem: To check whether there is a significant relationship between umbrellas sold and
rainfall. Determine the regression equation for the same.
Umbrellas sold (Y) Rainfall (X)

5 80
23 78
25 60
48 53
17 85
8 84
4 73
26 79
11 81
19 75
14 68
35 72
29 58
4 92
23 65
HYPOTHESIS TESTING:
Null Hypothesis: There is no significant difference between the umbrellas sold and rainfall
Alternate Hypothesis: There is significant difference between umbrellas sold and rainfall
Y= Dependent Variable
X= Independent Variable
b= Slope
6
7
Step 1:
Step 2:
6
8
Step 3:
Output:
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.713430174
R Square 0.508982614
Adjusted R Square 0.471212046
Standard Error 9.056631043
Observations 15
ANOVA
df SS MS F Significance F
Regression 1 1105.306644 1105.306644 13.47564 0.002822343
Residual 13 1066.293356 82.02256585
Total 14 2171.6
Coefficien Standard Lower Upper

t Error t Stat P-value Lower 95% Upper 95% 95.0% 95.0%
Intercept 78.978421 16.3974364 4.81651028 0.00033673 43.5539133 114.402929 43.5539133 114.402929
Rainfall -
(X) 0.8102233 0.22071407 -3.6709183 0.00282234 -1.2870471 -0.3333996 -1.2870471 -0.3333996
Equation:
6
9
Y=bx+a
Y=0.8102x + 78.978
Rainfall (X) Line Fit Plot

60
50
Umbrellas sold (Y)
40 Umbrellas sold (Y)

30 f(x) = − 0.810223313272095 x + 78.9784209692747 Predicted Umbrellas sold (Y)
Linear (Predicted Umbrellas sold
20 (Y))
10
0
50 55 60 65 70 75 80 85 90 95
Rainfall (X)
Decision Rule:
If p is less than alpha, reject null hypothesis.
Inference:
Here p (0.0003) is less than alpha, we will reject null hypothesis
Conclusion
There is a significant relationship between umbrellas sold and rainfall
7
0
R Studio
HYPOTHESIS TESTING in R Studio
 How to Install R Studio?
In order to install R Studio, we first need to install R. Following are the steps
how to install R:
1. Go to CRAN, click Download R for Windows, click Base, and download the installer for the
latest R version.
2. Right-click the installer file and select Run as Administrator from the pop-up menu.
3. Select the language to be used during installation.
This doesn’t change the language used by R; all messages and Help files remain in English.
4. Follow the instructions of the installer.
You can safely use the default settings and just keep clicking Next until R starts installing.
After installing the setup of R,we can install the setup of R Studio. Following
are the steps how to install R Studio:
1. Install R. Leave all default settings in the installation options.

2. Open RStudio.
3. Go to the “Packages” tab and click on “InstallPackages”. ...
4. Start typing “Rcmdr” until you see it appear in a list. ...
5. Wait while all the parts of the R Commander package are installed.
7
1
R and RStudio
R is a programming language used for statistical computing while RStudio uses the R
language to develop statistical programs. In R, you can write a program and run the code
independently of any other computer program. RStudio however, must be used alongside R in
order to properly function. Often referred to as an IDE, or integrated development
environment, RStudio allows users to develop and edit programs in R by supporting a large
number of statistical packages, higher quality graphics, and the ability to manage your
workspace.
R and RStudio are not separate versions of the same program, and cannot be substituted for
one another. R may be used without RStudio, but RStudio may not be used without R.
The Advantages of RStudio
1) RStudio is designed to make it easy to write scripts.
As soon as you create a new script, the windows within your RStudio session adjust
automatically so you can see both your script and the results in your console when you run
your syntax.
Even better is the ability to call up potential syntax options while you are writing just by
using the tab key.
For example, suppose I am trying to access a variable in a data set called “teachers”, but I
haven’t memorized the variable names:
2) RStudio makes it convenient to view and interact with the objects stored in your
environment.
7
2
In the basic R GUI, you can always list the objects you have stored in your environment. But
RStudio has a very useful “Environment” window available.
This shows all of the objects that you have stored, including data; scalars, vectors, and
matrices; model outputs; etc., along with a summary of the information that is stored in those
objects.
You can even click on your data sets directly to open them and view them as spreadsheets.
3) RStudio makes it easy to set your working directory and access files on your computer.
Especially if you are working in Windows, one of the most tedious parts of programming in
R is setting your working directory to access your files.
With RStudio, you can navigate to folders on your computer in the “Files” window, view any
files you have in that folder, and set that folder as the working directory.
7
3
Command :
setwd("c:/Documents/my/working/directory")
Set a default working directory
A default working directory is a folder where RStudio goes, every time you open it. You can
change the default working directory from RStudio menu under: Tools –> Global options –>
click on “Browse” to select the default working directory you want.
4) RStudio makes graphics much more accessible for a casual user.
The basic R GUI requires you to go to some lengths to save graphics as you go. But RStudio
has a window that does exactly that.
You can easily click back and forth between plots, change the sizes of your plot without
rerunning the code, and export or copy plots to include in other documents.
7
4
Four Panes in RStudio
RStudio is a four pane work-space for 1) creating file containing R script, 2) typing R
commands, 3) viewing command histories, 4) viewing plots and more.
Top-left panel: Code editor allowing you to create and open a file containing R script. The R
script is where you keep a record of your work. R script can be created as follow: File –>
New –> R Script.
7
5
Bottom-left panel: R console for typing R commands
Top-right panel:
Workspace tab: shows the list of R objects you created during your R session
History tab: shows the history of all previous commands
Bottom-right panel:
Files tab: show files in your working directory
Plots tab: show the history of plots you created. From this tab, you can export a plot to a PDF
or an image files
Packages tab: show external R packages available on your system. If checked, the package is
loaded in R.
7
6
IMPORT OF DATA SHEET IN EXCEL
Step 1:
Step 2:
7
7
Step 3 [Output]:
7
8
Descriptive statistics using r-studio
Step 1:
Step 2:
7
9
Step 3:
Output:
8
0
Coding:
For Summary Statistics:
summary(one_sample_t_test_Rstudio$Hours)
For Standard Deviation:
sd(one_sample_t_test_Rstudio$Hours)
For Variance:
var(one_sample_t_test_Rstudio$Hours)
Result:
For Summary Statistics:
summary(one_sample_t_test_Rstudio$Hours)
Min. 1st Qu. Median Mean 3rd Qu. Max.

23.00 27.00 30.00 30.48 33.70 39.00
For Standard Deviation:

sd(one_sample_t_test_Rstudio$Hours)
[1] 4.374199
For Variance:
var(one_sample_t_test_Rstudio$Hours)
[1] 19.13362
8
1
Correlation using R-Studio
Step 1:
Step 2:
8
2
Step 3:
Output:
Coding:
cor.test(corelation$`Advertisement in month`,corelation$`Sales in crores`)
Result:
Pearson's product-moment correlation
data: corelation$`Advertisement in month` and corelation$`Sales in crores`
8
3
t = 1.359, df = 6, p-value = 0.223
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.3335576 0.8866886
sample estimates:
cor
0.4851491
Inference:
Here r = +0.48, therefore there is a positive correlation between
advertisements and sales
8
4
Hypothesis Testing using R-Studio
(i) One sample t test

Problem: Suppose that we want to hypothesize that the mean
number of TV hours watched per week is greater than 28.5
HYPOTHESIS TESTING:
Null Hypothesis: Mean number of TV hours watched per week is less than 28.5
Alternate Hypothesis: Mean number of TV hours watched per week is greater than 28.5
Step 1:
8
5
Step 2:
Step 3:
8
6
Coding:
t.test(one_sample_t_test_Rstudio$Hours,alternative = "greater",mu=28.5)
Result:
data: one_sample_t_test_Rstudio$Hours
t = 2.441, df = 28, p-value = 0.01061
alternative hypothesis: true mean is greater than 28.5
29.10098 Inf
sample estimates:
mean of x
30.48276
DECISION RULE:
INFERENCE:
CONCLUSION:
Mean number of TV hours watched per week is greater than 28.5
8
7
(ii) Two Sample T-test
Problem: To analyse that there is a significant difference between

the marks scored by class groups A & B in mathematics at α=10%
HYPOTHESIS TESTING:
Null Hypothesis: There is no significant difference between the marks scored by class
groups A & B in mathematics
Alternate Hypothesis: There is significant difference between the marks scored by class
Step 1:
Step 2:
8
8
Step 3:
Coding:
t.test(twosample_t_test2$`Group A`,twosample_t_test2$`Group B`,conf.level = 0.90)
Result:
8
9
data: twosample_t_test2$`Group A` and twosample_t_test2$`Group B`
t = 1.7863, df = 26.177, p-value = 0.08565
alternative hypothesis: true difference in means is not equal to 0
0.3200806 13.7851826
sample estimates:
mean of x mean of y
82.47368 75.42105
DECISION RULE:
INFERENCE:
CONCLUSION:
There is significant difference between the marks scored by class
9
0
(iii) Paired Sample T-Test
Problem: Determine that there is a significant difference between

the time to finish the race when race is completed with local shoes
and branded shoes.
HYPOTHESIS TESTING:
Null Hypothesis: There is no significant difference between
and branded shoes.
Alternate Hypothesis: There is a significant difference between

and branded shoes.
Step 1:
9
1
Step 2:
Step 3:
9
2
Coding:
t.test(PT_TEST_R_STUDIO_1_$`Local shoes`,PT_TEST_R_STUDIO_1_$`Branded
shoes`,paired = T)
Result:
data: PT_TEST_R_STUDIO_1_$`Local shoes` and PT_TEST_R_STUDIO_1_$`Branded
shoes`
t = 0.079506, df = 14, p-value = 0.9378
alternative hypothesis: true mean difference is not equal to 0
-1.558575 1.678575
sample estimates:
mean difference
0.06
DECISION RULE:
INFERENCE:
Since p(t)(0.9378) is greater than α (0.05), accept null hypothesis.
CONCLUSION:
There is no significant difference between the time to finish the race when race is completed
with local shoes and branded shoes.
9
3
F Test Using R-Studio
Problem: Determine whether Variance of Class1 is greater than

variance of class2 in mathematics.
HYPOTHESIS TESTING:
Null Hypothesis: Variance of Class1 is not greater than variance of
class2 in mathematics.
Alternate Hypothesis: Variance of Class1 is greater than variance of

class2 in mathematics.
Step 1:
Step 2:
9
4
Step 3:
Coding:
var.test(f_test$Class1,f_test$Class2)
Result:
data: f_test$Class1 and f_test$Class2
F = 2.1317, num df = 5, denom df = 5, p-value = 0.4258
alternative hypothesis: true ratio of variances is not equal to 1
0.2982922 15.2340118
sample estimates:
ratio of variances
2.13171
DECISION RULE:
INFERENCE:
CONCLUSION:
Variance of Class1 is not greater than variance of class2 in
mathematics.
9
5
9
6
ANOVA using R-Studio
Problem: To test that the means marks of the students in subjects - economics, science and
history are all equal.
Step 1:
Step 2:
9
7
Step 3:
Coding:
combinedgroup=data.frame(cbind(ANOVA$Economics,ANOVA$Science,ANOVA$History)
)
summary(combinedgroup)
stack(combinedgroup)
stackedgroup=stack(combinedgroup)
anovaresult=aov(values~ind, data=stackedgroup)
summary(anovaresult)
Result:
>
combinedgroup=data.frame(cbind(ANOVA$Economics,ANOVA$Science,ANOVA$History
))
> summary(combinedgroup)
X1 X2 X3
Min. :42.00 Min. :54.0 Min. :35.00
1st Qu.:44.00 1st Qu.:55.5 1st Qu.:39.00
Median :49.00 Median :58.0 Median :40.00
Mean :48.33 Mean :60.0 Mean :43.67
3rd Qu.:53.00 3rd Qu.:64.0 3rd Qu.:50.00
Max. :54.00 Max. :69.0 Max. :55.00
NA's :2
9
8
> stack(combinedgroup)
values ind
1 42 X1
2 53 X1
3 49 X1
4 53 X1
5 43 X1
6 44 X1
7 45 X1
8 52 X1
9 54 X1
10 69 X2
11 54 X2
12 58 X2
13 64 X2
14 64 X2
15 55 X2
16 56 X2
17 NA X2
18 NA X2
19 35 X3
20 40 X3
21 53 X3
22 42 X3
23 50 X3
24 39 X3
25 55 X3
26 39 X3
27 40 X3
> stackedgroup=stack(combinedgroup)
> anovaresult=aov(values~ind, data=stackedgroup)
> summary(anovaresult)
Df Sum Sq Mean Sq F value Pr(>F)
ind 2 1086 542.9 15.2 7.16e-05 ***
Residuals 22 786 35.7
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
2 observations deleted due to missingness
DECISION RULE:
INFERENCE:
9
9
CONCLUSION:
Variance of Class1 is not greater than variance of class2 in
mathematics.
1
0

BRM Lab

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

BRM Lab

Uploaded by

Copyright:

Available Formats

BUSINESS RESEARCH METHODOLOGY LAB

(Using MS Excel and R Studio)

BACHELOR OF BUSINESS ADMINISTRATION

Under the guidance of

Dr. AANCHAL AGGARWAL

VIVEKANANDA SCHOOL OF BUSINESS STUDIES

(Affiliated to Guru Gobind Singh Indraprastha University)

YASH AGGARWAL 08329801722 1

YASH AGGARWAL 08329801722 3

YASH AGGARWAL 08329801722 4

YASH AGGARWAL 08329801722 5

YASH AGGARWAL 08329801722 6

YASH AGGARWAL 08329801722 7

YASH AGGARWAL 08329801722 8

YASH AGGARWAL 08329801722 9

Full time Part time

Is there a significant difference in the efficiency of the two drugs?

t-Test: Two-Sample Assuming Equal Variances

1. Economics and Science

1. Economics and Science –

2. Science and History –

3. Economics and History –

(i) Economics & Science

(iii) Economics & History

Chocolate Milk Carbohydrate Replacement Drink

Athelet Local Branded

t-Test: Paired Two Sample for Means

Local shoes Branded shoes

Economics Science History

student economics science history

(iii) ANOVA TEST

economic scienc histor

H2: There is a significant difference between economics, medicine and history

SUMMARY economics science history Total

Expected values = row total *column total/Grand total

Chi square test

Chi Square test

if p value is less than alpha, reject null hypothesis

there is no significant relationship between gender and newspaper brand.

Umbrellas sold (Y) Rainfall (X)

Coefficien Standard Lower Upper

Rainfall (X) Line Fit Plot

40 Umbrellas sold (Y)

 How to Install R Studio?

1. Install R. Leave all default settings in the installation options.

The Advantages of RStudio

1) RStudio is designed to make it easy to write scripts.

Set a default working directory

4) RStudio makes graphics much more accessible for a casual user.

History tab: shows the history of all previous commands

Files tab: show files in your working directory

Min. 1st Qu. Median Mean 3rd Qu. Max.

For Standard Deviation:

data: corelation$`Advertisement in month` and corelation$`Sales in crores`

(i) One sample t test

Problem: To analyse that there is a significant difference between

Problem: Determine that there is a significant difference between

Alternate Hypothesis: There is a significant difference between

Problem: Determine whether Variance of Class1 is greater than

Alternate Hypothesis: Variance of Class1 is greater than variance of

You might also like