Professional Documents
Culture Documents
ABSTRACT:
Research design and formulation of hypothesis are two important parts in a research process. In
the process of research work the task of designing a research approach, research instruments,
sampling plan and information gathering methods plays vary crucial role.
Reference: All the materials are prepared by referring Business Statistics written by P. C. Tulsian
and B. Jhunjhunwala.
Uses of F-Test
F test is used –
1. For test of hypothesis of equality between two variances.
2. For test of hypothesis of equality amongst several sample means.
Properties of F-Test
1. Range – Range of values of F is from 0 to . The value of F can never be
negative since both terms of the F-ratio and squared values.
2. Shape – The shape of ‘F’ distribution cure depends upon the number of
degrees of freedom for the first term and that for the second term. In
general F curve is skewed to right.
3. Critical Value – For same probability value, critical value of F for the lower
area is reciprocal of F for the upper area with and interchanged.
Analysis of Variance
Analysis of variance is the ratio of 2 variances (i) between samples & (ii) within
samples. Its purpose is to find out the influence of different forces working on
them.
It is used for agricultural experiments, for natural sciences, for physical sciences.
Classification Model
There may be one way classification model or two way classification model.
One-way Classification Model
One way classification model is designed to study the effect of one factor in an
experiment. For example, influence of application of one or more types of
fertilizers may be considered on several pieces of land. It is designed to test the
null hypothesis that the arithmetic means of the population from which the k
samples are randomly drawn are equal to one another.
Step-4: Square these differences and obtain their total i.e. ∑ ̅ ̅ for
each sample.
Step-5: Calculate the sum of squares between the samples (SSB) as follows:
SSB = ∑ ̅ ̿ ̅ ̿ ∑ ̅ ̿
Step-6: Calculate the difference between the various items in a sample
and the mean values of the respective samples.
Step-7: Square these differences and obtain their total for each sample i.e.
∑ ̅
Step-8: Calculate the sum of squares within the samples (SSW) as follows:
∑ ̅ ̿ ∑ ̅ ̿ ∑ ̅ ̿
Step-9: Prepare ANOVA table as follows:
Source of Sum of Degree of Mean Computed Table
variation squares freedom squares value of F value of F
Between
SSB c–1 MSB = F=
samples
Within
SSW n–c MSW =
Samples
Total n–1
Step-10: Compare the computed value of F with the table value of F for the
given degrees of freedom as a given critical level (generally we
take 5% level of significance) and interpret the same as follows:
Case Interpretation
(a) If the computed value of F The difference in the
is greater than the table value variances is significant and it
of F could not have arisen due to
fluctuation of random
sampling and hence we reject
Illustration-3
The following table gives the yield on 15 fields under three varieties of seeds
(viz. A, B, C);
YIELDS
A B C
9500 9300 10000
9600 9800 10300
9800 9200 9700
9100 10000 10300
9500 9000 10700
SSB = * +
∑ ∑
=* +
Step-6: Calculate sum of squares within samples (SSW) as follows:
SSW = SST – SSB
Step-8: Compare the computed value of F with the table value of F for the
given degrees of freedom as a given critical level (generally we
take 5% level of significance) and interpret the same as follows:
Case Interpretation
(a) If the computed value of F The difference in the
is greater than the table value variances is significant and it
of F could not have arisen due to
fluctuation of random
sampling and hence we reject
Between
SSC c–1 MSC = =
samples
Within
SSR R–1 MSR = =
Samples
Residual / (c – 1) MSE =
SSE
Error ( r – 1)
Total SST rc – 1
* Greater or Smaller variance out of MSC and MSE
**Greater or Smaller variance out of MSR and MSE.
Step-9: Compare the computed value of F with the table value of F for the
given degrees of freedom at a given critical level (generally we
take 5% level of significance) and interpret the same as follows:
Case Interpretation
(a) If the computed value of F The difference in the
is greater than the table value variances is significant and it
of F could not have arisen due to
fluctuation of random
sampling and hence we reject
Illustration-5
The following table gives per hectare yield for three varieties of wheat each
grown on five plots:
Per hectare yield (in tons)
Plot of Land Variety of wheat
A B C
1 5 3 10
2 6 5 13
3 8 2 7
4 1 10 13
5 5 0 17
Test at 5% level of significance.
List of Formulae
1. Value of f ̂
̂
2. Anova
Table for One Source of Sum of
Degree
Mean Variance
Factor of
variation squares squares Ratio
freedom
Analysis of
Variance Between
SSB c–1 MSB = F=
samples
Within MSW =
SSW n–c
Samples
Total SST n–1
The hypothesis of non-parametric test are concerned with something other than
the value of the population parameter. Hence non-parametric test does not
depend upon the fact that whether observed population fit into any parametric
distribution.
A non-parametric tests make only very fewer assumptions and as such they
have wide acceptability.
Non-parametric test are very simple to use. In certain cases, even when the use
of parametric test is justified, non-parametric test may be easier to use.
Advantages of Non-Parametric Test
1) It is a distribution free test.
2) It is more robust.
3) Non-parametric test can be used for very small sample size.
4) Non-parametric test can be used for attributes.
5) Non-parametric test can be used for making judgment about individuals.
6) They are very easy to calculate.
7) They can be used with limited information.
Whether both the judges have rated contestants in a same manner or they
differ if the significance level is 0.05?
Illustration-2
Use the sign test to see if there is a difference between the number of day’s until
collection of an account receivable, before and after a new collection policy.
Take
Before 30 28 34 35 40 42 33 38
After 32 29 33 32 37 43 40 41
Before 34 45 28 27 25 41 36
After 37 44 27 33 30 38 36
Illustration-3
The following data represent the rate of defective work of A to L workers before
and after takeover of a company.
A B C D E F G H I J K L
Before 8 7 6 9 7 10 8 6 5 8 10 8
After 6 5 8 6 9 8 10 7 5 6 9 8
On the basis of a paired sign test (using 0.10 level of significance) state whether
the tasks over has made any change.
Notes:
1) Values for Spearman’s Rank correlation is given for both the tails
combined. Hence at 0.10 significance level, it shows the value at right tail
and left tail covering 0.05 area.
2) When there are extreme values in the original data then rank correlation
can products more useful result than correlation method.
When, , r can be normally distributed with the following standard
error.
Standard error of co-efficient of rank correlation ( )
√
After calculating standard error, results can be standardize by using
formula = Z =
Then the computed value Z can be compared with critical value of Z to
check the hypothesis.
Illustration-4
Determine whether the student who score well in Mathematics also score well in
Physics (
Student Rank in Mathematics Rank in Physics
A 3 6
B 2 2
C 1 3
D 5 4
E 4 1
F 6 5
Illustration-5
The manager (training) of a marketing company wanted to assess the
performance of ten salesman. For this he compared their rank in the training
program with their rank in the field. The results of his evaluation are given below.
What are the comments about the relationship / association between training
and performance (use 5% level of significance).
Illustration-6
Given below are the marks obtained by 11 students in a test. Find out rank
correlation and test at 5% level of significance, whether there is any correlation
between the scores in two subjects.
Illustration-7
A teacher in a school believes that students who finish exams more quickly than
others have better exam scores. The following set of data shows the score and
order to finish for 12 students on an exam. Using rank correlation method.Do
these data indicate that the first students to complete an exam have higher
grades? (
Mann-Whitney U Test
This test is used to determine that whether the two population have the same
mean. This is non-parametric level to test the hypothesis as against the
parametric test for testing the hypothesis discussed in earlier chapter. However,
Mann-Whitney U test (or simply U test) is restricted to only two populations.
U Statistic = * +
[ ]
If both and are larger than 10 than U statistics can be approximately by the
normal distribution and U statistics can be standardized as follows:
If computed value of Z is less than critical value of Z for corresponding then null
hypothesis is accepted (i.e. both the population have the same mean)
otherwise null hypothesis is rejected.
Illustration 8
A larger hospital hires most of its doctors from the two major universities. Over the
last year, hospital has been conducting test for the newly recruited doctors to
determine which school educate better. Based on the following scores, help the
human resource department of the hospital to decide whether the universities
differ in quality. (
Test Score
University 99 83 89 64 98 85 61 79 91 87 88
A
University 96 90 97 94 86 95 68 78 93 56 76 84
B
SUMMARY:
1. Research design is a comprehensive aspersions plan of the sequence of operations that a
researcher intends to carry out to achieve the research objectives. It involves selecting the
most appropriate methods and techniques to solve the problem under investigation.
2. A hypothesis is a tentative generalization the validity of which remains to be seen. In its
most elementary stage the hypothesis may be any hunch, guess, imaginative idea which
became the basis of further investigation. It analyses: how powerful is my study (test)? how
many observations do I need to have for what I want to get from the study? Answer to all
these above questions enabling researchers to efficiently use research resources.
3. Tests of hypothesis can be carried out on one or two samples. One sample tests are used to
test if the population parameter is different from a specified value. Two sample tests are
used to detect the difference between the parameters of two populations.