You are on page 1of 17

# SIKKIM MANIPAL UNIVERSITY

ASSIGNMENT

Name
Dhananjay
Kumar
Roll No. 510911381

Course MBA-Semester-3
Research
Subject
Methodology
Subject Code MB0035-Set-2

## Centre Code 1799

1.Write short notes on the following
a. Null hypothesis
b. What is explanatory research?
c. What is random sampling?
d. Rank order co-relation

## a. A null hypothesis is a hypothesis (within the frequents context of statistical

hypothesis testing) that might be falsified using a test of observed data. Such a test works
by formulating a null hypothesis, collecting data, and calculating a measure of how
probable that data was assuming the null hypothesis were true. If the data appears very
improbable (usually defined as a type of data that should be observed less than 5% of the
time) then the experimenter concludes that the null hypothesis is false. If the data looks
reasonable under the null hypothesis, then no conclusion is made. In this case, the null
hypothesis could be true, or it could still be false; the data gives insufficient evidence to
make any conclusion. The null hypothesis typically proposes a general or default
position, such as that there is no relationship between two quantities, or that there is no
difference between a treatment and the control. The term was originally coined by
English geneticist and statistician Ronald Fisher.
In some versions of statistical hypothesis testing (such as developed by Jerzy Neyman
and Egon Pearson), the null hypothesis is tested against an alternative hypothesis. This
alternative may or may not be the logical negation of the null hypothesis. The use of
alternative hypotheses was not part of Ronald Fisher's formulation of statistical
hypothesis testing, though alternative hypotheses are standardly used today.

For instance, one might want to test the claim that a certain drug reduces the chance of
having a heart attack. One would choose the null hypothesis "this drug does not reduce
the chances of having a heart attack" (or perhaps "this drug has no effect on the chances
of having a heart attack"). One should then collect data by observing people both taking
the drug and not taking the drug in some sort of controlled experiment. If the data is very
unlikely under the null hypothesis one would reject the null hypothesis, and conclude that
its negation is true. That is, one would conclude that the drug does reduce the chances of
having a heart attack. Here "unlikely data" would mean data where the percentage of
people taking the drug who had heart attack was much less then the percentage of people
not taking the drug who had heart attacks. Of course one should use a known statistical
test to decide how unlikely the data was and hence whether or not to reject the null
hypothesis.

## b. Exploratory research provides insights into and comprehension of an issue or

situation. It should draw definitive conclusions only with extreme caution. Exploratory
research is a type of research conducted because a problem has not been clearly defined.
Exploratory research helps determine the best research design, data collection method
and selection of subjects. Given its fundamental nature, exploratory research often
concludes that a perceived problem does not actually exist.

## Exploratory research often relies on secondary research such as reviewing available

literature and/or data, or qualitative approaches such as informal discussions with
consumers, employees, management or competitors, and more formal approaches through
in-depth interviews, focus groups, projective methods, case studies or pilot studies. The
Internet allows for research methods that are more interactive in nature: E.g., RSS feeds
efficiently supply researchers with up-to-date information; major search engine search
results may be sent by email to researchers by services such as Google Alerts;
comprehensive search results are tracked over lengthy periods of time by services such as
Google Trends; and Web sites may be created to attract worldwide feedback on any
subject.

The results of exploratory research are not usually useful for decision-making by
themselves, but they can provide significant insight into a given situation. Although the
results of qualitative research can give some indication as to the "why", "how" and
"when" something occurs, it cannot tell us "how often" or "how many."

## Exploratory research is not typically generalizable to the population at large..

c. Random Sampling is that part of statistical practice concerned with the selection
of an unbiased or random subset of individual observations within a population of
individuals intended to yield some knowledge about the population of concern, especially
for the purposes of making predictions based on statistical inference. Sampling is an
important aspect of data collection.

Researchers rarely survey the entire population for two reasons (Adr, Mellenbergh, &
Hand, 2008): the cost is too high, and the population is dynamic in that the individuals
making up the population may change over time. The three main advantages of sampling
are that the cost is lower, data collection is faster, and since the data set is smaller is
possible to ensure homogeneity and to improve the accuracy and quality of the data.

Each observation measures one or more properties (such as weight, location, color) of
observable bodies distinguished as independent objects or individuals. In survey
sampling, survey weights can be applied to the data to adjust for the sample design.
Results from probability theory and statistical theory are employed to guide practice. In
business and medical research, sampling is widely used for gathering information about a
population.
d. Rank-order correlation - the most commonly used method of computing a
correlation coefficient between the ranks of scores on two variables. In statistics,
Spearman's rank correlation coefficient or Spearman's rho, named after Charles
Spearman and often denoted by the Greek letter (rho) or as rs, is a non-parametric
measure of statistical dependence between two variables. It assesses how well the
relationship between two variables can be described using a monotonic function. If there
are no repeated data values, a perfect Spearman correlation of +1 or 1 occurs when each
of the variables is a perfect monotone function of the other.

The Spearman correlation coefficient is often thought of as being the Pearson correlation
coefficient between the ranked variables. In practice, however, a simpler procedure is
normally used to calculate . The n raw scores Xi, Yi are converted to ranks xi, yi, and the
differences di = xi yi between the ranks of each observation on the two variables are
calculated.

## If there are no tied ranks, then is given by:

If tied ranks exist, Pearson's correlation coefficient between ranks should be used for the
calculation:

One has to assign the same rank to each of the equal values. It is an average of their
positions in the ascending order of the values.

## 2.Elaborate the format of a research report touching briefly on he mechanics of

writing.
Research report is a means for communicating research experience to others. A
research report is formal statement of the research process and it results. It narrates the
problem studied, methods used for studying it and the findings and conclusions of the
study.

## The format of a research report is given below:

1. Prefatory Item
Title page
Declaration
Certificates
Preface/ acknowledgment
List of tables
List of graphs/ figures/ charts
Abstracts or synopsis

2. Body of the Report
Introduction
Theoretical background of the topic
Statement of the problem
Review of literature
The Scope of the study
The objectives of the study
Hypothesis to be tested
Definition of the concepts
Models if any
Design of the study
Methodology
Method of data collection
Sources of data
Sampling Plan
Data collection instruments
Field work
Data processing and analysis plan
Overview of the report
Limitation of the study
Result: Findings and discussions
Summary, conclusions and recommendations

3. Reference Material

Bibliography
Appendix
Copies of data collection instruments
Technical details on sampling plan
Complex tables
Glossary of new terms used.

Mechanics of Writing:

A research report requires clear organization. Each chapter may be divided into
two or more sections with appropriate heading and in each section margin headings and
paragraph headings may be used to indicate subject shifts. Physical presentation is
another aspect of organization. A page should not be fully filled in from top to bottom.
Wider margins should be provided on both sides and on top and bottom as well.
Centered section heading is provided in the center of the page and is usually in solid font
size. It is separated from other textual material by two or three line space.
Marginal heading is used for a subdivision in each section. It starts from the left side
margin without leaving any space.
Paragraph heading is used to head an important aspect of the subject matter discussed in a
subdivision. There is some space between the margin and this heading.
Presentation should be free form spelling and grammar errors. If the writer is not strong
in grammar, get the manuscript corrected by a language expert.
Use the rules of punctuations.
Use present tense for presenting the findings of the study and for stating generalizations
Do not use masculine nouns and pronouns when the content refers to both the genders.
Do not abbreviate words in the text; spell out them in full. Footnote citation is indicated
by placing an index number, i.e., a superscript or numeral, at the point of reference.
Reference style should have a clear format and used consistently.

## 3.Discuss the importance of case study method.

Case study is a method of exploring and analyzing the life of a social unit or
entity, be it a person, a family, an institution or a community. Case study would depend
upon wit, commonsense and imagination of the person doing the case study. The
investigator makes up his procedure as he goes along. Efforts should be made to ascertain
the reliability of life history data through examining the internal consistency of the
material.. A judicious combination of techniques of data collection is a prerequisite for
securing data that are culturally meaningful and scientifically significant. Case study of
particular value when a complex set of variables may be at work in generating observed
results and intensive study is needed to unravel the complexities. The case documents
hardly fulfill the criteria of reliability, adequacy and representativeness, but to exclude
them form any scientific study of human life will be blunder in as much as these
documents are necessary and significant both for theory building and practice. In-depth
analysis of selected cases is of particular value to business research when a complex set
of variables may be at work in generating observed results and intensive study is needed
to unravel the complexities.

Let us discuss the criteria for evaluating the adequacy of the case
history or life history which is of central importance for case study.

John Dollard has proposed seven criteria for evaluating such adequacy as
follows:
i) The subject must be viewed as a specimen in a cultural series. That is, the
case drawn out from its total context for the purposes of study must be
considered a member of the particular cultural group or community. The
scrutiny of the life histories of persons must be done with a view to identify
the community values, standards and their shared way of life.
ii) The organic motto of action must be socially relevant. That is, the action of the
individual cases must be viewed as a series of reactions to social stimuli or situation. In
other words, the social meaning of behaviour must be taken into consideration.
iii) The strategic role of the family group in transmitting the culture must be
recognized. That is, in case of an individual being the member of a family, the role of
family in shaping his behaviour must never be overlooked.
iv) The specific method of elaboration of organic material onto social
behaviour must be clearly shown. That is case histories that portray in detail
how basically a biological organism, the man, gradually blossoms forth into a
social person, are especially fruitful.
v) The continuous related character of experience for childhood through
adulthood must be stressed. In other words, the life history must be a
configuration depicting the inter-relationships between thee persons various
experiences.
vi) Social situation must be carefully and continuously specified as a factor.
One of the important criteria for the life history is that a persons life must be
shown as unfolding itself in the context of and partly owing to specific social
situations.
vii) The life history material itself must be organised according to some
conceptual framework.

4. Give the importance of frequency tables and discuss the principles of table
construction, frequency distribution and class intervals determination:

## 1) Every tables should have a title. The tile should represent a

succinct description of the contents of the table. It should be clear
and concise. It should be place above the body of the table.
2) A number facilitating easy reference should identify every table.
The number can be centered above the title. The table number
should run in consecutive serial order. Alternative tables in chapter
1 be numbered as 1.1, 1.2,1.., in chapter2 as 2.1, 2.2,
2.3and so on.
3) The caption (or column heading) should be clear and brief.
4) The units of measurement under each heading must always be
indicated.
5) Any explanatory footnotes concerning the table itself are placed
directly beneath the table and in order to obviate any possible
confusion with the textual footnoted such reference symbols as the
asterisk (*) Danger(+) and the like may be used.
6) If the data in a series of table has been obtained from different
sources, it is ordinarily advisable to indicate the specific source in a
place just below the tables.
7) Usually lines separated columns from one another. Lines are
always drawn at the top and bottom of the table and below the
captions .
8) The column may be numbered to facilitate reference.
9) All column figures should be properly aligned. Decimal points and
plus and minus signs should be in perfect alignment.
10)Columns and rows that are to be compared with one another
should be brought closed together.
11) Totals of rows should be placed at the extreme right column and
totals of columns at the bottom.
12)IN order to emphasize the relative significance of certain
categories, different kind of type, spacing and identifications can be
used.
13)The arrangement of the categories in a table may be chronological,
geographical, alphabetical or according to magnitude. Numerical
categories are usually arranged in descending order of magnitude.
14)Miscellaneous and exceptions items are generally placed in the last
row of the table.
15)Usually the larger number of item is listed vertically. This means
that a table length is more than its width.
16)Abbreviations should be avoided whenever possible and ditto
marks should not be used in a table.
17)The table should be made as logical, clear, accurate and simple as
possible.

## In statistics, a frequency distribution is a tabulation of the values that one or more

variables take in a sample. Managing and operating on frequency tabulated data is much
simpler than operation on raw data. There are simple algorithms to calculate median,
mean, standard deviation etc. from these tables.

## Statistical hypothesis testing is founded on the assessment of differences and similarities

between frequency distributions. This assessment involves measures of central tendency
or averages, such as the mean and median, and measures of variability or statistical
dispersion, such as the standard deviation or variance.

A frequency distribution is said to be skewed when its mean and median are different.
The kurtosis of a frequency distribution is the concentration of scores at the mean, or how
peaked the distribution appears if depicted graphicallyfor example, in a histogram. If
the distribution is more peaked than the normal distribution it is said to be leptokurtic; if
less peaked it is said to be platykurtic.

Letter frequency distributions are also used in frequency analysis to crack codes and refer
to the relative frequency of letters in different languages.

## Principles of class interval determination:

In musical set theory, an interval class (often abbreviated: ic), also known as unordered
pitch-class interval, interval distance, undirected interval, or (completely incorrectly)
interval mod 6 (Rahn 1980, 29; Whittall 2008, 27374), is the shortest distance in pitch
class space between two unordered pitch classes. For example, the interval class between
pitch classes 4 and 9 is 5 because 9 4 = 5 is less than 4 9 = 5 7 (mod 12). See
modular arithmetic for more on modulo 12. The largest interval class is 6 since any
greater interval n may be reduced to 12 n.

The concept of interval class was created to account for octave, enharmonic, and
inversion equivalency

## 5.Write short notes on the following:

a. Type I error and type II error
b.One tailed and two tailed test
c. Selecting the significance level

Ans.
a.Type I error and type II error

In statistics, the terms type I error (also, error, false alarm rate (FAR) or false
positive) and type II error ( error, miss rate or a false negative) are used to describe
possible errors made in a statistical decision process. In 1928, Jerzy Neyman (1894-1981)
and Egon Pearson (1895-1980), both eminent statisticians, discussed the problems
associated with "deciding whether or not a particular sample may be judged as likely to
have been randomly drawn from a certain population" (1928/1967, p. 1), and identified
"two sources of error", namely:

Type I (): reject the null hypothesis when the null hypothesis is true, and
Type II (): fail to reject the null hypothesis when the null hypothesis is false

Type I error, also known as an "error of the first kind", an error, or a "false
positive": the error of rejecting a null hypothesis when it is actually true. Plainly
speaking, it occurs when we are observing a difference when in truth there is none, thus
indicating a test of poor specificity. An example of this would be if a test shows that a
woman is pregnant when in reality she is not. Type I error can be viewed as the error of
excessive credulity.

Type II error, also known as an "error of the second kind", a error, or a "false
negative": the error of failing to reject a null hypothesis when it is in fact not true. In
other words, this is the error of failing to observe a difference when in truth there is one,
thus indicating a test of poor sensitivity. An example of this would be if a test shows that
a woman is not pregnant, when in reality, she is. Type II error can be viewed as the error
of excessive skepticism.

## A one- or two-tailed t-test is determined by whether the total area of a is placed in

one tail or divided equally between the two tails. The one-tailed t-test is performed if the
results are interesting only if they turn out in a particular direction. The two-tailed t-test is
performed if the results would be interesting in either direction. The choice of a one- or
two-tailed t-test effects the hypothesis testing procedure in a number of different ways.

TWO-TAILED t-TESTS

A two-tailed t-test divides a in half, placing half in the each tail. The null hypothesis in
this case is a particular value, and there are two alternative hypotheses, one positive and
one negative. The critical value of t, tcrit, is written with both a plus and minus sign ( ).
For example, the critical value of t when there are ten degrees of freedom (df=10) and a is
set to .05, is tcrit= 2.228. The sampling distribution model used in a two-tailed t-test is
illustrated below:

ONE-TAILED t-TESTS

There are really two different one-tailed t-tests, one for each tail. In a one-tailed t-test, all
the area associated with a is placed in either one tail or the other. Selection of the tail
depends upon which direction tobs would be (+ or -) if the results of the experiment came
out as expected. The selection of the tail must be made before the experiment is
conducted and analyzed.

## A one-tailed t-test in the positive direction is illustrated below:

The value tcrit would be positive. For example when a is set to .05 with ten degrees of
freedom (df=10), tcrit would be equal to +1.812.

## A one-tailed t-test in the negative direction is illustrated below:

The value tcrit would be negative. For example, when a is set to .05 with ten degrees of
freedom (df=10), tcrit would be equal to -1.812.

## Comparison of One and Two-tailed t-tests

1. If tOBS = 3.37, then significance would be found in the two-tailed and the positive one-
tailed t-tests. The one-tailed t-test in the negative direction would not be significant,
because was placed in the wrong tail. This is the danger of a one-tailed t-test.

2. If tOBS = -1.92, then significance would only be found in the negative one-tailed t-test.
If the correct direction is selected, it can be seen that one is more likely to reject the null
hypothesis. The significance test is said to have greater power in this case.
The selection of a one or two-tailed t-test must be made before the experiment is
performed. It is not "cricket" to find a that tOBS = -1.92, and then say "I really meant to do
a one-tailed t-test." Because reviewers of articles submitted for publication are sometimes
suspicious when a one-tailed t-test is done, the recommendation is that if there is any
doubt, a two-tailed test should be done.

## Significance is commonly designated as:

plain ol' "significance"
"statistical significance"
"probability" This word, "probability is the source of the letter tt represents
significance, the letter, "p"

The p value identifies the likelihood tt a particular outcome may have occurred by
chance.

## 6.Explain Karl pearson co-efficient of correlation. Calculate Karl pearson co-

efficient for the following data:

X(Ht)-cm 17 17 17 17 17 18 18 18 18 193
4 5 6 7 8 2 3 6 9
Y (Wt)- 61 65 67 68 72 74 80 87 92 95
Kg

## In statistics, the Pearson product-moment correlation coefficient (sometimes

referred to as the PMCC, and typically denoted by r) is a measure of the correlation
(linear dependence) between two variables X and Y, giving a value between +1 and 1
inclusive. It is widely used in the sciences as a measure of the strength of linear
dependence between two variables
Pearson's correlation coefficient between two variables is defined as the covariance of the
two variables divided by the product of their standard deviations:

The above formula defines the population correlation coefficient, commonly represented
by the Greek letter (rho). Substituting estimates of the covariances and variances based
on a sample gives the sample correlation coefficient, commonly denoted r :

An equivalent expression gives the correlation coefficient as the mean of the products of
the standard scores. Based on a sample of paired data (Xi, Yi), the sample Pearson
correlation coefficient is

where

are the standard score, sample mean, and sample standard deviation.