Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Standard view
Full view
of .
0 of .
Results for:
P. 1
Research Design Lecture Notes UNIT II

# Research Design Lecture Notes UNIT II

Ratings: (0)|Views: 33|Likes:
The Design of Research-Research Design; Features of a Good design; Different Research Designs ; Measurement in Research; Data types; Sources of Error.
The Design of Research-Research Design; Features of a Good design; Different Research Designs ; Measurement in Research; Data types; Sources of Error.

Categories:Types, School Work

### Availability:

See more
See less

12/27/2013

pdf

text

original

UNIT-II BBA II SEM RESEARCH METHODOLOGY
Prof. Amit KumarFIT Group of Institutions Page 1
Concept Of Sample
a
sample
is asubsetof apopulation.Typically, the population is very large, making acensusor a completeenumerationof all the values in the population impractical orimpossible. The sample represents a subset of manageablesize. Samples are collected and statistics are calculated fromthe samples so that one can makeinferencesorextrapolationsfrom the sample to the population. Thisprocess of collecting information from a sample is referred toassampling.
Sample Size-Sample size determination
is the act of choosing the numberof observations to include in astatistical sample.The samplesize is an important feature of any empirical study in whichthe goal is to makeinferencesabout apopulationfrom a sample. In practice, the sample size used in a study isdetermined based on the expense of data collection, and theneed to have sufficientstatistical power.In complicatedstudies there may be several different sample sizes involvedin the study: for example, in assurvey samplinginvolvingstratified samplingthere would be different sample sizes foreach population. In acensus,data are collected on the entirepopulation, hence the sample size is equal to the populationsize. Inexperimental design,where a study may be dividedinto differenttreatment groups,there may be differentsample sizes for each group.Sample sizes may be chosen in several different ways:

expedience - For example, include those items readilyavailable or convenient to collect. A choice of small samplesizes, though sometimes necessary, can result in wideconfidence intervalsor risks of errors instatistical hypothesistesting.

using a target variance for an estimate to be derived fromthe sample eventually obtained

using a target for the power of astatistical testto be appliedonce the sample is collected.larger sample sizes generally lead to increasedprecisionwhenestimatingunknown parameters. For example, if wewish to know the proportion of a certain species of fish thatis infected with a pathogen, we would generally have a moreaccurate estimate of this proportion if we sampled andexamined 200, rather than 100 fish. Several fundamentalfacts of mathematical statistics describe this phenomenon,including thelaw of large numbersand thecentral limit theorem. In some situations, the increase in accuracy for larger samplesizes is minimal, or even non-existent. This can result fromthe presence of systematic errorsor strongdependencein the data, or if the data follow a heavy-tailed distribution.Sample sizes are judged based on the quality of the resultingestimates. For example, if a proportion is being estimated,one may wish to have the 95%confidence intervalbe lessthan 0.06 units wide. Alternatively, sample size may beassessed based on thepowerof a hypothesis test. Forexample, if we are comparing the support for a certainpolitical candidate among women with the support for thatcandidate among men, we may wish to have 80% power todetect a difference in the support levels of 0.04 units.
Sampling Process-
The sampling process comprises several stages:

Defining the population of concern

Specifying asampling frame,asetof items or events possible to measure

Specifying asampling methodfor selecting items or eventsfrom the frame

Determining the sample size

Implementing the sampling plan

Sampling and data collecting
Population definition
Successful statistical practice is based on focused problemdefinition. In sampling, this includes defining thepopulationfrom which our sample is drawn. A population can bedefined as including all people or items with thecharacteristic one wishes to understand. Because there isvery rarely enough time or money to gather informationfrom everyone or everything in a population, the goalbecomes finding a representative sample (or subset) of thatpopulation.Sometimes that which defines a population is obvious. Forexample, a manufacturer needs to decide whether abatchof material from productionis of high enough quality to bereleased to the customer, or should be sentenced for scrapor rework due to poor quality. In this case, the batch is thepopulation.Although the population of interest often consists of physicalobjects, sometimes we need to sample over time, space, orsome combination of these dimensions. For instance, aninvestigation of supermarket staffing could examinecheckout line length at various times, or a study onendangered penguins might aim to understand their usage of various hunting grounds over time. For the time dimension,the focus may be on periods or discrete occasions.In other cases, our 'population' may be even less tangible.For example,Joseph Jaggerstudied the behaviour of roulette wheels at a casino inMonte Carlo,and used this to identify abiased wheel. In this case, the 'population' Jagger wanted to

UNIT-II BBA II SEM RESEARCH METHODOLOGY
Prof. Amit KumarFIT Group of Institutions Page 2
investigate was the overall behaviour of the wheel (i.e. theprobability distributionof its results over infinitely manytrials), while his 'sample' was formed from observed resultsfrom that wheel. Similar considerations arise when takingrepeated measurements of some physical characteristic suchas theelectrical conductivityof copper.  This situation often arises when we seek knowledge aboutthecause systemof which the
observed
population is anoutcome. In such cases, sampling theory may treat theobserved population as a sample from a larger'superpopulation'. For example, a researcher might study thesuccess rate of a new 'quit smoking' program on a test groupof 100 patients, in order to predict the effects of the programif it were made available nationwide. Here thesuperpopulation is "everybody in the country, given access tothis treatment" - a group which does not yet exist, since theprogram isn't yet available to all.
Sampling frame
In the most straightforward case, such as the sentencing of abatch of material from production (acceptance sampling bylots), it is possible to identify and measure every single itemin the population and to include any one of them in oursample. However, in the more general case this is notpossible. There is no way to identify all rats in the set of allrats. Where voting is not compulsory, there is no way toidentify which people will actually vote at a forthcomingelection (in advance of the election). These imprecisepopulations are not amenable to sampling in any of the waysbelow and to which we could apply statistical theory.As a remedy, we seek asampling framewhich has theproperty that we can identify every single element andinclude any in our sample. The most straightforward type of frame is a list of elements of the population (preferably theentire population) with appropriate contact information. Forexample, in anopinion poll,possible sampling frames include
Probability and non-probability sampling
A
probability sampling
scheme is one in which every unit inthe population has a chance (greater than zero) of beingselected in the sample, and this probability can be accuratelydetermined. The combination of these traits makes itpossible to produce unbiased estimates of population totals,by weighting sampled units according to their probability of selection.Example: We want to estimate the total income of adultsliving in a given street. We visit each household in thatstreet, identify all adults living there, and randomly selectone adult from each household. (For example, we canallocate each person a random number, generated from auniform distributionbetween 0 and 1, and select the personwith the highest number in each household). We theninterview the selected person and find their income. Peopleliving on their own are certain to be selected, so we simplyadd their income to our estimate of the total. But a personliving in a household of two adults has only a one-in-twochance of selection. To reflect this, when we come to such ahousehold, we would count the selected person's incometwice towards the total. (The person who is selected fromthat household can be loosely viewed as also representingthe person who isn't selected.)In the above example, not everybody has the sameprobability of selection; what makes it a probability sample isthe fact that each person's probability is known. When everyelement in the population
does
have the same probability of selection, this is known as an 'equal probability of selection'(EPS) design. Such designs are also referred to as 'self-weighting' because all sampled units are given the sameweight.Probability sampling includes:Simple Random Sampling, Systematic Sampling,Stratified Sampling,ProbabilityProportional to Size Sampling, andClusterorMultistage Sampling.These various ways of probability sampling havetwo things in common:1.

Every element has a known nonzero probability of beingsampled and2.

involves random selection at some point.
Nonprobability sampling
is any sampling method wheresome elements of the population have
no
chance of selection or where the probability of selection can't beaccurately determined. It involves the selection of elementsbased on assumptions regarding the population of interest,which forms the criteria for selection. Hence, because theselection of elements is nonrandom, nonprobability samplingdoes not allow the estimation of sampling errors. Theseconditions give rise toexclusion bias,placing limits on howmuch information a sample can provide about thepopulation. Information about the relationship betweensample and population is limited, making it difficult toextrapolate from the sample to the population.
Example: We visit every household in a given street, and interview the first person to answer the door. In any household with more than one occupant, this is anonprobability sample, because some people are more likely to answer the door (e.g. an unemployed person who spendsmost of their time at home is more likely to answer than anemployed housemate who might be at work when theinterviewer calls) and it's not practical to calculate these probabilities.
Nonprobability sampling methods include [[accidentalsampling,quota samplingandpurposive sampling.In addition, nonresponse effects may turn
any
probabilitydesign into a nonprobability design if the characteristics of nonresponse are not well understood, since nonresponse

UNIT-II BBA II SEM RESEARCH METHODOLOGY
Prof. Amit KumarFIT Group of Institutions Page 3
effectively modifies each element's probability of beingsampled.
Sampling methods
Within any of the types of frame identified above, a varietyof sampling methods can be employed, individually or incombination. Factors commonly influencing the choicebetween these designs include:

Nature and quality of the frame

Availability of auxiliary information about units on the frame

Accuracy requirements, and the need to measure accuracy

Whether detailed analysis of the sample is expected

Cost/operational concerns
Simple random sampling
In asimple random sample('SRS') of a given size, all suchsubsets of the frame are given an equal probability. Eachelement of the frame thus has an equal probability of selection: the frame is not subdivided or partitioned.Furthermore, any given
pair
of elements has the samechance of selection as any other such pair (and similarly fortriples, and so on). This minimises bias and simplifies analysisof results. In particular, the variance between individualresults within the sample is a good indicator of variance inthe overall population, which makes it relatively easy toestimate the accuracy of results.However, SRS can be vulnerable to sampling error becausethe randomness of the selection may result in a sample thatdoesn't reflect the makeup of the population. For instance, asimple random sample of ten people from a given countrywill
on average
produce five men and five women, but anygiven trial is likely to overrepresent one sex andunderrepresent the other. Systematic and stratifiedtechniques, discussed below, attempt to overcome thisproblem by using information about the population tochoose a more representative sample.SRS may also be cumbersome and tedious when samplingfrom an unusually large target population. In some cases,investigators are interested in research questions specific tosubgroups of the population. For example, researchers mightbe interested in examining whether cognitive ability as apredictor of job performance is equally applicable acrossracial groups. SRS cannot accommodate the needs of researchers in this situation because it does not providesubsamples of the population. Stratified sampling, which isdiscussed below, addresses this weakness of SRS.Simple random sampling is always an EPS design (equalprobability of selection), but not all EPS designs are simplerandom sampling.
Systematic sampling
Systematic samplingrelies on arranging the targetpopulation according to some ordering scheme and thenselecting elements at regular intervals through that orderedlist. Systematic sampling involves a random start and thenproceeds with the selection of every
th element from thenonwards. In this case,
=(population size/sample size). It isimportant that the starting point is not automatically the firstin the list, but is instead randomly chosen from within thefirst to the
th element in the list. A simple example wouldbe to select every 10th name from the telephone directory(an 'every 10th' sample, also referred to as 'sampling with askip of 10').As long as the starting point israndomized,systematicsampling is a type of probability sampling.It is easy toimplement and thestratificationinduced can make itefficient,
if
the variable by which the list is ordered iscorrelated with the variable of interest. 'Every 10th' samplingis especially useful for efficient sampling fromdatabases. Example: Suppose we wish to sample people from a longstreet that starts in a poor district (house #1) and ends in anexpensive district (house #1000). A simple random selectionof addresses from this street could easily end up with toomany from the high end and too few from the low end (orvice versa), leading to an unrepresentative sample. Selecting(e.g.) every 10th street number along the street ensures thatthe sample is spread evenly along the length of the street,representing all of these districts. (Note that if we alwaysstart at house #1 and end at #991, the sample is slightlybiased towards the low end; by randomly selecting the startbetween #1 and #10, this bias is eliminated.)However, systematic sampling is especially vulnerable toperiodicities in the list. If periodicity is present and the periodis a multiple or factor of the interval used, the sample isespecially likely to be
un
representative of the overallpopulation, making the scheme less accurate than simplerandom sampling.Example: Consider a street where the odd-numbered housesare all on the north (expensive) side of the road, and theeven-numbered houses are all on the south (cheap) side.Under the sampling scheme given above, it is impossible' toget a representative sample; either the houses sampled willall be from the odd-numbered, expensive side, or they will allbe from the even-numbered, cheap side.Another drawback of systematic sampling is that even inscenarios where it is more accurate than SRS, its theoreticalproperties make it difficult to
quantify
that accuracy. (In thetwo examples of systematic sampling that are given above,much of the potential sampling error is due to variationbetween neighbouring houses - but because this methodnever selects two neighbouring houses, the sample will notgive us any information on that variation.)