UNIT-II

Concept Of Sample

BBA II SEM

RESEARCH METHODOLOGY
the presence of systematic errors or strong dependence in the data, or if the data follow a heavy-tailed distribution. Sample sizes are judged based on the quality of the resulting estimates. For example, if a proportion is being estimated, one may wish to have the 95% confidence interval be less than 0.06 units wide. Alternatively, sample size may be assessed based on the power of a hypothesis test. For example, if we are comparing the support for a certain political candidate among women with the support for that candidate among men, we may wish to have 80% power to detect a difference in the support levels of 0.04 units. Sampling ProcessThe sampling process comprises several stages:

a sample is a subset of a population. Typically, the population is very large, making a census or a complete enumeration of all the values in the population impractical or impossible. The sample represents a subset of manageable size. Samples are collected and statistics are calculated from the samples so that one can make inferences or extrapolations from the sample to the population. This process of collecting information from a sample is referred to as sampling. Sample SizeSample size determination is the act of choosing the number of observations to include in a statistical sample. The sample size is an important feature of any empirical study in which the goal is to make inferences about a population from a sample. In practice, the sample size used in a study is determined based on the expense of data collection, and the need to have sufficient statistical power. In complicated studies there may be several different sample sizes involved in the study: for example, in as survey sampling involving stratified sampling there would be different sample sizes for each population. In a census, data are collected on the entire population, hence the sample size is equal to the population size. In experimental design, where a study may be divided into different treatment groups, there may be different sample sizes for each group. Sample sizes may be chosen in several different ways: • expedience - For example, include those items readily available or convenient to collect. A choice of small sample sizes, though sometimes necessary, can result in wide confidence intervals or risks of errors in statistical hypothesis testing. using a target variance for an estimate to be derived from the sample eventually obtained using a target for the power of a statistical test to be applied once the sample is collected.

• • • • • •

Defining the population of concern Specifying a sampling frame, a set of items or events possible to measure Specifying a sampling method for selecting items or events from the frame Determining the sample size Implementing the sampling plan Sampling and data collecting Population definition Successful statistical practice is based on focused problem definition. In sampling, this includes defining the population from which our sample is drawn. A population can be defined as including all people or items with the characteristic one wishes to understand. Because there is very rarely enough time or money to gather information from everyone or everything in a population, the goal becomes finding a representative sample (or subset) of that population. Sometimes that which defines a population is obvious. For example, a manufacturer needs to decide whether a batch of material from production is of high enough quality to be released to the customer, or should be sentenced for scrap or rework due to poor quality. In this case, the batch is the population. Although the population of interest often consists of physical objects, sometimes we need to sample over time, space, or some combination of these dimensions. For instance, an investigation of supermarket staffing could examine checkout line length at various times, or a study on endangered penguins might aim to understand their usage of various hunting grounds over time. For the time dimension, the focus may be on periods or discrete occasions. In other cases, our 'population' may be even less tangible. For example, Joseph Jagger studied the behaviour of roulette wheels at a casino in Monte Carlo, and used this to identify a biased wheel. In this case, the 'population' Jagger wanted to

• •

larger sample sizes generally lead to increased precision when estimating unknown parameters. For example, if we wish to know the proportion of a certain species of fish that is infected with a pathogen, we would generally have a more accurate estimate of this proportion if we sampled and examined 200, rather than 100 fish. Several fundamental facts of mathematical statistics describe this phenomenon, including the law of large numbers and the central limit theorem. In some situations, the increase in accuracy for larger sample sizes is minimal, or even non-existent. This can result from

Prof. Amit Kumar FIT Group of Institutions

Page 1

UNIT-II

BBA II SEM

RESEARCH METHODOLOGY
interview the selected person and find their income. People living on their own are certain to be selected, so we simply add their income to our estimate of the total. But a person living in a household of two adults has only a one-in-two chance of selection. To reflect this, when we come to such a household, we would count the selected person's income twice towards the total. (The person who is selected from that household can be loosely viewed as also representing the person who isn't selected.) In the above example, not everybody has the same probability of selection; what makes it a probability sample is the fact that each person's probability is known. When every element in the population does have the same probability of selection, this is known as an 'equal probability of selection' (EPS) design. Such designs are also referred to as 'selfweighting' because all sampled units are given the same weight. Probability sampling includes: Simple Random Sampling, Systematic Sampling, Stratified Sampling, Probability Proportional to Size Sampling, and Cluster or Multistage Sampling. These various ways of probability sampling have two things in common:

investigate was the overall behaviour of the wheel (i.e. the probability distribution of its results over infinitely many trials), while his 'sample' was formed from observed results from that wheel. Similar considerations arise when taking repeated measurements of some physical characteristic such as the electrical conductivity of copper. This situation often arises when we seek knowledge about the cause system of which the observed population is an outcome. In such cases, sampling theory may treat the observed population as a sample from a larger 'superpopulation'. For example, a researcher might study the success rate of a new 'quit smoking' program on a test group of 100 patients, in order to predict the effects of the program if it were made available nationwide. Here the superpopulation is "everybody in the country, given access to this treatment" - a group which does not yet exist, since the program isn't yet available to all. Sampling frame In the most straightforward case, such as the sentencing of a batch of material from production (acceptance sampling by lots), it is possible to identify and measure every single item in the population and to include any one of them in our sample. However, in the more general case this is not possible. There is no way to identify all rats in the set of all rats. Where voting is not compulsory, there is no way to identify which people will actually vote at a forthcoming election (in advance of the election). These imprecise populations are not amenable to sampling in any of the ways below and to which we could apply statistical theory. As a remedy, we seek a sampling frame which has the property that we can identify every single element and include any in our sample. The most straightforward type of frame is a list of elements of the population (preferably the entire population) with appropriate contact information. For example, in an opinion poll, possible sampling frames include an electoral register and a telephone directory. Probability and non-probability sampling A probability sampling scheme is one in which every unit in the population has a chance (greater than zero) of being selected in the sample, and this probability can be accurately determined. The combination of these traits makes it possible to produce unbiased estimates of population totals, by weighting sampled units according to their probability of selection. Example: We want to estimate the total income of adults living in a given street. We visit each household in that street, identify all adults living there, and randomly select one adult from each household. (For example, we can allocate each person a random number, generated from a uniform distribution between 0 and 1, and select the person with the highest number in each household). We then

1. 2.

Every element has a known nonzero probability of being sampled and involves random selection at some point. Nonprobability sampling is any sampling method where some elements of the population have no chance of selection or where the probability of selection can't be accurately determined. It involves the selection of elements based on assumptions regarding the population of interest, which forms the criteria for selection. Hence, because the selection of elements is nonrandom, nonprobability sampling does not allow the estimation of sampling errors. These conditions give rise to exclusion bias, placing limits on how much information a sample can provide about the population. Information about the relationship between sample and population is limited, making it difficult to extrapolate from the sample to the population. Example: We visit every household in a given street, and interview the first person to answer the door. In any household with more than one occupant, this is a nonprobability sample, because some people are more likely to answer the door (e.g. an unemployed person who spends most of their time at home is more likely to answer than an employed housemate who might be at work when the interviewer calls) and it's not practical to calculate these probabilities. Nonprobability sampling methods include [[accidental sampling, quota sampling and purposive sampling. In addition, nonresponse effects may turn any probability design into a nonprobability design if the characteristics of nonresponse are not well understood, since nonresponse

Prof. Amit Kumar FIT Group of Institutions

Page 2

UNIT-II
effectively modifies each element's probability of being sampled. Sampling methods

BBA II SEM

RESEARCH METHODOLOGY
Systematic sampling relies on arranging the target population according to some ordering scheme and then selecting elements at regular intervals through that ordered list. Systematic sampling involves a random start and then proceeds with the selection of every kth element from then onwards. In this case, k=(population size/sample size). It is important that the starting point is not automatically the first in the list, but is instead randomly chosen from within the first to the kth element in the list. A simple example would be to select every 10th name from the telephone directory (an 'every 10th' sample, also referred to as 'sampling with a skip of 10'). As long as the starting point is randomized, systematic sampling is a type of probability sampling. It is easy to implement and the stratification induced can make it efficient, if the variable by which the list is ordered is correlated with the variable of interest. 'Every 10th' sampling is especially useful for efficient sampling from databases. Example: Suppose we wish to sample people from a long street that starts in a poor district (house #1) and ends in an expensive district (house #1000). A simple random selection of addresses from this street could easily end up with too many from the high end and too few from the low end (or vice versa), leading to an unrepresentative sample. Selecting (e.g.) every 10th street number along the street ensures that the sample is spread evenly along the length of the street, representing all of these districts. (Note that if we always start at house #1 and end at #991, the sample is slightly biased towards the low end; by randomly selecting the start between #1 and #10, this bias is eliminated.) However, systematic sampling is especially vulnerable to periodicities in the list. If periodicity is present and the period is a multiple or factor of the interval used, the sample is especially likely to be unrepresentative of the overall population, making the scheme less accurate than simple random sampling. Example: Consider a street where the odd-numbered houses are all on the north (expensive) side of the road, and the even-numbered houses are all on the south (cheap) side. Under the sampling scheme given above, it is impossible' to get a representative sample; either the houses sampled will all be from the odd-numbered, expensive side, or they will all be from the even-numbered, cheap side. Another drawback of systematic sampling is that even in scenarios where it is more accurate than SRS, its theoretical properties make it difficult to quantify that accuracy. (In the two examples of systematic sampling that are given above, much of the potential sampling error is due to variation between neighbouring houses - but because this method never selects two neighbouring houses, the sample will not give us any information on that variation.)

Within any of the types of frame identified above, a variety of sampling methods can be employed, individually or in combination. Factors commonly influencing the choice between these designs include: • • • • • Nature and quality of the frame Availability of auxiliary information about units on the frame Accuracy requirements, and the need to measure accuracy Whether detailed analysis of the sample is expected Cost/operational concerns Simple random sampling In a simple random sample ('SRS') of a given size, all such subsets of the frame are given an equal probability. Each element of the frame thus has an equal probability of selection: the frame is not subdivided or partitioned. Furthermore, any given pair of elements has the same chance of selection as any other such pair (and similarly for triples, and so on). This minimises bias and simplifies analysis of results. In particular, the variance between individual results within the sample is a good indicator of variance in the overall population, which makes it relatively easy to estimate the accuracy of results. However, SRS can be vulnerable to sampling error because the randomness of the selection may result in a sample that doesn't reflect the makeup of the population. For instance, a simple random sample of ten people from a given country will on average produce five men and five women, but any given trial is likely to overrepresent one sex and underrepresent the other. Systematic and stratified techniques, discussed below, attempt to overcome this problem by using information about the population to choose a more representative sample. SRS may also be cumbersome and tedious when sampling from an unusually large target population. In some cases, investigators are interested in research questions specific to subgroups of the population. For example, researchers might be interested in examining whether cognitive ability as a predictor of job performance is equally applicable across racial groups. SRS cannot accommodate the needs of researchers in this situation because it does not provide subsamples of the population. Stratified sampling, which is discussed below, addresses this weakness of SRS. Simple random sampling is always an EPS design (equal probability of selection), but not all EPS designs are simple random sampling. Systematic sampling

Prof. Amit Kumar FIT Group of Institutions

Page 3

UNIT-II

BBA II SEM

RESEARCH METHODOLOGY
specified minimum sample size per group), stratified sampling can potentially require a larger sample than would other methods (although in most cases, the required sample size would be no larger than would be required for simple random sampling. A stratified sampling approach is most effective when three conditions are met

As described above, systematic sampling is an EPS method, because all elements have the same probability of selection (in the example given, one in ten). It is not 'simple random sampling' because different subsets of the same size have different selection probabilities - e.g. the set {4,14,24,...,994} has a one-in-ten probability of selection, but the set {4,13,24,34,...} has zero probability of selection. Systematic sampling can also be adapted to a non-EPS approach; for an example, see discussion of PPS samples below. Stratified sampling Where the population embraces a number of distinct categories, the frame can be organized by these categories into separate "strata." Each stratum is then sampled as an independent sub-population, out of which individual elements can be randomly selected. There are several potential benefits to stratified sampling. First, dividing the population into distinct, independent strata can enable researchers to draw inferences about specific subgroups that may be lost in a more generalized random sample. Second, utilizing a stratified sampling method can lead to more efficient statistical estimates (provided that strata are selected based upon relevance to the criterion in question, instead of availability of the samples). Even if a stratified sampling approach does not lead to increased statistical efficiency, such a tactic will not result in less efficiency than would simple random sampling, provided that each stratum is proportional to the group's size in the population. Third, it is sometimes the case that data are more readily available for individual, pre-existing strata within a population than for the overall population; in such cases, using a stratified sampling approach may be more convenient than aggregating data across groups (though this may potentially be at odds with the previously noted importance of utilizing criterion-relevant strata). Finally, since each stratum is treated as an independent population, different sampling approaches can be applied to different strata, potentially enabling researchers to use the approach best suited (or most cost-effective) for each identified subgroup within the population. There are, however, some potential drawbacks to using stratified sampling. First, identifying strata and implementing such an approach can increase the cost and complexity of sample selection, as well as leading to increased complexity of population estimates. Second, when examining multiple criteria, stratifying variables may be related to some, but not to others, further complicating the design, and potentially reducing the utility of the strata. Finally, in some cases (such as designs with a large number of strata, or those with a

1. 2. 3.

Variability within strata are minimized Variability between strata are maximized The variables upon which the population is stratified are strongly correlated with the desired dependent variable. Advantages over other sampling methods

1. 2. 3. 4.

Focuses on important subpopulations and ignores irrelevant ones. Allows use of different sampling techniques for different subpopulations. Improves the accuracy/efficiency of estimation. Permits greater balancing of statistical power of tests of differences between strata by sampling equal numbers from strata varying widely in size. Disadvantages

1. 2. 3.

Requires selection of relevant stratification variables which can be difficult. Is not useful when there are no homogeneous subgroups. Can be expensive to implement. Post stratification Stratification is sometimes introduced after the sampling phase in a process called "poststratification". This approach is typically implemented due to a lack of prior knowledge of an appropriate stratifying variable or when the experimenter lacks the necessary information to create a stratifying variable during the sampling phase. Although the method is susceptible to the pitfalls of post hoc approaches, it can provide several benefits in the right situation. Implementation usually follows a simple random sample. In addition to allowing for stratification on an ancillary variable, post stratification can be used to implement weighting, which can improve the precision of a sample's estimates. Oversampling Choice-based sampling is one of the stratified sampling strategies. In choice-based sampling, the data are stratified on the target and a sample is taken from each stratum so that the rare target class will be more represented in the sample. The model is then built on this biased sample. The effects of the input variables on the target are often estimated with more precision with the choice-based sample even when a smaller overall sample size is taken, compared

Prof. Amit Kumar FIT Group of Institutions

Page 4

UNIT-II

BBA II SEM

RESEARCH METHODOLOGY
Clustering can reduce travel and administrative costs. In the example above, an interviewer can make a single trip to visit several households in one block, rather than having to drive to a different block for each household. It also means that one does not need a sampling frame listing all elements in the target population. Instead, clusters can be chosen from a cluster-level frame, with an element-level frame created only for the selected clusters. In the example above, the sample only requires a block-level city map for initial selections, and then a household-level map of the 100 selected blocks, rather than a household-level map of the whole city. Cluster sampling generally increases the variability of sample estimates above that of simple random sampling, depending on how the clusters differ between themselves, as compared with the within-cluster variation. For this reason, cluster sampling requires a larger sample than SRS to achieve the same level of accuracy - but cost savings from clustering might still make this a cheaper option. Cluster sampling is commonly implemented as multistage sampling. This is a complex form of cluster sampling in which two or more levels of units are embedded one in the other. The first stage consists of constructing the clusters that will be used to sample from. In the second stage, a sample of primary units is randomly selected from each cluster (rather than using all units contained in all selected clusters). In following stages, in each of those selected clusters, additional samples of units are selected, and so on. All ultimate units (individuals, for instance) selected at the last step of this procedure are then surveyed. This technique, thus, is essentially the process of taking random subsamples of preceding random samples. Multistage sampling can substantially reduce sampling costs, where the complete population list would need to be constructed (before other sampling methods could be applied). By eliminating the work involved in describing clusters that are not selected, multistage sampling can reduce the large costs associated with traditional cluster sampling. Quota sampling In quota sampling, the population is first segmented into mutually exclusive sub-groups, just as in stratified sampling. Then judgement is used to select the subjects or units from each segment based on a specified proportion. For example, an interviewer may be told to sample 200 females and 300 males between the age of 45 and 60. It is this second step which makes the technique one of nonprobability sampling. In quota sampling the selection of the sample is non-random. For example interviewers might be tempted to interview those who look most helpful. The problem is that these samples may be biased because not

to a random sample. The results usually must be adjusted to correct for the oversampling. Probability proportional to size sampling In some cases the sample designer has access to an "auxiliary variable" or "size measure", believed to be correlated to the variable of interest, for each element in the population. These data can be used to improve accuracy in sample design. One option is to use the auxiliary variable as a basis for stratification, as discussed above. Another option is probability-proportional-to-size ('PPS') sampling, in which the selection probability for each element is set to be proportional to its size measure, up to a maximum of 1. In a simple PPS design, these selection probabilities can then be used as the basis for Poisson sampling. However, this has the drawback of variable sample size, and different portions of the population may still be over- or under-represented due to chance variation in selections. To address this problem, PPS may be combined with a systematic approach. Example: Suppose we have six schools with populations of 150, 180, 200, 220, 260, and 490 students respectively (total 1500 students), and we want to use student population as the basis for a PPS sample of size three. To do this, we could allocate the first school numbers 1 to 150, the second school 151 to 330 (= 150 + 180), the third school 331 to 530, and so on to the last school (1011 to 1500). We then generate a random start between 1 and 500 (equal to 1500/3) and count through the school populations by multiples of 500. If our random start was 137, we would select the schools which have been allocated numbers 137, 637, and 1137, i.e. the first, fourth, and sixth schools. The PPS approach can improve accuracy for a given sample size by concentrating sample on large elements that have the greatest impact on population estimates. PPS sampling is commonly used for surveys of businesses, where element size varies greatly and auxiliary information is often available - for instance, a survey attempting to measure the number of guest-nights spent in hotels might use each hotel's number of rooms as an auxiliary variable. In some cases, an older measurement of the variable of interest can be used as an auxiliary variable when attempting to produce more current estimates. Cluster sampling Sometimes it is more cost-effective to select respondents in groups ('clusters'). Sampling is often clustered by geography, or by time periods. (Nearly all samples are in some sense 'clustered' in time - although this is rarely taken into account in the analysis.) For instance, if surveying households within a city, we might choose to select 100 city blocks and then interview every household within the selected blocks.

Prof. Amit Kumar FIT Group of Institutions

Page 5

UNIT-II

BBA II SEM

RESEARCH METHODOLOGY
an area, if the survey were to be conducted at different times of day and several times per week. This type of sampling is most useful for pilot testing. Several important considerations for researchers using convenience samples include:

everyone gets a chance of selection. This random element is its greatest weakness and quota versus probability has been a matter of controversy for many years. Convenience sampling or Accidental Sampling Convenience sampling (sometimes known as grab or opportunity sampling) is a type of nonprobability sampling which involves the sample being drawn from that part of the population which is close to hand. That is, a population is selected because it is readily available and convenient. It may be through meeting the person or including a person in the sample when one meets them or chosen by finding them through technological means such as the internet or through phone. The researcher using such a sample cannot scientifically make generalizations about the total population from this sample because it would not be representative enough. For example, if the interviewer were to conduct such a survey at a shopping center early in the morning on a given day, the people that he/she could interview would be limited to those given there at that given time, which would not represent the views of other members of society in such Different Types of Sample 1.

2.

3.

Are there controls within the research design or experiment which can serve to lessen the impact of a non-random convenience sample, thereby ensuring the results will be more representative of the population? Is there good reason to believe that a particular convenience sample would or should respond or behave differently than a random sample from the same population? Is the question being asked by the research one that can adequately be answered using a convenience sample? In social science research, snowball sampling is a similar technique, where existing study subjects are used to recruit more subjects into the sample. Some variants of snowball sampling, such as respondent driven sampling, allow calculation of selection probabilities and are probability sampling methods under certain conditions.

There are 5 different types of sample explain below with their advantages and disadvantages 1.Simple Random SampleObtaining a genuine random sample is difficult. We usually use Random Number Tables, and use the following procedure; 1. 2. 3. 4. 5. 6. 7. Number the population from 0 to n Pick a random place I the number table Work in a random direction Organise numbers into the required number of digits (e.g. if the size of the population is 80, use 2 digits) Reject any numbers not applicable (in our example, numbers between 80 and 99) Continue until the required number of samples has been collected [ If the sample is "without replacement", discard any repetitions of any number]

Advantages: Disadvantages:

The sample will be free from Bias (i.e. it's random!) Difficult to obtain Due to its very randomness, "freak" results can sometimes be obtained that are not representative of the population. In addition, these freak results may be difficult to spot. Increasing the sample size is the best way to eradicate this problem.

2.Systematic SampleWith this method, items are chosen from the population according to a fixed rule, e.g. every 10 house along a street. This method should yield a more representative sample than the random sample (especially if the sample size is small). It seeks to eliminate sources of bias, e.g. an inspector checking sweets on a conveyor belt might unconsciously favour red sweets. However, a systematic method can also introduce bias, e.g. the period chosen might coincide with the period of faulty machine, thus yielding an unrepresentative number of faulty sweets.
th

Prof. Amit Kumar FIT Group of Institutions

Page 6

UNIT-II

BBA II SEM

RESEARCH METHODOLOGY

Advantages: Disadvantages:

Can eliminate other sources of bias Can introduce bias where the pattern used for the samples coincides with a pattern in the population.

3.Stratified Sampling The population is broken down into categories, and a random sample is taken of each category. The proportions of the sample sizes are the same as the proportion of each category to the whole. Advantages: Disadvantages: 4.Quota SamplingAs with stratified samples, the population is broken down into different categories. However, the size of the sample of each category does not reflect the population as a whole. This can be used where an unrepresentative sample is desirable (e.g. you might want to interview more children than adults for a survey on computer games), or where it would be too difficult to undertake a stratified sample. Yields more accurate results than simple random sampling Can show different tendencies within each category (e.g. men and women) Nothing major, hence it's used a lot

Advantages: Disadvantages:

Simpler to undertake than a stratified sample Sometimes a deliberately biased sample is desirable Not a genuine random sample Likely to yield a biased result

5.Cluster SamplingUsed when populations can be broken down into many different categories, or clusters (e.g. church parishes). Rather than taking a sample from each cluster, a random selection of clusters is chosen to represent the whole. Within each cluster, a random sample is taken.

Advantages: Disadvantages:

Less expensive and time consuming than a fully random sample Can show "regional" variations Not a genuine random sample Likely to yield a biased result (especially if only a few clusters are sampled)

Types of Data:

Data can be classified as either primary or secondary.

Prof. Amit Kumar FIT Group of Institutions

Primary and Secondary Data

Page 7

UNIT-II
Primary Data

BBA II SEM
ADVANTAGES

RESEARCH METHODOLOGY

Primary data means original data that has been collected specially for the purpose in mind. It means when an authorized organization or an investigator or an enumerator or some guy with a clipboard collects the data for the first time from the original source. Data collected this way is called primary data. Research where one gathers this kind of data is referred to as field research. For example: your own questionnaire. Secondary Data Secondary data is data that has been collected for another purpose. When we use Statistical Method with Primary Data from another purpose for our purpose we refer to it as Secondary Data. It means that one purpose's Primary Data is another purpose's Secondary Data. Secondary data is data that is being reused. Usually in a different context. Research where one gathers this kind of data is referred to as desk research. For example: data from a book.

1. The method eliminates subjective bias 2. The information obtained under this method relates to what is current happening it is not complicated either by past behaviour or future intentions and attitudes. 3. This method is independent of respondent willingness to respondents as such is relatively less demanded of active co-operation on the part of the respondents as happens to be the case in interview or the questionnaire method. 4. This method is particularly suitable in studies, which deal with subjects who are not capable giving verbal reports of their feeling for one reason or the other. DISADVANTAGES 1. Its s an expensive method 2. The information provided by this method is very limited. 3. Sometimes unforeseen factors may interfere with the observational task. 4. The fact that some people are rarely accessible to direct observation creates obstacle for this method to collect data effectively. 2) SURVEYS Surveys are concerned with describing, recording, analyzing and interpreting conditions that exist or existed. The researcher does not manipulate the variable or arrange for events to happen Surveys are only concerned with conditions or relationships that exist, opinions that are held, processes that are going on, effects that are evident or trends that are developing. They are primarily concerned with present but at times do consider past events and influences as they relate to current conditions. 1. Survey type researches usually have larger samples because percentages of responses generally happen to be low, as low as 20 to 30%, especially in mailed questionnaire studies. Thus, the survey method gathers data relatively from the large number of cases at a peculiar time; it is essentially cross-sectional.

METHODS OF PRIMARY DATA COLLECTION 1) OBSERVATION METHOD Observation becomes a scientific tool and the method of data collection for the researcher when it serves a formulated research purpose is systematically planned and recorded and is subjected to checks and controls on validity and reliability. Under the observation method the information is sought by way of investigators own direct observation without asking from respondent ExampleIn a study relating to consumer behaviour the investigator instead of asking the brand of wristwatch used by the respondent may himself look for the watch.

Prof. Amit Kumar FIT Group of Institutions

Page 8

UNIT-II

BBA II SEM

RESEARCH METHODOLOGY
but they would pick up food they know to fatty. It is useful when the subject cannot provide information or can only provide inaccurate information like people addicted to drugs. But at the same time, in observation the researcher does not get any insight into what people may be thinking. 4)IN-DEPTH TECHNIQUES In a survey, usually general questions are asked to know what customers or subjects do and think. But it one wants to know ‘why they feel that way’, one has to conduct an in-depth research. In survey, answers may depend on the mood of the respondent. As such, the survey shows how one feels at one particular time. But in in-depth research, long and probing interviews are taken to find out customers satisfaction and loyalty, usage, awareness and brand recognition etc. as discussed below: 5)FOCUS GROUP A group is formed of 8 to 12 persons. They are selected keeping in view the targeted market. The group members are asked about their attitude towards a concept, product, service, packing or advertisement. Questions are asked in an interactive-group-setting where members are free to talk to each other. A moderator guides through the discussion. Through one-way mirror, the client or its representative observes the discussion, interpret facial expression and body language. There are some draw-backs. There is lesser control of the moderator or researcher and it lead to irrelevant discussions. Moreover, individual members consciously or unconsciously conform to what they perceive to be the consensus of the group, a situation called “Group-think”. The technology has give rise to “Modern Group” where group-members participate “on-line” and can share financial and operating data, pictures, voices and drawings etc. 6)PANELS These are more or less like Focus Group. But Focus groups are formed for one-time discussion to decide about a particular issue. On the contrary, panels are of long-term nature for meeting frequently to resolve an issue. These can be

2. Surveys are conducted in case of descriptive research studies, usually appropriate in case of social and behavioral sciences because many type of behavior that interest researcher cannot be arranged in realistic setting. 3. Surveys are example of field research and are concerned with hypothesis formulation and testing analysis of the relationship between nonmanipulated variables. 4. Surveys may either be census or sample surveys. They may also be classified as social surveys, economic surveys, and public opinion surveys. Whatever be their type, the method of data collection happens to be either observation or interview or questionnaire or opinionnaire or some projective technique. Case method may as well be used. 5. In case of surveys, research design must be rigid, must make economical provision for protection against bias and must maximize reliability, the aim happens to be to obtain complete and accurate information. 6. Possible relationships between the data and the unknowns in the universe can be studied through surveys.

3)OBSERVATIONS Observation is a primary method of collecting data by human, mechanical, electrical or electronics means with direct or indirect contact. As per Langley P, “Observations involve looking and listening very carefully. We all watch other people sometimes but we do not usually watch them in order to discover particular information about their behavior. This is what observation in social science involves.” Observation is the main source of information in the field research. The researcher goes into the field and observes the conditions in their natural state. There are many types of observation, direct or indirect, participant or non-participant, obtrusive or non-obtrusive, structured or non-structured. The observation is important and actual behavior of people is observed and not what people say they did or feel. For example, people value health

Prof. Amit Kumar FIT Group of Institutions

Page 9

UNIT-II

BBA II SEM

RESEARCH METHODOLOGY
Of many techniques, word association, sentence completing and ink-blot tests are very common. In these techniques, both verbal and non-verbal (hesitation, time-lag and facial expression) are noted and interpreted. Such tests are useful for finding out consumer preference, buying attitude and behavior. Eventually, these are used for product development or finding out reason for failure of an apparently efficient product.

current customers or potential customers; can be static or dynamic (members coming and going). Another difference is that penal are selected by the organizers through certain criteria like education, exposure and interest. Of course, there are exception like Microsoft Panel for Research and Evaluation of their software which are formed through open invitation. 7)IN-DEPTH INTERVIEWS In-depth interviews, also called as one-to-one interviews, are expensive in term of time and money but are good for exploring several factors. Problems identified in an interview may be a symptom of a serious problem. The interviews may be conducted face to face or through telephone or these could be computer assisted interviews. Nowadays, television interviews have become more common. But interviews are fraught with bias from three sources, the interviewer, the interviewee and the interview setting. The interviewer may misinterpret the response or distort it while writing down. He may unintentionally encourage certain responses though gestures and facial impressions. The interviewee may not give his or her true opinion or avoid difficult questions. The setting may be good or bad creating comfort or discomfort. It may be open or in presence of some colleagues or senior or level of trust may be inadequate. In order to minimize bias, the interviewer should have knowledge, skill and confidence. Rapport and trust should be established in the interview. 8)PROJECTIVE METHODS A psychological test in which a subject's responses to ambiguous or unstructured standard stimuli, such as a series of cartoons, abstract patterns, or incomplete sentences, are analyzed in order to determine underlying personality traits and feelings. This entails indirect question which enables the respondent to “project beliefs and feelings onto a third party". The respondents are expected to interpret the situation through their own experience, attitude and personality and express hidden opinion and emotions.

Secondary Data Collection Methods:
The four methods of secondary data collection are as follows :1) Internet search, using online resources to gather data for research purposes. This method is not usually very reliable and requires appropriate citation and critical analysis for findings. 2) Library search and indexing, this technique requires to go through written texts that have already done similar work and utilizing their researches for your dissertations. 3) Data collection organizations, for example Gallup and AC Nielsen conduct researches on a recurrent basis ranging in a wide array of topics. 4) News Papers and Magazines, journals and other similar periodicals.

Questionnaire construction A questionnaire is a series of questions asked to individuals to obtain statistically useful information about a given topic. When properly constructed and responsibly administered, questionnaires become a vital instrument by which statements can be made about specific groups or people or entire populations. Questionnaires are frequently used in quantitative marketing research and social research. They are a valuable method of collecting a wide range of information from a large number of individuals, often referred to as respondents. Adequate questionnaire construction is critical to the

Prof. Amit Kumar FIT Group of Institutions

Page 10

UNIT-II

BBA II SEM

RESEARCH METHODOLOGY
• The type of scale, index, or typology to be used shall be determined. • The level of measurement you use will determine what you can do with and conclude from the data. If the response option is yes/no then you will only know how many or what percent of your sample answered yes/no. You cannot, however, conclude what the average respondent answered. • The types of questions (closed, multiple-choice, open) should fit the statistical data analysis techniques available and your goals. • Questions and prepared responses to choose from should be neutral as to intended outcome. A biased question or questionnaire encourages respondents to answer one way rather than another. Even questions without bias may leave respondents with expectations. • The order or “natural” grouping of questions is often relevant. Prior previous questions may bias later questions. • The wording should be kept simple: no technical or specialized words. • The meaning should be clear. Ambiguous words, equivocal sentence structures and negatives may cause misunderstanding, possibly invalidating questionnaire results. Double negatives should be reworded as positives. • If a survey question actually contains more than one issue, the researcher will not know which one the respondent is answering. Care should be taken to ask one question at a time. • The list of possible responses should be collectively exhaustive. Respondents should not find themselves with no category that fits their situation. One solution is to use a final category for “other ________”. • The possible responses should also be mutually exclusive. Categories should not overlap. Respondents should not find themselves in more than one category, for example in both the “married” category and the “single” category - there may be need for separate questions on marital status and living situation. • Writing style should be conversational, yet concise and accurate and appropriate to the target audience. • Many people will not answer personal or intimate questions. For this reason, questions about age, income, marital status, etc. are generally placed at the end of the survey. This way, even if the respondent refuses to answer these "personal" questions, he/she will have already answered the research questions.

success of a survey. Inappropriate questions, incorrect ordering of questions, incorrect scaling, or bad questionnaire format can make the survey valueless, as it may not accurately reflect the views and opinions of the participants. A useful method for checking a questionnaire and making sure it is accurately capturing the intended information is to pretest among a smaller subset of target respondents.

Questionnaire construction issues • Know how (and whether) you will use the results of your research before you start. If, for example, the results won't influence your decision or you can't afford to implement the findings or the cost of the research outweighs its usefulness, then save your time and money; don't bother doing the research. • The research objectives and frame of reference should be defined beforehand, including the questionnaire's context of time, budget, manpower, intrusion and privacy. • How (randomly or not) and from where (your sampling frame) you select the respondents will determine whether you will be able to generalize your findings to the larger population. • The nature of the expected responses should be defined and retained for interpretation of the responses, be it preferences (of products or services), facts, beliefs, feelings, descriptions of past behavior, or standards of action. • Unneeded questions are an expense to the researcher and an unwelcome imposition on the respondents. All questions should contribute to the objective(s) of the research. • If you "research backwards" and determine what you want to say in the report (i.e., Package A is more/less preferred by X% of the sample vs. Package B, and y% compared to Package C) then even though you don't know the exact answers yet, you will be certain to ask all the questions you need - and only the ones you need - in such a way (metrics) to write your report. • The topics should fit the respondents’ frame of reference. Their background may affect their interpretation of the questions. Respondents should have enough information or expertise to answer the questions truthfully.

Prof. Amit Kumar FIT Group of Institutions

Page 11

UNIT-II

BBA II SEM

RESEARCH METHODOLOGY
without being constrained by a fixed set of possible responses. Examples of types of open ended questions include: o Completely unstructured - For example, “What is your opinion on questionnaires?” o Word association - Words are presented and the respondent mentions the first word that comes to mind. o Sentence completion Respondents complete an incomplete sentence. For example, “The most important consideration in my decision to buy a new house is . . .” o Story completion - Respondents complete an incomplete story. o Picture completion Respondents fill in an empty conversation balloon. o Thematic apperception test Respondents explain a picture or make up a story about what they think is happening in the picture Question sequence • • Questions should flow logically from one to the next. The researcher must ensure that the answer to a question is not influenced by previous questions. Questions should flow from the more general to the more specific. Questions should flow from the least sensitive to the most sensitive. Questions should flow from factual and behavioural questions to attitudinal and opinion questions. Questions should flow from unaided to aided questions. According to the three stage theory (also called the sandwich theory), initial questions should be screening and rapport questions. Then in the second stage you ask all the product specific questions. In the last stage you ask demographic questions.

• “Loaded” questions evoke emotional responses and may skew results. • Presentation of the questions on the page (or computer screen) and use of white space, colors, pictures, charts, or other graphics may affect respondent's interest or distract from the questions. • Numbering of questions may be helpful. • Questionnaires can be administered by research staff, by volunteers or selfadministered by the respondents. Clear, detailed instructions are needed in either case, matching the needs of each audience. Types of questions 1. Contingency questions - A question that is answered only if the respondent gives a particular response to a previous question. This avoids asking questions of people that do not apply to them (for example, asking men if they have ever been pregnant). Matrix questions - Identical response categories are assigned to multiple questions. The questions are placed one under the other, forming a matrix with response categories along the top and a list of questions down the side. This is an efficient use of page space and respondents’ time. Closed ended questions - Respondents’ answers are limited to a fixed set of responses. Most scales are closed ended. Other types of closed ended questions include: o Yes/no questions - The respondent answers with a “yes” or a “no”. o Multiple choice - The respondent has several option from which to choose. o Scaled questions - Responses are graded on a continuum (example : rate the appearance of the product on a scale from 1 to 10, with 10 being the most preferred appearance). Examples of types of scales include the Likert scale, semantic differential scale, and rank-order scale (See scale for a complete list of scaling techniques.). Open ended questions - No options or predefined categories are suggested. The respondent supplies their own answer

2.

3.

• • •

• •

4.

Prof. Amit Kumar FIT Group of Institutions

Page 12

Sign up to vote on this title
UsefulNot useful