# PRESENTED BY: ASHWINI POKHARKAR ROHIT PANDEY SWAPNIL MUKE APOORVA DAVE PEEYUSH KHANDEKAR SHAILAJA PATIL

To understand sampling, its advantages To know sampling process Types of sampling i.e. probability and non probability sampling To understand the factors to consider when determining sample size To know sampling errors

Sampling is the process of selecting a small number of elements from a larger defined target group of elements such that the information gathered from the small group will allow judgments to be made about the larger groups

Sample error can be studied. controlled & probability statement can be made about magnitude  . Saves time and efforts Saves money  More accurate measurementI. inspection fatigue is reduced II.

 Units of Analysis (people) List or Procedure Target Population of Interest Actual Population to Which Generalizations Are Made Defined/Listed by Sampling Frame Generalization Sample The people actually studied Target Sample Response Rate Sampling Frame List or Rule Defining the Population Method of selection List of Target Sample .

and Research Design. Some Bases for Defining Population: ◦ ◦ ◦ ◦ Geographic Area Demographics Usage/Lifestyle Awareness  . Population of interest is entirely dependent on Management Problem. Research Problems.

Sample frame error occurs when certain elements of the population are accidentally omitted or not included on the list. houses. etc. Difficult to get an accurate list.) from which units to be sampled can be selected. companies. cities.   . A list of population elements (people.

Probability Non probability •Convenience Simple random sampling Systematic random sampling Stratified random sampling Cluster sampling sampling •Snowball sampling •Judgment sampling •Quota sampling .

PROBABILITY SAMPLES • A probability sample is one in which each element of the population has a known non-zero probability of selection. . • Not a probability sample of some elements of population cannot be selected (have zero probability) • Not a probability sample if probabilities of selection are not known.

random sampling from that frame cannot fix the problem • The sampling frame is non-randomly chosen.SAMPLING FRAME IS CRUCIAL IN PROBABILITY SAMPLING • If the sampling frame is a poor fit to the population of interest. •Generalizations can be made ONLY to the actual population defined by the sampling frame . Elements not in the sampling frame have zero probability of selection.

SIMPLE RANDOM SAMPLING • Each element in the population has an equal probability of selection & each combination of elements has an equal probability of selection • Names drawn out of a hat • Random numbers to select elements from an ordered list .

STRATIFIED RANDOM SAMPLING-1 • Divide population into groups that differ in important ways • Basis for grouping must be known before Sampling • Select random sample from within each group .

j+k. Random j=3. • Randomly select a number j between 1 and k. n=8. desired sample size n. sampling interval k=N/n.j+2k. • Population size N. sample element j and then every k th element thereafter.SYSTEMATIC RANDOM SAMPLING • Each element has an equal probability of selection. k=64/8=8. • Example: N=64. . but combinations of elements have different probabilities.

whole cluster is sampled. there is random sampling within each randomly chosen cluster.. • In simple multistage cluster. • In pure cluster sampling. usually geographic organizational. • Some of the groups are randomly chosen. . Contd.RANDOM CLUSTER SAMPLING • Population is divided into groups.

• Population is divided into groups •Some of the groups are randomly selected • For given sample size. a cluster sample has more error than a simple random sample • Cost savings of clustering may permit larger sample • Error is smaller if the clusters are similar to each other .

STRATIFIED CLUSTER SAMPLING • Reduce the error in cluster sampling by creating strata of clusters • Sample one cluster from each stratum • The cost-savings of clustering with the error reduction of stratification Strata .

Convenience sampling relies upon convenience and access Judgment sampling relies upon belief that participants fit characteristics Quota sampling emphasizes representation of specific characteristics Snowball sampling relies upon respondent referrals of others with like characteristics .

Data collected in sample data may not truly represent the basic structure of the population.  it occurs due to faulty selection of random data while sampling. Sampling error is the amount of accuracy in estimating value caused only a portion of a population.Sampling error is any type of bias that is attributable to mistakes in either drawing a sample or determining the sample size. .

non-sampling error Occurs when aim of survey not very clear. including systematic errors. Non sampling errors presents in sample survey as well as cencus survey. Non-sampling error is for the deviations from the true value that are not a function of the sample chosen . Imperfect questionnaire Non-coherent answers by respondents Inadequate knowledge Prestige problems   .

◦ Degree of variability.   How many completed questionnaires do we need to have a representative sample? Generally the larger the better. ◦ Confidence level. Three criteria usually need to determine appropriate sample size: ◦ The level of precision. ◦ . but that takes more time and money.

 The level of precision. ±5 percent). if a researcher finds that 60% of farmers in the sample have adopted a recommended practice with a precision rate of ±5%. Thus..g. sometimes called sampling error. then he or she can conclude that between 55% and 65% of farmers in the population have adopted the practice. (e. . is the range in which the true value of the population is estimated to be. This range is often expressed in percentage points. in the same way that results for political campaign polls are reported by the media.

 The confidence or risk level is based on ideas encompassed under the Central Limit Theorem. the average value of the attribute obtained by those samples is equal to the true population value. . the values obtained by these samples are distributed normally about the true value. Furthermore. with some samples having a higher value and some obtaining a lower score than the true population value. The key idea encompassed in the Central Limit Theorem is that when a population is repeatedly sampled.

. In a normal distribution. . 95 out of 100 samples will have the true population value within the range of precision specified.g. mean). In other words. approximately 95% of the sample values are within two standard deviations of the true population value (e. if a 95% confidence level is selected. this means that.

respectively. have the attribute of interest. the degree of variability in the attributes being measured refers to the distribution of attributes in the population. The less variable (more homogeneous) a population. the smaller the sample size. Note that a proportion of 50% indicates a greater level of variability than either 20% or 80%. the larger the sample size required to obtain a given level of precision. . This is because 20% and 80% indicate that a large majority do not or do. The more heterogeneous a population.

more accurate estimates about populations. It may be hard to find a random sample of people.   . Larger sample sizes produce more accurate statistics. Larger sample sizes obviously produce better. the extra cost and effort is not always needed as smaller sample sizes can also produce significant results.

 Variability of the population characteristic under investigation Level of confidence desired in the estimate   Degree of precision desired in estimating the population characteristic .

For example: if you use a confidence interval of 4 and 47% percent of your sample picks an answer you can be "sure" that if you had asked the question of the entire relevant population between 43% (47-4) and 51% (47+4) would have picked that answer. The confidence interval (also called margin of error) is the plus-or-minus figure usually reported in newspaper or television opinion poll results. .

and .The formula for calculating the sample size for a simple random sample without replacement is as follows: where.575 for 99% confidence level).. .645 for 90% confidence level. 1.03 = + or – 3%). and 2.g.g.. and . N=(z/m)^2 P(1-p) z is the z value (e.96 for 95% confidence level. . m is the margin of error (e.05 = + or – 5%. 1.07 = + or – 7%.

05)^2 = (39. . and solving for the sample size equation.p is the estimated value for the proportion of a sample that will respond a given way to a survey question (e. Using our factors for the principal investigator population. we find: n=(1.96/.25) = 384 .25) = 1536.64(.g. PIs1.2)2(..50 for 50%).

n is the sample size based on the calculations above. the sample for PIs1 = 384. without using the finite population correction factor (explained below). The sample size equation solving for (new sample size) when taking the FPC into account is N’=n/(1+n/N) where. and N is population size.       Thus. .

we find: = = 375.37   . Calculating the new sample size for PIs1 using the formula above.

MONGA  WIKIPEDIA  RESEARCH METHODOLOGY C.R.KOTHARI .S. STATISTIC AND ECONOMICS BY-G.