You are on page 1of 2

BOOTSTRAPPING

Bootstrapping is a resampling technique that builds a sampling distribution for a statistic from
the empirical data rather than assuming some theoretical sampling distribution that requires
assumptions that may not be true. The bootstrap procedure consists of the following steps:

1. Pseudo-Population. Define a pseudo-population distribution for resampling. This is


usually defined as the distribution for the sample data or of some appropriate
transformation of the data.
2. Resampling. Draw, with replacement, N independent random observations from the
pseudo-population. These N observations comprise a bootstrap resample. Compute the
statistics of interest (e.g., mean, median, mode, standard deviation, residual, r, R2, b) for
the sample.
3. Evaluation. Repeat the resampling, typically at least 1000 times to produce multiple sets
of (boot strapped) values for the statistics of interest. The distribution of bootstrapped
values is the bootstrap sampling distribution of the statistics. The mean or median of that
distribution is the best estimate of the population value. The upper and lower tails of the
distribution can be used for significance testing by establishing whether the null
hypothesis value falls below or above, for example, the 2.5% or 97.5% values.

The success of the bootstrap method depends on how well the sample distribution resembles the
population distribution. This, in turn, is dependent upon two things: (1) the sample is randomly
drawn; and (2) the sample size N, in favor of large N. This may sound like the bootstrap is
simply another asymptotic method. However, there is evidence for the superiority of the
bootstrap for small samples. How small is not as certain.

Reading Results in SPSS:

• Bias is the difference between the average value of this statistic across the bootstrap samples
and the value in the Statistic column. In this case, the mean value of Churn within last month
is computed for all 1000 bootstrap samples, and the average of these means is then computed.

• Std. Error is the standard error of the mean value of Churn within last month across the 1000
bootstrap samples.

• The lower bound of the 95% bootstrap confidence interval is an interpolation of the 25th and
26th mean values of Churn within last month, if the 1000 bootstrap samples are sorted in
ascending order. The upper bound is an interpolation of the 975th and 976th mean values.

References:
Yung, Y-F and Chan, W. “Statistical Analyses Using Bootstrapping: Concepts and
Implementation.” In R.H. Hoyle (1999). Statistical Strategies for Small Sample Research,
Thousand Oaks, CA: Sage Publications.

Mooney, C. Z., & Duval, R. D. (1993). Bootstrapping: A nonparametric approach to statistical


inference. Newbury Park, CA: Sage Publications.

You might also like