A confidence interval is the numerical interval around the mean of a sample from a population that has a certain confidence of including the mean of the entire population. “Say what?” OK, let’s take it one point at a time.

LIMITS OF CONFUSION

A confidence interval is the numerical interval around the mean of a sample from apopulation that has a certain confidence of including the mean of the entire population.

―Say what?‖ OK, let’s take it one point at a

time.Say you collect 30 water samples from a lake.Oh wait. That use of the word sample will beconfusing to some people. A

sample

is aportion of a population, but the word can referto an individual piece of a population or acollection of pieces of the population(http://statswithcats.wordpress.com/2010/07/0

). It

’

s like the word fish

—

onefish, two fish, school of fish, and so on.Anyway, say you collect 30 aliquots (i.e., samples) of water from the lake and analyzethe aliquots for iron content. Then, you sum the 30 iron concentrations and divide by 30to get the mean iron concentration of your collection of aliquots (i.e., sample). But youdon

’

t really care about the mean iron concentration of your sample of 30 samplescollection of 30 aliquots. What you want to know is the average iron concentration of allthe water in the lake. No problem. You can use the mean iron concentration of the 30aliquots as an approximation of the mean iron concentration of the lake (population).Now, that would be fine for most people except for neurotic individuals who don

’t

understand the Central Limit Theorem. These persons have a couple of options. They cango back to the lake and collect 30 more aliquots of water (this is sometimes referred to asa

working vacation

if the collection of fish samples is also involved), then recalculate themean, and see what they get. They can do the same thing again, and again, and again(referred to as a

vacation

if the consumption of beer and potato chips is involved,http://statswithcats.wordpress.com/2010/07/26/samples-and-potato-chips/

’s

mean iron concentration might be.(Note: If the neurotic individuals can get someone else to pay for everything, they arecalled

consultants

. If the neurotic individuals can get everyone else to pay for everything,they are called

politicians

.)For people who can

’

t afford to collect more samples of samples, there

’

s an alternativeapproach called

resampling.

It

’

s the computer equivalent of a cushy government contractfor data collection. In a resampling approach, you would collect the 30 aliquots of lakewater, analyze them for iron content, and calculate the mean of your sample. Then youwould have specialized software randomly select a certain number of the original 30samples (the process is called

bootstrapping

or

jackknifing

depending on how it

’

s done;feel free to google away) to create a new dataset, from which you could calculate a newmean iron concentration. Then do that again, and again, and again until you have enoughmeans to say how variable the mean iron concentration is.

Cat whiskers are like confidence intervals. They let the cat know how big it's spread is.

A third alternative, which involves no fishing, less computer time, and as much beer asyou need, is to calculate a confidence interval. First, calculate the mean and standarddeviation of the 30 iron concentrations. Then calculate a confidence interval around thesample mean using the formula

Sample Mean

±

Sample Standard Deviation

DIVIDED BY

square root of the Number of Samples

TIMES

a

t-value

In the lake example, the mean, standard deviation and number of samples would becalculated from the iron concentrations determined in the aliquots of lake water. Thet-value would be calculated using software or selected from a table of values of the t-distribution on the basis of:

Degrees-of-freedom

—

the number of samples minus one. In this case, 30 wateraliquots minus 1 equals 29.

Alpha

—

One minus the confidence

that you won’t find any

estimates of the meanoutside the interval you calculate divided by the number of limits you willcalculate, in this case, two because you want upper and lower limits.The boundaries of a confidence interval are called the upper confidence limit and thelower confidence limit.For example, if:

Mean iron concentration were 50

Standard deviation of iron concentration were 10

t-value for 29 degrees-of-freedom (based on 30 iron concentrations) and alphaof .005 (based on 99% confidence for a two-sided limit) were 3.04the 99% lower confidence limit would be44.45 (i.e., 50

–

(3.04 * (10/30))and the 95% upper confidence limit would be55.55 (i.e., 50 + (3.04 * (10/

30))You would have about 99% confidence that this interval would include the mean ironconcentration of the lake.But what if you think 44 to 56 is too wide a range for the lake

’

s mean iron concentration.What can you do? You could go back to the lake and collect another 30 samples and tryagain. Better yet, you could go back to the lake and take 120 or even more sampleshttp://statswithcats.wordpress.com/2010/07/11/30-samples-standard-suggestion-or-superstition/

), but that

’

s a lot of expensive work vacation.Look back at the formula for the confidence limits. The limits are calculated from themean, the standard deviation, the number of samples, and the t-value. If you

’re not going

back to the lake, you can

’t change t

he mean, the standard deviation, or the number of samples. That leaves the t-value. The t-value would be based on the degrees-of-freedomand the confidence. The degrees-of-freedom are determined from the number of samples,so that

’s still no help. BUT, the choice of the confidence is yours.

Consider this. If you choose the confidence level to be:

99%, the confidence limit would be 44.45 to 55.5595%, the confidence limit would be 45.68 to 54.3290%, the confidence limit would be 46.27 to 54.32Or for that matter,50%, the confidence limit would be 47.86 to 52.14although it wouldn

’

t be very useful if your interval only had a 50% chance of includingthe real mean iron concentration of the lake.Consider the analogy of a nearsighted man playing a ring-toss game at a carnival. Thelocation of the peg he will toss his ring at is like the mean of a population of possiblemeasurements. The diameter of the peg is like the inherent variability of the population of measurements. The fuzziness with which he sees the peg because of his near sightednessis like the additional variation associated with sampling, measurement, andenvironmental variability(http://statswithcats.wordpress.com/2010/08/01/there%E2%80%99s-something-about-variance/

). The size of the ring he tosses is like the size of the confidence interval. If hewanted to be very confident that he could toss a ring over the peg, he would use a largering to give him that confidence (i.e., the higher the confidence the larger the confidenceinterval).The man cannot change the location and diameter of the peg (i.e., the population valuesare fixed). However, he would have a greater chance of success if he could see better(i.e., extraneous variation in the samples is controlled,http://statswithcats.wordpress.com/2010/09/05/the-heart-and-soul-of-variance-control/ ; http://statswithcats.wordpress.com/2010/09/19/it%E2%80%99s-all-in-the-technique

)or if he could use a very large ring (i.e., a relatively wide confidence interval). If the ring (theconfidence interval) becomes too large, though, the game becomes meaningless. Thus,there must be some limits on how large the ring should be.By convention, most statistical inferences,including confidence intervals, use a 95%confidence level. Sometimes either a 90%level (resulting in a smaller confidenceinterval) or a 99% level (resulting in alarger interval) is used. A 90% levelwould be more appropriate when theconsequences of not including the truepopulation value in the interval arerelatively minor. Confirmatory inferences,on the other hand, often use a 99%confidence level. When in doubt, use95%.Some people dislike putting confidence limits around means they calculate. Limits showhow imprecise data, and statistics calculated from them, actually are. But if you are goingto make an informed decision, you have to know not just the evidence, but the reliability

Obsidian in a 90% confidence drawer.

