# Volume 3 Issue 10 October 2011 Researchers’ Corner

Confidence Level and Confidence Interval
A novice researcher is often confused with terms like confidence level and confidence interval if not already exposed to the background. Further, there are terms like significance level, p-value, α-value, margin of error and so on found in research papers as well as while sampling and testing of data. Two concepts very fundamental to all these are ‘precision’ and ‘reliability’ of statistical predictions.

In day-to-day life, we encounter plenty of predictions and guess works by all sorts of people ranging from professional astrologers to renowned futurologists. For example, who will win an election or a cricket match, whether it will rain or not on a particular day, etc. are quite common. Are these predictions scientifically based? Do they have enough precision and reliability to confidently accept and act? How to judge their precision and significance? etc., are some natural questions any rationally thinking person would ask. Let us look from a lay-man’s angle what these two attributes of predictions, namely precision and reliability mean and try to understand them with the simple example. Suppose a teacher asks four of his students to predict how much marks (out of 100) they would score in the foregone examination before the results are announced and their predictions Student Predicted marks Range of Prediction The chances (in %) that the prediction comes true 1 A B C D 2 75 <90 >70 70±5 3 0 0 - 89 71 - 100 65 - 75 4 50.0 99.9 95 98 All predictions in the table except that of student `A’ falls in a range of marks. It is quite natural that the chances of prediction becoming true will increase with the increase in the range of prediction. On the other hand, a binary prediction like true/ false or pass/ fail as well as the pin-pointed prediction like that of student `A’ certainly will have lower chances of coming true than predictions leading to a range. The prediction of student B is very liberal in the sense he may score marks ranging from 0 to 89 and hence chances of this happening is as high as 99.9 percent. The prediction of student `C ‘ is challenging as he sets lower limit of not less than 70 marks and the chances of becoming true is reasonably high (95%). Lastly, the prediction of student `D’ is very reasonable in terms of range and the chances of becoming true are very high. Thus predictions of `C ‘ and `D’ are quite meaningful in terms of precise range of prediction coupled with high chances of occurring. The range within which the expected/ predicted value falls is called the ‘precision’ of prediction and the chances of predicted value falling in the range is called the ‘reliability’ of prediction. The reliability is are shown in the table. Column 2 of the table records prediction of each student. Students are also asked to judge the chance of prediction becoming true and the same in percentage is shown in column 4.

expressed as confidence level and the converse of it is significance level. That is, if the confidence level is 98%, the significance level is 2%. The confidence level tells how sure we can be and it is expressed as a percentage and represents how often the prediction lies within the confidence interval (i.e., range). So any prediction should balance between the precision and confidence level. For example, a very precise prediction like that of student `A’, with low confidence level as well as very poor precision and high confidence level like that of student `B’ are of less useful in practical situation.

This is what the theory of sampling distribution reveals and the range within which the results fall is the confidence interval and what falls outside is margin of error. However, by repeated sampling and/ or increasing the sample size margin of error can be decreased (or precision can be increased). In practice there is no need for a researcher to repeatedly take samples to arrive at desired confidence interval or margin of error as there are standard tables and even websites to get confidence interval or margin of error. Hence the wider the confidence interval we are willing to accept, the more certain we can be that the whole population answers would be within that range. The confidence interval and the margin of error tell us the amount of error that we can tolerate. Lower margin of error (or higher confidence level) requires a larger sample size. On the other hand, the confidence level is the amount of uncertainty we can tolerate.

We will see in future issues, the relation between sampling and precision as well as how to determine confidence interval or margin of error and sample size. M S Sridhar sridhar@informindia.co.in ------------------------------------------------------------------------------------------------------------------------------------------http://informindia.co.in/NewsLetter/Jgatenewsletter-october2011.swf