You are on page 1of 5

PARAMETER ESTIMATION AND HYPOTHESES TESTING

Jothi Malar Panandam

When we conduct research, we are often interested in a particular population. However, we are, generally, unable to study the entire population (every member in the population) due to the cost, or amount of time and labour involved, or even due to the size and distribution of the population that makes this literally impossible. As such we use samples from the population and try to estimate the numeric values of characteristics (referred to as parameters) of the population, or make inferences about the population based on the sample statistics.

Parameter Estimation The parameters (e.g. mean values) of the population are therefore, generally, unknown. However, they may be estimated from information obtained from random samples. If a single numerical value of a random sample, such as the mean birth weight of a sample of Boer goats, is used as an indicator of the population parameter, mean birth weight of Boer goats in Malaysia, then this is called a point estimation. The sample mean has been used as an unbiased estimator of the population mean. However, we must be aware that due to sampling variations the actual mean birth weight of the Boer goats in Malaysia may not be the estimated value. In fact, the mean birth weight cannot be specified for sure unless the birth weight has been recorded for every Boer goat in the country. The population parameter may also be estimated based on the sample information, a as a range which is likely to include its value rather than as a single value. For example, the mean birth weight of Boer goats in Malaysia may be estimated within a range of values determined based on the birth weights of the animals in a sample of Boer goats. This range or interval is called the confidence interval. This form of inference about a population parameter is called interval estimation. The size of the confidence interval depends on the size of the sample(s) used in its estimation, and the desired level of confidence or the probability that the parameter lies within the confidence interval. A bigger sample size would give a smaller interval. Often in interval estimation, we use the probability level of 0.95 (symbolized as ) or 95 % (i.e. 1 - as percentage) confidence level. This means that we have estimated the confidence interval based on the sample statistics such that we are 95 % confident that the interval would include the population parameter between its limits.

Hypothesis Testing We conduct experimental research because we wish to prove a research hypothesis. The research hypothesis may be defined as the statement of belief about one or more

populations based on theory and previous research findings. For example, our research hypothesis may be that there would be a difference in the growth rates of Brakmas calves fed on two rations formulated using different feedstuffs; or that the Boer cross has a higher twinning rate compared to purebred Boer goats. Statistical hypotheses, on the other hand, are used when the data from the experiment is analysed statistically to prove the research hypothesis. There are two types of statistical hypotheses, the null hypothesis and the alternative hypothesis. The null hypothesis, symbolised as Ho, is the hypothesis to be tested. It is the hypothesis of 'no difference' or no association as it presumes that there is no difference among the treatment effects or the populations being compared, or that there is no association between the variables of interest. It is often the complement of the research hypothesis. For example, if we want to prove that the two rations would affect the growth rate of the Brakmas calves differently, Ho is formulated to state that there is no difference between the growth rates of the Brakmas calves fed on the two rations. This implies that any observed difference in the growth rates of the two treatment groups is merely due to sampling effect or random variation. The alternative hypothesis (Ha) is the statement of what is true if the null hypothesis is rejected based on the sample data. Often, this hypothesis supports the research hypothesis. For the earlier example, Ha would state that there is a difference between the growth rates of the Brakmas calves fed on the two rations. If we were to conduct a feeding trial on a random sample of male Brakmas calves of the same age to compare the effects of the two rations, it is very unlikely that the mean growth rate of the two groups of animals would be the same. There are two possible explanations for the observed difference. It could mean that there is no difference in the effects of the two rations on growth of the calves, and the observed difference between the sample means is merely due to chance or sampling error. Alternatively, it could mean that there is a difference in the effects of the two rations on growth rate. Since only samples of the populations are used in the experiment, the decision to either reject or not reject the null hypothesis is subjected to error. In drawing our conclusion, we can make two kinds of errors. We may reject the true null hypothesis, and commit Type I error, or we may accept the false null hypothesis as true and commit Type II error. Inferential statistics (e.g. t test, analysis of variance) are used to decide whether or not to reject the null hypothesis. These use the rules of probability, parameter estimation, and the characteristics of theoretical distributions for this purpose. If the statistical test suggests that the observed results differed markedly from those expected due to chance or sampling error, the null hypothesis is rejected. Then the observed difference is said to be statistically significant. When the observed results do not support the rejection of the null hypothesis, it does not necessarily mean that Ho is true. It could be that the sample size used in the experiment is not sufficiently large enough to detect the difference or the association.

Inferential statistics cannot indicate whether either Type I or Type II error has occurred, but merely show how likely these are to occur. The only way to reduce both types of error is to increase the sample size.

Level of Significance and P Value The level of significance () used in hypothesis testing states the chance of rejecting the null hypothesis when it is in fact true (chance of type I error). It is general practice to use a level of significance of 0.05 or 0.01. A 0.05 or 5% level of significance implies that the risk of rejecting a true null hypothesis is 0.05. This also means we can be about 95 % confident of having made the right decision. Consider the experiment comparing the effects of two rations on the growth rate of Brakmas calves. If the statistical test shows the difference in the growth rates as statistically significant at .05 significance level, it would mean that the probability of the observed difference being due to sampling error is not more than 0.05. Therefore, we may confidently conclude that the observed difference reflects a real difference between the animals on the two rations. Often when conducting statistical analysis using computer software we come across P values. In inferential statistics, the distribution of the test statistic values (e.g. t distribution, F distribution) is considered. The P value is the probability that the test statistic values in the distribution are more extreme than the value calculated based on the sample data, if the null hypothesis is true. The null hypothesis is rejected when the P value for a statistical test is less than the chosen value.

One- and Two-Tailed Tests In a statistical test we may choose to reject H O based on the extreme values on one side or two sides of the probability distribution of the test statistic. If we choose to reject HO based on the extreme values on one side of the distribution, then the test is called a one-tailed or one-sided test. In this case the rejection region is to one side of the distribution, with an area equal to .

Figure 1. Rejection region in a one-tail test with level of significance


Consider the research hypothesis that the Boer cross have a higher twinning rate compared to purebred Boer goats. Here we are interested whether one particular breed type (Boer cross) has a higher twinning rate than the other . The statistical hypotheses tested would be as follows: Ho: HA: There is no difference in the twinning rates of Boer crosses and purebred Boer goats. Boer cross has a higher twinning rate compared to purebred Boer goats.

In the case of two-tailed tests, the rejection region is to both sides of the distribution of the test statistics, with an area of /2 at each end.

Consider the research hypothesis that the two rations would affect the growth rate of the Brakmas calves differently. In this case we are not expecting a particular ration to produce a higher growth rate than the other. The statistical hypotheses tested would be as follows: Ho: HA: There is no difference in the growth rates of Brakmas calves fed on the two rations. There is a difference in the growth rates of Brakmas calves fed on the two rations.