# α=significance level μ0=null hypothesis value of mean vs.

p0=null hypothesis value of proportion

Confidence Interval Sample Size for

Proportions Problems: parameterproportion p Point Estimate se
estimating population proportion with MoE=m

MoE

(if do not know

use 0.50, which will guarantee the sample size is large enough)
MoE for C.I. for population mean Confidence

*USE 1-PropZInt for (single) Population Proportion Confidence Interval USE 1-PropZTest for (single) proportions hypothesis test*
Means Problems: parameterMean μ Point Estimate
Sample Size for estimating a Population

se

MoE

Interval
or ( –

Mean, margin of error mn=(σ2z2)/m2

For a single hypothesis population Mean test statistic t=

μ0)/

where μ0=the null value

When asked for critical value use Solver: tcdf and enter the df and C.I.

σ=population standard deviation

*USE TInterval for (one sample) Confidence Interval & TTest for (one sample) or (matched-pairs) population mean*

Point Estimate—(for a population proportion is the sample proportion symbolized by ) a single # that is our ―best guess‖ for the parameter, does not tell how close the estimate is likely to be to the parameter (an interval estimate is more useful, it incorporates a margin of error, which helps to gauge accuracy of the Point Estimate). EXof the possible responses, 627 picked ―definitely or probably should be,‖ and 546 picked ―probably or definitely should not be.‖ 627+546=1173 for the population size and 627 answered definitely or probably so for the point estimate. A POINT ESTIMATE alone may be highly inaccurate, especially with a small sample. Interval Estimate—is an interval of numbers within which the parameter value is believed to fall (indicates precision by giving an interval of #’s around the point estimate). Confidence Interval—(are a function of 3 things: data in the sample, the confidence level & sample size) An interval of values (range) that contains the most believable (plausible) value for the population parameter. Is constructed by taking a point estimate and adding & subtracting a MoE. The MoE is based on the SE of the sampling distribution of that point estimate. The CI tells us the ―likelihood‖ the ―most informative estimation‖ method constructs an interval of #’s called the confidence interval, within the unknown parameter value is believed to fall. Ex. 95% confidence interval says the we have a ―95% confidence‖ (refers to long-run interpretation) in the long run about 95% of those intervals would give correct results, containing the population proportion. Has standard deviation called standard error, also has mean equal to population proportion. Is approximately normal distribution, for large random samples, because of the central limit theorem. That the interval contains the parameter (must compromise between the desired margin of error and the desired confidence of a correct inference) to achieve greater confidence; we make the sacrifice of a larger margin of error and wider confidence interval. MoE for C.I. found by multiplying the critical value by the se C.I. for pop proportion is CI for population mean (1.96 Critical Value for 95%, 2.576 for 99%, 1.645 for 90% Confidence Intervals) √ A GOOD Estimator of a parameter has 2 desirable properties—1) A good estimator has a sampling distribution that is centered at the parameter, in the sense that the parameter is the mean of that sampling distribution, an estimator with this property is said to be unbiased. 2) A good estimator has a small standard error compared to the other estimators. MoE—(depends on the standard error of the sampling distribution of the point estimate) also how close the sample proportion should be to the population proportion. MoE also measures how accurate the point estimate is likely to be in estimating the parameter. (Increases as the confidence level increases; Decreases as the sample size increases) z-score—measures the number of standard errors between the sample proportion p and the null hypothesis p0 In a test of hypothesis, the null hypothesis is that the population mean is less than or equal to 45 and the alternative hypothesis is that the population mean is greater than 45. The test is to be made at the 2.5% significance level. A sample of 81 elements selected from this population produced a mean of 47.3 and a standard deviation of 4.5. What is the critical value of z? 1.96 *Both Z & T statistics have the form test – Completely Randomized Design—subjects are randomly assigned to one of the treatments. Null Hypothesis—is a statement that the parameter takes a particular value (means ―no effect‖ or of no consequence).(in hypothesis testing, the null hypothesis is a claim about a population parameter that is assumed to be false until it is declared true) whenever the null is not rejected the alt. is rejected. A null hypothesis can only be rejected at the 5% significance level if and only if: a 95% confidence interval does not include the hypothesized value of the parameter. The null & alternative divide all possibilities into two non-overlapping sets. A two-tailed test is one where results in either of two directions can lead to rejection of the null hypothesis Alternative hypothesis states that the parameter falls in some alternative range of values (where the burden of proof lies). Is a claim about a population parameter that will be true if the null hypothesis is false. Statistical inference–uses sample statistics to make decisions and predictions about population parameters. There are, generally speaking, two types of statistical inference: sample estimation and population estimation. Ordinal Variable—is a categorical variable for which the categories are ordered from low to high in some sense. Case-Control Study–a retrospective study. Subjects who have a response outcome of interest(ex cancer serves as
cases) other subjects not having that outcome serve as (controls). The cases and controls are compared on an explanatory variable, like whether they were smokers.

Standard Error—estimated value. Is an estimated standard deviation of a sampling distribution (Depends on the sample size) Error Probability—the probability that the method results in an incorrect inference, that the data generates a confidence interval that does NOT contain the population proportion. Degrees of freedom—(df) , one less than the sample size. Statistic—describes a sample Parameter—describes a population (i.e. σ & μ) Mean—one way to summarize the center of the observations for a quantitative variable. Bayesian statistics–Statistical inference based on the subjective definition of probability. Proportion—equals the # of items in a category divided by the sample size, it summarizes the relative frequency of observations in a category for a categorical variable. Properties of of t distribution—t score is slightly larger than a z score 1) the t distribution is bell shaped and symmetric about 0 2) The probabilities depend on the degrees of freedom, df. The t distribution has a slightly different shape for each distinct value of df, and different t-scores apply for each df value. 3) The t distribution has thicker tails and is more spread out than the standard normal distribution. The larger df value the closer it gets to the standard normal. When df is about 30 or more, the t-score & z-score distributions are nearly identical. 4) A t-score multiplied by the standard error gives the margin of error for a confidence interval for the mean. T confidence interval does NOT work well when the data contains extreme outliers. However, the t distribution when estimating the mean accounts for the extra variability due to using the sample SD s to estimate the population SD in finding a standard error. As the sample size increases, the t distribution becomes more similar to the normal distribution. Cohort Study Design—at the beginning none have disease, group of subjects is studied over time. Matched Pairs—each subject in one sample is matched with a subject on the other sample. (i.e. a set of married couples with the men in one sample and the women in the other) two observations for a particular subject, because they both
come from the same person.

Crossover Design (Really Good Design) a matched pair’s design in which subject’s crossover during the experiment from using one treatment to using another treatment.