This action might not be possible to undo. Are you sure you want to continue?

QXD 04/13/2005 03:28 PM Page 1

**How to Make a Decision With Statistics
**

O U T L I N E

1

**1.1 Introduction—Statistics and the Scientific Method 1.2 Decisions, Decisions 1.3 The Language of Statistical Decision Making
**

O B J E C T I V E S

■ ■ ■ ■ ■ ■ ■ ■

1.4 What’s in the Bag? 1.5 Selecting Two Vouchers* 1.6 Significant versus Important

Introduce the scientific method and the role of statistics in decision making. Distinguish between a population and a sample. Learn how to formulate and write the null and alternative hypotheses. Understand what it means to say data are statistically significant. Learn about the two types of error in a statistical decision problem—Type I error and Type II error. Understand the meaning of the significance level, p-value, and power of a test. Discover the effect of the sample size in a decision-making process. Understand the difference between significant versus important.

**1.1 INTRODUCTION—STATISTICS AND THE SCIENTIFIC METHOD
**

Statistics is the science of data.The word science comes from the Latin word “scientia,” meaning knowledge. The scientific method is a procedure for systematically pursuing knowledge. Together with the scientific method, statistics provides us with a collection of principles and procedures for obtaining and summarizing information in order to make decisions. The scientific method is an iterative process for learning about the world around us.

*Optional sections

1

ALIAMC01_0131497561.QXD 04/13/2005 03:28 PM Page 2

2

CHAPTER 1

HOW TO MAKE A DECISION WITH STATISTICS

The scientific method is composed of the following steps: Step 1 Formulate a theory. Step 2 Collect data to test the theory. Step 3 Summarize the results. Step 4 Interpret the results and make a decision.

Formulate theories

Interpret results & make decision

Collect data

We start with a theory—that is, a proposed, but unverified, statement. Suppose that we are making a product and recently some customers Summarize have returned the product, reporting that it did results not work as they expected. We recognize this as an opportunity to improve. The feedback offered by the customers may give rise to a theory about what is causing the product to malfunction. Since many customers reported a broken spring mechanism, it is hypothesized that the spring coil should be thicker. We wish to test this theory, so we begin to experiment. We collect data to help us check the theory. We might make a change in the production process of our product, and then measure the performance of the product made under the change. These measurements are the data. We look at the data and summarize the results. We might summarize the percentage of product produced under this process change that still does not operate properly. We interpret the results and use the data to confirm or refute the theory. If the percentage of malfunctioning product has been sufficiently reduced, we might conclude that the theory has been supported. The process change is implemented and becomes the new standard for making the product. If the percentage of improperly operating product has not been sufficiently reduced, the theory may not be supported and a new theory should be developed and tested. It would be nice if data could prove conclusively that a theory is either true or false, but in this uncertain world, that is generally not the case. Most theories are in a permanent state of uncertainty. There are always new observations about the world around us and new data coming in. Also, scientists are always thinking of new ways to test old theories or to interpret data, which may lead to exposing weaknesses of old theories, making it time to test them again. If we cannot conclude whether or not a theory is absolutely true, at least it would be nice to be able to quantify how much “faith” we had in our decision—if we could say something like “We are 95% confident in our conclusion.” This is where statistics and its collection of methods play a role. The ability to make such confidence statements stems from the use of statistics at every step of the scientific method. A theory is rejected if it can be shown statistically that the data we observed would be very unlikely to occur if the theory were in fact true. A theory is accepted if it is not rejected by the data. The scientific method is an iterative process of learning. A conclusion may be that we need to update the theory and gather more data. The results do not necessarily give definite answers. Instead, the results may suggest new theories or decisions that are subject to further testing. The scientific method and the road map for studying statistics with this text are best represented with a circle. The various components in the circle are connected, and the circle does not end—just as learning is a never-ending process. So, where do we start? We start here in Chapter 1 by providing you with an overview of the components of this statistical decision-making process. The goal of this chapter is to introduce you to some of its elements by involving you in the process. We will ask you to think critically about the information being presented.You will begin to learn how to examine assumptions, discern hidden values, evaluate evidence, and assess conclusions.The remaining

ALIAMC01_0131497561.QXD 04/13/2005 03:28 PM Page 3

1.2 DECISIONS, DECISIONS

3

chapters of this text reinforce and build on these various elements. Previous users of this text have commented that Chapter 1 presents some complex issues in statistics. However, mastering these ideas early on smoothes out the path through the remaining chapters.

1.2 DECISIONS, DECISIONS

Every day—in fact, almost constantly—we gather information to make decisions. Consider the following question: Can I cross the street at this intersection? What information, or data, would you need to answer this question? Without consciously thinking about it, your mind is processing the answers to a couple of questions: How many cars are approaching and at what speed are they traveling? Are there any obstacles or weather conditions that will impede my progress across the street? We wouldn’t think of this simple, everyday decision as being a problem in statistics. However, what if you wanted to ask more complex questions, such as those in the following four examples:

■

Suppose that you are a student at the University of Michigan and you are interested in gathering information about the student population registered full time during the winter term. Some of the questions you might be interested in are as follows: What percent of students are women? African-American? registered Democrats? vegetarian? married? Suppose that you are in the market for a new car. You have decided to purchase a General Motors car and will consider all such models produced in the current model year.The answers to the following questions will influence your purchase decision:What model is more efficient in average miles per gallon? What is the average price of all the models? How many inches of leg room are there for the driver? The makers of Advil® claim that “for pain, nothing is proven more effective or longerlasting than Advil” (based on studies conducted by Whitehall Laboratories, the makers of Advil).As a consumer who does experience headaches, you might ask “What is meant by ‘more effective’?” “Was Advil compared to all headache relievers?” “How were the comparisons carried out?” and “Does this mean Advil will work better for you?” “For every type of headache you have?” Many small-scale studies have found and reported that fish eaters live longer. Another study contradicts the popular belief that fish is good for the heart. What is a healthconscious person to do? How do you as a consumer weigh the information being cited daily? You learn to ask questions, such as “How was the study conducted?” and “What type of and how many subjects were used?” In the recent fish study, researchers followed the eating habits of 44,895 men and found that those who ate a lot of fish were just as likely to experience heart trouble as those who ate only small amounts. So, do these results extend to men of all ages? to women? What was the working definition of “experiencing heart trouble?”

■

■

■

In the first two examples, the group of individuals or objects under study is very large. Because it would be inefficient and expensive to question each student at the University of Michigan, or to obtain information on each car produced by General Motors in the current model year, we need to devise a reliable method for drawing conclusions based on a manageable amount of data. The last two examples illustrate how information can be a powerful and common tool of persuasion. We know how to discount some kinds of information. In the Advil example, are you surprised that the results of a study conducted by the makers of Advil are in support of Advil? If a company conducts a study comparing its product with a competitor’s and the results show that the competitor’s product is better, would these results

ALIAMC01_0131497561.QXD 04/13/2005 03:28 PM Page 4

4

CHAPTER 1

HOW TO MAKE A DECISION WITH STATISTICS

be reported? If Advil is better, how can a competitor, such as Tylenol®, also claim to be the best and have studies supporting its claims? The example about eating fish illustrates that you should consider the weight of the evidence.You shouldn’t base your decisions on every study that comes along. If there are, for example, three independent studies that have similar results, the weight of the evidence is stronger than one study that stands alone. Research is cumulative and no one study is definitive. A study provides clues, not absolute answers. Over 100 years ago, H. G. Wells, author of such classics as The Time Machine and The War of the Worlds, stated a prediction based on the following theory: “Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write.” Indeed, data collection and reporting of results are increasingly being used to confirm or refute theories. Individuals in nearly all job markets are finding the need to be able to evaluate data intelligently—that is, to use statistical reasoning to interpret the meaning of data and make decisions. We begin cycling through the scientific method by first establishing some standard terminology.

**1.3 THE LANGUAGE OF STATISTICAL DECISION MAKING
**

Learning statistics is like learning a new language whose new terminology involves special phrases, symbols, and definitions.The meaning of a word in everyday language can be different from how it is defined in statistics. In this section, we will discuss the statistical terminology involved in the decision-making process.There are many components in the decision-making process. The decision-making process involves theories, data, a measure of “likeliness,” and the possibility of errors. Although we don’t always think of these components from their statistical viewpoint, they are always present. In the following sections, we will discuss these components and introduce their corresponding more formal and accepted terminology.

1.3.1 Testing Theories

The group of objects or individuals under study is called the population. In the student example, the population consists of all University of Michigan students registered full time during the winter term. In the car example, the population consists of all car models produced by General Motors in the current model year. A part of the population is called a sample. We use the information from our sample to generalize and make decisions about the whole population—a process called statistical inference. The validity of the resulting generalizations will depend on the validity of the sample selected. DEFINITIONS: The population is the entire group of objects or individuals under study, about which information is desired. A sample is a part of the population that is actually used to get information. Statistical inference is the process of drawing conclusions about the population based on information from a sample of that population. “Should I take Tylenol or Advil to relieve my headache fast?” One theory is that Tylenol is just as good as Advil, while another is that Advil is better than Tylenol. If we have in mind theories about a population that we want to test, those theories in statistics are called hypotheses. In statistics, we start with two hypotheses about the same population, each of which has a special name. We generally call the conventional belief—that is, the status quo, or prevailing viewpoint—the null hypothesis (denoted by H0 and read “H naught”). The null hypothesis is generally a theory which states that there is nothing happening—no effect, no difference, or no

Example 1. However. Average Life Span Problem 2 Suppose that you work for a company that produces cooking pots with an average life span of seven years. (b) Is this a one-sided to the right. Poll Results Problem 3 Based on a previous poll. (a) Write the null and the alternative hypotheses for this setting. H1: Taking an aspirin every other day for 20 years will reduce the risk of getting colon cancer.1 Problem 1 ◆ Stating the Hypotheses: Three Different Settings Aspirin Cuts Cancer Risk According to the American Cancer Society. *Sometimes the alternative hypothesis is denoted by Ha. (a) Write the null and the alternative hypotheses for this setting. a difference. (a) Write the null and the alternative hypotheses for this setting. You want to test the hypothesis that the population proportion of people who say they plan to vote for the Democratic candidate has changed. You want to test the hypothesis that the average life span of the cooking pots made with this new material increases.3 THE LANGUAGE OF STATISTICAL DECISION MAKING 5 change—in the population. the lifetime risk of developing colon cancer is 1 in 16. A study suggests that taking an aspirin every other day for 20 years can cut your risk of colon cancer nearly in half. (b) The alternative hypothesis is one-sided to the left. Solution 1 (a) H0: Taking an aspirin every other day for 20 years will not change the risk of getting colon cancer. the benefits may not kick in until at least a decade of use. Let’s look at some examples.ALIAMC01_0131497561. To gain a competitive advantage. or two-sided alternative hypothesis? Hint: look at the alternative hypothesis. which is 1 in 16. . one-sided to the left. the percentage of people who said that they plan to vote for the Democratic candidate was 50%. The presidential candidates will have daily televised commercials and a final political debate during the week before the election. one-sided to the left. a change—in the population. Solution 2 (a) H0: The average life span of the new cooking pots is seven years. you suggest using a new material that claims to extend the life span of the pots. H1: The average life span of the new cooking pots is greater than seven years. so the risk will be less than 1 in 16. one-sided to the left. It is the statement that something is happening—there is an effect. (b) Is this a one-sided to the right. (b) Is this a one-sided to the right. (b) The alternative hypothesis is one-sided to the right. or two-sided alternative hypothesis? Hint: look at the alternative hypothesis. or two-sided alternative hypothesis? Hint: look at the alternative hypothesis. The alternative hypothesis (denoted by H1 and read “H one”)* is an alternative to the null hypothesis.QXD 04/13/2005 03:28 PM Page 5 1.

or two-sided according to the direction shown in the alternative hypothesis. a total of 106. t! Fair Die? . out of 315. a change in the population. DEFINITIONS: The null hypothesis. (b) The alternative hypothesis is two-sided. If both be statements about the same the sign in the alternative hypothesis is greater than 172.1 L In a famous die experiment. If the sign in the altermaterial and H1 be a statement about pots native hypothesis is “not equal to” 1Z2. about a population. The alternative hypothesis. a difference. is an alternative to the null hypothesis—the statement that there is an effect. 3 State the appropriate null and alternative hypotheses for assessing if the data provide compelling evidence for the competing theory. If the die is “fair”—that is. H1: The percentage of the Democratic votes in the upcoming election will be different from 50%. Both H0 and H1 are statements about the same population of two-sided test. H1: The die is not fair—that is. Sides 5 and 6 have more indentations than the other faces. In the average-life-span example. This is the case in Problem 1.672 rolls. the indentations have no effect.This suggests that the true proportion of 5’s or 6’s may be a bit higher than the “fair” value. population. or prevailing viewpoint. we have a made with the new material. ◆ would show support for the alternative hypothesis. each of the six outcomes has the same chance of occurring—then the true proportion of 5’s or 6’s should be 1. denoted by H1. 3 However. This is the case in Problem 2.656 resulted in either a 5 or a 6. the indentations do have an effect. The test is said to be one-sided to the left. et's Do I 1. Values that are very large or very small pots made with the new material. is the conventional belief—the status quo. Smaller observed values show support for the The null and alternative hypotheses should alternative hypothesis. The alternative hypothesis (or research hypothesis) will only be accepted if the data provide convincing evidence for it.ALIAMC01_0131497561. If the sign in the alternative hypothesis is less than 162. and the proportion of 5’s or 6’s is . a close examination of a real die reveals that the “pips” are made by small indentations into the faces of the die. Larger observed it would not be correct to have H0 be a values show more support for the alternative hypothestatement about pots made with the original sis. we have a one-sided test to the right. one-sided to the right. and so these sides should be slightly lighter than the other faces. and the proportion of 5’s or 6’s is .QXD 04/13/2005 03:28 PM Page 6 6 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS Solution 3 (a) H0: The percentage of the Democratic votes in the upcoming election will be 50%. denoted by H0. H0: The die is fair—that is. 1. we have a one-sided test to the left. What We’ve Learned: The null hypothesis is that which represents the status quo and it will not be rejected unless the data provided convincing evidence against it.

We prefer to avoid the term accept and state the decision as either rejecting the null hypothesis or failing to reject the null hypothesis.3. Toddlers can give you a cold because they are the can lead to colds. January 21. t! Stress Can Cause Sneezes . we start by assuming that the null hypothesis is true. the decision-making process is basically as follows: We have two competing theories or ideas about the same population of interest. So. When we support a theory.2 provides an introduction to the idea of assessing how unusual the data are under the null hypothesis. then we reject that theory and support the second theory. it is customary to say that we accept or favor that theory. Are the data more likely to be observed if the first theory is true or if the second theory is true? If the data are unlikely to occur if the first theory is true. We look at these data.2 How Do We Decide Which Theory to Support? In statistics. State the appropriate hypotheses for assessing the researcher’s theory regarding this population. say a growing number of researchers. said his most recent studies suggest that stress doubles a person’s become infected.2 L Excerpts from the article “Stress Can Cause Sneezes” (New York Times. indoors with coughers. To learn which of these two theories seems more reasonable. rare or very unlikely) under the null hypothesis. Since the null hypothesis is generally the status quo.ALIAMC01_0131497561. multiplies in the body. The percentage of people exposed to a cold virus who actually get a cold is 40%. psychology professor at Carnegie Mellon University up to 90% of people exposed to a cold virus in Pittsburgh. but only 40% of those exposed actually become sick. Dr. On average. which is why the alternative hypothesis is often referred to as the research hypothesis. One mystery that is still original Germs “R” Us. A very few actually get the cold. H0: Tip: Having trouble determining the alternative hypothesis? Ask yourself “Why is the research being conducted?” The answer is generally the alternative hypothesis. meaning that the virus risk of getting a cold. But can going postal with prevalent in cold research is that while many the boss or fretting about marriage give a person a post-nasal drip? individuals are infected with the cold virus. Example 1. The logic behind statistical decision making is based on the “rare event” concept. 1997) Stress can cause sneezes are shown at the right.QXD 04/13/2005 03:28 PM Page 7 1. Studies suggest that FROM THE NEW YORK TIMES stress doubles a person’s risk of getting a cold. we gather some information. H1: 1. the population of interest is people who are under (acute) stress. Sheldon Cohen. lasting maybe only a few minutes. We then assess whether or not the observed result is extreme (that is. The researcher would like to assess if stress increases this percentage. sneezers and wheezers. or hypothesis. One researcher thinks that the accumulation of stress predisposes an infected person to illness. Winter can give you a cold because it forces you Acute stress. Yes. or data. But do keep in mind that accepting a theory does not necessarily imply that the theory is true.3 THE LANGUAGE OF STATISTICAL DECISION MAKING 7 et's Do I 1.

ALIAMC01_0131497561.QXD 04/13/2005 03:28 PM Page 8

8

CHAPTER 1

HOW TO MAKE A DECISION WITH STATISTICS

Example 1.2

Problem

◆

Package of Balls

Suppose that you are shown a closed package containing five balls. The package is on sale because the label, which describes the colors of the balls in the package, is missing. The salesperson states that most of the packages of balls sold in this store contain one yellow ball and four blue balls, for a proportion of yellow balls of 1. You wish to test the 5 hypotheses that the proportion of yellow balls in the package is 1 5 (or 0.20) versus the proportion of yellow balls in the package is more than 0.20. (a) Write the null and the alternative hypotheses for this setting. (b) Is this a one-sided to the right, one-sided to the left, or two-sided test? Suppose that you are allowed to collect some data. You are permitted to mix the balls in the package well, and then, without looking inside the package, reach in, select one ball, and record its color. After replacing the selected ball, you are allowed to repeat this procedure exactly the same way for a total of five observations. This is called selecting with replacement. (c) Suppose that the data are as follows: The first ball is yellow, the second ball is yellow, the third ball is yellow, the fourth ball is yellow, and the fifth ball is yellow. Do you reject the null hypothesis that the contents of the package are one yellow ball and four blue balls? Why? (d) Suppose that the data are as follows:The first ball is blue, the second ball is blue, the third ball is blue, the fourth ball is blue, and the fifth ball is blue. Do you reject the null hypothesis that the contents of the package are one yellow ball and four blue balls? Why? (e) In part (c) or part (d), would the decision you made be a correct decision?

Solution

(a) H0: The proportion of yellow balls in the package is 1 (or 0.20). 5 H1: The proportion of yellow balls in the package is more than 0.20. (b) The alternative hypothesis is one-sided to the right. (c) The observed data are possible under the null hypothesis of one yellow ball and four blue balls. You could have picked the one yellow ball all five times! However, observing five yellow balls in a row is very unlikely to occur if the package indeed contains only one yellow ball and four blue balls. Thus, you are more inclined to reject this hypothesis based on the observed data. (d) The observed data are again possible under the null hypothesis of one yellow ball and four blue balls. Observing five blue balls in a row is very likely to occur if the package indeed contains only one yellow ball and four blue balls.Thus, you are more inclined not to reject this hypothesis based on the observed data. (e) The only way you can know if you made the correct decision is to purchase the package and look inside! You certainly could have made an error in either case.

What We’ve Learned: The decision that is made is a function of what data is obtained. There is uncertainty in the fact that the truth is unknown and that repeated samples of the same size can lead to different results.

ALIAMC01_0131497561.QXD 04/13/2005 03:28 PM Page 9

1.3 THE LANGUAGE OF STATISTICAL DECISION MAKING

9

In Example 1.2, the data presented in the two scenarios were somewhat extreme cases— observing five yellow balls in a row and observing five blue balls in a row. There are certainly other possible outcomes, each providing some evidence as to which hypothesis should be supported. Example 1.3 helps us to think about how much evidence is enough evidence to reject the null hypothesis.

Example 1.3

Problem

◆

Is the New Drug Better?

Suppose that you have developed a new and very expensive drug intended to cure some disease. You wish to assess how well your new drug performs compared to the standard drug by testing the following hypotheses: H0: The new drug is as effective as the standard drug. H1: The new drug is more effective than the standard drug. A study is conducted in which the investigator administers the new drug to some number of patients suffering from the disease and the standard drug to another group of patients suffering from the disease. (In Chapter 3, we will discuss why two groups are needed in studies like this and how to allocate the patients to the two treatment groups.) The proportion of cures for both drugs is recorded. Based on this information, we have to decide which hypothesis to support. (a) If the proportion of subjects cured with the new drug was exactly equal to the proportion of subjects cured with the standard drug, which hypothesis would you support? (b) If 75% of the subjects were cured with the new drug while 55% of the subjects were cured with the standard drug, for a difference in cure rates of 20%, which hypothesis would you support? (Define the difference in cure rates as the percent of subjects cured with the new drug less the percent cured with the standard drug.) (c) If the difference in cure rates was equal to 2%, which hypothesis would you support? (d) How large of a difference in the cure rates is needed for you to feel confident in rejecting the null hypothesis?

Solution

(a) If the cure rate for the new drug were the same or less than the cure rate for the standard drug, we would not have enough evidence to reject the null hypothesis.Although the null hypothesis states that the two drugs are equally effective, any evidence that the new drug is no better than the standard drug would not support the alternative hypothesis. (b) If 75% of the subjects who received the new drug were cured while 55% of the subjects who received the standard drug were cured, we might reject the null hypothesis and conclude that the new drug is more effective than the standard drug. However, a difference in cure rates of 20% is not proof that is true. It is possible that the observed difference in this study occurred just by chance even though the new drug is really not more effective. (c) If the difference in cure rates is 2%, we might not reject the null hypothesis and may conclude that the new drug is not more effective than the standard drug. However, the 2% difference in cure rates is not proof that is false. It is possible that the observed difference in this study occurred just by chance even though the new drug is really more effective. We must remember that a study provides clues, not absolute answers.

ALIAMC01_0131497561.QXD 04/13/2005 03:28 PM Page 10

10

CHAPTER 1

HOW TO MAKE A DECISION WITH STATISTICS

(d) The required difference will depend on a number of things: How much data was gathered? Were there 50 subjects assigned to each drug? Or 500? How were the subjects assigned to each group?

What We’ve Learned: In real situations the decision maker makes a choice among various alternative courses of action. According to the consequences of the decision, one researcher might decide to reject the null hypothesis and another one might decide not to reject it. The concepts of the level of significance 1a2, the p-value, and the power of the test, which are introduced in the next sections, will clarify why this can happen.

Newspapers and articles often state such phrases as “but the results were not statistically significant” or “there was a statistically meaningful difference between the two groups.” In general, the alternative hypothesis is the new theory, the researcher’s claim. Thus, the researcher would like to have the data support his or her theory. The researcher would like to reject the null hypothesis. The data are said to be statistically significant if they are very unlikely to be observed under the null hypothesis. If the data are statistically significant, then our decision would be to reject H0. DEFINITION: The data collected are said to be statistically significant if they are very unlikely to be observed under the assumption that H0 is true. If the data are statistically significant, then our decision would be to reject H0. If the word significant is used in an article or a report, determine whether the word is being used in the statistical sense or is just being used in the usual sense to try to convince you that the result is important.

et's Do I

1.3

L

Last month, a large supermarket chain received many customer complaints about the quantity of chips in 16-ounce bags of a particular brand of potato chip.Wanting to assure its customers that they were getting their money’s worth, the chain decided to test the following hypotheses concerning the true average weight (in ounces) of a bag of such potato chips in the next shipment received from the supplier: H0: The average weight is at least 16 ounces. H1: The average weight is less than 16 ounces. If there is evidence in favor of the alternative hypothesis, the shipment would be refused and a complaint registered with the supplier. Some bags of chips were selected from the next shipment and the weight of each selected bag was measured. The researcher for the supermarket chain stated that the data were statistically significant. What hypothesis was rejected? Was a complaint registered with the supplier? Could there have been a mistake? If so, describe it.

t!

Complaints about Chips

ALIAMC01_0131497561.QXD 04/13/2005 03:28 PM Page 11

1.3 THE LANGUAGE OF STATISTICAL DECISION MAKING

11

et's Do I

1.4

L

The Titanic, the largest ship that had ever been built up to that time, left Southampton, England toward New York on Wednesday, April 10, 1912. It was carrying many rich and famous people of both England and the United States. Many thought that the Titanic could not sink. It made a stop at Cherbourg, in France, where it took on many third-class passengers, and a brief stop at Queenstown, Ireland, and set its course across the Atlantic. On the evening of April 14 it struck an iceberg and in less than 2 1 hours, by 2:15 A.M. of the next morning, it 2 sank. Only 710 survived among the 2201 passengers and crew. A question of interest may be whether or not everybody had an equal chance of surviving. In the language of the testing our hypotheses might be: H0: Men and women had the same chance of surviving. H1: Men and women did not have the same chance of surviving. Let pM represent the proportion of all men that survived and pW represent the proportion of all women that survived. (a) Using the notation for proportions given above, rewrite H0 and H1. H0: H1: (b) Suppose the data obtained were statistically significant.What hypothesis was accepted?

**1.3.3 What Errors Could We Make?
**

One principle of the American justice system is that the defendant in a trial should be considered innocent until proven guilty. What would the null and alternative hypotheses be in the context of a criminal trial? The null hypothesis would be the status quo that the defendant is innocent. The alternative hypothesis would be the result that the prosecutor is trying to establish—namely, that the defendant is guilty. H0: The defendant is innocent. H1: The defendant is guilty. The defendant and prosecutor present their cases. The jury must weigh the evidence presented, using it to assess whether or not there is enough doubt in the defendant’s innocence to deliver a guilty verdict. The justice system is not perfect. If the jury delivers a guilty verdict, but the defendant is truly innocent, an error has occurred. If the jury determines that there is no reasonable doubt and delivers an innocent verdict, but the defendant is truly guilty, again there would be an error. In statistical terms, these errors have special names. If we reject the null hypothesis H0 when in fact it is true, we have committed an error called a Type I error. If we fail to reject the null hypothesis H0 when in fact it is false, we have also committed an error, called a Type II error. DEFINITION: Rejecting the null hypothesis H0 when in fact it is true is called a Type I error. Failing to reject the null hypothesis H0 when in fact it is false is called a Type II error.

t!

Did Everyone Have an Equal Chance of Surviving?

What does this mean? Solution (a) Type I error: Decide that it is not going to rain. We can summarize the two types of error in hypothesis testing in the following table: The True Hypothesis Null H0 is true Your Decision Based on the Data Null H0 is supported Alternative H1 is supported No error Type I error Alternative H1 is true Type II error No error The Type II error entry in the preceding table (upper right corner) is interpreted as follows: The true hypothesis (column heading) is H1. in our decision-making process (i.QXD 04/13/2005 03:28 PM Page 12 12 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS A defendant is considered innocent until proven guilty. but your decision (row heading) was to support H0—which is an error. (a) Describe the two types of error that you could make when deciding between these two hypotheses. So you wish to test the following hypotheses: H0: Tonight it is going to rain. a Type I error would occur if an innocent person is falsely convicted and the guilty party remains free. . in order to return a guilty verdict. Are you going to carry an umbrella with you? You don’t want to get wet if it should rain. In the trial setting.e. Example 1. H1: Tonight it is not going to rain. The jurors are instructed to find sufficient evidence. The burden of the proof lies with the data. If one of these errors occurs. Rain. the scientific method).4 Problem ◆ Rain. In hypothesis testing. Convicting an innocent person is considered a serious consequence. Failing to reject the null hypothesis means that we did not have sufficient evidence to reject that theory. did anything wrong.ALIAMC01_0131497561. Go Away! You plan to walk to a party this evening. A Type II error would occur if a guilty person is set free. rejecting the null hypothesis is a stronger statement than failing to reject it. Remember that a Type I error can only be made if the null hypothesis is true.. (b) What are the consequences of making each type of error? (c) You learn from the noon weather report that there is a 70% chance of rain tonight. A Type II error can only be made if the alternative hypothesis is true. Thus. when in fact it is going to rain. it does not imply that we. Generally. Type II error: Decide that it is going to rain. beyond a reasonable doubt. when in fact it is not going to rain. It only means that we were misled by the information collected.The burden of proof lies with the prosecutor. the null hypothesis is selected to be the hypothesis that may produce serious consequences to you if you erroneously reject it. we begin by assuming that the null hypothesis is true.

A error would be more serious because Á t! Which Error Is Worse? . (a) H0: The water is contaminated. A error would be more serious because Á (e) H0: The value of my investment-stock portfolio is going to increase over the next six months. H1: The shirt is washable.3 THE LANGUAGE OF STATISTICAL DECISION MAKING 13 (b) The consequence of making a Type I error is that you are going to get wet. What We’ve Learned: The null hypothesis was chosen to be “H0: Tonight it is going to rain” since the consequence of getting wet was considered more important than to carry around an unnecessary umbrella. H1: The ship is sinkable. Therefore.ALIAMC01_0131497561. (c) It means that from records of days with similar atmospheric conditions. A error would be more serious because Á (c) H0: The ship is unsinkable. Recall that a Type I error occurs if you reject the null hypothesis H0 when it is true. Answers may vary. the decision is based on analysis of the alternative actions and the consequences of those actions. decide whether a Type I error or a Type II error would be more serious. A Type II error occurs if you do not reject the null hypothesis H0 when it is false. A error would be more serious because Á (b) H0: The parachute works. with no experimental data. 70% of such days had rain and 30% did not. but be prepared to discuss your answer. A error would be more serious because Á (d) H0: The shirt is not washable. In situations like this one. since you are not going to carry an umbrella. it may rain tonight or it may not. The consequence of making a Type II error is that you will carry around an umbrella you are not going to need.5 L For each set of hypotheses that follows. H1: The parachute does not work. et's Do I 1.QXD 04/13/2005 03:28 PM Page 13 1. H1: The value of my investment-stock portfolio is not going to increase over the next six months. H1: The water is not contaminated.

The proportion of cures for both drugs is recorded. to treat a particular disease.ALIAMC01_0131497561. you would never reject the null hypothesis. we would need to administer the new drug to all such patients. A Type I error in this situation would be . H1: The new drug is more effective than the standard drug. A Type II error in this situation would be . two drugs were being compared.6 L In the previous section. and depending on which sample we get. this is not possible. What are the two types of error that you could make when deciding between these two hypotheses? A Type I error occurs if you reject the null hypothesis H0 when it is true. This is one reason why we continue to cycle through the scientific method as part of the process of learning. We must be willing to accept some small chance of making an error. A Type II error occurs if you do not reject the null hypothesis H0 when it is false. The hypotheses were as follows: H0: The new drug is as effective as the standard drug.QXD 04/13/2005 03:28 PM Page 14 14 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS et's Do I 1. alternative theory. a new drug and a standard drug. We usually want to protect the prevailing viewpoint by ensuring a small chance of committing a Type I error. and you would never support any new. t! Testing a New Drug . we could make a wrong decision. Why not? We must remember that the data are a sample. What are the consequences of making a Type I error? What are the consequences of making a Type II error? Which error might be considered more severe from an ethical point of view? To know the true proportion of patients suffering from the disease who would be cured using the new drug. However. why not set the chance of making the Type I error to zero? To achieve a value of zero for a Type I error. A study is conducted in which the investigator administers the new drug to some number of patients suffering from the disease and the standard drug to another group of patients suffering from the disease. Think About It If the consequences of making a Type I error are considered very serious.

In general. Sometimes the alternate hypothesis H1 is the theory that the researcher is hoping the data will support. there are situations committing the Type I error. the computation of b is quite easy and is included for completeness. the chance of rejecting the null hypothesis when it is in fact true. ◆ for which a different value may be more appropriate. Since both a and b represent the chance of making an error.01 and sometimes at 0. and it is denoted by the Greek letter alpha. close to zero.05. the smaller the value of b . For a fixed value of a.* We will also learn how the significance level may be set in advance and used by statisticians to determine a rule for deciding whether or not to reject H0. The power of the test is the chance of rejecting the null hypothesis when the alternative hypothesis is true. you want both to be small. a = level of significance = the chance of a Type I error occurring = the chance of rejecting the null hypothesis H0 when it is true. When do you use a = 0. how unusual the data must be to reject the null hypothesis. The power of the test also depends on the significance level a of the test. The significance level can also be used as a statement regarding how much evidence against the null hypothesis is required in order to reject that hypothesis—that is. Sometimes this level is set at 0. The chance of a Type II error occurring is denoted by the Greek letter beta. . However. Equivalently. the better test will be the one that has the largest power (closer to one). In general.01 instead of a = 0. Finally.b. b.ALIAMC01_0131497561. We will see how a and b are related through examples shortly. In particular. a.05? The level of significance is related to the chance of making a mistake when deciding between two hypotheses. researchers would prefer a test that has higher power.10. the better test will be the one that has the smallest value of b .3 THE LANGUAGE OF STATISTICAL DECISION MAKING 15 In statistics. for a fixed value of a. the chance of a Type I error occurring is called the level of significance. DEFINITION: The power of the test is defined as 1 . DEFINITION: The significance level number a is the chance of committing a Type I error— that is. ideally. is a more advanced statistical concept in general and is not required for the statistical methods presented in the later chapters. b = the chance of a Type II error occurring the chance of failing to reject the null hypothesis H0 when the alternative hypothesis H1 is true. The researcher would like to reject H0 when indeed the alternative H1 is true. *The calculation of the chance of a Type II error. b.QXD 04/13/2005 03:28 PM Page 15 1. For a fixed sample size. you select the level of significance to be extremely small to reduce the chance of rejecting the null hypothesis mistakenly. is called the power of the test. there is another term to define in hypothesis testing that is directly related to the chance of a Type II error occurring. rejecting H0 when indeed the alternative H1 is true. The chance of making that correct decision. the larger the value of a. the level of a is fixed in advance The standard significance level used by most researchers because it depends on the consequences of is the value of a = 0. In the examples and exercises of this chapter. and thus the larger the power of the test.

while Bag B has a total value of $1890.We will develop various rules for how that decision should be made and compare the rules in terms of the chances of the two types of errors occurring. a and b. If there are two or more observations with the same value. Frequency Plot for Bag A X X X X X X X X X X X X X X X X X X X X $1000 $10 $20 $30 $40 $50 $60 $1000 Bag A has a total value of -$560.e. Frequency Plot for Bag B X X X X X X X X X X X X X X X X X X X X $1000 $10 $20 $30 $40 $50 $60 $1000 You will be shown only one of the bags. Finally. You will be allowed to gather some data on the basis of which you must decide whether to keep the shown bag or reject the shown bag and take the other bag. you do not want Bag A.ALIAMC01_0131497561. There are two bags—call them Bag A and Bag B. we will discover many aspects of the decision-making process. These plots show the possible voucher values and the frequency of each voucher value. you are required to receive the sum of the face value of the vouchers in your bag (i. Obviously.QXD 04/13/2005 03:28 PM Page 16 16 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS 1. . DEFINITION: A frequency plot displays a set of observations by representing each observation value with an X positioned along a horizontal scale. the X’s are stacked vertically. The outsides of the two bags look alike. or you pay $560 if the bag you pick is Bag A). We will be allowed to gather some data from the shown bag in order to make a decision about its contents.4 WHAT’S IN THE BAG? In this section. Once your decision is made and you have selected your bag. We will have two competing hypotheses about the contents of a shown bag.. This type of graph is called a frequency plot. you win $1890 if the bag you pick is Bag B. we will focus on what the observed data tell us by assessing how extreme it would be to assume that the null hypothesis was true. We will see how a and b are related. They differ in terms of the vouchers’ face values and their frequencies as follows: Bag A Face Value $1000 $10 $20 $30 $40 $50 $60 Frequency 1 7 6 2 2 1 1 Bag B Face Value $10 $20 $30 $40 $50 $60 $1000 Frequency 1 1 2 2 6 7 1 Bag A Bag B It may be helpful to look at a graph of these data. Each bag contains 20 vouchers of the same size and shape.

and must pay $560. which is Bag A. and must pay $560. In this case. we say the sample size. Based on this one observation. you must decide between the following hypotheses about the shown bag: H0: The shown bag is Bag A.ALIAMC01_0131497561. In each of these two cases.4 WHAT’S IN THE BAG? 17 The data will consist of selecting just one voucher from the shown bag (without looking in the bag) after mixing the contents of the bag well. then you will know that the shown bag is Bag A. is equal to one. H1: The shown bag is Bag B.QXD 04/13/2005 03:28 PM Page 17 1. How will you decide based on one observation from the shown bag whether or not to reject H0? Consider first the obvious choices: If the one voucher you select is -$1000. What are the two errors that you could make when deciding between these two hypotheses? Type I error = Reject H0 when H0 is true = You decide the shown bag is Bag B when in fact it is Bag A = You keep the shown bag. They are not statements about the bag that is not shown. Think About It What if the voucher you select is $60? Would this observation lead you to think the shown bag is Bag A or Bag B? Why? How would you answer these questions if the voucher you select is $10? In the preceding discussion. and you will decide to stay with H0. it is selected to be the null hypothesis. a decision rule is being formed. then you will know that the shown bag is Bag B and will decide to reject H0. which is Bag A. or n = 1. DEFINITION: The number n of observations in a sample is called the sample size. denoted by n. Since Bag A is the undesirable bag. Type II error = Stay with H0 when H1 is true = You decide the shown bag is Bag A when in fact it is Bag B = You select the other bag. A decision rule is a formal rule that states. based on the data obtained. you will not have made an error. if the one voucher you select is $1000. Note that both of these two hypotheses are statements about the same population—the shown bag. Likewise. . when you would reject the null hypothesis H0.

We will first consider the most extreme value.4. DEFINITION: The direction of extreme corresponds to the position of the values that are more likely under the alternative hypothesis H1 than under the null hypothesis H0. is called the most extreme value. but determining the direction of extreme will be sufficient to establish a decision rule. given that the shown bag is Bag A. If you select a $50 or $60 voucher. To do so. we need to examine the chances that a certain voucher value will be selected for each of the two possible bags. which we shall see later in Section 1. 1. in this scenario.1 Forming a Decision Rule Let’s develop a more formal and comprehensive decision rule based on the possible values that you could select from the shown bag.35). $50. corresponding to values that are large. Generally. based on the data obtained. it is the larger voucher values that are extreme or unlikely under the null hypothesis H0 and thus show the most support for the alternative hypothesis H1. whether from Bag A or Bag B. . the chance that we select one of the 7 $10 vouchers. is 20 (or 0. since these observations are more likely to occur from Bag B. it specifies a set of values based on the data to be collected. We thus define the direction of extreme for this scenario to be to the right. We use the concept of determining values that are extreme in the development of a decision rule.4. you might conclude that the bag is Bag A. (Note that in some cases it may not be possible to find the most extreme value. If the larger values are more likely under H1 than under H0. and $60 give you some clue as to which bag you have been shown. when to reject the null hypothesis H0.) DEFINITION: The value under the null hypothesis H0 that is least likely. exactly 7 of which have a value of $10. then the direction of extreme is said to be to the right. since such observations are more likely to occur from Bag A. you might conclude that the bag is Bag B.2.ALIAMC01_0131497561. The direction of extreme may not always be to the right. but at the same time is very likely under the alternative hypothesis H1. The other chances are computed similarly and are summarized as shown in the accompanying table. In other words. which are contradictory to the null hypothesis H0 and which favor the alternative hypothesis H1. The voucher values of $10. If you select a $10 or a $20 voucher. Face Value -$1000 $10 $20 $30 $40 $50 $60 $1000 Chance if Bag A 1 20 7 20 6 20 2 20 2 20 1 20 1 20 Chance if Bag B 0 1 20 1 20 2 20 2 20 6 20 7 20 1 20 0 The voucher values of $30 and $40 have the same chance of occurring. Since in Bag A there are 20 vouchers. $20.QXD 04/13/2005 03:28 PM Page 18 18 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS DEFINITION: A decision rule is a formal rule that states.

$30. which could only come from Bag B. Such values are contradictory to the null hypothesis and favor the alternative hypothesis H1. a Type I error corresponds to selecting a $60 or larger voucher from Bag A. For Decision Rule 1. or reject H0 if your selected voucher is Ú $60. would be if this decision rule was used. This most extreme value.$1000. a Type II error corresponds to selecting a voucher less than $60—namely.4 WHAT’S IN THE BAG? 19 Looking at the frequency plots for Bags A and B. A Type II error is defined as failing to reject H0 when H1 is true. Let’s see what the chances of committing these errors. . The extreme value of $60 in this decision rule is sometimes referred to as the cutoff value. It is possible that we may make a mistake if we use Decision Rule 1. the value that is both unlikely to come from Bag A and most likely to come from Bag B is the voucher value of $60. H0: The shown bag is Bag A. To any decision rule there corresponds a rejection region. the only voucher value larger than $60 is the $1000 voucher. $20. a voucher value of $60 is possible from both Bag A and Bag B. An acceptance region is the set of values for which you would not reject the null hypothesis H0. In particular. A Type I error is defined as rejecting H0 when H0 is true. the rejection region is $60 or larger. A cutoff value. Frequency Plot for Bag A X X X X X X X X X X X X X X X X X X X X $1000 $10 $20 $30 $40 $50 $60 $1000 Frequency Plot for Bag B X X X X X X X X X X X X X X X X X X X X $1000 $10 $20 $30 $40 $50 $60 $1000 Note that this decision rule is consistent with the direction of extreme being to the right— we reject the null hypothesis for values that are as large or larger than $60. H1: The shown bag is Bag B. For Decision Rule 1. is a value that marks the starting point of a set of values that comprise the rejection region. a and b. leads us to our first decision rule. or critical value. DEFINITIONS: A rejection region is the set of values for which you would reject the null hypothesis H0. In this case. or critical value. a .ALIAMC01_0131497561. A rejection region is the set of values for which you would reject H0. Decision Rule 1: Reject H0 if you select a $60 or a $1000 voucher. For Decision Rule 1. of all the possible values specified under the null hypothesis (Bag A). $10. $40. or $50 voucher—from Bag B. You can think of the critical value as the trigger point that prompts you to change your decision from staying with H0 to rejecting H0. so we certainly would reject H0 if we selected it. indicated in the plots with an arrow.QXD 04/13/2005 03:28 PM Page 19 1.

with the largest 6 1 6 $50 20 20 chance from Bag B of 20. containing only the voucher values of $60 or $1000. .ALIAMC01_0131497561. X X X X X X X X X X X X X X a X X X X 1 20 X X $1000 $10 $20 $30 $40 $50 $60 $1000 Frequency Plot for Bag B b 12 20 X X X X X X X X X X X X X X X X X X X X $1000 $10 $20 $30 $40 $50 $60 $1000 b = chance of staying with H0 when H1 is true = chance of selecting a . Decision Rule 1 resulted in a very large value of b because the rejection region was small.There would be a 60% chance of committing the error of staying with H0 when H1 is true.$1000.$1000. $40. 7 1 This next most extreme value gives us a $60 20 20 new rejection region—namely. voucher values 1 $1000 0 20 of $50 or more. or $50 voucher from Bag B = 12 20 = 0.05. the value of b is 0.Are you satisfied with these levels for a and b? Would you prefer that either or both of them be lower? There are many possible decision rules. 1 7 Among the remaining possible values $10 20 20 under the null hypothesis. and $50.QXD 04/13/2005 03:28 PM Page 20 20 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS Frequency Plot for Bag A Since a Type I error could only occur if H0 is true. $30.05 Since a Type II error could only occur if H1 is true. we need to find the next most extreme Chance if Chance if value among the remaining possible values Face Value Bag A Bag B under the null hypothesis that shows the most 1 0 -$1000 20 support from the alternative hypothesis. . To enlarge the rejection region. $20. 1 6 $20 20 20 $30. and these levels are related to each other. $10. but at the same time 20 20 2 2 shows the most support for the alternative hy$40 20 20 pothesis? The answer is $50.60 The significance level a is 0. $40. $10. $20. which value is least likely 2 2 $30 under the null hypothesis. The corresponding decision rule is as follows: Decision Rule 2: Reject H0 if your selected voucher is Ú $50.60. we find the chance of this error by looking at the frequency plot for Bag B. we find the chance of this error by looking at the frequency plot for Bag A. then the levels of a and b will generally change. and the levels of a and b will depend on which decision rule is used. If the decision rule is changed. So let’s consider enlarging the rejection region. a = chance of rejecting H0 when H0 is true = chance of selecting a $60 or $1000 voucher from Bag A 1 = 20 = 0. However.

The corresponding decision rule is as follows: Decision Rule 3: Face Value -$1000 $10 $20 $30 $40 $50 $60 $1000 Chance if Bag A 1 20 7 20 6 20 2 20 2 20 1 20 1 20 Chance if Bag B 0 1 20 1 20 2 20 2 20 6 20 7 20 1 20 0 Reject H0 if your selected voucher is Ú $40. we have X X X X X X X X X X X X X 2 20 X X X X X X X $1000 $10 $20 $30 $40 $50 $60 $1000 Frequency Plot for Bag B X X X X X X X b 6 20 X X X X X X X X X X X X X $1000 $10 $20 $30 $40 $50 $60 b = chance of staying with H0 when H1 is true = chance of selecting a . If we feel that the value of b is still too large. What is the next most extreme value among the remaining possible values under the null hypothesis that shows the most support from the alternative hypothesis? Among the remaining possible values under the null hypothesis. Values larger than $50 are even more extreme than $50. we select the next largest value of $40.ALIAMC01_0131497561. or $1000 voucher from Bag A 2 = 20 = 0.4 WHAT’S IN THE BAG? 21 Note that this decision rule is consistent with the direction of extreme being to the right— we reject the null hypothesis for values as large or larger than $50.10. Looking at the frequency Frequency Plot for Bag A plot for Bag A.10 Looking at the frequency plot for Bag B.30 $1000 By enlarging the rejection region. or critical value. we must consider enlarging the rejection region again. we have a = chance of rejecting H0 when H0 is true = chance of selecting a $50. $20. but the value of b has decreased to 0. . the significance level a has increased to 0. or $40 voucher from Bag B 6 = 20 = 0.$1000. and $40. $30. The values for a and b corresponding to this decision rule are provided next. The extreme value of $50 for this decision rule is the cutoff value. $20. maintaining the direction of extreme for this scenario. 2 with a chance from Bag B of 20. $10. $60. However. $30. $10.$1000. the values of $30 and $40 are both most likely.QXD 04/13/2005 03:28 PM Page 21 1. The values for a and b corresponding to this decision rule are provided next.30. .

we would reject H0 and the result would be statistically significant. However. $50. The values for a and b were determined because we knew what vouchers were in Bag A and Bag B. Á the critical cutoff value is $50 and larger values are even more extreme. finding a decision rule for a specified . As we can see from the preceding summary table.20. 2. the significance level a has increased to 0.ALIAMC01_0131497561.10 0. for a 5% significance level we could determine the rule as Decision Rule 1.15. Given a particular decision rule. Rejection Region $60 or more We say Á A 0. Reject H0 if the selected voucher is $60 or more.60 $50 or more $40 or more Á the critical cutoff value is $60 and larger values are even more extreme. a corresponding decision rule can be determined. We can also take the opposite approach. if the selected voucher from the shown bag is a $60 voucher.30 0. In this case. 0. we have started with various rejection regions and have compared the corresponding decision rules in terms of the chances of committing a Type I and Type II error. The values for a and b depended on the contents of the bags and the applicable decision rule. a decision rule with that exact significance level cannot be determined.$1000. and the value of b has decreased to 0. For a specified level of significance a. Therefore. we have a = chance of rejecting H0 when H0 is true = chance of selecting a $40. Reject H0 if the selected voucher is $50 or more.20. a and b. Reject H0 if the selected voucher is $40 or more. we can compute the levels of a and b exactly.20 So far. $60.QXD 04/13/2005 03:28 PM Page 22 22 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS Frequency Plot for Bag A X X X X X X X Looking at the frequency plot for Bag A. or $1000 voucher from Bag A 4 = 20 = 0. $10. A summary of the three decision rules and the resulting values for a and b is given in the following table: Decision Rule 1. 3. $1000 Once again.05 B 0. $20.20 0.20. it may be that for some values of a. For example. or $30 voucher from Bag B 4 = 20 = 0.20. by enlarging the rejection region. there is no decision rule having a significance level of exactly a = 0. Á the critical cutoff value is $40 and larger values are even more extreme. we have X X X X X X X a X X X X 4 20 X X $1000 $10 $20 $30 $40 $50 $60 $1000 Frequency Plot for Bag B X X X X X X X b 4 20 X X X X X X X X X X X X X b = chance of staying $1000 $10 $20 $30 $40 $50 $60 with H0 when H1 is true = chance of selecting a . respectively. Looking at the frequency plot for Bag B.

if the decision rule is reject H0 if the selected 1 voucher is $60 or more. Decision rule : significance level A For the Bag A–Bag B scenario with n = 1.QXD 04/13/2005 03:28 PM Page 23 1. but the direction of the extreme values will be to the left. Summary of the Relationship between the Decision Rule and the Significance Level If you start with a particular decision rule. The alternative hypothesis is that the shown bag is Bag D. we would use Decision Rule 2.2 More on the Direction of Extreme In the current Bag A–Bag B scenario.05. You can be given a specified significance level at which to perform the test.10.4 WHAT’S IN THE BAG? 23 significance level a requires us to look for a rejection region that yields a significance level as close to a as possible without exceeding a. You will be shown only one of the bags. in which we are allowed to select two vouchers instead of a single voucher. Bag C and Bag D. the frequency (or number) of vouchers. The corresponding graphical displays are also provided. then the corresponding decision rule is as follows: Reject H0 if the selected voucher is $50 or more. The summary table also shows us that for a fixed sample size (here n = 1). we will also have a one-sided rejection region. For a given level of a. H1: The shown bag is Bag D.15 were specified. we say that the rejection region was one-sided. . or closest to without exceeding. If the level of a = 0. determine the corresponding decision rule. You can also take the opposite approach. each containing 15 vouchers. in terms of the face value. Enlarging the rejection region will increase the level of a and decrease the level of b. The contents of each bag.5 Problem ◆ One-Sided Rejection Region to the Left We again have two bags. you can obtain the corresponding significance level a. The null hypothesis will be that the shown bag is Bag C. a. if the significance level is a = 0. Example 1. Significance level A : decision rule For the Bag A–Bag B scenario with n = 1. H0: The shown bag is Bag C. 1. there is a relationship between a and b. The outsides of the two bags look alike. Both of these hypotheses are statements about the same population—the one shown bag. Since the rejection region contained values that were in just one direction. then the corresponding significance level is a = 20 = 0.5. with a given cutoff value. we can find the appropriate cutoff value that would yield a significance level equal to.4. the direction of the extreme values was to the right.ALIAMC01_0131497561. In the next example. That is. and the corresponding chance of selecting each voucher value. is described next. is there any way we can reduce the level of b? You can see the answer to this question in Section 1. and for that level.

(a) What is the direction of extreme? Hint: Look at the frequency plots and determine what values are contradictory to the null hypothesis (Bag C) and show the most support for the alternative hypothesis (Bag D).067. corresponding to values that are small. (d) The value of b corresponding to this decision rule is as follows: b = chance of not rejecting H0 when H1 is true = chance of selecting $2. (e) The power of this test is 1 . (c) Calculate a.b = 5 15 = 0. the level of significance for the rule in part (b). (d) Calculate b. .667. (c) The value for a corresponding to this decision rule is as follows: a = chance of rejecting H0 when H0 is true = chance of selecting a $1 voucher from Bag C = 1 15 = 0.QXD 04/13/2005 03:28 PM Page 24 24 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS Frequency Plot of Bag C Face Value $1 $2 $3 $4 $5 BAG C Frequency 1 2 3 4 5 Chance 1 15 2 15 3 15 4 15 5 15 X X X X X X X X X X X X X X X $1 $2 $3 $4 $5 Face Value $1 $2 $3 $4 $5 BAG D Frequency 5 4 3 2 1 Frequency Plot of Bag D Chance 5 15 4 15 3 15 2 15 1 15 X X X X X X X X X X X X X X X $1 $2 $3 $4 $5 You are allowed to select just one voucher from the shown bag and must decide if you will reject H0 or fail to reject H0.333. $3. $4. because the voucher value of $1 is the least likely to come from Bag C and at the same time shows the most support from the alternative hypothesis of Bag D. (b) The most extreme value is the value 1. containing only those values as small as or smaller than the cutoff value of $1. (e) Calculate the power of the test. The rejection region is onesided. Solution (a) Since the smaller voucher values show the most support for the alternative hypothesis and against the null hypothesis. the direction of extreme for this scenario is one-sided to the left. (b) Develop the decision rule using the most extreme value. the chance of a type II error. or $5 voucher from Bag D = 10 15 = 0.ALIAMC01_0131497561. This most extreme value leads to the following decision rule: Decision Rule 1: Reject H0 if your selected voucher is … $1.

13. There is a relationship between a. What is it? t! Enlarging the Rejection Region Chance 1 15 2 15 3 15 4 15 5 15 X X X X X X X X X X X X X X X $1 $2 $3 $4 $5 BAG D Frequency 5 4 3 2 1 Frequency Plot of Bag D Chance 5 15 4 15 3 15 2 15 1 15 X X X X X X X X X X X X X X X $1 $2 $3 $4 $5 . BAG C Frequency 1 2 3 4 5 Frequency Plot of Bag C Face Value $1 $2 $3 $4 $5 Face Value $1 $2 $3 $4 $5 H0: The shown bag is Bag C. and $5. Decision Rule 1 had a significance level a = 0. and the power of the test all depend on the direction of extreme. while the value of b was somewhat large (0. which you are asked to do in Let’s do it! 1.4 WHAT’S IN THE BAG? 25 What We’ve Learned: The decision rule and computation of a.067 and a rather large level for b = 0.667). $4. In statistics. Finally.067). et's Do I 1.667. the better is your test. Enlarge the rejection region and give a new decision rule that will have a smaller level for b. we first fix the value of a and reach the desired value of b (and thus the power) by changing the sample size n (larger sample sizes will reduce the value of b and consequently increase the power of the test). The larger the power. b.7 L Recall the Bag C and Bag D scenario of the previous example. So it is important to correctly identify the direction of extreme.What should we do to reduce the level of b? We need to enlarge the rejection region.15. If we know two of these values we could determine the value of the third. 1. $3. H1: The shown bag is Bag D. note that with Decision Rule 1.QXD 04/13/2005 03:28 PM Page 25 1. the significance level was small (0. you need to find the next most extreme value among the values $2. b .14 and 1.ALIAMC01_0131497561. and the sample size n. We will discover some properties of a power curve and calculate several points of a power curve in Examples 1. The power of the test is the chance of rejecting the null hypothesis when the alternative hypothesis is true. To do this.7.

QXD 04/13/2005 03:28 PM Page 26 26 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS Decision Rule 2: Reject H0 if your selected voucher is The rejection region for Decision Rule 2 is .ALIAMC01_0131497561. is described next. The alternative hypothesis is that the shown bag is Bag F. . H0: The shown bag is Bag E. The outsides of the two bags look alike. You will be shown only one of the bags. in terms of the face value. and the corresponding chance of selecting each voucher value. Face Value $1 $2 $3 $4 $5 $6 $7 $8 $9 $10 BAG E Frequency 1 2 3 4 5 5 4 3 2 1 Chance 1 30 2 30 3 30 4 30 5 30 5 30 4 30 3 30 2 30 1 30 Face Value $1 $2 $3 $4 $5 $6 $7 $8 $9 $10 BAG F Frequency 5 4 3 2 1 1 2 3 4 5 Chance 5 30 4 30 3 30 2 30 1 30 1 30 2 30 3 30 4 30 5 30 You are allowed to select just one voucher from the shown bag and must decide to reject or not to reject H0: . The values for a and b corresponding to Decision Rule 2 are as follows: a = b = How did these values compare to those for Decision Rule 1? Example 1.6 Problem ◆ Two-Sided Rejection Region to the Right and to the Left We have two bags. H1: The shown bag is Bag F. each containing 30 vouchers. The contents of each bag. The null hypothesis will be that the shown bag is Bag E. Bag E and Bag F. The corresponding graphical displays are also provided. the frequency (or number) of vouchers.

the values that are You have to decide in advance whether your contradictory to the null hypothesis (Bag E) and problem is a one-sided test or a two-sided test show the most support for the alternative hypothesis before you see the data. Solution (a) Looking at the frequency plots. Calculate a. and the power is calculated again as 1 .b.QXD 04/13/2005 03:28 PM Page 27 1. corresponding to values that are very small and very large. Calculate b. The rejection region contains those values as small or smaller than $1 and as large or larger than $10. There is also a variation of this example in which there are two possible bags for the alternative hypothesis.b = 10 30 = 0. Here. while the value of b was somewhat large (0. the chance of a Type II error. one for each of the two directions— namely. In this case. $1 and $10. the significance level was small (0.4 WHAT’S IN THE BAG? 27 (a) What is the direction of extreme? Hint: Look at the frequency plots and determine what values are contradictory to the null hypothesis (Bag E) and show the most support for the alternative hypothesis (Bag F). ◆ (Bag F) are both the smaller voucher values and the larger voucher values. the level of significance for the rule in part (b). Enlarging the rejection region is left as an exercise at the end of this chapter.ALIAMC01_0131497561.667. a = 2 30 = 0. Frequency Plot of Bag E X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X $1 $2 $3 $4 $5 $6 $7 $8 $9 $10 Frequency Plot of Bag F X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X $1 $2 $3 $4 $5 $6 $7 $8 $9 $10 (b) (c) (d) (e) Develop the decision rule using the most extreme value.067.067). (c) (d) b = 20 30 = 0. We include them both in forming the first decision rule: Decision Rule 1: Reject H0 if your selected voucher is either … $1 or Ú $10. .333.667). (b) The most extreme value is the voucher value that is both the least likely to come from Bag E and at the same time the most likely to come from Bag F. we have two cutoff values. we have two values that are tied for the label of most extreme—namely. What We’ve Learned: In the two-sided test you have to look at values that are large and small for the calculations of a and b. and the computation is more complex. What should we do to reduce the level of b? We need to enlarge the rejection region. The direction of extreme for this scenario is said to be two-sided to the left and to the right. (e) The power of the test is = 1 . Calculate the power of the test. Note that with Decision Rule 1. $1 and $10.

A rejection region is called two-sided if its set of extreme values are in two directions. t! The Case of the Asymmetric Hypotheses 1 2 3 4 5 6 B: Loaded Die X X X X X X X X X X X X X X X X X X 1 2 3 4 5 6 . For example.QXD 04/13/2005 03:28 PM Page 28 28 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS DEFINITIONS: A rejection region is called one-sided if its set of extreme values are all in one direction. The die is loaded (in favor of 1 and 2). both to the right and to the left. the chance of obtaining a “2” when you roll a fair die once is 3/18 and the chance of obtaining a “2” when you roll the loaded die once is 5/18. we return to our original Bag A–Bag B scenario (page 16) and turn our attention away from the fixed decision rule.8 L The frequency plots below provide the models for a fair die and a die loaded in favor of outcomes 1 and 2. we assume for the moment that H0 is true—that is. Then the decision made was (circle one) The die is fair. 1. that the bag you select your one voucher from is indeed Bag A. et's Do I 1. Suppose you have a die and you do not know if it is A: Fair Die the fair die or the loaded die.3 How Unusual Are the Data? The p-value In this section. either all to the right or all to the left. Instead we will focus on what the observed data tell us. based on a fixed significance level. The idea is that we select our one voucher and assess whether or not the observed value would lead us to reject the null hypothesis. Since H0 is generally the prevailing viewpoint. a = b = power = (c) Suppose the die is rolled and the result was statistically significant at the significance level a.You will be allowed to roll the die once and based on the outcome of your roll you have to decide between the following hypotheses: X X X X X X H0: The die is fair. H1: The die is loaded (in favor of 1 and 2). Clearly indicate on the graphs and compute the values of a. b and the power. (a) The direction of extreme for this test is (circle one) one-sided to the right one-sided to the left two-sided can’t tell X X X X X X X X X X X X (b) Consider the following decision rule: Reject H0 if you roll a 2 or “more extreme” value.ALIAMC01_0131497561.4.

Since H0 states that the bag is Bag A. Half got the chromium supplements and the other half got a fake substitute. DEFINITION: The p-value is the chance. we use only Bag A to determine the p-value.The p-value is a value between 0 and 1 that measures how likely the observed result is. or data even more extreme. which shows even more evidence against H0?” Since in our example the larger voucher values show more evidence against H0.A p-value that is not small indicates that the observed data. H1: Chromium group had a higher strength gain.At the end of the training program. “If H0 is true (the bag is Bag A). t! Chromium Supplements . of getting the observed value plus the chance of getting all of the more extreme values.QXD 04/13/2005 03:28 PM Page 29 1. Consider the following hypotheses: H0: No difference in strength gains between the two groups. but there was no statistically meaningful difference in strength gains. We find the p-value after we select and look at a voucher value from the shown bag. are unlikely if the null hypothesis is true. men who trained while taking chromium supplements did not grow appreciably stronger than did men who trained without it. the study reported. more support for the alternative hypothesis H1. or data more extreme. We should understand that the smaller the p-value. Based on the stated results. equivalently.That chance has a special name. we compute the chance of getting the observed voucher value or a larger voucher under the assumption that the bag is Bag A. and we are assuming it is true. Sixteen men were monitored in the study. Therefore.ALIAMC01_0131497561. computed under the assumption that H0 is true. are somewhat likely if the null hypothesis H0 is true.9 L According to a study in the American College of Sports Medicine Journal and Science in Sports and Exercise. We ask. how likely is it that we would get such a voucher value or a voucher that is even more extreme. or something even more extreme. if the null hypothesis H0 is true. both groups had become stronger. called the p-value. A small p-value indicates that the observed data. et's Do I 1.The observed value is the value of the voucher that we have selected.4 WHAT’S IN THE BAG? 29 Suppose that we select one voucher. which hypothesis was supported by the data? Would the p-value for testing the preceding hypotheses have been somewhat small? Explain. the stronger is the evidence provided by the data against the null hypothesis H0. a small p-value corresponds to the data showing stronger evidence against the null hypothesis or.

so it is compared to the required significance level a to make a decision. would this be called a Type I error or a Type II error? (c) If the results of Study A are “statistically significant.0018 t! Three Studies 0. Alternative Hypothesis The true average lifetime is 6 54 months. However.” which hypothesis is supported? (d) For each study.33. (b) Suppose that Study A concluded that the data supported the alternative hypothesis that the true average lifetime is less than 54 months. you will always make the same decision either way. Let’s return to the Bag A–Bag B scenario.QXD 04/13/2005 03:28 PM Page 30 30 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS et's Do I 1. Recall that you were asked to decide between the following two theories: H0: The shown bag is Bag A. How small should the p-value be in order to reject the null hypothesis? The p-value is the observed significance level based on the data. p-value 0.33.ALIAMC01_0131497561. The true proportion of adults who work two jobs is … 0.3590 Study C (a) For which study do the results show the most support for the null hypothesis? Explain. As you will see. determine if the rejection region would have been one-sided to the right. but in fact the true average lifetime is greater than or equal to 54 months. The true proportion of adults who work two jobs is 7 0. versus H1: The shown bag is Bag B.) Study A: Study B: Study C: one-sided to the right one-sided to the right one-sided to the right one-sided to the left one-sided to the left one-sided to the left two-sided two-sided two-sided We may use a fixed level of significance a to make a decision.10 L The following table summarizes the hypotheses and results for three different studies: Null Hypothesis Study A Study B The true average lifetime is Ú 54 months.We will examine two approaches for making a decision: using a decision rule for a given significance level (called the classical approach) and determining the p-value to be compared to the significance level (called the p-value approach). . The p-value is what provides a measure of the strength of the data against H0.0251 0. The average time to relief for all Treatment I users is not equal to the average time to relief for all Treatment II users. (Circle your answer. people may have different opinions as to which level is the appropriate one to use. or two-sided.You were allowed to select only one voucher. The average time to relief for all Treatment I users is equal to the average time to relief for all Treatment II users. In our statistical language. so you only have to use one approach. one-sided to the left.

So if the shown bag is Bag A. Looking at the frequency plot for Bag A.” Since the observed $60 voucher is larger than $50. that the bag is Bag B. what would you decide? Classical Decision Rule Approach with a 0. we see that there are 6 vouchers out of the 20 that are valued at $30 6 or more.10 The p-value is the chance of getting the observed result or something even more extreme if the null hypothesis H0 is true. The p-value of 0.05.10 and our decision is to fail to reject H0. we would expect to select a $30 voucher. Note that the observed $30 voucher is in the acceptance region. the more support for the alternative hypothesis—that is. Suppose you select a voucher from the shown bag and it is a $60 voucher. a value larger than $60.30 is larger than a 0. a value larger than $30.” This was Decision Rule 2 given on page 22. Note that the observed $60 voucher is in the rejection region.10 X $1000 $10 $20 $30 Observed value $40 $50 Cutoff value $60 $1000 Acceptance region Rejection region The observed $30 voucher is in the acceptance region and the p-value 6/20 0.10 The p-value is the chance. we know that the corresponding decision rule would be “Reject H0 if the selected voucher is Ú $50. In this case. of getting the observed $30 voucher or a voucher that is even more extreme—that is.30 is larger than a = 0. So the 1 p-value is 20 or 0.10 For a 10% level of significance.30. about 30% of the time.10. we reject H0. we fail to reject H0. which is smaller than the significance level of 0. Since the observed $30 voucher is smaller than $50. p-value Approach with a 0. of getting the observed $60 voucher or a voucher that is even more extreme—that is. So if the shown bag .10 For a 10% level of significance.10. or something even larger.30 2 20 a X X X X X X 0.QXD 04/13/2005 03:28 PM Page 31 1. the larger the value of the selected voucher.4 WHAT’S IN THE BAG? 31 Suppose you select a voucher from the shown bag and it is a $30 voucher. Frequency Plot for Bag A (H0: The shown bag is Bag A) X X X X X X X p-value X X X X X X 6 20 0. So the p-value is the chance. what would you decide? Classical Decision Rule Approach with a 0. we know that the corresponding decision rule would be “Reject H0 if the selected voucher is Ú $50. which is larger than the significance level of 0. So the p-value is 20 or 0.ALIAMC01_0131497561.10. If the significance level was set at 10%. we see that there is exactly 1 such voucher out of the 20. Looking at the frequency plot for Bag A. p-value Approach with a 0. If the significance level was set at 10%. assuming the shown bag is Bag A. assuming the shown bag is Bag A.

If the p-value is 7a : the data are not statistically significant at the given level a and we do not reject H0.10 X X X X $1000 $10 $20 $30 $40 $50 $60 $1000 Cutoff Observed value value Acceptance region Rejection region The observed $60 voucher is in the rejection region and the p-value 1/20 0.10 … 0. we do not reject H0. 4 Since the p-value = 20 = 0.30 7 0.05 a 2 20 0. Once the p-value is reported.10) 6 Since the p-value = 20 = 0. p-value approach: Reject H0 if the p-value ◊ A ( 0.10. In general.05 is smaller than a 0.10. we would expect to select a $60 voucher.10. Summary of the Relationship between the p-value and the Significance Level A If the p-value is …a : the data are statistically significant at the given level a and we reject H0.05 is smaller than a = 0. Since $60 Ú $50.10 and our decision is to reject H0. we reject H0. Observed Voucher If observed voucher is $30 If observed voucher is $40 If observed voucher is $50 If observed voucher is $60 We will make the same decision whether we use the classical decision rule approach starting with a decision rule for the specified significance level and comparing the observed voucher value to the cutoff value in the decision rule or use the p-value approach and compare the p-value directly to the significance level.20 7 0. The p-value of 0. . a decision can quickly be made at any desired significance level.10. we reject H0. 1 Since the p-value = 20 = 0. Reject H0 if observed voucher » $50 Since $30 6 $50. Since $50 Ú $50. Frequency Plot for Bag A (H0: The shown bag is Bag A) X X X X X X X X X X X X X X p-value X X 1 20 0.QXD 04/13/2005 03:28 PM Page 32 32 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS is Bag A. we do not reject H0. we do not reject H0.10.10. we reject H0. only 5% of the time.05 … 0. 2 Since the p-value = 20 = 0.ALIAMC01_0131497561. The following table summarizes the decision that would be made for various observed voucher values: Classical decision rule approach: A 0. or something even larger. Since $40 6 $50. we do not reject H0. we reject H0. we will prefer to use the p-value approach.

a Type I error. If we have observed the data and decided to reject H0. are provided. If the shown bag is actually Bag A. the chance that we have made a Type I error is either 1 (because the shown bag was Bag A) or 0 (because the Once you have made a decision. Could you have made a mistake? (circle one) What type of mistake could you have made? (circle one) What is the chance that you have made a mistake? Yes No Type II error Type I error It is important to distinguish between “before we observe the data and state a decision rule” and “after we have observed the data and made a decision. Your decision is to reject the null hypothesis and conclude that the data are statistically significant at the 10% level. For Decision Rule 2. If the observed voucher from the shown bag is $60.5 on page 23. ◆ Type I error is not equal to the significance level. The null hypothesis will be that the shown bag is Bag C. our decision is either right or wrong. The corresponding decision rule is:“Reject H0 if the selected voucher is $50 or more. which is the chance that you will make a Type I error. then we have made a mistake.10. You rejected H0. The chance that we have made a is either right or wrong. we had a = 0. from Example 1. If the shown bag is actually Bag B.7 ◆ p-value for a One-Sided Rejection Region to the Left Frequency Plot of Bag C X X X X X X X X X X X X X X Recall the two bags. H0: The shown bag is Bag C. we can state a decision rule and compute the corresponding chance of committing a Type I error—namely. the decision shown bag was Bag B). the decision is to reject H0.QXD 04/13/2005 03:28 PM Page 33 1.4 WHAT’S IN THE BAG? 33 Think About It The significance level is a = 0. Each bag contains 15 vouchers and you will be shown only one of the bags. The frequency plots that show the contents of each bag. After we have looked at the data and have made our decision. The shown bag is either Bag A or Bag B. then we have not made a mistake. The alternative hypothesis is that the shown bag is Bag D. Example 1. a.” Before we observe the data.10. the chance of committing a Type I error. in terms of the face value.ALIAMC01_0131497561. Bag C and Bag D. H1: The shown bag is Bag D. You are allowed to select just one voucher from the shown bag and must decide to whether or not to reject H0. X $1 $2 $3 $4 $5 Frequency Plot of Bag D X X X X X X X X X X X X X X X $1 $2 $3 $4 $5 .”A voucher is selected and it turns out to be $60.

Thus the direction of extreme for this scenario is to the left. The p-value is the chance to observe the actual observed data plus the chance to observe more unlikely data than the observed one.QXD 04/13/2005 03:28 PM Page 34 34 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS Recall that the smaller voucher values are less likely under the null hypothesis (Bag C) and show the most support for the alternative hypothesis (Bag D).01? at the level a = 0. as a consequence. Frequency Plot of Bag D X X X X X X X X X X X X X X X X X X X X X $1 $2 $3 $4 $5 $1 $2 $3 $4 $5 . if the null hypothesis is true. Bag C and Bag D. And B did not happen.40. from Example 1. however. Frequency Plot of Bag C X X X X X X X X X t! p-value for a One-Sided Rejection Region to the Left H1: The shown bag is Bag D. we cannot reject A. then B is unlikely. We would support the conclusion that the shown bag is Bag C.10? The p-value is not the chance that the null hypothesis is true. for a p-value of 6 15 = 0. So. does not tell you whether A is true or false.5. (a) Calculate the corresponding p-value. ◆ Solution (a) We wish to compute the corresponding p-value. The p-value is not the chance that the null hypothesis is false. First.11 L Consider again the two bags. Problem Suppose the observed voucher value is $3. of getting the observed value plus the chance of getting all of the more extreme values. So we need to find the chance of getting the observed voucher value of $3 or something smaller under the frequency plot for Bag C. computed under the assumption that H0 is true. What We’ve Learned: The logic of statistical inference as it pertains to this example is as follows: If A is true. (b) For all three significance levels the data are not statistically significant since the p-value is larger than 0. This logic. et's Do I 1.ALIAMC01_0131497561. The p-value is not the chance that the alternative hypothesis is true. corresponding to values that are small. It is a conditional statement. There are 6 vouchers (shown in blue) out of 15 that are $3 or less. H0: The shown bag is Bag C.The large p-value implies that we do not have enough evidence to reject the null hypothesis. (b) Are the data statistically significant at the level a = 0. we need to remember that the p-value is computed based on the assumption that H0 is true. You are allowed to select just one voucher from the shown bag and must decide whether or not to reject H0. The p-value is the chance.10.05? at the level a = 0.

X X X X X X X X X X X X $1 $2 $3 $4 $5 $6 $7 $8 $9 $10 Frequency Plot of Bag F X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X $1 $2 $3 $4 $5 $6 $7 $8 $9 $10 . The hypotheses are H0: The shown bag is Bag E. Find the corresponding p-value.10 0. H1: The shown bag is Bag F. For the following significance levels. You are allowed to select just one voucher from the shown bag and must decide whether or not to reject H0. Each bag contains 30 vouchers.8 Problem ◆ p-value for a Two-Sided Rejection Region Frequency Plot of Bag E X X X X X X X X X X X X X X X X X X Recall the two bags. in terms of voucher face values. are the data statistically significant? Significance level A 0.6 on page 26. For the following significance levels. Bag E and Bag F.QXD 04/13/2005 03:28 PM Page 35 1.01 Circle one Yes Yes Yes No No No (b) Suppose that the observed voucher value is $1.05 0.ALIAMC01_0131497561. are the data statistically significant? Significance level A 0. from Example 1. and you will be shown only one of the bags. The frequency plots that show the contents of each bag. Find the corresponding p-value.4 WHAT’S IN THE BAG? 35 (a) Suppose that the observed voucher value is $2.05 0.01 Circle one Yes Yes Yes No No No Example 1. are provided.10 0.

of getting the observed value $3 plus the chance of getting all of the more extreme values. or a = 0. corresponding to the values that are very small or very large.We would support the conclusion that the shown bag is Bag E. If the p-value … a. Suppose that the observed voucher value is $3.01.10.10? at the level a = 0. one at each end of the possible voucher values. computed under the assumption that H0 is true. from Example 1.ALIAMC01_0131497561. the values of $1 and $10 were tied for the label of most extreme. we reject the null hypothesis. In Example 1. There are 12 vouchers (shown in blue) out of 30 that are $3 or less or $8 or more. (b) Are the data statistically significant at the level a = 0.01? Solution (a) The p-value is the chance. we need to find the chance of getting the observed voucher value of $3 or less or getting an observed voucher value of $8 or more under the frequency plot for Bag E. we do not reject the null hypothesis.6. A voucher value of $8 is just as extreme as a voucher value of $3. The large p-value implies that we do not have enough evidence to reject the null hypothesis. the chance of getting $8 or more from Bag E.6.12 L Consider again the two bags. (b) At any of the significance levels a = 0. H0: The shown bag is Bag E. Bag E and Bag F. What We’ve Learned: You don’t need to construct rejection regions. et's Do I 1.05.05? at the level a = 0. t! p-value for a Two-Sided Rejection Region Frequency Plot of Bag E X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X $1 $2 $3 $4 $5 $6 $7 $8 $9 $10 Frequency Plot of Bag F X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X $1 $2 $3 $4 $5 $6 $7 $8 $9 $10 .You just need to compare the p-value with the significance level a. for a p-value of 12 30 = 0. the direction of extreme for this scenario is to the left and to the right. (a) Compute the corresponding p-value. the data are not statistically significant since the p-value is larger than 0. Thus. The rejection region consisted of two equal parts. Decision Rule 1: Reject H0 if your selected voucher is either … $1 or Ú $10.10.QXD 04/13/2005 03:28 PM Page 36 36 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS Recall that both the smaller voucher values and the larger voucher values are less likely under the null hypothesis (Bag E) and show the most support for the alternative hypothesis (Bag F). a = 0. so will the p-value be computed using values in both directions. based on these two cutoff values.You are allowed to select just one voucher from the shown bag and must decide whether or not to reject H0. H1: The shown bag is Bag F. otherwise. So.40. Just as our rejection region included values in both directions.

it was the larger values that were unlikely under the null hypothesis H0 and showed the most support for the alternative hypothesis H1. In our Bag E–Bag F scenario (Examples 1. corresponding to values that are very small and very large. it was the smaller values that were unlikely under the null hypothesis H0 and showed the most support for the alternative hypothesis H1.05 0.01 Circle one Yes Yes Yes No No No Let’s recap the various directions of extreme. We will need to have the H0 picture. the direction of extreme was to the left. we will have more practice at picturing the p-value. Rather than working with more bags of vouchers and their corresponding frequency plots. For the following significance levels. pages 23 and 33). A p-value would be computed using this direction of extreme indicated by H1. A p-value would be computed using this direction of extreme indicated by H1.8. In our original Bag A–Bag B scenario (pages 16–23 and 31–33). calculating the chance of getting the values that are equal to or larger than the observed value under the H0 picture. For the following significance levels.4 WHAT’S IN THE BAG? 37 (a) Suppose that the observed voucher value is $2. calculating the chance of getting the values that are equal to or smaller than the observed value under the H0 picture.6 and 1.10 0. corresponding to values that are small.5 and 1.A p-value would be computed using this direction of extreme indicated by H1. it was both the smaller values and the larger values that were unlikely under the null hypothesis H0 and showed the most support for the alternative hypothesis H1. and determine the direction of extreme by comparing the H1 picture to the H0 picture. In the next examples.01 Circle one Yes Yes Yes No No No (b) Suppose that the observed voucher value is $10.7. So the direction of extreme was to the right. are the data statistically significant? Significance level A 0. We will see more smoothed versions of frequency .05 0.QXD 04/13/2005 03:28 PM Page 37 1. So. Find the corresponding p-value. are the data statistically significant? Significance level A 0. know the value that was observed. corresponding to values that are large. So the direction of extreme was to the left and to the right. we will simplify the H0 and H1 pictures by working with their smoothed versions. pages 26 and 35). calculating the chance of getting the values from both ends of the H0 picture.10 0. Find the corresponding p-value. In our Bag C–Bag D scenario (Examples 1.ALIAMC01_0131497561.

You will be shown only one of the bags. In Example 1. The smoothed versions of the frequency plots depicting the contents of each bag. The following tips are helpful to think about as we go through the remaining p-value examples. Smoothed Version of Frequency Plot under H0 3 2 1 0 1 2 3 4 5 6 7 $ Smoothed Version of Frequency Plot under H1 3 2 1 0 1 2 3 4 5 6 7 $ . each containing many vouchers. The outsides of the two bags look alike.The voucher values (X’s) that are as extreme or more extreme under the H0 picture are highlighted in both the frequency plot and the smoothed version.9 Problem ◆ Can You Picture the p-value? Consider two bags. You are allowed to select just one voucher from the shown bag and must decide whether or not to reject H0.8.9 through 1. in terms of the face value. The H1 picture is used to determine the direction of extreme by comparing it with the H0 picture. are provided. In Examples 1. Example 1. p-value Tips ■ ■ $1 $2 $3 $4 $5 $6 $7 $8 $9 $10 Smoothed Version of Frequency Plot of Bag E 1 2 3 4 5 6 7 8 9 10 $ The p-value is always computed using the H0 picture.11. we will depict the p-value by shading in the appropriate region under the smoothed H0 picture. we found the p-value when the observed voucher value was $3.ALIAMC01_0131497561.QXD 04/13/2005 03:28 PM Page 38 38 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS Frequency Plot of Bag E X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X plots in Chapter 6 and learn why the total area under the curve is equal to 1. The frequency plot of the contents of Bag E and its smoothed version are shown here.

and the that corresponds to the p-value. p-value. then the p-value is given by the darker blue-shaded area in the picture below. In future chapters the curves that area that corresponds to the p-value. the direction of extreme is to the right. then the p-value is given by the darker blue-shaded area in the picture below. Smoothed Version of Frequency Plot under H0 p-value for observed value of $2 3 2 1 0 1 2 3 $ (c) If the selected voucher is .$1. Shade the and one. and thus must be numbers between zero (c) Suppose the selected voucher is . ◆ hypothesis if you get a value of 2. which are defined to be non-negative p-value be less than 0. b. (b) If the selected voucher is $2. Solution (a) Since the picture for the alternative hypothesis is shifted to the right.$1 and that area is more that 1 of the total area.4 WHAT’S IN THE BAG? 39 (a) What is the direction of extreme? (b) Suppose the selected voucher is $2. represent the models under the competing (d) Suppose the selected voucher is .QXD 04/13/2005 03:28 PM Page 39 1. Show a by shading its corresponding area under the H0 curve. to the larger values.5? functions that have a total area under (e) Suppose the decision rule is to reject the null the curve of one. Shade the area Remember that a.ALIAMC01_0131497561.$1.5 or more than 0. Show b and the power of the test by shading their corresponding areas under the H1 curve.5 since it is the area under the H0 curve to the right of . power of a test are all defined as chances. Smoothed Version of Frequency Plot under H0 p-value for observed value of $1 3 2 1 0 1 2 3 $ (d) The p-value will be greater than 0. Will the hypotheses will be called density curves.$1.6 or more extreme. 2 .

Smoothed Version of Frequency Plot under H0 Significance level 3 2 1 0 1 2 3 4 5 6 7 $ Smoothed Version of Frequency Plot under H1 Power 3 2 1 0 1 2 3 4 5 6 7 $ What We’ve Learned: The graphical pictures of the models under the competing hypotheses help us visualize the relationship between a. The value of b is shown below as the area under the H1 curve to the left of 2.6.QXD 04/13/2005 03:28 PM Page 40 40 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS (e) The significance level a is shown below as the area under the H0 curve to the right of 2. in terms of the face value.10 Problem ◆ Can You Picture the p-value? Smoothed Version of Frequency Plot under H0 Consider two bags.6.6. Example 1. are provided. and the power of the test. b. You are allowed to select just one voucher from the shown bag and must decide whether or not to reject H0. The outsides of the two bags look alike. and thus the power of the test is shown below as the area under the H1 curve to the right of 2. The smoothed versions of the frequency plots depicting the contents of each bag. 7 6 5 4 3 2 1 0 1 2 3 $ Smoothed Version of Frequency Plot under H1 7 6 5 4 3 2 1 0 1 2 3 $ . the p-value.ALIAMC01_0131497561. You will be shown only one of the bags. each containing many vouchers.

Shade the area that corresponds to the p-value.11 Problem ◆ Can You Picture the p-value? Consider two bags.QXD 04/13/2005 03:29 PM Page 41 1. then the p-value is given by the darker blue-shaded area in the picture below.ALIAMC01_0131497561. then the p-value is given by the darker blue-shaded area in the picture below. You will shade from the observed value to the left if the direction of extreme is to the left. so the p-value is found by shading the area from the observed value and to the left under the null hypothesis.4 WHAT’S IN THE BAG? 41 (a) What is the direction of extreme? (b) Suppose the selected voucher value is $2. in terms of the face value. are provided. to the smaller values. (c) Suppose the selected voucher value is . Smoothed Version of Frequency Plot under H0 p-value for observed value of $1 3 2 1 0 1 2 3 $ What We’ve Learned: To shade the p-value just concentrate on the null hypothesis. Smoothed Version of Frequency Plot under H0 p-value for observed value of $2 3 2 1 0 1 2 3 $ (c) If the selected voucher is . Shade the area that corresponds to the p-value. ◆ assuming that H0 is true. In this problem the alternative hypothesis is shifted to the left of the null hypothesis.$1.$1. each containing many vouchers. You are . You will be shown only one of the bags. Example 1. The smoothed versions of the frequency plots depicting the contents of each bag. Solution (a) Since the picture for the alternative hypothesis is shifted to the left. the direction of extreme is to the left. The outsides of the two bags look alike. The p-value is computed The p-value is always found under H0. You will shade from the observed value to the right if the direction of extreme is to the right. (b) If the selected voucher is $2.

then the p-value is given by the darker blue-shaded area in the picture below. Smoothed Version of Frequency Plot under H0 p-value for observed value of $2 is the sum of these two areas.$1. (b) If the selected voucher is $2.QXD 04/13/2005 03:29 PM Page 42 42 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS allowed to select just one voucher from the shown bag and must decide whether or not to reject H0. Smoothed Version of Frequency Plot under H0 p-value for observed value of $1 is the sum of these two areas. the direction of extreme is two-sided to the left and to the right. then the p-value is given by the darker blue-shaded area in the picture below. 3 2 1 0 1 2 3 $ (c) If the selected voucher is .ALIAMC01_0131497561. (c) Suppose the selected voucher is . Shade the area that corresponds to the p-value. Smoothed Version of Frequency Plot under H0 Smoothed Version of Frequency Plot under H1 3 2 1 0 1 2 3 $ 3 2 1 0 1 2 3 $ (a) What is the direction of extreme? (b) Suppose the selected voucher is $2. both the smaller and the larger values show the most evidence against the null hypothesis and in favor of the alternative hypothesis. Solution (a) In the picture for the alternative hypothesis. Thus. 3 2 1 0 1 2 3 $ .$1. Shade the area that corresponds to the p-value.

1 5.5 4.0 5. (a) The direction of extreme is (circle your answer) one-sided to the right. The hypotheses for the distribution of the length of the part (in mm) are presented at the right as the smoothed versions of the frequency plots.5 4. but around a higher average of 4. and clearly label the region with a. and clearly label the region with b. et's Do I 1. one-sided to the left.7 4.4 4. (ii) In the picture. (d) Is the result in part (c) statistically significant? (circle your answer) Yes Explain your answer. H0 : Machine A (b) Consider the following decision rule: Reject H0 if the selected part length is 4.4 WHAT’S IN THE BAG? 43 What We’ve Learned: For a two-sided test.6 mm. one to the right. the significance level.3 4.2 4. In the symmetric case like this one. the p-value is the sum of two equal areas.8 4.9 shade in the region that corresponds to the p-value and clearly label the region with the p-value. shade in the region that correH1: Machine B sponds to a.7 4. Machine B makes parts whose lengths vary similarly.2 length 5. the p-value is the sum of the two areas. In the picture.9 extreme. but you’re not sure. t! Machine A or Machine B? 5.13 L Machine A makes parts whose lengths average around 4. H0: The parts are from Machine A. so it is the area of a one-sided test multiplied by two. 4.3 4.8 mm or more 4.4 4. H1: The parts are from Machine B.9 mm.1 5.2 4.6 4. (c) The selected part length is 4. You will test the following hypotheses by randomly selecting one part from the box and measuring it. Suppose you have a box of parts that you believe are from Machine A. two-sided.ALIAMC01_0131497561.7 mm. shade in the region that corresponds to b. and one to the left.0 5.8 4. (i) In the picture.2 length No .6 4. the chance of a Type II error.QXD 04/13/2005 03:29 PM Page 43 1.

The smoother version of the frequency plots depicting the contents of each bag are provided. b.QXD 04/13/2005 03:29 PM Page 44 44 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS Example 1. and the power of the test. (b) Suppose the observed value is 2. b . The outsides of the two bags look alike. Smoothed Version of Frequency Plot under H0 3 2 1 0 1 2 3 4 5 6 7 $ Smoothed Version of Frequency Plot under H1 3 2 1 0 1 2 3 4 5 6 7 $ (a) Consider the following decision rule: Reject H0 if you observe a value of 1. Smoothed Version of Frequency Plot under H0 Significance level 3 2 1 0 1 2 3 4 5 6 7 $ Smoothed Version of Frequency Plot under H1 Power 3 2 1 0 1 2 3 4 5 6 7 $ . (c) Would an observed value of 2 be statistically significant? Explain.ALIAMC01_0131497561. each containing many vouchers. Shade in the area corresponding to a. the Power of the Test.5 or more extreme. and the power of the test. Shade in the area that corresponds to the p-value. and the p-value? Consider two bags. Solution (a) The figure below shows the areas corresponding to a. b.12 Problem ◆ Can You Picture a.

Bag A and Bag B. The p-value is less than a. or 0. The data are statistically significant. 0. b is the area under H1 from the same value but to the left. The value for b and thus the power of the test are both represented by areas under H1. Example 1. b is the area under H1 from the same value but to the right. and this level should be selected first in a testing hypotheses setting. in terms of the face value of the vouchers within. Smoothed Version of Frequency Plot under H0 p-value 3 2 1 0 1 2 3 4 5 6 7 $ (c) Yes. The p-value depends on the observed value. two-sided. you must decide whether or not to reject H0.ALIAMC01_0131497561. In general their areas start at different values. (b) Consider the following decision rule: Reject H0 if the selected voucher value is $10 or more extreme.01. but both areas will either extend to the right (for a one-sided test to the right). If a was the area to the left of a particular value under H0.13 Problem ◆ Can You Shade and Calculate the Values of a. or both will extend to the left (for a one-sided test to the left). One of the bags will be shown to you and you will be allowed to select just one voucher from that shown bag. Based on the one voucher value. clearly label the region with a.10. H1: The shown bag is Bag B. The area for the level a is the significance level. often 0. b. (a) The direction of extreme is (select one) one-sided to the right. (i) In the appropriate picture. or they will both extend in two directions (for a two-sided test). the Power of the Test.QXD 04/13/2005 03:29 PM Page 45 1.4 WHAT’S IN THE BAG? 45 (b) The figure below shows the areas corresponding to the p-value. and calculate the value of a. 1 6 1 6 H0 0 H1 12 $ 0 12 $ . What We’ve Learned: Both the level a and the p-value are represented by areas under H0. Hint: You need to find the equation of the line represented by the hypotenuse of the triangle in H0. If a was the area to the right of a particular value under H0. and the p-value? The pictures below present the smoothed versions of the frequency plots depicting the contents of two bags. H0: The shown bag is Bag A. shade in the region that corresponds to the significance level. The area for b is under H1 but in the direction different than a.05. one-sided to the left.

b . and calculate the value of the power of the test. H0 1 6 0. The equation of the 1 1 hypotenuse of this triangle is y = . call it b*.ALIAMC01_0131497561. 11 (iii) The power of the test is 1 . is the area under this H1 triangle to the left of 10. (b) 1 (i) The slope of the hypotenuse of the triangle in H0 is m = . (d) Give a new decision rule that will result in a larger significance level a* compared with the decision rule in part (b). clearly label the region as power. shade in the region that corresponds to b.QXD 04/13/2005 03:29 PM Page 46 46 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS (ii) In the appropriate picture. the power of the test.721102 + 1 = 36. Thus. Thus a = 112236 = 36 = 0. clearly label the region as p-value.72x + 6. shade in the region that corresponds to the p-value.306 0 10 12 $ . shade in the region that corresponds to 1 . The equation of the 1 hypotenuse of this triangle is y = 72x. Hint: You need to find the equation of the line represented by the hypotenuse of the triangle in H1.694 0 10 12 $ H1 1 6 10 72 Power 1 0. In the appropriate picture. The chance of a Type II error.028.694. clearly label the region with b. 6 2 1 (ii) The slope of the hypotenuse of the triangle in H1 is m = 72. (iii) In the appropriate picture. and calculate the value of b. (e) How does the level of b for the new rule in part (d).028 1 36 0 10 12 $ H1 1 6 10 72 0.72. It has a base 10 and height value of 1 10 1 10 50 72 1102 = 72 . the chance of a Type II error. b = 2 110272 = 72 = 0.b = 36 = 0. and calculate the p-value. (c) The observed voucher value is 3. The significance level a is the area under this H0 triangle to the right of 10.306. It has a base of 2 and a height of 1 1 1 1 . relate to the level of b for the rule in part (b)? (select one) b* 7 b b* 6 b b* = b Solution (a) The direction of extreme is one-sided to the right.b.

and under the alternative H3 the population mean is even further from the null hypothesis at the value of 4. (a) Suppose the population mean is really larger than 3. In this example.14 ◆ What Is the Effect on the Power If the Alternative Hypothesis Moves Farther and Farther from the Value in the Null Hypothesis? Problem Suppose the null hypothesis is that the population mean. represented by m. p-value and the power of the test we follow the same rules as with any other hypothesis testing models. For which alternative would it be easier to reject the null hypothesis—when the alternative H1 is true or when the alternative H3 is true? (b) As the alternative hypothesis moves farther and farther away from the null hypothesis (in the direction of the alternative). What We’ve Learned: To shade the areas for a.QXD 04/13/2005 03:29 PM Page 47 1. is equal to 3. The alternative hypothesis is that the population mean m is more than 3.5.ALIAMC01_0131497561. Example 1.72132 + 1 = 1 = 0. 6 8 2 H0 1 6 1 8 p-value 0.4 WHAT’S IN THE BAG? 47 (c) The p-value will be the area under the H0 triangle to the right of 3. So the p-value = 119210.5625 0 3 12 $ (d) A new decision rule could be: Reject H0 if the selected voucher value is $8 or more extreme (or in this case. (e) The corresponding new b* would be less than the original b . under the alternative H2 the population mean is 4. Consider three possible values for the population mean under the alternative hypothesis. A significance level a has been set and the corresponding decision rule is to reject the null hypothesis if the observed value is 3.64 or larger.125.5. what do you think will happen to the power of the test? The power will (select one) increase decrease stay the same .The height when x = 3 1 is given by . to calculate the corresponding areas you need to obtain the equation of the lines that form the hypotenuses of the two triangles and use those equations to calculate the height of the triangles at various values for x.1252 = 0. $8 or larger). b.5625. Under the alternative H1 the population mean is 3.

64 Power H1 : 3 3. Example 1. Let’s see why by examining some pictures. In this example the power was the area under the alternative hypothesis to the right of 3. we would have three points of a curve that is called the power curve (or power function).5 3. the power decreases and approaches the significance level A.64. Note that when the value for M in the alternative hypothesis gets closer and closer to the value for M in the null hypothesis. .5 5 What We’ve Learned: The farther the value stated in the alternative hypothesis is compared with the value from the null hypothesis. In particular. the area under the alternative hypothesis to the right of 3. so for H3.64 Power H2 : 3 3.64 4. H0 : 2 3 3.QXD 04/13/2005 03:29 PM Page 48 48 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS Solution (a) It would make most sense that it would be easier to reject the null hypothesis if the true alternative model were farther away from the null model. If we were to compute the power of the test for these values and plot them on a graph. In the previous three examples we have pictured and computed the power of the test for a variety of null and alternative hypotheses.64 became larger.14 showed the power of the test for various values of the parameter under the alternative hypothesis.64 4 5 Power H3 : 3.ALIAMC01_0131497561. As the value in the alternative hypothesis moved to the right. (b) The power will increase. the greater the power of the test.

two-sided. The hypotheses for the distribution of time (length of time between order taken and order received in minutes) are presented below as the smoothed version of the corresponding frequency plots.15 Problem ◆ The Power Curve A fast-food store manager wishes to test hypotheses regarding the distribution of service time for customers using the drive-through window. clearly label the region with a.ALIAMC01_0131497561. (i) In the picture. one-sided to the left.25 0 1 x 2 3 4 5 6 Service time in minutes 7 (a) The direction of extreme is (select one) one-sided to the right. These values of the power will be plotted and our first power curve will be displayed.25 0 1 x 2 3 5 6 4 Service time in minutes 7 H1 1 4 0. Example 1. (b) Consider the following decision rule: Reject H0 if the observed service time is 3. H0 1 4 0. the power approaches the significance level a.4 WHAT’S IN THE BAG? 49 DEFINITIONS: For a given null hypothesis H0 and a given significance level a.2 minutes or more extreme. In the next example we will again compute the power of the test under various values of the alternative hypothesis.14. you must decide whether or not to reject H0. and calculate the value of a. is m = 4. .QXD 04/13/2005 03:29 PM Page 49 1. Based on the service time for a randomly selected customer. this time for smoothed frequency plots that are a rectangle. Note that the mean service time under H0 is m = 2 and that the mean service time under H1. As we began to see in Example 1. the power curve is the graph that shows the power of the test against the various population values in the alternative hypothesis.When the population value in the alternative hypothesis gets closer and closer to the population value in the null hypothesis. the power of the test approaches the value 1 when the population value in the alternative hypothesis gets further away from the population value in the null hypothesis. shade in the region that corresponds to the significance level.

7 0. and calculate the value of b.ALIAMC01_0131497561. H1 1 4 1. and calculate the value of this power.825 and is shown below.2(0. shade in the region that corresponds to the power of the test. (vi) Call H4 the alternative hypothesis where the rectangle has the base [3. and calculate the value of this power.5].252 = 0. H0 1 4 (0.2 4 5 6 7 x Service time in minutes (iii) The power of the test is 1 . Shade the power for this H3. Shade the power for this H2.25) 0. (iv) Call H2 the alternative hypothesis where the rectangle has the base [2. (iii) In the picture.7 and is shown below. In the picture. H1 1 4 Power 1 0.2 4 5 6 7 x Service time in minutes (iv) The power of the test is 3.5 6.310. (c) The observed service time is 3. and calculate the value of the power. clearly label the region with power = 1 . Solution (a) The direction of extreme is one-sided to the right. shade in the region that corresponds to b the chance of a Type II error.b. (b) (i) The significance level a = 0.2 and is shown below.3 and is shown below.210. (v) Call H3 the alternative hypothesis where the rectangle has the base [3. (vii) Draw the picture of the power curve using the values from (iii). (d) Is the observed result in part (c) statistically significant? Explain.252 = 0.25 0 1 2 2.5 minutes.8)(0.5 7 x Service time in minutes .1.25) 4 0.25) 0. 7. H2 1 4 Power (3.3)(0. shade in the region that corresponds to the p-value and clearly label the region with p-value.825 0.2 4. and calculate the value of this power.2 4 5 6 7 x Service time in minutes (ii) The chance of a Type II error b = 1.5 3. 6.25 0 1 2 3.81.2 0 1 2 3. (v).25 0 1 2 3. clearly label the region with b. (iv).b = 0.1]. Shade the power for this H4.QXD 04/13/2005 03:29 PM Page 50 50 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS (ii) In the picture.3 0.252 = 0. and (vi).5. 7].25 0.

975 and is shown below.910.125 0.1 3.252 = 0.125.0 0 1 2 3 4 5 mean (c) The p-value will be the area to the right of 3.2.25) 0.25) 0.125 is less than the significance level a of 0. 0.810.25 0 1 2 3.4 0. H4 1 4 Power (3.8)(0.1 6 7.95) 0.4 WHAT’S IN THE BAG? 51 (v) The power of the test is 3. It is shown below and computed to be p-value = 0. since the p-value of 0.1 x Service time in minutes (vii) The power curve using the values from the previous parts is shown below. H0 1 4 p-value 0. 0. .25) 0.975 0.5 under the null hypothesis.25 0 1 2 3.8 Power of the test (4.95 and is shown below.25 0 1 2 3.ALIAMC01_0131497561.2) 0.95 0. H3 1 4 Power (3.510.2 4 5 6 7 x Service time in minutes (vi) The power of the test is 3.2 4 5.6 0.0 (5.5 4 5 6 7 x Service time in minutes (d) Yes.5(0. 0.252 = 0.QXD 04/13/2005 03:29 PM Page 51 1.9)(0.7) 0. 1.252 = 0.2 (2.

The contents of each bag. Let’s see what happens when we increase the sample size—that is. in terms of the face value and the frequency of voucher values. We then asked the question. 1. Our decision will be based on the average of the two selected vouchers. H1: The shown bag is Bag B. The power function is a non-decreasing function that approaches 1 when the value in the alternative hypothesis gets further from the value in the null hypothesis. is there any way we can reduce the level of b?” The answer is to increase the sample size n. We will prepare a table of the possible averages and their frequencies. such as. . we will list all the possible combinations of two vouchers that can be selected. is as follows: Bag A Face Value $1000 $10 $20 $30 $40 $50 $60 Frequency 1 7 6 2 2 1 1 Bag B Face Value $10 $20 $30 $40 $50 $60 $1000 Frequency 1 1 2 2 6 7 1 Bag A Bag B We were to be shown only one of the bags. We were allowed to gather some data and.4. Without replacement means that when you remove the first voucher from the bag. based on that data. because we thought it was Bag B. the sample size. you keep it in your hand and then select another voucher from the bag. you might select two vouchers from the bag at the same time.ALIAMC01_0131497561. a and b. we considered two bags—called Bag A and Bag B. there are a number of factors that influence the power function. As we did when we selected just one voucher. What are the possible averages that we could observe? It depends on whether we are selecting from Bag A or Bag B. and the inherent variability of the values in the population under study. What pairs of values could we select? *This section is optional. Consider selecting the two vouchers without replacement from the shown bag. instead of observing just one voucher. without replacement. In general. The data consisted of selecting just one voucher from the shown bag—that is. the sample size was n = 1. or to keep the shown bag. had to decide whether to take the other bag.5 SELECTING TWO VOUCHERS* In Section 1. Alternatively. For this fixed sample size. Each bag contained 20 vouchers. we can develop decision rules and compute the corresponding levels.QXD 04/13/2005 03:29 PM Page 52 52 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS What We’ve Learned: We have graphed our first power function. “For a given level of a. we are going to observe two vouchers. With this information. there was a relationship between a and b: Enlarging the rejection region would increase the level of a and decrease the level of b. We had to decide between the following hypotheses about the one shown bag: H0: The shown bag is Bag A. The outsides of the two bags looked alike. and then we will calculate the frequency that each combination can occur. because we thought the shown bag was Bag A. the significance level.

The pair of . . $20 $20 $20 $20 $20 $20 5 ways for the 1st $20 to go with other 5 $20's $20 $20 $20 $20 $20 $20 4 ways for the 2nd $20 to go with other 4 $20's . an average of $20 can occur a total of 14 + 15 = 29 ways.5 SELECTING TWO VOUCHERS 53 Think About It To calculate the number of different ways that a certain pair can occur—for example.ALIAMC01_0131497561. but cannot occur at all if the bag is Bag B. $10 $20 $10 $20 $10 $20 $10 $20 $10 $20 $10 $20 $10 Six ways for the 1st $10 $10 $20 $10 $20 $10 $20 . and the number of ways (frequency) of selecting that pair from Bag A (Column 4) and from Bag B (Column 5).. . $20). Notice that in Table 1. there are two entries in Column 3 corresponding to an average of $20 (entries shown in bold)—namely. . $10 $10 $1000 The pair of $10 and $20 can occur in 42 ways from Bag A. but cannot occur from Bag B (which has just one $20). $30) or ($20.. 5 4 3 2 1 15 ways for Bag A The pair of $30 and $50 can occur in two ways from Bag A and in 12 ways from Bag B. Why are these the number of occurrences? Table 1. . when you select the vouchers ($10. the corresponding average of the pair of values (Column 3). but in only one way from Bag B. and for Bag B. $10 $10 $10 The pair $1000 and $10 $10 can occur in seven ways $10 from Bag A.$1000 and $10—think of the seven $10’s in Bag A as if they were of different colors.QXD 04/13/2005 03:29 PM Page 53 1. For Bag A. an average of $20 can occur a total of 2 + 0 = 2 ways.1 provides a listing of the possible pairs of values that could be selected (Columns 1 and 2).1.$1000 and $10 from Bag A can occur in seven ways. so you can distinguish them. 6 ways for each of the 7 $10's $10 $20 $10 42 ways in total for Bag A $20 $10 $20 $10 Six ways for the 2nd $10 The pair of $20 and $20 can occur in 15 ways from Bag A.

ALIAMC01_0131497561.1 BAG A Numbers of ways of selecting the two values 0 7 6 2 2 1 1 0 21 42 14 14 7 7 0 15 12 12 6 6 0 1 4 2 2 0 1 2 2 0 0 1 0 0 0 0 BAG B Number of ways of selecting the two values 0 0 0 0 0 0 0 0 0 1 2 2 6 7 1 0 2 2 6 7 1 1 4 12 14 2 1 12 14 2 15 42 6 21 7 0 Possible two values selected $1000 $1000 $1000 $1000 $1000 $1000 $1000 $1000 $10 $10 $10 $10 $10 $10 $10 $20 $20 $20 $20 $20 $20 $30 $30 $30 $30 $30 $40 $40 $40 $40 $50 $50 $50 $60 $60 $1000 $1000 $10 $20 $30 $40 $50 $60 $1000 $10 $20 $30 $40 $50 $60 $1000 $20 $30 $40 $50 $60 $1000 $30 $40 $50 $60 $1000 $40 $50 $60 $1000 $50 $60 $1000 $60 $1000 $1000 Average of the two selected values $1000 $495 $490 $485 $480 $475 $470 0 $10 $15 $20 $25 $30 $35 $505 $20 $25 $30 $35 $40 $510 $30 $35 $40 $45 $515 $40 $45 $50 $520 $50 $55 $525 $60 $530 $1000 .QXD 04/13/2005 03:29 PM Page 54 54 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS Table 1.

combining the entries corresponding to the same average. From Table 1.55.ALIAMC01_0131497561.2 presents a condensed version of Table 1. Examine the frequency plots that follow.QXD 04/13/2005 03:29 PM Page 55 1.1. and be sure you understand what they represent.2. Note that the number of ways to select two vouchers from a bag with 20 vouchers is 190.5 SELECTING TWO VOUCHERS 55 Table 1. The frequency plots for Bag A and Bag B show the possible averages and the frequency of each average when the sample size n = 2 for both bags. (See Exercise 1.2 Average of the two selected values $1000 $495 $490 $485 $480 $475 $470 0 $10 $15 $20 $25 $30 $35 $40 $45 $50 $55 $60 $505 $510 $515 $520 $525 $530 $1000 Total BAG A Number of ways of selecting the two values 0 7 6 2 2 1 1 0 21 42 29 26 20 17 9 4 2 1 0 0 0 0 0 0 0 0 190 BAG B Number of ways of selecting the two values 0 0 0 0 0 0 0 0 0 1 2 4 9 17 20 26 29 42 21 1 1 2 2 6 7 0 190 .) Table 1. we are able to produce a frequency plot of the resulting possible averages for Bag A and for Bag B.

QXD 04/13/2005 03:29 PM Page 56 Frequency Plot for Bag A X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X 495 490 485 480 475 470 10 15 20 25 30 35 40 45 50 55 60 505 510 515 520 525 530 Possible Averages ($) Frequency Plot for Bag B X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X 495 490 485 480 475 470 10 15 20 25 30 35 40 45 50 55 60 505 510 515 520 525 530 Possible Averages ($) 56 .ALIAMC01_0131497561.

an average of $15 could occur in 42 out of 190 possibilities. any of the other possible averages. . $505.$495. $45.1 Forming a Decision Rule The two competing theories are as follows: H0: The shown bag is Bag A. An average of $15 is more likely to be obtained if the bag is Bag A. H1: The shown bag is Bag B. but if the bag is Bag B. Would this observation lead you to think the shown bag is Bag A or Bag B? Why? How would you answer if the average of the two selected vouchers is $15? An average of $55 could result from either Bag A or Bag B. If the bag is Bag A.QXD 04/13/2005 03:29 PM Page 57 1. $510. The direction of extreme is to the right. an average of $15 could result from either Bag A or Bag B. We will first consider the most extreme value (in the direction of H1) among the possible values under the null hypothesis that shows the most support for the alternative hypothesis. $30. could result from either Bag A or Bag B. In each of these two cases. Likewise. you will have not made an error. an average of $55 could occur in 42 out of 190 possibilities. and we will decide to reject H0. then we will know that the shown bag is Bag B. . $525. if we obtain an average of $60. How will you decide based on the average of two selected vouchers from the shown bag whether or not to reject H0? Consider first the obvious choices: If we obtain an average of . However. $515. and we will use this direction to finding extreme values for developing possible decision rules. .$475.$490. $25. Think About It Suppose the average of the two selected vouchers is $55. . corresponding to values that are large. An average of $55 is more likely to be obtained if the bag is Bag B. or $530. $50. then we will know that the shown bag is Bag A. or $10. If the bag is Bag A.$470. an average of $15 could occur in only 1 out of 190 possibilities. $35. or $55. It is the larger voucher values that are extreme or contradictory to the null hypothesis H0 and thus show the most support for the alternative hypothesis H1. $520. an average of $55 could occur in just 1 out of 190 possibilities.ALIAMC01_0131497561. $15. . $20.$485. Likewise. .5 SELECTING TWO VOUCHERS 57 1. but if the bag is Bag B.5. and we will not reject H0. $40.$480.

Since under the null hypothesis (Bag A) there is only 1 one average greater than or equal to $55. of all the possible averages specified under the null hypothesis (Bag A). how190 ever. . averages of $50 or more. Decision Rule 1: Reject H0 if your selected average of the two vouchers is Ú $55. only 0. Under the alternative hypothesis (Bag B). so the chance of a Type II error is b = 108 = 0. In particular.416. To enlarge the rejection region. The corresponding decision rule is as follows: Decision Rule 2: Reject H0 if your selected average of the two vouchers is Ú $50. so the chance 79 of a Type II error is b = 190 = 0. the average that is most likely to come from Bag B is the average of $55. so we find the chance of this error by looking at the frequencies for Bag A.QXD 04/13/2005 03:29 PM Page 58 58 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS Looking at the frequency plots for Bag A and Bag B.0053. This next most extreme value gives us a new rejection region—namely. Average of the two selected values $1000 $495 $490 $485 $480 $475 $470 $0 $10 $15 $20 $25 $30 $35 $40 $45 $50 $55 $60 $505 $510 $515 $520 $525 $530 $1000 Bag A Number of ways of selecting the 2 values 0 7 6 2 2 1 1 0 21 42 29 26 20 17 9 4 2 1 0 a 0 0 0 0 0 0 0 Bag B Number of ways of selecting the 2 values 0 0 0 0 0 0 0 0 0 1 2 b 4 9 17 20 26 29 42 21 1 1 2 2 6 7 0 Among the remaining possible averages under the null hypothesis. This most extreme value leads us to our first decision rule. 3 the significance level a = 190 = 0. the value of b is a very large value. there is a total of 26 + 20 + 17 + 9 + 4 + 2 + 1 = 79 averages that are less than $50. We may make a mistake if we use Decision Rule 1. we need to find the next most extreme average among the remaining possible averages under the null hypothesis that shows the most support from the alternative hypothesis. with the largest chance 29 from Bag B of 190. So let’s consider enlarging the rejection region. the significance level a = 190 = 0. A Type II error could only occur if H1 is true. an average of $55 is possible from either Bag A or Bag B.0053. Notice that enlarging the rejection region will increase the level for a and decrease the level for b. Under the alternative hypothesis (Bag B). A Type I error could occur only if H0 is true. The significance level a is quite small.568.ALIAMC01_0131497561.0158. Since under the null hypothesis (Bag A) there are three averages greater than or equal to $50.568. so we find the chance of this error by looking at the frequencies for Bag B. there are a total of 29 + 26 + 20 + 17 + 9 + 4 + 2 + 1 = 108 averages that are less than $55. 0. which value shows the most support from the alternative hypothesis? The answer is $50.

Recall that if there is no possible cutoff value that will make the significance level exactly equal to 0. We can answer this inquiry by comparing the results for n = 1 and n = 2. Decision rule : significance level A For the Bag A–Bag B scenario with n = 2.5 SELECTING TWO VOUCHERS 59 et's Do I 1.05 as possible without exceeding 0. if the decision rule tells us to reject H0 if the average of the two selected vouchers is $55 or more. b = b = 12 20 53 190 t! Finding a 5% Level Decision Rule = 0. if the significance level is a = 190 = 0. we use the value that yields a significance level that is as close to 0. determine the corresponding decision rule. then the corresponding significance 1 level is a = 190 = 0. or closest to without exceeding.05 Decision rule—reject H0 if your result is Ú$60.05.05. Decision rule—reject H0 if your result is Ú$45. increasing the sample size from n = 1 to n = 2 reduces the level of b: Significance level A Sample size n = 1 Sample size n = 2 0. Decision Rules 1 and 2 both resulted in actual significance levels that were less than 0.ALIAMC01_0131497561. That is. You can also take the opposite approach. . a.05.05. for a given significance level of a. and for that level.” Summary of the Relationship between the Decision Rule and the Significance Level If you start with a particular decision rule. You can be given a specified significance level for performing the test. we asked whether. there was any way we could decrease the level of b. The following table shows that. we can find the appropriate cutoff value that would yield a significance level equal to.14 L Suppose that the specified significance level is a = 0. then the corresponding decision rule tells us to reject H0 if the average of the two selected vouchers is $55 or more.0053. The decision rule will be “Reject H0 if the average of the two vouchers is This rule results in a significance level of a = The corresponding level for b is . you can obtain the corresponding significance level a.QXD 04/13/2005 03:29 PM Page 59 1. Continue to enlarge the rejection region to find a decision rule with a 5% significance level.60 = 0. Back on page 22. Significance level A : decision rule 1 For the Bag A–Bag B scenario with n = 2.05. for the same significance level of a = 0.278 .0053. with a given cutoff value.

if the sample size n is increased. For a fixed significance level.2 What’s in the Bag? p-value When the Sample Size Is 2 We again turn our attention to the observed data—to measuring how likely the observed result is if the null hypothesis is true—namely. 1.b. What would you decide? What is the p-value? Looking at the frequencies for Bag A. For both bags. for the same significance level a = 0.5. when the sample size increases. the values are less spread out and more tightly clustered in the center. and 9. will increase.279. and thus the power of the test. we see that under the null hypothesis there are a total of 20 + 17 + 9 + 4 + 2 + 1 = 53 ways to get an average of 53 $30 or larger. more information is available on which to base the decision. the possible values would be even less spread out than when n = 2 and would be even more concentrated in the center. 8.ALIAMC01_0131497561. the p-value.10. Think About It What are the possible values for the average of n = 2 vouchers selected from Bag A? What are the possible values for the average of n = 2 vouchers selected from Bag B? How do these possible values compare to those for each bag when the sample size was just n = 1? For Bag A we see that when is n = 2. “If H0 is really true. the value of b will decrease. the values varied from -495 to 55. Would you reject the null hypothesis H0 at the . For Bag B we see that when n = 2.174 If the sample size n is increased. In the next section we wrap up our n = 2 Bag A–Bag B scenario by looking at the p-value and using it to make our decision as to whether we think the shown bag is Bag A or Bag B.10 Decision rule—reject H0 if your result is Ú$50. (Do you want to try it?). b = b = 6 20 33 190 = 0. how likely would we be to observe an average of this magnitude or larger (in the direction supporting the alternative hypothesis) just by chance?” Suppose that the observed average of the two selected vouchers is $30. If we were to continue this process and work out all the possible samples of size n = 3 vouchers (without replacement). we would have 1140 possible average values for Bag A and 1140 possible average values for Bag B.We will return to the effect of increasing the sample size in Chapters 2. when the sample size is larger. We will show in Chapter 8 that in general. the p-value is 190 = 0. Thus. We ask. the frequency plot of the sample averages becomes less spread out and more concentrated around the center.30 = 0. Decision rule—reject H0 if your result is Ú$40.QXD 04/13/2005 03:29 PM Page 60 60 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS The following table shows that. Suppose that we select two vouchers and observe their average. whereas for n = 1 the values varied more from 1000 to 60. increasing the sample size from n = 1 to n = 2 reduces the level of b: Significance level A Sample size n = 1 Sample size n = 2 0. the values varied from 15 to 530 whereas for n = 1 the values varied more from 10 to 1000. 1 . For n = 3.The effect of the sample size on the power of the test is further discussed through the next Think About It question.

the decision is either right or wrong. If your answer to the last question in the “Think about it” box was not 0 or 1. you would reject the null hypothesis and conclude that the data are statistically significant at the 5% level.279 is larger than the significance level of 0.05—that is. Would you reject the null hypothesis H0 at the 0. because the p-value of 0.5 SELECTING TWO VOUCHERS 61 Average of the two selected values $1000 $495 $490 $485 $480 $475 $470 $0 $10 $15 $20 $25 $30 $35 $40 $45 $50 $55 $60 $505 $510 $515 $520 $525 $530 $1000 Bag A Number of ways of selecting the 2 values 0 7 6 2 2 1 1 0 21 42 29 26 20 17 9 4 2 1 0 0 0 0 0 0 0 0 Bag B Number of ways of selecting the 2 values 0 0 0 0 0 0 0 0 0 1 2 4 9 17 20 26 29 42 21 1 1 2 2 6 7 0 0.10 level of significance? The answer is yes.ALIAMC01_0131497561. we see that under the null hypothesis there are only 2 + 1 = 3 ways to get an average of $50 or larger. .What would you decide? What is your p-value? Looking at the frequencies for Bag A. and the chance that you made the mistake is either 0 or 1. Two vouchers were then selected and the observed average turned out to be $50. but was 0.05.0158. the 3 p-value is somewhat small. go back and read page 33 on the distinction between before and after you look at the data. because the p-value is very small. which is less than the significance level of a = 0. We also see that the observed $30 voucher lies in the acceptance region for a 10% level decision rule. less than the significance level of 0. Think About It Suppose that the significance level was set at a = 0. Note that the observed $50 voucher lies in the rejection region for a 10% level decision rule.10 level of significance? The answer is no. 190 = 0. You rejected H0. Could you have made a mistake? (circle one) What type of mistake could you have made? (circle one) What is the chance that you have made a mistake? Explain.QXD 04/13/2005 03:29 PM Page 61 1.10.10. Since the p-value for this observed result is 0. the chance of a Type I error was set at 0.0158.05. Suppose that the observed average of the two selected vouchers is $50. Yes No Type II error Type I error Once you have made a decision. Thus.05.

and the conclusion will be left up to you as the reader.ALIAMC01_0131497561. This does not necessarily make it important. In Chapters 2 and 3. we say the result is statistically significant. while the average for the subjects treated with the standard drug was 7 days. Doctors begin to prescribe the new drug to diagnosed patients. it is the chance of observing a result as extreme (or even more so). can be found to be statistically significant—that is.5 days.” The maker of this new drug launches a marketing campaign citing that a recent study has “proven” this new drug works faster than the standard drug. as the following cases explain: Case 1: With a large enough sample. we will discuss the notion that not all data are good data and learn some guiding statistical principles behind the collection of good data. Consider a study conducted to compare two drugs for treating strep throat. the statistical term significant does not mean “important. When you read the results of a study. Perhaps we should slow down a bit. . while tolerable. a new drug and a standard drug. The new drug cured patients an average of 0. Statistical analysis of these data indicated that this difference in time to cure was “statistically significant.The statistical analysis only tells us that the half-day difference is almost impossible to have occurred by chance if indeed the two drugs were equally effective. 1. Large and expensive data sets can be useless because there may have been no statistical design principles underlying the data collection. hard to explain by chance alone.” A result may be statistically significant. The hypotheses being tested were as follows: H0: The new drug is as effective as the standard drug in terms of length of time to achieve a “cure.5 days sooner as compared to the standard drug. You will need to consider the consequences of the two types of errors. However.” Suppose that the average time until “cure” for subjects treated with the new drug was 6. It is also important to consider the sample size the study was based on. an important difference may not be statistically significant if the sample size is too small. even a small difference. For patients who must pay for prescriptions. cost may also play a role. are generally more severe or more frequent. a key factor to consider when assessing the results is the sample size. and one of them really is true. The p-value is a chance that is computed assuming that the null hypothesis is true. Such a difference between the observed data and the null hypothesis theory was unlikely to occur just by chance alone. a small amount of improvement.6 SIGNIFICANT VERSUS IMPORTANT When the observed data are very different from what would be expected under the null hypothesis (that is. There are two competing hypotheses. We must be careful—more data do not necessarily mean more understanding. What other issues should be considered? Suppose that the side effects with the new drug. Case 2: On the other hand. From a statistical viewpoint. you may be provided with a p-value. but it may not be practically significant.QXD 04/13/2005 03:29 PM Page 62 62 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS Another common misinterpretation of the meaning of a p-value is to associate it with the chance that the null hypothesis is true.” H1: The new drug is more effective than the standard drug in terms of length of time to achieve “cure. or those interpreting the results may have lacked the statistical expertise to make sense of the data. The p-value of a test depends on the sample size. the p-value is very small).

(a) If you take a sample of size n = 2. H1: The class does not consist of all males ( 6100% males). The decision rule is “Reject H0 if you observe at least one female in your sample. . H1 is actually true.ALIAMC01_0131497561. The hypotheses to be tested are as follows: H0: The class consists of all males (100% males). The decision rule is “Reject H0 if you observe at least one female in your sample. how likely would it be to obtain a female in your sample and thus reject the null hypothesis? Not Very Likely Very Likely If the sample size is too small. even a large difference (100% males under the null hypothesis versus the true percentage of 70% males in the population) may not be statistically significant. (a) If you take a sample of size n = 80. how likely would it be to obtain a female in your sample and thus reject the null hypothesis? Not Very Likely Very Likely (b) If you take a sample of size n = 2. H1: The class does not consist of all males ( 6100% males). Think About It Case 2 The population is the 100 students in the Engineering 101 class. even this small difference (100% males under the null hypothesis versus the true percentage of 98% males in the population) can be found to be statistically significant. how likely would it be to obtain a female in your sample and thus reject the null hypothesis? Not Very Likely Very Likely With a large enough sample. H1 is actually true.QXD 04/13/2005 03:29 PM Page 63 1.” Suppose that the class actually consists of 2 females and 98 males—that is. how likely would it be to obtain a female in your sample and thus reject the null hypothesis? Not Very Likely Very Likely (b) If you take a sample of size n = 80.” Suppose that the class actually consists of 30 females and 70 males.6 SIGNIFICANT VERSUS IMPORTANT 63 Think About It Case 1 The population is the 100 students in the Engineering 101 class. The hypotheses to be tested are as follows: H0: The class consists of all males (100% males). so again.

The company manager is considering the purchase of the latest laser-based inspection equipment. In this case. the average is 10. on average.000 population. etc. What could happen if the test is conducted using a sample that is too small? .4 cases per 100. The producer of the new equipment is interested in being able to advertise that this new product is indeed faster.) against the actual increase in speed. A single PCB may contain thousands of solder joints. If the affliction rate is actually higher than 71. the average is 7 10.4 cases per 100.ALIAMC01_0131497561. A particular company currently uses an X-ray-based inspection system that can inspect approximately 10 solder joints per second. Since the manager and producer have different needs. (b) The manager may need to weigh the cost of implementing the new equipment (purchasing. then the government will provide funding for the implementation of new programs.QXD 04/13/2005 03:29 PM Page 64 64 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS Example 1. What We’ve Learned: The manager and the producer have different needs. (a) Would you want to take a small sample size or a large sample size? (b) Suppose the rate is actually higher than 71. The manager wishes to test the following hypotheses: H0: The new laser equipment is as fast as the current X-ray equipment in terms of the average number of joints inspected per second. being able to detect a small difference is of practical significance. more than 10 solder joints per second. Also suppose that enough observations are taken so that the difference.000 population.000 population. however small. Example 1. that is. H1: The new laser equipment is faster than the current X-ray equipment in terms of the average number of joints inspected per second.17 Problem ◆ AIDS Affliction Rate Consider the following hypotheses: H0: The country has a rate of AIDS affliction of 71.000 population. solder-joint defects on PCBs have plagued electronics manufacturers.4 cases per 100. However.16 Problem ◆ Circuit Boards The introduction of printed circuit boards (PCBs) in the 1950s revolutionized the electronics industry.001. training of employees.4 cases per 100. The data were found to be statistically significant. (a) What decision would the manager have made? (b) What factors would influence whether or not the small difference would be of any practical importance? Solution (a) The manager would decide to reject the null hypothesis and conclude that the new equipment was indeed faster. Suppose that the true average number of solder joints inspected per second with the laser equipment is just 10. the level of significance and the sample size selected would probably be different. H1: The country has a rate of AIDS affliction of more than 71. that is. which supposedly can inspect. was detected.

(b) If the sample size is too small.The decision to reject H0 does not mean that H1 is true.ALIAMC01_0131497561. Even though your data are statistically significant (H1 is supported). more. SUMMARY ■ ■ The goal of this first chapter was to get you acquainted with the reasoning used in the statistical decision-making process. First. we need to state it correctly. In Chapters 9 and 10. or simply because the decision was based on just a sample from the population. So what sample size should we use? In general. a change ■ We must understand that errors can be made in making a decision. decrease Key words in H1 are different from. The decision to support H0 does not mean that H0 is true. Both hypotheses should be statements about the same population(s). even though the difference is practically important. you may not be able to detect the difference. one-sided to the left. However.QXD 04/13/2005 03:29 PM Page 65 SUMMARY 65 Solution (a) You would want to gather many observations. and a Type II error is failing to reject a false null hypothesis. to get all the possible information you can. we determine the null and alternative hypotheses. What We’ve Learned: If the sample size is too small. greater. you still might not act on H1 because the difference may not be of practical importance. In Chapter 2. less. when the level of significance is small. fewer. A Type I error is rejecting a true null hypothesis. Directions of Extreme : One-sided to the right H0 H1 One-sided to the left H1 H0 Two-sided or H1 H1 H0 Key words in H1 are higher. but rather that the data were strong enough to detect the difference. larger samples can give us a better idea about the population. we discuss methods for selecting a sample. even a large difference may not be statistically significant. . increase Key words in H1 are lower. or two-sided). but rather that the data failed to detect the difference. even a small difference can be found to be statistically significant. This can occur when not enough observations are taken. with a large enough sample. Since many computations in testing require that we assume that the null hypothesis is true. The result may not be statistically significant. but larger samples may mean more time and higher costs. The alternative hypothesis generally provides us with the direction of extreme that is needed to carry out the test (one-sided to the right. we will discuss statistical techniques that help us determine the sample size needed. a large sample size.

Once a decision has been made. present. Besides determining if the data are statistically significant. we should consider the role of the sample size in assessing practical significance. computed under the assumption that H0 is true. The p-value is a chance. and assess the significance of such relationships are discussed in Chapters 13 and 14. we are introduced to the idea that samples taken from the same population using the same basic method will not all yield the same results. in your own words. Chapter 3 discusses various types of studies and the questions to ask to help us decide intelligently which results are worthy of our attention. Looking ahead: Issues revolving around how to collect the data are addressed in Chapter 2.QXD 04/13/2005 03:29 PM Page 66 66 ■ CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS The chance that we will make a Type I error is called the significance level. and in Chapter 7 we study chance. Details on how to measure. we say the data are statistically significant and we reject H0.ALIAMC01_0131497561. of getting the observed value plus the chance of getting all of the more extreme values. we experience what kind of dissimilarity we should expect to see among the various samples that could be obtained from the same population. is what allows us to use the statistical methods presented in Chapters 9 through 15. In Chapter 2. the stronger is the evidence provided by the data against the null hypothesis. Once we collect and summarize the data. This level is specified in advance before the data are collected. and give an example of each of the following key terms from this chapter: one-sided rejection significance level 15 scientific method 1 region 28 power of the test 15 data 2 two-sided rejection frequency plot 16 population 4 region 28 sample size 17 sample 4 p-value 29 decision rule 18 statistical inference 4 smoothed frequency direction of extreme 18 hypotheses 4 plot 37 most extreme value 18 null hypothesis 6 power curve (power rejection region 19 alternative function) 49 acceptance region 19 hypothesis 6 practical significance versus critical value (cutoff statistically significant 10 importance 62 value) 19 Type I error 11 Type II error 11 . we measure how unlikely the data are if the null hypothesis is true (the p-value). from each other and from the population. In Chapter 8. the decision is either right or wrong. 5. and overall models. More formally. the p-value is the chance. The smaller the p-value. The ability to quantify the amount by which sample results are likely to differ. These last chapters bring us back again to statistical inference and decision making. we can make a decision. you will learn various ways for summarizing data using graphs. Studies are conducted to test hypotheses. If the p-value is less than or equal to the significance level. numerical measures. discussing formal estimation and hypothesis testing. In Chapters 4. more formally referred to as probability. Your understanding of the material presented in this chapter provides you with a sound foundation for pursuing the knowledge presented in the remaining chapters. and 6. denoted by a. Based on this p-value. ■ ■ ■ ■ ■ KEY TERMS Be sure you can describe.

2 In statistics. The technique involves transmitting the sounds of a killer whale underwater.1 1. administered to patients 22% in the latest stage had reported Sunday. University Dr. stimulates the five years since treatimmune system to attack the ment.4 1. whose melanoma had Results suggest it could be a tumor regression. Commercial fishermen working certain parts of the Atlantic Ocean sometimes have trouble with the presence of whales. For each set of hypotheses. H1: The dog does not bite. sonar researchers have determined that 40% of all whales seen in an area do leave on their own.ALIAMC01_0131497561. In the past. that of those receiving other treatments. (c) H0: The mall is open. 10% survive five years. decide whether a Type I error or a Type II error would be more serious.” The vaccine was greatly improves survival with no compared with 10% getting other serious side effects. in terms of the five-year survival rates (Source: USA Today. H1: The watch is waterproof. patients who survived Morton says the vaccine. we use the symbols of H0 and H1. (a) H0: The gun is loaded. State the appropriate hypotheses for assessing whether this vaccine is just as effective as other treatments. Findings presented at the SAN DIEGO—A vaccine-style American Cancer Society’s Science greatly improves survival treatment for malignant melanoma – Writers Seminar show: with no serious side 27% survived five years the most deadly skin cancer – effects. 1993). made people or prevent recurrences. H1: The mall is closed. Write out an explanation of what these mean for a person who has not had statistics. State the appropriate null and alternative hypotheses for testing whether or not the new technique works in terms of the percent of whales that leave the area. 1. Decide whether each of the following statements is true or false: (a) The first stage (step) in the scientific method is to formulate a theory (hypothesis). says disappeared. Answers may vary. from tumor cells. Dr. They would like to scare away the whales without frightening the fish.3 1. probably to get away from the noise of the fishing boat. thinks the vaccine may is the percentage of such prevent melanoma in high-risk Calif. The accompanying article suggests that “a vaccine Vac c i ne battl es mel ano ma treatment for malignant Researchers gave the vaccine to By Tim Friend melanoma—the most 355 patients whose melanoma had USA TODAY deadly skin cancer— spread. The article states melanoma. John Wayne the effect of this vaccine Cancer Institute in Santa Monica. or if this vaccine is more effective than other treatments. of Utah.5 . researchers treatments. (b) A theory (hypothesis) is rejected if it can be shown statistically that the data we observed would be very unlikely to occur if the theory were in fact true. tumors in three spread.Donald Morton. (d) H0: The watch is not waterproof. (d) The null hypothesis and the alternative hypothesis are each a statement (sentence) about the resulting sample. (b) H0: The dog bites. The fishermen plan to try a new technique to increase that figure. Laurence Meyer.QXD 04/13/2005 03:29 PM Page 67 EXERCISES 67 EXERCISES 1. (c) A well-planned study will always provide proof for a theory. Explain your choice. H1: The gun is not loaded. One measure of major new treatment. March 29.

H1: The electricity is not turned on. the drug that has been used until now to treat high blood pressure. Pharmaco. They claim that their new drug is much better for reducing blood pressure compared to Cephaline. the Star Kist Co. Suppose that you are an amateur gardener with a fondness for tomatoes and statistics.? (c) Could a mistake have been made? If so.9 1. (b) Pharmaco. H1: The snake is not poisonous. describe the mistake.10 1.7 1. (b) H0: The brakes are not operational. An employee is assumed not to be a drug user.” and the alternative hypothesis is “drug user. Her hypotheses are H0: The average age for all patrons is 30 years. Septaphine. decide whether a Type I error or a Type II error would be more serious. she plans to remodel the club and alter the entertainment to appeal to the older crowd. (a) H0: The electricity is turned on.8 1. the protection agency tested the following hypotheses: H0: The average sodium content of all six-ounce cans is 250 mg.” The test classifies a person as a drug user 4% of the time when the person is really not a drug user.6 For each set of hypotheses. Explain your choice. (c) H0: The snake is poisonous. Inc. They decide to test their claim.QXD 04/13/2005 03:29 PM Page 68 68 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS 1. Therefore.. Answers may vary. Are they serious? Find an article in a recent newspaper in which a hypothesis is stated. H1: The average sodium content of all six-ounce cans is greater than 250 mg. If so. no changes will be made. H1: The brakes are operational. H1: The average age for all patrons is more than 30 years. to fight high blood pressure. Inc. In response to these complaints. is a drug company that has come up with a new drug.11 1. (a) Explain what a Type I error and a Type II error are in this situation.12 . She would like to determine whether or not the mean age of her patrons is over 30 years. If not. The data from the study were not statistically significant. you decide that this year you will fertilize half your tomato plants with Brand A and half with Brand B and compare the average yields for the two types of fertilizer. (d) H0: It is safe to cross the street. (a) State clearly the null and alternative hypotheses. H1: It is not safe to cross the street. (a) What hypothesis was supported? (b) Was a complaint registered with the provider. Brand B. 1. Brand A. For the past few years you have always used one particular brand of fertilizer. may increase the yield you get from your tomato plants. Explain the consequences of a Type I error. A consumer protection agency received many complaints that the sodium content in a sixounce can of Star Kist tuna is greater than the 250 mg that is stated on the label of each can. An international corporation plans to institute a drug-testing plan for its employees.. (a) What is your null hypothesis and your alternative hypothesis? (b) Explain what Type I and Type II errors represent in this situation. found that Septaphine was significantly better than Cephaline using a 10% level of significance. but now you think a new more expensive fertilizer. (b) What is the chance of a Type I error in this situation? The owner of a local nightclub has recently surveyed a sample of 100 patrons of the club. so the null hypothesis is “not a drug user.ALIAMC01_0131497561. State the corresponding null and alternative hypotheses and write down in words the meaning of both a Type I error and a Type II error. on your tomato plants.

and Bag H. Compute the corresponding levels of a and b. to having three identical bags. (b) Enlarge the rejection region to provide a Decision Rule 2.6 on the two-sided rejection region (page 26). describe the mistake. Bag E. The direction of extreme for this scenario is to the left and to the right. 0. The rejection region contains those values as small or smaller than $1 and as large or larger than $10. a. You are allowed to select just one X X X X voucher from the shown bag and must decide X X X X X X whether or not to reject H0.10 and b = 0.To compute the value for b.ALIAMC01_0131497561. The null hypothesis will be that the shown bag is Bag E.14 1. 0.05. corresponding to the values that are very small or very large. If H1 is true. (i) Compute b assuming that the shown bag is Bag G.QXD 04/13/2005 03:29 PM Page 69 EXERCISES 69 (i) What hypothesis was supported? (ii) Could a mistake have been made? If so. How do they compare with those in part (a)? In this exercise.30. Bag G or Bag H. The first decision rule being considered was as follows: Decision Rule 1: Reject H0 if your selected voucher is either … $1 or Ú $10. X X X X X X X X H0: The shown bag is Bag E. (a) Find the levels a and b corresponding to Decision Rule 1. we need one more piece of information—namely. but the decision rule is changed so that a is now 0. (ii) Compute b assuming that the shown bag is Bag H.15 1. Bag E and Bag F. we extend Example 1. Suppose that the sample size n will remain the same. (c) a correct decision.20. the shown bag is either Bag G or Bag H.16 $1 $2 $3 $4 $5 $6 $7 $8 $9 $10 Frequency Plot of Bag H X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X $1 $2 $3 $4 $5 $6 $7 $8 $9 $10 $1 $2 $3 $4 $5 $6 $7 $8 $9 $10 We still have that both the smaller voucher values and the larger voucher values are less likely under the null hypothesis (Bag E) and show the most support for the alternative hypothesis (either Bag G or Bag H). are X X provided. in terms of the vouchers face values.6 from having two identical bags. What is the significance level. and you will be shown only one of the three bags. 1. (c) Recall that b is the chance of failing to reject H0 when H1 is true. Frequency Plot of Bag G X X X X X X X X X X X X X X X X X X X X X X X 1. . Compute the corresponding p-value. (d) (a) or (c) (e) (b) or (c) Recall Example 1. The frequency plots that show the contents of each Frequency Plot of Bag E bag. Each bag contains 30 vouchers. H1: The shown bag is Bag G or Bag H. Bag G. a = 0. Determine which of the following is possible for the new value of b and explain your answer: 0. which of these two alternative bags is the shown bag.15. The alternative hypothesis is that the shown bag is either. corresponding to this decision rule? (b) Suppose that the observed voucher value is $3. you may be making Á (select your answer) (a) a Type I error. (a) Consider the following decision rule: Reject H0 if your selected voucher is either …$1 or Ú$10.13 For a certain statistical test with a particular decision rule. (b) a Type II error. If you do not reject the null hypothesis.

the chance of a Type II error? If yes. the 0. if the decision is to fail to reject H0. if the significance level a = 0. If no. (d) In hypothesis testing. can you compute b. compute it. explain why it is false): (a) In hypothesis testing. if the null hypothesis is rejected.ALIAMC01_0131497561. The null hypothesis can be stated as “The proportion of newborns that are girls is 0. Each box contains 25 samesized tokens. Frequency Plot of Bag C X X X X X X X X X H1: The shown bag is Bag D.21 . one-sided to the left.95.0. it is because it is not true.05. (b) In hypothesis testing. if the statement is false.) 1. (b) if H0 is true.QXD 04/13/2005 03:29 PM Page 70 70 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS 1.5. (b) Is the direction of extreme one-sided to the right. a Type II error could be made. (c) In hypothesis testing. (You may wish to review Example 1.17 Consider again the two identical bags. Frequency Plot of Bag D X X X X X X X X X X X X X X X X X X X X X $1 $2 $3 $4 $5 $1 $2 $3 $4 $5 Knowing only the contents of each bag. or two-sided? Determine whether each of the following statements is true or false (a true statement must always be true.5 on page 23. Box A and Box B.05 = 0. the chance of falsely rejecting it is 0. H0: The shown bag is Bag C. explain what other information is needed to do so. You are allowed to select just one voucher from the shown bag and must decide whether or not to reject H0. Each token is inscribed with a dollar value.50.05 indicates (select your answer) (a) the chance of a Type II error is 0. The frequency table (distribution) for each box is as follows: Box A Value ($) 0 5 10 15 20 25 30 Frequency 1 1 2 3 4 6 8 Box B Value ($) 0 5 10 15 20 25 30 Frequency 8 6 4 3 2 1 1 1. (e) A one-sided test is used whenever the sample size is small.19 1. In hypothesis testing.20 1. a Type II error could be made. from Example 1. if the null hypothesis is true.05.05 is chosen.18 Is a randomly chosen newborn baby equally likely to be a girl or a boy? Suppose that a researcher decides to investigate this question by checking the records of births for the past 10 years at a large metropolitan hospital.05. Suppose that there are exactly two boxes. then b = 1 . if a level of significance of 0. (d) 5% of the time H0 is true. (c) 95% of the time H0 is true.” (a) Give the appropriate alternative hypothesis. Bag C and Bag D.

22 Suppose that there are exactly two bags. (b) What is the direction of extreme? (c) Write a decision rule so that the probability of a Type I error is as close to. (b) A Type II error occurs whenever we fail to reject the null hypothesis. The frequency table for each bag is as follows: Bag A Value ($) 2 4 6 8 10 12 14 16 18 Frequency 1 2 4 10 16 10 4 2 1 Bag B Value ($) 2 4 6 8 10 12 14 16 18 Frequency 10 7 5 2 2 2 5 7 10 Furthermore. (d) What is the numerical value of a for your decision rule? (e) What is the chance for committing a Type II error. Each bag contains 50 equal-sized tokens inscribed with a dollar value. You must judge (decide) the probable truth of the claim statistically by randomly selecting one token from the given bag and using the p-value approach. Using the decision rule from part (c). what is your decision? 1. b. .10 for a statistical test. (d) What is the numerical value for a for your decision rule in part (c)? (e) What is the chance of committing a Type II error using your decision rule? (f) Suppose that you select a token inscribed with $5. then b must be 0. (a) Draw a frequency plot for the two hypotheses. suppose that just one of these two bags is presented to you with the claim that the bag is Bag A.QXD 04/13/2005 03:29 PM Page 71 EXERCISES 71 Suppose that you are shown just one of the boxes. but no more than. (c) If you suspect that the average life expectancy is not 76 years (as it has been accepted in the past). Assume that you have selected a voucher inscribed with $14. You must decide (judge) the probable truth of this claim statistically by selecting at random one token from the shown box. if your rule is used? (f) Suppose that a token inscribed with $8 is chosen. 10%. (c) Give a reasonable decision rule that could be used in the statistical test. then the rejection region for your decision rule would be one-sided to the right. (i) What would be your decision? (ii) What type of error (mistake) could you have committed? (g) Suppose that a token inscribed with $16 is chosen. Bag A and Bag B. (a) Write the two hypotheses for the statistical test. What is the direction of extreme? (b) Write the two hypotheses. A claim is made that the box is Box A.90.ALIAMC01_0131497561. (i) What would be your decision? (ii) What type of error (mistake) could you have committed? 1.23 Decide whether each of the following statements is true or false: (a) If a is specified to be 0. but is more than 76 years.

25 Suppose that there are two bags each containing 25 same-sized candies. (c) If each of the coins in the shown bag did have the same chance of being selected.24 Suppose that there are two identical jars. N. Jar A and Jar B. Is this the case here? Explain. each of the coins in the shown bag must have the same chance of being the selected coin.QXD 04/13/2005 03:29 PM Page 72 72 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS 1. and Q represent pennies. 1. would it be appropriate to say the direction of extreme for this test is one-sided to the right? Explain. . (a) What would be the two hypotheses? (b) In order to conduct this test. respectively. D. The distribution of the colors of the candies in each bag is given below. The hypotheses to be tested are as follows: H0: The bag that is shown is Bag X. H1: The bag that is shown is Bag Y. The plan is to select one coin from the jar and decide which jar has been placed before you. Bag X Value ($) Blue Brown Yellow Green Red Frequency 1 7 9 7 1 Bag Y Value ($) Blue Brown Yellow Green Red Frequency 9 3 1 3 9 (a) Following is a graph of the contents found in Bag X: Bag X X X X X X X X X X X X X X X X X X X X X X X X X X Blue Brown Yellow Green Red Make a similar graph to display the contents found in Bag Y.ALIAMC01_0131497561.26 What is a p-value? Write out an explanation of what this means for a person who has not had statistics. 1. that contain coins in the amounts shown by the following distributions: Jar A X X X X X X X X X X X X X X X X Jar B X X X X X X X X X X X X X X X X X X X X X X X X P N D Q P N D Q Note that P. nickels. and quarters. dimes. Suppose that you are presented with one of these two jars with the claim that the jar is Jar A. (b) Is it appropriate to say the direction of extreme for this test is two-sided? Explain.

$1. H0? Explain your answer. (a) Make a frequency plot for H0 and for H1. what is b? (d) Can you calculate a p-value? If yes. The bag is the “winning bag” containing the following vouchers: $1.29 1. $1.ALIAMC01_0131497561. $100. Boxes of rods that were produced by Machine B are to be held back (not sent to suppliers). $10.QXD 04/13/2005 03:29 PM Page 73 EXERCISES 73 1. which hypothesis was supported by the data? (b) Would the p-value for testing the hypotheses have been large or small? Explain. You are on a game show and you must choose whether to accept or reject a bag presented to you. $100. H1: Caesarean delivery rates for the two groups are not the same. what is a? (c) Your decision rule is to reject H0 if you pick a voucher of $1 or lower. Let the hypotheses be as follows: H0: The shown bag is the winning bag. $10. $10. Number of Flaws H1: The box of steel rods was produced by Machine B. $100. and a Type II error. premature delivery rates. $10.What is the p-value? (d) Are the data statistically significant at the level a from part (b)? 1.30 . and $1000. $100. These plots will 1 2 3 4 5 be used as the models for the number of visual flaws for rods produced Number of Flaws by the two machines. The frequency plots are based on a total of 15 rods X X X X X X X X X produced by Machine A and 15 rods from Machine B. $10. and Caesarean delivery rates. Calculate the chance of a Type I error. The bag is the “losing bag” containing the following vouchers: -$1000. and $100. (a) The direction of extreme for the test is (select one) to the left to the right two-sided can’t tell (b) It is decided to reject H0 if the observed number of visual flaws of the selected steel rod is 2 or more extreme.27 1. Half of the women in the study took prenatal vitamins containing 25 mg of zinc. If no. There are two possibilities for the contents of the shown bag. $1. calculate it. a. Consider the following hypotheses: H0: Caesarean delivery rates for the two groups are the same. $1. We wish to test the following X X X X hypotheses: X X X X X 1 2 3 4 5 H0: The box of steel rods was produced by Machine A. (b) Your decision rule is to reject H0 if you pick a voucher of $1 or lower. explain in one sentence why not. There was no significant difference in the Caesarean delivery rates between the two groups. You will select a rod from this box and the X X X number of visual flaws will be measured. Based on this decision rule. b. The other group received the same prenatal vitamins but without zinc. Machines A and B are the only two machines that produce steel rods at Machine A: X a steel manufacturing plant. Based on this decision rule. (c) The observed number of visual flaws of the selected steel rod was 4. The accompanying frequency plots give the X X distributions of the number of visual flaws in rods produced by X X X Machines A and B. $1. The results were reported in the Journal of the American Medical Association. H1: The shown bag is the losing bag. You have a box of steel Machine B: rods that no longer have the label that tells you which of the two X X X machines produced the rods. The study compared average birth weight. (a) Based on the stated results.28 Should the p-value be small or large to reject the null hypothesis. A study was conducted to assess whether taking supplements of zinc might help certain women deliver larger babies—particularly those women who are thin when they get pregnant.

she must test the following hypotheses: H0: The selected die is Die A (fair). the chance of a Type II error. H1: The shown box is Box II. That is. and compute a. each containing 35 vouchers. Die B is biased—that is. The two dice look identical and are mixed so she cannot know which die is which. each of the six outcomes is equally likely. and compute the p-value. Calculate a. and b. (c) The observed voucher value is $6. The following table provides the chances of each of the six possible outcomes for each die: Outcome 1 2 3 4 5 6 Chance if Die A (fair) 1 1 1 1 1 1 6 6 6 6 6 6 Chance if Die B (biased) 3 2 2 1 1 1 10 10 10 10 10 10 (a) Since observing smaller outcomes (fewer dots) is more likely under the alternative hypothesis. 1. (e) Give a new decision rule that will result in a larger significance level a as compared to the decision rule in part (b). the significance level.” (i) Circle. Box I and Box II. She decides to select one of the two die and roll that die one time. clearly label. Die A is a fair die. and compute b. the chance of a Type II error. she must determine which die it is. In the picture circle. (d) Is the observed result statistically significant? Explain.31 Jaeyun has two dice. Complete the new decision rule: Reject H0 if the observed voucher value is Á Box I X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X 2 4 6 8 10 12 $ Box II X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X 2 4 6 8 10 12 $ . (ii) Circle. (a) The direction of extreme is (select one) to the right to the left two-sided can’t tell (b) Consider the decision rule “Reject H0 if the observed voucher value is $4 or more extreme. clearly label. (c) Jaeyun performs the experiment and she rolls a 2. the chance of a Type I error.ALIAMC01_0131497561. Based on the outcome. clearly label. H1: The selected die is Die B (biased). The contents of each box are displayed at the right.QXD 04/13/2005 03:29 PM Page 74 74 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS 1. You will be shown only one of the boxes and will be allowed to select just one voucher from the shown box at random to test the following hypotheses: H0: The shown box is Box I. the direction of extreme is (select one) to the left to the right in both directions can’t tell (b) Jaeyun decides to use the following decision rule: Reject H0 if she rolls a 1 (or less). it is heavier on some sides so the opposite sides show up more often than others.32 Suppose that there are two identical boxes. Calculate the p-value and give her decision.

(i) What is the numerical value for the p-value? (ii) If the level of significance a is set at 10%.05. (i) What is the numerical value for the p-value? (ii) If the level of significance a is set at 10%.34 Answer each of the following questions regarding the p-value of a test: (a) Give two possible p-values for data that are statistically significant at 0. (b) Give two possible p-values for data that are statistically significant at 0.35 . The average lifetime for the population of 100-watt light bulbs produced by his competitor is about 40 hours. Each bag contains 40 equal-sized tokens inscribed with a dollar value. The frequency table for each bag is as follows: Bag A X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X 60 Bag B X X X X X X X X 62 64 66 68 70 72 74 76 $ X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X 60 62 64 66 68 70 72 74 76 $ Furthermore. is the $68 voucher statistically significant? 1. (a) Write the two hypotheses. Claude manages a plant that manufactures light bulbs. is the $68 voucher statistically significant? (iii) If the level of significance a is set at 5%.ALIAMC01_0131497561.01.33 Suppose that there are exactly two bags.10.01. Claude is interested in finding out the average lifetime of a new model of a 100-watt light bulb that he is producing. Claude wishes to test whether the average lifetime for his new-model bulbs is greater than that of his competition on average. is the $74 voucher statistically significant? (iii) If the level of significance a is set at 5%. (a) What is the population under study? (b) State the hypotheses to be tested. You must judge (decide) the probable truth of the claim statistically by randomly selecting one token from the given bag and using the p-value approach. Bag A and Bag B.QXD 04/13/2005 03:29 PM Page 75 EXERCISES 75 1. suppose that one of these two bags is presented to you with the claim that the bag is Bag A. (b) What is the direction of extreme? (c) Suppose that you select a voucher inscribed with $74. is the $74 voucher statistically significant? (d) Suppose that you select a voucher inscribed with $68. but not statistically significant at 0. 1. (c) Give two possible p-values for data that are not statistically significant at 0.

That’s up from 295. Boys watch an average of 53 more hours per year than girls. Nearly 25% of toddlers aged 2–5 also have their own TVs. The study also found that 42% of children aged 6–11 have televisions in their bedroom. suppose that one of these two bags is presented to you with the claim that the bag is Bag A. is the $14 voucher statistically significant? (e) If the level of significance a is set at 5%.10 level. (a) Write the two hypotheses.5 hours of kids’ programs each year. or two-sided? Explain. What is the chance of making a Type I error? (d) He conducts a statistical test and the data were statistically significant at the a = 0.QXD 04/13/2005 03:29 PM Page 76 76 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS (c) The manufacturer decides to test his theory at a level a = 0. (e) He conducts a statistical test and the data were statistically significant at the a = 0.10 level. is the $14 voucher statistically significant? 1. Which type of error could have been made? (select one) Type I Type II (d) Is the rejection region one-sided to the right.10.36 Suppose that there are exactly two bags. Give a possible p-value for this test. 1997): A new study of children’s viewing habits. revealed that children aged 2–11 watch an average of 312. The frequency table for each bag is as follows: Bag A Value ($) 2 4 6 8 10 12 14 16 18 Frequency 14 13 11 4 3 2 1 1 1 Bag B Value ($) 2 4 6 8 10 12 14 16 18 Frequency 1 1 1 2 3 4 11 13 14 Furthermore. Bag A and Bag B.15 level? at the 0.05 level? 1. (b) What is the direction of extreme? (c) What is the numerical value for the p-value? (d) If the level of significance a is set at 10%.ALIAMC01_0131497561. Consider the following hypotheses to be tested: H0: The average number of hours of kids’ programs watched this year by children between 6 and 11 years old is the same as that for last year. Study Shows” (Source: The Associated Press. one-sided to the left. The decision could be right or wrong. The preceding test was not statistically significant at a significance level of 0. (a) Which hypothesis was supported? (b) What can you say about the p-value for this test? (Be as specific as possible. Assume that you have selected a voucher inscribed with $14. H1: The average number of hours of kids’ programs watched this year by children between 6 and 11 years old has increased as compared to that for last year. conducted by the BJK&E Media Group of Manhattan.) (c) The data were observed and a decision was made. Would the data be statistically significant at the a = 0.37 Examine the following information from the article “Children Today Watching More and More TV. . Each bag contains 50 equal-sized tokens inscribed with a dollar value.5 hours the year before. You must judge (decide) the probable truth of the claim statistically by randomly selecting one token from the given bag and using the p-value approach. December 30.10.

Which hypothesis was supported? (c) What can you say about the p-value for this test? Be as specific as possible. but not at the 1% level. A college newspaper would like to include an article on the cost of attending college. A researcher decides to test the following hypotheses: H0: The true average cost is $350. Part of the article discusses the cost of off-campus housing. Based on the data collected. Which type of error could have been made? The following table provides information regarding the hypotheses and results for three different studies: Null Hypothesis Study A Study B The true average lifetime is 73 years.41 Study C . and the resulting data were statistically significant: H0: The average weight of a Bayer aspirin is 325 mg. (a) An error could have been made.38 A statistical test was performed to test the following hypotheses. The average time to relief for all Treatment A users is equal to the average time to relief for all Treatment B users. The average time to relief for all Treatment A users is not equal to the average time to relief for all Treatment B users. p-value 1.40. using well-written sentences. (a) Which hypothesis was supported? (b) What is the direction of extreme for this statistical test? (c) Suppose the significance level was 0.40 1. (f) According to part (b). the consequence of the error you chose in part (a).QXD 04/13/2005 03:29 PM Page 77 EXERCISES 77 1. (a) Clearly write the hypotheses to be tested. A reference book about the cost of attending a college reports an average cost of $350 for monthly rental of a one-bedroom apartment within 1 mile of campus. Give two possible values for the p-value. (c) What can you say about the p-value for this test? (d) Are the results statistically significant? Does exercise affect the birth weight of goats? A study was designed to address this question in which the pregnant goats were trained to walk on a treadmill.10. An error could have been committed. H1: The average weight of a Bayer aspirin is not 325 mg.05. If yes.39 1. a decision was made at the 1% level. (b) The data were not statistically significant at the 1% level. H1: The true average cost is higher than $350.ALIAMC01_0131497561. The true proportion of students who work a part-time job is less than 0. Would you need to change the possible p-value you gave in part (c)? If no.40. Alternative Hypothesis The true average lifetime is more than 73 years. Suppose that the mean birth weight for goats born to mother goats that did not exercise was 1600 grams. the null hypothesis was rejected at a = 0. give another value that will satisfy this statement on significance. (e) Suppose that the data were statistically significant at the 5% level. The true proportion of students who work a part-time job is 0. Could it have been a Type I error or a Type II error? (b) Explain. explain why not. (d) Provide a possible p-value for this test.

would this be called a Type I error or a Type II error? (d) For each study. The preceding test was statistically significant at a 5% significance level.43 Study B Study C The average time to relief for all new users is equal to the average time to relief for all standard treatment users. The true average income of adults who work two jobs is equal to $70.01.05.42 Having a volatile nature may be hazardous to your health. say University of Michigan researchers.05 and at a = 0. Alternative Hypothesis p-value 1. one-sided to the left. the data were statistically significant at a = 0.60. who found that hot-tempered men are more likely to suffer strokes than their cool-tempered counterparts. the data were not statistically significant at a = 0. For Study C. the data were statistically significant at a = 0.QXD 04/13/2005 03:29 PM Page 78 78 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS (a) Provide possible p-values for all three studies such that the following statements are true: For Study A. (iii) Study C results were not statistically significant at 10%.01. (b) For which study do the results show the most support for the null hypothesis? Explain. (iii) Study C is a one-sided test to the right.05. (ii) Study B is a one-sided test to the left.) (d) An error could have been made of which type? (select one) Type I Type II The following table provides some information about hypotheses for three different studies: Null Hypothesis Study A The true proportion of females is equal to 0. determine if the rejection region would have been one-sided to the right. (c) Suppose that Study A concluded that the data supported the alternative hypothesis that the true average lifetime is more than 73 years. . The researchers found that men with high anger levels were twice as likely to have a stroke than those less prone to explode. but not at 1%. (b) Give a possible p-value for each study such that (i) Study A results were statistically significant at 10%.000.ALIAMC01_0131497561. (ii) Study B results were statistically significant at 5% and at 1%. H1: Having a high anger level will increase the risk of having a stroke. This study explored the relationship between level of anger (measured using Spielberger’s Anger Expression Scale) and stroke risk (measured as the number of strokes over the past 10 years) in 2110 middle-aged men whose average age was 53. 1. but not at a = 0. In our statistical language. (a) The rejection region for the preceding test would have been: (select one) one-sided to the right one-sided to the left two-sided (b) Which hypothesis was supported? (c) What can you say about the p-value for this test? (Be as specific as possible. One set of hypotheses tested was as follows: H0: Having a high anger level will not change the risk of having a stroke. or two-sided. For Study B. but in fact it is not. (a) Give the appropriate alternative hypothesis for each study such that (i) Study A is a two-sided test.

but not at the 5% level. 0.11.) (b) What would a Type II error be in this context? (Give your answer in a nonstatistical manner. (d) The results are statistically significant at 0. 1 The p-value is 52. (c) The results are statistically significant at 0. or 0. which study results had the most support for the null hypothesis? Explain. (b) The significance level a is the chance that H0 is true. 0.47 1. Determine whether each of the following statements is true or false (a true statement is always true): (a) If the p-value is less than 0.04. You decide to conduct an experiment to test this ability. Consider the following hypotheses: H0: The subject does not have ESP. A hypothesis test was performed.60.05.45 1.05. You deal one card face down from a regular deck of 52 cards. never.01. all the time. (c) Sometimes yes. Suppose that you are asked to evaluate the abilities of an individual who claims to have perfect ESP (extrasensory perception). 1. the individual correctly identifies the hidden card. and the data were found to be statistically significant at the 5% level. What is the level of significance of this particular decision rule? (d) What is the chance of a Type II error for the decision rule given in part (c)? (e) When the experiment is carried out. (d) Suppose that Study A concluded that the data supported the alternative hypothesis when in fact the true proportion of females is equal to 0. What is the p-value? (f) When the experiment is carried out.48 .04. In our statistical language. Determine whether each of the following statements is true or false (a true answer is always true): (a) The chance that H0 is true is 0. (b) The decision was to reject H0.QXD 04/13/2005 03:29 PM Page 79 EXERCISES 79 (c) According to the p-values you entered in the last column of the table.05 level.46 1. H1: The subject does have ESP. (b) If the results from the study were significant at the 10% level. which of the following could have been the p-value? 0. (b) No. (a) Yes. The subject is then asked to say what the card is. (a) State the alternative hypothesis that was used in the study. Suppose that the p-value for a statistical test is found to be 0.) (c) Suppose that you decide to conclude that the individual has ESP if and only if he or she correctly identifies the card. would this be called a Type I error or a Type II error? 1.44 Proponents of a rehabilitation program at a local prison claim that the program has been successful because results of a study showed that the proportion of convicted persons who have been through the program and were later reconvicted is lower than the national proportion of convicted persons who are released and later reconvicted. the individual fails to identify correctly the hidden card.10.06.05.04 and the significance level was 0. (a) What would a Type I error be in this context? (Give your answer in a nonstatistical manner. Is this the chance that the null hypothesis is correct? Explain. then the results are statistically significant at the 0. sometimes no.ALIAMC01_0131497561. Will the same data be statistically significant at the 1% level? Select one and explain your answer.

(a) Suppose that you observed a yellow ball. Which of the following statements about the sample size is correct? (a) Your sample size should be large in order to find significance. Complete the following table by determining the decision rule and level of b for n = 1 and n = 2. what happens to the value of b? 1.05 Decision rule—reject H0 if your result is Decision rule—reject H0 if your result is b = b = 1.51 1. (b) Your sample size should be small in order to find significance. The frequency plots for the possible outcomes are provided on *These exercises follow from the optional Section 1. You are permitted to mix the balls in the package well and then. The decision rule is “Reject H0 if you observe at least one unhealthy student in your sample. Suppose that the data showed that the percentage of Republicans who are in favor of the death penalty is 42%. reach in and select one ball and record its color.5. (d) Do you think this result is practically significant? Explain.10. (a) State the corresponding null and alternative hypotheses. .”) A social research scientist wants to test that the percentage of Republicans who favor the death penalty is greater than the percentage of Democrats who are in favor of the death penalty.49 In Example 1. The hypotheses to be tested are as follows: H0: The class consists of all healthy students (100% healthy).” Suppose that your class actually consists of just one unhealthy student. The population is the 50 students in a class. 5 You are allowed to collect some data. both correspond to “one yellow and one blue. What is the corresponding p-value for this result? Is your result statistically significant at a 10% significance level? (Hint: List all possible outcomes and determine the number of ways each outcome could occur. increasing the sample size from n = 1 to n = 2 reduced the level of b (see page 60).QXD 04/13/2005 03:29 PM Page 80 80 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS 1.5. H1: The class does not consist of all healthy students ( 6100% healthy). Let’s verify that this holds if we fix the significance level to be 0. We saw that for a fixed significance level of 0. without looking inside the package.2 (page 8). What is the corresponding p-value for this result? Is your result statistically significant at a 10% significance level? (b) You are allowed to take two observations—that is. 5 H1: The proportion of yellow balls in the package is more than 1.50 1.05.53* The questions that follow are based on the Bag A–Bag B scenario when n = 2 vouchers are selected from the shown bag. (b) Will you need a one-sided or a two-sided rejection region? Explain. and the percentage of Democrats who are in favor of the death penalty is 40%. 0.52* Significance level A Sample size n = 1 Sample size n = 2 As the sample size is increased. Consider our Bag A–Bag B scenario in Section 1.ALIAMC01_0131497561. Note that order will not be important—that is. after replacing the first selected ball and mixing the contents. you reach in and select a second ball and record its color. (c) Do you think this result is statistically significant? Explain. getting a blue ball and then a yellow will be treated the same as getting a yellow ball and then a blue. the following hypotheses were stated regarding the contents of a package of balls: H0: The proportion of yellow balls in the package is 1. Suppose that the first ball is yellow and the second ball is blue.

In our Bag A–Bag B scenario with sample size of 2. we noted that the total number of ways of selecting 2 vouchers from a bag with 20 vouchers was 190 (see Table 1.25. (c) If the observed average voucher value is $3.54* Recall Example 1. (b) Consider the following decision rule: Reject H0 if the average voucher value is …$2. r 2 (a) Follow the aforementioned steps to verify that there are 190 ways to select 2 vouchers from a total of 20 vouchers in a bag.ALIAMC01_0131497561.5 (page 23) on the one-sided rejection region to the left. The steps for finding the number of combinations of 20 items taken 2 at a time are as follows: 1.5. Find the values of a and b corresponding to this decision rule. *These exercises follow from the optional Section 1. this is called finding the number 20 of combinations of 20 items taken 2 at a time.55* 2 0 MATH 3 2 ENTER the total number of items. Let’s extend this scenario by increasing the sample size from n = 1 to n = 2. Consider again the following hypotheses: H0: The shown bag is Bag A. find A 3 B . More formally. (a) Construct frequency plots for Bag C–Bag D for the case in which the sample size is n = 2 and the average of the two selected vouchers (selected without replacement) is used to make the decision.QXD 04/13/2005 03:29 PM Page 81 EXERCISES 81 page 56. is the null hypothesis rejected? Explain. what is the corresponding p-value? Many calculators can compute the number of ways of selecting 2 items (without replacement) from a total of 20 distinguishable items. (v) What is the cutoff value for this decision rule? Explain how you know this is the correct cutoff value. H1: The shown bag is Bag B.” (i) What is the significance level for this decision rule? (ii) What is the cutoff value for this decision rule? (iii) What is the value of b for this decision rule? (iv) State very clearly what the value of b represents in this situation. (b) Suppose that a new decision rule is given such that the significance level is 0.” The TI graphing calculators have an operation called nCr for finding the number of combinations. (i) What is the p-value if the two vouchers selected are $50 and $20? (ii) Based on your answer in (i). (b) Suppose that we were going to continue the Bag A–Bag B scenario to see what happens when we increase our sample size to 3. This is expressed as A 2 B and is read “20 choose 2. . This operation is found under the MATH PRB menu. [Hint: Think carefully about your answers to (i) through (iv). is the null hypothesis rejected? Explain. page 55).] 1. (a) Suppose the decision rule is “Reject H0 if the average of the two vouchers is at least $45. How many ways are there to select 3 vouchers 20 (without replacement) from the total of 20 vouchers in a bag? That is. (iii) What is the p-value if the two vouchers selected are $30 and $30? (iv) Based on your answer in (iii). The frequency table (so you won’t have to count all of the little X’s) is given on page 55. n 20 for the operation nCr the number of items you wish to select.2.

(i) Shade in the picture and compute the value of a.15 on page 49. (iv) Compute the power of the test.15 3 0. . in minutes) are H0: The population of values is represented by a rectangle with base from 3 to 7 and height 1>4. (d) Which of the following would lead to a reduction in the chance of committing a Type II error? Select one.5 years. (c) You select a chip at random from the population and the resulting lifetime is 0. (i) decreasing the sample size (ii) increasing the sample size (e) Is the following statement true or false? “The chance that the null hypothesis is true is equal to 1>7. (d) Are the data in part (c) statistically significant at the level found in part (b)? Explain. (i) In your sketch for part (a). You will be allowed to select one value from the population at random and must decide which of the two models to support. and clearly label the region with a.ALIAMC01_0131497561. (iii) Shade in the picture and compute the power of the test. shade in the region that corresponds to the chance of a Type II error.25 0 1 2 3 4 5 6 7 0 1 2 0.05 0. a. shade in the region that corresponds to the significance level. 1.08 4 5 6 7 (a) What is the height of the rectangle under the null hypothesis such that the total area of the rectangle will be one? (b) Consider the following decision rule: Reject the null hypothesis if the lifetime of the one selected chip is 1 year or more extreme. (a) Draw the pictures to represent H0 and H1. (i) In your sketch for part (a). and clearly label the region with b . (ii) Compute the p-value.4 0. (iii) In your sketch for part (a). (c) Suppose that the observed service time was 4.QXD 04/13/2005 03:29 PM Page 82 82 CHAPTER 1 HOW TO MAKE A DECISION WITH STATISTICS 1. Hint: See Example 1.56 Consider two competing models to describe a population of values for the lifetime of a computer chip. (ii) Compute the significance level.04 0. H0 density H1 density (Note: numbers represent areas or chances) 0. (ii) Shade in the picture and compute the value of b. shade in the region that corresponds to the p-value and clearly label the region as such. Shade in the picture and calculate the p-value for this test. The competing hypotheses for the distribution of time (length of time between order taken and order received. (b) The service time for a randomly selected customer will be obtained and the manager will test these hypotheses using the following decision rule: Reject H0 if the observed service time is 3.4 minutes or more extreme.57 A fast-food store manager wishes to test hypotheses regarding the distribution of service time for customers using the drive-through window.03 0.1 minutes. H1: The population of values is represented by a rectangle with base from 0 to 4 and height 1>4.

- Unit - 1 Steps in Hypothesis Testing
- Introduction to Chi
- Week Four
- My book
- Lecture_Inferential Statistical Analysis
- Statistical Inference
- 5040hypothesis Testing
- Hypothesis Lecture
- Hypothesis Test Summary011109
- Unit 1
- Hypothesis Testing
- Test of Hypothesis
- Lecture 4 Slides
- a borrowed ppt on chi squareanalysis
- Lecture in R
- Power Ejemplo
- FINAL
- Chapter 9 Summary
- res met 6Q1 Q2
- Hypothesis Testing
- Hypotheses Testing
- Hypothesis Test Assignment Help Help With Assignment
- Unit Vi Testing of Hypothesis 868131580
- Psych Stats
- EC-512EC512 LecNotes Pt2
- Hypothesis Testing
- ch.12 - ppt
- Hypothesis Testing - MR
- hypothesistesting-ppt-111111015810-phpapp01
- Slide_2 HYPOTHESIS FORMATION, TYPES OF ERROR AND ESTIMATION
- Aliaga_ch01

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue reading from where you left off, or restart the preview.

scribd