This action might not be possible to undo. Are you sure you want to continue?

469 / 470 BC - 399 BC

Socratic Ignorance "I know that I know nothing"

STATISTICS HISTORY

Some scholars pinpoint the origin of statistics to 1662, with the publication of Natural and Political Observations upon the Bills of Mortality by John Graunt. Its mathematical foundations were laid in the 17th century with the development of probability theory by Blaise Pascal and Pierre de Fermat. Probability theory arose from the study of games of chance. The method of least squares was first described by Carl Friedrich Gauss around 1794.

STATISTICS APPLICATIONS

Early applications of statistical thinking revolved around the needs of states to base policy on demographic and economic data, hence its stat- etymology. The scope of the discipline of statistics broadened in the early 19th century to include the collection and analysis of data in general. The term statistics is ultimately derived from the New Latin statisticum collegium ("council of state") and the Italian word statista ("statesman" or "politician"). Today, statistics is widely employed in government, business, and the natural and social sciences.

INTRO TO STATISTICS

WHAT IS STATISTICS ? VID

Resources

www.collegeboard.com www.whfreeman.com/tps3e

Register to take Online Quizzes Send me the result gil_op@hotmail.com

www.ti.com

"Imagination is more important than knowledge."

Einstein’s Riddle

ALBERT EINSTEIN WROTE THIS RIDDLE EARLY DURING THE 19th CENTURY. HE SAID THAT 98% OF THE WORLD POPULATION WOULD NOT BE ABLE TO SOLVE IT. ARE YOU IN THE TOP 2% OF INTELLIGENT PEOPLE IN THE WORLD? SOLVE THE RIDDLE AND FIND OUT.

There are no tricks, just pure logic, so good luck and don't give up. 1. In a street there are five houses, painted five different colours. 2. In each house lives a person of different nationality 3. These five homeowners each drink a different kind of beverage, smoke different brand of cigar and keep a different pet. THE QUESTION: WHO OWNS THE FISH?

HINTS

1. The Brit lives in a red house. 2. The Swede keeps dogs as pets. 3. The Dane drinks tea. 4. The Green house is next to, and on the left of the White house. 5. The owner of the Green house drinks coffee. 6. The person who smokes Pall Mall rears birds. 7. The owner of the Yellow house smokes Dunhill. 8. The man living in the centre house drinks milk. 9. The Norwegian lives in the first house. 10. The man who smokes Blends lives next to the one who keeps cats. 11. The man who keeps horses lives next to the man who smokes Dunhill. 12. The man who smokes Blue Master drinks beer. 13. The German smokes Prince. 14. The Norwegian lives next to the blue house. 15. The man who smokes Blends has a neighbour who drinks water.

Einstein's Riddle - ANSWER

The German owns the fish.

"Do not worry about your difficulties in Mathematics. I can assure you mine are still greater."

Fundamental Definitions

Statistics Population Sample

DEFINITIONS Like any new field, you have to learn the vocabulary in order to understand what is going on. In the beginning it seems like a lot because the terms may be unfamiliar or they may be words that you know but they are used in a different way. There is no way around it but to memorize the meanings. Its like with a foreign language. You just have to learn what the new words mean.

Definition: Statistics refers to a set of methods and rules for organizing , summarizing and interpreting information.

POPULATIONS AND SAMPLES In research, we are trying to find out general information about a class of people. (e. g. why do people commit crimes?) As a researcher, I want to know something about people in general. This group of people in general is called a population. Definition: A population is the set of all individuals of interest in a particular study.

E. g. All people in the United States All students in school Students in 3rd Semester

Population You want to know how many people in Mexico prefer Coke over Pepsi… Define the Population:

Give another example of population:

All individuals in the U. S. is not only a large sample, it is a very diverse sample – hard to find factors that relate to them all. I could not include every individual from the group in my study. So I want to take a sample from the population and hope it would represent the whole group. Definition: A sample is a set of individuals selected from a population, usually intended to represent the population in a research study

Definition: A sample is a set of individuals selected from a population, usually intended to represent the population in a research study. How would you Sample all people in the US? Sample for Cola drinkers in Mexico? Sample your own example:

When we describe information, we use different terms to represent populations and samples. Definition: A parameter is a value which describes a population. Define a parameter in the Cola drinkers in Mexico: Define a parameter in your example: Definition: A statistic is a value which describes a sample. Define a statistic in the Cola drinkers in Mexico: Define a statistic in your own example:

Once we have data, there are two things we can do with it. We can describe it and we can use it to make generalizations from it. These are the two different roles of statistics. Definition: Descriptive statistics are statistical procedures that summarize, organize and simplify data. Write an example: Definition: Inferential statistics are techniques that allow us to study samples and then make generalizations about the populations from which they were selected. Write an example:

There is usually some difference between the way the sample looks and the way the population looks. This difference is known as sampling error Definition: Sampling error is the discrepancy that exists between the sample statistic and the population parameter. (e. g. “margin of error” in voters’ polls). Write an example: We want to reduce sampling error whenever possible.

One way that we use to try to insure that our sample is representative is to use random selection Definition: Random selection or random sampling is a process for obtaining a sample from a population that requires that every individual in the population have the same chance of being selected for the sample. A sample obtained by this method is called a random sample. Write an example:

**The Scientific Method and the Design of Experiments
**

The scientific method is a process for studying behavior that relies on objectivity. It requires that we try to eliminate personal biases from influencing the outcome of our studies. Theory Hypothesis data collection

Definition: Theory – an integrated and overarching set of principles that explain and predicts phenomena Definition: Hypothesis – a specific testable prediction (usually) derived from a theory

**Mean, Median, Mode, and Range
**

The "mean" is the "average" you're used to, where you add up all the numbers and then divide by the number of numbers. The "median" is the "middle" value in the list of numbers. To find the median, your numbers have to be listed in numerical order, so you may have to rewrite your list first. The "mode" is the value that occurs most often. If no number is repeated, then there is no mode for the list. The "range" is just the difference between the largest and smallest values.

**Example for Mean, Median, Mode, and Range
**

Find the mean, median, mode, and range for the following list of values: 13, 18, 13, 14, 13, 16, 14, 21, 13 The mean is the usual average, so: (13 + 18 + 13 + 14 + 13 + 16 + 14 + 21 + 13) ÷ 9 = 15 The median is the middle value, so we have to rewrite the list in order: 13, 13, 13, 13, 14, 14, 16, 18, 21 There are nine numbers in the list, so the middle one will be the (9 + 1) ÷ 2 = 10 ÷ 2 = 5th number: 13, 13, 13, 13, 14, 14, 16, 18, 21 So the median is 14.

The mode is the number that is repeated more often than any other, so 13 is the mode. The largest value in the list is 21, and the smallest is 13, so the range is 21 – 13 = 8. mean: 15 median: 14 mode: 13 range: 8

M&Ms Activity

DATA PRODUCTION

Producing data

Survey

Surveys are popular ways to gauge public opinion The idea of a survey:

Select a sample of people to represent a larger population. Ask the individuals in the sample some questions and record their responses. Use sample results to draw some conclusions about the population.

Observational study

In an observational study, we observe individuals and measure variables of interest but do not attempt to influence the responses.

Experiment

In an experiment, we deliberately do something to individuals in order to observe their responses.

**Example Observational study VS Experiment
**

Census Phone survey Vaccines Interview Comparing two drugs Exam PAAR students

**Exercise Observational study Experiment
**

Design an Observational Study.

Question? Population? Sample? Data production? Analisys? Conclusion?

Design an Experiment.

Question? Population? Sample? Data production? Analisys? Conclusion?

Homework Exercises

P1 P2 P3 P4 P5

(from pg 11)

DATA ANALYSIS

**Individuals and Variables
**

Individuals are the objects described by a set of data (people, animals, things). Variables are any characteristics of an individual. A variable can take different values for different individuals.

Categorical variable. Places an individual into one of several groups or categories. Quantitative variable. Takes numerical values for which arithmetic operations (adding, average…) make sense.

**Id: Individuals, Variables (Categorical/Quantitative)
**

Education in the United States State CA CO CT Region Population (1000´s) 4,601 3,504 SAT Verbal 551 512 SAT Math 553 514 Percent taking 54 27 84 Percent No HS 18.9 11.3 12.5 Teachers pay ($ 1000) 54.3 40.7 53.6

PAC 35,894 499 Example. Education in the US 519 MTN NE

Distribution

The Distribution of a variable tells us what values the variable takes and how often it takes these values.

Describing Categorical variables Bar graph Side-by-side Bar graph Do you wear your seat belt?

Region Northeast Midwest South West Percent wearing seatbelts, 2003 74 75 80 84 Percent wearing seatbelts, 1998 66.4 63.6 78.9 80.8 Percent 1998 Wearing Seat belts: 1998 vs 2003 Percent 2003 100 80 60 40 20 0

Percent 85 80 75 70 65

Percents of Front-seat Passengers Wearing Seat belts in 2003

**Describing Quantitative variables
**

Dotplot

The number of goals scored by the US women´s soccer team in 34 games played during the 2004 season

3 0 2 7 8 2 4 3 5 1 1 4 5 3 1 1 3 3 3 2 1 2 2 2 4 3 5 6 1 5 5 1 1 5

A Dotplot of goals scored by the US women´s soccer team in 2004 Use your TI 84 plus Find Mode: Mean: Median: Range: 0 2 4 6 8 10 Graph the distribution

Goals scored

**Exploring Relationships between variables
**

On time Alaska Airlines America West 3274 6438 Delayed 501 787 Percents of late flights 13.3 % 10.9 %

Alaska Airlines Departure city On time Delayed % of late flights 11.1 % 5.2 % 8.6 % 16.8 % 14.2 % 13.3 %

America West On time Delayed % of late flights 14.4 % 7.9 % 14.5 % 28.7 % 23.2 % 10.9 %

Los Angeles Phoenix San Diego San Francisco Seatle Total

497 221 212 503 1841 3274

62 12 20 102 305 501

694 4840 383 320 201 6438

117 415 65 129 61 787

Many relationships between two variables are influenced by other variables lurking in the background.

**Comparing the percents of delayed flights for the two airlines at five airports.
**

35 30 25 20 15 10 5 0 Los Angeles Phoenix San Diego San Francisco Seattle Alaska Airlines America West

Class Exercises

EXERCISES

P.7 P.8 P.9 P.10 P.12

(from pg 19)

PROBABILITY

**Probability: What are the Chances?
**

When you toss a coin…

What is the probability of getting heads?

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 5 10 50 100 500 1000

Number of Tosses

Proportion of Heads

Probability: what happens in the long run.

**The Big idea of Probability
**

Chance behavior is unpredictable in the short run, but has a regular and predictable pattern in the long run.

Games of Chance:

Texas Hold’ em Black jack Roulette Dice

Probability quantifies the pattern of chance variation.

Basic Probability

The probability for a given event can be thought of as the ratio of the number of ways that event can happen divided by the number of ways that any possible outcome could happen. If we identify the set of all possible outcomes as the "sample space" and denote it by S, and label the desired event as E, then the probability for event E can be written

How Likely? = What is the Probability?

Rolls in Craps

What is the Probability?

1 2 3

**iE: Betting Dice
**

4 Five (Fever Five) 5 6 Natural or Seven Out

1

Snake Eyes

Ace Deuce

Easy Four

Easy Six

2

Ace Deuce

Hard Four

Five (Fever Five)

Easy Six

Natural or Seven Out

Easy Eight

3

Easy Four

Five (Fever Five)

Hard Six

Natural or Seven Out

Easy Eight

Nine (Nina)

4

Five (Fever Five)

Easy Six

Natural or Seven Out

Hard Eight

Nine (Nina)

Easy Ten

5

Easy Six

Natural or Seven Out

Easy Eight

Nine (Nina)

Hard Ten

Yo (Yo-leven)

6

Natural or Seven Out

Easy Eight

Nine (Nina)

Easy Ten

Yo (Yo-leven)

Boxcars

**The following chart shows the dice combinations needed to roll each number
**

Dice Roll 2 3 4 5 6 7 8 9 10 11 12 Possible Dice Combinations 1-1 1-2, 2-1 1-3, 2-2, 3-1 1-4, 2-3, 3-2, 4-1 1-5, 2-4, 3-3, 4-2, 5-1 1-6, 2-5, 3-4, 4-3, 5-2, 6-1 2-6, 3-5, 4-4, 5-3, 6-2 3-6, 4-5, 5-4, 6-3 4-6, 5-5, 6-4 5-6, 6-5 6-6

**Exploring Probability. Playing Cards, Dice, Spinners, and Coins
**

Game Theoretical Probability # of attempts # of wins Experimental Probability # of attempts # of wins Experimental Probability

Draw a card with a heart on it. Be sure to replace the drawn card and shuffle the cards before the next attempt.

1 out of 4 or .25 4 100

Roll the die – Roll the number 3.

1 out of 6 or .16 4 100

Spinner – Spin the color red.

1 out of 6 or .16 4 100 1 out of 2 or .50 4 100

Coin toss – The coin must land on heads.

STATISTICAL INFERENCE

**Drawing Conclusions from Data
**

Have you ever cheated on a test or exam? “Yes” 48% Internet survey of 1200 students, aged 13 to 17, between January 23 and February 10, 2003. If all 13 to 17 year-old students were asked the same question, would exactly 48% have answer “yes”? What about with a second sample or a third sample? Variation is everywhere!

Probability provides a description of how the Sample results will vary in relation to the true Population percent. We rely on Probability to help us answer research questions with a known degree of confidence.

Based in the Sampling method in the previous example, we can say the estimate of 48% is very likely to be within the 3% of the true Population percent. That is, we can be confident that between 45% and 51% of all teenage students would say that they have cheated on a test. Statistical Inference allows us to use the results of properly designed experiments and observetional studies, to draw conclusions that go beyond the data themselves.

REVIEW EXERCISES

Exercises. P13, P14, P17, P18 Chapter Exs. P19,P20, P22, P23, P26 On-line Quiz. *20% First Period grade Chapter P: What is Statistics? www. whfreeman.com/tps3e Register as a Student Instructor e-mail gil_op@hotmail.com

Sign up to vote on this title

UsefulNot useful- AP Statistics Semester i Ago-dic 2009 II
- Probabilidad en Barragués, J.I. (2013) Probability and Statistics
- Notes 02-11-16
- IE27_01_IntroToProbability
- Unlock Scilab13
- I Statistics for Management
- 11 Probability
- M3070_M1_samp_F4
- Medical StudentsStat
- Introduction To
- case study solution_counting and probability
- Scoring Model
- Simple Eve Risk Analysis
- sampling design.ppt
- Chapter 7 Probability 1
- Non Probability Sampling
- Is It Worth the While the Relevance of Qualitative Information in Credit Rating
- Prob Appt
- On the Kolmogorov-Smirnov Test for Normality with Mean and Variance UnknownK S Lilliefors
- Biostatistics (Dr Shilpi Gilra)
- chap05_01.ppt
- Koulis
- MDA
- Pilot–pivotal trials for average bioequivalence
- QUIZ 1
- Lecture 26 - Sampling Distribution Proportion.pdf
- Unit 15 Statistical Data Sampling
- SE1 Expanded
- Important SAP QM Tables
- An Introduction to Probability Theory by William Feller
- AP Statistics Semester i Ago-dic 2009 II