22 views

Uploaded by GSS

stats

save

- SCh2
- bucket
- Statistics for Management
- 3 5 a appliedstatistics 1
- Edited
- Evaluation of Sport Sponsorship
- 2015-GBZ-QQ
- RM Project IV-Th Sem
- AgriComp Case Study
- Test Bank for Statistics Informed Decisions Using Data 5th Edition by Michael Sullivan III
- Statistics-for-Management-and-Economics-9th-Edition-Gerald-Keller-Test-Bank.pdf
- Summary Probability Distributions
- Efficient Frontier and Lower Partial Moment of the First Order
- StatisticalAnalyses.pdf
- Alain Desrosieres - The Politics of Large Numbers
- Course Outline
- Date Sheet Final Exam Combine Summer' 14
- Data Analysis 2.3
- Concrete Works of Colorado, Inc. v. City and County of Denver, Colorado, 540 U.S. 1027 (2003)
- Research problems.docx
- HHS Appellate Decision on Georgia Department of Human Services
- Geonaute Software En
- 15 Method Study 280911
- Statistics Prelim
- tes10 ch01
- Corelation between Central Corneal Thicknes, Gender and Age in Bulgarian Children
- 240proj.pdf
- Experimental Practice and an Error Statistical Account of Evidence (Deborah Mayo).pdf
- BS Statistics (4-Years) Course Outline
- Irwin Miller, Marylees Miller-John E. Freund's Mathematical Statistics With Applications-Pearson (2014)
- Sapiens: A Brief History of Humankind
- The Unwinding: An Inner History of the New America
- Yes Please
- Dispatches from Pluto: Lost and Found in the Mississippi Delta
- Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
- The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the Digital Revolution
- John Adams
- Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
- A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
- Grand Pursuit: The Story of Economic Genius
- This Changes Everything: Capitalism vs. The Climate
- The Emperor of All Maladies: A Biography of Cancer
- The Prize: The Epic Quest for Oil, Money & Power
- Team of Rivals: The Political Genius of Abraham Lincoln
- The New Confessions of an Economic Hit Man
- The World Is Flat 3.0: A Brief History of the Twenty-first Century
- Rise of ISIS: A Threat We Can't Ignore
- Smart People Should Build Things: How to Restore Our Culture of Achievement, Build a Path for Entrepreneurs, and Create New Jobs in America
- The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
- How To Win Friends and Influence People
- Angela's Ashes: A Memoir
- Steve Jobs
- Bad Feminist: Essays
- The Light Between Oceans: A Novel
- The Silver Linings Playbook: A Novel
- Leaving Berlin: A Novel
- Extremely Loud and Incredibly Close: A Novel
- The Sympathizer: A Novel (Pulitzer Prize for Fiction)
- You Too Can Have a Body Like Mine: A Novel
- The Incarnations: A Novel
- Life of Pi
- The Love Affairs of Nathaniel P.: A Novel
- A Man Called Ove: A Novel
- Bel Canto
- The Master
- The Blazing World: A Novel
- The Rosie Project: A Novel
- The First Bad Man: A Novel
- We Are Not Ourselves: A Novel
- Brooklyn: A Novel
- The Flamethrowers: A Novel
- Wolf Hall: A Novel
- The Art of Racing in the Rain: A Novel
- The Wallcreeper
- Interpreter of Maladies
- Beautiful Ruins: A Novel
- The Kitchen House: A Novel
- The Perks of Being a Wallflower
- Lovers at the Chameleon Club, Paris 1932: A Novel
- The Bonfire of the Vanities: A Novel
- A Prayer for Owen Meany: A Novel
- The Cider House Rules
- Mislaid: A Novel

You are on page 1of 85

1

Lectures 1 to 3

Based on Chapter 1 of the textbook

COURSE OUTLINE

2

IMPORTANT NOTE

**In this course, I base my lectures on the posted
**

lecture notes, but of course I elaborate on them in

the class, and I solve problems on the board.

These notes are useful for review, but they are not

meant to take the place of lectures. Students who

rely on notes only, they traditionally do not do well

on their exams in this course.

**Whatever is on slides, board, and explained in
**

words by me are in your exams.

3

AVAILABLE

RESOURCES FOR

PRACTICE QUESTIONS

4

**Examples solved during lectures (the Most important
**

source)

Sets of practice questions and their answers posted by the

instructor on Owl.

Group Assignments in the course.

Bonus Pop-quizzes.

Student solutions manual which is a separate book

bundled by your textbook. This book contains detailed

solutions to all even numbered exercises and applications

in the textbook.

MyStatLab which is bundled by your textbook and contains

tutorial exercises which are algorithmically generated for

unlimited practice. The student access code for MyStatLab

is inside the student access kit. Inside the kit, you can find

the web address for registration on-line as well as a stepby-step guidance. The CourseCompass Course ID is

entezarkheir87955.

5

GROUP ASSIGNMENTS

**You have to form groups of 5 students.
**

Each group has to give me their group member names

(official names not nicknames) on Thursday (September

19).

Give me one page with official names either typed or

written with a legible hand writing.

If this is not done on September 19 by your own

choices, I will go over the list and group the students

without a group in the alphabetical order.

Each group will solve the assignment and give me only

one copy, which is typed (not hand-written) and

stapled, on the due date of the assignment right in the

beginning of the class.

Late assignments will receive mark of zero.

6

**MY EXPECTATIONS FROM YOU FOR THIS
**

COURSE ARE:

Attend lectures on time=>You do not miss pop-quiz

and its bonus mark

Read chapters of the textbook before attending

lectures.

Be focused and attentive during lectures and ask

your questions if you have any. Remember that I

also hold office hours for your questions.

Solve practice questions

Participate in solving your group assignments and

hand in on time

Take midterm and final exams

7

**SOME OF THE FORMULAS IN THE LECTURES
**

You are not supposed to memorize

some

of

**the formulas. As I lecture, I specify them on slides.
**

I will give you these formulas in exam without any

explanation. You should know what they are and how to

use them.

However, there are other formulas that you have to

memorize them. This is normal for an econometrics

course.

This is an econometrics course filled with formulas and

equations. Some of you might not understand formulas.

Please let me know if you cannot follow me or please

come to my office hours.

8

**Do cigarette taxes
**

lead to obesity?!

Entezarkheir, Sen, Wilson (2010)

Tax

=>

9

**Lecture Goals for Students
**

After completing the first few lectures of the course, you

should be able to:

Explain key definitions:

Population vs. Sample

Parameter vs. Statistic

Descriptive vs. Inferential Statistics

**Describe random sampling
**

Explain the difference between Descriptive and Inferential

statistics

Identify types of variables and levels of measurement

10

**Lecture Goals for Students
**

Create and interpret graphs to describe categorical

variables:

frequency distribution, bar chart, pie chart, Pareto

diagram

Create a line graph to describe time-series data

Create and interpret graphs to describe numerical variables:

frequency distribution, histogram, ogive, stem-and-leaf

display

Construct and interpret graphs to describe relationships

between variables:

Scatter plot, cross table

11

**What Are Data?
**

• Data values or observations are information

collected regarding some subject

• Data are often organized into a data table such

as the one below

12

**What Are Data? Ctd.
**

The characteristics recorded about each individual,

case, or subject are called variables

Variables are usually shown as the columns of a

data table and identify What has been measured

Variables

13

**Variable Types: Categorical and Numerical
**

When a variable names categories and answers

questions about how cases fall into those

categories, it is called a categorical variable

**When a variable has measured numerical values
**

with units and the variable tells us about the

quantity of what is measured, it is called a

quantitative or numerical variable

14

Categorical variables

Categorical variables …

• arise from descriptive responses to questions

such as “What kind of advertising do you use?”

• may only have two possible values (like “Yes”

or “No”)

• may be a number like a telephone area code

15

Numerical Variables

Numerical or quantitative variables have units. The

units indicate

• how each value has been measured

**• the corresponding scale of measurement
**

• how much of something we have

• how far apart two values are

16

**FURTHER CLASSIFICATIONS OF VARIABLES
**

Variables

Numerical

(quantitative)

Categorical

Discrete

Examples:

Number of

Children

Defects per

hour (Counted

items)

Continuous

Examples:

Weight

Voltage

(Measured

characteristics)

17

**MEASUREMENT LEVELS OF CATEGORICAL
**

VARIABLES: ORDINAL AND NOMINAL

Nominal Data

Ordinal Data

•In nominal data, the numbers are used only for the

purpose of convenience, and they do not mean any

ordering (Example: If female it is 1 and if male it is

zero. The responses are words that describe the

categories)

•Ordinal data shows rank ordering of items, and

similar to nominal data the values are words that

describe responses (Example: product quality rating:

1:poor, 2: average, 3:good).

18

EXAMPLE 1

Upon visiting a newly opened Starbucks store,

customers were given a brief survey. Is the answer to

each of the following questions categorical or

numerical? If categorical, give the level of

measurement. If numerical, is it discreet or

continuous?

a) Is this your first visit to this Starbucks store?

b)On a scale from 1 (very dissatisfied) to 5 (very

satisfied), rate your level of satisfaction with today’s

purchase?

c) What was the actual cost of your purchase today?

19

**DECISION MAKING IN AN UNCERTAIN
**

ENVIRONMENT

**Everyday decisions are based on uncertainty or
**

incomplete information

Examples: Will the job market be strong when I

graduate?

Data are used to assist decision making. How?

**Statistics is a tool to help process, summarize,
**

analyze, and interpret data

20

**SOME KEY DEFINITIONS
**

A population is the collection of all items of

interest or under investigation

N represents the population size

A

**sample is an observed subset of the
**

population

n represents the sample size

A

**parameter is a specific characteristic of a
**

population

A

**statistic is a specific characteristic of a
**

sample

21

POPULATION VS. SAMPLE

Population

Values calculated

using population

data are called

parameters

Sample

Values computed

from sample data

are called statistic

22

EXAMPLE 2

Suppose we are interested in examining the

household income in Canada.

Population: All households in Canada

Sample: A group of households participated in the

LFS survey (Labour Force Survey by statistics

Canada)

23

**WHY DO WE NEED SAMPLES?
**

Getting

**access to the population is not
**

always feasible, and it is very costly.

Thus, we select a sample from the

population to make a statement about that

population.

How valid is this statement? It is valid if the

sample is a representative of the population.

How can we achieve a representative

sample? One important principal is selecting

a random sample.

24

RANDOM SAMPLING

Simple random sampling is a procedure in

which

each

**member of the sample is chosen
**

strictly by chance,

each member of the population is equally

likely to be chosen,

and every possible sample of n objects is

equally likely to be chosen.

The resulting sample is called a random

sample.

25

Now

**when we collect a random sample, we
**

are looking for ways to illustrate the

information in the data before calculating

statistic and using it in making a decision

under uncertainty.

Therefore,

we can use descriptive statistics

26

**DESCRIPTIVE AND INFERENTIAL STATISTICS
**

Two branches of statistics:

Descriptive statistics

**Graphical and numerical procedures to
**

summarize and process data

It includes tables, graphs, mean, etc.

Inferential

statistics

**Using data to make predictions, forecasts, and
**

estimates to assist decision making

Inference is the process of drawing conclusions

or making decisions about a population based on

sample results

27

**DESCRIPTIVE STATISTICS: GRAPHICAL
**

PRESENTATION OF DATA

**Data in raw form are usually not easy to
**

use for decision making

**Some type of organization is needed
**

Table

Graph

**The type of the organization to use
**

depends on whether the variable is

categorical or numerical.

28

**DESCRIPTIVE STATISTICS: GRAPHICAL
**

PRESENTATION OF DATA CTD.

Categorical

Variables

• Frequency

distribution

• Cross table

• Bar chart

• Pie chart

• Pareto diagram

Numerical

Variables

• Line

chart

• Frequency

distribution

• Histogram and ogive

• Stem-and-leaf

display

• Scatter plot

29

**DESCRIPTIVE STATISTICS: TABLES AND GRAPHS
**

FOR CATEGORICAL VARIABLES

Categorical

Data

Tabulating Data

-Frequency

Distribution

Table

-Cross Table

Graphing Data

-Bar Chart

-Stacked

or Component

bar chart

Pie

Chart

Pareto

Diagram

30

**DESCRIPTIVE STATISTICS: FREQUENCY
**

DISTRIBUTION FOR TABULATING CATEGORICAL

VARIABLE

A

**frequency distribution table is a table used
**

to organize data.

The

**left column (called classes or groups)
**

includes all possible responses on a

variable being studied.

The

**right column is a list of frequencies, or
**

number of observations, for each class.

31

**DESCRIPTIVE STATISTICS: THE FREQUENCY DISTRIBUTION
**

TABLE FOR CATEGORICAL VARIABLE

**Example 3: Hospital Patients by Unit
**

Hospital Unit

Cardiac Care

Emergency

Intensive Care

Maternity

Surgery

Total:

(Variables are

categorical)

Number of Patients

1,052

2,245

340

552

4,630

8,819

Frequencies

Percent

(rounded)

11.93

25.46

3.86

6.26

52.50

100.0

32

A NOTE

Frequency is the number of observations in each

category.

Relative frequency is obtained by dividing each

frequency by the number of observations.

Percent is obtained from dividing each frequency

by the number of observations and multiplying the

resulting proportion by 100%.

33

**DESCRIPTIVE STATISTICS: BAR CHART FOR CATEGORICAL
**

VARIABLE

•

**When we want to draw attention to the
**

frequency of each category (in the frequency

distribution table) in the categorical variable,

we will use bar chart.

•

**The height of a rectangle for a category is the
**

frequency of each category or the number of

observations in each category.

•

**There is no need for the bars to touch each
**

other.

34

EXAMPLE 3 CTD.

Bar chart for patient data

**Hospital Patients by Unit
**

5000

4000

3000

2000

Surgery

Maternity

0

Intensive

Care

1000

Emergency

1,052

2,245

340

552

4,630

Frequencies

Cardiac

Care

Cardiac Care

Emergency

Intensive Care

Maternity

Surgery

Number

of patients

Number of

patients per year

Hospital

Unit

35

**DESCRIPTIVE STATISTICS: PIE CHARTS FOR
**

CATEGORICAL VARIABLE

If

**the goal is drawing attention to the
**

proportion of frequencies in each category

of the frequency table, pie chart is proper.

The

circle or pie represents the total.

The

**pieces of pie display shares of the total,
**

frequencies, or percentage for each

category of the categorical variable.

36

EXAMPLE 3 CTD.

Pie chart for patient data

Hospital

Unit

Cardiac Care

Emergency

Intensive Care

Maternity

Surgery

Number

of Patients

% of

Total

1,052

2,245

340

552

4,630

11.93

25.46

3.86

6.26

52.50

**Hospital Patients by Unit
**

Cardiac Care

12%

Surgery

53%

(Percentages are

rounded to the nearest

percent)

Emergency

25%

Intensive Care

4%

Maternity

37

6%

**DESCRIPTIVE STATISTICS: CROSS TABLES FOR
**

CATEGORICAL VARIABLES

Cross

**Tables (or contingency tables) list the
**

number of observations for every combination

of values for two categorical variables

If

**there are r categories for the first variable
**

(rows) and c categories for the second

variable (columns), the table is called an

r

× c cross table.

When

**you want to display two categorical
**

variables together, you describe them by cross

tables and you picture them by component bar

charts.

38

EXAMPLE 4

**3 x 3 Cross Table for Investment Choices by Investor
**

(values in $1000’s)

**Investment Investor A Investor B Investor C
**

Category

Total

Stocks

46

55

27

128

Bonds

Cash

32

15

44

20

19

33

95

68

Total

93

119

79

291

39

EXAMPLE 4 CTD

To

**display the cross table, we use the stacked
**

or component bar chart

40

EXAMPLE 5 (Q 1.12 ON PAGE 14)

41

ANSWER TO EXAMPLE 5

Employee Performance

60

Number of Employees

50

40

10 to <15 min

30

5 to <10 min

< 5 min

20

10

0

Less than 3

months

3 to 6 months

6 to 9 months

Experience

9 to 12 months

42

**At Home for extra practice:
**

Read example 1.2 on page 10 (Cross

table and component bar chart).

Read example 1.3 on page 11 (Pie

chart).

43

**DESCRIPTIVE STATISTICS: PARETO DIAGRAM FOR
**

CATEGORICAL VARIABLE

Many

**managers who need to identify major
**

causes of problems and attempt to correct

them quickly with a minimum cost use a

special bar chart known as Pareto diagram.

The

**underlying idea is that in most cases a
**

small number of factors (vital few) are

responsible for most of the problems (trivial

many).

This

**graph is used to separate the “vital few”
**

from the “trivial many”

44

**DESCRIPTIVE STATISTICS: PARETO DIAGRAM CTD.
**

Pareto

**diagram is used to portray
**

categorical variables

Pareto

**diagram is a bar chart, where
**

categories are shown in descending order

of frequency

45

EXAMPLE 6

400 defective items are examined for cause of defect.

Display the Pareto Diagram.

Source of

Manufacturing Error

Number of defects

Bad Weld

34

Poor Alignment

223

Missing Part

25

Paint Flaw

78

Electrical Short

19

Cracked case

21

Total

400

46

ANSWER TO EXAMPLE 6

Step 1: Sort by defect cause, in descending order

Step 2: Determine % in each category

Source of

Manufacturing Error

Number of defects

% of Total Defects

Poor Alignment

223

55.75

Paint Flaw

78

19.50

Bad Weld

34

8.50

Missing Part

25

6.25

Cracked case

21

5.25

Electrical Short

19

4.75

Total

400

100%

47

**ANSWER TO EXAMPLE 6 CTD.
**

Step 3: Show results graphically

Note: ignore the line for now.

60%

100%

90%

50%

80%

70%

40%

60%

30%

50%

40%

20%

30%

20%

10%

10%

0%

0%

Poor Alignment

Paint Flaw

Bad Weld

Missing Part

Cracked case

Electrical Short

cumulative % (line graph)

**% of defects in each category
**

(bar graph)

Pareto Diagram: Cause of Manufacturing Defect

48

**DESCRIPTIVE STATISTICS: GRAPHS TO DESCRIBE
**

TIME-SERIES VARIABLES

A

**line chart (time-series plot) is used to
**

show the values of a variable over time

Time

is measured on the horizontal axis

The

**variable of interest is measured on the
**

vertical axis

Note: Time series data is a numerical data

49

EXAMPLE 7: LINE CHART

50

REMINDER: CLASSIFICATION OF VARIABLES

Data

Categorical

Numerical

Discrete

Continuous

51

**REMINDER: TABLES AND GRAPHS FOR
**

CATEGORICAL VARIABLES

Categorical

variables

Tabulating Data

-Frequency

Distribution Table

-Cross table

Graphing Data

-Bar Chart

-Component

Bar Chart

Pie

Chart

Pareto

Diagram

52

**DESCRIPTIVE STATISTICS: GRAPHS TO DESCRIBE
**

NUMERICAL VARIABLES

Numerical Data

Frequency Distributions

and

Cumulative Distributions

Histogram

Graph

Ogive

Graph

**Stem-and-Leaf
**

Graph

53

**DESCRIPTIVE STATISTICS: FREQUENCY DISTRIBUTION
**

TABLE FOR NUMERICAL VARIABLES

Similar

**to categorical variables, we also have the
**

frequency distribution table for numerical variables to

organize data.

This

**table contains class groupings (categories or
**

ranges within which the data fall) and the

corresponding frequencies (number of observations)

with which data fall within each class or category.

In

**contrast to the case of categorical variables,
**

groups in the frequency distribution table for

numerical variables are defined by numbers.

54

**DESCRIPTIVE STATISTICS: FREQUENCY
**

DISTRIBUTION TABLE FOR NUMERICAL VARIABLE

CTD.

Therefore,

**we need to know:
**

1-How many classes or groups do we have?

2-How wide should each class be?

We should also know that classes should have

the same width. This makes comparison of

groups easier.

We should also know that classes should never

overlap and must be inclusive. Why?

55

**DESCRIPTIVE STATISTICS: FREQUENCY
**

DISTRIBUTION TABLE FOR NUMERICAL VARIABLE

CTD

No

**overlap because we do not want to
**

count the same observation in two different

groups.

Inclusive because of including all

information that we have.

**The above points imply that we need to know
**

more information than the case of categorical

variable to plot the frequency distribution

table.

56

**CLASS INTERVALS AND CLASS BOUNDARIES FOR
**

FREQUENCY DISTRIBUTION TABLE OF NUMERICAL

VARIABLES

Each

**class grouping has the same width
**

Determine the width of each interval by

largest number smallest number

w interval width

number of desired intervals

**Use at least 5 but no more than 15-20 intervals
**

Intervals never overlap

Round up the interval width to get desirable

interval endpoints

57

EXAMPLE 8

A manufacturer of insulation randomly selects 20

winter days and records the daily high

temperature data:

24, 35, 17, 21, 24, 37, 26, 46, 58, 30,

32, 13, 12, 38, 41, 43, 44, 27, 53, 27

**Find the frequency distribution table for this
**

variable.

58

ANSWER TO EXAMPLE 8

Sort

**raw data in ascending order:
**

12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38,

41, 43, 44, 46, 53, 58

Find

range: 58 - 12 = 46

Select

**number of classes: 5 (usually between 5
**

and 15)

Compute

interval width: 10 (46/5 then round up)

Determine

**interval boundaries: 10 but less than 20,
**

20 but less than 30, . . . , 50 but less than 60

Count

**observations & assign to classes
**

59

**ANSWER TO EXAMPLE 8 CTD.
**

Data in ordered array:

12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

60

**DESCRIPTIVE STATISTICS: HISTOGRAM FOR
**

NUMERICAL VARIABLE

A

**histogram is a graph that consists of vertical
**

bars constructed on a horizontal line that is

marked off with intervals for the variable (in the

frequency distribution table) being displayed.

The interval endpoints are shown on the

horizontal axis

the vertical axis is either frequency, relative

frequency, or percentage for each interval.

Histogram displays the distribution of the

variable.

61

**EXAMPLE 8 CTD.: PLOT THE HISTOGRAM FOR THE
**

MANUFACTURER OF INSULIN

**Histogram : Daily High Tem perature
**

7

6

Frequency

6

5

4

4

3

3

2

2

1

(No gaps

between

bars)

5

0

0

0 0 10 10 2020 30 30 40 40 50 50 60

Temperature in Degrees

0

Ch.

1-6270

60

**QUESTIONS FOR GROUPING DATA INTO INTERVALS
**

FOR FREQUENCY DISTIRBUTION

**1. How wide should each interval be?
**

(How many classes should be used?)

largest number smallest number

w interval width

number of desired intervals

**2. How should the endpoints of the
**

intervals be determined?

Often answered by trial and error, subject to

user judgment

The goal is to create a distribution that is

neither too "jagged" nor too "blocky”

Goal is to appropriately show the pattern of

63

variation in the data

HOW MANY CLASS INTERVALS?

3.5

3

2.5

2

1.5

1

0.5

60

Temperature

**Few (Wide class intervals)
**

may compress variation too

much and yield a blocky

distribution

can obscure important patterns

of variation.

Frequency

12

10

8

6

4

2

0

0

30

60

Temperature

More

64

(X axis labels are upper class endpoints)

More

56

52

48

44

40

36

32

28

24

20

16

8

12

0

4

**Many (Narrow class intervals)
**

may yield a very jagged

distribution with gaps from

empty classes

Can give a poor indication of

how frequency varies across

classes

Frequency

SHAPE OF A DISTRIBUTION

Histogram helps us in deciding the shape of the data

distribution.

We can visually determine whether data are evenly

spread from its middle or center.

A distribution is said to be symmetric if the

observations are balanced or evenly distributed

around its center.

If you fold the histogram along a vertical line through

the middle and have the edges match pretty close,

you have a symmetric distribution.

65

**SHAPE OF A DISTRIBUTION CTD.
**

The

**usually thinner ends of a distribution
**

are called tails.

A distribution is skewed, or asymmetric, if

the observations are not symmetrically

distributed on either side of the center .

If one tail stretches out farther than the

other, the distribution is skewed to the side

of the longer tail.

66

**SHAPE OF A DISTRIBUTION CTD.
**

A skewed-right distribution (sometimes called

positively skewed) has a tail that extends farther to

the right.

A skewed-left distribution (sometimes called

negatively skewed) has a tail that extends farther to

the left.

67

A REMINDER

Frequency is the number of observations in each

category.

Relative frequency is obtained by dividing each

frequency by the number of observations.

Percent is obtained from dividing each frequency

by the number of observations and multiplying the

resulting proportion by 100%.

68

**THE CUMULATIVE FREQUENCY DISTRIBUTION
**

**A cumulative frequency distribution contains the
**

total number of observations whose values are less

than the upper limit for each class in the frequency

distribution table of numerical variables.

or

**Cumulative distribution is frequency distribution of
**

each group plus the frequency distribution of

previous group.

69

EXAMPLE 8 CTD.: MANUFACTURER

OF INSULIN

Data in ordered array:

12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

70

**At Home for extra practice:
**

Read example 1.9 on page 22 (frequency

and cumulative frequency for numerical

variables).

71

**DESCRIPTIVE STATISTICS: OGIVE GRAPH FOR
**

NUMERICAL VARIABLES

An

**ogive, sometimes called a cumulative
**

line graph, is a line that connects points that

are the cumulative percent of observations

below the upper limit of each interval in a

cumulative frequency distribution.

72

**EXAMPLE 8 CTD.: INSULIN MANUFACTURER
**

Cumulative

percentage

**Plot the Ogive graph.
**

0

0

0

0

**Ogive: Daily High Temperature
**

Cumulative Percentage

Less than 10

100

80

60

40

20

73

0

10

20

30

40

50

60

Upper Interval endpoints

EXAMPLE 9

Suppose we have the following data:

17

62

15

65

28

51

24

65

39

41

35

15

39

32

36

37

40

21

44

37

59

13

44

56

12

54

64

59

Construct a frequency distribution table.

Construct a histogram.

74

**DESCRIPTIVE STATISTICS: STEM-AND-LEAF
**

DIAGRAM FOR NUMERICAL VARIABLES

**Stem-and-leaf diagram is a simple way to see
**

distribution details in a data set.

It is only good for small data sets.

METHOD: Separate the sorted data series

into leading digits (the stem) and

the trailing digits (the leaves)

o

**The number of digits in each class indicates
**

the class frequency.

o

**The individual digits indicate the pattern of
**

values within each class.

75

EXAMPLE 10

**Data in ordered array:
**

21, 24, 24, 26, 27, 27, 30, 32, 38, 41

**Here, use the 10’s digit for the stem unit:
**

Stem Leaf

21 is shown as

38 is shown as

2

1

3

8

76

EXAMPLE 10 CTD.

**Data in ordered array:
**

21, 24, 24, 26, 27, 27, 30, 32, 38, 41

**Completed stem-and-leaf diagram:
**

Stem

Leaves

2

1 4 4 6 7 7

3

0 2 8

4

1

77

**USING OTHER STEM UNITS
**

**Using the 100’s digit as the stem:
**

**Round off the 10’s digit to form the leaves
**

Stem Leaf

613 would become

6

1

776 would become

7

8

12

2

...

1224 becomes

78

**USING OTHER STEM UNITS
**

**Using the 100’s digit as the stem:
**

The completed stem-and-leaf display:

Data:

**613, 632, 658, 717, 722, 750,
**

776, 827, 841, 859, 863, 891,

894, 906, 928, 933, 955, 982,

1034, 1047,1056, 1140, 1169,

1224

Stem

6

Leaves

136

7

2258

8

346699

9

13368

10

356

11

47

12

2

79

EXAMPLE 11

The data presented below were collected on the amount of

time it takes, in hours an employee, to process an order at a

local plumbing wholesaler.

2.8

5.5

4.9

10.2

0.5

1.1

13.2

14.2

14.2

7.8

8.9

4.5

3.7

10.9

15.2

8.8

11.2 13.4

18.2 17.1

**Construct a stem-and-leaf display of the data.
**

Answer:

This time the stem unit is 0.1.

80

So far

we used bar chart, pie chart, and Pareto

diagram to describe a single categorical

variable.

we used component bar chart to describe

two categorical variables.

we used histograms, ogives, and stem-andleaf graphs to describe a single numerical

variable.

Now we use scatter plot to describe two

numerical variables.

81

**Descriptive statistics: Scatter
**

Diagrams or Plots for numerical

variables

Scatter

**Diagrams are used for paired
**

observations taken from two numerical

variables

The

Scatter Diagram:

one variable is measured on the vertical

axis and the other variable is measured

on the horizontal axis

82

**EXAMPLE 12: PLOT THE SCATTER DIAGRAM FOR THE
**

TABLE.

Average SAT scores by state: 1998

Verbal

Math

Alabama

562

558

Alaska

521

520

Arizona

525

528

Arkansas

568

555

California

497

516

Colorado

537

542

Connecticut

510

509

Delaware

501

493

D.C.

488

476

Florida

500

501

Georgia

486

482

Hawaii

483

513

W.Va.

525

513

Wis.

581

594

Wyo.

548

546

…

**LECTURE SUMMARY FOR STUDENTS
**

Introduced

key definitions:

Population vs. Sample

Parameter vs. Statistic

Descriptive vs. Inferential statistics

Described

random sampling

Examined the decision making process

84

**LECTURE SUMMARY FOR STUDENTS
**

**Reviewed types of data and measurement levels
**

Data in raw form are usually not easy to use for

decision making -- Some type of organization is

needed:

Table

Graph

Techniques reviewed in this chapter:

Frequency distribution

Line chart

Cross tables

Frequency

distribution

Bar chart

Histogram and ogive

Pie chart

Stem-and-leaf display

Pareto diagram

85

Scatter plot

- SCh2Uploaded bySumeet Saini
- bucketUploaded byAhmadZainurRhofiqin
- Statistics for ManagementUploaded byAnup Shrestha
- 3 5 a appliedstatistics 1Uploaded byapi-310045431
- EditedUploaded byuroojishfaq
- Evaluation of Sport SponsorshipUploaded bySandeep Puranikmath
- 2015-GBZ-QQUploaded byEdson Bonfim
- RM Project IV-Th SemUploaded byPramod Barela
- AgriComp Case StudyUploaded byrosette
- Test Bank for Statistics Informed Decisions Using Data 5th Edition by Michael Sullivan IIIUploaded byJami
- Statistics-for-Management-and-Economics-9th-Edition-Gerald-Keller-Test-Bank.pdfUploaded bya852314876
- Summary Probability DistributionsUploaded byayman
- Efficient Frontier and Lower Partial Moment of the First OrderUploaded byInternational Journal of Science and Engineering Investigations
- StatisticalAnalyses.pdfUploaded byVasa
- Alain Desrosieres - The Politics of Large NumbersUploaded byМилица Пезер
- Course OutlineUploaded byJayabrata Bhaduri
- Date Sheet Final Exam Combine Summer' 14Uploaded byRazi Baig
- Data Analysis 2.3Uploaded byrafathk
- Concrete Works of Colorado, Inc. v. City and County of Denver, Colorado, 540 U.S. 1027 (2003)Uploaded byScribd Government Docs
- Research problems.docxUploaded byRedelyn Guingab Balisong
- HHS Appellate Decision on Georgia Department of Human ServicesUploaded byBeverly Tran
- Geonaute Software EnUploaded byFrankie Costa Negro
- 15 Method Study 280911Uploaded bypammy313
- Statistics PrelimUploaded byCarmie Basillote
- tes10 ch01Uploaded byAlejandro_Martinez22
- Corelation between Central Corneal Thicknes, Gender and Age in Bulgarian ChildrenUploaded byinventionjournals
- 240proj.pdfUploaded byAnonymous Rmfu2OwCB9
- Experimental Practice and an Error Statistical Account of Evidence (Deborah Mayo).pdfUploaded byjosepepefunes26
- BS Statistics (4-Years) Course OutlineUploaded byMuhamamd Khan Muneer
- Irwin Miller, Marylees Miller-John E. Freund's Mathematical Statistics With Applications-Pearson (2014)Uploaded byClaudine Reid