3 views

Uploaded by Wei Cong

Attribution Non-Commercial (BY-NC)

- Statistics Review for Algebra 2(Wm)
- Measures of central Tendency
- Lecture Notes Statistics
- Summary of Formula - Statistics
- Phase II of Golden Ratio
- Production Rates Highway
- 5 Most Important Methods for Statistical Data Analysis
- Understand Statistics
- Highway Construction Production Rates and Estimated Contracct Times
- Biostat shiz
- Areas of Numeracy
- CMR-99-F30063_SMEDIS
- Quantitative Techniques - Paper 2.pdf
- Chapter 01
- Quantitative Technique.docx
- Speed and Time Headway Distribution Under Mixed Traffic Condition
- Random Variable
- mmc1
- farma 1
- Lecture 1

You are on page 1of 45

Collecting and presenting data to assist decision making Processing and analyzing data g y g Obtaining reliable forecasts

To i T inspect the incoming goods f t th i i d from a supplier (O li (Onesample hypothesis testing) Developers of a new hypertension drug want to determine if the drug lowers blood pressure (Twosample hypothesis testing) In marketing, statistics is used to evaluate whether higher spending on advertising is justified (Simple linear regression) g ) To forecast economic indices, such as GNP, GDP, etc related to many factors (Multiple linear regression)

Key Definitions

A population (universe) is the collection of all members of a group

N represents the population size

n represents the sample size

A parameter is a numerical measure that describes a characteristic of a population d ib h t i ti f l ti A statistic is a numerical measure that describes a characteristic of a sample d ib h t i ti f l

3

Population

a b cd

Sample

b gi o r y

Measures computed from sample data are called statistics

4

c n u

ef gh i jk l m n o p q rs t u v w x y z

Examples

Population P l ti All eligible voters All light bulbs manufactured in a day All patients with high blood pressure for a clinical study Sample S l 1000 voters polled 100 light bulbs selected 200 hypertension patients enrolled for a clinical study

Descriptive Statistics

Collecting, presenting, and characterizing data

Inferential Statistics

Drawing conclusions and/or making decisions concerning a population based only on sample data

Descriptive Statistics

Collect data

e.g., e g Survey

Present data

e.g., Tables and graphs

Characterize data

n

Inferential statistics

Population

8

Estimation e.g., Estimate the population mean weight using the sample mean weight Hypothesis testing e.g., Test the claim that earnings for males to be higher than females

Less Time Consuming Than a Census Less Costl to Administer Than a Cens s Costly Census Less Cumbersome and More Practical to Administer Than Census of th P Ad i i t Th a C f the Population l ti

10

Types of Data

Data

Categorical

Examples: Marital Status Political Party Eye Color (Defined categories)

Numerical

Discrete

Examples: Number of Children Defects per hour (Counted items)

Continuous

Examples: Weight distance (Measured characteristics)

11

Numerical Data N i lD t

41, 24, 32, 26, 27, 27, 30, 24, 38, 21

Histograms

7 6

Tables

5 4 3 2 1 0 10 20 30 40 50 60

Stem-and-Leaf Display St d L f Di l

A simple way to see distribution details in a p y data set

METHOD: Separate the sorted data series into leading digits (the stem) and the trailing digits (the leaves)

Data in Raw Form (as Collected): 24, 26, 24, 21, 27, 27, 30, 41, 32, 24 26 24 21 27 27 30 41 32 38 Data in Ordered Array from Smallest to Largest: Largest 21, 24, 24, 26, 27, 27, 30, 32, 38, 41 Stem-and-Leaf Stem and Leaf Display:

2 144677 3 028 4 1

What is a Frequency Distribution? A frequency distribution is a list or a table containing class groupings (ranges within which the data fall) ... and the corresponding frequencies with which data fall ithi d t f ll within each grouping or category h i t It allows for a quick visual interpretation of the data

Condenses the raw data and allows for a quick visual interpretation of the data Example: A manufacturer of insulation randomly E l f t fi l ti d l selects 20 winter days and records the daily high temperature

24, 35, 17, 21, 24, 37, 26, 46, 58, 30, 32, 13, 12, 38, 41, 43, 44, 27, 53, 27

12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

Find Range: 58 - 12 = 46 Select Number of Classes: 5 ( (usually between 5 and ll b t d 15) Compute Class Interval (Width): 10 (46/5 then round up) C t Cl I t l (Width) Determine Class Boundaries (Limits):10, 20, 30, 40, 50,

60

Data in Ordered Array:

12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

Class

[10, [10 20) [20, 30) [30, [30 40) [40, 50) [50, 60) Total

Frequency

3 6 5 4 2 20

Percentage

15 30 25 20 10 0 100

Histogram Example g p

Class [10, 20) [20, 30) [30, 40) [40, 50) [50, 60) Class Cl Midpoint Frequency 15 25 35 45 55 3 6 5 4 2

Distribution Shape

The shape of the distribution is said to be symmetric if the observations are balanced, or evenly distributed, about the center. y ,

Symmetric Distribution

10 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9

Fre equency

Distribution Shape

(continued)

The shape of the distribution is said to be skewed if the observations are not symmetrically distributed around the center.

Positively Skewed Distribution

A positively skewed distribution (skewed to the right) has a tail that extends to the right in the direction of g positive values.

12 10 Fre equency 8 6 4 2 0 1 2 3 4 5 6 7 8 9

A negatively skewed distribution (skewed to the left) has a tail that extends to the left in the direction of negative al es negati e values.

12 10 Freq quency 8 6 4 2 0 1 2 3 4 5 6 7 8 9

His togram : Daily high te m pe rature 7 6 5 4 3 2 1 0 6 5 4 3 2 0 5 15 25 35 45 55 0 More

Fre equency

Numerical description

Summary M S Measures

Mean Median Mode

Quartiles

Range Variance

Variation

Mean

Mean (Arithmetic Mean) of Data Values

Sample mean

n Population mean

X=

X

i =1

Sample Size

i

X1 + X 2 + L + X n = n

Population Size

X

i =1

X1 + X 2 + L + X N = N

An example

TV watching hours/week: 5, 7, 3, 38, 7

Mean = (5 + 7 + 3 + 38 + 7)/5 = 60/5 = 12

Mean = (5 + 7 + 3 + 8 + 7)/5 = 30/5 = 6

12

38

Mean = 12

Mean = 6

The Most Common Measure of Central Tendency, especially when n is large Affected b E t Aff t d by Extreme Values (Outliers) V l (O tli )

Median

Robust measure of central tendency y Not affected by extreme values

3 5 7 38 3 5 7 8

Median = 7

Median = 7

If n is odd, th median i th middle number i dd the di is the iddl b (i.e,(n+1)/2 th measurement) If n is even, the median is the average of the n/2 th g and (n/2 +1) th measurement

Mode

A Measure of Central Tendency Value that Occurs Most Often Not Affected b Extreme Values N t Aff t d by E t V l There May Not Be a Mode There M Be S Th May B Several M d l Modes Used for Either Numerical or Categorical Data

0 1 2 3 4 5 6

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Mode = 9

No Mode

Mean is generally used, unless extreme values (outliers) exist The median is often used, since the median is not sensitive to extreme values. l

Example: Median home prices may be reported for a region less sensitive to outliers

Quartiles Q til

Split ordered data into 4 quarters i ( n + 1) Position of i th quartile i-th

( Qi ) =

25%

25%

25%

25%

( Q1 )

( Q2 )

( Q3 )

Noncentral Location Q1 , Q2, and Q3 are called 25th, 50th, and 75th percentile respectively. A pth percentile is the value of X such that p% of the measurements are less than X and (100 p)% (100-p)% are greater than X X.

Data in Ordered Array: 3 6 6 12 12 12 15 15 18 21

1(10 + 1) = 2.75 4

3(10 + 1) = 8.25 4

Box-and-Whisker Box and Whisker Plot

Graphical display of data using 5-numbers Data in Ordered Array: 3 6 6 12 12 12 15 15 18 21

X smallest Q 1

Median( Q2)

Q3

Xlargest

12

15.75 15 75 21

Suppose that you are a purchasing agent for a large manufacturing firm and that you regularly place orders with two different suppliers (A & B). The number of days required to fill orders are the following A: 9, 10, 10, 10, 10, 10, 11, 11, 11, 11 B: 7, 7, 8, 10, 10, 10, 11, 12, 13, 15

Supplier A: Mean = 10.3, Median=mode=10

Supplier A

6 5 4 Fre equency Fre equency 3 2 1 0 7 8 9 10 11 # of days 12 13 14 15 3.5 3 2.5 2 1.5 1 0.5 0 7 8 9 10 11 # of days 12 13 14 15

Supplier B

Measures of Variation

Variation

Range Interquartile Range Variance

Standard Deviation

Measures of variation give information on the spread or variability of the data values.

Range

Easy to compute Difference between the Largest and the Smallest Observations: S ll t Ob ti

Example:

Range = 12 - 7 = 5

7 8 9 10 11 12

Ignores the way in which data are distributed g y

7 8 9 10 11 12 7 8 9 10 11 12

Range = 12 - 7 = 5

Range = 12 - 7 = 5

Sensitive to outliers

1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5

Range = 5 - 1 = 4

1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120

Range = 120 - 1 = 119

Difference between the First and Third Quartiles

Data in Ordered Array: 3 6 6 12 12 12 15 15 18 21

Variance

Sample Variance:

S2 =

( X

i =1

X)

n 1

Population Variance:

=

2

( X

i =1

Standard Deviation

Most widely used Measure of Variation y Has the Same Units as the Original Data

Sample Standard Deviation:

S=

Population Standard Deviation:

( X

i =1

X)

n 1

( X

i =1

Examples E l

Data set 11, 12, 13, 16, 16, 17, 18, 21 n=8,

1 X = (11 + 12 + ... + 21) = 15.5 8

1 2 2 2 s = ( 4.5) + ( 3.5) + ... + (5.5) = 11.14 7

2

s = s 2 = 11.14 = 3.34

2 n n 1 1 2 s= X i X i n 1 i =1 n i =1

Xi

i =1

and

X

i =1

Data set 11, 12, 13, 16, 16, 17, 18, 21

X

i =1

8 i =1

= 11 + 12 + ... + 21 =124

2

Each value in the data set is used in the calculation Values far from the mean are given extra weight (because deviations from the mean are squared)

Visualizing variation

- Statistics Review for Algebra 2(Wm)Uploaded byAbaad Ali
- Measures of central TendencyUploaded byEastonMclean
- Lecture Notes StatisticsUploaded byPawan Yadav
- Summary of Formula - StatisticsUploaded byEzekiel D. Rodriguez
- Phase II of Golden RatioUploaded byAlejandra Zepeda
- Production Rates HighwayUploaded byKshamata Desai
- 5 Most Important Methods for Statistical Data AnalysisUploaded byMayank Kakkar
- Understand StatisticsUploaded bySaha2
- Highway Construction Production Rates and Estimated Contracct TimesUploaded byLTE002
- Biostat shizUploaded byJames Maravillas
- Areas of NumeracyUploaded byTalibe
- CMR-99-F30063_SMEDISUploaded byNoah Ryder
- Quantitative Techniques - Paper 2.pdfUploaded byTuryamureeba Julius
- Chapter 01Uploaded bynikadon
- Quantitative Technique.docxUploaded byKomal
- Speed and Time Headway Distribution Under Mixed Traffic ConditionUploaded byTed Kim Michael Pagdonsolan
- Random VariableUploaded bypeeyushtewari
- mmc1Uploaded byAmin Azad
- farma 1Uploaded byNuri Hidayati
- Lecture 1Uploaded byVishnu Venugopal
- 3. Normalitas Data EchaUploaded byJey Blues
- Chapter 3 BlankUploaded byhimanshu sagar
- ap statsUploaded byMuhammad Howey
- Scaling Techniques 1Uploaded bySumit pal
- Solutions Chapter 14 2Uploaded bySunitha Kishore
- Statistics Review Questions)Uploaded bymakunjap
- lesson plan1sfUploaded byapi-346461926
- Chapter 13 OrganizerUploaded byBrett
- 2683386Uploaded byalexandru_bratu_6
- TUploaded bytyas

- Stats: Data and Models Solution Chapter 1-4Uploaded byanon283
- 2017 AMC 12B TestUploaded bymmdabral
- STATISTICS-1-1 (1)Uploaded bySheena Ornido
- MATH 533 ( Applied Managerial Statistics ) Final Exam AnswersUploaded byElsievogel
- chap03Uploaded byImam Awaluddin
- Full Quantitative Techniques-IUploaded bykapill
- june 13 s1Uploaded byannabellltf
- STA301 Assignment 1 Solution Fall 2010Uploaded byMuhammad Anwar
- Bba Question PaperUploaded byBalakrishna Jagarlamudi
- CCP303Uploaded byapi-3849444
- Exercise 1 Sem 2 201718Uploaded byAhmad Munawir
- lesson plan 4Uploaded byapi-306606954
- Box Plot- BonaobraUploaded bydan
- A_Comparison_of_2D-3D_Pose_Estimation_MethodsUploaded byThiago Souto Maior
- Sample 1Uploaded byloc1409
- Mc CaldenUploaded byaiakoby
- CRP FIX NORMAL.docUploaded byjenny
- Statistics in Education- Made SimpleUploaded bySatheesh
- 3.Numerical Descriptive TechniquesUploaded byNurgazy Nazhimidinov
- maths 16Uploaded byapi-230427224
- Milkov 2005 Distributia Globala a Vulcanilor NoroiosiUploaded byAlexandra Gabriela
- QT MOD-1Uploaded bypradeep
- Stat15Uploaded byRia
- Math IB Revision Statistics SLUploaded bymykiri79
- Nat a Proficiency Testing GuideUploaded byAluizio Filho
- Mca4020 Slm Unit 06Uploaded byAppTest PI
- phd assign.Uploaded byNiño R. Felix
- Stat ,Uploaded byapi-3813174
- Additional Mathematic Project WorkUploaded byJulian Julio
- 3.3_measures_of_position.pptxUploaded byJerickEmbanecidoMontilla