0% found this document useful (0 votes)
188 views437 pages

Quantitative Methods: Pilani

The document outlines the structure and rules for a Quantitative Methods course at BITS Pilani, emphasizing two-way communication and participation. It details the course's textbooks, coverage, evaluation scheme, and the importance of quantitative methods in business decision-making. Additionally, it introduces topics such as data collection, variable types, and sampling methods, along with practical applications using software like MS Excel.

Uploaded by

Sirisha Burugula
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
188 views437 pages

Quantitative Methods: Pilani

The document outlines the structure and rules for a Quantitative Methods course at BITS Pilani, emphasizing two-way communication and participation. It details the course's textbooks, coverage, evaluation scheme, and the importance of quantitative methods in business decision-making. Additionally, it introduces topics such as data collection, variable types, and sampling methods, along with practical applications using software like MS Excel.

Uploaded by

Sirisha Burugula
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Quantitative Methods

Lecture-1

BITS Pilani
Pilani Campus

1
Important class rules

Rule #1: Two way communication


– Answer questions using the chat box
• Type – “yes” or “y” in the chat box to confirm you understand
– I will wait for answers before moving forward

Rule #2: Participate


– Learning by doing (keep your spreadsheet open)
– Use chat box / raise your virtual hand to ask questions
• Do you know how to raise hand? (Type “y” in the chat box if yes)

Rule #3 Utilize Q&A


– Clarify your remaining questions at the end of the session
2
BITS Pilani, Pilani Campus
Textbooks

This course has two textbooks.

1. Business Statistics
▪ Business Statistics: A First Course by D.M. Levine,
T.C. Krehbiel, M.L. Berenson and P. K. Viswanathan.
Seventh edition.
Pearson Education.
Available at Amazon

2. Management Science/Optimisation
▪ Quantitative Methods for Business. David R
Anderson, Dennis J Sweeney, Thomas A Williams,
Jeffrey D Camm and Kipp Martin. Twelfth edition.
Cengage Learning. 2013.
Available at Amazon
3
BITS Pilani, Pilani Campus
Course coverage

1. Business Statistics
▪ Data collection, presentation, basic probability,
estimation, hypothesis testing, correlation, and
regression.

2. Management Science
▪ Optimisation techniques- Linear programming,
transportation problems, and assignment
problems.

3. Practice
▪ Solve problems in MS Excel/ any other software.

4
BITS Pilani, Pilani Campus
Syllabus - Refer to the course handout
Textbook # Chapter # Chapter Title
1 1 Defining and Collecting Data

Mid-Term Test Syllabus


1 2 Organizing and Visualizing variables

Comprehensive Exam Syllabus


1 3 Numerical Descriptive Measures
1 4 Basic Probability
1 5 Discrete Probability Distributions
1 6 The Normal Distribution Textbook #1
1 7 Sampling Distributions
1 8 Confidence Interval Estimation
1 9 Fundamentals of Hypothesis Testing: One-Sample Tests
1 10 Two-Sample Tests and ANOVA
1 11 Chi-Square Tests
1 12 Simple Linear Regression
2 7 An Introduction to Linear Programming
2 9 Linear Programming Applications in Marketing, Finance, and Operations Management
2 10 Distribution and Network Models Textbook #2
5
BITS Pilani, Pilani Campus
Course handout and Evaluation

Course handout is available at Taxila.


▪ Textbooks, Reference books, Topics, Delivery plan …

Evaluation scheme- (Evaluation Component, EC)


No Name Weight Date(s)
EC-1 Quizzes & Assignments 25% Will be announced shortly, by the faculty
EC-2 Mid-Semester Test 35% As announced by WILP
EC-3 Comprehensive Exam 40% As announced by WILP

6
BITS Pilani, Pilani Campus
Why Quantitative Methods?

• Start using data and quantitative analysis for your business decisions

• Decisions: Qualitative inputs vs. quantitative inputs

• Annual performance evaluation

• Important to see the big picture behind the numbers

• How does marketing budget affect sales?

• Cut-throat competition for the market share, profitability, goodwill, etc.

• Playing with trade-offs

• Making best use of the limited resources


7
BITS Pilani, Pilani Campus
Some more examples

Which customer segment is most satisfied with our service?

Which of the employees should be given a promotion under budget limitations?

Is there a relationship between the customer’s gender and the color preference of the car?

Is a newly developed Covid-19 Vaccine safe?

How do I create a portfolio for maximizing the return to risk ratio with limited budget?

What product mix should be manufactured in a factory?

How to split marketing budget for maximum returns: acquisition vs. retention?

8
BITS Pilani, Pilani Campus
QM can be applied to intangible
constructs also
Anything that cannot be measured cannot be improved.

The pursuit of discovering the reasons behind any phenomenon

– Brand Recall

– Customer experience

– Employee Performance

– Investor Perceptions

9
BITS Pilani, Pilani Campus
BITS Pilani
Pilani Campus

Chapter-1: Defining and Collecting data


Topics

Chapter-1: Defining and Collecting Data


➢ Defining variables
▪ Collecting data
▪ Types of sampling methods
▪ Types of survey methods

11
BITS Pilani, Pilani Campus
Data and Variables
• Start with your business objective
• Define data that you want to collect to achieve that
• Example: Employee performance review
• What kind of data will help you achieve this objective?
• Collect data
• How do you collect this data?
• Organize and visualize
• What is the distribution of employee performance in your
department?
• Analyze to answer business questions
• Top 5% get 20% raise etc. 12
BITS Pilani, Pilani Campus
Data and Variables

• Variables
• Features, characteristics or columns of a record
• What are the variables in the sales record table shown
below?
• Type in the chat box!
• Data
• Values of those features
• What is data against features in the sales record table?

Customer Customer Name Annular Purchase


1 Rohit Rs 1459.00
2 … … 13
BITS Pilani, Pilani Campus
Variable Types

• Categorical 1. Everyone should download


store_inventory.xlsx file
• Gender? Type of customer (business or home)? 2. Work with the data and
confirm the understanding
• Numerical 3. Ask questions if you are not
• Discrete able to or raise your virtual
hand to ask the question
• Countable 0,1,2,3,…
• Number of purchases in a year?
• Continuous Which of the
• Take any finite value store_inventory.xlsx
• Depends on precision of the instrument (4.56 or data Categorical, Discreet
4.5678 etc.) or Continuous?
• Sales amount?
14
BITS Pilani, Pilani Campus
Topics

Chapter-1: Defining and Collecting Data


✓ Defining variables
➢ Collecting data.
▪ Types of sampling methods
▪ Types of survey methods

15
BITS Pilani, Pilani Campus
Data Sources

• Primary sources
Is store_inventory.xlsx data
• You collect the data first hand (Examples?) primary or secondary?
• Surveys
• Personal contact, phone, email, online etc.
• Experiments
• Randomized trials (Covid vaccine?)
• Business activity
• Production process
• Secondary sources
• You use the data collected by other sources?
• https://reuters.com/
• List of prospects (leads) 16
BITS Pilani, Pilani Campus
Data Sources

• Structured data (?)


• Can be neatly organized as tables

Is store_inventory.xlsx data
• Unstructured data (?) structured or unstructured?
• Texts / Tweets
• Sentiment analysis
• Are customers saying positive things about my brand?
• Images
• Pictures of products to determine faults
• Videos
• Which ad evokes positive reactions?
17
BITS Pilani, Pilani Campus
Answering business questions with data

• Population or Sample?
• Population study: We use data from all the units for analysis
• Sample study and inference
• We use data from a carefully drawn sample of the population
• We infer population parameters from the sample analysis
• What is the mean height of this class?
• Is Population study possible? (Type “y” in the chat box if yes)
• Data from all the units can be collected and used for the analysis
• Exit poll (who is likely to win the election?)
• Is collecting data from entire population possible?
• A sample from the population is used to answer the question
18
BITS Pilani, Pilani Campus
Sampling Methods

• Let’s look at store inventory data again


• Let’s try to answer the question of
• “What is the average inventory quantity of items in the shop”?
• Can we use any type of sample to answer our question?
• If we take first 10 items will our estimate of average for the whole
store accurate?
• What do we need to ensure that our answer is
approximately right?
• The sample should be “representative of the population”
• Do we understand this statement? (Type “y” if you do)
19
BITS Pilani, Pilani Campus
Type of Sampling Methods

• Non-probability sampling
• Judgement sample (when do we use this?)
• Convenience sample (When do we use this?)
• Probability Sampling
• Simple Random Sampling
• Systematic Sampling
• Stratified Sampling
• Cluster Sampling

20
BITS Pilani, Pilani Campus
Simple Random Sampling

• Every item has same chance of selection


• Sampling Frame: List of all units in the population
• Draw a random sample of “n” items from the “Sampling
Frame” of “Items”
• What is the probability that a given item would be drawn in the first
picking?
• 1/N
• Sampling with replacement
• Sampling without replacement

21
BITS Pilani, Pilani Campus
Simple Random Sampling

• Draw a random sample of 10 items from store inventory


data
• Create a new column, name it “Random Number”
• Generate random numbers with formula: RANDBETWEEN (1,100)
• Populate this entire column with random numbers
• Sort data by column “Random Number”
• Pick first 31 numbers
• Calculate average of these 31 randomly picked items
• Calculate population average
• How does the sample average compare with the population
average?
22
BITS Pilani, Pilani Campus
Limitations of Simple Random Sampling

• May not be feasible


• Is drawing random sample of 10,000 residents from Mumbai’s
population possible? (~2.2 crores transient population)
• May be very expensive
• May not represent all logical groups of the population
• Is this 10,000 sample likely to represent extreme minority groups?
• With less than 1% population proportion?

23
BITS Pilani, Pilani Campus
Systematic Sampling

• Start from a randomly selected item


• Take every kth item sampling frame
• With/after-class practice
• Sort inventory data with random number column
• Take every 3rd item from the list
• Calculate average of inventory quantity from this systemic sample
• Calculate population average
• How does the sample average compare with population average?

24
BITS Pilani, Pilani Campus
Stratified Sampling

• Divide the population into logical subpopulations


• Members of certain state, cast, religion, locality
• Draw a simple random sample from each sub-population
• With/after-class practice
• Use inventory data item types as sampling strata
• Sort inventory data by item type
• Draw a random sample of 4 items from each type
• Combine them to make a stratified sample
• Calculate average of inventory quantity from this stratified sample
• Calculate population average
• How does the sample average compare with population average?
25
BITS Pilani, Pilani Campus
Cluster Sampling

• In certain situations pure random sampling is


• Not desirable, infeasible or very expensive
• Examples – experiments
• We could chose cluster sampling in such cases
• Clusters are logically occurring groups
• Apartment communities in a city
• Certain block of residents
• Patients arriving in particular hospitals
• Clusters are defined and n-clusters are randomly chosen
• All units from chosen clusters become part of the sample
26
BITS Pilani, Pilani Campus
Cluster Sampling

• With/after-class practice: In the given inventory data


• Treat items from suppliers as clusters
• Make a list of unique clusters
• Randomly select 4 clusters
• Calculate average inventory quantity from the sample data
• Compare it to the population average

27
BITS Pilani, Pilani Campus
Today’s content

✓ Why quantitative methods?


✓ Variables and data
✓ Survey methods
✓ Sampling methods introduction

28
BITS Pilani, Pilani Campus
Q&A

• Questions about
• Objectives
• Content
• Anything else

29
BITS Pilani, Pilani Campus
Quantitative Methods

Lecture-2
BITS Pilani
Pilani Campus

1
BITS Pilani
Pilani Campus

Organizing and Visualizing Variables


(Ch 2 & 3 Business Statistics, Levine et al.)
So far and the next

✓ Session 1:
√ Defining and collecting data
√ Survey and sampling methods
➢ Session 2:
➢ Organizing and visualizing variables
➢Ch 2 & 3 Business Statistics, Levine et al.

3
BITS Pilani, Pilani Campus
Focus and mode

▪ Concept clarity
▪ Ability to apply concepts
▪ Everything should flow from your business or managerial questions
▪ Active participation and discussions

4
BITS Pilani, Pilani Campus
Why organize?

From Raw data


▪ Start with the business question
▪ Making sense of the data 550
400
200
550
500
600
700
500
400
500
500
500
250
350
450
550
500
400
450 425 400 350 350 350 400 450
▪ Tabular presentation makes it easier to 360
500
600
600
500
1000
450
600
525
400
300
600
400
400
800
450
grasp data by features/variables 650 400 300 400 450 400 375 500

▪ Summary of important variables can be


easily seen
▪ Max, min, spread, averages etc.
To organized
▪ Variable relationships can be explored
Class Frequency
▪ Errors can seen < 250
251-500
2
35

▪ Data can be explored systematically 501-750


751-1000
11
2
Total 50
▪ Global summaries and drilling down
5
BITS Pilani, Pilani Campus
Why picturize?
From Raw data
▪ Visual summary helps 550
400
200
550
500
600
700
500
400
500
500
500
250
350
450
550
500
400
450 425 400 350 350 350 400 450
▪ Tabular data can be overwhelming 360
500
600
600
500
1000
450
600
525
400
300
600
400
400
800
450
▪ May have millions of rows and hundreds of 650 400 300 400 450 400 375 500

columns
To pictorial presentation
▪ A visual summary can be revealing
▪ Spread
▪ Distribution
▪ Relationships
▪ Trends

6
BITS Pilani, Pilani Campus
What are we organizing or picturizing?

▪ A variable or a set of variables together


▪ Variable type decides what all we could do
▪ Can we find average of categorical variables?
▪ For gender variable, which takes two values: Male or Female
▪ Frequency or proportion makes more sense
▪ For height of the class
▪ Max, min, spread and average etc. would make sense

7
BITS Pilani, Pilani Campus
Variables

✓ From the previous session


✓ Categorical
✓ Numerical
• More nuanced categorization
• Based on scales
• Nominal (categorical)
• Ordinal
• Interval
• Ratio
– What is a “scale”? (Please type your answers in the chat box)

8
BITS Pilani, Pilani Campus
Variables

✓ Categorical
✓ Numerical
• Based on scales
– Nominal (categorical)
– Ordinal
– Interval
– Ratio

9
BITS Pilani, Pilani Campus
1. Nominal scale

Nominal (from ‘name’), also called Categorical variable.


▪ Engineer?- Yes/No.
▪ Gender- Male/Female.
▪ Material- Wood/Metal/Plastic/Steel/Bronze.
▪ Industry- Oil/Mining/Automobile/Media/IT/Food processing.
▪ Etc.

▪ Measurements
▪ Data can be categorized and counted; cannot be measured or ranked.

10
BITS Pilani, Pilani Campus
Nominal data- frequency and proportion
table
Store inventory data
• Microsoft Spreadsheet
• Insert
• Pivot table
• Recommended
• Select entire table
• Start with blank table
• Add type to rows and values
• Value field settings can be
changed to % of column as well
• You can work with multiple
variables together by adding
them to rows or columns
11
BITS Pilani, Pilani Campus
Nominal data- bar charts and pie charts

12
BITS Pilani, Pilani Campus
Variables

✓ Categorical
✓ Numerical
• Based on scales
✓ Nominal – Frequency, percentages, bar and pie charts
– Ordinal
– Interval
– Ratio

13
BITS Pilani, Pilani Campus
2. Ordinal data

Ordinal (from ‘Order’), also called Ranked data.


▪ Tall, taller, tallest.
▪ Big, bigger, biggest.
▪ Olympics: First, second, third, fourth….
▪ Thickness: very thick, thick, thin.
▪ Taste: Good, average, below average, bad.
▪ Temperature: freezing, cool, warm, hot.
▪ Customer satisfaction: not satisfied, somewhat satisfied, satisfied, highly satisfied.

▪ What we could do with ordinal data?


▪ Nominal
▪ + Data can be ranked

14
BITS Pilani, Pilani Campus
Ordinal data

Price category:

Low: 0-25
Medium: 26-50
High: >50

fr: Year-1 Fresher

15
BITS Pilani, Pilani Campus
Ordinal data- Frequency & Cumulative
frequency table & Ogive chart
Ogive is
cumulative
frequency
line

In Microsoft spread sheet


• Choose the data and select
• Custom combo chart
• Bars for frequency &
• Add a line for cumulative frequency

16
BITS Pilani, Pilani Campus
Variables

✓ Categorical
✓ Numerical
• Based on scales
✓ Nominal – Frequency, percentages, bar and pie charts
✓ Ordinal – Nominal + cumulative frequency charts
– Interval
– Ratio

17
BITS Pilani, Pilani Campus
Interval scale

Numerical data, where zero is arbitrary chosen.


▪ Intervals/differerence between two points make logical sense
▪ Temperature measured in Centigrade.
▪ Temperature measured in Fahrenheit.
▪ Employee satisfaction measured on 1 to 7 scale
‘ 1 2 3 4 5 6 7
Not satisfied Highly satisfied
▪ Customer satisfaction measured on 1 to 5 scale
‘ 1 2 3 4 5

Zero is arbitrary chosen-


▪ Zero degree Centigrade/Fahrenheit is not zero temperature
▪ Zero Kelvin is 0 temperature

18
BITS Pilani, Pilani Campus
US/UK/EU shoe sizes are on Interval scale-
because their zero size will not be of zero length

Garment sizes may also


be on Interval scale.

19
BITS Pilani, Pilani Campus
Interval data

▪What we could do with ordinal data?


▪ Ordinal
▪ + Data Distribution, Histograms

20
BITS Pilani, Pilani Campus
Interval data- Histogram, Cumulative
frequency

Fig: Histogram of number of products by price buckets (bins) (Store Inventory Data)

21
BITS Pilani, Pilani Campus
Variables

✓ Categorical
✓ Numerical
• Based on scales
✓ Nominal – Frequency, percentages, bar and pie charts
✓ Ordinal – Nominal + cumulative frequency charts
✓ Interval - Ordinal + Distribution chart (histogram)
– Ratio

22
BITS Pilani, Pilani Campus
Ratio scale

Numerical data, Zero means no value


▪ Height in mm/cm/m
▪ Weight in g/kg/tons
▪ Time in sec/min/hr
▪ Temperature in Kelvin
▪ Humidity in %
▪ Sales (nos, tons, or Rs)
▪ Charts
▪ Same as interval scale

23
BITS Pilani, Pilani Campus
Ratio scale charts
153 Range Frequency
154 150-160 3
154 160-170 3
162 170-180 5
165 180-190 4
169 190-200 1
172 Total 16
176
176 Stem Leaf
176 15 3 4 4
177 16 2 5 9
180 17 2 6 6 6 7
182 18 0 2 6 7 Stem-and-Leaf diagram
186 19 0 is like a histogram,
187 without losing the data.
190

24
BITS Pilani, Pilani Campus
BITS Pilani
Pilani Campus

Two variables- tables and charts

25
Two or categorical variables- Column
charts
Gender
f m Total
fr 3 2 5
Class

so 11 12 23
jr 7 11 18
sr 3 1 4
Total 24 26 50
Two variables are- Gender (f/m) and Class (fr, so, jr, sr).

26
BITS Pilani, Pilani Campus
Two numerical variables- Scatter plot

Two variables are- Height and GPA.

27
BITS Pilani, Pilani Campus
Two variables- Line chart

Two variables are- Year and Tourists .

Year 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016
No (miliion) 2.54 2.38 2.73 3.46 3.92 4.45 5.08 5.28 5.17 5.78 6.31 6.58 6.97 7.68 8.03 8.80
28
BITS Pilani, Pilani Campus
Two or more variables- Stock chart

Date High Low Close


Aug 06, 2020 11,256.80 11,127.30 11,200.15
Aug 05, 2020 11,225.65 11,064.05 11,101.65
Aug 04, 2020 11,112.25 10,908.10 11,095.25
Aug 03, 2020 11,058.05 10,882.25 10,891.60
Jul 31, 2020 11,150.40 11,026.65 11,073.45
Jul 30, 2020 11,299.95 11,084.95 11,102.15
Jul 29, 2020 11,341.40 11,149.75 11,202.85
Jul 28, 2020 11,317.75 11,151.40 11,300.55
Jul 27, 2020 11,225.00 11,087.85 11,131.80
Jul 24, 2020 11,225.40 11,090.30 11,194.15
Jul 23, 2020 11,239.80 11,103.15 11,215.45
Jul 22, 2020 11,238.10 11,056.55 11,132.60
Jul 21, 2020 11,179.55 11,113.25 11,162.25
Jul 20, 2020 11,037.90 10,953.00 11,022.20
Jul 17, 2020 10,933.45 10,749.65 10,901.70

Three variables are- High, Low and Close prices. 29


BITS Pilani, Pilani Campus
Summary tables- One, Two, and Three
variables

Frequency table
One nominal variable- Blood Group
Contingency table
Three nominal variables- Blood group, Gender and Rh

Contingency table
Two nominal variables- Blood Group and Ethnicity
BITS Pilani, Pilani Campus
Visual representation- charts

1 2 3

1 Pie chart
2 Column chart
3 Side-by-side chart
4 Stacked row chart
5 Stacked column chart

4 5

BITS Pilani, Pilani Campus


Two variable- charts

Time Series chart Scatter plot (Chest-G vs. Length)

BITS Pilani, Pilani Campus


Q&A

33
Quantitative Methods

Lecture-3
BITS Pilani
Pilani Campus

1
BITS Pilani
Pilani Campus

Numerical Descriptive Measures


(Ch 2 & 3 Business Statistics, Levine et al.)
So far and the next

✓ Previous Sessions
√ Defining and collecting data
√ Survey and sampling methods
√ Organizing and visualizing variables
➢ Session 3:
➢ Numerical descriptive measures

3
BITS Pilani, Pilani Campus
Nominal / Categorical

▪ Meaning of numerical descriptive measures?


▪ Describe the data in a numerical way
▪ Contingency tables
▪ Relationship between two variables, cross tabulations
▪ % of female and male customers by product types
▪ Descriptive measures for nominal/categorical & ordinal variables?
▪ Frequency, cumulative frequency, proportions (%)
▪ Next: Numerical descriptive measures for
▪ Numerical variables (measured on interval or ratio scales)

4
BITS Pilani, Pilani Campus
Numerical measures

▪ Central tendency (?)


▪ Mean, mode and median
▪ Dispersion or spread
▪ Around the central point

Source: Self generated image

5
BITS Pilani, Pilani Campus
Central tendency measures for raw data

▪ Arithmetic Mean (AM)


▪ AM = sum of the values/number of observations.
▪ AM = (σ𝑛𝑖=1 𝑥𝑖 )/n
▪ σ𝑛𝑖=1 𝑥𝑖 : Summation from first value, 𝑖 = 1 to the nth value 𝑖 =n
▪ 𝑥1 + 𝑥2 + 𝑥3 + ⋯ + 𝑥𝑛
▪ n = total number of observations in your data
▪ Example: 1 3 4 8 9
▪ n (?)
▪ 𝑥3 (? )
▪ AM (?)
▪ = (1+3+4+8+9)/5= 25/5= 5.
6
BITS Pilani, Pilani Campus
Central tendency measures for raw data

▪ Median- value of middle observation


▪ After sorting the data in ascending order.
▪ Median = (n+1)/2 number
▪ Examples:
▪ 1 3 4 8 9, Median value = (?)
▪ 1 3 4 8 9
▪ 1 3 4 8 9 1, Median value = (?)
▪ 1 3 4*8 9 1
▪ 113*489
▪ Median value= 3.5 = (3+4)/2.

7
BITS Pilani, Pilani Campus
Central tendency measures for raw data

▪ Mode
▪ Value of most frequently occurring observation in the data
▪ Examples:
▪ 2 2 3 4 5, Mode value= (?)
▪ 2 2 3 4 5 (Mode value =2)
▪ 22 30 40 40 50 50 66 (Mode value= ?)
▪ 22 30 40 40 50 50 66 (Mode values= 40 and 50)
▪ Bi modal
▪ 1, 2, 3, 4, 5 (Mode value= ?)
▪ May not have any Mode
8
BITS Pilani, Pilani Campus
When to use which measure

▪ Data Points: 2, 4, 6, 8, 10 (Which is the best measure?)


▪ Mean: 6, Median: 6, Mode: No mode
▪ Mean as the Best Measure (Balanced data)
▪ Data Points: 1, 2, 3, 4, 20
▪ Mean: 6, Median: 3, Mode: No mode
▪ Data with extreme values
▪ Data Points: 1, 2, 2, 2, 4, 6, 11
▪ Mean: 4, Median: 2, Mode: 2

▪ Mode is generally the only measure with nominal data


9
BITS Pilani, Pilani Campus
When to use which measure

▪ Mean
▪ Data should be approximately equally spread about the center
▪ Median
▪ When there are extreme values
▪ Skewed distribution of data
▪ Salary offered to collage graduates
▪ Mode
▪ Rarely used for numerical data
▪ Who wins the election? (Modal voted candidate)
▪ For categorical data – it is the only measure available
▪ Highest frequency
10
BITS Pilani, Pilani Campus
Others means

▪ Geometric mean
▪ Only works with positive numbers
▪ High volatility, different units
▪ Harmonic mean
▪ When greater weight needs to be given to HM = n/ σ𝑛𝑖=1 1/𝑥𝑖
smaller values (Outlier issues)
▪ Averages ratios and rates = n/(1/𝑥1 + 1/𝑥2 … 1/𝑥𝑛)
▪ AM>GM>HM

11
BITS Pilani, Pilani Campus
Mean of Grouped Data

Grouped Mean = ∑f.x / ∑f


f = Frequency (Number of items in the group), x = Mid Value 0 0 0 0 1 10 21
32/7 ~ 4.6
0-10 6 5 30
Frequency Mid Value 10-20 0 15 0
x (f) (x) f.x 20-30 1 25 25
[0-10] 20 5 100 55/7 ~7.7
(10-20] 10 15 150
(20-30] 15 25 375 Mean = (?)
(30-40) 25 35 875
(40-50] 10 45 450 Mean = 24.375

∑f = 80 ∑f.x= 1950
Note: Mean of Raw data is generally not equal to mean of grouped data (why?)
12
BITS Pilani, Pilani Campus
Median of Grouped Data

Class Frequency Cum Median, Me = l + {h x (N/2 – cf )/f}


Interval (f) Freq
Where,
[0-10] 10 10
(10-20] 10 20 l = lower limit of median class.
15 35 h = width of median class
(20-30]
f = frequency of median class,
(30-40) 25 60 cf = cumulative frequency of the class preceding the median
(40-50] 10 70 class.
N = ∑fi
(Median class?) Note: Not in syllabus

Median point: (70+1)/2 = 35.5


Median class: 30-40

13
BITS Pilani, Pilani Campus
Mode of Grouped Data

Class Frequency Cum Mo = xk + h{(fk – fk-1)/(2fk – fk-1 – fk+1)}


Interval (f) Freq
Where,
[0-10] 10 10
(10-20] 10 20 xk = lower limit of the modal class interval
(20-30] 15 35 fk = frequency of the modal class.
(30-40) 25 60 fk-1= frequency of the class preceding the modal class.
(40-50] 10 70 fk+1 = frequency of the class succeeding the modal class.
h = width of the class interval.
(Modal class?)
Note:
1. What if modal class is first or the last
class?
2. Not in syllabus

14
BITS Pilani, Pilani Campus
Few properties of Mean, Median, Mode

Mean (AM) Median Mode


AM is unique. Median is unique Mode may not exist.
11 12 13 14.
1 2 3 4. AM=2.5. 1 2 8 18. Multiple modes may exist
11 12 13 14 15 15 16 16.

Change in the value of any Change in the values may not Change in the values may not
observation always affects affect Median. affect Mode.
AM. 123456789 Blue Green Red Red Red Yellow
1 2 3 4. AM=2.5. 1 2 3 4 5 6 7 8 10000000 Mode remains Red even if
1 2 3 8. AM=3.5. Yellow is replaced by Blue.
AM cannot be computed if all Median may be computed even May be computed even if values
values are not available. if values of all observations are of all observations are not
1 2 3 X. AM=? not available. available.
12345679X 1233334X
The Median will remain the The Mode will remain the same
same even if the largest value X even if value X is unknown.
is unknown.
15
BITS Pilani, Pilani Campus
BITS Pilani
Pilani Campus

Variation

16
Variation in daily life

6
5
4
3
2
1 3 6 4 2 4 1 4 2 1 3

6
5 Manpower deployment
4
3
2
1 3 3 3 3 3 3 3 3 3 3

BITS Pilani, Pilani Campus


Measuring variation
1. Range = Maximum-Minimum
2. Variance, population = σ2 = 1/N * ∑ (xi-Mean)2
3. Standard deviation, population =σ
4. Coefficient of Variation, population = σ/Mean
5. Variance, sample = s2 = 1/(N-1) * ∑ (xi-Mean)2
6. Standard deviation, sample =s
7. Coefficient of Variation, sample = s/Mean
8. Mean absolute deviation = 1/N * ∑ |xi-Mean|
9. Z score (how many std devns is xi
away from the mean) = [xi-Mean]/ σ
1. Quartiles (Q1, Q2, Q3) Smallest 25%, 50%, 75% observations.
2. Inter-quartile range = Q3 – Q1
3. 5-number summary (Min., Q1, Q2, Q3, Max.) Minimum, 3 Quartiles, Maximum
13. Boxplot (Called Box and Whisker chart in MS Excel) Plot of 5-number summary
BITS Pilani, Pilani Campus
Why measure variation?
1. In order to make decisions such as which stock to invest?
1. Would you like to invest in stock which shows great variation in price or smaller variation in price
1. Both on the negative side and the positive side

2. In order to evaluate which process is better?


1. Generally, a process with lesser variation is better (credit risk assessment of a customer)

3. Product quality improvements


1. Measuring the current state of variation in the process parameters
2. Reducing variation from acceptable benchmarks

BITS Pilani, Pilani Campus


Measure of spread: Range
Range = Maximum Value – Minimum Value
Max

Sample Data
2
5
7
10
15
Min 20
Range?

18
20
BITS Pilani, Pilani Campus
Highest and Lowest variations (By Range)?

Range = Maximum Value – Minimum Value


Price

Days
▪ Variation: C > B > D > A
21
BITS Pilani, Pilani Campus
Range- Uses and shortcomings
Stock price during a day
Uses
▪ Stock price- minimum and maximum in a day.
▪ Ambient temperature- minimum and maximum in a day.
▪ Blood pressure- high and low within few minutes.
▪ Range is computed only from two observations. Hence, easy Temperature during a day
to compute.

Shortcomings
Range is computed only from two observations,
• Does not capture distribution of variation
• Not suitable for data with extreme values
31 41 5 9 26 53 58979 3 23 ...
Range = (58979-3) = 58976.
22

BITS Pilani, Pilani Campus


Measure of spread: Variance and
Standard Deviation
Variance (population) (σ2) = 1/N * ∑ (xi-Mean)2
Sample Data X-mean (X-mean)^2
σ2 = (∑ (xi-Mean)2)/N = ?
1 -10.0 100.0
5 -6.0 36.0
15 4.0 16.0
σ2 = 41.67
10 -1.0 1.0
15 4.0 16.0
Variance sample (s2) = 1/(N-1) * ∑ (xi-Mean)2
20 9.0 81.0
N = 6, Mean = 11 250.0

Standard deviation, population = σ (square root of variance)


Also called RMSE (Root Mean Square Error)
Standard deviation, sample =s
Coefficient of Variation, population = σ/Mean
Coefficient of Variation, sample = s/Mean 23
BITS Pilani, Pilani Campus
Variation and Standard Deviations

Std. Dev.= 0 Std. Dev.= 10 Std. Dev.= 30


100 100 100

80 80 80

60 60 60

40 40 40

20 20 20

0 0 0
A B C D E F G H I J K L A B C D E F G H I J K L A B C D E F G H I J K L

Std. Dev.= 14.98 Std. Dev.= 25.97 Std. Dev.= 25.97


100 100 100

80 80 80

60 60 60

40 40 40

20 20 20

0 0 0
A B C D E F G H I J K L A B C D E F G H I J K L A B C D E F G H I J K

Standard deviation = RMSE

24
BITS Pilani, Pilani Campus
BITS Pilani
Pilani Campus

Coefficient of Variation (CoV)

25
Higher variation, Red or Blue?

Sensex Healthcare
Minimum 35,414 16,169
Maximum 38,493 18,630
Mean 37,077 17,026
Range 3,079 2,461
Std. Dev. 841 715

▪ According to Range and Standard deviation,


Sensex had higher variation than Healthcare
index.

26
BITS Pilani, Pilani Campus
Coefficient of Variation, CoV
Sensex Healthcare
Minimum 35,414 16,169 ▪ Coefficient of Variation (CoV) =
Maximum 38,493 18,630 Standard deviation/Mean
Mean 37,077 17,026
Range 3,079 2,461
Std. Dev. 841 715
CoV
(=Std Dev/Mean) 0.023 0.042

▪ According to CoV, Healthcare index has • What is the unit of standard deviation?
higher variation. • What is the unit of CoV?
▪ When means of two data differ a lot,
▪ Units are different
▪ CoV may capture the variation better than
Standard deviation.
27
BITS Pilani, Pilani Campus
BITS Pilani
Pilani Campus

Z score

28
Z score

How far is the observation from mean, in terms of standard


deviations.

Z score of 1, 2, 3, 4, 5? Mean = 3.

Z = (ObservedValue - Mean)/Standard deviation


= Error/Standard deviation

▪ Z-Score
▪ Not a summary measure
▪ It is computed for each data point.

29
BITS Pilani, Pilani Campus
Z scores- Example and Uses

Z scores of 1, 2, 3, 4, 5? Mean = 15/5=3, Standard deviation= 1.41

• What is the Z score of 1?


• (1-3)/1.41 = -1.41
• What is Z score of 3?
• (3-3)/1.41 = 0

Uses
▪ To identify Outliers (extreme values).
▪ Z value < -3 or > 3 are often considered Outliers.
▪ To read Normal distribution table (Chapter-6)
-3 -2 -1 0 1 2 3
▪ To estimate Confidence Intervals (Chapter-8) Z score
▪ To test Hypothesis (Chapter 9,10, and 11).
30
BITS Pilani, Pilani Campus
BITS Pilani
Pilani Campus

3. Shape of the data- Skewness

31
Symmetry (or lack) of frequency distributions
Mean < Median < Mode Mean = Median = Mode Mean > Median > Mode

Tail on the left, -ive Skewed Symmetric Tail on the right, +ive Skewed
Ex: Japan’s population age Ex: Distribution of height Ex: Wealth, income, family size

Frequency distribution in the middle is symmetric,


-ive skewed 0 +ive skewed
and other two are not symmetric.
Symmetric
Skewness measures deviation from symmetry. Skewness of
Left distribution= -1.28
Computation of Skewness is not in the syllabus. Middle distribution= 0
Right distribution= +1.28
MS Excel function, Skewness = Skew(Range). 32
BITS Pilani, Pilani Campus
Typical frequency distributions-
MMM relationships

Symmetric, Bell Symmetric, Bell Symmetric and Negatively skewed, Positively skewed, Another Positively
shaped, high shaped, less flattest. There is tail on left side. tail on right side. skewed, tail on right
concentration in the concentration in the no mode. side. There is no
middle. middle. mode.
Mean=Median=Mode Mean=Median=Mode Mean=Median Mean<Median<Mode Mode<Median<Mean Median<Mean

Lower spread Higher spread

Frequency distribution MMM relationship


Negatively skewed Mean < Median < Mode
Symmetric Mean = Median = Mode
Positively skewed Mode < Median < Mean

33
BITS Pilani, Pilani Campus
HW…. A tale of 3 exams

What do results of these exams indicate?

34
BITS Pilani, Pilani Campus
BITS Pilani
Pilani Campus

Shape of the data- Kurtosis

35
Flatness (or Peakedness) of a frequency distribution

Frequency distribution on the right is flatter.


0 Kurtosis
Kurtosis measures Flatness (or Peakedness)
of a frequency distribution. Kurtosis of
Distribution on the left- 1.26
Calculation of Kurtosis is not in the course. Distribution on the right- 0.067
MS Excel formula, =Kurt(Range).

BITS Pilani, Pilani Campus


BITS Pilani
Pilani Campus

Quartiles

37
Quartiles (from Quarters)

▪ Divide the sorted data into 4 quarters- 25% Dataset


40
observations in each quarter. 50
60
▪ Q1, First Quartile- Lowest 25% observations 80 Q1=90, = (100+80)/2
▪ (N+1)/4th 100
▪ Q2, Second Quartile- Lowest 50% observations 110
120
▪ (N+1)/2th
180 Q2=200, (220+180)/2
▪ Q3, Third Quartile- Lowest 75% observations. 220
▪ 3/4(N+1)th 300
600
▪ Inter-Quartile Range (IQR) = Q3-Q1. 700 Q3=800, =(700+900)/2
900
910
▪ Notice that Q2 = Median. 930 IQR = 800 - 90 = 710.
▪ Quartiles are used to study variation in the data, and to 950
Outliers: More than
spot whether Q3+1.5 IQR
distribution or less
of data than Q1-1.5 IQR
is symmetric.
▪ Used in 5-number summary and Boxplots.
BITS Pilani, Pilani Campus
BITS Pilani
Pilani Campus

5-number summary

39
5-number summary
Dataset 5-numbers
The dataset is summarized by following 5 40 40
50
numbers-
60
80 Q1=90, = (100+80)/2 90
1. Minimum
100
2. Q1- Quartile 1 110
3. Q2- Quartile 2 120
180 Q2=200, (220+180)/2 200
4. Q3- Quartile 3 220
5. Maximum 300
600
700 Q3=800, =(700+900)/2 800
▪ Usage 900
▪ To study variation in the data, 910
▪ Quick symmetry assesment 930
950 950
▪ Used in Boxplots.
5-number summary of above dataset-
40, 90, 200, 800 and 950

BITS Pilani, Pilani Campus


BITS Pilani
Pilani Campus

Boxplots

41
Boxplot
▪ A visual representation of 5- number Dataset 5-numbers
950 950
summary. A box over the 5-numbers- Min, 930
Max

Q1, Q2, Q3, Max, 910


Q3
900 Q3=800 = (900+700)/2 800
700
600
300
220 Q2=200, (220+180)/2 200
180
120
110
▪ Boxplot is used to study variation in the 100 Q1=90, =(80+100)/2 90 Q2
data, and to spot whether distribution of 80
Q1
60
data is symmetric. Min
50
40 40
▪ Boxplot = ‘Box and Whisker’ in MS Excel 2019.

Wherever there are outliers, the whisker line ends at Q3+1.5IQR (if outliers are on the higher side) or at Q1-1.5IQR
(if outliers are on the lower side)
BITS Pilani, Pilani Campus
Typical Boxplots
Boxplot Lowest Lowest Lowest Lowest 1 5
Comments
No 25%, Q1 50%, Q2 75%, Q3 100%, (All)
1 25 5 5 25 Symmetric, narrow IQR
2 20 20 20 20 Symmetric
3 20 30 30 20 Symmetric
4 35 15 15 35 Symmetric
Symmetric 2 6
5 10 40 40 10 Symmetric, wide IQR
6 25 5 5 25 Symmetric, low (30) median
7 45 10 10 15 Left skewed
8 20 5 10 45 Right skewed
3 7

4 8

43
BITS Pilani, Pilani Campus
Take 5… Boxplots stories

Age

44
BITS Pilani, Pilani Campus
BITS Pilani
Pilani Campus

Covariance and Correlation


(will be done later in Chapter#12, Simple Linear Regression)

45
MS Excel functions….1/2

Excel function Example Result


=Average(Range) =Average(1,2,3,4) 2.5
=Median(Range) =Median(6,4,5,3,2) 4
=Mode(Range) =Mode(2,3,3,4,5,5,5,5) 5

=Var.p(Range) =Var.p(1,2,3,4,10) 10.000


=Stdev,p(Range) =Stdev.p(1,2,3,4,10) 3.162

46
BITS Pilani, Pilani Campus
MS Excel functions….2/2

Excel function Example Result


=Min(Range) Min(2,3,4,7,8) 2
=Max(Range) Max(2,3,4,7,8) 9

=Quartile.Inc(Array,#Q)
(Q= 1,2,3)

47
BITS Pilani, Pilani Campus
Numerical Description Measures

1. Mean, Median, Mode


2. Min, Max, Range
3. Variance, Stddev, CoV
4. 5 – Number summary
1. Min, Q1, Q2, Q3, Max
5. Inter Quartile Range: Q3-Q1
6. Skewness
7. Kurtosis
8. You can use MS Excel to calculate these descriptive measures
48
BITS Pilani, Pilani Campus
Q&A - Discussion on exam pattern

BITS Pilani, Pilani Campus


Quantitative Methods

Lecture-4
BITS Pilani
Pilani Campus

1
BITS Pilani
Pilani Campus

Probability – Part I
(Ch 4 Business Statistics, Levine et al.)
So far and the next

✓ Previous Sessions
√ Defining and collecting data
√ Survey and sampling methods
√ Organizing and visualizing variables
√ Numerical descriptive measures
➢ Today
➢ Probability (Part I)

3
BITS Pilani, Pilani Campus
Why learn probability?
▪ We ideally prefer certainty, however most things are uncertain.
▪ What is uncertainty?
▪ Faced with uncertainty, how do we make rational decisions?
▪ Should I hire the candidate I interviewed for the job?
▪ What is the chance of the prospective hire performing well in the job?
▪ Should I approve the loan application on the table?
▪ What is the chance of the loan seeker paying the loan back?
▪ Should I do an MBA?
▪ What is the chance of getting a better job if I invest money and 2 years in an MBA?
▪ Study of probability helps us make sense of uncertainty
▪ Helps us quantify “the questions about chance of an event happening”
▪ From “perhaps” or “likely” or “unlikely” to say “50%”, “70%” or “25%” 4
BITS Pilani, Pilani Campus
Better decision can be taken if uncertainty is
measured

▪ Rain: To carry umbrella or not?


▪ Farming: To grow Wheat or Gram in Rabi season?
▪ Safety: To wear a helmet or not?
▪ Insurance: To insure the shipment or not?
▪ Stock market: Which stock to buy/sell?
▪ International airport: Frisk the passenger at Green channel?
▪ Quality: Accept/Reject the consignment or Inspect more parts?
▪ Cricket: batting, betting or fielding?

▪ Commercial Bank: How much cash to keep in the branch?


▪ Blood Bank: How many units to keep?
▪ Insurance: What should be the premium?
▪ Maintenance: How many spare parts to stock?
▪ Emergency services-1: How many Fire Engines?
▪ Emergency services-2: How many Ambulances/Nurses in emergency?
▪ Pizza: Number of delivery boys?
5
BITS Pilani, Pilani Campus
Probability

▪ What is probability?
▪ Chance or likelihood of an event
▪ Random Experiment
▪ Exact outcome can not be predicted with certainty
▪ But outcome is always one of the possible knowns
▪ Sample space: Contains all possible outcomes/events
▪ Natural Experiment: Chance of rain for example
▪ Simple event and joint events
▪ One feature v/s more than one features
▪ Will it be a 4? Will it be 4 & 2 on two consecutive rolls?

6
BITS Pilani, Pilani Campus
Probability

▪ Probability is a number assigned to an event


▪ Represents likelihood of that event happening
▪ Always between 0 and 1
▪ Total probability law
▪ Total probability of all events in the sample space is 1
▪ ∑pi = 1, where pi = The probability of the event ‘i’
▪ Example:
▪ If a sample space has 6 possible events (E1, E2, …., E6)
▪ Can a probability of getting a 2 (E2) be -1.2? Or 2?
▪ Total probability of E1 through E6 = ∑Ei = 1
▪ If the total probability of getting 1, 2, 3, 4 or 5 is 4/6, then
▪ What is the probability of E6?
▪ 1 – 4/6= 2/6 (It’s a loaded dice not a fair one)

7
BITS Pilani, Pilani Campus
Probability?

• For a sample space containing equally likely outcomes


Number of favourable outcomes
– Prob =
Total number of possible outcomes
• A six-sided dice is tossed, probability of number 6?
• What is the sample space?
• Total number of possible outcomes?
• 6
• Number of favorable outcomes
• 1
• Probability?
• 1/6

• Probability of snowfall in Mumbai next month?


• What is the sample space? Are events equally likely?

8
BITS Pilani, Pilani Campus
Getting values of probability

1. A priori
▪ Classical/Equi-likely.
▪ Textbook examples of Coin tossing, Playing cards,
etc, or when you know nothing.

2. Empirical
▪ From historical data, observations, or experiments.
▪ Life tables in insurance, earthquakes, rainfall, twins,
quality, stock market …

3. Subjective
▪ Personal judgement.
▪ Outcome of India vs Brazil cricket match outcome.
Covid-19 will be completely over in 2028…
9
BITS Pilani, Pilani Campus
BITS Pilani
Pilani Campus

A priori probability
Probability- a priori
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 𝑖𝑛 𝑤ℎ𝑖𝑐ℎ 𝑡ℎ𝑒 𝑒𝑣𝑒𝑛𝑡 𝑜𝑐𝑐𝑢𝑟𝑠
𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 =
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠

Tossing a coin-
▪ Outcomes- Head or Tail.
▪ P(Head) = P(Tail) = ½.

Births-
▪ Outcomes- Male or Female.
▪ P(Male)=P(Female)= ½.

Playing cards-
▪ Outcomes- 26 Black or 26 Red.
▪ P(Red)= P(Black)= 26/52 = ½.

11

BITS Pilani, Pilani Campus


Probability- a priori …1

Number of outcomes in which the event occurs Total number of possible outcomes

1. P(4)=1/6. 1 1, 2, 3, 4, 5, 6.
2. P(5)= 1/6. 1 1, 2, 3, 4, 5, 6.
3. P(Even)= 3/6. 3 1, 2, 3, 4, 5, 6.
4. P(<5)= 4/6. 4 1, 2, 3, 4, 5, 6.
5. P(<=5)= 5/6. 5 1, 2, 3, 4, 5, 6.
6. P(Divisible by 3)= 2/6. 2 1, 2, 3, 4, 5, 6.
7. P(Divisible by 5)= 1/6. 1 1, 2, 3, 4, 5, 6.
8. P(Prime)= 3/6. 3 1, 2, 3, 4, 5, 6.

BITS Pilani, Pilani Campus


BITS Pilani
Pilani Campus

Empirical probability-
From experiments or observations
Empirical Probability
• The probability is computed from experiments,
observations, surveys, etc.
• Proportion of times an event occurs
• If a dice is rolled 10, 100, 500 or 1000 times
• Actual proportion of times a 4 would appear
• Will it be 1/6 or 0.167?

Empirical Test (Simulating rolls of a dice)


# Get a random number between 1 and 6. Do this
'n' times
# Calculate the proportion of times the number was
4.
# Now vary 'n' between 20 and 1000
# Chart n on the x-axis and the proportion prop on
the y-axis.

Python code is shared on the teams files


14
BITS Pilani, Pilani Campus
Empirical probability Examples

▪ The probability is computed from experiments, observations,


surveys, etc.

Item Probability
Left-handed 1 : 10 persons
Twins 3 : 100 births
Breast Cancer 1 : 8 Women in US
17.2 in 100 male smokers
Lung Cancer
11.6 in 100 females smokers

15
BITS Pilani, Pilani Campus
Empirical probability computation-
Examples

.
S&P BSE Sensex observed for 26 days- 11
Range Frequency Probability*
times Down and 14 times Up.
20-30 1 0.01
30-40 40 0.26
▪ P(Down) = 11/25 = 44%. 40-50 76 0.50
▪ P(Up) = 14/25 = 56%. 50-60 26 0.17
60-70 6 0.04
70-80 1 0.01
80-90 1 0.01
Total 151 1
* or Relative Frequency 16
BITS Pilani, Pilani Campus
Using probability to make decisions:
Expected value
Expected value: E[X] = ∑xip(xi)
• xi = The values X takes
• p(xi) = The probability that X takes value xi
• ∑xip(xi) = x1p(x1) + x2p(xn) + … + xnp(xn)
• n: Total number or possible outcomes

• For Roulette-A
• Probability of Blue, Yellow, Red and Green is 9%, 6%, 70% and 15% Respectively
• You are offered Rs 10 for Blue, Rs 20 for Yellow ,Rs 30 for Red and Rs 40 for Green

• Game Fee is Rs 30
• Should you play this game? Should you offer this game?
• Expected value E[X]: x1p(x1) + x2p(x2) + x3p(x3) + x4p(x3)
• 10*0.09 + 20*0.06 + 30*0.70 + 40*0.15 = 0.9 + 1.2 + 21 + 6 = 29.1
• Note: Only likely to hold in large number of trials
17
BITS Pilani, Pilani Campus
BITS Pilani
Pilani Campus

Subjective probability
Subjective probability

▪ Past experience, personal opinions and biases, etc.


1. Covid-19 will be over completely over by 2028-
▪ ExpertA = 0.90. ExpertB = 0.10. ExpertC = 0.25. Layman = 0.30.

2. BSE Sensex will close 400-500 points up tomorrow-


▪ Broker-A: 0.40. Broker-B: 0.30. Investor-A: 0.60. MF-A: 0.30.

3. Sports betting-
▪ P(IndiaWillWin) = 0.40. BookieA.
▪ P(IndiaiWillWin) = 0.45. BookieB.
▪ P(IndiaWillWins) = 0.70. BookieC.

19
BITS Pilani, Pilani Campus
BITS Pilani
Pilani Campus

Types of probability- for two or more events


BITS Pilani
Pilani Campus

Joint and Marginal probability

21
Joint Events

• M&R Electronics case from the book


• Purchase behavior of 1000 households
• HDTV
• What is the sample space?
• Related to purchase behavior of 1000 households
• Depends on how you frame the experiment
• Simple events? (Only one characteristic)
• A household planned to purchase, did not plan to purchase, actually
purchased, did not actually purchase
• Joint events? (More than one characteristics)
• Planned to purchase and actually purchased
BITS Pilani, Pilani Campus
Joint Events

• Simple probability
• p(y) = p (planned to purchase)
• 250/1000 = ¼ = 0.25
• Complement
• p( did not plan to purchase) = p(ȳ ) = 1 – p(y)
Contingency Table
• 0.75
• Joint Probability:
• P (planned to purchase & actually purchased) = p(x and Y)
• 200/1000 = 1/5 = 0.2 x x∩y y
• p(x ∩ Y) ; ∩ : Intersection symbol (300) 200 250
• Addition Rule
• p(x or y) = p(x U y) = p(x) + p(y) – p(x∩y)
• U: Union symbol Venn Diagram

BITS Pilani, Pilani Campus


Marginal Probability

• Marginal Probability
• Event A occurs with one or other events
B1, B2, B3…. Bn
• Marginal probability of A: P(A()
• Sum of joint probabilities with all B1, B2,
…. Bn
• P(A) = P( A and B1) + P (A and B2)
+….= P(A and Bn)
• B1, B2, …, Bn are mutually exclusive Contingency Table
and collectively exhaustive events
occurring with event A P(y) = p (y ∩ x) + p (y ∩ X̄)

200/1000 + 50/1000

= 0.2 + .05 = 0.25

BITS Pilani, Pilani Campus


Marginal and Joint…1
Marginal
1. P(Red)= Diamond
1/2
1. P(King)=
1/13 = 4/52 Club
P(7) =
1/13
P(Picture) =
Heart
12/52
P(Diamond)
13/52 = 1/4 Spade
Joint
P(Red and King)
2/52 Marginal probability- only one event occurs. P(Red) means the
P(Diamond and Red) probability that the card is of Red color.
13/52
P(Picture and Red)
6/52 Joint probability- both events occur. P(Red and King) means the
P(Black and Red) probability that the card is of Red color and its is also a King.
0
P( <3 and Red)
4/52
25

BITS Pilani, Pilani Campus


Marginal and Joint probability…2

Marginal Diamond
1. P(Red)= 26/52 = 1/2.
2. P(King)= 4/52. Club
3. P(7) = 4/52.
4. P(Picture) = 12/52. Heart

5. P(Diamond) = 13/52.
Spade
Joint
5. P(Red and King) = 2/52.
Marginal probability- only one event occurs. P(Red) means the
6. P(Diamond and Red) = 13/52. probability that the card is of Red color.
7. P(Picture and Red) = 6/52.
Joint probability- both events occur. P(Red and King) means the
8. P(Black and Red) = 0/52. probability that the card is of Red color and it is also a King.
9. P( <3 and Red) = 4/52.
26

BITS Pilani, Pilani Campus


Q&A

27
Quantitative Methods

Lecture-5
BITS Pilani
Pilani Campus

1
BITS Pilani
Pilani Campus

Probability – Part II
(Ch 4 Business Statistics, Levine et al.)
So far and the next

✓ Previous Sessions
√ Defining and collecting data
√ Survey and sampling methods
√ Organizing and visualizing variables
√ Numerical descriptive measures
√ Probability (Part I)
➢ Today
➢ Probability (Part II)

3
BITS Pilani, Pilani Campus
• Two people
• Every person can either be a Yes or a No
• How many arrangements are possible
• Yes, No
• Yes, Yes
• No, Yes
• Yes, No

4
Sample space – some more exploration
• Sample space depends on how you frame the experiment
• Lets consider only one feature of the previous example
• Frame 1: Nature generates a visitor out of its own magic with two
possibilities
• Sample space: S = {Yes, No}
• Are these equally likely outcomes? Is probability P(Yes) = ½?
• Frame 2: Equally likely sample space
• Treat 1000 store visitors’ data like a pack of 1000 independent cards
• Each card is marked as one of the two values (250: Yes, 750: No)
• S= { 750-No, 250-Yes)
• Nature randomly picks a card with replacement (every card is equally likely)
• What is the probability that the next visitor plans to purchase?
• Frame 3: Drawing 1000 independent persons from a population
• Sample space - { All permutations of a thousand Yes or No possibilities}
• Total: 2X2….X2 (1000 times) = 21000 values 5
BITS Pilani, Pilani Campus
BITS Pilani
Pilani Campus

Conditional probability

6
Conditional Probability

• Conditional Probability
• Probability of Event A occurring, given
event B has occurred
• Narrows down the sample space
• Sample space of x | y?, x| ȳ?
• P (A | B)
• Probability of A given B
P(AB) Contingency Table
• P(A|B) = P(B)

Probability of “actual purchase” Probability of “purchasing” given Also can be computed from
given “plans to purchase” “does not plan to purchase” the new narrowed sample space
P(x| y) = p (x ∩ y) / p (y) P(x| ȳ ) = p (x ∩ ȳ) / p (ȳ) Sample space of x | y? and x| ȳ?
P(x| y) = 200 /250 = .8
= (200/1000) / ( 250/1000) = (100/1000) / ( 750/1000) P(x| ȳ ) = 100 /750 = 0.133
= 200 /250 = .8 = 100 /750 = 0.133

BITS Pilani, Pilani Campus


Conditional Probability and sample space

Probability of “actual purchase”


Total given “plans to purchase”
1000
From narrowed sample space
= 200 /250 = .8
Does not plan to
Plans to purchase
purchase
250
750

Purchase: Yes Purchase: No Purchase: Yes Purchase: No


200 50 100 650

8 BITS Pilani, Pilani Campus


Marginal, Joint and Conditional probability

P(Red | King) Diamond

= ½= 2/4
Club
P(Red | 7)
=1/2 Heart

P(Picture | Black) Spade


6/26

P(Picture | Diamond) Conditional probability- P(King | Red) means the probability that the
card is a King given that you have the information that the card
3/13 is a Red Card

BITS Pilani, Pilani Campus


Conditional probability

From 52 card deck Diamond


1. P(Red |King)= 2/4.
2. P(Red |8)=2/4. Club
3. P(Picture |Red)=6/26.
4. P(Picture |7)=0/4. Heart
5. P(=7|Red)= 2/26.
Spade
6. P(<7|Red)= 12/26.
7. P(<7|Diamond)= 6/13.
Frequencies on tree
8. P(Picture|Red)=6/26.
Red Black <7 Not <7
9. P(Red|Picture)= 6/12. King 2 2
Total
4 Red 12 14
Total
26
Contingency table
10.P(Red|Diamond)=13/13 = 1. NotKing
Total
24
26
24
26
48
52
Black
Total
12
24
14
28
26
52

11.P(Diamond|Red)= 13/26. P(Red|King): If the card is a King, what is the probability it is of Red color? 2/4=1/2.
P(<7/Red): If the card is of Red color, what is the probability it is <7? 12/26. 10

BITS Pilani, Pilani Campus


Application of Conditional Probability

Issue of Credit card using P(Default | Event)


• Do you own a house?
• Are you a working professional?
• Do you have a PAN card?
• Is your CIBIL score higher than 800?
• Do you have a credit card from another bank?

▪ Insurance Premium Decisions P( Claim | Event)


▪ P ( Lung Cancer | Smokes)
▪ P (Lung Cancer | Does not smoke)
▪ P (Death in the insurance period | Age 1-50)
▪ P (Death in the insurance period | Age 80+)

11
BITS Pilani, Pilani Campus
Application of Conditional Probability
Medical diagnosis

▪ Narrowing down the probability of the patient having diabetes


▪ P (Diabetes)
▪ The doctor asks several questions
▪ Do you urinate a lot?
▪ P(HasDiabetes | UrinateALot) = a.
▪ P(HasDiabetes | DoesNotUrinateALot) = b
▪ Subsequent questions can further narrow down the probabilities

12
BITS Pilani, Pilani Campus
Independence
Total
• Independence 1000

• Events A and B are independent if Plans to Does not plan


purchase to purchase
Information about one event having occurred does 250 750
not help narrow down the sample space
• P (A | B) = P (A) ; Purchase: Yes Purchase: No Purchase: Yes Purchase: No
• ‘|’ is read as: “conditional on” or “given” 200 50 100 650

• Are “Purchase” and “Plans to Purchase”


independent events?
• Toss of a coin - Are head and tail independent P(purchase)?
events? P(purchase | plans to purchase)?
• What is the probability P(Head)?
• What is the probability of getting a “Head” if you know P(purchase) = 300/1000 = 0.3 = 30%
that “Tail” has occurred? P(purchase | plans to purchase) = 200/250
• Zero = 0.8 = 80%
• Hence they are not independent

BITS Pilani, Pilani Campus


Multiplication Rule

• Joint Probability
For a toss of a fair coin.
P(AB) What is the probability of getting
• P(A|B) = P(B)
4 heads in a row?
• P(AB) = P (A|B).P(B)
• Note: P(AB )= P(A and B) = P(A∩B)
• P(Head) = P(H) = ½
• 4 tosses are 4 independent events
• For Independent Events • P( H, H, H, H) = P(H).P(H).P(H).P(H)
• P(A | B) = P(A) • = (1/2)4
• P(A and B) = P(A). P(B) • = 1/16

BITS Pilani, Pilani Campus


Multiplication Rule and marginal probability

• Marginal probability and general multiplication rule


• If, P(A) = P(A∩B1) + P(A∩B2) + …… + P(A∩Bn)
• A always takes place with one of the mutually exclusive events B1, B2, …., Bn
• Then, P(A) = P(A|B1).P(B1) + P(A|B2).P(B2) + …… + P(A|Bn).P(Bn)

P(Purchased) =
P(Purchased | Planned:Yes)* P (Planned:Yes)
+
P(Purchased | Planned:No)* P (Planned:No)

p(x) => (200/250)*(250/1000) + (100/750)*(750/1000) = 0.3

Contingency Table
BITS Pilani, Pilani Campus
Multiplication Rule and marginal probability

• Marginal probability and general multiplication rule


• If, P(A) = P(A∩B1) + P(A∩B2) + …… + P(A∩Bn)
• Then, P(A) = P(A|B1).P(B1) + P(A|B2).P(B2) + …… + P(A|Bn).P(Bn)

Total
Marginal Probability P(Purchased) =
1 P(Purchased | Planned:Yes) * P (Planned:Yes)
+
Does not plan to
P(Purchased | Planned:No) * P (Planned:No)
Plans to purchase
purchase
0.25
0.75

P (purchased) = 0.8*0.25 + 0.133*0.75 = 0.3

Purchase: Yes Purchase: No Purchase: No


Purchase: Yes
0.8 0.2 0.867
0.133

BITS Pilani, Pilani Campus


Bayes’ Theorem

P (B|A) when P(A|B) is known


• From P(B|A) to P (A|B)
• From P(Test is positive | Covid) to => P ( Covid | Test is positive)

Updating estimated probability after actual experience


• You started with assuming the coin has 50% chance of being fair i.e.
P(Coin-Fair) = .5
• You see three tosses showing up as Heads
• You would like to update your probability estimates
• P (Coin-Fair | Three-Heads-In-A-Row)

BITS Pilani, Pilani Campus


Bayes’ Theorem

• Finding P(B|A) when P(A|B) is known


P(BA)
• P(B|A) = P(A)
P(A|B).P(B)
• P (B|A) = P(A)
P(A|B).P(B)
• P (B|A) = P(A|B1).P(B1)+P(A|B2).P(B2)+⋯.+P(A|Bn).P(Bn)

BITS Pilani, Pilani Campus


Bayes’ Theorem
D
• Probability of infection in population: 0.03
• Test kit performance Y: 0.03 N:0.97
• P(True Positive) = P(Positive Report | Disease) = 0.9
• P(False positive) = P(Positive Report | No disease) = 0.02
TP: 0.9 FN: 0.1 TN: .98 FP: 0.02
• What is the probability of disease if the result is positive?
• What is P (Disease | Positive Report) ? when P(Positive Report | Disease) is known
P( Positive Report|Disease).P(Disease)
• P(Disease | Positive Report) = P(Positive Report)
• P(Positive Report) = P(Positive Report and Disease) + P(Positive Report and No disease)
• Marginal Probability of Positive Report =P(Positive Report)
• P(Positive Report |Disease).P(Disease) + P(Positive Report | No disease).P(No disease) = 0.0464
0.9∗0.03
• P(Disease | Positive Report) = 0.0464 = 0.5818

BITS Pilani, Pilani Campus


CW/HW-1
Actual

Identify Worst Up
Up
0.36
Down
0.24
Total
0.60
A Forecast
Economist Down
Total
0.24
0.60
0.16
0.40
0.40
1.00

▪ The track record of four economists Up Down Total


Up 0.40 0.24 0.64
in predicting the economy is given B Forecast
Down 0.20 0.16 0.36
in the contingency tables. Total 0.60 0.40 1.00

Up Down Total
▪ The track record of four astrologers in predicting whether
Indian football team will win (replace Up with Win) or lose Up 0.60 0.00 0.60
(replace Down with Lose) is given in the contingency tables. C Forecast
Down 0.00 0.40 0.40
Total 0.60 0.40 1.00

Up Down Total
▪ Identify the worst economist?
Up 0.00 0.60 0.60
▪ Rank all economists and show all calculations. D Forecast
Down 0.24 0.00 0.24
Hint: Use conditional probability.
Total 0.24 0.60 0.84
20
BITS Pilani, Pilani Campus
Imperfect ‘classification machines’

Quality inspection Medical diagnosis Banks

Economists/Astrologers Court Judgements

BITS Pilani, Pilani Campus


Some other applications
C
Updating apriori probabilities based on
experience/evidence F: 0.5 U:0.5

• Is the coin fair?


• You start with 0.5 probability of coin being fair: HH: 0.25 NHH: 0.75 HH: 0.64 NHH: 0.36
• P(H) = P(T) = 0.5
• An unfair coin is assumed to give heads 80% of the time
• What is your updated probability estimate of
the coin being fair, if you see 2 heads in a row?
• P(HH|F) = 0.25, P(HH|U) = 0.64, P(F|HH)?
• P(F|HH) = P(HH|F)*P(F) / P(HH)
• P(HH) = P(HH|F)*P(F) + P(HH|U)*P(U)
• 0.25*0.5/(0.125 + 0.32) = 0.125/0.445 = 0.280
• Updated priors P(F) = 0.28, P(U) = 0.72

22
BITS Pilani, Pilani Campus
Counting Methods (1/4)

𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 𝑖𝑛 𝑤ℎ𝑖𝑐ℎ 𝑡ℎ𝑒 𝑒𝑣𝑒𝑛𝑡 𝑜𝑐𝑐𝑢𝑟𝑠


𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 =
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠
How to count these numbers?
Rule 1: “K” Mutually exclusive and collectively exhaustive outcomes in
“N” trials
– Total number of possible outcomes KN
– Total number of possible outcomes in 10 coin tosses?
– K = 2, N = 10
– Total number of possible outcomes = 210 = 1024

23
BITS Pilani, Pilani Campus
Counting Methods (2/4)

Rule 2: IF there are “K1” possible outcomes in 1st trial, K2 in 2nd and so
on
– Total number of possible outcomes = K1*K2*….
Example:
Possible number plates if the number consists of 2 alphabets followed
by 2 numbers?
26*26*10*10 = 67600

24
BITS Pilani, Pilani Campus
Counting Methods (3/4)

Rule 3: Number of ways in which “n” unique numbers/items can be


arranged
– ! 𝑛 𝑓𝑎𝑐𝑡𝑜𝑟𝑖𝑎𝑙 𝑛
– (n)*(n-1)*(n-2)*….*(1)
– Example: How many ways alphabets A, B, C, D can be arranged?
• ! 4 = 4*3*2*1 = 24
When some of the numbers/items are not unique (with repetition)
!𝑛
• 𝑖𝑓 items 1, 2 etc… of the items repeat r1, r2,… times
!𝑟1!𝑟2…
• Example: Possible unique 4 alphabet arrangements from “A”, “B”, “C” and “B”
!4
• = = 24/2 = 12
!2

25
BITS Pilani, Pilani Campus
Counting Methods (4/4)

Rule 4: Permutation
– Total possible arrangements of “r” objects chosen from “n” objects
!𝑛
– 𝑛𝑃𝑟 =
!(𝑛−𝑟)
– How many ways can 2 alphabets taken from A, B, C and D be arranged?
• ! 4/! 2 = 24/2 = 12
Rule 5: Combination
– Total possible way of selecting “r” objects from “n” objects irrespective of order
!𝑛
– 𝑛𝐶𝑟 =
!𝑟!(𝑛−𝑟)
– How many ways can the 2 alphabets be taken from A, B, C and D?
• ! 4/(! 2 ∗ ! 2) = 24/(2*2) = 6

26
BITS Pilani, Pilani Campus
Situated Learning Assignment (3 Marks)
• Take a data from your work context
• Identify datascale of different variables
– Nominal, ordinal, interval, ratio
• Compute measures of central tendency and dispersion
– Arithmetic mean
– Dispersion: Next slide
– Calculations: By hands (calculator is allowed but not MS XL)
• Suitable graphs (using MS XL)
– Depends on variable types
– For interval and ratio with histograms
• You can comment about skewness and kurtosis
27
Measuring variation
1. Range = Maximum-Minimum
2. Variance, population = σ2 = 1/N * ∑ (xi-Mean)2
3. Standard deviation, population =σ
4. Coefficient of Variation, population = σ/Mean
5. Variance, sample = s2 = 1/(N-1) * ∑ (xi-Mean)2
6. Standard deviation, sample =s
7. Coefficient of Variation, sample = s/Mean
8. Mean absolute deviation = 1/N * ∑ |xi-Mean|
9. Z score (how many std devns is xi
away from the mean) = [xi-Mean]/ σ
1. Quartiles (Q1, Q2, Q3) Smallest 25%, 50%, 75% observations.
2. Inter-quartile range = Q3 – Q1
3. 5-number summary (Min., Q1, Q2, Q3, Max.) Minimum, 3 Quartiles, Maximum
13. Boxplot (Called Box and Whisker chart in MS Excel) Plot of 5-number summary
BITS Pilani, Pilani Campus
Assignment Reporting Example
• How to calculate manually and report
• 5 points summary of dataset { 10, 20, 30, 40, 50, 60}
– Min: 10
– Q1: (n+1)/4th number = 7/4tg = 1.75th = 10 + (20-10)*0.75 =
17.5
– Q2 = (n+1)/2th number = 7/2th = 3.5th = 35
– Q3 = (3/4)*(n+1)th = 5.25th = 52.5
– Max: 60
– 5 Point summary = { 10, 17.5, 35, 52.5, 60}
• You can use XL for graphs and to verify your results
29
Q&A

30
Quantitative Methods

Lecture-6
BITS Pilani
Pilani Campus

1
BITS Pilani
Pilani Campus

Discreet Probability Distributions


(Ch 5 Business Statistics, Levine et al.)
So far and the next

✓ Previous Sessions
√ Defining and collecting data
√ Survey and sampling methods
√ Organizing and visualizing variables
√ Numerical descriptive measures
√ Probability
➢ Today
➢ Discreet Probability Distributions

3
BITS Pilani, Pilani Campus
Sample space and Random Variable
• You have 2 machines in your factory
• Each machine can be “working” or “not-working” decided by a random
process
• Sample space?
S = { “Working, Working”,
“Working, Not-Working”,
“Not-Working, Working”,
“Not-working, Not-Working” }
– Does the sample space contain numbers?

• A “Random Variable (RV)” maps the sample space on a real number line (R)
• Can take some or all real values between -∞ and ∞

BITS Pilani, Pilani Campus


Sample space and Random Variable
• A “Random Variable (RV)” maps the sample space on a real number line (R)
• One to one or many to one mapping
• Can take some or all real values between -∞ and ∞
S={
“Working, Working”,
“Working, Not-Working”,
“Not-Working, Working”,
“Not-working, Not-Working” }

• Example 1: R
-1 0 1 2 3
• Random Variable: X, representing number of working machines on a given day.
• How many values can X take?
• X = { 0, 1, 2 }
• Sample space of an Random Variable X consists of Real Numbers 0, 1 and 2
5
BITS Pilani, Pilani Campus
Sample space and Random Variable
• A “Random Variable (RV)” maps the sample space on a real number line (R)
• Can take some or all real values between -∞ and ∞
S={
“Working, Working”,
“Working, Not-Working”,
“Not-Working, Working”,
“Not-working, Not-Working” }

• Example 2: -1 0 1 2 3 R
• Random Variable: Y, representing whether all the machines are working today.
• Takes value 1 if all the machines are working, 0 otherwise
• How many values can Y take?
• Y = { 0, 1 }
• Sample space of an RV consists of Real Numbers
6
BITS Pilani, Pilani Campus
Discreet and Continuous Random Variables
• A “Discreet Random Variable (RV)”
• Takes discreet values
• Previous example (Note: There were no values for X or Y between 0 and 1)
• Number of students present in this class on a given day, number of questions in an exam paper

• A “Continuous Random Variable (RV)” can take any (infinite values) between
any two numbers
• Height of the students of this class
• Diameter of a machine component

7
BITS Pilani, Pilani Campus
Probability Distribution of a Discreet RV
• Probability distribution of a discreet RV is all the possible (Value, Probability) pairs
• Also called probability mass function (PMF)
• Example: You have two machines in your factory. Probability of a machine working
on a given day is 0.8.
• Random variable X represents number of machines working on a given day.
• What is the probability distribution of X P(X)
• {(0, 0.04), (1, 0.32), (2, 0.64)} .64

X Probability Cumulative .32


Probability .04
0 0.04 0.04 X
1 0.32 0.36 0 1 2 3
2 0.64 1 Probability distribution of X
Total 1 8
BITS Pilani, Pilani Campus
Cumulative Probability Distribution
• Cumulative probability distribution is distribution of
P(X<=x)
P(X<=x)
• What is the cumulative probability distribution of X 1

(Previous Example)? .36


.04

X
0 1 2
X Probability Cumulative
Probability Cumulative Probability distribution of X
0 0.04 0.04
1 0.32 0.36
2 0.64 1 • What is P (X <=0.5)?
Total 1 • What is P (X<3)?

9
BITS Pilani, Pilani Campus
Expected Value of a Random Variable

• E[X] = µ = σ𝑁
𝑖=1 𝑥𝑖 𝑃(𝑋 = 𝑥𝑖 )
• N= total number of values
• 𝑥𝑖 = ith value
• 𝑃(X = 𝑥𝑖 ) probability of X taking value xi P(X)

• Also referred to as mean by some of the books .64


E[X]
.32
.04
i 𝑥𝑖 P(X=𝑥𝑖 ) 𝑥𝑖 𝑷(𝑿 = 𝒙𝒊)
1 0 0.04 0 X
0 1 2 3
2 1 0.32 0.32
Probability distribution of X
3 2 0.64 1.28
N=3 E[X] = µ = σ𝟑𝒊=𝟏 𝒙𝒊 𝑷(𝑿 = 𝒙𝒊 ) 1.6
10
BITS Pilani, Pilani Campus
Variance and Standard Deviation
• Variance: Probability weighted sum of the squares of the deviation of
each value of the random variable from the expected value of the
probability distribution.
σ2 = σ𝑁
𝑖=1 (𝑥𝑖 −𝐸 𝑋 ) 2 𝑃(𝑋 = 𝑥 )
𝑖
• N= total number of values, 𝑥𝑖 = ith value
• 𝑃(X = 𝑥𝑖 ) probability of X taking value xi

i 𝑥𝑖 P(X=𝑥𝑖 ) (𝑥𝑖 − 𝑬[𝑿]) (𝑥𝑖 − 𝑬[𝑿])2 (𝑥𝑖 − 𝑬[𝑿])2


P(X=𝑥𝑖 )*
1 0 0.04 -1.6 2.56 .1024
2 1 0.32 -0.6 0.36 .1152
3 2 0.64 0.4 0.16 .1024
N=3, E[X] = 1.6 Variance = σ2 = σ𝟑𝒊=𝟏(𝒙𝒊 −𝑬 𝑿 )𝟐 𝑷(𝑿 = 𝒙𝒊 ) 0.32
11
Standard Deviation = σ = √(σ2) 0 .56
BITS Pilani, Pilani Campus
Parametric Probability Distributions (Models)
• Why?
• If you have 100 machines, what is the probability that more than 20 machines
would not work on a given day?
• What is the expected value of “Machines not working” on a given day. What is the
variance?

• Many real life situations are similar


• For a well defined problem that meets certain criteria
• Parametric probability models help us quickly calculate the probabilities,
variance, expected value etc. using a generic formula

12
BITS Pilani, Pilani Campus
BITS Pilani
Pilani Campus

Binomial distribution
Binomial Distribution
• Binomial Random Variable
• A random variable representing number of successful events in n trials
• Properties of a binomial distribution
– The sample has fixed number of observations: n
– Each observation consists of the possibility of two mutually exclusive and
collectively exhaustive outcomes
– Probability of the event of interest is constant across observations. So, P
(Event) is fixed
– Value of any observation is independent of any other observation

14
BITS Pilani, Pilani Campus
Binomial Distribution
For toss of a coin experiment, the Random Variable X represents number of heads in
10 trials
Is X a Binomial RV?
• Does X represent number of successful events in n trials?
• Does the sample have fixed number of observations: n?
• Does each observation consist of the possibility of “two” mutually exclusive and collectively
exhaustive outcomes?
• Is the probability of the event of interest constant across observations?
• Does the value of any observation independent of any other observation?
• X is a Binomial RV

15
BITS Pilani, Pilani Campus
Binomial Distribution
• If a random variable X is distributed Binomially, then
• P ( X = x | n, π)
– Probability of x number of events occurring given parameters
– n: total number of trails/observations and
– Π: probability of the event or probability of success
• P( X = x | n, π) = 𝐶𝑥𝑛 𝜋 𝑥 (1 − 𝜋)𝑛−𝑥
𝑛!
• 𝐶𝑥𝑛 =
𝑥!(𝑛−𝑥)!

• E[X] = nπ
• Variance of a binomial distribution: σ2= nπ(1-π)

16
BITS Pilani, Pilani Campus
Binomial Distribution
• Probability distribution of ‘x’ machines failing out of total ‘10’
• Or ‘x’ heads in 10 trials of a coin
• For different values of probability of the event (π)
π=0.8, n=10
π=0.2, n=10 π=0.5, n=10 0.35
0.35 0.3
0.3
0.3 0.25
0.25

0.25 0.2 0.2

0.2 0.15 0.15

0.15 0.1
0.1

0.1 0.05
0.05
0
0.05
0 1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
0
1 2 3 4 5 6 7 8 9 10

17
BITS Pilani, Pilani Campus
Binomial Distribution

• Binomial Random Variable


• Probability that exactly 2 machines out of 4 will fail today, given
probability of a machine failing on any given day is 0.2
• P( X = x | n, π) = 𝐶𝑥𝑛 𝜋 𝑥 (1 − 𝜋)𝑛−𝑥
• n, x, π?
• 4, 2 and 0.2
4!
• P( X = 2 | 4, 0.2) = 0.2 2 (0.8)2 = 0.0768
2!2!
• E[X] = nπ
• 4*0.2 = .8
XL Formula: BINOM.DIST(x, mean, false)
18
BITS Pilani, Pilani Campus
BITS Pilani
Pilani Campus

An application of Binomial distribution


Binomial Distribution

• Binomial Random Variable n=50


π 0.2
• You have 50 machines in your factory,
One mechanic capacity 5
probability of failure of a machine on a
given day is 0.2 Number of events P(x) CumulativeProb Mechanics Required
0 0.0000 0.0000 0
• One mechanic can only fix a maximum 1 0.0002 0.0002 0.2
of 5 machines the same day they go 2 0.0011 0.0013 0.4
3 0.0044 0.0057 0.6
out of order 4 0.0128 0.0185 0.8
• You want at least 95% chance that all 5 0.0295 0.0480 1
6 0.0554 0.1034 1.2
the out of order machines would be 7 0.0870 0.1904 1.4
fixed the same day 8 0.1169 0.3073 1.6
9 0.1364 0.4437 1.8
• How many mechanics should you 10 0.1398 0.5836 2
hire? 11 0.1271 0.7107 2.2
12 0.1033 0.8139 2.4
• What if you want to be 100% 13 0.0755 0.8894 2.6
14 0.0499 0.9393 2.8
confident? 15 0.0299 0.9692 3

XL Formula: BINOM.DIST(x, mean, cumulative=false) or BINOM.DIST(x, mean, true)


20
BITS Pilani, Pilani Campus
Binomial Distribution

• Binomial Random Variable n=50


π 0.2
• You have 50 machines in your factory,
One mechanic capacity 5
probability of failure of a machine on a
given day is 0.2 Number of events P(x) CumulativeProb Mechanics Required
0 0.0000 0.0000 0
• How many machines do you expect to 1 0.0002 0.0002 0.2
be out of order any given day? 2 0.0011 0.0013 0.4
3 0.0044 0.0057 0.6
• E[X] = nπ 4 0.0128 0.0185 0.8


5 0.0295 0.0480 1
E[X] = 50*0.2 = 10 6 0.0554 0.1034 1.2
• What is the chance that less than 5 7 0.0870 0.1904 1.4
8 0.1169 0.3073 1.6
machines would be out of order? 9 0.1364 0.4437 1.8
• P(X<=4) = 0.0185 10
11
0.1398
0.1271
0.5836
0.7107
2
2.2
• What is the chance that exactly 5 or 12 0.1033 0.8139 2.4
13 0.0755 0.8894 2.6
10 machines would be out of order 14 0.0499 0.9393 2.8
• P(X=5) + P(X=10) = .0295 + 0.1398 15 0.0299 0.9692 3

21
BITS Pilani, Pilani Campus
Other examples of binomial distributions

• For a shop floor manager, x number of employees calling


it sick today out of total n
• For a rental company, x number of cars expected to be in
good conditions of total n
• Expected number of defective parts out of a total n
coming out the assembly line
• Out of x customers approached, how many are expected
to buy the product

XL Formula: BINOM.DIST(x, mean, cumulative=false) or BINOM.DIST(x, mean, true)


22
BITS Pilani, Pilani Campus
BITS Pilani
Pilani Campus

Poisson distribution
Poisson Distribution

• Poisson Random Variable


– Count of the events of interest occurring in an “Area of Opportunity”
– Area of opportunity
• Certain time-period, Space, Length etc
• Examples of the “count of events” and “area of opportunity”
Number of cars arriving at a Toll booth in a 1 minute period
• Number of defects per square inch in a product
• Number of calls arriving at a call center in 2 minutes interval
• Number of car crashes in a month on Mumbai Pune Highway

24
BITS Pilani, Pilani Campus
Poisson Distribution

Requirements
– Probability of the event is same for all the areas of opportunity
– Number of events occurring in one area of opportunity is independent of
number of events occurring in other areas of opportunity
– The probability that two or more events will occur in an area of opportunity
approaches zero as the area of opportunity becomes extremely small
– Examples
• Number of customers arriving each minute during the lunch hour in a bank, number of
defects per sq inch area etc.
– Has one parameter λ (Lambda): Expected number of events per unit

25
BITS Pilani, Pilani Campus
Examples

Are the following example of Poisson distribution?


– Number of customers arriving at a bank branch every minute during lunch hour
Requirements
– Is the probability of the event same for all the areas of opportunity?
– Are number of events occurring in one area of opportunity independent of
number of events occurring in other areas of opportunity?
– Does the probability of two or more events occuring in an area of opportunity
approach zero as the area of opportunity becomes extremely small?
– The problem can be modelled as Poisson distribution

26
BITS Pilani, Pilani Campus
Examples

Are the following example of Poisson distribution?


– Number of customers arriving at a bank branch every hour during the day time

Requirements
– Is the probability of the event same for all the areas of opportunity?
– Not a good example of Poisson distribution

27
BITS Pilani, Pilani Campus
Poisson Distribution: Probability, Expectation and Variance

Poisson distribution model


𝑒 −𝜆 𝜆𝑥
• P(X=x|λ) =
𝑥!
– P(X = x| λ ) = probability that X will take value x in a given area of
opportunity
– λ: Poisson distribution parameter, expected values of X or mean of X
– e: exponential constant (~2.718)
– x: Number of events, can take any values, i.e. 0, 1, 2,…. ,∞
– E[X] = λ
– Variance = λ

28
BITS Pilani, Pilani Campus
Examples

λ =2 λ =8
0.3 0.16

0.25 0.14

0.12
0.2
0.1

0.15
0.08

0.1 0.06

0.04
0.05
0.02
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

29
BITS Pilani, Pilani Campus
What is the Probability distribution?

1. Avg no. of no of accidents= 4/day.(arrival (occurrence) rate of accidents per day


Probability of 0, 1, 2, 3, 4, 5… accidents?
2. Avg no. of no of potholes= 6/km. (occurrence rate of potholes per km) Probability
of 0, 1, 2, 3, 4, 5… potholes?
3. Avg no. of goals= 3.2/match. Probability of 0, 1, 2, 3, 4, 5… goals?
4. Avg no. of typos= 2.7/page. Probability of 0, 1, 2, 3, 4, 5… typos?
5. Avg no. of teeth cavities= 3.28/patient. Probability of 0, 1, 2, 3, 4, 5… teeth cavities?

x Probability
0 0.0183
1 0.0733
2 0.1465
3 0.1954
4 0.1954
5 0.1563
6 0.1042
7 0.0595
8 0.0298
9 0.0132
10

0.0053 Solution of 1 above for Poisson distribution
Sum 1.0
30
BITS Pilani, Pilani Campus
BITS Pilani
Pilani Campus

An application of Poisson distribution


Ambulance service decision
Mean number of accidents (λ) 4.2 Fixed Cost 8000

• The average number of accidents reported in the factory is Number of Accidents P(x) Always Fast Best and Fast Carry Fast

4.2/month. Rate Expected Cost Rate Expected Cost Rate Expected Cost

0 0.014996 3000 0.00 4000 0.00 0 0.00


• Assume the distribution of accidents is Poisson, 1 0.062981 3000 188.94 4000 251.93 0 0.00

• the manager of a factory has invited bids for ambulance service 2 0.132261 3000 793.57 4000 1058.09 0 0.00

to take injured workers to a nearby hospital. Bids submitted by 3 0.185165 3000 1666.49 4000 2221.98 0 0.00

three service providers are as follows- 4 0.194424 3000 2333.08 2000 1555.39 0 0.00

5 0.163316 3000 2449.74 2000 1633.16 0 0.00


• AlwaysFast services: Rs 3,000/case. (You can directly use expected 6 0.114321 3000 2057.78 2000 1371.85 3000 2057.78
value formula here)
7 0.068593 3000 1440.45 2000 960.30 3000 1440.45

• BestAndFast services: Rs 4,000/case up to 3 cases in a month and Rs 8 0.036011 3000 864.27 2000 576.18 3000 864.27

2,000/case if cases in a month exceed 3. 9 0.016805 3000 453.74 2000 302.49 3000 453.74

• CarryFast services: A fixed amount Rs 8,000/month and Rs 3,000/case 10 0.007058 3000 211.75 2000 141.16 3000 211.75

if cases in a month exceed 5. 11 0.002695 3000 88.93 2000 59.29 3000 88.93

12 0.000943 3000 33.96 2000 22.64 3000 33.96

13 0.000305 3000 11.88 2000 7.92 3000 11.88

Which bid is the most economical? 14 9.14E-05 3000 3.84 2000 2.56 3000 3.84

15 2.56E-05 3000 1.15 2000 0.77 3000 1.15

Variable (Exp) 12599.57 10165.71 5167.75


Fixed Cost 0 0 8000
Total 12599.57 10165.71 13167.75

XL Formula: POISSON.DIST(x, mean, false) 32


BITS Pilani, Pilani Campus
Q&A

33
Quantitative Methods

Lecture-7
BITS Pilani
Pilani Campus

1
BITS Pilani
Pilani Campus

Normal Distribution
(Ch 6 Business Statistics, Levine et al.)
So far and the next

✓ Previous Session
√ Discreet Probability Distributions
√ Probability mass function
➢ Today
➢ Continuous and Normal Distributions

3
BITS Pilani, Pilani Campus
Continuous Random Variables
• A “Continuous Random Variable (RV)” can take any
(infinite values) between any two numbers
• Height/marks of the students of this class
• Diameter of a machine component
• Point probability is always zero (Why?)
• Probabilities are defined for finite intervals (small or large)
• Probability Distribution of a continuous variable
• It’s a mathematical function
• Probability Density Function (PDF): 𝑓𝑋 𝑥
• Y axis is not probability
• It represents the values the PDF function takes
• Cumulative Probability Distribution Function (CDF): 𝐹𝑋 𝑥 =
𝑥
‫׬‬−∞ 𝑓𝑋 𝑥 𝑑𝑥
4
BITS Pilani, Pilani Campus
Calculating probabilities
• Probability is defined as area under the PDF curve, between two points
𝐵
• 𝑃 𝑋 ≥ 𝐴 𝑎𝑛𝑑 𝑋 ≤ 𝐵 = ‫𝑓 𝐴׬‬ 𝑥 𝑑𝑥

• Easier to calculate using CDF


• 𝑃 𝑋 ≥ 𝐴 𝑎𝑛𝑑 𝑋 ≤ 𝐵 = 𝐹𝑋 𝐵 − 𝐹𝑋 𝐴

Uniform PDF

5
BITS Pilani, Pilani Campus
Normal Distribution
• Also called Gaussian Distribution
Probability Desnsity Function
• Probability Density Function (PDF)
1 𝑋−µ 2
1 −
• f(X |µ, 𝜎) = 2
𝑒 2 𝜎
2𝜋𝜎
• µ = mean/expected value; 𝜎2 = Variance

15

100
0
5
10

20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
Marks

• X ~ N(µ, 𝜎2), X is normally distributed with


parameters µ, 𝜎2
CDF
1

• Cumulative Probability Distribution Function (CDF):


0.8

0.6
𝑥
• 𝐹𝑋 𝑥 = ‫׬‬−∞ 𝑓𝑋 𝑥 𝑑𝑥 0.4

0.2

100
65
0
5
10
15
20
25
30
35
40
45
50
55
60

70
75
80
85
90
95
6
BITS Pilani, Pilani Campus
Normal Distribution
• Two parameters define a normal distribution: µ and σ2
• Higher σ increases the spread, change in µ shifts the curve

Image source: https://en.wikipedia.org/wiki/Normal_distribution


7
BITS Pilani, Pilani Campus
Calculating Normal Distribution Probabilities

• P (X>=A and X<=B)?


• Ex: If X is normally distributed with mean 50 and variance 100
i.e. X ~ N(50, 100)
• Then what is the value of P (X>=35 and X<=50)? P (X<=50)?

• 𝐹𝑟𝑜𝑚 𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝐷𝑒𝑛𝑠𝑖𝑡𝑦 𝐹𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑃𝐷𝐹 :


• Area under the curve
• 𝐸𝑎𝑠𝑖𝑒𝑟 𝑤𝑎𝑦 𝑖𝑠 𝑢𝑠𝑖𝑛𝑔 𝑢𝑠𝑖𝑛𝑔
• 𝐶𝑢𝑚𝑢𝑙𝑎𝑡𝑖𝑣𝑒 𝐷𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝐹𝑢𝑛𝑐𝑡𝑖𝑜𝑛 (𝐶𝐷𝐹):
P (X<=35))
CDF
1

0.8

P (X>=35 and X<=50) =


0.6
𝐹𝑋 50
0.4
𝑭𝑿 𝟓𝟎 − 𝑭𝑿 𝟑𝟓
0.2 𝐹𝑋 35
P (X>=35 and X<=50)
0
5

85

100
0

10
15
20
25
30
35
40
45
50
55
60
65
70
75
80

90
95

8
BITS Pilani, Pilani Campus
Standardized Normal Distribution
• Distribution of Z values is called Standard Normal PDF (µ=50, σ=10)

Normal Distribution
𝑥𝑖 − µ
• Zi =
σ
• Z ~ N (0,1), µ = 0 and σ2 = 1

35
0
5
10
15
20
25
30

40
45
50
55
60
65
70
75
80
85
90
95
100
• X-axis for normal distribution:
Marks

Standardized Normal PDF (µ=0, σ=1)


• Values of Random Var X
• X-axis for standardized normal distribution:
• Standard deviation distance from the mean
• For all normal RVs, standardized normal distribution
is same, Z ~ N (0,1), (Why?)

-2.5

1.5
-5
-4.5
-4
-3.5
-3

-2
-1.5
-1
-0.5
0
0.5
1

2
2.5
3
3.5
4
4.5
5
Z 9
BITS Pilani, Pilani Campus
Normal Distribution

• Other names- Gaussian, Bell-shaped curve, and Law of error.


• It is a continuous distribution- fractional values like 1.23, 6.66, etc. (distance, temperature) are allowed
on x-axis.
• The curve ranges from –infinity to +infinity on x-axis.
• Area under the curve represents probability.
• Total area under the curve is 1, that is, probability of all the events=1.
• Normal distribution curve is symmetric around Mean, and for a Normal distribution, Mean= Median=
Mode.
• Normal distribution is described by only two parameters- mean (μ) and standard deviation (σ). For
each mean and standard deviation, there is a separate curve.
• Normal curve with mean, μ= 0 and standard deviation, σ = 1 is called Z curve, or Standardized Normal
distribution.
𝑥−𝜇 2
1 −
• Equation of Normal distribution curve, 𝑓 𝑥 = 𝑒 2𝜎2
2𝜋𝜎2
• Since above equation is not easy, tables have been made with area (probability) under the curve.
10
BITS Pilani, Pilani Campus
Calculating Normal probabilities
Example scenario
• Assuming students marks are normally distributed with µ=50 and σ=10
• X ~ N (50, 100)
• What is the probability of a student receiving 45 or less marks?
• Or proportion of the students receiving 45 or less marks [P(X<=45)]
• Steps
• Calculate Z values for cut off points
• Find the cumulative probability up to the cut-off points from the z cumulative
distribution table (standardized normal cumulative probability table)
• Use these probability values to solve the questions
• You can use this approach to solve any normal probability question
11
BITS Pilani, Pilani Campus
Calculating Normal probabilities
• RV: X ~ N (µ = 50, σ =10); µ=50 and σ=10
0.5
0.4
0.3
PDF
• P(X<=45.5)? = P(X<45.5)+ P(X=45.5) = P(X<45.5) 0.2
0.1

• What % of the students have scored 45.5 or less? 0

2.5
-4.5

-3.5

-2.5

-1.5

-0.5

0.5

1.5

3.5

4.5
-5

-4

-3

-2

-1

5
• Steps Z
• Calculate Z value of the value of interest 1
0.9

• (45.5-50)/10 = -0.45
0.8
0.7
0.6

• Find the cumulative probability up to that value from the z


0.5
0.4
0.3
CDF
0.2
table 0.1
0

• 0.3264 or 32.64%

-3.5
-5
-4.5
-4

-3
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
12
BITS Pilani, Pilani Campus
Calculating Normal probabilities
• If X ~ N (μ=50, σ=10), then P(X>=45) and P(X<=55)? PDF
• What % of students have scores between 45 and 55
• Steps
• Calculate Z values of the values of interest
• P(X<=45) and P(X<=55), Z for 45 and 55?
-5 -4.5 -4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

• (45-50)/10 = -0.5 & (55-50)/10 = 0.5 1


0.9

• Find the cumulative probabilities up to the values of interest 0.8


0.7
0.6
from the Z table 0.5
CDF
0.4
• (P(Z<=0.5)? P(Z<=-0.5)? 0.3
0.2

• P(Z<=0.5) - P(Z<=-0.5) = 0.6915 – 0.3085 = 0.3830 or 38.30%


0.1
0

-4.2
-4.6

-3.8
-3.4

-2.6
-2.2
-1.8
-1.4

-0.6
-0.2
0.2
0.6

1.4
1.8
2.2
2.6

3.4
3.8
4.2
4.6
5
-5

-3

-1

3
Z - CDF

Z - CDF
13
BITS Pilani, Pilani Campus
Calculating Normal probabilities
• If X ~ N (μ=50, σ=10), What % of the students have
scored more than 55 marks?
• Steps
• Calculate Z values of the value of interest, (Z for X=55?)
• Z = (55-50)/10 = 0.5
• Which of the shaded area is P(Z>=0.5)? 1
0.9
• P(Z>=0.5) = 1- P(Z<=0.5), why? 0.8
0.7
• P(Z>=0.5) = P(Z<=-0.5), why? 0.6
0.5

• Using symmetry of the distribution


0.4
0.3
0.2
• Cumulative prob values for Z=-0.5? and Z=0.5? 0.1
0

• Is P(Z<=-.5) = 1- P(Z<=.5)?

-2.6

-1.8
-5
-4.6
-4.2
-3.8
-3.4
-3

-2.2

-1.4
-1
-0.6
-0.2
0.2
0.6
1
1.4
1.8
2.2
2.6
3
3.4
3.8
4.2
4.6
5
• 1 – 0.6915 = 0.3085
Z - CDF

Z - CDF 14
BITS Pilani, Pilani Campus
How questions may be asked
• Given RV: X ~ N (µ, σ2)
• Probabilities for X between A and B [A and B can take any values between -∞, ∞]
• Proportions or values between a certain range
• Given proportions within ranges
• Do they represent normal distribution?
• Find whether proportions match with standard normal proportions under the curve

• Which interval will contain x% values [Top, bottom or around the mean]
• You want to award A grade to top 5% of the students, what will be the marks cut-off
• You want to promote top 5% or let-go bottom 5% of the performance score employees
• You want to report “95% range around the Mean”
• µ ± 1.96σ
• Using the distribution parameters X values can be converted to Z values, and
vise versa
15
BITS Pilani, Pilani Campus
How questions may be asked
• Steps to make life easy:
• Always visualize the bell shaped PDF
• Visualize the shaded areas of your interest X<=A
• Find cut-off z-values of interest
• Find cumulative probabilities for the z-values of
interest (from the Z cumulative prob table)
• That is area under the curve up-to those points from -∞
• Work with these values to find the answer

X between
X>=A
A and B

16
BITS Pilani, Pilani Campus
Some key normal distribution probabilities
• For X ~N (μ, σ2)

• Values within 1, 2 and 3 standard deviations


• 68%: About 68% of values are within one standard deviation
of the mean [μ-σ <= X <= μ +σ ]

• What are the Z values of interest here?


• Z = -1 and Z = +1 Source: https://en.wikipedia.org/wiki/Normal_distribution

• 95.5%: About 95.5% of values are within two standard


deviations of the mean [μ -2σ <= X <= μ +2σ]
• Practically, most of the distribution
• What are the Z values of interest here? is contained withing 3 standard
• 99.7%: About 99.7% of values are within three standard deviations of the mean
deviations of the mean [μ -3σ <= X <= μ +3σ]
17
BITS Pilani, Pilani Campus
Some key normal distribution probabilities

• 90%, 95% and 99% intervals around the mean value


• 90% of values fall within 1.65 standard deviations of the
mean [-1.65 <= Z <= 1.65 ]
• 95% of values fall within 1.96 standard deviations of the
mean [-1.96 <= Z <= 1.96]
Source: https://en.wikipedia.org/wiki/Normal_distribution
• 99% of values fall within 2.58 standard deviations of the
mean [-2.58 <= Z <= 2.58]

Important homework: Verify all the statements on this and


previous slide using Z cumulative probability table
18
BITS Pilani, Pilani Campus
Q&A

19
Quantitative Methods

Lecture-8
BITS Pilani
Pilani Campus

1
BITS Pilani
Pilani Campus

Survey Errors and Sampling Distributions


(Ch 1 & Ch 7 Business Statistics, Levine et al.)
So far and the next

✓ Previous Sessions
√ Discreet Probability Distributions
√ Probability mass function (PMF) and Cumulative probability distribution
√ Continuous and Normal Distributions
√ Probability Density Function (PDF) and Cumulative Distribution Function (CDF)
√ Probabilities as area under the PDF curve (region of interest).
√ Use of cumulative Z tables to calculate normal probabilities
√ One sided Z table (+Ve or –Ve values) is sufficient as the distribution is symmetric

➢ Today
➢ Survey errors
➢ Sampling distributions
3
BITS Pilani, Pilani Campus
Setting the context for the next few sessions

• You want to understand characteristics of the population


• Proportion of votes party-A will get (exit polls)
• Mean fill volume in a bottle of your coconut-water brand
• Proportion of the market segment that prefers your brand
• However, practically, you can only work with small samples
• You would need to make inferences about the population from the sample
data
• Core concept is probability distribution and
• Calculating probabilities of a Random Var. lying between certain range
• In the prv. Session, we learned that in detail for a normally distributed random variable

4
BITS Pilani, Pilani Campus
BITS Pilani
Pilani Campus

Surveys and Survey Errors


(Ch 1 Business Statistics, Levine et al.)
Census and Sampling

Census
▪ Entire population (population, tiger, agriculture,
health facilities).
Sampling 43, 54,
▪ A portion of the population. 38, 22
▪ Quality, voting, blood, soil, customer surveys, voice,
interviews, ….

Population Sample
Why samples?
▪ Quicker.
▪ Cheaper.
N, N, Y,
▪ May not participate/Not available.
N Y, N,
▪ When tests are destructive.
Y, Y
▪ Scientifically chosen samples can give very good
accuracy about the properties of the population.

6
BITS Pilani, Pilani Campus
Sampling methods

7
BITS Pilani, Pilani Campus
Objective of the sampling

➢ Main objective
➢ Estimating properties of the population
➢ Sample properties are a way to estimate the parameters of the
population
➢ Mean
➢ What is the mean life of a battery from brand-A?
➢ Proportions
➢ What proportion of the population prefers my product over the competitor’s?
➢ Only a sample drawn by probability sampling can be used for
estimating population parameters
8
BITS Pilani, Pilani Campus
Probability sampling

Simple random sampling


Stratified sampling
p=1/12 from population
Proportional to strata
stratum is homogeneous

Cluster sampling
heterogeneous cluster
Systematic sampling
Every nth item/bottle

BITS Pilani, Pilani Campus


Survey errors
• Coverage error
• A portion of the population is excluded from the sampling frame

• Nonresponse error
• Some from the chosen sample may not respond

• Measurement error
• Approximate scales
• Psychological scales give approximate values
• Attitudes, strength of beliefs etc.
• Bad or leading question
• Do you like color and taste of the candy?
• Yes / No

• Sampling error
• You get a different sample each time you draw a sample
• Sample to sample random differences
BITS Pilani, Pilani Campus
BITS Pilani
Pilani Campus

Sampling Distributions
(Ch 7 Business Statistics, Levine et al.)
Sampling Distributions

Sampling
Distributions

Sampling Sampling
Distribution of the Distribution of the
Mean Proportion

Population: Population: Not


Normally Distributed Normally Distributed

12
BITS Pilani, Pilani Campus
Population parameter(s) and Sample statistic(s)

• Population parameters Characteristic


Symbol/Equation
Population Sample

• Population mean (μ), standard Mean


Variance
µ
σ2
𝑋ത
S2
deviation (σ) Standard deviation σ S
Proportion π p

• Population proportion (π) Size


Item
N
xi
n
Xi

• Sample statistics Characteristic is called


Mean
Parameter
µ = 1/N*∑x i
Statistic
𝑋ത = 1/n*∑Xi

• Means: Sample mean (𝑋),ത Variance σ2 = 1/N*∑(x i-µ) 2 S2 = 1/(n-1)*∑(Xi-𝑋ത) 2


Standard deviation σ = + √(σ2) S = +√(S2)
standard deviation (S) For population variance, divide by N. For sample variance, divide by n-1.

• Sample proportions (p)

13
BITS Pilani, Pilani Campus
Population >> Samples >> Sampling Distribution

• Each time you draw a random sample from a population


• You get a different sample with different characteristics (mean, standard
deviation etc.)
• Distribution of a sample statistic (𝑋ത or p) of all the possible samples
from a population is called sampling distribution
• ത
Sampling distribution of the Mean (𝑋)
• Sampling distribution of the Proportion (p)
• An easier way to understand
• Consider the sample statistic of interest of a “Random Variable”, 𝑋ത for example
• PDF or PMF of that RV (𝑋ത for example) is called Sampling Distribution

14
BITS Pilani, Pilani Campus
Population >> Samples >> Sampling Distribution

𝑋1, S1
X~ N(µ,σ2) 𝑋ത ~ N(µ,σ𝑋ത 2)

𝑋2, S2

X values 𝑋ത values
(Age, height, weight etc.)
𝑋𝑚, Sm

Samples of size Sampling Distribution


Population: Random variable X n, with means 𝑋1, 𝑋2, … 𝑋𝑚 of the Mean
15
BITS Pilani, Pilani Campus
Population >> Samples >> Sampling Distribution

p1 Mean = π

π(1 − π)
Population proportion: π σ𝑃 =
p2 𝑛
Size: N

Proportion females, supporting


candidate-A, products containing
faults etc.
pm 𝑝

Sampling Distribution
Samples of size n,
of the Proportion
with proportions p1, p2, …, pm
Random Variable: 𝑝
Assuming: nπ and n(1- π)>=5
16
BITS Pilani, Pilani Campus
Sampling Distribution of the Mean

• Unbiased property of the Sample Mean


• If population mean is μ
• Mean of the Sampling Distribution of the Mean is also μ
• Standard error:
• Standard deviation of the sampling distribution of the Mean
• Denoted by σ𝑋ത
σ
• σ𝑋ത = ; n: 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒, σ: 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
𝑛

17
BITS Pilani, Pilani Campus
Sampling Distribution of the Mean

𝑋1, S1 σ
X~ N(µ,σ2) 𝑋ത ~ N(µ,σ𝑋ത 2) σ𝑋ത =
𝑛

𝑋2, S2

X values 𝑋ത values
(Age, height, weight etc.)
𝑋𝑚, Sm

Samples of size Sampling Distribution


Population: Random variable X n, with means 𝑋1, 𝑋2, … 𝑋𝑚 of the Mean
18
BITS Pilani, Pilani Campus
Sampling Distributions

Sampling
Distributions

Sampling Sampling
Distribution of the Distribution of the
Mean Proportion

Population: Population: Not


Normally Distributed Normally Distributed

19
BITS Pilani, Pilani Campus
Sampling from a normally distribution population

• If the population is normally distributed N ~(µ,σ)


• Then Sampling Distribution of the Mean will always be normally
distributed irrespective of the sample size n
• 𝑋ത ~ N(µ,σ𝑋ത 2)

• Standard error (σ𝑋ത 2) will depend on the sample size n


σ
• σ𝑋ത =
𝑛

20
BITS Pilani, Pilani Campus
Z value for Sampling Distribution of the Mean

𝑋ത −µ
• Z=
σ𝑋

𝑋ത −µ
• Z= σ
𝑛
σ
• 𝑋ത = µ + Z σ𝑋ത = µ + Z
𝑛

• Once you have Z values of interest, rest of the inferences are


similar to how we worked with normal distribution questions
• Z values can be positive or negative depending on which side of the mean
you are looking at

21
BITS Pilani, Pilani Campus
Example
• A Coconut water bottle filling machine is tuned for normally distributed
fill quantity, with µ of1000ml and σ of 100ml
• For quality check, a sample of 100 bottles is drawn, and sample mean is
calculated. Machine must be stopped and retuned if the sample mean is
outside center 95% of the sampling distribution of the Mean (lies in the left
or right tails beyond this range ) • Which cumulative
probability entry will you
• You found the sample mean to be 980ml, should you halt the machine?
see in the Z table?
• .025 or .975
• µ = 1000, σ = 100, n=100 What is your lower halt cut-off in ml?
σ
• Standard error of the Mean: σ𝑋ത ? (= )
𝑛 𝑋ത − µ
• 10 𝑍=
σ𝑋ത
• Sampling distribution of the Mean ~ N (µ = 1000, σ𝑋ത = 10) 𝑋ത −1000
-1.96 =
10
• Value of interest, Z value? ത
𝑋 = -19.6 + 1000 = 980.4ml
• 980, -2
What is the upper cut-off in ml?
• Should you halt the process? 22
BITS Pilani, Pilani Campus
Sampling Distributions

Sampling
Distributions

Sampling Sampling
Distribution of the Distribution of the
Mean Proportion

Population: Population: Not


Normally Distributed Normally Distributed

23
BITS Pilani, Pilani Campus
Sampling from a populations not normally distributed

• The Central Limit Theorem


• If the sample size is large enough (n>=30)
• The Sampling Distribution of the Mean is approximately normally
distributed
• Regardless of the population distribution
• Why are we interested in normality?
• Thumb rule: for sample size of 30 or more, we consider the
Sampling distribution of the Mean be normally distributed

24
BITS Pilani, Pilani Campus
Central Limit Theorem: Empirical verification

Source: Levin et al. page: 256 25


BITS Pilani, Pilani Campus
Sampling Distributions

Sampling
Distributions

Sampling Sampling
Distribution of the Distribution of the
Mean Proportion

Population: Population: Not


Normally Distributed Normally Distributed

26
BITS Pilani, Pilani Campus
Sampling Distribution of the Proportion

• Categorical variables
• Female %, % prefer Trump etc.
• If population proportion is π
• Mean of “Proportions” of all the possible samples is also π
𝑋:𝑁𝑢𝑚𝑏𝑒𝑟𝑠 𝑤𝑖𝑡ℎ 𝑐ℎ𝑎𝑟𝑎𝑡𝑒𝑟𝑖𝑠𝑡𝑖𝑐𝑠 𝑜𝑓 𝑖𝑛𝑡𝑒𝑟𝑒𝑠𝑡
• Sample proportion: p =
n:Sample size
• Standard error of the Proportion:

π(1−π)
• σ𝑝 =
𝑛
• n: 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒
27
BITS Pilani, Pilani Campus
Sampling Distribution of the Proportion

• Sampling distribution of the proportion follows a binomial distribution


𝑋:𝑁𝑢𝑚𝑏𝑒𝑟𝑠 𝑤𝑖𝑡ℎ 𝑐ℎ𝑎𝑟𝑎𝑡𝑒𝑟𝑖𝑠𝑡𝑖𝑐𝑠 𝑜𝑓 𝑖𝑛𝑡𝑒𝑟𝑒𝑠𝑡
• Sample proportion: p =
n:Sample size
• If both nπ 𝑎𝑛𝑑 n(1 − π) >=5 (Which is true in most of the practical cases)
• Sampling distribution of the proportion can be assumed to be normal distribution with
• Mean: π, and
• Standard error of the Proportion:

π(1−π)
• σ𝑝 =
𝑛
• n: 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒

28
BITS Pilani, Pilani Campus
Sampling Distribution of the Proportion

p1 Mean = π

π(1 − π)
Population proportion: π σ𝑃 =
p2 𝑛
Size: N

Proportion females, supporting


candidate-A, products containing
faults etc.
pm 𝑝

Sampling Distribution
Samples of size n,
of the Proportion
with proportions p1, p2, …, pm
Random Variable: 𝑝
Assuming: nπ and n(1- π)>=5
29
BITS Pilani, Pilani Campus
Z value for Sampling Distribution of the Proportion

• Assuming both nπ and n 1 − π are ≥ 5


p−π
• Z=
σ𝑝
p−π
• Z=
π(1−π)
𝑛

• Once you have Z values of interest, rest of the inferences are


similar to how we worked with sampling distribution of the Mean

30
BITS Pilani, Pilani Campus
A question from the book (7.14 p 262)

• A research study found that with 80% of women say that having a flexible work
schedule is either very important or extremely important to their career success.
• Suppose you select a sample of 100 working women.
• What is the probability that in the sample fewer than 85% say that having a
flexible work schedule is either very important or extremely important to their
career success?
• P (p<=0.85)
• What is the probability that in the sample between 75% and 85% say that having
a flexible work schedule is either very important?
• P(0.75<=p<=.85)
p−π p−π
• Calculate Z values: => Z = and use Z table to find the answers
σ𝑝 π(1−π)
𝑛
31
BITS Pilani, Pilani Campus
Exam related questions

• Syllabus
• Descriptive Statistics, Statistical charts, measures of central tendency and dispersion
• Data scales, Probability, Bayes Theorem
• Probability distributions
• Including discrete distributions and continuous distributions, and the mean and variance of a probability
distribution.
• Up-to session-7 (Normal distribution)
• Calculator is allowed
• Onscreen version or your own
• One sided Z cumulative probability table will be provided
• You can type your answers or upload handwritten answers after scanning
• All the best!

32
BITS Pilani, Pilani Campus
Q&A

33
Quantitative Methods

Lecture-9
BITS Pilani
Pilani Campus

1
BITS Pilani
Pilani Campus

Confidence Interval Estimation


(Ch 8 Business Statistics, Levine et al.)
So far and the next

✓ Previous Sessions
√ Probability Distributions
√ Sampling Distributions of the mean and the proportion
➢ Today
➢ Confidence Interval Estimation

3
BITS Pilani, Pilani Campus
Sampling distributions

• Sampling distribution of the Mean


• Distribution of all the possible sample means
• For normally distributed populations
• Sampling distribution of the Mean is always normal
σ
• 𝑋ത ~ N(µ,σ𝑋ത 2), 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟 𝑜𝑓 𝑡ℎ𝑒 𝑀𝑒𝑎𝑛 = σ𝑋ത = , Where σ is the population standard deviation
𝑛
• For populations that are not normally distributed
• Sampling distribution approaches near normal as the sample size increases (Central Limit Theorem)
• Empirical thumb rule n>=30

• Sampling distribution of the Proportion


• Distribution of all the possible sample proportions
• Has a binomial distribution
• For n π and n(1-π) ≥ 5, sampling distribution of the Proportion can be assumed to be normally distributed
π(1−π)
• p ~ N(π,σ𝑝 2), standard error of the Proportion = σ𝑃 = , Where π is the population proportion
𝑛
4
BITS Pilani, Pilani Campus
Deductive and Inductive Reasoning

• Deductive reasoning
• Deriving specific conclusions from general principles or understanding
• Taking something as true and making deductions based on that
• So far we have been assuming that population parameter (mean or proportion) is known
• We have been trying to find the interval around the (known) population mean
• Top 5%, bottom 5%, within certain distance from the (known) mean (central 95% etc.)
• Inductive reasoning
• Deriving broader generalizations from something specific
• Generally all you know is the sample statistics from just one sample (sample mean, or proportion etc.)
• We need to draw interference about the population parameters from the sample statistics
• From 𝑋ത to µ or From p to π
• Example:
• From an everyday sampling process you find that 5% of the products in the sample are defective
• What conclusions can you draw about the proportion of faulty products produced today?

5
BITS Pilani, Pilani Campus
Type of estimates
• Point estimate
• One estimate of sample mean or proportion from the sample
• 𝑋ത or p as one point estimate of µ or π

• Interval estimate
• An interval is constructed in such a way, that you know the probability of population parameter to
be in that interval
• Example:
• Given the sample mean 𝑋ത and sample standard deviation S,
“You would like to say with 95% confidence that the population mean lies in an interval between values
L (lower limit) and H (higher limit)”

6
BITS Pilani, Pilani Campus
Confidence interval of the Mean
• When population standard deviation (σ) is known
• Drawing inference from the sampling distribution of the Mean and Z distribution tables

• When σ is not known


• Drawing inference from the sampling distribution of the Mean and students’ t-distribution tables

7
BITS Pilani, Pilani Campus
Confidence interval of the Mean: σ known
• If we know µ we can calculate probability that 𝑋ത will lie is a given interval (previous session)
• Min 𝑋ത for top 5%
• Max 𝑋ത for bottom 5% or
• 𝑋ത does not lie in the top or bottom 2.5% tails (within 95% around the mean)

• Now given 𝑋ത 𝑎𝑛𝑑 σ, you want to find the 95% probability range for µ

8
BITS Pilani, Pilani Campus
Confidence interval of the mean when σ is known (1/4)

• From the book (Ch 7, p 264 and Ch 8, p 272)


• Cereal filling example µ = 368gm, σ = 15gm, n=25
• 95% interval around the mean in the sampling distribution of the Mean?
• Standard error of the Mean? (standard deviation of the sampling distribution of the Mean)
• 15/5 = 3,
• 95% Interval around the mean (Critical Z value: 1.96)?
• How much area under the right tail (beyond the shaded region)?
• [368 – 1.96*3, 368 + 1.96*3] = [362.12, 373.88]
• 95% of the all possible samples will have means in this rage

𝑋ത
-1.96 +1.96 Z
Sampling distribution of the Mean 9
BITS Pilani, Pilani Campus
Confidence interval of the means when σ is known (2/4)

• Now you draw five samples with 𝑋1 = 362.3, 𝑋2 =


369.5, 𝑋3 = 360, 𝑋4 = 362.12, 𝑋5 = 373.88
• µ = 368gm and 95% interval around the µ is [362.12,
𝑋ത
373.88]
• Note two things,
• ത are within the 95% interval
Some of these means (𝑋)
around the mean
• 𝑋1, 𝑋2, 𝑋4 𝑋5 around µ and
• Some are not
• 𝑋3
σ
• Notice, the ഥ𝑋 ± intervals around the sample
𝑛
means 𝑋1 = 362.3, 𝑎𝑛𝑑 𝑋5 = 373.88
• Population mean is at the upper or lower
boundary of these intervals
10
BITS Pilani, Pilani Campus
Confidence interval of the means when σ is known (3/4)

ത and 𝑋5?
• What is special about 𝑋4
ത and 𝑋5
• For all the sample means, that lie between 𝑋4 𝑋ത
σ
• ഥ𝑋 ± intervals would contain µ
𝑛

• What proportion of the total sample means would fall


ത and 𝑋5 ?
between 𝑋4
• 95% (Why?)

• 95% Confidence Interval


• An interval that contains the population mean with 95%
probability
• You can say with 95% confidence that the population mean is withing H and L
boundaries of the calculated sample mean

11
BITS Pilani, Pilani Campus
Confidence interval of the Mean when σ is known (4/4)

• Confidence Interval for the Mean (σ known)


𝝈
• ഥ − 𝒁𝜶 𝝈𝑿ഥ ≤ µ ≤ 𝒁𝜶/𝟐
𝑿
𝟐 𝒏
𝝈
• ഥ − 𝒁𝜶/𝟐 𝝈𝑿ഥ , 𝑿
[𝑿 ഥ + 𝒁𝜶/𝟐 ] or [ Lower Limit (LL), Upper Limit (UL)]
𝒏
• 𝑋ത : Sample mean, n: Sample size
• 𝑍α/2 : Z value corresponding to an upper tail probability of α/2 from the standardized normal distribution
• Z value for Cumulative probability 1 − α/2 from Z table
• α: Level of significance (0.05 level of significance)
• Level of confidence: 1 − α ∗ 100%
• What will be the confidence for 0.05 level of significance?
• 95%
• Hence 95% confidence interval estimate would be (Leaving 2.5% in both the tails):
σ σ
• 𝑋ത ± 𝑍α/2 = 𝑋ത ± 1.96
𝑛 𝑛
σ
• 𝑍α/2 𝑖𝑠 𝑐𝑎𝑙𝑙𝑒𝑑 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑒𝑟𝑟𝑜𝑟 𝑒
𝑛

12
BITS Pilani, Pilani Campus
Most commonly used confidence intervals
• 95% confidence interval: 𝑋ത ± 1.96
σ
𝑛

• 90% confidence interval


• α? (Level of significance?)
𝐿𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 𝑋ത 𝑢𝑝𝑝𝑒𝑟 𝑙𝑖𝑚𝑡
• α/2? 𝑋ത − 1.96
σ
𝑋ത ± 1.96
σ
𝑛 𝑛
• 𝑍α/2 = 1.645 (For area corresponding to 1- α/2 cumulative probability)
σ 95% confidence interval
• 𝑋ത ± 1.645
𝑛

• 99% confidence interval


• α? (Level of significance?)
• α/2? (proportion of values up to the right tail boundary)
• 𝑍α/2 = 2.58
σ
• 𝑋ത ± 2.58
𝑛

• 80% confidence interval


σ
• 𝑋ത ± 1.28
𝑛

13

BITS Pilani, Pilani Campus


From the book (p 275)
• You draw a sample of 100 sheets from a paper manufacturing process
• Sample mean length is 10.998 inches
• Population standard deviation is known to be 0.02 inches
• What is the 95% confidence interval estimate of the population mean paper length?
• 𝑋ത and σ?
• 10.998 and 0.02
• n=?
• 100
• 𝑍α/2 = 1.96 𝐿𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 𝑋ത 𝑢𝑝𝑝𝑒𝑟 𝑙𝑖𝑚𝑡
σ σ
𝑋ത − 1.96 𝑋ത ± 1.96
• 95% confidence interval: 𝑋ത ± 1.96
σ
𝑛
𝑛 𝑛

• [10.9941 ≤ µ ≤ 11.0019] 95% confidence interval

• [10.9941, 11.0019]
• You can say with 95% confidence that the population mean
• Lies between 10.9941 and 11.0019 14
BITS Pilani, Pilani Campus
Confidence interval of the mean when σ is unknown

• You generally do not know σ of the population (why?)

• To calculate accurate value of σ, you need to know the entire population


• If you know the entire population, you can calculate the exact value of µ

• Next question: How to estimate the confidence interval when σ is unknown

15
BITS Pilani, Pilani Campus
Confidence interval of the mean when σ is unknown

• When Sample standard deviation is used as an estimate for population standard deviation,
• In place of Z distribution we use student’s-t distribution
• There is one student t-distribution for each sample size (n) or degrees of freedom (n-1)
• t-distribution approaches 'Z distribution as ‘n’ goes up

Source: T1 reference book, page 278

16
BITS Pilani, Pilani Campus
t-distribution
• Z-Statistic of the sampling distribution of the Mean
𝑋ത −µ
• 𝑍 = σ , σ: Population standard deviation
ൗ 𝑛

• t-Statistic of the sampling distribution of the Mean


𝑋ത −µ
• 𝑡= 𝑆 , S: sample standard deviation, S = σ𝑛𝑖=1 𝑋𝑖 − 𝑋ത 2 /(𝑛 − 1)
ൗ 𝑛

• t-statistic follows a t-distribution with n-1 degrees of freedom


• Degrees of freedom
• ത
We have used S as an estimate of σ. For calculating S, we need a fixed value for 𝑋.
• ത only n-1 values of the sample are now free to take random values, nth value is not free, as 𝑋ത
When we fix 𝑋,
is fixed
• Hence n-1 degrees of freedom
• Example: If someone tells you are free to choose any 2 numbers but the mean needs to 4
• How many numbers you are actually free to choose?
• Only one (why?), so you have only 1 degree of freedom (n-1), for a sample of size ‘n’ 17
BITS Pilani, Pilani Campus
t-probabilities
• Similar to Z table but now with one
more parameter
• Degrees of freedom
• There is one t-table for every degree
of freedom
• Only some key probability values are
published
• Important:
• t-distribution assumes that
variable X is normally distributed
• However for large sample sizes,
and when population is not very t-value corresponding to 95%
confidence or .05 level of significance
skewed you can still use t-tables for 𝑡α = 𝑡. 025
2
non-normal populations
18
BITS Pilani, Pilani Campus
Confidence interval of the Mean when σ is unknown
• Confidence Interval for the Mean (σ known)
𝑺 𝑺
• ഥ − 𝒕𝜶
𝑿 ≤ µ ≤ 𝒕𝜶/𝟐
𝟐 𝒏 𝒏
𝑺 𝑺
• ഥ − 𝒕𝜶/𝟐
[𝑿 , ഥ + 𝒕𝜶/𝟐
𝑿 ] or [ Lower Limit (LL), Upper Limit (UL)]
𝒏 𝒏
• 𝑋ത : Sample mean, n: Sample size
• 𝑡α/2 : t-value corresponding to an upper tail probability of α/2 from the t-distribution, for n-1 degrees of freedom
• t value for Cumulative probability 1 − α/2 from t table
• α: Level of significance (0.05 level of significance)
• Level of confidence: 1 − α ∗ 100%
• What will be the confidence for 0.05 level of significance?
• 95%
• Hence 95% confidence interval estimate for sample size 100 would be (Leaving 2.5% in both the tails):
𝑆 𝑆
• 𝑋ത ± 𝑡α/2 = 𝑋ത ± 1.9842
𝑛 𝑛

19
BITS Pilani, Pilani Campus
Testing normality assumption
t-distribution assumes that variable X is normally distributed
• However, for large sample sizes, and
• When population is not very skewed
• You can still use t-tables for non-normal populations

• How do you test the normality of the sample data?

20
BITS Pilani, Pilani Campus
Testing normality assumption

Are returns approx.. normal?


• Mean = Mode = Median?
• Is range ~6σ? (29.16?)
• Is Inter-quartile-range 1.33σ? (6.46?)
• Is Skewness zero? (No it is left skewed)
• Is kurtosis zero (No it is more peaked)

Cumulative probability tests


• Percent of total returns within ±1, ±1.96 Z values,
• Should be approx. 68.26%, 95% respectively
• Box plot to check skewness (Is it approximately Source: page 237, Levine et al.
symmetrical)

21
BITS Pilani, Pilani Campus
Q-Q plot : Quantile-Quantile plot

Q-Q plot
• Z values of the X-Axis
• Corresponding values (data sequences) on y-axis)
• Ideally the plot should be a straight line
• Significant deviation from the straight line represent
non-normality
(Refer to page 239 Levine et. al)

Source: page 237, Levine et al.

22
BITS Pilani, Pilani Campus
Confidence interval of the Proportion
• Confidence Interval for the Proportion
• Probability that the confidence interval would contain the population proportion
• Confidence Interval for the proportion
𝒑(𝟏−𝒑)
• 𝒑 ± 𝒁𝜶/𝟐 𝝈𝒑 = 𝒑 ± 𝒁𝜶/𝟐 𝒏

• 𝒑: Sample proportion, n: Sample size


𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑖𝑡𝑒𝑚𝑠 ℎ𝑎𝑣𝑖𝑛𝑔 𝑡ℎ𝑒 𝑐ℎ𝑎𝑟𝑎𝑐𝑡𝑒𝑟𝑖𝑠𝑡𝑖𝑐 𝑜𝑓 𝑖𝑛𝑡𝑒𝑟𝑒𝑠𝑡 (𝑋)
• p= 𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑖𝑡𝑒𝑚𝑠 (𝑛)
• 𝑍α/2 : Z value corresponding to an upper tail probability of α/2 from the standardized normal
distribution
• Z value for Cumulative probability 1 − α/2 from Z table
• α: Level of significance (0.05 level of significance)
• Level of confidence: 1 − α ∗ 100%
• Subject to the assumption that both X and n-X are greater than 5
• np and n(1-p) >= 5
23
BITS Pilani, Pilani Campus
When to use which distribution for estimation

Working with sampling distribution of the Mean


• When the population variance is known
• Population normally distributed: use z-statistic for any sample size
• Population in NOT normally distributed
• Sample size >= 30, use z-statistic
• When the population variance is not known
• For normally distributed population - use t-statistic
• Sample size >=30, t-statistic can be used even if population is not normally
distributed
• Barring heavily skewed populations
Working with sampling distribution of the Proportion
• Use Z statistic 24
BITS Pilani, Pilani Campus
BITS Pilani
Pilani Campus

Determining sample sizes


How much sample size do you need?

– Sampling error and width of the interval


• You can say the mean is between -∞ and +∞ with 100% confidence but is it useful?
• If you decide on 95% confidence
– You need pragmatically narrow intervals to aid decision making
– For example, you may want width of the interval no more than 2𝑚𝑚
– Sampling error
• Sampling error e = 𝑍𝛼/2 𝜎/ 𝑛
• This determines width of the confidence interval of predetermined 𝛼 i.e. 2* 𝑍𝛼/2 𝜎/ 𝑛
𝜎
• Mean upper and lower limit estimate: 𝑋ത ± 𝑍𝛼/2
𝑛

𝑝(1−𝑝)
• Proportions upper and lower limit estimates 𝑝 ± 𝑍𝛼/2 𝑛

– Sample size should be sufficient to meet the business requirements related to


confidence interval estimates
26
BITS Pilani, Pilani Campus
Sample size estimates for the Mean sampling error
o For sampling distribution of the Mean
o Sampling error e = 𝑍𝛼/2 𝜎/ 𝑛
o 𝑛 = 𝑍𝛼/2 2 𝜎 2 /𝑒 2
o 𝜎 is estimated from the past data or from range/6 etc. (Why?)
o Practically 6 sigma around the mean contains most or all of the distribution

27
BITS Pilani, Pilani Campus
Sample size estimates for the Mean sampling error

o 𝑛 = 𝑍𝛼/2 2 𝜎 2 /𝑒 2
o Example:
o Historically marks of the students in Quant course have ranged from 0 to 100
o You want an interval estimate of the population mean marks with 0.05 significance or 95%
confidence. However, for results to be meaningful, you want the confidence interval width
of 10 marks only. What is the sample size that you need for the study
o 𝛼?
o e?
o 𝑍𝛼/2 = 1.96
o n = (1.96*(100/6)/5)2 = 42.68 = ~43
28
BITS Pilani, Pilani Campus
Sample size estimates for the Proportion sampling errors

o For sampling distribution of the Proportion

𝜋(1−𝜋)
– Sampling error (e) = 𝑍𝛼/2
𝑛

– 𝑛 = 𝑍𝛼 2 𝜋(1 − 𝜋)/𝑒 2
2

– What if 𝜋 is unknown or the past data is not available?


• 𝜋 : Population proportion
• the best practice is to use 0.5
• Max value for 𝜋 1 − 𝜋 is 0.25 at 𝜋 = 0.5
• It gives the widest interval
• However, the required sample size would be more
• Probability of 0.5 means no information on which way the event would fall.
29
BITS Pilani, Pilani Campus
Q&A

30
Quantitative Methods

Lecture-10
BITS Pilani
Pilani Campus

1
BITS Pilani
Pilani Campus

Fundamentals of hypothesis testing


(Ch 9 Business Statistics, Levine et al.)
So far and the next

✓ Previous Sessions
√ Probability Distributions
√ Confidence Interval Estimation
➢ Today
➢ Hypotheses testing

3
BITS Pilani, Pilani Campus
Previous Class: Confidence Interval

• An interval estimate of the population parameter


• Based on the sample mean/proportion
• Infer an interval of values where population mean or proportion
is likely to lie
• With α significance or (1- α)*100% confidence

4
BITS Pilani, Pilani Campus
Hypothesis testing (1/3)

What is the word meaning of “Hypothesis?”


• A proposition made as a basis for reasoning, without any assumption of its truth
• You state a hypothesis and test whether it can be supported

As a business manager you my want to test if the assumed value reflects the reality
• Mean bottle fill volume is 1000ml
• Mean component diameter is .2mm
• Defect rate of the manufacturing process is <1%
• A new drug is more effective than the old drug
• The weight reduction plan indeed reduces the weight (Wtafter < Wtbefore)

Remember you are interested in the population parameters and not the sample statistic
• Sample statistic (mean, sd) are just a vehicle to infer the population parameters (mean, proportion)
5
BITS Pilani, Pilani Campus
Hypothesis testing (2/3)
Hypothesis consists of two statements
• Null Hypothesis (H0)
• Alternate Hypothesis (H1)
• Together they make the entire set of possibilities
Examples
• Mean bottle fill volume: H0: μ = 1000ml and H1: μ ≠ 1000ml
• Population error rate: H0: π ≤ 1% and H1 : π >1%
How to decide your null hypothesis
• Your alternate hypothesis should be set to enable a business decision (Should I stop my
machine for tuning)
• Null hypothesis should be set to be rejected (The new drug is not effective).
• Your assumption is set as an alternate hypothesis (The new drug is effective)

6
BITS Pilani, Pilani Campus
Hypothesis testing (3/3)
Null is never accepted (You only fail to reject it)!

• H0: All Swans are white and H1: All Swans are not white
• If your sample has a million white swans. Can you accept the null hypothesis?
• You can only say “You do not have sufficient evidence to reject it”
• It may be that you do not have sufficient sample size

Set null in order to reject it


• Null hypothesis can not be accepted. We may not find enough evidence to reject it.
• Your alternate hypothesis should be set to enable a business decision.
• Null hypothesis should be set to be rejected (The new drug in not effective)
• Rejection of the null hypothesis should enable an action or help you decide

7
BITS Pilani, Pilani Campus
Critical Value and test statistic
• We use statistical distribution of the sample statistic for testing the H0
hypothesis (H0: μ = 1000ml)
• What is sample statistic? 𝑋2
• What is the distribution of the sample statistic? 𝑋1
• If the sample statistic is sufficiently close to null hypothesis value,
you cannot reject it (𝑋2).
• But if it too far, you may have a reason to reject it (𝑋1).
• How far is too far?
µ ± 𝑍α/2 σ/ 𝑛
• Region of rejection based on critical values of Z-distribution or t-
Distribution of the sample statistic.
• µ ± 𝑍α/2 σ/ 𝑛 or µ ± 𝑡α/2 S/ 𝑛
• If your sample mean is 𝑋1, will you reject the null hypothesis?
• If it is 𝑋2, will you reject the null hypothesis?

8
BITS Pilani, Pilani Campus
Critical Value Approach: Two-Tail Test
• When hypothesis is of the type H0: μ = 1000ml and H1: μ ≠ 1000ml
• You would like to reject the null hypothesis if the sample mean fill
volume is too high, or it is too low
• Area of rejection lies on both the sides of the sampling distribution
• The rejection region is divided into two parts (both left and right tail
containing α/2 area)
Example
• For 95% confidence or .05 significance
• Each tail (right and left rejection regions) will have 0.025 (2.5%) values
under it.
• Critical Z values will be -1.96 and +1.96
• If your sample statistic falls in the region of rejection, you reject the null

9
BITS Pilani, Pilani Campus
Decision risks: Type-I error and Type-II errors
Type-I error H0
• When your null hypothesis is true
• If you get a sample mean 𝑋1 , would you reject null? 𝑋2

• Is probability of drawing a sample with mean 𝑋1 𝑜𝑟 𝑙𝑒𝑠𝑠 𝑧𝑒𝑟𝑜? 𝑋1


• α is the probability of getting a sample mean which falls in the
region of rejection, even if null was true!
• Probability of rejecting the null hypothesis when it is true is
called type-I error.
• Also called the false alarm. µ ± 𝑍α/2 σ/ 𝑛
• α, the level of significance, is type-I error and (1- α) is called
confidence coefficient
• Can be reduced by choosing an appropriate α
• Changing α changes the width of the rejection regions

10
BITS Pilani, Pilani Campus
Type-II error
• Probability of not rejecting the null hypothesis when it is indeed false is called type-II
H0
error (β).
Actual μ1
• β depends on the actual difference between the population parameter and
hypothesized value.
𝑋1
• (1- β) is called the power of the statistical test.
• Probability of not making Type-II error.
• The probability of rejecting null hypothesis when it is false.
• Can’t be controlled by the choice of α.
• When you reduce α, you increase β. µ ± 𝑍α/2 σ/ 𝑛
• Can be reduced by increasing sample size (why?)
• Increase in n, reduces the standard error. Makes the sampling distribution narrower.
• Makes the area of Non-Rejection narrower
• Which error should you focus on?
• Use your business situation to decide. One can be reduced at the expense of the
other 11
BITS Pilani, Pilani Campus
Hypothesis testing steps
1. State the null and the alternate hypotheses
2. Decide alpha and n, based on the costs associated to type-I and type-II errors
3. Collect the sample
4. Decide the appropriate test statistic and the sampling distribution
1. Z or t-statistic
5. Calculate the critical values and areas of rejection and non-rejection
6. Find the sample statistic
7. Decide whether the sample statistic falls in the area of rejection/non-rejection

12
BITS Pilani, Pilani Campus
When to use which distribution for estimation

Working with sampling distribution of the Mean


• When the population variance is known, use z-statistic
• When the population variance is not known
• Use t-statistic

Working with sampling distribution of the Proportion


• Use Z statistic

13
BITS Pilani, Pilani Campus
Example 9.3 (p. 314 Levine et al.)
• The business problem: Is population mean waiting time to place an order 4.5 minutes?
• The population is normally distributed, with a population standard deviation of 1.2 minutes.
• A sample of 25 orders during a one-hour period shows the sample mean to be 5.1 minutes.
• Determine whether there is evidence at the 0.05 level of significance to support the assumption?
Step 1: State the null and the alternate hypotheses
• H0: μ = 4.5 minutes and H1: μ ≠ 4.5 minutes 4.5
Step 2: Decide alpha and n 𝑋1 = 5.1
• α = 0.05, n = 25 Z=2.5
Step 3: Collect the sample (we already have it).
Step 4: Decide the appropriate test statistic and the sampling distribution (Z or t?)
Step 5: Calculate the critical values and areas of rejection and non-rejection
Critical Z? Critical Area? (µ±𝑍(α/2)σ/√𝑛)
4.5 ± 1.96*1.2/5 => Area of acceptance is between 4.03 and 4.97
4.03 𝑡𝑜 4.97
Step 6: Calculate sample statistic (𝑿ഥ ) : 5.1 Z=-1.96 Z=+1.96
The ZSTAT of the sample statistic, if the null was true? (5.1 – 4.5) / (1.2/5 ) = +2.5
Step 7: Make the decision! (Should we reject the null hypothesis?) 14
BITS Pilani, Pilani Campus
p-value approach to Hypothesis testing
1. p-value is the observed value of significance
2. p-value is the probability of getting the test statistic equal or more extreme than
the currently calculated value if the null hypothesis was correct
3. Test statistic: (𝑋1) =5.1, Z=2.5
4. p=?
5. Total area under the tails beyond Z>=2.5 and Z<=-2.5
6. p = (1-0.9938)*2 = 0.0124 Z=-2.5 Z=+2.5
7. Chosen level of significance?
8. 0.05
9. In other words, you would not have rejected the null hypothesis if the
significance was 0.0124
10. Reject the null if p<= α
11. H0 must go if p-value is less than or equal to the chosen level of significance
15
BITS Pilani, Pilani Campus
Comparison of Confidence interval and Hypothesis testing approach

• What is the 95% confidence interval?


• ഥ ±𝑍(α/2)σ/√𝑛
𝑿
• α = 0.05, n = 25
• ഥ = 5.1
𝑿
• Confidence interval
• 5.1 ± 1.96*1.2/5 => [ 4.62 ; 5.57 ]
• What’s the difference?
• For hypothesis testing you form an interval around hypothesized mean
• For confidence interval you form an interval around the sample mean

16
BITS Pilani, Pilani Campus
Hypothesis testing when σ is unknown
• When population standard deviation (σ) is known
• Drawing inference from the sampling distribution of the Mean and Z distribution tables

• When σ is not known


• Sampling distribution follows students’ t-distribution with n-1 degrees of freedom

• All the steps remain the same as Z statistic example. Only change is that we now use t-statistic to
determine area of acceptance, rejection or p-value
• Test of normality is required for small samples
17
BITS Pilani, Pilani Campus
Critical Value Approach: One-Tail Test
• When you are interested in only one side of the distribution
• Example: You want to test whether the proportion of students
supporting a schedule change is more than 50%.
• When hypothesis is of the type H0: π ≤ .5 and H1: π > .5
• Now the region of rejection lies entirely in the right tail
• The rejection region is now right tail containing α area
Example
• For 95% confidence or .05 significance
• Right tail rejection regions will have 5% of the values under it.
• Critical Z values will be +1.645 (or ~+1.65)
• All other steps remain the same
• Find the Z value of the sample statistic
• Analyze whether it lies in the rejection region or not

18
BITS Pilani, Pilani Campus
Z test of hypothesis for proportions
• We use Z-test for the proportions
𝜋(𝟏−𝜋)
• ZSTAT = (𝑝 − 𝜋)/
𝒏
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑖𝑡𝑒𝑚𝑠 ℎ𝑎𝑣𝑖𝑛𝑔 𝑡ℎ𝑒 𝑐ℎ𝑎𝑟𝑎𝑐𝑡𝑒𝑟𝑖𝑠𝑡𝑖𝑐 𝑜𝑓 𝑖𝑛𝑡𝑒𝑟𝑒𝑠𝑡 (𝑋)
• Sample proportion = p =
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑖𝑡𝑒𝑚𝑠 (𝑛)
• ZSTAT follows approximately normal distribution subject to the assumption that both X and n-X are greater
than 5
• Hypothesis testing by either critical value approach or p-value approach

19
BITS Pilani, Pilani Campus
Example: 9.4 page 333 Levin et al. (One tail proportions)

• McDonald’s had a drive through service, which filled 90.9% of its drive-through orders correctly.
• After quality improvement efforts, a sample of 400 orders using the new process indicated that 378
orders were filled correctly.
• At the 0.01 level of significance, can you conclude that the new process has increased the proportion of
orders filled correctly?
• Two tailed of one tailed test?
• Null hypothesis: H0? And the alternate hypothesis: H1?
• H0: π ≤ 90.9 and H1: π > 90.9
• Level of significance: α? Critical Z value?
• α =.01, Z critical = ~2.33
• Sample statistic (sample proportion p?) and corresponding ZSTAT value?
(𝑝−𝜋) .𝟗𝟎𝟗(𝟏−.909)
• Sample p = 378/400 = 0.945, ZSTAT = = (.945 − .909)/ 𝟒𝟎𝟎
= 2.503
𝜋 𝟏−𝜋
𝒏

• p-value: (1-.9943) = .0057


• If p-value is less than α, null hypothesis must be rejected 20
BITS Pilani, Pilani Campus
Example: 9.4 page 333 Levin et al. (One tail t-test)

• You want to test that the new process has a service time mean of less than
188.83 seconds.
• You collect the data by selecting a sample of n = 25 stores.
• you find that the sample mean service time
• at the drive-through equals 170.8 seconds and the sample standard
deviation equals is 21.3 seconds
• You decide to use α = 0.05. Population σ is unknown.
• Two tailed or one tailed test? Null hypothesis: H0? And H1?
• H0: μ ≥ 188.83 and H1: μ < 188.83
• Level of significance: α? Degrees of freedom? Critical t value?
• α =.05, df = 24, t-critical = -1.7109
• tSTAT = (170.8 – 188.83)/21.3/ 𝟐𝟓 = -4.23
• p-value: 0.0001, from XL Formula: T.DIST.RT(t-stat, df)
• tSTAT is in the region of rejection, null hypothesis must be rejected
• p-value is less than α, null hypothesis must be rejected
21
BITS Pilani, Pilani Campus
Q&A

22
Quantitative Methods

Lecture-11
BITS Pilani
Pilani Campus

1
BITS Pilani
Pilani Campus

Two Sample Tests and ANOVA


(Ch 10 Business Statistics, Levine et al.)
So far and the next

✓ Previous Sessions
√ Probability Distributions
√ Confidence Interval Estimation
√ Hypothesis testing
➢ Today
➢ Two Sample Tests and ANOVA

3
BITS Pilani, Pilani Campus
Concept Alert! Inherent randomness
Population 0.18
0.16

• All outcomes of a random variable (X) under consideration. 0.14


0.12

• Population distributions can be discrete (binomial etc.) or continuous (normal etc.). 0.1
0.08
0.06
0.04
0.02
A fair dice population 0
1 2 3 4 5 6
𝜀
• Random Var. X: The number that shows up on the top.

• The population: The distribution of for example, a billion outcomes of X. 𝑋𝑖 = 𝜇 + 𝜀𝑖

• µ ~3.5, σ ~ 1.71.

BITS Pilani, Pilani Campus


Concept Alert! Inherent randomness
X: µ ~3.5, σ ~ 1.71. 0.18
0.16
0.14
0.12
Sampling error 0.1
0.08
• Say you draw one random sample of one of the outcomes, 𝑋𝑖 . 0.06
0.04
• Will the value of 𝑋𝑖 be 3.5? 0.02
0

• Every time you throw this dice, you will get: 𝑋𝑖 = 𝜇 + 𝜀𝑖 . 1 2


𝜀
3 4 5 6

• Mean + some random error (𝜀𝑖 ).


𝑋𝑖 = 𝜇 + 𝜀𝑖
• This random error ( 𝜀𝑖 ) is because of inherent randomness / sampling error / random
error.

BITS Pilani, Pilani Campus


Concept Alert! Same or different populations?
0.18
Sampling error? 0.16

• From the population X, you draw a random sample of 100 outcomes.


0.14
0.12
0.1
• Every time you draw a sample and calculate the mean. 0.08
0.06

• You would get 𝑋𝑖 = 𝜇 + 𝜀i − Mean + some random error 0.04


0.02

• You want to find out the possible reason of the difference between µ & 𝑋𝑖 0
1 2 3 4 5 6

• Is it because of inherent randomness or sampling error? 𝑋 = 𝜇+𝜀


3.83
• Or, does the sample belong to a different population? 3.49
5.8
• The difference cannot be explained by inherent randomness. 3.5 ± 1.96 ∗ .171

n=100, Sampling
distribution of
𝑋. 𝜎𝑋ത = 0.171
6
BITS Pilani, Pilani Campus
Concept Alert! Same or different populations?
0.18
Questions we generally would be answering 0.16

• We would have some proven sampling distribution for our sample statistic
0.14
0.12
0.1
• We would ask: Is the sample statistic within the region of “non-rejection”. 0.08
0.06
• We assume that our sample statistic can be anywhere in the region of non- 0.04
0.02
rejection because of the inherent randomness. 0
1 2 3 4 5 6

• That is the core of all hypothesis testing.


𝑋 = 𝜇+𝜀
• For example, if you found the mean of 100 throws of dice to be 5.8. Does it 3.83
3.49
belong to fair dice population? 5.8
3.5 ± 1.96 ∗ .171
• You can say with 95% confidence that the mean is too far to belong to the same
population n=100, Sampling
distribution of
𝑋. 𝜎𝑋ത = 0.171
7
BITS Pilani, Pilani Campus
Practical Applications
• Is there really a difference beyond just plain randomness?

• Are results from two marketing campaigns same?

• Are prices from two stores same?

• Are the conversion rates the same, for the “buy-now button” of 5 different colors
on your website?
• Is stock portfolio-A riskier (more variance) than the portfolio-B?

• Whether there is a statistical difference between the means, proportions,


standard deviations etc.?

BITS Pilani, Pilani Campus


BITS Pilani
Pilani Campus

Comparing means of two populations


Comparing the means of two independent populations

Comparing the means of two populations

• Is there any difference between two population means?(H0?)


• H0: μ1 = μ2.
• Can also be written as: H0: μ1 - μ2 = 0
• H1: μ1 ≠ μ2
Independent populations?
• Units in one are not related to units in another (Members of a family).
Assumptions
• t-test: Normality of the populations (when σ is not known).
• Equal variance.
10
BITS Pilani, Pilani Campus
Difference in Means: Sampling Distribution

X1 ~ N(µ1,σ2) 𝑋1 - 𝑋2 ~ N(µ1-µ2,σ𝑋1 − 𝑋2 2)

X2 ~ N(µ2,σ2) 𝑋1 - 𝑋2 values

Sampling Distribution of
Differences in Means

Populations
11
BITS Pilani, Pilani Campus
Difference in Means: Sampling Distribution

µ1-µ2 Standard error:

1 1
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟: σ𝑋1 − 𝑋22 = 𝑆𝑝2 (𝑛1 + 𝑛2)

• 𝑆𝑝2 : Pooled common variance estimate of the populations

𝑋1 - 𝑋2 values
• Why pooled?
• We have two estimates, pooling them gives an estimate from a larger sample
2 (n1 −1)∗𝑆1 +(n2 −1)∗𝑆2
2 2
• 𝑆𝑝 =
n1 +n2−2
• Confidence interval
1 1
• 𝑋1 − 𝑋2 ± 𝑡𝛼 𝑆𝑝2 (𝑛1 + 𝑛2)
2
• 𝑡𝛼 : Upper tail critical value, with 𝛼 significance, of t distribution with n1+n2 -2 degrees
2
𝛼
of freedom (critical value for upper tail area of )
2

12
BITS Pilani, Pilani Campus
Hypothesis testing:
Independent populations: Comparison of means
• Are population means same, one greater or lower?
• Two tailed: H0: μ1 = μ2 => μ1 - μ2 = 0
• One tail: H0: μ1 ≥ μ2 => μ1 - μ2 ≥ 0

Pooled Variance t-test


• Find critical t-values that separate region of non-rejection and region of rejections
• 𝑡𝛼 : Upper tail critical value, with 𝛼 significance, with n1+n2 -2 degrees of freedom
2

• Decide whether the sample tSTAT falls in the region of rejection


• Alternatively, find p-value for the tSTAT and compare it with the level of significance
(𝑋1− 𝑋2)−(μ1− μ2)) 2 (n1 −1)∗𝑆1 +(n2 −1)∗𝑆2
2 2
• tSTAT = , where 𝑆𝑝 =
𝑆𝑝2 (𝑛1+𝑛2)
1 1 n1+n2−2
13
BITS Pilani, Pilani Campus
Is there a difference in store sales?
You have two stores that sell your coconut water.
You want to test if the sale at these two locations is the same or different at 0.05 level of significance.
You draw 10 samples from each store sales data.
You find that the samples means are 50.3 and 72 bottles resp, and
The Sample standard deviations were found to be 18.73 and 12.54 resp.
• H0 and H1

• μ1 = μ2 => μ1 - μ2 = 0 and H1: μ1 ≠ μ2 => μ1 - μ2 ≠ 0

• Test statistic: 𝑋1 − 𝑋2 ?

• -22

• Distribution of the test statistic (if null was true)?

• t-distribution with difference of mean to be zero.

• Degrees of freedom for test statistic?

• n1 + n2 -2 = 18
14
BITS Pilani, Pilani Campus
Example
• Test statistic: 𝑋1 − 𝑋2 : -22

• Standard error?

(n1 −1)∗𝑆1 2 +(n2 −1)∗𝑆2 2


2
• Pooled variance: 𝑆𝑝 = = 254
n1 +n2 −2

1 1
• Standard error: 𝑆𝑝2 ( + ) = 7.13
𝑛1 𝑛2

• Degrees of freedom: 18

(𝑋1− 𝑋2)−(μ1− μ2))


• tSTAT = = -3.044
1 1
𝑆𝑝2 (𝑛1+𝑛2 )

• Critical values: -2.1 and +2.1 , p-value: 0.007

• p-value < α and tSTAT is in the region of rejection.

15
BITS Pilani, Pilani Campus
Comparing the means of two related populations
When two populations (for sample 1 and sample 2) are not independent

• Outcomes of the first population are not independent of the outcomes of the second population

• Repeat measurements on same units:


• You treat differences (D1, D2, …,

Before After Difference Dn) as population


• Draw a random sample from that.
X1b X1a D1 = X1a - X1b
• Null hypothesis?
• Ho: μD = 0 and H1: μD ≠ 0.
X2b X2a D2 = X2a – X2b • tSTAT for (n-1 degrees of freedom):
ഥ − μD))/SD/ 𝑛.
(𝐷

16
BITS Pilani, Pilani Campus
Comparing the means of two related populations
When two populations (for sample 1 and sample 2) are not independent
• Are store-A book prices more than store-B prices?

• If the sample “A” predominantly consists of very expensive titles and B contained predominantly inexpensive
paperbacks.

• Would the results be valid?

• Better way may be to take the prices for similar books for comparison.

• Another way is to pair the measurement units on some characteristic (paperbacks and hardbound etc)
Store-A Store-B Difference

Title1 Title1 D1 = T1a - T1b

Title2 Title2 D2 = T2a – T2b


17
BITS Pilani, Pilani Campus
Why use related populations?
• Samples may not be comparable

• Using unrelated items may make variance very high.

• By using related populations, we reduce overall variance by reducing the “unit level difference” between the two
samples.

• Lower variance reduces the β-error (probability of not rejecting a false null hypothesis).

18
BITS Pilani, Pilani Campus
BITS Pilani
Pilani Campus

Comparing proportions of two populations


Comparing the proportions of two independent populations

• Are population proportions same?

Two tailed:

H0: π1=π2

 π1 – π2 = 0

H1: π1≠π2 => π1 – π2 ≠ 0

One tail:

H0: π1 ≥ π2

 π1 – π2 ≥ 0

H1: π1 < 𝑜𝑟 > π2

=> π1 – π2 < or > 0


• Z statistic approximately follows Z-Distribution
20
BITS Pilani, Pilani Campus
Comparing the proportions of two independent populations

Standard error:

1 1
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟: σ𝑝1− 𝑝2 = 𝑝(1
ҧ − 𝑝)ҧ ( + )
𝑛1 𝑛2

• Pooled proportion estimate:

𝑋1+𝑋2 𝑋1 𝑋2
• 𝑝ҧ = , p1 = , p2 =
𝑛1+𝑛2 𝑛1 𝑛2

• Confidence interval

1 1
• (𝑝1− 𝑝2) ± 𝑍𝛼/2 𝑝(1
ҧ − 𝑝)ҧ ( + )
𝑛1 𝑛2

All the steps are similar to the previous example. The only difference is that we now use a Z-test instead of a t-test.

21
BITS Pilani, Pilani Campus
BITS Pilani
Pilani Campus

Comparing variances of two populations


Comparing variances of two independent populations
Testing for equality of variances
2 2 σ12
– Ho: σ1 = σ2 or = 1 and
σ22
– H1: σ12 ≠ σ22

• F-Test

• If two independent populations are normally distributed.

• Ratio of sample variances follow F-Distribution. Note:


Pic depicts one tail F-test
• FSTAT = S12 / S22
Though the left side
• Follows F-Distribution with discussion is about one tail
test
• n1-1 numerator degrees of freedom and n2-1 denominator degrees of freedom

• Larger variance is taken as numerator.

• Critical F-Value is calculated from the F-Table.

• Reject null if FSTAT > Fα/2 or if p-value < α. 23


BITS Pilani, Pilani Campus
Equality of variances, two independent populations
• You have two stores that sell your coconut water.
• You want to test if the sale at these two locations is same or different at 0.05 level of significance.
• You draw 10 samples (10 days) from each store
• You find that the mean sale volume to be 50.3 and 72 bottles resp.
• Sample standard deviations were found to be 18.73 and 12.54 resp.
• One of the important assumption of pooled t-test is that population variances are same
• We will now test that assumption!
σ12
• H o: σ12 = σ22 or =1
σ22
• H1: σ12 ≠ σ22

24
BITS Pilani, Pilani Campus
Equality of variances, two independent populations
F-Test
FSTAT = S12 / S22
350.68/157.33 = 2.23
Numerator & Denominator degrees of freedom:
9
Fcritical ? (From F-Table for .05 significance)
4.03
Should we reject the null?
FSTAT < Fcritical
p-value = 0.25 ( p>α)
We do not have sufficient evidence to reject the null hypothesis.
We can assume that the variances are equal.

25
BITS Pilani, Pilani Campus
BITS Pilani
Pilani Campus

ANOVA: Analysis of Variance


ANOVA: Analysis of variance
• So far, we learned about comparing two populations (means and variances).

• ANOVA helps compare means of more than two populations or groups.

• The criteria that separate the groups are called Factors.

• Each factor can have multiple levels (or categories).

One-way ANOVA:

• One factor

• Levels separate the groups

• Also known as completely randomized design

27
BITS Pilani, Pilani Campus
ANOVA: Analysis of variance
It’s a two Step Process

• Find if there is difference population means (Ho: μ1 = μ2 = μ3 = μ4 ), H1?

• H1: Not all means are same or at least one mean is different from others

• If null is rejected, find out which means are different from others

• Second step is not in the syllabus, but we may review it if time permits.

28
BITS Pilani, Pilani Campus
ANOVA: Analysis of variance
• Assuming you randomly select samples from “c” populations,
which are normally distributed with equal variances Loc 1 Loc 2 Loc3 Loc4

30.06 32.22 30.78 30.33


• Null hypothesis: Ho: μ1 = μ2 = … = μc (Means of all
29.96 31.47 30.91 30.29
populations are equal)
30.19 32.13 30.79 30.25
• Alternate hypothesis: Not all μjs are equal (Where,
29.96 31.86 30.95 30.25
j=1,2,…,c). 29.74 32.29 31.13 30.55
• At least one population mean is different from the other population
Mean 29.982 31.994 30.912 30.334
means
SD 0.164985 0.335306 0.142548 0.12522
• We are going to use the division of variations for testing this
hypothesis.

29
BITS Pilani, Pilani Campus
ANOVA: Analysis of variance
We partition the total variation into two parts.
Loc 1 Loc 2 Loc3 Loc4
• Within group variation: Due to random variation or the sampling error (SSW).
30.06 32.22 30.78 30.33
• Differences are due to the sampling error or are random (Xi = µ + εi).
29.96 31.47 30.91 30.29
• Among groups variation: Due to variation between the groups (SSA)
30.19 32.13 30.79 30.25

29.96 31.86 30.95 30.25


• Total Variation of the joint sample (SST) = SSW + SSA
29.74 32.29 31.13 30.55

Mean 29.982 31.994 30.912 30.334

SD 0.164985 0.335306 0.142548 0.12522

30
BITS Pilani, Pilani Campus
ANOVA: Total Sum of Squares
Calculating the total sum of squares (Total variation): SST
• Under null hypothesis, all the group means are same, so they all can just be Loc 1 Loc 2 Loc3 Loc4

considered to be coming from one population 30.06 32.22 30.78 30.33

• We calculate grand mean 𝑋ധ and then calculate sum of square deviation from 29.96 31.47 30.91 30.29

all the values 30.19 32.13 30.79 30.25

𝑛𝑖 ധ 2 , (𝑋𝑖𝑗 : ith value from jth group)


• SST = σ𝑐𝑗=1 σ𝑖=1(𝑋𝑖𝑗 −𝑋) 29.96 31.86 30.95 30.25

29.74 32.29 31.13 30.55


𝑛𝑖
σ𝑐𝑗=1 σ𝑖=1(𝑋𝑖𝑗 )
• Grand mean = 𝑋ധ = Mean 29.982 31.994 30.912 30.334
𝑛

• n = Total sample size (sum of all group sample sizes) SD 0.164985 0.335306 0.142548 0.12522

• Associated degrees of freedom with the SST?

• n-1

• Total Sum of Square Variance for ANOVA

• Total Mean Squares: MST = SST / (n-1) 31


BITS Pilani, Pilani Campus
Among Group Variations
Calculating the among group sum of squares : SSA
Loc 1 Loc 2 Loc3 Loc4
• Sum of squares of deviation of each mean from the grand mean
30.06 32.22 30.78 30.33
• Weighted by the group sample size (each group sample size
29.96 31.47 30.91 30.29
may be different)
30.19 32.13 30.79 30.25

• SSA = σ𝑐𝑗=1 𝑛𝑗 (𝑋ത𝑗 − 𝑋)


ധ 2 29.96 31.86 30.95 30.25

• Associated degrees of freedom with the SSA : c-1 29.74 32.29 31.13 30.55

Mean 29.982 31.994 30.912 30.334


• Among group Variance for ANOVA
SD 0.164985 0.335306 0.142548 0.12522
• Mean Squares Total: MSA = SSA / (c-1)

32
BITS Pilani, Pilani Campus
Within Group Variations

Calculating within groups sum of squares : SSW Loc 1 Loc 2 Loc3 Loc4

• Sum of squares of individual value from their group means 30.06 32.22 30.78 30.33

𝑛𝑖
• SSW = σ𝑐𝑗=1 σ𝑖=1(𝑋𝑖𝑗 −𝑋ഥ𝑗 )2
29.96 31.47 30.91 30.29

30.19 32.13 30.79 30.25


• Associated degrees of freedom with the SSW: n-c
29.96 31.86 30.95 30.25
• (Each group contributes to n-1 degrees of freedom)
29.74 32.29 31.13 30.55

• Within group Variance for ANOVA Mean 29.982 31.994 30.912 30.334
• Mean Squares Total: MSA = SSA / (n-c)
SD 0.164985 0.335306 0.142548 0.12522

33
BITS Pilani, Pilani Campus
Using F-Test for One Way ANOVA s
• We used F-Test to test the equality of two variances
• Under the null hypothesis all groups belong to same population.
• Given this, grouped variance estimate (MSW) and within variance estimate (MSW) should be equal.
𝑀𝑆𝐴
• One-Way ANOVA test statistic: FSTAT =
𝑀𝑆𝑊
• Follows F-Distribution with, c-1 numerator degrees of freedom, and n-c denominator degrees of freedom

34
BITS Pilani, Pilani Campus
Using F-Test for One Way ANOVA s
• Reject null if FSTAT > Fα

• Why not Fα/2 ?


𝑛𝑖
• MSW is only capturing the sampling error: SSW = σ𝑐𝑗=1 σ𝑖=1(𝑋𝑖𝑗 −𝑋ഥ𝑗 )2

• MSA additionally captures potential group differences if exist.


• SSA = σ𝑐𝑗=1 𝑛𝑗 (𝑋ത𝑗 − 𝑋)
ധ 2

• If all means are not equal. MSA> MSW

• This concludes the first step.

• Rejection of null hypothesis only means that not all means are equal.

• If null was rejected, we still do not know - which one?

35
BITS Pilani, Pilani Campus
ANOVA Table

• In Questions
• You may have to create the ANOVA Table, or
• You may be given an ANOVA Table to interpret

• There could be an additional column with p-value of the test statistic


• Remember: ANOVA uses one tailed F-Test (Entire α proportion is in the right tail)
36
BITS Pilani, Pilani Campus
ANOVA F-Test assumptions
• Independence
• Samples are independent of each other. Measurement of one is not related to the measurement in the other

• Normality
• Group populations are normally distributed

• Test is relatively robust and works well for approximately normal populations

• Homogeneity of variance
• Population variances are same

• For same groups sizes, the test is quite robust against small variations

• For different group sizes, this can bias our results

• Test of equal variance is an important requirement

• Levine test (F-test on absolute deviations from group medians can be used) to test of equal variances

37
BITS Pilani, Pilani Campus
When to use which test

• Compare two populations


• Paired t-test for mean differences between two independent populations
• Paired t-test for mean differences between two dependent populations
• F-test for differences in variances between two independent populations
• Z-test for differences between two proportions

• More than two populations


• One-Way ANOVA for difference in independent population means

38
BITS Pilani, Pilani Campus
Multiple Mean Comparisons
• Finding out if the means are different is the first step
• If the null is rejected. The second step is to find out, which of the means are different.
• Instead of doing individual paired t-tests (for difference in means) for all the possible pair of means. We can run a joint
multiple Tukey-Kramer Procedure.
• It has two steps
• Compute the absolute mean differences 𝑋ഥ𝑗 − 𝑋ഥ𝑖 𝑤ℎ𝑒𝑟𝑒 𝑖 𝑎𝑛𝑑 𝑗 𝑟𝑒𝑓𝑒𝑟 𝑡𝑜 𝑑𝑖𝑓𝑓𝑒𝑟𝑛𝑡 𝑔𝑟𝑜𝑢𝑝𝑠 (𝑖 ≠ 𝑗)
• How many pairs would have in c groups?
• c(c-1)/2 (why?)
• Compute critical range for the Tukey Kramer procedure for each pair of sample sizes

• Declare a group pair to be different if the absolute mean difference is more than the critical range 39
BITS Pilani, Pilani Campus
Multiple Mean Comparisons – Stores example
• Qα is 4.05
• Pair-wise ranges

Studentized range for 0.05 significance, with 4 and 16


Numerator and denominator degrees of freedom
• Which of the groups are different?
• Only in #3 pair, there is no difference

40
BITS Pilani, Pilani Campus
Q&A

41
Quantitative Methods

Lecture-12
BITS Pilani
Pilani Campus

1
BITS Pilani
Pilani Campus

Chi-square tests
(Ch 11 Business Statistics, Levine et al.)
Today’s session

➢ Chi-square tests.
➢ Goodness of fit.
➢ Simultaneously testing the difference between of two or more
proportions.
➢ Test of independence.

3
BITS Pilani, Pilani Campus
The Concept behind chi-square test
• A population “X” has certain proportions of defined attributes.

• Say, 50% of humans in the population are Females, and 50% are Male.

• You draw a representative sample (say n=50) from this population.

• What is the expected proportion of Males and Females in this sample?

• 50% each.

• What are the expected frequencies of Males and Females in this sample?

• 25 each. Why?

• Observed frequencies would be slightly different. Due to sampling error.

• However, the difference between observed and expected frequencies should be close to zero.

BITS Pilani, Pilani Campus


The concept behind chi-square test
• If you find that the differences between expected and observed frequency is close to zero.

• You can assume that the samples are from the same population.

• If you find that the differences are too large, you may conclude that they are not from the
expected population.

• How large a difference is too large?

• That’s where chi-square distribution comes to our help.

(𝑓𝑜 −𝑓𝑒 )2
• 𝑇𝑒𝑠𝑡 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐: χ2 𝑆𝑇𝐴𝑇 (Chi-square statistic) = σ𝐴𝑙𝑙 𝑐𝑒𝑙𝑙𝑠 𝑓𝑒

• 𝒇𝒐 : 𝑂𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦, in the sample 𝒇𝒆 : 𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑖𝑓 𝑡ℎ𝑒 𝑝𝑟𝑜𝑝𝑜𝑟𝑡𝑖𝑜𝑛𝑠 𝑤𝑒𝑟𝑒 𝑠𝑎𝑚𝑒.

• If null hypothesis of no difference is correct, 𝜒 2 𝑆𝑇𝐴𝑇 follows chi-square distribution with


certain degrees of freedom (that we would discuss in each case).

BITS Pilani, Pilani Campus


The concept behind chi-square test
(𝑓𝑜 −𝑓𝑒 )2
• 𝑇𝑒𝑠𝑡 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐: χ2 𝑆𝑇𝐴𝑇 (Chi-square statistic) = σ𝐴𝑙𝑙 𝑐𝑒𝑙𝑙𝑠
𝑓𝑒

• 𝒇𝒐 : 𝑂𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦, 𝒇𝒆 : 𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑖𝑓 𝑡ℎ𝑒 𝑝𝑟𝑜𝑝𝑜𝑟𝑡𝑖𝑜𝑛𝑠 𝑤𝑒𝑟𝑒 𝑠𝑎𝑚𝑒.

• Null hypothesis. No difference in proportions. From same population etc.

• If null hypothesis of no difference is correct, 𝝌𝟐 𝑺𝑻𝑨𝑻 follows chi-square distribution with df degrees of freedom.

• If null hypothesis is true, what should be χ2 𝑆𝑇𝐴𝑇 ?

• It should be close to zero or in the region of nonrejection.

• If χ2 𝑆𝑇𝐴𝑇 is in the region of rejection, we can reject the null of no difference.

• Notice it starts with 0 (the left most point).

• Rejection region means too far from zero.

• In Chi-square test, entire α probability region is in the right tail.

BITS Pilani, Pilani Campus


BITS Pilani
Pilani Campus

Chi-Square test: Goodness of fit


Chi-square: Goodness of fit
Primary question

• Does the observed data fit assumed distribution.

• Degrees of freedom: (Number_of_values) – 1 – (no_of_estimated_parameters)

• no_of_estimated_parameters: In normal distribution you may have to estimate the mean and the
standard deviation

BITS Pilani, Pilani Campus


The goodness of fit
Does the following sample represent a “uniform distribution”. Are there equal proportion of each option/value in
the population?

(𝒇𝒐 −𝒇𝒆 )𝟐
Option Observed Expected Expected Expected 𝝌𝟐 𝑺𝑻𝑨𝑻 = σ𝑨𝒍𝒍 𝒄𝒆𝒍𝒍𝒔
𝒇𝒆
Frequency Frequency Frequency Frequency
(fo) (fe) (fo-fe)^2 (fo-fe)^2/fe 𝝌𝟐 𝑺𝑻𝑨𝑻 = 2.6
A 15 20 25 1.25
B 19 20 1 0.05
C 25 20 25 1.25
D 21 20 1 0.05
Total 80 80 2.6

Degrees of freedom:
Number of values – 1 – no_of_estimated_parameters

BITS Pilani, Pilani Campus


The goodness of fit
(𝒇𝒐 −𝒇𝒆 )𝟐
Option Observed Expected Expected Expected 𝝌𝟐 𝑺𝑻𝑨𝑻 = σ𝑨𝒍𝒍 𝒄𝒆𝒍𝒍𝒔
Frequency Frequency Frequency Frequency 𝒇𝒆
(fo) (fe) (fo-fe)^2 (fo-fe)^2/fe Critical Value:
A 15 20 25 1.25 7.815
B 19 20 1 0.05
𝝌𝟐 𝑺𝑻𝑨𝑻 = 2.6
C 25 20 25 1.25
D 21 20 1 0.05
Total 80 80 2.6
Degrees of freedom:
Number of values – 1 – no_of_estimated_parameters

Degrees of freedom: Significance:


4–1–0=3 0.05

Rject H0 𝐢𝐟 𝝌𝟐 𝑺𝑻𝑨𝑻 > 𝝌𝟐 𝜶 Conclusion:


Data fits uniform distribution

BITS Pilani, Pilani Campus


Goodness of fit examples: Fair dice?
• A dice is tossed 600 times and frequency of 1 to 6 is noted. Outcome Frequency
1 80
• Is it a fair die? 2
3
105
100
4 95
• Ho: Data comes from a fair dice population. 5 100
6 120
• How would you approach it? Total 600

• Observed frequency is given. What are the expected proportions? (𝒇𝒐 −𝒇𝒆 )𝟐
𝝌𝟐 𝑺𝑻𝑨𝑻 = σ𝑨𝒍𝒍 𝒄𝒆𝒍𝒍𝒔
𝒇𝒆
• What are the expected frequencies: 100 each (why?)

• Degrees of freedom: 5 (why?)

• Calculate sample chi-square statistic and compare it with the critical value.

Rject H0 𝐢𝐟 𝝌𝟐 𝑺𝑻𝑨𝑻 > 𝝌𝟐 𝜶

BITS Pilani, Pilani Campus


Goodness of fit examples: Normal distribution?
• Your coconut water shop recorded the time taken to serve the customers (minutes).

• Are time taken normally distributed?

• Ho: Data fits normal distribution.

• How will you approach it?


(𝒇𝒐 −𝒇𝒆 )𝟐
• Degrees of freedom. 𝝌𝟐 𝑺𝑻𝑨𝑻 = σ𝑨𝒍𝒍 𝒄𝒆𝒍𝒍𝒔
𝒇𝒆

• 7-1-2 = 4 (why?)

• Two parameters, µ and σ, need to be estimated to find expected frequencies.

• Find expected proportions for each bin (interval) (How?).

• Find expected frequencies for each interval (How?).

• Calculate the test statistic, critical value and test the hypothesis. (How?).
Rject H0 𝐢𝐟 𝝌𝟐 𝑺𝑻𝑨𝑻 > 𝝌𝟐 𝜶

BITS Pilani, Pilani Campus


BITS Pilani
Pilani Campus

Chi-square test: Difference in two proportions


Chi-square: Difference in two proportions
• You have two stores selling coconut water. You ran a satisfaction survey.

• You asked customers of each store: “Will you buy again?”

• You want to test if the proportion of customers willing to buy again is same for both the stores.

• H0: π1=π2
• H1: π1≠π2
(𝑓𝑜 −𝑓𝑒 )2
• 𝑇𝑒𝑠𝑡 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐: χ2 𝑆𝑇𝐴𝑇 (Chi-square statistic) = σ𝐴𝑙𝑙 𝑐𝑒𝑙𝑙𝑠
𝑓𝑒
• We need to answer two questions.
• Degrees of freedom? 1 (Why?)
• How to calculate 𝒇𝒆 : 𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑖𝑓 𝑡ℎ𝑒 𝑝𝑟𝑜𝑝𝑜𝑟𝑡𝑖𝑜𝑛𝑠 𝑤𝑒𝑟𝑒 𝑠𝑎𝑚𝑒 at both the stores.

BITS Pilani, Pilani Campus


Difference in two proportions
Row Group-1 Group-2 (Store- Total Row Store-1 Store-2 Total
(Store-1) 2)

Items of interest X1 X2 X Buy-again? 163 154 317


(yes) (Yes)
Items of not n1 – X1 n2 – X2 n-X Buy-again? (No) 64 108 172
interest (no)

Total sample n1 n2 n Total sample 227 262 489

• How to calculate 𝑓𝑒 ?
• Under null hypothesis, proportion of items of interest would be same.
• We can take proportion from store-1, store-2, or from the total. Which estimate is better?
𝑋
• Overall estimated proportion of the item of interest: 𝑝ҧ = ;
𝑛
• X: Total number of items of interest, n: Total sample size.
• 𝑓𝑒 = column_total* 𝑝,ҧ for items of interest, and column_total* (1 − 𝑝) for the items of not interest.

BITS Pilani, Pilani Campus


Chi-square: Difference in two proportions
𝑋 𝑡𝑜𝑡𝑎𝑙_𝑖𝑡𝑒𝑚𝑠_𝑜𝑓_𝑖𝑛𝑡𝑒𝑟𝑒𝑠𝑡 317
Expected proportion 𝑝ҧ = = = = .648, 𝑎𝑛𝑑 (1 − 𝑝) = .352
𝑛 𝑇𝑜𝑡𝑎𝑙_𝑠𝑎𝑚𝑝𝑙𝑒𝑠𝑖𝑧𝑒 489

Row Store-1 Store-1 Store-2 Store-2 Total


Observed Expected Observed Expected
Buy-again? (Yes) 163 147.16 154 169.84 317
Buy-again? (No) 64 79.84 108 92.16 172

Total sample 227 227 262 262 489

• Store1: Total customers (sample)?:


• 227
• Expected proportion of the items of interest?
• 0.648
• Expected frequency?
• fe = 227*0.648 = 147.16

BITS Pilani, Pilani Campus


Chi-square: Difference in two proportions
Row Store-1 Store-1 Store-2 Store-2 Expected Total
Observed Expected Observed
Buy-again? (Yes) 163 227*.648 = 147.16 154 169.84 X=317
𝑝=.648
ҧ
Buy-again? (No) 64 227*.352 = 79.84 108 92.16 172
(1 − 𝑝) = .352

Total sample 227 227 262 262 489

fo fe fo-fe (fo-fe)^2 (fo-fe)^2/fe


163.00 147.16 15.84 250.91 1.70

154.00 169.84 -15.84 250.91 1.48

64.00 79.84 -15.84 250.91 3.14

108.00 92.16 15.84 250.91 2.72


Total 9.05
BITS Pilani, Pilani Campus
Chi-square: Difference in two proportions
(𝑓𝑜 −𝑓𝑒 )2
• 𝑇𝑒𝑠𝑡 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐: χ2 𝑆𝑇𝐴𝑇 (Chi-square statistic) = σ𝐴𝑙𝑙 𝑐𝑒𝑙𝑙𝑠
𝑓𝑒
• 𝒇𝒐 : 𝑂𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦, 𝒇𝒆 : 𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑖𝑓 𝑡ℎ𝑒 𝑝𝑟𝑜𝑝𝑜𝑟𝑡𝑖𝑜𝑛𝑠 𝑤𝑒𝑟𝑒 𝑠𝑎𝑚𝑒.

• If null hypothesis is correct, 𝝌𝟐 𝑺𝑻𝑨𝑻 follows chi-square distribution with 1 degree of freedom.

• Degrees of freedom = (nrows-1)*(ncols-1) if the data is presented in a contingency table.

• Row and Column totals are fixed and hence one degree of freedom is lost in each.

• Mechanics of hypothesis testing remains the same.

• Calculate the test statistic for our data (χ2 𝑆𝑇𝐴𝑇 ) for our data.

• Compare it with the cutoff values of χ2 distribution with 1 degree of freedom for a given α.

• Reject null of the χ2 𝑆𝑇𝐴𝑇 is in the region of rejection, or p-value is less than alpha.
BITS Pilani, Pilani Campus
Chi-square: Difference in two proportions

2 (𝑓𝑜 −𝑓𝑒 )2
• χ 𝑆𝑇𝐴𝑇 = σ𝐴𝑙𝑙 𝑐𝑒𝑙𝑙𝑠
𝑓𝑒
fo fe fo-fe (fo-fe)^2 (fo-fe)^2/fe
163.00 147.16 15.84 250.91 1.70

154.00 169.84 -15.84 250.91 1.48

64.00 79.84 -15.84 250.91 3.14 Rject H0 𝐢𝐟 𝝌𝟐 𝑺𝑻𝑨𝑻 > 𝝌𝟐 𝜶


108.00 92.16 15.84 250.91 2.72
Total 9.05

Should we reject the null hypothesis?

Conclusion: Proportion of customers who are willing to buy again is different in each store.

BITS Pilani, Pilani Campus


Chi-square: Difference in two proportions
• Rject H0 if χ2 𝑆𝑇𝐴𝑇 > χ2 α

• Otherwise, do not reject the null hypothesis.

• If null is true, expected and actual frequencies would be equal.

• χ2 𝑆𝑇𝐴𝑇 should be close to zero.

• Notice (the left boundary of the distribution is zero).

• The entire α probability region is in the right tail.

• Too far from zero means the proportions are different.

BITS Pilani, Pilani Campus


BITS Pilani
Pilani Campus

Chi-square test: Difference in more than two


proportions
Chi-square: Difference in more than two proportions
• H0: π1 = π2 = π3 =... = πc
• H1: at least one proportion is different from the others.

• Now the contingency table has two rows and more than two columns (‘c’ columns).
(𝑓𝑜 −𝑓𝑒 )2
• 𝑇𝑒𝑠𝑡 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐: χ2 𝑆𝑇𝐴𝑇 (Chi-square statistic) = σ𝐴𝑙𝑙 𝑐𝑒𝑙𝑙𝑠
𝑓𝑒
• 𝒇𝒐 : 𝑂𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦, 𝒇𝒆 : 𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑖𝑓 𝑡ℎ𝑒 null hypothesis is true.

• Overall estimated proportion of items of interest 𝑝ҧ = 𝑋𝑛 , X = total number of items of interest, n = total sample size.

• If null hypothesis is correct, 𝝌𝟐 𝑺𝑻𝑨𝑻 follows chi-square distribution with c-1 degrees of
freedom (why?).

• Mechanics of the hypothesis testing remains the same.


BITS Pilani, Pilani Campus
Chi-square: Difference in more than two proportions
• Assuming we have three stores selling coconut water.

H0: π1 = π2 = π3

(Proportoin of customers willing to buy again is same in all the stores)


Row Store-1 Store-2 Store-3 Total
Observed Observed Observed
Buy-again? (Yes) 128 199 186 X=513
Buy-again? (No) 187
88 33 66
Total sample 216 232 252 700

𝑋 𝑡𝑜𝑡𝑎𝑙_𝑖𝑡𝑒𝑚𝑠_𝑜𝑓_𝑖𝑛𝑡𝑒𝑟𝑒𝑠𝑡 513
Expected proportion: 𝑝ҧ = 𝑛
=
𝑇𝑜𝑡𝑎𝑙_𝑠𝑎𝑚𝑝𝑙𝑒𝑠𝑖𝑧𝑒
=
700
= .733,

𝑎𝑛𝑑 (1 − 𝑝) = .267

BITS Pilani, Pilani Campus


Frequency table and chi-square statistic
fo fe (fo-fe)^2 (fo-fe)^2/fe
128 158.30 917.92 5.80
88 57.70 917.92 15.91
199 170.02 839.67 4.94
33 61.98 839.68 13.55
186 184.68 1.74 0.01
66 67.32 1.74 0.03

Total (χ2)
40.23

(𝑓𝑜 −𝑓𝑒 )2
𝑇𝑒𝑠𝑡 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐: χ2 𝑆𝑇𝐴𝑇 (Chi-square statistic) = σ𝐴𝑙𝑙 𝑐𝑒𝑙𝑙𝑠
𝑓𝑒
24
Chi-square: Difference in more than two proportions

Row Store-1 Store-1 Store-2 Store-2 Store-3 Store3 Total


Observed Expected Observed Expected Observed Expected
Buy-again? (Yes) 128 158.30 199 170.02 186 184.68 X=513
Buy-again? (No) 92.16 187
88 57.70 33 61.98 66
Total sample 216 216 232 232 252 252 700

• Degrees of freedom?

• (n_rows-1)*(n_cols-1) = 2
Rject H0 𝐢𝐟 𝝌𝟐 𝑺𝑻𝑨𝑻 > 𝝌𝟐 𝜶
• χ𝑆𝑇𝐴𝑇 2 = 40.23

• Critical value χα=.05,𝑑𝑓=2 2 = 5.991 - XLS formula: CHISQ.INV.RT(α, 𝑑𝑓)

• Should we reject the null hypothesis?

BITS Pilani, Pilani Campus


Chi-square more than two proportions: Important notes

• For accurate results:

• At least 20% of the cells with expected frequency of 5 or more.

• Expected frequency in each cell must be at least 1.

• Columns can be merged if this criteria is not met.

BITS Pilani, Pilani Campus


BITS Pilani
Pilani Campus

Chi-square test: Test of independence


Chi-square: Test of independence
• Your coconut stores survey also had one additional question.

If you do not plan to purchase again, please let us know why?

1. Price?

2. Location?

3. Staff Behavior?

4. Others?

• Now you would like test if the reason for not buying again is independent of the store?

• In other words, are the proportion of these reasons same across the stores?

BITS Pilani, Pilani Campus


Chi-square: Test of independence
• H0: Two categorical variables are independent (there is no relationship between them).

• H1: Two categorical variables are dependent. (there is a relationship between them).

In the test of proportions


• We have one factor with two or more levels (stores).
• We test whether proportion of items of interest is same across all levels (stores).

In the test of independence


• We have two factors (“stores” and “the reason for not buying again”). Each factor has two or more
levels.

BITS Pilani, Pilani Campus


Chi-square: Test of independence
• Assuming we have three stores selling coconut water.
Store-1 Store-2 Store-3
Why will not buy Total
Observed Observed Observed
again?
Price 23 7 37 67

Location 39 13 8 60

Staff behavior 13 5 13 31

Others 13 8 8 29
Total 88 33 66 187

• H0: “Reasons to not buy again” is independent of the stores (there is no relationship).

• H1: “Reasons to not buy again” is dependent on the stores (there is a relationship between them).

BITS Pilani, Pilani Campus


Chi-square: Test of independence
• H0: Two catergorical variables are independent (there is no relationship between them).

• H1: Two categorical variables are dependent. (there is a relationship between them).
• Now the contingency table has more than two rows and more than two columns (‘c’ columns).
(𝑓𝑜 −𝑓𝑒 )2
• 𝑇𝑒𝑠𝑡 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐: χ2 𝑆𝑇𝐴𝑇 (Chi-square statistic) = σ𝐴𝑙𝑙 𝑐𝑒𝑙𝑙𝑠
𝑓𝑒
• 𝒇𝒐 : 𝑂𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦, 𝒇𝒆 : 𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑖𝑓 𝑡ℎ𝑒 null hypothesis is true.

• If null hypothesis is correct, 𝝌𝟐 𝑺𝑻𝑨𝑻 follows chi-square distribution,

• With (number of rows – 1)*(number of columns – 1) degrees of freedom.

• All the other mechanics of the hypothesis testing remains the same.

BITS Pilani, Pilani Campus


Chi-square: Test of independence
Store-1 Observed Store-2 Observed Store-3 Observed Total
Why will not buy again?
Price 23 7 37 67
Location 39 13 8 60
Staff behavior 13 5 13 31
Others 13 8 8 29
Total 88 33 66 187

Calculating the expected frequency


• Expected proportion: P(A and B) = P(A)*P(B) [ Under null hypothesis of independence]

• P( Price and Store_1) = P(Price)*P(Store_1) = (67/187)*(88/187) = 0.169

• Expected frequency (first cell): 0.169*187 = 31.53

Degrees of freedom?

• Df = (nrows – 1)*(ncols – 1) = 3*2 = 6

BITS Pilani, Pilani Campus


Chi-square: Test of independence
Cell fo fe (fo-fe) (fo-fe)^2 (fo-fe)^2/fe
Store1/Price 23 31.53 -8.53 72.76 2.31
Store2/Price 7 11.82 -4.82 23.23 1.97 Stores and Reasons
Store3/Price 37 23.65 13.35 178.22 7.54 for not buying again
Store1/Location 39 28.24 10.76 115.78 4.1 are not independent
Store2/Location 13 10.59 2.41 5.81 0.55
Store3/Location 8 21.18 -13.18 173.71 8.2
Store1/Staff 13 14.59 -1.59 2.53 0.17
Store2/Staff 5 5.47 -0.47 0.22 0.04
Store3/Staff 13 10.94 2.06 4.24 0.39
Store1/Others 13 13.65 -0.65 0.42 0.03
Store2/Others 8 5.12 2.88 8.29 1.62
Store3/Others 8 10.24 -2.24 5.02 0.49
Total 27.41

• Test Chi-Square Statistic: 27.41

• Chi-square cut-off (df: 6 and alpha: .05): 12.592

• Should we reject the null hypothesis?


BITS Pilani, Pilani Campus
Degree of freedom, df- in statistics
1. Variance = Sum of square of errors from mean/df
▪ For population variance, df= number of observations.
▪ For sample population, df=number ofobservations-1.
▪ 1 df is lost since mean is computed from the sample.

2. t distribution
▪ df= sample size-1
▪ 1 df is lost since standard deviation is computed from the sample.

3. F distribution
▪ dfN= sample sizeN-1, dfD=sample sizeD-1
▪ 1 df is lost each in Numerator and in Denominator since means are computed from the samples.

4. Chi square distribution: df depends on the problem


▪ df= (No of colums-1)*(No of rows-1) in contingency tables.
▪ df= Number of frequencies-1, if the data contained only one row or column.

BITS Pilani, Pilani Campus


BITS Pilani
Pilani Campus

Relationship between two variables


Ch 2 (Scatter Plots) and Ch 3 (Correlations and Covariance)
Business Statistics, Levine et al.
Relationship between two variables?
• What do we mean by “relationship between two variables”?

• Information about one variable conveys some information about the other.

Some examples

• Does knowing darkness of the clouds give information about chance of rain?

• Yes, there is a relationship. Positive: When one goes up, other also goes up on an average.

• Is there a relationship between number of absent days and total marks?

• Yes, there is a relationship. Negative: When one goes up, other goes down on an average.

• Is there a relationship between student weight and total marks? (No relationship)

• Knowing weight of a student does not give any information about her/his potential marks in the exam.

BITS Pilani, Pilani Campus


Scatter plots and correlations
• Scatter plot: Visualizing the relationship between Var-A and Var-B.

• Correlations: A way to quantify the nature of linear realationship.

• What happens to variable on Y-Axis, when variable on X-axis increases?

• See from a distribution lens and not from the absolute value lens.

• A: Expected value: E[Y | X=400].

• B: Expected value: E ([Y | X=1200].

• Does E[Y] linearly increase with X? (Positive Correlation)

• Does E[Y] linearly decrease with Y? (Negative correlation)

• When X increases? Does E[Y] remain the same? (No correlation).

• Which type of correlation 10-Year Bond yields and Us Equity market have?

BITS Pilani, Pilani Campus


Relationship and Linear Relationship

Linear Relationship: Relationship but not linear.


You can draw a trend line You can not draw a trend line.
Only a relationship curve can be drawn.

BITS Pilani, Pilani Campus


Linear relationship: Covariance
Measures the strength of linear relationship between two variables
σ𝑛 ത ത
𝑖=1(𝑋−𝑋)(𝑌−𝑌)
Sample covariance: cov(X,Y) =
𝑛−1
n: Sample size

𝑋ത =? 𝑋ത =5 ത
𝑌=? 𝑌ത =7 𝑛=? n=4

X Y X-𝑋ത Y-𝑌ത ത
(X-𝑋)(Y- ത
𝑌)
2 4 -3 -3 9
cov(X,Y) = 20/3 = 6.67
4 6 -1 -1 1
6 8 1 1 1
8 10 3 3 9
ഥ )(𝒀 − 𝒀
σ𝒏𝒊=𝟏(𝑿 − 𝑿 ഥ ) = 20

BITS Pilani, Pilani Campus


Issues with Covariance.
σ𝑛 ത ത
𝑖=1(𝑋−𝑋)(𝑌−𝑌)
• cov(X,Y) = 𝑛−1 X Y
X Y
• The value is unit dependent. 2 4 20 40
4 6 40 60
• Our example (centimeters v/s millimeters) 60 80
6 8
• For cm values: cov(X,Y) = 6.67 cm2 8 10 80 100

• For mm values cov(X,Y) = 666.67 mm2

• Makes the interpretation difficult.

• Value becomes very large as the data-size increases (uncapped).

BITS Pilani, Pilani Campus


Linear relationship: Coefficient of Correlation
• Measures the strength of linear relationship between two variables.
𝑐𝑜𝑣(𝑋,𝑌) σ𝑛 ത ത
𝑖=1(𝑋−𝑋)(𝑌−𝑌)
• Coefficient of correlation: r or ρ =
𝑆𝑥 𝑆𝑦
σ𝑛 ത 2
𝑖=1(𝑋−𝑋) σ𝑛 ത 2
𝑖=1(𝑌−𝑌)

σ𝑛 ത 2
𝑖=1(𝑋−𝑋)
𝑋ത =5 ത
𝑌=7
• 𝑆𝑥 = X Y X-𝑋ത Y-𝑌ത ത 2
(𝑋 − 𝑋) ത 2
(𝑌 − 𝑌) ത
(𝑋 − 𝑋)(𝑌 ത
− 𝑌)
𝑛−1
2 4 -3 -3 9 9 9
σ𝑛 ത 2
𝑖=1(𝑌−𝑌)
4 6 -1 -1 1 1 1
• 𝑆𝑦 = 6 8 1 1 1 1 1
𝑛−1
8 10 3 3 9 9 9
Sum: 20 Sum: 20 Sum: 20
r=? r= 20/20 = 1
41
Coefficient of correlation
• Unit Independent.
• Varies between -1 and +1
• -1: Perfect negative linear correlation
• 0: No linear correlation.
• +1: Perfect positive linear correlation.

42
Q&A

43
Quantitative Methods

Lecture-13
BITS Pilani
Pilani Campus

1
BITS Pilani
Pilani Campus

Simple Linear Regression


(Ch 12 Business Statistics, Levine et al.)
Types of relationships
Linear relationships

• Positive Linear Relationship: When X increases, Y increases

• Negative Linear Relationship: When X increases, Y decreases

Non-linear relationships and No relationships

• Positive Curvilinear Relationships

• Negative Curvilinear Relationships

• U shaped Curvilinear Relationships

• No Relationships

BITS Pilani, Pilani Campus


Sample data and the scatter plot

Customers Annual Sales 14


Store (Lakhs) (Lakhs)
12
1 3.7 5.7
2 3.6 5.9 10

Annual Sales (Y)


3 2.8 6.7 8
4 5.6 9.5
6
5 3.3 5.4
6 2.2 3.5 4

7 3.3 6.2 2
8 3.1 4.7
0
9 3.2 6.1 0 1 2 3 4 5 6 7
Customers within a 5km radius (X)
10 3.5 4.9
11 5.2 10.7
12 4.6 7.6
Is there a relationship between X and Y?
13 5.8 11.8
14 3 4.1
BITS Pilani, Pilani Campus
Predictions: Point Estimate

14

𝒀𝒊 What is the point


12 estimate of Y,
without taking the
10 Error/ Residual: relationship
𝜺𝒊 = 𝒀𝒊 − 𝒀ഥ
Annual Sales (Y)

between X and Y
ത 6.63
8
𝑌= into account?
6
Mean is the point
4 estimate without taking
the relationship into
2 account.

0
0 1 2 3 4 5 6 7 Is every value of Y
X: Customers within a 5km radius ഥ?
equal to 𝒀

BITS Pilani, Pilani Campus


Taking the relationship into account: Fitting a line.

Customers (Lakhs) X Line Fit Plot ෡ = −𝟏. 𝟐𝟏 + 𝟐. 𝟎𝟕𝑿


𝒀
14

X=5.3
Error explained by the line 𝑌𝑖
෡ = 9.8
12
𝒀
෡ Residual error
10 𝒀
Annual Sales (Lakhs) Y

Is this estimate
8 ത 6.63
𝑌= better than the
6
point estimate?
Annual Sales (Lakhs) Y
4 Predicted Annual Sales (Lakhs) Y

2 Regression helps
us estimate the
0
0 1 2 3 4 5 6 7 equation of the
Customers (Lakhs) X linear relationship.

BITS Pilani, Pilani Campus


Simple Linear Regression
• Two variables (X and Y).

• They are assumed to have a linear relationship (increasing or decreasing).

• Y is our variable of business interest.

• We want to predict the value of Y, given certain value of X.

• X, is called an independent variable. Its values is determined outside the system (Exogenous).

• Y, is called dependent variable. Sometimes also referred as outcome or response variable.

• Value of Y is determined within the system (Endogenous).

• When independent variable X changes, Y also changes in a predictable way.

BITS Pilani, Pilani Campus


Linear relationship: equation of a line
• Independent variable ( X) is shown on X axis.
Y
• Dependent variable is ( Y) is shown on Y axis.

6
• Equation of a line: 𝑌 = 𝛽0 + 𝛽1 𝑋

5
4
Intercept (𝛽0 )

3
• The point where the line meets Y axis.

2
• Value of Y, when X is 0

1
Slope (𝛽1 )

0
• When value of X goes up by 1 unit, the value of Y goes up by 𝛽1 units. 0 1 2 3 4 5 X
• Green line: Y = 1 + 2X (intercept? Slope?)

• 𝐼𝑛𝑡𝑒𝑟𝑐𝑒𝑝𝑡: 𝛽0 = 1, 𝑆𝑙𝑜𝑝𝑒: 𝛽1 = 2

• Red line: Y = 3 - 0.5X (Slope 𝛽1 is negative: -0.5)


BITS Pilani, Pilani Campus
Simple Linear Regression Model
• Population regression equation.

• 𝑌𝑖 = 𝛽0 + 𝛽1 𝑋𝑖 + 𝜀𝑖

• 𝛽0 , 𝛽1 𝑎𝑟𝑒 𝑖𝑛𝑡𝑒𝑟𝑐𝑒𝑝𝑡 𝑎𝑛𝑑 𝑠𝑙𝑜𝑝𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝒑𝒂𝒓𝒂𝒎𝒆𝒕𝒆𝒓𝒔

• 𝜀𝑖 : Random Error

• Expected value of Y is the line.


• 𝐸 𝑌 = 𝛽0 + 𝛽1 𝑋

• Line is the best average fit.

BITS Pilani, Pilani Campus


Fitting a regression line from sample data

• We fit a line based on the sample data. If certain assumptions


Annual Sales (Lakhs)
are met, this line can be used to make population predictions. 14

12

10

Annual Sales
8

0
0 1 2 3 4 5 6 7
Customers within 5kms radium

• Notice the “hat” on top of the dependent variable. “Hat”


represents estimated values of Y.

BITS Pilani, Pilani Campus


How to fit a line? Which of the lines fits the best?

Annual Sales (Lakhs)


14 • Least Square Method
12

10
• The line that minimizes the sum of
squared residual errors, is the best fit.
Annual Sales

6 ෡ 2
• min σ𝐴𝑙𝑙 𝑣𝑎𝑙𝑢𝑒𝑠(𝑌𝑖 − 𝑌𝑖)
4

2 • min σ𝐴𝑙𝑙 𝑣𝑎𝑙𝑢𝑒𝑠(𝑌𝑖 − 𝑏0 − 𝑏1 𝑋𝑖 )2


0
0 1 2 3 4 5 6 7 • Solving it gives the formula for the
Customers within 5kms radium
intercept and the slope estimates
(𝑏0 , 𝑏1 ).

BITS Pilani, Pilani Campus


Linear Regression: The coefficients
Solving the square errors for minimization, we get

𝑆𝑆𝑋𝑌
– b1 =
𝑆𝑆𝑋
( σ𝑛 𝑛
𝑖=1 𝑋𝑖 )(σ𝑖=1 𝑌𝑖 )
– SSXY= σ𝑛𝑖 (𝑋𝑖 ത 𝑖 − 𝑌)
− 𝑋)(𝑌 ത = σ𝑛𝑖 𝑋𝑖 𝑌𝑖 −
𝑛
( σ𝑛
𝑖=0 𝑋𝑖 )
2
– SSX = σ𝑛𝑖 (𝑋𝑖 ത 2 = σ𝑛𝑖=0(𝑋𝑖 )2 −
− 𝑋) 𝑛
– b0 = 𝑌ത − 𝑏1 𝑋ത
𝑛
σ (𝑌 )
– 𝑌ത = 𝑖 𝑛 𝑖
𝑛
σ (𝑋 ) 𝑋𝑖 𝒀𝒊 ഥ)
(𝑿𝑖 − 𝑿 ത
(𝑌𝑖 − 𝑌) ഥ )2
(𝑿𝑖 − 𝑿
– 𝑋ത = 𝑖 𝑛 𝑖

BITS Pilani, Pilani Campus


Linear Regression Example
• You are the CEO of the Coconut water branded outlet business. You would like to get a strategy to identify
where to open new stores.

• From your experience you find that your sales directly depend on number of potential customers within 5 sq
km radius of the stores.

• You can find number of potential customers within 5 sq km radius by using a market research firm.

• Yow would like to build a linear regression model to be able to predict potential sales.

• Linear Model: An equation of the line that can help you predict the dependent variable.

BITS Pilani, Pilani Campus


Sample data

Customers Annual Sales


Store (Lakhs) (Lakhs) Annual Sales (Lakhs)
14
1 3.7 5.7
12
2 3.6 5.9
3 2.8 6.7 10

Annual Sales
4 5.6 9.5 8

5 3.3 5.4 6

6 2.2 3.5 4
7 3.3 6.2
2
8 3.1 4.7
0
9 3.2 6.1 0 1 2 3 4 5 6 7
Customers within 5kms radius
10 3.5 4.9
11 5.2 10.7
12 4.6 7.6
13 5.8 11.8
14 3 4.1
BITS Pilani, Pilani Campus
Linear Regression: Working through a business problem

Store Customers (Lakhs) X Annual Sales (Lakhs) Y X-Xbar (X-Xbar)^2 Y-Ybar (X-Xbar)(Y-Ybar)
1 3.7 5.7 -0.08 0.0062 -0.93 0.073
2 3.6 5.9 -0.18 0.0319 -0.73 0.130
3 2.8 6.7 -0.98 0.9576 0.07 -0.070
4 5.6 9.5 1.82 3.3176 2.87 5.230
5 3.3 5.4 -0.48 0.2290 -1.23 0.588
6 2.2 3.5 -1.58 2.4919 -3.13 4.939
7 3.3 6.2 -0.48 0.2290 -0.43 0.205
8 3.1 4.7 -0.68 0.4605 -1.93 1.309
9 3.2 6.1 -0.58 0.3347 -0.53 0.306
10 3.5 4.9 -0.28 0.0776 -1.73 0.482
11 5.2 10.7 1.42 2.0205 4.07 5.787
12 4.6 7.6 0.82 0.6747 0.97 0.798
13 5.8 11.8 2.02 4.0862 5.17 10.454
14 3 4.1 -0.78 0.6062 -2.53 1.969
Mean 3.78 6.63 SSX 15.5236 SSXY 32.199
b1 2.0742
b0 -1.2089

෡ = −1.2089 + 2.0742*X
𝑬𝒔𝒕𝒊𝒎𝒂𝒕𝒆𝒅 𝒓𝒆𝒈𝒓𝒆𝒔𝒔𝒊𝒐𝒏 𝒍𝒊𝒏𝒆: 𝒀

BITS Pilani, Pilani Campus


Linear Regression: Predictions and Cautions
• Regression Equation: 𝑌෠ = −1.21 + 2.1 *X

• ෠ Predicated sales.
X: Number of customers within 5kms radius and 𝑌:

• Your MR firm calculated potential customers within 5kms rage to be 4 Lakhs.

• What is your predicted annual sales?

• 𝑌෠ = −1.21 + 2.1*4 = 7.2 Lakh INR.

Caution:

• Interpolation v/s Extrapolation.

• You have used a range of X values, from your sample, to estimate the regression equation (2.2 – 5.8 Lakh)

• Predictions may be invalid out of these range of values. We should not use regression for extrapolation.

• Predictions within the range are called interpolations.

BITS Pilani, Pilani Campus


Measures of Variation: Prediction without additional info.

Annual Sales (Lakhs) Y


14

12

10
Annual Sales

8
ത 6.63
𝑌=
6

0
0 1 2 3 4 5 6 7
Customers within 5kms radium

BITS Pilani, Pilani Campus


Taking the relationship into account: Fitting a line.

Customers (Lakhs) X Line Fit Plot


14

12 Error explained by regression 𝑌𝑖

10 ෡
𝒀
Annual Sales (Lakhs) Y

Residual error Total Error


8 ത 6.63
𝑌=
6

Annual Sales (Lakhs) Y


4 Predicted Annual Sales (Lakhs) Y

0
0 1 2 3 4 5 6 7
Customers (Lakhs) X

BITS Pilani, Pilani Campus


Measures of Variation: SST = SSR + SSE
• Total Sum Of Squares Variation (SST): Measure of variation of Yi
around the mean

• Total Variation = Explained by the regression + Residual Variation

• ത 2
𝑆𝑆𝑇 = σ𝑛𝑖=1(𝑌𝑖 − 𝑌)
• Regression Sum Of Squares Variation (SSR): Variation in Y
explained by the regression on Variable X.

• 𝑆𝑆𝑅 = σ𝑛𝑖=1(𝑌෠ − 𝑌)
ത 2

• Error Sum of Squares (SSE): Variation in Y not explained by X (Due


to other factors).

• ෠ 2
𝑆𝑆𝐸 = σ𝑛𝑖=1(𝑌𝑖 − 𝑌)

• SST = SSR + SSE

BITS Pilani, Pilani Campus


Linear Regression: The Coefficient of Determination
• Total Variation = Explained by the regression + Residual Variation

• Total Sum Of Squares Variation (SST): Measure of variation of Yi around the mean

• Regression Sum Of Squares Variation (SSR): Variation in Y explained by the regression on Variable X.

• Error Sum of Squares (SSE): Variation in Y not explained by X (Due to other factors).

• SST = SSR + SSE

What proportion of the total variation is explained by the regression?

• 𝒓𝟐 = SSR / SST ; This is called “The Coefficient of Determination”.

• The proportion of variation in the values of Y, explained by the linear relationship between independent
variable X with the dependent variable Y.

• Correlation Coefficient: r (How do you know if it is +ve or –ve?)

• Depends on the sign of slope b1


BITS Pilani, Pilani Campus
Linear Regression: Standard Error of the Estimate
• Regression line does not predict values exactly.

• There is residual error.

• Standard error of the estimate of Y

𝑆𝑆𝑅
• 𝑆𝑋𝑌 =
𝑛−2

• SSR: Residual sum of square.

• 𝑆𝑆𝑅 = σ𝑛𝑖=1(𝑌෠ − 𝑌)
ത 2

• 𝑆𝑋𝑌 : Standard error of the estimate.

• The standard deviation measures variation around the mean. 𝑆𝑋𝑌 measures the variation around the
regression line.

BITS Pilani, Pilani Campus


Linear Regression: XLSX
• Data >> Enable Data Analysis ToolPak

BITS Pilani, Pilani Campus


Inference about the slope: t-test
• Regression line: 𝑌෠ = 𝑏0 +𝑏1 *X

• 𝑏1 : Slope of the regression line.

• What does 𝑏1 = 0 mean?.

• There is no linear relationship between X and Y.

• Regression output gives us the t-test results

• 𝐻0 : 𝑏1 = 0

• Is 𝒃𝒐 the intercept and 𝒃𝟏 the slope significant (null of zero value is rejected) in the following output?

Standard
Coefficients Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept -1.20884 0.994874 -1.21507 0.247707 -3.37648 0.958806 -3.37648 0.958806
Customers
(Lakhs) X 2.074173 0.253629 8.177972 3E-06 1.521562 2.626784 1.521562 2.626784

BITS Pilani, Pilani Campus


Inference about the slope: F-test
• The test uses the ratio of SSR (Regression sum of squares) and SSE (Error sum of squares) to check the
significance of the slope parameter.

• F-statistics follows an F distribution with 1 and n-2, numerator and denominator degrees of freedom resp.

BITS Pilani, Pilani Campus


Q&A

25
Quantitative Methods

Lecture-14
BITS Pilani
Pilani Campus

1
BITS Pilani
Pilani Campus

Introduction to Linear Programming


(Ch 7 Quantitative Methods for Business, D. Anderson et al.)
Optimization
Two important features of business problems

• Maximization or minimization of certain outcomes.

• Maximization of revenue, profits etc.

• Minimization of cost, time etc.

There are constraints

• Limited budget, time, capacity, risk appetite, govt. regulations.

The goal

• Maximize or minimize your objective function subject to certain constraints.

BITS Pilani, Pilani Campus


Optimization examples
• How to decide portfolio of stocks
• To maximize returns.

• Given risk constraints and limited funds availability.

• How to allocate advertising budget amongst various media outlets


• To maximize sales.

• Subject to budget and media slots availability.

• How much to ship from which warehouse to which demand locations


• To minimize transportation costs.

• Subject to warehouse inventories and customer demand.

BITS Pilani, Pilani Campus


Problem formulation
• Mathematical model: State the problem by way of mathematical statements.

• Step1: Describe the objective - Example: Maximize revenue or minimize the cost.

• Step2: Describe each constraint – Example: Number of tons of material used <= total available.

• Step 3: Define the decision variables or controllable inputs. Example: Material produced (tons).

• Step 4: Write the objective in terms of decision variables. Example: Max 40P1 + 30P2.

• Step 5: Write the constraints in terms of decision variables. Example: 0.4P1 + .5P2 <=20

• Step 6: Add non-negativity constrains. Example: Non negative quantities or P1, P2>=0

• Step 7: Solve the model using an appropriate method.

• We will learn Graphical Method and Solving using MS XL solver program.

BITS Pilani, Pilani Campus


The business problem
• Your factory uses three materials to produce two products.

• You would like to maximize the profit.

• Profit contribution of the first and second products are 40 and 30 Rs respectively.

• Table 1 shows input materials required per ton of products.

Table1 Product 1 (P1) Product 2 (P2)


Material1 (M1) tons 0.4 0.5
Material2 (M2) tons 0.2
Material3 (M3) tons 0.6 0.3

• You only have 20 tons of M1, 5 tons of M2 and 21 tons of M3 available in your inventory.

BITS Pilani, Pilani Campus


Problem formulation
Step 1: What is your objective?
• Two products: P1 and P2 Maximize the profit
• Materials: M1, M2, M3.
Step 2: What are your constraints?
• Profit contributions: P1 - Rs 40, P2 - Rs Availability of input materials.
30
Step 3: What are your decision variables?
• Input materials required to produce one How much (tons) of each product to produce. P1 tons, P2 tons

ton of the products Step 4: Write the objective in terms of decision variables.
Table1 P1 P2 Maximize 40P1 + 30P2
M1 0.4 0.5
Step 5: Write the constraints in terms of decision variables.
M2 0.2 M1 constraint: 0.4P1 + 0.5P2 <=20
M3 0.6 0.3 M2 constraint: 0.2P2 <=5
M3 constraint: 0.6P1 + 0.3P2 <=21
• Availability: 20 tons of M1, 5 tons of M2
and 21 tons of M3 Step6: Add non-negativity constraints.
P1, P2 >=0
BITS Pilani, Pilani Campus
Linear Programming Model

• Maximize the profit,


• Subject to material availability and non negativity constraints.

Linear Programming Model (Standard Form)

Max 40P1 + 30P2

Subject to (s.t.)
M1) 0.4P1 + 0.5P2 <=20
M2) 0.2P2 <=5
M3) 0.6P1 + 0.3P2 <=21
P1, P2 >=0

BITS Pilani, Pilani Campus


Graphical Method
P2
P1=0 P1,P2>=0
Linear Programming Mathematical Model 80
0.6P1 + 0.3P2 =21
Max 40P1 + 30P2
70

Subject to (s.t.)
60 0.4P1 + 0.5P2 =20
1. P1, P2 >=0
2. M1) 0.4P1 + 0.5P2 <=20 50

3. M2) 0.2P2 <=5 40


4. M3) 0.6P1 + 0.3P2 <=21 0.2P2 =5
30
• Step 1: Draw a line for each constraint
20
(considering inequalities as equalities)

• Step 2: Identify area on the graph that 10


P2=0
satisfies the constraint. 0 10 20 30 40 50 60 70 80
P1

BITS Pilani, Pilani Campus


Graphical Method
P2
Linear Programming Mathematical Model P1=0 P1,P2>=0
• Step 3: Identify the area that satisfies all the 80
0.6P1 + 0.3P2 =21
constraints 70

• Step 4: Draw the objective line 60 0.4P1 + 0.5P2 =20


Max 40P1 + 30P2
50
40P1 + 30P2 = 1200
40
• Step 5: Move this line in the direction of the objective. 0.2P2 =5
• Step 6: Find the point in the feasible reason that 30

provides the optimum value. 20

• Intersection of black and green lines (25, 20) 10

• TIP: Find all the possible extreme points (nodes) and


P2=0
0 10 20 30 40 50 60 70 80
see which one maximizes the objective function. P1

BITS Pilani, Pilani Campus


Slack Variables
P2
• Unused capacity associated with a constraint is called P1=0 P1,P2>=0
slack (a redundant or non-binding constraint) 80
0.6P1 + 0.3P2 =21
• Optimum solution: P1=25, P2 = 20 70

• Which constraints are not binding?


60 0.4P1 + 0.5P2 =20
• M1) 0.4P1 + 0.5P2 <=20
• M2) 0.2P2 <=5 50
• M3) 0.6P1 + 0.3P2 <=21
• The value of M1 at the solution point 40
0.2P2 =5
• =0.4*25 + 0.5*20 = 20 tons
30

• The value of M2 at the solution point 20

• =0.2*20 = 4 tons (1 ton is slack or excess/unused


quantity available with M2) 10
P2=0
0 10 20 30 40 50 60 70 80
• The value of M3 at the solution point P1
• 0.6*25 + 0.3*20 = 21 tons
BITS Pilani, Pilani Campus
Graphical Solution Notes

• Solution, if it exists, will always be found at the extreme points.

• You can quickly find values of the objective function at extreme points and evaluate the solution.

• Extreme points are intersection of the constraint lines or a constraint line itself.

• The feasible region may not exist (No solution).

• Feasible region may be a line if the Obj Function line is parallel to a constraint (More than one
solution).

BITS Pilani, Pilani Campus


Programmatic solution using MS XL Solver

• Enable “Solver” plugin.

• Let’s solve the example problem with MS XL Solver

BITS Pilani, Pilani Campus


BITS Pilani
Pilani Campus

Linear Programming Applications


(Ch 9 Quantitative Methods for Business, D. Anderson et al.)
Applications
Marketing Applications.

• Media selection.

• Marketing Research.

Financial Applications.

• Portfolio selection.

• Financial planning.

Operations Management Applications.

• Make or buy decisions.

• Production scheduling.

• Workforce assignment.

• Blending problems.
BITS Pilani, Pilani Campus
Overall approach
• Overall approach remains the same as the simple two variable problem that we solved.

• You would need to build a linear programming model.

• Identify and write the objective function (Max or Min).

• Identify and write the constraints including the non-negativity constraints.

• Problems with two variables can be solved using graphical method (carry your scale etc. for the
exam).

• Problems with more than two variables can be solved using computer programs ( Solver etc.).

• You would primarily be assessed on your ability to write the linear programming model.

• And/or solving the problem, if feasible, in the exam.

BITS Pilani, Pilani Campus


BITS Pilani
Pilani Campus

Marketing Applications
Marketing Applications: Media Selection

• Objective: A company would like to allocate budget to various media outlets to maximize the quality of
exposure etc.

• Media outlets? - News Paper, Internet Search, Daytime TV, Evening TV, Radio etc.

• Constraints: Minimum reach, budget, availability, quality of exposure requirements or company policy.

BITS Pilani, Pilani Campus


Marketing Applications: Media Selection

• Objective: To maximize the total


quality exposure units.

• The company has set the following


requirements

• Total budget: $30,000.00

• At least 10 TV commercials

• At least 50,000 potential customers must


be reached Source: All problems to be discussed today are from your
reference book T2, Ch 9
• No more than $18,000 be spent TV

BITS Pilani, Pilani Campus


Problem formulation: Objective Function
#Define the decision variables or controllable inputs.

• What is our controllable input?

• Units of advertisement to purchase at each media outlet.

• DTV, ETV, DNP, SNP, R

# Write the objective in terms of decision variables.

• Maximize the quality exposure units.

Max 65DTV + 90ETV + 40DNP + 60SNP + 20R

BITS Pilani, Pilani Campus


Problem formulation: Constraints
Step 5: Write the constraints in terms of decision variables.

• Total budget: $30,000.00

1500DTV + 3000ETV+400DNP+1000SNP+100R <=30000

• At least 10 TV commercials

DTV + ETV >= 10

• At least 50,000 potential customers must be reached

1000DTV + 2000ETV + 1500DNP + 2500SNP + 300R >= 50000

• No more than $18,000 be spent TV • Add Non-negativity constraints


DTV, ETV, DNP, SNP, R>=0
1500DTV + 3000ETV <=18000

• Other constraints implicit in the problem.

DTV <=15, ETV <=10, DNP <=25, SNP <=4, R<=30


BITS Pilani, Pilani Campus
Problem formulation
Linear Programming Problem: Standard form.
Max 65DTV + 90ETV + 40DNP + 60SNP + 20R

s.t. (subject to)


• 1500DTV + 3000ETV + 400DNP + 1000SNP + 100R <=30000

• DTV + ETV >= 10

• 1000DTV + 2000ETV + 1500DNP + 2500SNP + 300R >= 50000

• 1500DTV + 3000ETV <=18000

• DTV <=15, ETV <=10, DNP <=25, SNP <=4, R<=30

• DTV, ETV, DNP, SNP, R>=0


Note:
• This problem is setup to maximize the quality exposure given minimum reach requirements.
• Alternatively, you may be asked to maximize the reach given quality constraints.

BITS Pilani, Pilani Campus


Q&A

23

You might also like