Lec 1

Business
Statistics
Lecture 1
Muhammad Amin Qureshi
Presentation Credit: Asst. Prof. Ikram-e-Khuda

The Big
Inputs
Picture Random System
performing
Random
Mutually
Exclusive
Experiment Outputs
Outputs comprise the sample space and each output has its probability of occurrence
The real value assigned to each output is the random variable
Statistics use the information of probability to help us understand the random systems and ultimately develop the inference
Concept of a Random
Variable
A function whose value is a real number determined by each element
in the sample space is called a random variable
Random Variable and
Events
• A random variable is a variable whose domain is the set of basic
events, and whose range (outcome) could be numerical or
categorical.
• An event is an outcome or a union of outcomes, when the outcomes

are the occurrences over which you can assign probabilities (or
measures).
Concept of a Random
Variable
Random Variable Types
Discrete
Continuous
Concept of a Random
Variable
Concept of a Random
Variable
Concept of a Random
Variable
Measurement Scales of
Sampling
• Measurement scales are used to categorize and/or quantify variables
which are the outcomes of a random experiment out of sampling
process.
• Four scales of measurement that are commonly used in statistical

analysis:
• nominal
• ordinal
• Interval
• ratio scales.
Nominal Scale of
Measurement
The nominal scale of measurement only satisfies the identity property
of measurement.
Examples
• Gender
• Religion
• Political affiliation
• Color
Ordinal Scale of
The ordinal scale has the property of both identity and magnitude.
An ordinal variableMeasurement
is a categorical variable for which the possible values are ordered
Example
• Rank can be ordered as 1st, 2nd 3rd etc (in a race)
• However, we cannot tell from this ordinal scale whether it was a close race or whether the
winner won by a mile !
• Educational level might be ordered as

1: Elementary school education
2: High school graduate
3: Some college
4: College graduate
5: Graduate degree
Interval Scale of
• Identity
Measurement
• The interval scale of measurement has the properties of
• Magnitude
• Equal intervals.
Example
• Temperature
• Any temperature scale is made up of equal temperature units, so that the difference
between 40 and 50 degrees is equal to the difference between 50 and 60 degrees (for
example).
• With an interval scale, you know not only whether different values are bigger or smaller,
you also know how much bigger or smaller they are.
• Absolute zero is not defined!!
Ratio Scale of
The ratio scaleMeasurement
of measurement satisfies all four of the properties of
measurement: identity, magnitude, equal intervals, and a minimum value of
zero.
Example
• Distance
• Length
• Height
• Width
• Area
• Age
• Cost price
• Selling price
Properties of Measurement
Scales
Each scale of measurement satisfies one or more of the
following measurement.
properties of
• Identity. Each value on the measurement scale has a unique meaning.
• Magnitude. Values on the measurement scale have an ordered relationship to

one
another.
• That is, some values are larger and some are smaller.
• Equal intervals. Scale units along the scale are equal to one another. This means, for
example, that the difference between 1 and 2 would be equal to the difference between
19 and 20.
• A minimum value of zero. The scale has a true zero point, below which no values exist.
Summary of Different Variable
Data or Types
Random Variable
• Qualitative
• Discrete (Integer Numbers)
• Finite set: Grouped or Ungrouped
• Nominal
• Ordinal
• Quantitative
• Continuous (Real Numbers)
• Finite set: Grouped or Ungrouped type
• Nominal
• Ordinal
• Scaled
• Interval (Zero point not defined)
• Ratio (Zero point undefined)
• Infinite set
Descriptive Statistics vs. Statistical
Inference
• Descriptive statistics Gives first hand knowledge about the data
• Presenting, organizing and summarizing the data
• Probability calculations
• By frequency distribution tables
• Graphing the data
• Measures of central tendency
• Measures of Dispersion
• Five number summary
• Measures of shapes
• Inferential statistics makes inferences/ conclusions and predictions about a

population based on a sample of data taken from the population in
question.
Descriptive vs. Inference Statistics
Descriptive vs. Inference Statistics
Population and
Sampling
Population and sample are two basic concepts of statistics.
• Population
• Population is the collection of all individuals or items under consideration in a statistical study.
• Sample
• Sample is that part of the population from which information is collected.
• Sampling
• Sampling is the process by which inference is made to the whole by examining a part.
• With a single grain of rice, we can test if all the rice in the pot has boiled;
• from a cup of tea, a tea-taster determines the quality of the brand of tea; and
• a sample of moon rocks provides scientists with information on the origin of the moon.
• This process of testing some data based on a small sample is called sampling
Sampling Types
• Probability Sampling
• Non-probability sampling
• The population always represents the target of an investigation. We learn about the population by sampling from the collection.
Population and
Sampling
The Statistical Inference
• Graph your data (probability distributions)
• Look the shape!
Process
• Normal, skewed and kurtosis
• Estimations (Estimating the Population parameter in the Sample)

• Point Estimation
• It gives a particular value as an estimate of the population parameter
• Mean/ weighted mean, median, mode, quartiles, percentiles, range, IQR etc
• Interval Estimation
• It gives a range of values which is likely to contain the population parameter.
• This interval is called a confidence interval
• This procedure also tells that how likely the interval is to contain the actual parameter
• That value is called confidence level
• Usually α=1-confidence level. This implies that confidence level=1-α
• A confidence interval is a random interval
• Inference
• Hypothesis testing
• One tailed and two tailed tests
• Type 1 and Type 2 errors
• Drawing conclusion and predictions
Important Concepts and
Formulae
1) Measure of positions: Quartiles, percentiles and deciles
2) Measures of central tendency: Mean or Mathematical Expectations
3) Measures of dispersion: Variance and standard deviation
4) Measures of shapes: normal , skewed and kurtosis

Analysis Grouped Data Formulae
Formula for Quartiles

Location or class boundary of quartile group
Value
Formula for Percentiles

Location or class boundary of percentile group
Value
Formula for Deciles

Location or class boundary of decile group
Value
Mean or Average value of Dataset
Let the dataset is represented by .
Here is the ith frequency corresponding to ith class or group called
Variance of Mean
Standard Deviation of Mean

The square root of variance gives the standard deviation = Error = variance
2
Error= standard deviation
𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑𝑜𝑢𝑡𝑐𝑜𝑚𝑒=𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐𝑎𝑙 𝑚𝑜𝑑𝑒𝑙 ±𝑒𝑟𝑟𝑜𝑟 𝑜𝑓 𝑡 h𝑒 𝑚𝑜𝑑𝑒𝑙

Measures of Shapes
Measures of Shapes
The frequency distributions can have any shape.
These shapes are visible on bar charts and histograms.
Of so many shapes available we are particularly interested in the shape of normal distributions.
Normal distribution is symmetric bell shaped curve.
This curve appears as the envelope of a bar chart or histogram.
Deviation on horizontal axis from normal distributions is called skewness. There can be either positively or negatively
skewed deviations.
Deviation (on vertical axis) from the height of appropriate normal distribution height is called kurtosis. There can be
either negative kurtosis (platykurtic) or positive kurtosis (leptokurtic).
Bell Shaped Normal
Curve
Data mostly on: Data mostly on:
Tail skewed towards
Tail skewed towards Right, thus right-skewed data
Left, thus left-skewed data
Ref: https://develve.net/Skewness.html
Bell shaped normal distribution curve
Ref: http://itfeature.com/statistics/measure-of-dispersion/measure-of-kurtosis
Characteristics of Distributions
Characteristics of Normal distributions

Mean is the middle value of dataset
Mean =Median=Mode
Skewness=0
Characteristics of Skewed distributions

Mean ≠ Median
Skewness > 0 => Positive skewness (Mean > Median)
Skewness < 0 => Negative skewness (Mean < Median)
Mathematical Calculation
Skewness
Here
is the total no groups or classes.

Normal Shape
In summary following criteria can be used to check if the shape of frequency
distribution is normal or not
Frequency Distribution Graph

All heights must be symmetric
Mean=Median
Skewness=0
All these three criteria must support each other

Probability
Calculations
• Basic probability of one event
• Probability of two events occurring simultaneously

• Mutually exclusive case
• Probability of two events occurring one after the other

• Conditional probabilities and decision trees
• Bayes theorem
Probability
Review
• Probability of an event A is symbolized by P(A) and 0 ≤ P(A) ≤ 1.
Probability Formula 1
𝑇𝑜𝑡𝑎𝑙 𝑛𝑜.𝑜𝑓 𝑓𝑎𝑣𝑜𝑢𝑟𝑎𝑏𝑙𝑒 𝑐𝑎𝑠𝑒𝑠

𝑃 = 𝑇𝑜𝑡𝑎𝑙 𝑠𝑎𝑚𝑝𝑙𝑒
A
𝑠𝑝𝑎𝑐𝑒
Probability Formula 2
𝑚 = 𝑁𝑜.𝑜𝑓 𝑡𝑖𝑚𝑒 𝑒𝑣𝑒𝑛𝑡 𝐴
𝑃 𝐴 = lim 𝑜𝑐𝑐𝑢𝑟𝑠
𝑛→∞ 𝑛= 𝑁 𝑜 . 𝑜 𝑓 𝑟 𝑒 𝑝 𝑒 𝑡 𝑖 𝑡 𝑖 𝑜 𝑛 𝑠 𝑜 𝑓 𝑟 a 𝑛 𝑑 𝑜 𝑚 𝑒 𝑥 𝑝 𝑒 𝑟 𝑖 𝑚 𝑒 𝑛 𝑡
• Sum of the probabilities of all mutually exclusive events is always equal to 1.

Other Rules of Probability
Rules of Sum and Rules of
Product
Addition Rule 1
Addition Rule 2
Multiplication Rule 1
Multiplication Rule 2
Bayes Theorem
Example
Let:
P(A) = Probability of someone suffering from the disease = 0.5% = 0.005
P() = Probability of someone not suffering from the disease = 1 – 0.005 = 0.995
P(B) = Probability of someone being positive in the test (this is required in part a)
P() = Probability of someone being negative in the test
P(B|A) = Probability of someone being positive in test, given that they were suffering = 0.95 (given)
P() = Probability of someone being negative in test, given that they were suffering = 1 – 0.95 = 0.05
P(B|) = Probability of someone being positive given that they were not suffering = 0.1 (given)
P( ) = Probability of someone being negative given that they were not suffering = 1 – 0.1 = 0.9
Part a requires “Total Probability” of P(A&B) = P(A).P(B|A)
someone being positive in the test
regardless of the fact that they were P(A&) = P(A).P(
suffering or not.
The answer for part a will be the sum

of two branches highlighted in green
on the right.
P(B) = P(A&B) + P(&B)
Part b requires that a person got a

positive test result, what is the
probability that they were really
suffering from the disease? => P(A|B)

Lec 1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lec 1

Uploaded by

Copyright:

Available Formats

Business

Presentation Credit: Asst. Prof. Ikram-e-Khuda

The real value assigned to each output is the random variable

• An event is an outcome or a union of outcomes, when the outcomes

• Four scales of measurement that are commonly used in statistical

• Educational level might be ordered as

• Magnitude. Values on the measurement scale have an ordered relationship to

• Inferential statistics makes inferences/ conclusions and predictions about a

• Estimations (Estimating the Population parameter in the Sample)

2) Measures of central tendency: Mean or Mathematical Expectations

3) Measures of dispersion: Variance and standard deviation

4) Measures of shapes: normal , skewed and kurtosis

Formula for Quartiles

Formula for Percentiles

Formula for Deciles

Here is the ith frequency corresponding to ith class or group called

Standard Deviation of Mean

Error= standard deviation

𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑𝑜𝑢𝑡𝑐𝑜𝑚𝑒=𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐𝑎𝑙 𝑚𝑜𝑑𝑒𝑙 ±𝑒𝑟𝑟𝑜𝑟 𝑜𝑓 𝑡 h𝑒 𝑚𝑜𝑑𝑒𝑙

These shapes are visible on bar charts and histograms.

Normal distribution is symmetric bell shaped curve.

This curve appears as the envelope of a bar chart or histogram.

Characteristics of Normal distributions

Characteristics of Skewed distributions

is the total no groups or classes.

Frequency Distribution Graph

All these three criteria must support each other

• Probability of two events occurring simultaneously

• Probability of two events occurring one after the other

𝑇𝑜𝑡𝑎𝑙 𝑛𝑜.𝑜𝑓 𝑓𝑎𝑣𝑜𝑢𝑟𝑎𝑏𝑙𝑒 𝑐𝑎𝑠𝑒𝑠

• Sum of the probabilities of all mutually exclusive events is always equal to 1.

The answer for part a will be the sum

P(B) = P(A&B) + P(&B)

Part b requires that a person got a

You might also like