Professional Documents
Culture Documents
Statistics
Lecture 1
Muhammad Amin Qureshi
Exclusive
Experiment Outputs
Outputs comprise the sample space and each output has its probability of occurrence
Statistics use the information of probability to help us understand the random systems and ultimately develop the inference
Concept of a Random
Variable
A function whose value is a real number determined by each element
in the sample space is called a random variable
Random Variable and
Events
• A random variable is a variable whose domain is the set of basic
events, and whose range (outcome) could be numerical or
categorical.
Example
• Rank can be ordered as 1st, 2nd 3rd etc (in a race)
• However, we cannot tell from this ordinal scale whether it was a close race or whether the
winner won by a mile !
Example
• Temperature
• Any temperature scale is made up of equal temperature units, so that the difference
between 40 and 50 degrees is equal to the difference between 50 and 60 degrees (for
example).
• With an interval scale, you know not only whether different values are bigger or smaller,
you also know how much bigger or smaller they are.
• Absolute zero is not defined!!
Ratio Scale of
The ratio scaleMeasurement
of measurement satisfies all four of the properties of
measurement: identity, magnitude, equal intervals, and a minimum value of
zero.
Example
• Distance
• Length
• Height
• Width
• Area
• Age
• Cost price
• Selling price
Properties of Measurement
Scales
Each scale of measurement satisfies one or more of the
following measurement.
properties of
• Identity. Each value on the measurement scale has a unique meaning.
• Equal intervals. Scale units along the scale are equal to one another. This means, for
example, that the difference between 1 and 2 would be equal to the difference between
19 and 20.
• A minimum value of zero. The scale has a true zero point, below which no values exist.
Summary of Different Variable
Data or Types
Random Variable
• Qualitative
• Discrete (Integer Numbers)
• Finite set: Grouped or Ungrouped
• Nominal
• Ordinal
• Quantitative
• Continuous (Real Numbers)
• Finite set: Grouped or Ungrouped type
• Nominal
• Ordinal
• Scaled
• Interval (Zero point not defined)
• Ratio (Zero point undefined)
• Infinite set
Descriptive Statistics vs. Statistical
Inference
• Descriptive statistics Gives first hand knowledge about the data
• Presenting, organizing and summarizing the data
• Probability calculations
• By frequency distribution tables
• Graphing the data
• Measures of central tendency
• Measures of Dispersion
• Five number summary
• Measures of shapes
Sampling Types
• Probability Sampling
• Non-probability sampling
• The population always represents the target of an investigation. We learn about the population by sampling from the collection.
Population and
Sampling
The Statistical Inference
• Graph your data (probability distributions)
• Look the shape!
Process
• Normal, skewed and kurtosis
• Interval Estimation
• It gives a range of values which is likely to contain the population parameter.
• This interval is called a confidence interval
• This procedure also tells that how likely the interval is to contain the actual parameter
• That value is called confidence level
• Usually α=1-confidence level. This implies that confidence level=1-α
• A confidence interval is a random interval
• Inference
• Hypothesis testing
• One tailed and two tailed tests
• Type 1 and Type 2 errors
• Drawing conclusion and predictions
Important Concepts and
Formulae
1) Measure of positions: Quartiles, percentiles and deciles
Value
Value
Value
Mean or Average value of Dataset
Let the dataset is represented by .
Variance of Mean
Of so many shapes available we are particularly interested in the shape of normal distributions.
Deviation on horizontal axis from normal distributions is called skewness. There can be either positively or negatively
skewed deviations.
Deviation (on vertical axis) from the height of appropriate normal distribution height is called kurtosis. There can be
either negative kurtosis (platykurtic) or positive kurtosis (leptokurtic).
Bell Shaped Normal
Curve
Data mostly on: Data mostly on:
Tail skewed towards
Tail skewed towards Right, thus right-skewed data
Left, thus left-skewed data
Ref: https://develve.net/Skewness.html
Bell shaped normal distribution curve
Ref: http://itfeature.com/statistics/measure-of-dispersion/measure-of-kurtosis
Characteristics of Distributions
Here
Probability Formula 1
Let:
P(A) = Probability of someone suffering from the disease = 0.5% = 0.005
P() = Probability of someone not suffering from the disease = 1 – 0.005 = 0.995
P(B) = Probability of someone being positive in the test (this is required in part a)
P() = Probability of someone being negative in the test
P(B|A) = Probability of someone being positive in test, given that they were suffering = 0.95 (given)
P() = Probability of someone being negative in test, given that they were suffering = 1 – 0.95 = 0.05
P(B|) = Probability of someone being positive given that they were not suffering = 0.1 (given)
P( ) = Probability of someone being negative given that they were not suffering = 1 – 0.1 = 0.9
Part a requires “Total Probability” of P(A&B) = P(A).P(B|A)
someone being positive in the test
regardless of the fact that they were P(A&) = P(A).P(
suffering or not.