You are on page 1of 51

INTRODUCTION

MATH-361 PROBABILITY AND


STATISTICS
INTRODUCTION

Course Title : Probability and Statistics


Course Code: Math-361
Credit hours: 3-0
Text Book: Advanced Engineering Mathematics
by
Erwin Kreyszig
Reference Book: a) Probability and Statistics for Engineers and Scientists
by Walpole.
b) Probability and Statistics by Murray R. Speigel
COURSE LEARNING OUTCOMES

Assigne Course Learning Outcomes Level of


d PLO CLOs Learning PLO
1. Present sample data and extract C4 2
its important features
1, 2
2. Understand different discrete and C2 2
continuous probability distributions

3. Estimate different population


C3 2
parameters on the basis of samples

4. Implement quantity control C3 1


measures
COURSE CONTENTS
Estimated
Article Topics Contact
Hours
24.1 Graphical Representation of Data: Stem-and-Leaf Plot, 3
Histogram, Boxplot; Mean, Standard Deviation,Variance
24.2 Sample Space, Experiment Outcomes, and Sampling with 3
and without replacement, Set theory,.
24.3 Introduction to theory of Probability,Theorems of 3
Probability, Conditional probability
24.4 Permutations and Combinations 3
24.5 Random Variables and Probability Distributions 3
24.6 Mean and Variance of a Distribution, Expectation, Moments 3
24.7 Binomial, Poisson & Hypergeometric distributions. 3
24.8 Normal distribution. 3
24.9 Distributions of several Random Variables 3
25.1 Random Sampling 3
25.2 Point estimation of Parameters 3
25.3 Confidence intervals. 3
25.4 Testing of hypothesis. Decisions 3
25.5 Quality control, Control chart 3
25.6 Acceptance sampling, errors & rectification. 3
25.7 Goodness of Fit, Chi-square test 3
25.9 Regression Analysis. 3
GRADING SCHEME

Quizzes =10%
Assignment=10%
One Hour Test= 30%
Final=50%
WHY STUDY STATISTICS

When information is sought, statistical ideas suggest


a typical collection process with four crucial steps.
1. Set clearly defined goals for the investigation.
2. Make a plan of what data to collect and how to
collect it.
3. Apply appropriate statistical methods to efficiently
extract information from the data.
4. Interpret the information and draw conclusions.
INTRODUCTION

Probability is the likelihood of something happening.


When someone tells you the probability of something
happening, they are telling you how likely that something is.
When people buy lottery tickets, the probability of winning
is usually stated, and sometimes, it can be something like
1/10,000,000 (or even worse). This tells you that it is not
very likely that you will win.
Statistics: The branch of mathematics that deals with
the collection, organization, analysis, and interpretation of
numerical data. Statistics is especially useful in drawing
general conclusions about a set of data from a sample of the
data.
STATISTICS

• Everything dealing with the collection,


processing, analysis, and interpretation
of numerical data belongs to the
domain of statistics.
DESCRIPTIVE AND INFERENTIAL STATISTICS
DESCRIPTIVE AND INFERENTIAL
STATISTICS
Descriptive statistics uses the data to provide
descriptions of the population, either through
numerical calculations or graphs or tables.

Inferential statistics makes inferences and


predictions about a population based on a sample
of data taken from the population in question.
UNITS AND POPULATION OF UNITS

unit: A single entity, usually an object or


person, whose characteristics are of interest.

population of units: The complete


collection of units about which information is
sought.
VARIABLE

Guided by the statement of purpose, we have


a characteristic of interest for each unit in the
population. The characteristic, which could
be a qualitative trait, is called a variable if it
can be expressed as a number.
EXAMPLES OF POPULATIONS,
UNITS, AND VARIABLES.
Two Basic Concepts- POPULATION AND SAMPLE

A statistical population is the set of all


measurements (or record of some quality trait)
corresponding to each unit in the entire
population of units about which information is
sought.
A sample from a statistical population is the
subset of measurements that are actually
collected in the course of an investigation.
RELATIONSHIP BETWEEN PROBABILITY AND
INFERENTIAL STATISTICS

• Elements in probability allow us to draw conclusions about


characteristics of hypothetical data taken from the population,
based on known features of the population.
• the sample along with inferential statistics allows us to draw
conclusions about the population, with inferential statistics
making clear use of elements of probability
DESCRIPTIVE STATISTICS

CLO1-PLO2: Present sample data and extract its important


features
DESCRIPTIVE STATISTICS

• CLO1-PLO2: Present sample data and extract its important features

• Dot Plot
• Stem and Leaf Plot
• Frequency Distribution and Histogram Plot
• Descriptive Measures
➢ Sample Mean
➢ Sample Variance
➢ Sample Mode
➢ Sample Median
• Box Plot
DOT DIAGRAM
DOT DIAGRAM
DOT DIAGRAM
DOT DIAGRAM

3, 6, − 2, 4, 7, 4, 3

0.107, 0.196, 0.021, 0.283, 0.179, 0.854, 0.58, 0.19, 7.3,


1.18, 2.0
STEM AND LEAF PLOT
STEM AND LEAF
PLOT
STEM AND LEAF PLOT
STEM AND LEAF PLOT

• 1.2 | 0 2 3 5 8 the leaf unit = 0.01

the corresponding data are 1.20, 1.22, 1.23, 1.25, and


1.28

• If a stem-and-leaf display has the two digit leaves


0.3 | 03 17 55 89 first leaf digit unit = 0.01

the corresponding data are 0.303, 0.317, 0.355, and


0.389.
STEM AND LEAF PLOT
FREQUENCY DISTRIBUTION AND
HISTOGRAM PLOT
• A frequency distribution is a table that divides a set of
data into a suitable number of classes (categories),
showing also the number of items belonging to each
class.

• Such a table sacrifices some of the informationcontained


in the data; instead of knowing the exact value of each
item, we only know that it belongs to a certain class.
FREQUENCY DISTRIBUTION AND
HISTOGRAM PLOT

Numerical distributions is, frequency


distributions where the data are grouped
according to size;

Categorical distribution: if the data are


grouped according to some quality, or
attribute
FREQUENCY DISTRIBUTION AND
HISTOGRAM
The followingPLOT
scores represent the final examination
grades for an elementary statistics course:

23 60 79 32 57 74 52 70 82 36 80 77 81 95 41
FREQUENCY DISTRIBUTION AND
HISTOGRAM
The followingPLOT
scores represent the final examination
grades for an elementary statistics course:

23 60 79 32 57 74 52 70 82 36 80 77 81 95 41
FREQUENCY DISTRIBUTION AND
HISTOGRAM
The followingPLOT
scores represent the final examination
grades for an elementary statistics course:

23 60 79 32 57 74 52 70 82 36 80 77 81 95 41
FREQUENCY DISTRIBUTION AND
HISTOGRAM PLOT

The following scores represent the final examination


grades for an elementary statistics course:
23 60 79 32 57 74 52 70 82 36 80 77 81 95 41
FREQUENCY DISTRIBUTION AND
HISTROGRAM PLOT
• The class boundaries are the endpoints of the
intervals that specify each class.
• the classes do not overlap, they accommodate all
the data, and they are all of the same width.
.
FREQUENCY DISTRIBUTION AND
HISTROGRAM PLOT

• Mid point of each class is class mark.


common interval between any successive class marks
as the class interval of the distribution.
DISTRIBUTION SHAPES
FREQUENCY DISTRIBUTION

To illustrate the construction of a frequency distribution, let us


consider the following 80 determinations of the daily emission (in
tons) of sulfur oxides from an industrial plant:

15.8 26.4 17.3 11.2 23.9 24.8 18.7 13.9 9.0 13.2
22.7 9.8 6.2 14.7 17.5 26.1 12.8 28.6 17.6 23.7 26.8
22.7 18.0 20.5 11.0 20.9 15.5 19.4 16.7 10.7 19.1
15.2 22.9 26.6 20.4 21.4 19.2 21.6 16.9 19.0 18.5
23.0 24.6 20.1 16.2 18.0 7.7 13.5 23.5 14.5 14.4
29.6 19.4 17.0 20.8 24.3 22.5 24.6 18.4 18.1 8.3
21.9 12.3 22.3 13.3 11.8 19.3 20.0 25.7 31.8 25.9
10.5 15.9 27.5 18.1 17.9 9.4 24.1 20.1 28.5
DESCRIPTIVE MEASURES
DESCRIPTIVE MEASURES

Q1- In order to control costs, a company collects data on the weekly


number of meals claimed on expense accounts. The numbers for five
weeks are
15 14 2 7 and 13.
Find the mean and the median.

Q2-An engineering group receives email requests for technical


information from sales and service persons. The daily numbers for six
days are
11 9 17 19 4 and 15.
Find the mean and the median.
QUARTILES AND PERCENTILES

• The median, divides a set of data into halves

• When an ordered data set is divided into quarters, the


resulting division points are called sample quartiles.

• The first quartile, Q1, is a value that has one-fourth, or 25%, of


the observations below its value. The first quartile is also the
sample25th percentile P0.25.
QUARTILES AND PERCENTILES

Sample percentiles
The sample 100 p-th percentile is a value such
that at least 100 p% of the observations are at or
below this value and at least 100 ( 1 − p )% are at
or above this value.

Sample quartiles
first quartile Q1 = 25th percentile
second quartile Q2 = 50th percentile
third quartile Q3 = 75th percentile
QUARTILES AND PERCENTILES
The following rule simplifies the calculation of sample
percentiles.

Calculating the sample 100 p-th Percentile


1. Order the n observations from smallest to largest.
2. Determine the product np. If np is not an integer,
round it up to the next integer and find the
corresponding ordered value.
If np is an integer, say k, calculate the mean of the k-
th and(k + 1)st ordered observations.
QUARTILES AND PERCENTILES

Obtain the quartiles and the 97th percentile for the sulfur
emission data on page ??.
15.8 26.4 17.3 11.2 23.9 24.8 18.7 13.9 9.0 13.2 22.7
9.8 6.2 14.7 17.5 26.1 12.8 28.6 17.6 23.7 26.8 22.7
18.0 20.5 11.0 20.9 15.5 19.4 16.7 10.7 19.1 15.2 22.9
26.6 20.4 21.4 19.2 21.6 16.9 19.0 18.5 23.0 24.6 20.1
16.2 18.0 7.7 13.5 23.5 14.5 14.4 29.6 19.4 17.0 20.8
24.3 22.5 24.6 18.4 18.1 8.3 21.9 12.3 22.3 13.3 11.8
19.3 20.0 25.7 31.8 25.9 10.5 15.9 27.5 18.1 17.9 9.4
24.1 20.1 28.5
QUARTILES AND PERCENTILES

Obtain the quartiles and the 97th percentile for the sulfur emission data on page ??.
15.8 26.4 17.3 11.2 23.9 24.8 18.7 13.9 9.0 13.2 22.7 9.8 6.2 14.7 17.5 26.1
12.8 28.6 17.6 23.7 26.8 22.7 18.0 20.5 11.0 20.9 15.5 19.4 16.7 10.7 19.1
15.2 22.9 26.6 20.4 21.4 19.2 21.6 16.9 19.0 18.5 23.0 24.6 20.1 16.2 18.0
7.7 13.5 23.5 14.5 14.4 29.6 19.4 17.0 20.8 24.3 22.5 24.6 18.4 18.1 8.3 21.9
12.3 22.3 13.3 11.8 19.3 20.0 25.7 31.8 25.9 10.5 15.9 27.5 18.1 17.9 9.4
24.1 20.1 28.5

The ordered data are:


6.2 7.7 8.3 9.0 9.4 9.8 10.5 10.7 11.0 11.2 11.8 12.3 12.8 13.2 13.3 13.5 13.9 14.4
14.5 14.7 15.2 15.5 15.8 15.9 16.2 16.7 16.9 17.0 17.3 17.5 17.6 17.9 18.0 18.0 18.1
18.1 18.4 18.5 18.7 19.0 19.1 19.2 19.3 19.4 19.4 20.0 20.1 20.1 20.4 20.5 20.8 20.9
21.4 21.6 21.9 22.3 22.5 22.7 22.7 22.9 23.0 23.5 23.7 23.9 24.1 24.3 24.6 24.6 24.8
25.7 25.9 26.1 26.4 26.6 26.8 27.5 28.5 28.6 29.6 31.8
QUARTILES AND PERCENTILES

According to our calculation rule, n p = 80 ( 1/4 ) = 20 is an integer, so we take


the
mean of the 20th and 21st ordered observations.
Q1 =(14.7 + 15.2)/2= 14.95.

Since n p = 80 ( 1/2 ) = 40, the second quartile, or median, is the mean of the
40th and
41st ordered observations
Q2 =(19.0 + 19.1)/2= 19.05

while the third quartile is the mean of the 60th and 61st:
Q3 =22.9 + 23.0/2= 22.95.

To obtain the 97th percentile P0.97, we determine that 0.97 × 80 = 77.6 which
we round up to 78. Counting in to the 78-th position, we obtain
P0.95 = 28.6.
BOX PLOT
BOX PLOT
MEAN. STANDARD DEVIATION.
VARIANCE.
PROBLEMS

Q1- 20 21 20 19 20 19 21 19

Q2- 7 6 4 0 7 1 2 4 6 6

Q3. -0.52 0.11 -0.48 0.94 0.24 -0.19 -0.55


MATLAB COMMAND
PROMPT
MATLAB COMMANDS SUMMARY
MATLAB COMMAND
PROMPT

You might also like