You are on page 1of 27

INSTITUTE: UIE (AIT-CSE)

PROBABILITY AND STATISTICS


CST-229
Lecture – 1.5
TOPIC: Regression

Prepared by:Dr.Archana Sharma

DISCOVER . LEARN . EMPOWER


1
Course
Objectives

CO Title Level
Number
CO1 To recall the basic concepts of Probability Remember
and Statistics
CO2 To understand the ideas and classify the Understand
different distribution , based on different
event
CO3 To acquire knowledge on various techniques Analysis
of Probability and Statistics to analyze the  
behavior of a distribution based on data
available.
CO4 To acquire knowledge on the broad Understand
perspective of probability theory.  
CO5 To acquire knowledge on various discrete and Apply
continuous distributions along with their
properties.
2
Course Outcomes

After doing this course student will be able to:

Formulate a statistical problem in mathematical terms from


a real-life situation.
Select an appropriate distribution for analyzing data specific to an
experiment.

Apply statistical hypothesis in general and in practice.

Compute and interpret descriptive statistics using numerical and


graphical techniques

.To acquire knowledge on various discrete and continuous


distributions along with their properties
3
Topics to be covered:
• Definition of Regression
• Types of Regression
• Methods of Regression
• Uses of regression Analysis

4
Regression
•  Regression analysis is a statistical tool that gives us the ability to
estimate the mathematical relationship between a dependent variable
(usually called y) and an independent variable (usually called x).
• The dependent variable is the variable for which we want to make a
prediction.
• While various non-linear forms may be used, simple linear regression
models are the most common.

5
Introduction
• The primary goal of quantitative analysis is to lot size Man-hours
use current information about a phenomenon 30 73
to predict its future behavior. 20 50
• Current information is usually in the form of a 60 128
set of data. 80 170
40 87
• In a simple case, when the data form a set of 50 108
pairs of numbers, we may interpret them as 60 135
representing the observed values of an 30 69
independent (or predictor ) variable X and a 70 148
dependent ( or response) variable Y. 60 132

6
OBJECTIVE:
Statistical relation between Lot size and Man-Hour

• The goal of the analyst who 180

studies the data is to find a 160

functional relation y = f(x) 140

120

between the response 100

Man-Hour
variable y and the predictor 80

variable x. 60

40

20

0
0 10 20 30 40 50 60 70 80 90
Lot size

7
Regression Function
• The statement that the relation between X
and Y is statistical should be interpreted
as providing the following guidelines:
1. Regard Y as a random variable.
2. For each X, take f (x) to be the
expected value (i.e., mean value) of y.
3. Given that E (Y) denotes the
expected value of Y, call the equation
E (Y )  f ( x)
the regression function.

8
Historical Origin of Regression
• Regression Analysis was first
developed by Sir Francis Galton,
who studied the relation between
heights of sons and fathers.
• Heights of sons of both tall and
short fathers appeared to “revert”
or “regress” to the mean of the
group.

9
Regression Line
• If the scatter plot of our sample data suggests a linear relationship
between two variables i.e.
y   0  1 x

we can summarize the relationship by drawing a straight line on the


plot.
• Least squares method give us the “best” estimated line for our set of
sample data.

10
Types of Regression
• Regression analysis can be classified on the following bases
1. Change of Proportion
2. Number of Variables

11
Basis of Change in Proportions
• Linear Regression
• Non-Linear Regression

12
On the basis of Number of Variables
• Simple Regression
• Partial Regression
• Multiple Regression

13
Method of drawing regression Lines
• Free Hand Curve Method
• The method of Least Squares
• Regression equation through regression Coefficients

14
Method of Least Squares
•  According to least square method, the line should be drawn in
through the plotted points in such a way that the sum of the
squares of the deviations of the actual Y values from the
computed values is the minimum or the least .
• The line which fits the points in the best manner should have
as minimum.
• A line fitted by this method is called the line of bestfit.

15
Regression Equation of Y on X
•  It can be written as = a+ bX
• There are two normal equations as follows:

16
Regression Equation of X on Y
•  It can be written as = a+ bY
• There are two normal equations as follows:

17
Question

18
19
20
21
Uses of Regression
• Prediction of unknown value
• Nature of relationship
• Estimation of relationship
• Calculation of Coefficient of determination
• Helpful in calculation of error
• Policy Formation
• Test stone of Hypothesis

22
FAQ

• What is Regression analysis?


• What are the types of regression?
• What is the use of Regression analysis?

23
Practice Problem

1. Obtain the equation of the lines of regression for the data given below:
X 1 2 3 4 5 6 7 8 9
Y 9 8 10 12 11 13 14 16 15

2. From the following data of the age of thousands and the age of wives, find two regression lines and correlation
coefficient. Also Calculate the husband’s age when wife’s age is 16.

Husband’s 22 23 23 24 26 27 27 28 30 30
age:

Wife’s age 18 20 21 20 21 22 23 24 25 26

24
References
• Book:
•  SP GUPTA and VK KAPOOR ( SULTAN CHAND PUBLICATION)
• Miller and Freund, Probability and Statistics for Engineers, Pearson, 2005

• Reference1: NPTEL, KHAN ACADEMY, STATQUEST


• Reference2:https://www.youtube.com/watch?v=FHLcT21Pzhs
• Reference3:https://www.youtube.com/watch?v=8PJ24SrQqy8

• Reference4: https://gradeup.co/study-notes-on-correlation-and-regression-i
• Reference5: Introduction to Probability and Statistics (UDEMY)
• Reference6: https://www.youtube.com/watch?v=JvS2triCgOY
• Reference7: https://www.coursera.org/learn/basic-statistics
25
Topics covered

• Definition of Regression
• Types of Regression
• Methods of Regression
• Uses of regression Analysis

26
THANK YOU

27

You might also like