You are on page 1of 28

PRACTICAL BIOSTATISTICS

BMB-308
TUTORIAL
REPORT AND PRESENTATION

TOPICS:REGRESSION ANALYSIS
Group-6 presented by
1.M.D.Jahirul Islam(1039)
2.A.Z.M.Ariful Islam(1050)
3.M.D.Gulzar Hossain(1043)
4.M.D.Mostafizur Rahman(1220)
What is statistics?
Statistics is a field of study concerned with
1.Collection,organization,summarization and
analysis of data.
2.Drawing of inference about a body of data.
What is biostatistics?
When tools of statistics are employed in analyze of
data derived from biological science and medicine
Term needed to understand regression analysis
Variable: Observed characteristics have different value in different
person, places or things

In statistics, regression analysis includes any techniques for


modelling and analyzing several variables, when the focus is on the
relationship between a dependent variable and one or more
independent variables

Most commonly, regression analysis estimates the conditional,


expectational of the dependent variable given the independent
variables — that is, the average value of the dependent variable
when the independent variables are held fixed.
The nature and strength of the relationship between variable examined
by regression and correlation analysis.

REGRESSION: Regression analysis is helpful in ascertaining the probable


form of the relationship between variables, and the ultimate objective.
This analysis is employed usually to predict or estimate the value of one
variable corresponding to a given value of another variable .
The purpose of regression is to mathematically describe the relation
between the variables
When the variables are perfectly correlated, the prediction is perfect;
the less correlated the variables, the less accurate the prediction
Steps in regression analysis subsequent analysis
of it involves
1.determination whether or not the assumptions underlying a linear
relationship are met in the data available for analysis.
2.obtain the equation for the line that best fits the sample data.
3.evaluate the equation to obtain some idea of the strength of the relationship
and the usefulness of the equation for the predicting and estimating.
4.if the data appear to conform satisfactorily to the linear model, use the
equation obtained from the sample data to predict and to estimate.
When we use the equation to predict we will be predicting the value Y is likely
to have when x has a given value . To estimate we will be estimating the mean of
the subpopulation of Y values assumed to exist the given value of X.
Regression Equation
Because correlation assumes the variables are linearly
related, the mathematical relation between the variables
must be the equation of a line
Y’=slope * X + intercept
Y’ (read Y prime) is the predicted value of the Y variable
slope is how steep the line is
intercept is where the line crosses the Y axis when X = 0
The Slope
The slope is how steep the 40
line is 35
The slope is defined as the
30
change in the Y axis value
25
divided by the change in
the X axis value 20

By just looking at the lines, 15

which one has the steepest 10


slope? 5
0
0 2 4 6
slope
Look at the left-most 40
two points 35
For the blue line the 30
change in Y is 15 - 10 = 5. 25
The change in X is 1 - 0 =
20
1. The slope is 5 / 1 = 5
15
The slope of the green
10
line is (12 - 10) / (1 - 0) =
2 5
Black’s slope is 1 0
Red’s slope is -1 0 2 4 6
Intercept
The intercept is the Y 25
axis value when X
20
equals 0
It is where the line
15
strikes the Y axis when
X=0 10
Blue’s intercept is 15
Black and green’s 5
intercept is 10
0
Red’s intercept is 5 0 2 4 6
10
CORRELATION: This analysis measure the strength of the relationship between variables.

The purpose of correlation is to determine if two variables are linearly related to each
other
The correlation coefficient tells us:
the strength of the relation
the direction of the relation (direct or indirect)
The correlation coefficient, however, does not tell us how the variables are related
I.e., it does not tell us how to predict the value of one variable given the value of the
other
ASSUMPTION UNDERLYING SIMPLE LINEAR REGRESSION:
In the simple linear regression model two variables, X and Y. The
variable X is usually referred to as the independent variable. Values
of X is selected by the investigator and corresponding to each
preselected value of X, one or more values of Y are obtained. Y is
called the dependent variable
1.Values of the independent variable X are said to be fixed, this
means that values of X are preselected by the investigator . X is
also referred to as a nonrandom variable and mathematical
variable.
2.The variable X is measured without error.
3.For each value of X there is a subpopulation of Y values, these
subpopulation must be normally distributed.
4.The variances of the subpopulations of Y are all equal.
5.The means of the subpopulations of Y all line in the same straight
line . This is known as the assumption of linearity . This assumption
may be expressed as µ y/x=α+βx. Where μ y/x is the mean of
the subpopulation of the Y values for particular X value,α and β are
called population regression coefficients.α and β represent the y
intercept and slope.
6.The Y values are statistically independent, the X values are
dependent.
The regression model is y=α + βx + e. Where e is the error term.
e=y-(α+βx)
The scattered diagram of two variables, lipid peroxide
and bilirubin show that they are correlated
So we can do regression analysis procedure within this
variable.
20 number of observation
The average amount of L.P.O. for each individual is 1.5121
The average amount of bilirubin for each individual is
0.2281
Standard Deviation in case of L.P.O.is 0.7165
Standard Deviation in case of bilirubin is 0.7165
Correlation coefficient:
between LPO and bilirubin is 0.708 is very close to 1. that means
they are highly correlated.
Coefficient of determination:
r2=0.502
*50% variation in regression variable Y can be explained by the total
variation in X.
So the regression model moderately fits the sample data.
(the regression co-efficient is moderate when the value of r2 lies
between 0.50 to 0.70,
the regression co-efficient is good when the value of r2 lies
between 0.70 to 1.0, the regression co-efficient is poor when the
value of r2 is less then 0.50)
Adjusted r square gibes more accurate value
ANOVA : There is a significant effect of the independent
variable on the dependent variable . So, β have significant
effect on regression model.
COEFFICIENT : When there is no amount of bilirubin in
the individual body yet at least 0.367 amount of LPO exist
in his body.
If we increase 1 unit of bilirubin than LPO. Is increased
5.018 per unit.
T statistics : The result is .226>level of significance α.Do
not reject nul hypothesis at 5% level of significance.

You might also like