You are on page 1of 10

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/309893046

Biostatistics Series Module 6: Correlation and Linear Regression

Article  in  Indian Journal of Dermatology · November 2016


DOI: 10.4103/0019-5154.193662

CITATIONS READS
49 7,446

2 authors:

Avijit Hazra Nithya J Gogtay


Institute of Post-Graduate Medical Education and Research and Seth Sukhlal Karnan… KEM Hospital
180 PUBLICATIONS   2,514 CITATIONS    265 PUBLICATIONS   3,180 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Biostatistics series in Indian Journal of Dermatology View project

Controlled release mebeverine in diarrhoea predominant irritable bowel syndrome View project

All content following this page was uploaded by Avijit Hazra on 21 November 2016.

The user has requested enhancement of the downloaded file.


IJD® MODULE ON BIOSTATISTICS AND RESEARCH METHODOLOGY FOR THE DERMATOLOGIST
MODULE EDITOR: SAUMYA PANDA

Biostatistics Series Module 6: Correlation and Linear Regression


Avijit Hazra, Nithya Gogtay1

Abstract From the Department of


Correlation and linear regression are the most commonly used techniques for quantifying the Pharmacology, Institute of
association between two numeric variables. Correlation quantifies the strength of the linear Postgraduate Medical Education
and Research, Kolkata,
relationship between paired variables, expressing this as a correlation coefficient. If both
West Bengal, 1Department of
variables x and y are normally distributed, we calculate Pearson’s correlation coefficient (r). Clinical Pharmacology, Seth GS
If normality assumption is not met for one or both variables in a correlation analysis, a Medical College and KEM Hospital,
rank correlation coefficient, such as Spearman’s rho (ρ) may be calculated. A hypothesis Mumbai, Maharashtra, India
test of correlation tests whether the linear relationship between the two variables holds in
the underlying population, in which case it returns a P < 0.05. A 95% confidence interval
of the correlation coefficient can also be calculated for an idea of the correlation in the Address for correspondence:
Dr. Avijit Hazra,
population. The value r2 denotes the proportion of the variability of the dependent variable
Department of Pharmacology,
y that can be attributed to its linear relation with the independent variable x and is called Institute of Postgraduate Medical
the coefficient of determination. Linear regression is a technique that attempts to link Education and Research,
two correlated variables x and y in the form of a mathematical equation (y = a + bx), such 244B Acharya J. C. Bose Road,
that given the value of one variable the other may be predicted. In general, the method Kolkata ‑ 700 020,
of least squares is applied to obtain the equation of the regression line. Correlation and West Bengal, India.
linear regression analysis are based on certain assumptions pertaining to the data sets. If E‑mail: blowfans@yahoo.co.in
these assumptions are not met, misleading conclusions may be drawn. The first assumption
is that of linear relationship between the two variables. A scatter plot is essential before
embarking on any correlation‑regression analysis to show that this is indeed the case. Outliers
or clustering within data sets can distort the correlation coefficient value. Finally, it is vital
to remember that though strong correlation can be a pointer toward causation, the two are
not synonymous.

Key Words: Bland–Altman plot, correlation, correlation coefficient, intraclass correlation


coefficient, method of least squares, Pearson’s r, point biserial correlation coefficient,
Spearman’s rho, regression

Introduction The Scatter Plot


The word correlation is used in day‑to‑day life to denote When exploring the relationship between two numerical
some form of association. In statistics, correlation variables, the first and essential step is to graphically
analysis quantifies the strength of the association depict the relationship on a scatter plot or scatter
between two numerical variables. If the association is diagram or scattergram. This is simply a bivariate plot
“strong” then an attempt may be made mathematically of one variable against the other. Before plotting, one
to develop a predictive relationship between the two or both variables may be logarithmically transformed to
variables so that given the value of one, the value of obtain a more normal distribution.
the other may be predicted from it and vice versa.
On a scatter diagram, it is customary to plot the
Defining this mathematical relationship as an equation
independent variable on the X‑axis and the dependent
is the essence of regression analysis. Correlation and
regression analysis are therefore like two sides of the
same coin. This is an open access article distributed under the terms of the Creative
Commons Attribution‑NonCommercial‑ShareAlike 3.0 License, which allows
Access this article online others to remix, tweak, and build upon the work non‑commercially, as long as
Quick Response Code: the author is credited and the new creations are licensed under the identical
terms.
Website: www.e‑ijd.org
For reprints contact: reprints@medknow.com

How to cite this article: Hazra A, Gogtay N. Biostatistics series module 6:


DOI: 10.4103/0019-5154.193662 Correlation and linear regression. Indian J Dermatol 2016;61:593-601.
Received: October, 2016. Accepted: October, 2016.

© 2016 Indian Journal of Dermatology | Published by Wolters Kluwer - Medknow 593


Hazra and Gogtay: Correlation and regression

variable on the Y‑axis. However, “independent” and The Correlation Coefficient


“dependent” distinction can be puzzling at times. For To quantify the strength of the relationship between
instance, if we are exploring the relationship between two variables shown to have a linear relationship on the
age and stature in children, it is reasonable to assume scatter plot, we calculate the correlation coefficient. The
that age is the “independent” variable on which the coefficient takes values only between −1 and +1, with
height depends. So customarily, we will plot age on the numerical magnitude depicting the strength of the
the X‑axis and height on the Y‑axis. However, if we are relationship, and the sign indicating its direction. Thus,
exploring the relationship between serum potassium the sign accompanying a correlation coefficient is not
and venous plasma glucose levels, which variable do a +or −sign in the arithmetic sense. Rather the plus sign
we treat as the “dependent” variable? In such cases, it denotes a direct relationship, whereas minus denotes an
usually does not matter which variable is attributed to inverse relationship.
a particular axis of the scatter diagram. If our intention
is to draw inferences about one variable (the outcome If both variables x and y are normally distributed,
or response variable) from the other (the predictor or we calculate Pearson’s product moment correlation
coefficient r or Pearson’s correlation coefficient r, or
explanatory variable), the observations from which the
simply r (after Karl Pearson). It is calculated as the
inferences are to be made are usually placed on the
covariance of the two variables divided by the product of
horizontal axis.
their standard deviations (SDs), and the term “product
Once plotted, the closer the points lie to a straight moment” in the name derives from the mathematical
line; the stronger is the linear relationship between two nature of the relationship.
variables. Examine the two scatter plots presented in
A value of r close to +1 indicates a strong direct linear
Figure 1. In Figure 1a, the individual dots representing
relationship (i.e., one variable increases with the other;
the paired xy values are obviously, closely approximating
as in Figure 3a). A value close to −1 indicates a strong
a straight line and the value of y increases as the value
inverse linear relationship (i.e., one variable decreases
of x increases. This type of relationship is referred to as as the other increases; Figure 3b). A value close to
direct linear relationship. In Figure 1b, the dots are also 0 indicates a random scatter of the values [Figure 3c];
approximating a straight line, though not so closely, as alternatively, there could be a nonlinear relationship
in Figure 1a. Moreover, the value of y is declining as between the variables [Figure 3d]. The scatter plot
that of x is increasing. This type of relationship is an is indispensable in checking the assumption of a
inverse or reciprocal linear relationship. linear relationship and it is meaningless to calculate a
Now inspect the scatter plot shown in Figure 2. In this correlation coefficient without such a relation between
instance, there also is a strong relation between the dose the two variables. In between the state of “no correlation
of the drug and the response – the response is low to at all” (r = 0) and “perfect correlation” (r = 1), interim
begin with, rises steadily in the subsequent portion of values of the correlation coefficient are interpreted
the dose range, but then tends to decline with further by convention. Thus, values >0.7 may be regarded as
increase in dose. It is clear that although the relationship “strong” correlation, values between 0.50 and 0.70 may
is strong it cannot be approximated by a single straight be interpreted as “good” correlation, between 0.3 and
line but can be described by an appropriate curved line. 0.5 may be treated as “fair” or “moderate” correlation,
The association, in this case, is curvilinear rather than and any value <0.30 would be poor correlation.
linear. We are going to discuss correlation and regression However, we must remember that the interpretation of
assuming linear relationship between the variables in
question. Although many biological phenomena will show
nonlinear relationships, the mathematics of these are
more complex and beyond the scope of this module.

a b
Figure 1: Scatter diagram depicting direct and inverse linear relationships Figure 2: Scatter diagram depicting a curvilinear relationship

Indian Journal of Dermatology 2016; 61(6) 594


Hazra and Gogtay: Correlation and regression

a b Hypothesis Test of Correlation and Confidence


Interval for Correlation Coefficient
When we calculate the correlation coefficient for sample
data, we quantify the strength of the linear relationship
between two variables measured in the sample.
A high correlation coefficient value indicates a strong
c d correlation for the sample. However, what about the
relationship in the population as a whole from which
the sample has been drawn? A significance test, based
on the t distribution, addresses this question. It allows
us to test whether the association might have occurred
merely by chance or whether it is a true association.
This significance test is applied with the null hypothesis
that the population correlation coefficient equals 0. For
Figure 3: Scatter diagram depicting relationship patterns between two variables any given value of r and taking the sample size into
consideration, P values can be obtained from most
a correlation coefficient depends on the context and statistical packages. As usual, P < 0.05 indicates that
purposes. A correlation of 0.85 may be very low if one is there is sufficient evidence to suggest that the true
verifying a physical law using high‑quality instruments (population) correlation coefficient is not 0 and that the
or trying to derive the standard curve for a quantitative linear relationship between the two variables observed
assay but may be regarded as very high in the clinical in the sample also holds for the underlying population.
context. However, although the hypothesis test indicates whether
Note that the correlation coefficient has no units and there is a linear relationship, it gives no indication
is, therefore, a dimensionless statistic. The position of of the strength of that association. This additional
x and y can be interchanged on a scatter plot without information can be obtained from the 95% confidence
affecting the value of r. interval (CI) for the population correlation coefficient.

If one or both variables in a correlation analysis is/are Calculation of this CI requires r to be transformed to
not normally distributed a rank correlation coefficient give a normal distribution by making use of Fisher’s
that depends on the rank order of the values rather z transformation. The width of the CI depends on the
than the actual observed values can be calculated. sample size, and it is possible to calculate the sample
Examples include Spearman’s rho (ρ) (after Charles size required for a given level of accuracy.
Edward Spearman) and Kendall’s tau (τ) (after Maurice
George Kendall) statistics. In essence Spearman’s rank Coefficient of Determination
correlation coefficient rho, which is the more frequently When exploring linear relationship between numerical
used nonparametric correlation, is simply Pearson’s variables, a part of the variation in one of the variables
product moment correlation coefficient calculated for can be thought of as being due to its relationship with
the rank values of x and y rather than their actual the other variable with the rest due to undetermined
values. It is also appropriate to use ρ rather than r (often random) causes. A coefficient of determination
when at least one variable is measured on an ordinal can be calculated to denote the proportion of the
scale or when the sample size is small (say n  ≤10); ρ variability of y that can be attributed to its linear
is also less sensitive to deviations from linear relation relation with x. This is taken simply as r multiplied by
than r. itself that is r2. It is also denoted as R2.

Although less often used, Kendall’s tau is another For any given value of r, the r2 will denote a value
nonparametric correlation offered by many statistical that is closer to 0 and will be devoid of a sign. Thus, if
packages. Some statisticians recommend that it should r is +0.7 or −0.7, r2 will be 0.49. We can interpret this
be used, rather than Spearman’s coefficient, when the 0.49 figures as that 49% of the variability in y is due to
data set is small with the possibility of a large number variation of x. Values of r2 close to 1 imply that most of
of tied ranks. This means that if we rank all of the the variability in y is explained by its linear relationship
with x. The value (1 −r2) has sometimes been referred to
scores and many scores have the same rank, Kendall’s
as the coefficient of alienation.
tau should be used. It is also considered to be a more
accurate gauge of the correlation in the underlying In statistical modeling, the r2 statistics gives information
population. about the goodness of fit of a model. In regression, it

595 Indian Journal of Dermatology 2016; 61(6)


Hazra and Gogtay: Correlation and regression

denotes how well the regression line approximates the the use of dichotomous variables is frequent. Suppose
real data points. An r2 of 1 indicates that the regression we are interested in exploring whether in a group of
line perfectly fits the data. Note that values of r2 outside students opting for university courses, there is gender
the range 0–1 can occur where it is used to measure the preference for physical sciences or life sciences. Here,
agreement between observed and modeled values, and we have two binary variables – gender (male or female)
the modeled values are not obtained by linear regression. and discipline (physical science or life science). We
can arrange the data as a 2 × 2 contingency table and
Point Biserial and Biserial Correlation calculate the phi coefficient.
The point biserial correlation is a special case of the
product‑moment correlation, in which one variable is Simple Linear Regression
continuous, and the other variable is binary. The point If two variables are highly correlated, it is then feasible
biserial correlation coefficient measures the association to predict the value of one (the dependent variable)
between a binary variable x, taking values 0 or 1, and from the value of the other (the independent variable)
a continuous numerical variable y. It is assumed that using regression techniques. In simple linear regression,
for each value of x, the distribution of y is normal, the value of one variable (x) is used to predict the
with different means but same variance. It is often value of the other variable (y) by means of a simple
abbreviated as rPB. mathematical function, the linear regression equation,
which quantifies the straight‑line relationship between
The binary variable frequently has categories such as yes
the two variables. This straight line, or regression line,
or no, present or absent, success or failure, etc.
is actually the “line of best fit” for the data points on
If the variable x is not naturally dichotomous but the scatter plot showing the relationship between the
is artificially dichotomized, we calculate the biserial variables in question.
correlation coefficient rB instead of the point‑biserial The regression line has the general formula:
correlation coefficient.
y = a + bx.
Although not often used, an example where we
may apply the point biserial correlation coefficient Where “a” and “b” are two constants denoting the
would be in cancer studies. How strong is the intercept of the line on the Y‑axis (y‑intercept) and the
association between administering the anticancer drug gradient (slope) of the line, respectively. The other name
(active drug vs. placebo) and the length of survival for b is the “regression coefficient.”
after treatment? The value would be interpreted in the Physically, “b” represents the change in y, for every 1
same way as Pearson’s r. Thus, the value would range unit change in x; while “a” represents the value that
from −1 to +1, where −1 indicates a perfect inverse y would take, if x is 0. Once the values of a and b
association, +1 indicates a perfect direct association, and have been established, the expected value of y can be
0 indicates no association at all. Take another example. predicted for any given value of x, and vice versa. Thus, a
Suppose we want to calculate the correlation between model for predicting y from x is established. There may be
intelligence quotient and the score on a certain test, but situations, in which a straight line passing through the
all the test scores are not available with us although we origin will be appropriate for the data, and in these cases,
know whether each subject passed or failed. We could the equation of the regression line simplifies to y = bx.
then use the biserial correlation.
But how do we fit a straight line to a scattered set of
The Phi Coefficient points which seem to be in linear relationship? If the
points are not all on a single straight line, we can, by
The phi coefficient (also called the mean square
eye estimation, draw multiple lines that seem to fit the
contingency coefficient) is a measure of association for
series of data points on the scatter diagram. But which
two binary variables. It is denoted as ϕ or rϕ.
is the line of best fit? This problem had mathematicians
Also introduced by Karl Pearson, this statistic is similar stumped literally for centuries. The solution was in the
to the Pearson’s correlation coefficient in its derivation. form of the method of least squares, which was first
In fact, a Pearson’s correlation coefficient estimated published by the French mathematician Adrien‑Marie
for two binary variables will return the phi coefficient. Legendre in 1805 but used earlier by Carl Friedrich Gauss
The interpretation of the phi coefficient requires in Germany in 1795 to calculate the orbit of celestial
caution. It has a maximum value that is determined bodies. Gauss developed the idea further and today is
by the distribution of the two variables. If both have a known as the father of regression.
50/50 split, values of phi will range from −1 to +1.
Look at Figure 4. When we have a scattered series of
Application of the phi coefficient is particularly seen dot which are lying approximately but not exactly
in educational and psychological research, in which on a straight line, we can, by eye estimation, draw a

Indian Journal of Dermatology 2016; 61(6) 596


Hazra and Gogtay: Correlation and regression

number of lines that seem to fit the series. But which regression subsequently somehow got attached to the
is the line of best fit? The method of least squares, in procedure of line fitting itself.
essence, selects the line that would provide the least
Note that in our discussion above; we have discussed
sum of squares for the vertical residuals or offsets. In
the predictive relationship between two numerical
Figure 4, it is line B as is shown in the lower panel. For
variables. This is simple linear regression. If the value
a particular value of x, the vertical distance between the
of y requires more than one numerical variable for a
observed and fitted value of y is known as the residual
reasonable prediction, we are encountering the situation
or offset. Since some of the residuals are above and
called multiple linear regression. We will be discussing
some below the line of best fit, we require a +or −sign
the basics of this in a future module.
to mathematically denote the residuals. Squaring removes
the effect of the −sign. The method of least squares finds Pitfalls in Correlation and Regression
the values of “a” and “b” that minimizes the sum of the
squares of all the residuals. The method of least squares
Analysis
is not the only technique, but is regarded as the simplest Correlation and linear regression analysis are based on
technique for linear regression, that is for the task of certain assumptions pertaining to the data sets. If these
finding the straight line of best fit for a series of points assumptions are not met, conclusions drawn can be
depicting a linear relationship on a scatter diagram. misleading. Both assume that the relationship between
the two variables is linear. The observations have to be
You may wonder why the statistical procedure of fitting a independent – they are not independent if there is more
line is called “regression” which in common usage means than one pair of observations (that is repeat measurements)
“going backward.” Interestingly, the term was used from one individual. For correlation, both variables should
neither by Legendre or Gauss but is attributed to the be random variables although for regression, only the
English scientist Francis Galton who had a keen interest response variable y needs to be random.
in heredity. In Victorian England, Galton measured the
heights of 202 fathers and their first born adult sons and Inspecting a scatter plot is of utmost importance before
plotted them on a graph of median height versus height estimation of the correlation coefficient for many
group. The scatter for fathers and sons approximated to reasons:
two lines that intersected at a point representing the • A nonlinear relationship may exist between two
average height of the adult English population. Studying variables that would be inadequately described,
this plot, Galton made the very interesting observation or possibly even undetected, by the correlation
that tall fathers tend to have tall sons but they are not coefficient. For instance, the correlation coefficient
as tall as their fathers, and short fathers tend to have if calculated for the set of data points in Figure 2,
short sons but they are not as short as their fathers; would be almost zero, but we will be grossly wrong
and in the course of just two or three generations the if we conclude that there is no association between
height of individuals tended to go back or “regress” to the variables. The zero coefficient only tells us that
the mean population height. He published a famous there is no linear (straight‑line) association between
paper titled “Regression towards mediocrity in hereditary the variables, when in reality there is a clear
stature.” This phenomenon of regression to the mean curvilinear (curved‑line) association between them
can be observed in many biological variables. The term • An outlier may create a false correlation. Inspect
the scatter plot in Figure 5a. The r value of 0.923

a b

c
Figure 4: The principle of the method of least squares for linear regression. The sum
of the squared “Residuals” is the least for the line of best fit Figure 5: Examples of misleading correlations

597 Indian Journal of Dermatology 2016; 61(6)


Hazra and Gogtay: Correlation and regression

suggests a strong correlation. However, a closer look cause and effect relationship between two variables that
makes it obvious that the series of dots is actually are correlated, even if the correlation is strong. In other
quite scattered, and the apparent correlation is being words, correlation does not imply causation.
created by the outlier point. This kind of outlier is
As a now widely stated example, numerous
called univariate outlier. If we consider the x value of
epidemiological studies showed that women taking
this point, it is way beyond the range of rest of the
combined hormone replacement therapy (HRT) also
x values; similarly, the y value of the point is much had a lower‑than‑average incidence of coronary heart
beyond the range of y values for the rest of the disease (CHD). The correlation was strong leading
dots. An univariate outlier is easy to spot by simply researchers to propose that HRT was protective against
sorting the values or constructing boxplots CHD. However, subsequent randomized controlled
• Conversely, an outlier can also spoil a correlation. trials showed that HRT causes a small but statistically
The scatter plot in Figure 5b suggests only moderate significant increase in the risk of CHD. Reanalysis of
correlation at r value of 0.583, but closer inspection the data from the epidemiological studies showed that
reveals that a single bivariate outlier reduces what is women undertaking HRT were more likely to be from
otherwise an almost perfect association between the higher socioeconomic groups, with better‑than‑average
variables. Note that the deviant case is not an outlier diet and exercise regimens. The use of HRT and
in the usual univariate sense. Individually, its x value decreased the incidence of CHD were coincident effects
and y value are unexceptional. What is exceptional is of a common cause (i.e., the benefits associated with a
the combination of values on the two variables that higher socioeconomic status), rather than a direct cause
it exhibits, making it an outlier, and this would be and effect, as had been supposed.
evident only on a scatter plot
• Clustering within datasets may also inflate a Correlation may simply be due to chance. For
correlation. Look at Figure 5c. Two clusters are example, one could compute r between the abdominal
evident, and individually they do not appear to circumference of subjects and their shoe sizes,
show strong correlation. However, combining the two intelligence, or income. Irrespective of the value of r,
suggests a decent correlation. This combination may these associations would make no sense.
be undesirable in real life. Clustering within datasets For any two correlated variables, A and B, the following
may be a pointer that the sampling has not really relationships are possible:
been random. • A causes B or vice versa (direct causation)
When using a regression equation for prediction, errors • A causes C which causes B or the other way
in prediction may not be just random but may also round (indirect causation)
be due to inadequacies in the model. In particular, • A causes B and B causes A  (bidirectional or cyclic
extrapolating beyond the range of observed data can be causation)
risky and is best avoided. Consider the simple regression • A and B are consequences of a common cause but do
equation: not cause each other
• There is no causal connection between A and B; the
Weight = a + b × height. correlation is just coincidence.
Suppose we give a height value of 0. The corresponding Thus, causality ascertainment requires consideration of
weight value, strangely, is not 0 but equals a. What several other factors, including temporal relationship,
is the matter here? Is the equation derived through dose‑effect relationship, effect of dechallenge and
regression faulty? The fact is that the equation is not rechallenge, and biological plausibility. Of course, a strong
at fault, but we are trying to extrapolate its use beyond correlation may be an initial pointer that a cause‑effect
the range of values used in deriving the equation relationship exists, but per se it is not sufficient to infer
through least squares regression. This is a common causality. There must also be no reasonable alternative
pitfall. Equations derived from one sample should not explanation that challenges causality. Establishing
be automatically applied to another sample. Equations causality is one of the most daunting challenges in both
derived from adults, for instance, should not be applied public health and drug research. Carefully controlled
to children. studies are needed to address this question.

Correlation is Not Causation Assessing Agreement


One of the common errors in interpreting the correlation In the past correlation has been used to assess
coefficient is failure to consider that there may be this degree of agreement between sets of paired
a third variable related to both of the variables being measurements. However, correlation quantifies the
investigated, which is responsible for the apparent relationship between numerical variables and has
correlation. Therefore, it is wrong to infer that there is limitations if used for assessing comparability between

Indian Journal of Dermatology 2016; 61(6) 598


Hazra and Gogtay: Correlation and regression

methods. Two sets of measurements would be perfectly • How big is the average discrepancy between the
correlated if the scatter diagram shows that they all lie methods, which is indicated by the position of
on a single straight line, but they are not likely to be the bias line. This discrepancy may be too large to
in perfect agreement unless this line passes through accept clinically. However, if the differences within
the origin. It is very likely that two tests designed mean ± 1.96 SD are not clinically important, the two
to measure the same variable would return figures methods may be used interchangeably
that would be strongly correlated, but that does not • Whether the scatter around the bias line is too much,
automatically mean that the repeat measurements are with a number of points falling outside the 95%
also in strong agreement. Data which seem to be in agreement limit lines
poor agreement can produce quite high correlations. In • Whether the difference between the methods tends
addition, a change in scale of measurement does not to get larger or smaller as the values increase. If it
affect the correlation, but it can affect the agreement. does, as is indicated in Figure 7, it indicates the
existence of a proportional bias which means that
Bland–Altman Plot the methods do not agree equally through the range
Bland and Altman devised a simple but informative of measurements.
graphical method of comparing repeat measurements. The Bland–Altman plot may also be used to assess
When repeat measurements have been taken on a series the repeatability of a method by comparing repeated
of subjects or samples, the difference between pairs of measurements on a series of subjects or samples by
measurements (Y‑axis) is plotted against the arithmetic that single method. A coefficient of repeatability can
mean of the corresponding measurements (X‑axis). be calculated as 1.96 times the SD of the differences
The resulting scatter diagram such as figure is the between the paired measurements. Since the same
Bland–Altman plot (after John Martin Bland and method is used for the repeated measurements, it is
Douglas G. Altman, who first proposed it in 1983 and expected that the mean difference should be zero. This
then popularized it through a Lancet paper in 1986) can be checked from the plot.
an example of which is given in Figure 6. The repeat
measurements could represent results of two different Intraclass Correlation Coefficient
assay methods or scores from the same subjects by two Although originally introduced in genetics to judge
different raters. sibling correlations, the intraclass correlation coefficient
Computer software that draws a Bland–Altman plot (ICC) statistic is now most often used to assess the
can usually add a ‘bias’ line parallel to the X‑axis. consistency, or conformity, of measurements made by
This represents the difference between the means multiple observers measuring the same parameter or two
of the two sets of measurements. Lines denoting or more raters scoring the same set of subjects.
95% limits of agreement (mean difference ± 1.96 SD of The methods of ICC calculation have evolved over time.
the difference) can be added on either side of the bias The earliest work on intraclass correlations focused on
line. Alternatively, lines denoting 95% confidence limits paired measurements, and the first ICC statistics to be
of mean of differences can be drawn surrounding the proposed was modifications of the Pearson’s correlation
bias line. coefficient (which can be regarded interclass correlation)
Bland–Altman plots are generally interpreted informally. calculations. Beginning with Ronald Fisher, the intraclass
Three things may be looked at:

Figure 7: Example of a Bland–Altman plot showing proportional bias. In this case,


Figure 6: Example of a Bland–Altman plot used to compare two test methods. The the difference between the methods first tends to narrow down and then increase as
bias line with the limits of agreement is provided the value of measurements increase

599 Indian Journal of Dermatology 2016; 61(6)


Hazra and Gogtay: Correlation and regression

correlation has been regarded within the framework of single measures is an index for the reliability of the
analysis of variance and its calculation is now based on multiple ratings by a single typical rater. ICC for average
the true (between subject) variance and variance of the measures is an index for the reliability of different
measurement error (during repeat measurement). raters averaged together. This ICC is always slightly
The ICC takes a value between 0 and 1. Complete higher than the single measures ICC. Software may also
inter‑rater agreement is indicated by a value of 1 but offer different models for ICC calculation. One model
this is seldom achieved. Arbitrarily, the agreement assumes that all subjects were rated by the same raters.
boundaries proposed are <0.40: Poor; 0.40–0.60: Fair; A different model may be used when this precondition
0.60–0.74: Good, and >0.75: Strong. Software may report is not true. The model may test for consistency when
two coefficients with their respective 95% CIs. ICC for systematic differences between raters are irrelevant, or

Box 1: Examples of correlation and agreement analysis from published literature


Harbrecht BG, Rosengart MR, Bukauskas K, Zenati MS, Marsh JW Jr, Geller DA. Assessment of transcutaneous bilirubinometry
in hospitalized adults. J Am Coll Surg 2008;206:1129‑36.
Harbrecht et al. validated transcutaneous techniques to measure serum bilirubin in adults at risk of or diagnosed with hepatic
dysfunction. Eighty consecutive hospitalized adult patients from the general surgery, trauma surgery, and liver resection/
transplantation services of a tertiary care university medical center underwent TcB measurement from the forehead, sternum,
forearm, and deltoid. TcB measurements were repeated each time; serum bilirubin measurements were performed. They
found that TcB measurement from the forehead correlated with serum bilirubin better (r=0.963) than measurements from
the forearm (r=0.792), deltoid (r=0.922), or sternum (r=0.928). However, a Bland–Altman plot demonstrated that forehead
measurements became less accurate as the magnitude of hyperbilirubinemia increased. Therefore, the authors concluded that
in adult patients forehead TcB correlates best with serum bilirubin levels but becomes less reliable at higher values and further
refinements in transcutaneous bilirubinometry technology are needed.
Wöpking S, Scherens A, Haussleiter IS, Richter H, Schüning J, Klauenberg S, Maier C. Significant difference between three observers
in the assessment of intraepidermal nerve fiber density in skin biopsy. BMC Neurol 2009;9:13. doi: 10.1186/1471‑2377‑9‑13.
The determination of IENFD in skin biopsy is a useful method for the evaluation of different types of peripheral neuropathies.
To validate the method, it is necessary to determine interobserver reliability which was the aim of the study. Three observers
determined the IENFD and estimated the staining quality of the basement membrane for 120 skin biopsies (stained with indirect
immunofluorescence technique) from 68 patients. The authors found an unexpected significant difference in IENFD between the
observers and hence considered the results not to be in line with the high intraclass correlation coefficient (0.73) reported before
as index of interobserver reliability. The Bland–Altman plot also showed a divergence growing with rising mean. The authors
opined that difference in IENFD between the observers and the resulting low interobserver reliability is likely caused by different
interpretations of the standard counting rules. The standardization of the method is thus important and at least two observers
should analyze the skin biopsies.
Longo Imedio I, Serra‑Guillén C. Adaptation and validation of the Spanish version of the Actinic Keratosis Quality of Life
questionnaire. Actas Dermosifiliogr 2016;107:474‑81.
While there are questionnaires for evaluating the effects of skin cancer on patient quality of life, there were no specific
questionnaires available in Spanish for evaluating the quality of life in patients with actinic keratosis. The aim of this study was
to translate and culturally adapt the AKQoL questionnaire into Spanish. The original questionnaire was translated into Spanish
following the guidelines for the cross‑cultural adaptation of self‑report measures. Several measures of general reliability and validity
were used, including Cronbach’s α for internal consistency and a Bland–Altman plot for test‑retest reliability. To test concurrent
validity, the authors used the Spearman’s correlation coefficient to measure the correlation between AKQoL and Skindex‑29
scores. The final version of the translated questionnaire was administered to 621 patients with actinic keratosis. The Cronbach’s α
reliability coefficient was 0.84. The correlation between the score on the Skindex‑29 and on the AKQoL was however modest at rho
of 0.344 (P<0.05). The authors considered the translated version as a satisfactory alternative to be used in Spanish‑speaking actinic
keratosis patients.
Apa H, Gözmen S, Bayram N, Çatkoğlu A, Devrim F, Karaarslan U, et al. Clinical accuracy of tympanic thermometer and noncontact
infrared skin thermometer in pediatric practice: An alternative for axillary digital thermometer. Pediatr Emerg Care 2013;29:992‑7.
The aim of this study was to compare body temperature measurements by infrared tympanic and forehead noncontact thermometers
with the conventional axillary digital thermometer. A total of 1639 temperature readings were performed for every method
on fifty hospitalized children. The average difference between the mean (SD) of axillary and tympanic temperatures was
−0.20°C (0.61°C) (95% CI, −1.41°C–1.00°C). The average difference between the mean (SD) of axillary and forehead temperatures
was −0.38 (0.55°C) (95% CI, −1.47°C–0.70°C). The Bland–Altman plot showed that most of the data points were tightly clustered
around the zero line of the difference between the two temperature readings. The authors concluded that the infrared tympanic
thermometer could be a good option for fever assessment in children. The noncontact infrared thermometer is also useful, but it has
greater bias.
TcB: Transcutaneous bilirubin, IENFD: Intraepidermal nerve fiber density, AKQoL: Actinic keratosis quality of life, CI: Confidence interval,
SD: Standard deviation

Indian Journal of Dermatology 2016; 61(6) 600


Hazra and Gogtay: Correlation and regression

absolute agreement, when systematic differences are Further Reading


relevant. 1. Linear regression and correlation. In: Samuels MA, Witmer JA,
Schaffner AA. Statistics for the life sciences. 4th ed. Boston:
Some published examples of the use of correlation Pearson Education; 2012. p. 493‑549.
analysis are provided in the Box 1. 2. Correlation. In: Kirk RE. Statistics: An introduction. 5th ed.
Belmont: Thomson Wadsworth; 2008. p. 123‑57.
Finally, note that assessing agreement between
3. Regression. In: Kirk RE. Statistics: An introduction. 5th ed.
categorical variables requires different indices such as Belmont: Thomson Wadsworth; 2008. p. 159‑81.
Cohen’s kappa. This will be discussed in a future module. 4. Correlational techniques. In: Glaser AN. High‑yield
biostatistics. Baltimore: Lippincott Williams and Wilkins; 2001.
Financial support and sponsorship p. 50‑7.
Nil. 5. Bewick V, Cheek L, Ball J. Statistics review 7: Correlation and
regression. Crit Care 2003;7:451‑9.
Conflicts of interest 6. Giavarina D. Understanding Bland Altman analysis. Biochem
There are no conflicts of interest. Med (Zagreb) 2015;25:141‑51.

601 Indian Journal of Dermatology 2016; 61(6)

View publication stats

You might also like