You are on page 1of 9

Cronbach’s Alpha: Simple Definition, Use

and Interpretation
Statistics Definitions > Cronbach’s Alpha

Cronbach’s alpha, α (or coefficient alpha), developed by Lee Cronbach in 1951, measures
reliability, or internal consistency. “Reliability” is how well a test measures what it should. For
example, a company might give a job satisfaction survey to their employees. High reliability
means it measures job satisfaction, while low reliability means it measures something else (or
possibly nothing at all).

Cronbach’s alpha tests to see if multiple-question Likert scale surveys are reliable. These
questions measure latent variables — hidden or unobservable variables like: a person’s
conscientiousness, neurosis or openness. These are very difficult to measure in real life.
Cronbach’s alpha will tell you if the test you have designed is accurately measuring the variable
of interest.

Cronbach’s Alpha Formula


The formula for Cronbach’s alpha is:

Where:

 N = the number of items.


 c̄ = average covariance between item-pairs.
 v̄ = average variance.

SPSS Steps
While it’s good to know the formula behind the concept, in reality you won’t actually need to
work it. You’ll often calculate alpha in SPSS or similar software. In SPSS, the steps are:

Step 1: Click “Analyze,” then click “Scale” and then click “Reliability Analysis.”
Step 2: Transfer your variables (q1 to q5) into “Items,”. The model default should be set as
“Alpha.”
Step 3: Click “Statistics” in the dialog box.
Step 4: Select “Item,” “Scale,” and “Scale if item deleted” in the box description. Choose
“Correlation” in the inter-item box.
Step 5: Click “Continue” and then click “OK”.

Rule of Thumb for Results


A rule of thumb for interpreting alpha for dichotomous questions (i.e. questions with two
possible answers) or Likert scale questions is:

In general, a score of more than 0.7 is usually okay. However, some authors suggest higher
values of 0.90 to 0.95.

Avoiding Issues with Cronbach’s Alpha


Use the rules of thumb listed above with caution. A high level for alpha may mean that the items
in the test are highly correlated. However, α is also sensitive to the number of items in a test. A
larger number of items can result in a larger α, and a smaller number of items in a smaller α. If
alpha is high, this may mean redundant questions (i.e. they’re asking the same thing). A low
value for alpha may mean that there aren’t enough questions on the test. Adding more relevant
items to the test can increase alpha. Poor interrelatedness between test questions can also cause
low values, so can measuring more than one latent variable.

Confusion often surrounds the causes for high and low alpha scores. This can result in
incorrectly discarded tests or tests wrongly labeled as untrustworthy. Psychometrics professor
Mohsen Tavakol and medical education professor Reg Dennick suggest that improving your
knowledge about internal consistency and unidimensionality will lead to the correct use of
Cronbach’s alpha1:

Cronbach’s Alpha: Simple Definition, Use


and Interpretation
Statistics Definitions > Cronbach’s Alpha
Cronbach’s alpha, α (or coefficient alpha), developed by Lee Cronbach in 1951, measures
reliability, or internal consistency. “Reliability” is how well a test measures what it should. For
example, a company might give a job satisfaction survey to their employees. High reliability
means it measures job satisfaction, while low reliability means it measures something else (or
possibly nothing at all).

Cronbach’s alpha tests to see if multiple-question Likert scale surveys are reliable. These
questions measure latent variables — hidden or unobservable variables like: a person’s
conscientiousness, neurosis or openness. These are very difficult to measure in real life.
Cronbach’s alpha will tell you if the test you have designed is accurately measuring the variable
of interest.

Cronbach’s Alpha Formula


The formula for Cronbach’s alpha is:

Where:

 N = the number of items.


 c̄ = average covariance between item-pairs.
 v̄ = average variance.

SPSS Steps
While it’s good to know the formula behind the concept, in reality you won’t actually need to
work it. You’ll often calculate alpha in SPSS or similar software. In SPSS, the steps are:

Step 1: Click “Analyze,” then click “Scale” and then click “Reliability Analysis.”
Step 2: Transfer your variables (q1 to q5) into “Items,”. The model default should be set as
“Alpha.”
Step 3: Click “Statistics” in the dialog box.
Step 4: Select “Item,” “Scale,” and “Scale if item deleted” in the box description. Choose
“Correlation” in the inter-item box.
Step 5: Click “Continue” and then click “OK”.

Rule of Thumb for Results


A rule of thumb for interpreting alpha for dichotomous questions (i.e. questions with two
possible answers) or Likert scale questions is:
In general, a score of more than 0.7 is usually okay. However, some authors suggest higher
values of 0.90 to 0.95.

Avoiding Issues with Cronbach’s Alpha


Use the rules of thumb listed above with caution. A high level for alpha may mean that the items
in the test are highly correlated. However, α is also sensitive to the number of items in a test. A
larger number of items can result in a larger α, and a smaller number of items in a smaller α. If
alpha is high, this may mean redundant questions (i.e. they’re asking the same thing). A low
value for alpha may mean that there aren’t enough questions on the test. Adding more relevant
items to the test can increase alpha. Poor interrelatedness between test questions can also cause
low values, so can measuring more than one latent variable.

Confusion often surrounds the causes for high and low alpha scores. This can result in
incorrectly discarded tests or tests wrongly labeled as untrustworthy. Psychometrics professor
Mohsen Tavakol and medical education professor Reg Dennick suggest that improving your
knowledge about internal consistency and unidimensionality will lead to the correct use of
Cronbach’s alpha1:

Unidimensionality in Cronbach’s alpha assumes the questions are only measuring one latent
variable or dimension. If you measure more than one dimension (either knowingly or
unknowingly), the test result may be meaningless. You could break the test into parts, measuring
a different latent variable or dimension with each part. If you aren’t sure about if your test is
unidimensional or not, run Factor Analysis to identify the dimensions in your test.

References

Mohsen Tavakol and Reg Dennick. Making Sense of Cronbach’s Alpha. International Journal of
Medical Education. 2011; 2:53-55 Editorial
Cronbach's alpha
From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

A tau-equivalent measurement model is a special case of a congeneric measurement model,

hereby assuming all factor loadings to be the same, i.e.

In statistics (classical test theory), Cronbach's (alpha)[1] is the trivial name used for tau-

equivalent reliability ( )[2] as a (lowerbound) estimate of the reliability of a psychometric

test. Synonymous terms are: coefficient alpha, Guttman's , Hoyt method and KR-20.[2]

It has been proposed that can be viewed as the expected correlation of two tests that
measure the same construct. By using this definition, it is implicitly assumed that the average
correlation of a set of items is an accurate estimate of the average correlation of all items that
pertain to a certain construct.[3]

Cronbach's is a function of the number of items in a test, the average covariance between
item-pairs, and the variance of the total score.

It was first named alpha by Lee Cronbach in 1951, as he had intended to continue with further
coefficients. The measure can be viewed as an extension of the Kuder–Richardson Formula 20
(KR-20), which is an equivalent measure for dichotomous items. Alpha is not robust against
missing data. Several other Greek letters have been used by later researchers to designate other
measures used in a similar context.[4] Somewhat related is the average variance extracted (AVE).
This article discusses the use of in psychology, but Cronbach's alpha statistic is widely used
in the social sciences, business, nursing, and other disciplines. The term item is used throughout
this article, but items could be anything—questions, raters, indicators- for all of which, one
might ask, to what extent they "measure the same thing." Items that are manipulated are
commonly referred to as variables.

It has been argued that the term "Cronbach's " be abandoned in favour of "tau-equivalent

reliability" ( ); and that in many cases an alternative approach, congeneric reliability, should
be used to calculate reliability instead.[2]

Scale purification, i.e. "the process of eliminating items from multi-item scales" (Wieland et al.,
2017) can influence Cronbach's alpha. A framework presented by Wieland et al. (2017)
highlights that both statistical and judgmental criteria need to be taken under consideration when
making scale purification decision.[5]

Definition

Suppose that we measure a quantity which is a sum of components (K-items or testlets):

. Cronbach's is defined as

where is the variance of the observed total test scores, and the variance of component i
for the current sample of persons.[6]

If the items are scored 0 and 1, a shortcut formula is[7]

where is the proportion scoring 1 on item i, and . This is the same as KR-20.

Alternatively, Cronbach's can be defined as


where is as above, the average variance of each component (item), and the
average of all covariances between the components across the current sample of persons (that is,
without including the variances of each component).

The standardized Cronbach's alpha can be defined as

where is as above and the mean of the non-redundant correlation coefficients (i.e.,
the mean of an upper triangular, or lower triangular, correlation matrix).

Cronbach's is related conceptually to the Spearman–Brown prediction formula. Both arise


from the basic classical test theory result that the reliability of test scores can be expressed as the
ratio of the true-score and total-score (error plus true score) variances:

The theoretical value of alpha varies from 0 to 1, since it is the ratio of two variances and the
variance in the denominator is always at least as large as the variance in the numerator. However,
depending on the estimation procedure used, estimates of alpha can take on any value less than
or equal to 1, including negative values, although only positive values make sense.[8] Higher
values of alpha are more desirable. Some professionals, as a rule of thumb, require a reliability of
0.70 or higher with 0.60 as the lowest acceptable threshold (obtained on from substantial sample)
before they will use an instrument.[9]

Although Nunnally (1978) is often cited when it comes to this rule, he has actually never stated
that 0.7 is a reasonable threshold in advanced research projects.[10] And obviously, this rule

should be applied with caution when has been computed from items that systematically
violate its assumptions, such as the use of ordinal items, as Cronbach's alpha is only a lower limit
for a large sample within a metric space. Furthermore, the appropriate degree of reliability
depends upon the use of the instrument. For example, an instrument designed to be used as part
of a battery of tests may be intentionally designed to be as short as possible, and therefore
somewhat less reliable. Other situations may require extremely precise measures with very high
reliabilities. In the extreme case of a two-item test, the Spearman–Brown prediction formula is
more appropriate than Cronbach's alpha.[11]

This has resulted in a wide variance of test reliability. In the case of psychometric tests, most fall
within the range of 0.75 to 0.83 with at least one claiming a Cronbach's alpha above 0.90.[12]

Internal consistency
Main article: Internal consistency

Cronbach's alpha will generally increase as the intercorrelations among test items increase, and is
thus known as an internal consistency estimate of reliability of test scores. Because
intercorrelations among test items are maximized when all items measure the same construct,
Cronbach's alpha is widely believed to indirectly indicate the degree to which a set of items
measures a single unidimensional latent construct. It is easy to show, however, that tests with the
same test length and variance, but different underlying factorial structures can result in the same
values of Cronbach's alpha. Indeed, several investigators have shown that alpha can take on quite
high values even when the set of items measures several unrelated latent
constructs.[1][13][14][15][16][17] As a result, alpha is most appropriately used when the items measure
different substantive areas within a single construct. When the set of items measures more than

one construct, coefficient is more appropriate.[18][19][20]

Alpha treats any covariance among items as true-score variance, even if items covary for
spurious reasons. For example, alpha can be artificially inflated by making scales which consist
of superficial changes to the wording within a set of items or by analyzing speeded tests.

A commonly accepted[citation needed] rule for describing internal consistency using Cronbach's alpha
is as follows,[21][22][23] though a greater number of items in the test can artificially inflate the
value of alpha[13] and a sample with a narrow range can deflate it, so this rule should be used
with caution.

Cronbach's alpha Internal consistency


0.9 ≤ α Excellent
0.8 ≤ α < 0.9 Good
0.7 ≤ α < 0.8 Acceptable
0.6 ≤ α < 0.7 Questionable
0.5 ≤ α < 0.6 Poor
α < 0.5 Unacceptable

Generalizability theory
Cronbach and others generalized some basic assumptions of classical test theory in their
generalizability theory. If this theory is applied to test construction, then it is assumed that the
items that constitute the test are a random sample from a larger universe of items. The expected
score of a person in the universe is called the universe score, analogous to a true score. The
generalizability is defined analogously as the variance of the universe scores divided by the
variance of the observable scores, analogous to the concept of reliability in classical test theory.
In this theory, Cronbach's alpha is an unbiased estimate of the generalizability. For this to be true

the assumptions of essential -equivalence or parallelness are not needed. Consequently,


Cronbach's alpha can be viewed as a measure of how well the sum score on the selected items
capture the expected score in the entire domain, even if that domain is heterogeneous.

Intraclass correlation
Cronbach's alpha is said to be equal to the stepped-up consistency version of the intraclass
correlation coefficient, which is commonly used in observational studies. But this is only
conditionally true. In terms of variance components, this condition is, for item sampling: if and
only if the value of the item (rater, in the case of rating) variance component equals zero. If this
variance component is negative, alpha will underestimate the stepped-up intra-class correlation
coefficient; if this variance component is positive, alpha will overestimate this stepped-up intra-
class correlation coefficient.

Factor analysis
Cronbach's alpha also has a theoretical relation with factor analysis. As shown by Zinbarg,
Revelle, Yovel and Li,[19] alpha may be expressed as a function of the parameters of the
hierarchical factor analysis model which allows for a general factor that is common to all of the
items of a measure in addition to group factors that are common to some but not all of the items
of a measure. Alpha may be seen to be quite complexly determined from this perspective. That
is, alpha is sensitive not only to general factor saturation in a scale but also to group factor
saturation and even to variance in the scale scores arising from variability in the factor loadings.
Coefficient Ωhierarchical[18][19] has a much more straightforward interpretation as the proportion of
observed variance in the scale scores that is due to the general factor common to all of the items
comprising the scale.

Alternatives

Cronbach's assumes that all factor loadings are equal. In reality this is rarely the case, and

hence it systematically underestimates the reliability. An alternative to Cronbach's that does

not rely on this assumption is congeneric reliability ( ).[2]

You might also like