Professional Documents
Culture Documents
1, May 1989 81
© Oxford University Press 1989
ABSTRACT Typologies play an important role in sociological theory and research. Basic to the use of (ideal)
types is the notion that a subject's overt behavior can be conceived of as governed by his/her belonging or
closeness to a particular underlying pure type. Many statistical techniques are in use to detect or construct these
fundamental types, especially factor analysis. Much less attention has been paid to the possibilities that latent
class analysis has to offer. Through an elaborate example, it is shown that the basic ideas of latent class analysis
correspond eminently well with the use social scientists make of (ideal) types. Several important extensions of
the basic latent class model along with significant new developments are discussed.
(joint) manifest variable ABCD, -n ^ is the The items selected to represent the political
conditional probability of obtaining score i on domain are both strictly political items, like
A, given one belongs to latent class t of X, and party attachment, and items measuring the
the other symbols have obvious similar political stand a respondent takes with regard to
meanings): important social and economic issues. The final
choices were guided by the theoretical work of
ABCDX _ ABCDX Zijderveld (1981, 1982) within the boundaries of
_ i j k 1t t=1 i j k 1t (1) what was available from the European Value
71
Survey. The following nine items then have been
This equation represents the assumption that the chosen (more detailed information about the
population can be divided into T exhaustive and construction of these indicators may be obtained
mutually exclusive categories of the latent vari- from the authors):
able X or, one might say, the assumption that X
exists. The essential assumption of local 1. Party Closeness: Respondent (R) feels close
independence is represented in equation (2): to a particular party; does not feel close;
2. Left/right Self-placement: left; middle; right;
ABCDX = X AX fiX CX DX
^ ijklt "t^it^jt^kt11!! (2) 3. Political Interest: interested; not interested;
4. Trust in Parliament: R trusts Parliament;
Among others, Goodman (1974a,b) has shown does not trust;
how to obtain (maximum likelihood) estimates 5. Political Protest Behavior: R participated at
of the probabilities in eqs (1), (2). By least by signing a petition; has not been
comparing the estimates irt?k? with the involved in any kind of political protest;
observed proportions pt?k? or rather with 6. Desirability of Societal Change: change is
the observed frequencies ffff? through desirable; not desirable;
standard chi-square testing procedures, one can 7. Importance of Equality vs. Freedom:
test whether the model is empirically valid—that equality among people more important than
is, whether the postulated typology corresponds individual freedom; freedom more
with reality. important;
In the next section matters will be clarified 8. Basis of Income Differences: income
through an elaborate example of the standard differences mainly based on performance
latent class model. differences between individuals; not mainly
based on performance differences;
THE STANDARD LATENT CLASS MODEL: AN 9. Running Business: owners to make all the
EXAMPLE decisions;, owners and employees jointly to
European Value Survey: Variables and Data decide; employees decide; don't know who's
The potentialities of latent class analysis will be to decide.
illustrated by data from the European Value Conspicuously absent from this list is the
Survey, a recent survey into the basic value variable Party Preference or Voting Behavior.
domains of almost all Western European This is due to the fact that the information
countries, initiated by the so-called European regarding these questions was not available for
Value Systems Study Group. (For more the Dutch data.
information, see among others: Stoetzel, 1983; As for the religious domain, items were
Harding, Phillips and Fogarty, 1986; Halman, selected from several important religious
Heunks, De Moor and Zanders, 1987.) subdomains; viz. religious membership, religi-
From all value domains investigated in this ous activities, importance of religion for per-
survey, we have chosen 'politics' and 'religion' sonal life and traditional religious beliefs. This
for illustrative purposes, restricting ourselves to led to the following ten indicators:
the Dutch survey carried out in 1981 among
1,221 persons randomly drawn from the popu- 1. Religious Denomination: R belongs to a
lation of 18 years and older.1 religious denomination; does not belong;
86 SEARCHING FOR IDEAL TYPES
2. Religious Organization/Church Member- one tries several models with a different number
ship: member; not a member; of latent classes, and chooses the model that fits
3. Church Attendance: at least on Feast Days the observed data and lends itself best to a
(Christmas etc.); never; theoretically meaningful interpretation.
4. Religiosity: R considers him/herself a religi- We have tried a two, a three, and a four latent
ous person; does not; class model using, as everywhere in this article,
5. Religious Comfort: religion gives comfort the program LCAG (Hagenaars, 1988a;
and strength; does not; Hagenaars and Luijkx, 1987).2 The two latent
6. Importance of God: God is important in R's class model yielded an interpretable solution in
life; not important; terms of a division of people into 'left' and 'right'
7. Prayer: R takes moments of prayer; does (L2—the likelihood-ratio chi-square) = 1,805-5,
not take; df = 1,510). The three latent class model fitted
8. Personal God: R believes in a personal better (L2 = 1,3521, df = 1,497) and could be
God; does not believe; interpreted in terms of 'left', 'right' and
9. Decalogue: at least eight of the Ten 'middle'.
Commandments apply today; less than eight The four latent class model offered a still
apply; better and also more interesting possibility to set
10. Traditionalism: with regard to the tra- up a meaningful typology for the political
ditional Articles of Faith (belief in God, domain. The results are presented in Table 1.
Life after Death, a Soul, The Devil, Hell, The last row of Table 1 presents the prob-
Heaven, Sin): traditional believer (R abilities IT * which indicate that the respondents
believes at least five items); intermediate are roughly evenly divided among the four latent
(two to four items); nontraditional (at most classes. The latent classes 1 and 4 each contain
one item). about 30 per cent of all cases, and the latent
classes 2 and 4 each about 20 per cent.
The nine indicators of the political domain The remaining cell entries of Table 1 refer to
taken together make up a huge nine dimensional the conditional probability (TT*?) that a re-
table with 1,536 cells (too large, of course, to spondent obtains a particular score (i) on a
show in this article). As there are only 1,221 manifest variable (A), given the particular latent
respondents and because these indicators are class (t) he or she belongs to. For instance, the
correlated with each other, this table contains a cells at the upper left corner show that a
large number of empty or nearly empty cells. respondent who belongs to latent class 1
The same applies of course to the ten dimen- expresses with a probability of 0-84 as his or her
sional table for the religious items where there feeling that there exists a party he or she feels
are also 1,536 cells. This fact influences the close to; the (conditional) probability that
stability of the estimates of the parameters of the people from latent class 1 will not feel close to
latent class model and especially the possibility any party then of course equals 016.
of testing the model using the traditional chi-
The conditional probability of feeling close to
square test statistics. We will return to this
a party is the highest for the latent classes 1 and
problem at the end of this article.
2; it is small for latent class 4 and almost zero for
latent class 3. Members of latent classes 1 and 2
A Typology for the Political Domain are also more interested in politics than the
In applying the standard latent class model to members of the classes 3 and 4, as can be seen
the multidimensional table formed by the nine from the conditional probabilities with regard to
political indicators, one first has to decide upon the third item. Especially in latent class 2, one
the number of categories of the latent variable, has a high probability of expressing political
that is, the number of latent classes. This interest.
decision can be made on theoretical grounds if From the probabilities related to the second
one has a clear notion about the number of item it can be inferred that the first latent class
types. Otherwise, in a more exploratory fashion, may be characterized as the political right, the
EUROPEAN SOCIOLOGICAL REVIEW 87
Latent class
1 2 3 4
Items Conservatives Progressives Individualists Non-involved
L 2 = 1,227-3, d f = 1,484.
Pearson chi-square = 1,602-7.
second class as the political left, and the remain- same pattern is obtained for the income
ing classes 3 and 4 as the political middle. difference item: latent classes 1 and 3 see income
Parliament is not highly trusted, except by the differences as based mainly on differences in
political right (latent class 1); the political left individual performances as opposed to latent
(latent class 2) has a high probability of having class 2 and especially 4.
been engaged in political protests, much higher That business should be run by the owners
than all other groups. alone is an opinion especially found among the
Interesting are the outcomes for the item right (latent class 1) and to a lesser extent among
'societal change'. As might have been expected, members of latent class 3; the left, latent class 2,
the left people (from latent class 2) are the relatively favors businesses as being run by the
strongest proponents of social change, but the employees alone; latent class 4 relatively often
second place is occupied by latent class 3, one of replies 'don't know'.
the middle groups. Among the right group What then is the overall pattern that arises
(latent class 1) and especially among the other from all these separate findings? In the first
middle group (latent class 4) the probability of place it is clear that those who belong to the
advocating social change is much smaller. latent classes 1 or 2 are part of the traditional
With regard to the equality item, the latent political system albeit in different ways. The first
classes 1 and 3 stress the importance of latent class consists of people who belong to the
individual freedom, while the latent classes 2 political right wing, who have a consistently
and 4 are in favor of equality. More or less the conservative political outlook. The second latent
88 SEARCHING FOR IDEAL TYPES
Latent Class
1 2 3
Items Religious Middle Non-religious Factor-loading
L2 = 1,046-1, df = 1,500.
Pearson chi-square = 2,085-6.
Note: (a) Except for the last item, all probabilities refer to the first, 'religious' category of each item; the last column refers to
the outcomes of a principal components analysis, one factor solution.
class may be characterized as left wing, as Maybe the label politically Non-involved fits
consisting of Progressives. them best.
The other two groups are much less involved The fruitfumess of this typology ought to be
in politics (cf. items 1 to 5). The fact that they verified by relating these types, both theoreti-
place themselves in the middle of the left/right cally and empirically, to social background vari-
scale may be more indicative of their feeling of ables and to other typologies, for example,
not belonging anywhere on this scale than of Zijderveld's typology mentioned above (that is,
their feeling that the middle forms their political by relating the Non-involved to Zijderveld's
home. Amoralists, the Individualists to his
The members of latent class 3 share with the Immoralists, and the Conservatives and Pro-
Conservatives (latent class 1) an individualistic gressives to his Moralists). In the next section,
orientation (see items 7 and 8), but the former we will show how to extend the basic latent class
are not as opposed to societal change and lay model in order to investigate these kind of
less emphasis on the owners for running business relations empirically. But, first, the results of the
than the latter. They might perhaps best be latent class analysis of the religious items will be
labeled as Individualists. presented.
The final latent class 4 consists of tra-
ditionalists with an egalitarian and perhaps even A Typology for the Religious Domain
indifferent political orientation to the world. The outcomes of the latent class analysis for the
They do not protest, favor the least societal religious items are very clear and simple. A
change, prefer equality above individual two-latent class solution divides the respondents
freedom and are the most opposed to basing into two groups which are of the same size:
income differences on individual differences in religious and nonreligious people (L2 = 1,716-9,
performance; also they have a relatively large df = 1,512). The three latent class model pre-
share of the 'don't know' answers on item 9. sented in Table 2 clearly shows that there is in
EUROPEAN SOCIOLOGICAL REVIEW 89
fact an underlying (and continuous?) dimension within social systems. In this section we will
'religiosity'. show how to extend the basic latent class model
The probabilities that members of latent class to include these kinds of relationships.
1 give religious answers are very high, almost
'perfect'; the corresponding probabilities for the Estimating Individual Latent Scores
third latent class are very low, almost zero; Usually further analyses are carried out by
latent class 2 occupies an intermediate position assigning the respondents to the latent classes on
on all items. the basis of their scores on the indicators, that is,
Contrary to the outcomes for the political on the manifest variables (Lazarsfeld and
items, where we did have nominal level items Henry, 1968; Goodman, 1974a,b; Clogg, 1981;
and found nonlinear relations between the latent Schwartz, 1986). This of course parallels what is
and the observed variables, we now have ordinal customary in common factor analysis, where
level (dichotomous) items and obtain more or most investigators will compute factor scores
less linear relationships between the latent and and use these 'observed' scores for further
manifest variables. In such circumstances factor analyses.
analysis should give about the same results as Using the estimates of TI^CDX (see eq. (1))
latent class analysis. The application of an one can obtain for each respondent an estimate
exploratory principal components analysis to the of the conditional probability that the
ten religious items pointed clearly toward a one respondent belongs to a particular latent class
factor solution. (The first unrotated factor (belongs to category t of X), given the scores
explained 52-9 per cent of the variance of all (i,j,k,l) on the manifest variables (A,B,C,D),
items; the second (best) factor only 8-3 per using the formula:
cent!) The factor loadings are given in the last
column of Table 2.
With these results the advantages of latent ABCDX__ABCDX ABCD
i j k1
(3)
class analysis over factor analysis are nil. How-
ever, the outcomes might have been different. The respondent will then be assigned to that
In factor analysis one assumes a priori continu- latent class for which this conditional probability
ous underlying variables and linear relationships is largest. Once having obtained 'observed'
among all variables. In latent class analysis no scores X' for the latent variable X, the analyses
such a priori assumptions are made. From the proceed as usual. X' may be correlated with
religious items a different typology might have observed variables like education, occupation,
arisen such as: latent class 1—orthodoxly religi- age etc. and also with other 'observed' latent
ous; latent class 2—nonorthodoxly religious; variables, say Y' and Z', resulting from latent
latent class 3—nonreligious. In order to find class analyses of other value domains.
such a nominal level typology, involving non- But, as in common factor analysis, one
linear relations between the variables, one has encounters two kinds of problems using this kind
to use latent class analysis. of 'observed' score (except for some extreme
cases in which the latent variable is perfectly
related to a manifest variable and where one
ELABORATING TYPOLOGIES; EXTENSIONS does not really need a latent variable). (See
OF THE BASIC LATENT CLASS MODEL among others, Steiger, 1979; Hagenaars, 1985.)
A latent class analysis as carried out in the People are assigned to that latent class to
previous section, is just the first step for con- which they most probably belong on the basis of
structing a meaningful typology. As we have their scores on the manifest variables. But as
said above: to be fruitful, a typology, for long as this modal probability is less than 1,
example, the one that has been found for the there is a possibility of misclassification, that is,
political items, ought to be related theoretically a possibility that a person will be assigned to the
and empirically to typologies in other value wrong latent class. The misclassification prob-
domains and to the positions people occupy abilities can be computed using equations like
90 SEARCHING FOR IDEAL TYPES
eq. (3) (Clogg, 1981). For the three latent class
model for the religious items (Table 2) the
average probability of misclassification equals
only 005; but for the four latent class model for
the political items (Table 1) this probability is
0-21. In other words, one can expect that 5 per
cent and 21 per cent respectively of all
respondents will be assigned to the wrong latent
class. Because of these misclassifications the
'observed' relation between X' and an 'external'
variable, like age, may systematically differ from
the true relation between the latent variable X
and this external variable. FlGURE 2 Two latent variable latent class model
To the extent that misclassifications may
occur, a second and even more fundamental ences A and B and because Y and Z are
problem arises, viz. that within the latent class associated with each other and not because of
model the individual scores on the latent vari- any direct effect from Z on A and B. The same
able X are not exactly identified. It is hypo- applies of course to the relations between C, D
thetically possible to construct different sets of andY.
individual scores X (not X'), each set perfectly Goodman has shown how this two latent
in correspondence with the estimated par- variable model can be formulated as a variant of
ameters of the latent class model, but not the basic model in Figure 1 (eqs. (1), (2)). One
identical to each other and sometimes even may consider the joint latent variable YZ (with
negatively correlated (Hagenaars, 1985)! So, it R x S categories) as the 'one' latent variable X
becomes unclear what the meaning is of the with T = R x S categories. The relation
'observed' scores X' if the underlying individual between the two latent variables may be derived
scores on the true latent variable X which the from the probability distribution IT* = -ITyf of
'observed' scores are supposed to represent, are the latent variable X = YZ.
not uniquely identified themselves. In order to ensure that the scores on, for
However, it is not necessary to work with the example, A depend only on Y and not on Z, as
'observed' scores X'. The relations between the model in Figure 2 implies, one has to put
several latent variables and between a latent certain equality constraints on the parameters,
variable and observed, external variables can be on the conditional probabilities (eq. (2)). For
unbiasedly estimated without one having to instance: the conditional probability that a per-
resort to individual scores. son obtains the score i on A may only depend on
his 'score' on Y and not on his 'score' on Z. So,
Latent Class Models with More than One Latent in case all variables are dichotomous one
Variable imposes the following kind of restrictions (here
Right from the beginning attention has been only shown with regard to the manifest variables
paid to latent class models with more than one A and C):
latent variable (Lazarsfeld, 1950a,b; Wiggins,
1955). An example of a two latent variable
model is given in Figure 2. AYZ= AYZ AYZ _ 1I_AYZ
1 11 1 12
In Figure 2 Y and Z are two latent variables; (4)
A to D are manifest variables. A and B are the CYZ_ _CYZ CYZ_ CYZ
1 11- "l 21 1 1 2 ~ * 1 22
indicators of Y; C and D are the indicators of Z.
The scores on A and B are only directly influ- In this way, the basic latent class model actually
enced by the latent score on Y; the scores on B encompasses a very large variety of latent class
and C only by Z. Although A and B may be models with several latent variables whose par-
associated with Z, this is only because Y influ- ameters can be routinely estimated using
EUROPEAN SOCIOLOGICAL REVIEW 91
Goodman's (1974a,b) algorithm as implemented TABLE 3 Relation between the political and the religious
in, for example, LCAG. typology
In order to relate the typologies for the relig- Religious Nonreligjous
ious and the political domain to each other, one
has to set up a two latent variable model. The Conservatives 0-21 010 0-31
latent religiosity variable Y has three categories Progressives 004 015 019
(as above) and influences directly the ten relig- Individualists 0-06 016 0-22
ious indicators indicated in Table 2; the latent Non-involved 010 018 0-28
political variable Z has four categories and 0-42 0-58 100
directly determines the scores on the nine
political indicators, presented in Table 1. However, this was not the case. All parameters
Although in principle the parameters of this referring to the political domain were very much
model might have been estimated, the comput- the same as the ones obtained before and also
ing time would have been enormous. The 9 + 10 the outcomes for the religious domain reflected
indicators together make up an observed fre- the structure found before. We therefore do not
quency table of 1,536 x 1,536 = 2,359,296 cells; present these parameter estimates, except for
certainly in combination with the 3 x 4 = 12 the 'new' parameters -n-yf. These parameters
latent classes this made the problem too large to give us the distribution of the joint latent vari-
handle adequately. able YZ, that is, the relationship between the
However, some simplifications could be intro- two typologies. They are presented in Table 3.
duced without seriously affecting the results. After percentaging Table 3 horizontally, one
The latent class analysis of the religious items learns that 69 per cent of the Conservatives
led to the conclusion that the underlying latent belong to the religious latent class; this percent-
variable was a continuous variable 'religiosity', age is much lower for all other political groups,
which had very strong, more or less linear the lowest figure being 19 per cent for the
relations with all manifest indicators. Progressives; the percentage 'member of the
So, it was decided to use only two latent religious latent class' for the Individualists is 29
classes for the religiosity latent variable per cent and for the Non-involved 36 per cent.
(dichotomize the underlying religiosity con- The vertical percentages of course confirm
tinuum) and to select only three indicators for this picture. Half (51 per cent) of religious
this domain, viz. the items 5, 6, and 8 in Table 2. people belong to the Conservative latent class,
These items are very strongly related to the while among the nonreligious, the proportion
latent variable and may represent the underlying Conservative only amounts to 17 per cent. For
dimension as well as the ten items. For the all other groups it applies that religious people
political domain all items were used. have a lower probability of being a member than
Although we still had a very large observed the nonreligious. The corresponding percent-
frequency table (12,288 cells and only 1,221 ages are: with regard to the Progressives: 9 vs.
respondents) and eight latent classes, this prob- 26 per cent; with regard to the Individualists: 16
lem was manageable. The test statistics were: L2 vs. 27 per cent; with regard to the Non-involved:
= 3,522-6, df = 12,226 (and Pearson chi-square: 24 vs. 30 per cent.
12,9301!). Before looking at the relations So, at least for the Netherlands, Conservatism
between the two latent variables, it was first and religiosity go hand in hand, while progress-
checked whether the solution for this total iveness and religiosity tend to exclude each
model could be compared with the two separate other, as do, to a lesser extent, political apathy
solutions obtained above. Because in maximum and individualism.
likelihood estimation methods all parameters
are estimated simultaneously and because the Relating Latent Variables to External, Manifest
religious domain was only represented by three Variables
items, the results obtained here might have been Further insight into the meaning of a typology
different from the results in the previous section. may be obtained by relating this typology to a
92 SEARCHING FOR IDEAL TYPES
ables are automatically included. This may
seriously complicate the interpretation of the
results. We will return to this in the next section.
Our starting point now is an observed fre-
quency table with 27,648 cells; the number of
latent classes of course equals four. The test
E(ducation)
statistics are: L2 = 5,342-4, df = 27,528 (Pearson
chi-square = 29,0080!). Again, the parameter
estimates referring to the relationship between
X and the nine political indicators are the same
as obtained earlier. The estimates concerning
FIGURE 3 Latent class model with external variable
the relationship between X, the political
typology, and ARE are presented in Table 4
number of background variables. Such a model (albeit in the form of percentages instead of
has been depicted in Figure 3. proportions).
All variables in Figure 3 are manifest, except Table 4 is a rather complicated table. Never-
for X. Essential to this model is that the relation theless, a few things immediately catch the eye.
between the external variable Education (E) For example, as much as 25 per cent of the
and the indicators A to D are not direct, but are Non-involved are older, religious people with a
completely mediated through the latent variable low level of education while as much as 25 per
X. As Goodman (1974a) shows, all parameters cent of the Progressives consist of young, non-
of such a model, including the parameters con- religious people with a high educational level.
cerning the relation between X and E, may be But to get a better insight into this table one has
obtained by considering this model as a basic to perform more extended analyses, starting
model (Figure 1) with one latent variable and with the bivariate tables, for example, education
five 'indicators' A to E. by political typology, and introducing step by
It was decided to exemplify this approach by step the other independent variables.3
relating the political typology to the following These analyses made clear that religion was
three manifest variables: directly and rather strongly related to the
typology in the sense that the Conservatives and
Age — <31 years; 31-50 years; the Progressives were each other's counterparts
>50 years. while the other two occupied an in-between
Religious R does not belong to a position. The Conservatives tend to be members
Denomination — religious denomination; of a religious denomination and the Progressives
R belongs. tend to be nonmembers. This of course is in
Education — age at completion of agreement with the results presented in Table 3.
formal education: The relations between education and the
<16 years; 16-18 years; typology were very strong, especially with
>18 years. regard to belonging to the Non-involved or the
Progressive type. The lower the educational
The easiest way to estimate the relations level, the higher the chances of belonging to the
between the latent variable X and the three Non-involved type. There is a strong opposite
external variables is to consider Age, Religious tendency for the chances of belonging to the
Denomination and Education as one joint vari- Progressive type. The higher the education, the
able ARE with 3 x 2 x 3 = 18 categories and higher the chances of being Progressive. The
perform a standard latent class analysis with one Non-involved type has definitely the lowest edu-
latent and ten manifest variables: the nine cation, while Progressives tend to be the most
political indicators plus ARE. A disadvantage of highly educated. Furthermore, the Conserva-
this approach is that complicated higher order tives and the Individualists mostly do not come
interactions between X and the external vari- from the least educated category, but in equal
EUROPEAN SOCIOLOGICAL REVIEW 93
TABLE 4 Relation between the political typology and background variables: age, education and religion; vertical
percentages
Religious
Age<" Denomination^' Education< c > Conservatives Progressives Individualists Non-involved
Notes: (a) Age: Y(oung): < 3 1 years; M(iddle): 3 1 - 5 0 years; O(ld): > 5 0 years.
(b) Religious Denomination: N(onreligious): R does not belong to a religious denomination; R(eligious): R belongs.
(c) Education: age at completion of education: L(ow): < 1 6 years; M(iddle): 16-18 years; H(igh): > 1 8 years.
amounts from the groups with average or high education; there is no special relation with
education. religion or age.
There is no special connection between age
and belonging to the Non-involved or Progress-
ive types. The support for the Conservatives RESTRICTED MODELS AND NEW
clearly increases with age. The Individualists DEVELOPMENTS
belong to the middle and younger age groups, As has been shown above, the standard latent
but definitely not to the oldest age group. class model actually encompasses a large variety
In sum, we have obtained the following extra of models. The standard, one latent variable
characteristics for the types in the political model can be transformed into a model with
domain. The Conservatives are the older, re- several latent variables and/or several external
ligious people, averagely educated. The Pro- (background) variables. Many more modifi-
gressives are the nonreligious, more highly cations of the standard model are possible that
educated people, from all age groups. The are useful within the context of typology
Individualists come from the youngest and analysis.
middle age groups and are to be found equally If two or more items are equally reliable and
among the religious and nonreligious people valid indicators of the underlying typology (the
with an average education. The overwhelming latent variable X), one can put equality restric-
characteristic of the Non-involved is their low tions on the parameters to ensure the
94 SEARCHING FOR IDEAL TYPES
equivalence of these indicators (Gooodman, ables and the latent variable might show very
1974a,b). For example, in the case of one latent complicated higher order interactions. How-
variable religiosity (X) and two religious items A ever, it is possible to introduce the background
and B, one sets the probability of giving the variables as quasi-latent variables and to exclude
religious answer i on A equal to the probability from the model certain higher order interactions
of giving the religious answer i on B, for each among the (quasi-) latent variables (Hagenaars,
latent class t: irf* = ir?f. 1988a). In this way it is possible to define
Through equality restrictions one can also 'modified path analysis models' (Goodman,
express the idea that the 'reliabilities' (Schwartz, 1973) with latent variables.
1985; 1986) of a particular item A are the same Apart from these modifications of the stand-
for all latent classes. For example, in the case of ard model there are several interesting new
dichotomous variables: the probability of developments. For instance it is possible to relax
obtaining the religious score 1 on A, given a the basic assumption of local independence
person is in the religious latent class 1 (X = 1), is (Hagenaars, 1988b); models have been
the same as the probability of choosing the developed in which item nonresponse is taken
nonreligious alternative 2 on A, given one is in into account (Fuchs, 1982; Fay, 1986;
the nonreligious latent class 2 (X = 2): trf? = Hagenaars, 1988c); procedures are available to
TT2 2 . set up models with ordinal (latent) variables
Another useful feature of latent class analysis connecting latent class and latent trait models
is the possibility to assign fixed values to the (Haberman, 1979; Clogg, 1979; Clogg and
parameters of the latent class model, especially Sawyer, 1981; Andrich, 1985).
the extreme values zero or one. Sometimes a At the heart of most of these developments is
manifest variable (A) can be conceived of as a the formulation of the latent class model as a
perfect indicator of the underlying typology (X). log-linear model with latent variables (see
This kind of relation between A and X can be especially, Haberman, 1979). For instance, the
taken into account by imposing the following standard model in Figure 1 can be formulated in
restrictions: irt* = 1 if i = t, -rrf^ = 0 other- log-linear terms as follows, using Goodman's
wise. The manifest variable A and the latent notation (Goodman, 1978):
variable X have then in fact become identical
variables, and hence X often will be called a
quasi-latent variable.
Quasi-latent variables can be used for many (5)
other purposes. Very often an investigator wants
to compare the typology found for a particular Especially Goodman in his many publications
group (e.g. nation) at one point in time with has shown the potentialities of log-linear
typologies found for other groups or other analysis for sociological research. Extending
periods of time. Such cross-cultural or over-time these insights to log-linear analysis with latent
comparisons can be made through a latent class variables, in other words to latent class analysis,
model in which the group variable (e.g. nation) gives an impression of the many possibilities a
or time variable (period) has been included as a modern researcher has working with (ideal)
quasi-latent variable. (More details are given by types. Standard, easy to use computer programs
Clogg and Goodman, 1985, and Hagenaars, in which the possibilities mentioned above have
1985, 1988a.) been implemented, are available. To mention
just a few: Haberman's LAT, Clogg's MLLSA,
Quasi-latent variables also play an important and Hagenaars' LCAG (Haberman, 1979;
part in setting up causal models with categorical Clogg, 1981; Hagenaars, 1988a; Hagenaars and
latent variables. As has been shown in the Luijkx, 1987).
previous section, background variables like age,
religion, and education can be related to the
latent variable. It was remarked there that the EVALUATION
relationships between these background vari- Latent class analysis appears to be extremely
EUROPEAN SOCIOLOGICAL REVIEW 95
well suited for discovering and testing the 'exist- NOTES
ence' of ideal types. The idea of latent classes 1. The sample consists of about 1,000 randomly chosen
which determine the observed reactions, corre- adults and an extra quota sample of about 200 young
adults aged 18-24 years. For more details about the
sponds very well with the idea of underlying sampling procedure see Harding, Phillips and Fogarty
types which govern overt behavior. Moreover, (1986). The EVSSG data set has been deposited with the
latent class analysis does not presuppose interval Economic and Social Research Council Data Archive at
measurement, linear relationships, or underly- the University of Essex, Wivenhoe Park, Colchester,
ing normal distributions. These kind of assump- Essex CO4 3SQ, England.
2. The program LCAG, along with the user manual can be
tions are common in alternative techniques like obtained at minimal charges for tape and mailing from:
factor analysis, but are alien to the way most Jacques A. Hagenaars, Department of Sociology,
social scientists have been using ideal types. Tilburg University, P.O. Box 90153, 5000 LE Tilburg,
Especially through the work of Haberman and The Netherlands; electronic mail: R066JHAG@
HTIKUB5.BrTNET.
Goodman, most statistical problems with latent 3. The results reported below are mainly based on a log-
class analysis have been overcome. It is known linear analysis of Table 4. Ideally this log-linear analysis
how and when to obtain estimates of identifiable ought to be carried out simultaneously with the latent
parameters. Statistical testing procedures have class analysis. We will return to this in the next section.
been developed. Computer programs are avail-
able to routinely carry out the analyses.
Nevertheless, one big problem remains, as
with all forms of tables analysis. If the number of REFERENCES
manifest variables increases somewhat, the Adriaansens H P, Zijderveld A C. (1981): Vrijwillig Initiatief
number of cells increases enormously, and may in de Verzorgingsstaat (Voluntary associations in the
well exceed the number of respondents. Under welfare state), Deventer: Van Loghura Slaterus.
such circumstances the standard chi-square test- Aldenderfer M S, Blashfield R K. (1984): Cluster Analysis,
ing procedures are of no use. (For some recent Beverly Hills: Sage Publications.
Andrich D. (1985): 'An Elaboration of Guttman Scaling
developments and possible alternatives see with Rasch Models for Measurement', in Tuma N B,
among others Koehler, 1986.) In the examples (ed), Sociological Methodology 1985, San Francisco:
above this came out clearly from the enormous Jossey-Bass, pp. 33-80.
differences between the values of the Pearson Barton, A H. (1955): 'The Concept of Property-space in
chi-squares and the loglikelihood chi-squares for Social Research', in Lazarsfeld P F, Rosenberg M, (eds),
The Language of Social Research, New York: The Free
several models. (So in spite of what we have said Press, pp. 40-53.
above, perhaps more attention ought to be paid Bell D. (1974): The Coming of Post-industrial Society,
to the possibilities of using individual latent London: Heinemann.
scores.) Bijnen E J. (1973): Cluster Analysis, Tilburg: Tilburg Uni-
versity Press.
Although extremely small cell frequencies in Clogg C C. (1979): 'Some Latent Structure Models for the
principle affect the stability of the parameter Analysis of Likert-type Data', Social Science Research, 8:
estimates, it is more or less reassuring that, once 287-301.
a satisfactory solution was obtained, dividing the (1981): 'New Developments in Latent Structure
sample into more layers according to religion, Analysis', in Jackson D J, Borgatta E F, (eds), Factor
analysis and Measurement in Sociological Research,
age and education which resulted in an enor- Beverly Hills: Sage Publications, pp. 215-48.
mous table, did not noticeably affect the par- Clogg C C, Goodman L A. (1985): 'Simultaneous Latent
ameter estimates. Structure Analysis in Several Groups', in Tuma N B,
So, many practical problems remain, but the (ed), Sociological Methodology 1985, San Francisco:
Jossey-Bass, pp. 81-110.
possibilities offered by latent class analysis chal- Clogg C C, Sawyer D O. (1981): 'A Comparison of Alterna-
lenge the theorist to subject his theoretical tive Models for Analyzing the Scalability of Response
notions to empirical tests more systematically; Patterns', in Leinhardt S, (ed), Sociological Methodology
and on the other hand, it offers the empirical 1981, San Francisco: Jossey-Bass, pp. 240-80.
analyst an opportunity to perform analyses that Everitt B. (1980): Cluster Analysis, New York: Halsted.
Fay R E. (1986): 'Causal Models for Patterns of Non-Res-
are more in line with what is theoretically ponse', Journal of the American Statistical Association,
meaningful. 81: 354-365.
96 SEARCHING FOR IDEAL TYPES
Fuchs C. (1982): 'Maximum Likelihood Estimation and Lazarsfeld P F. (1950a): 'The Logical and Mathematical
Model Selection in Contingency Tables with Missing Foundation of Latent Structure Analysis', in Stouffer S,
Data', Journal of the American Statistical Association, 77: (ed), Measurement and Prediction, Princeton: Princeton
270-78. University Press, pp. 362-412.
Gifi A. (1981): Non-linear Multivariate Analysis, Leyden: (1950b): "The Interpretation and Mathematical
DSWO Press. Foundation of Latent Structure Analysis', in Stouffer S,
Goodman L A. (1973): 'The Analysis of Multidimensional (eds), Measurement and Prediction, Princeton: Princeton
Contingency Tables when Some Variables Are Posterior University Press, pp. 413-72.
to Others; a Modified Path Analysis Approach', Bio- Lazarsfeld P F, Henry N W. (1968): Latent Structure
metrika, 60: 179-92. Analysis, Boston: Houghton Mifflin Company.
(1974a): "The Analysis of Systems of Qualitative Vari- De Leeuw, J. (1984): Canonical Analysis of Categorical
ables When Some of the Variables Are Unobservable. Data, Leyden: DSWO Press.
Part I—A Modified Latent Structure Approach', McCutcheon A L. (1987): Latent Class Analysis, Beverly
American Journal of Sociology, 79: 1179-1259. Hills: Sage Publications.
(1974b): 'Exploratory Latent Structure Analysis Using McKinney J C. (1966): Constructive Typology and Social
Both Identifiable and Unidentifiable Models', Bio- Theory, New York: Appleton Century Crofts.
metrika, 61: 215-31. Riley M W. (1963): Sociological Research, I; A Case
(1978): Analyzing Qualitative/Categorical Variables; Approach, New York: Harcourt, Brace & World.
Log-linear Models and Latent Structurt Analysis, Schwartz J E. (1985): The Neglected Problems of Measure-
Cambridge: Abt Books. ment Error in Categorical Data', Sociological Methods
Haberman S J. (1979): Analysis of Qualitative Data. New and Research, 13: 435-66.
Developments, New York: Academic Press. (1986): 'A General Reliability Model for Categorical
Hagenaars JAP. (1985): Loglineaire Analyse van Herhaalde Data Applied to Guttman Scales and Current-Status
Surveys (Log-linear analysis of repeated surveys), disser- Data', in Tuma N B, (ed), Sociological Methodology
tation, Tilburg University. 1986, Washington, DC: American Sociological Associ-
(1988a): 'LCAG-Loglinear Modelling with Latent ation, pp. 79-119.
Variables; a Modified LISREL Approach', in Saris W E, Sjoberg G, Nett R. (1968): A Methodology for Social
Gallhofer I, (eds), Sociometric Research. Volume 2. Data Research, New York: Harper & Row.
Analysis, London: MacmiUan, pp. 111-130. Steiger J H. (1979): "The Relationship between External
(1988b): 'Latent Structure Models with Direct Effects Variables and Common Factors', Psychometrika, 44:
Between Indicators. Local Dependence Models', 157-67.
Sociological Methods and Research, 16: 379-405. Stoetzel J. (1983): Les Valeurs du Temps Present: une
(1988c): 'Log-linear Analysis with Latent Variables and Enquite Europtenne, Paris: Presses Universitaires de
Missing Data', paper presented at the International France.
Conference on Social Science Methodology, Dubrovnik, Tatsuoka M M. (1971): Multivariate Analysis, New York:
May-June, 1988. John Wiley.
Hagenaars J, Luijkx R. (1987): LCAG; Latent Class Models Wiggins L M. (1955): Mathematical Models for the Analysis
and Other Loglinear Models with Latent Variables, of Multi-wave Panels, Ph.D dissertation, Columbia Uni-
Manual LCAG, Working Paper Series # 17, Department versity, Ann Arbor: University Microfilms.
of Sociology, Tilburg: University of Tilburg. Zijderveld A C. (1982): Reality in a Looking Glass:
Halman L, Heunks F, De Moor R, Zanders H. (1987): Rationality Through an Analysis of Traditional Folly,
Traditie, Secularisatie en Individualisering (Tradition, London: Routledge & Kegan Paul.
secularization, and individualization), Tilburg: Tilburg
University Press.
Harding S, Phillips D. (1986): Contrasting Values in Western
Europe, London: Macmillan.
Hannan H H. (1976): Modem Factor Analysis, Chicago:
University of Chicago Press.
Inglehart R. (1977): The Silent Revolution, New Jersey:
Princeton University Press.
Kim J O, Mueller C W. (1978a): Introduction to Factor
Analysis, Beverly Hills: Sage Publications.
(1978b): Factor Analysis; Statistical Methods and Prac-
tical Issues, Beverly Hills: Sage Publications.
Klecka W R. (1980): Discriminant Analysis, Beverly Hills:
Sage Publications.
Koehler K. (1986): 'Goodness-of-fit Tests for Log-linear AUTHORS' ADDRESS
Models in Sparse Contingency Tables', Journal of the Jacques A. Hagenaars and Lock C. Halman, Department of
American Statistical Association, 81: 483-493. Sociology, Tilburg University, P.O. Box 90153, 5000 LE
Kruskal J B, Wish M. (1978): Multidimensional Scaling, Tilburg, The Netherlands.
Beverly Hills: Sage Publications. Manuscript received: July, 1988.