You are on page 1of 21

# JSS

Journal of Statistical Software
August 2009, Volume 31, Issue 4.

http://www.jstatsoft.org/

Gifi Methods for Optimal Scaling in R: The
Package homals
Jan de Leeuw

Patrick Mair

University of California, Los Angeles

WU Wirtschaftsuniversit¨at Wien

Abstract
Homogeneity analysis combines the idea of maximizing the correlations between variables of a multivariate data set with that of optimal scaling. In this article we present
methodological and practical issues of the R package homals which performs homogeneity
analysis and various extensions. By setting rank constraints nonlinear principal component analysis can be performed. The variables can be partitioned into sets such that
homogeneity analysis is extended to nonlinear canonical correlation analysis or to predictive models which emulate discriminant analysis and regression models. For each model
the scale level of the variables can be taken into account by setting level constraints. All
algorithms allow for missing values.

Keywords: Gifi methods, optimal scaling, homogeneity analysis, correspondence analysis, nonlinear principal component analysis, nonlinear canonical correlation analysis, homals, R.

1. Introduction
In recent years correspondence analysis (CA) has become a popular descriptive statistical
method to analyze categorical data (Benz´ecri 1973; Greenacre 1984; Gifi 1990; Greenacre and
Blasius 2006). Due to the fact that the visualization capabilities of statistical software have
increased during this time, researchers of many areas apply CA and map objects and variables
(and their respective categories) onto a common metric plane.
Currently, R (R Development Core Team 2009) offers a variety of routines to compute CA
and related models. An overview of functions and packages is given in Mair and Hatzinger
(2007). The package ca (Nenadic and Greenacre 2007) is a comprehensive tool to perform
simple and multiple CA. Various two- and three-dimensional plot options are provided.
In this paper we present the R package homals, starting from the simple homogeneity analysis,
which corresponds to a multiple CA, and providing several extensions. Gifi (1990) points

2

homals: Gifi Methods for Optimal Scaling in R

out that homogeneity analysis can be used in a strict and a broad sense. In a strict sense
homogeneity analysis is used for the analysis of strictly categorical data, with a particular
loss function and a particular algorithm for finding the optimal solution. In a broad sense
homogeneity analysis refers to a class of criteria for analyzing multivariate data in general,
sharing the characteristic aim of optimizing the homogeneity of variables under various forms
of manipulation and simplification (Gifi 1990, p. 81). This view of homogeneity analysis will
be used in this article since homals allows for such general computations. Furthermore, the
two-dimensional as well as three-dimensional plotting devices offered by R are used to develop
a variety of customizable visualization techniques. More detailed methodological descriptions
can be found in Gifi (1990) and some of them are revisited in Michailidis and de Leeuw (1998).

2. Homogeneity analysis
In this section we will focus on the underlying methodological aspects of homals. Starting
with the formulation of the loss function, the classical alternating least squares algorithm is
presented in brief and the relation to CA is shown. Based on simple homogeneity analysis
we elaborate various extensions such as nonlinear canonical analysis and nonlinear principal
component analysis. A less formal introduction to Gifi methods can be found in Mair and de
Leeuw (2009).

2.1. Establishing the loss function
Homogeneity analysis is based on the criterion of minimizing the departure from homogeneity.
This departure is measured by a loss function. To write the corresponding basic equations the
following definitions are needed. For i = 1, . . . , n objects, data on m (categorical) variables
are collected where each of the j = 1, . . . , m variable takes on kj different values (their levels
or categories). We code them using n × kj binary indicator matrices Gj , i.e., a matrix of
dummy variables for each variable. The whole set of indicator matrices can be collected in a
block matrix
h
i

G = G1 ... G2 ... · · · ... Gm .
(1)
In this paper we derive the loss function including the option for missing values. For a
simpler (i.e., no missings) introduction the reader is referred to Michailidis and de Leeuw
(1998, p. 307–314). In the indicator matrix missing observations are coded as complete zero
rows; if object i is missing on variable j, then row i of Gj is 0. Otherwise the row sum
becomes 1 since the category entries are disjoint. This corresponds to the first missing option
presented in Gifi (missing data passive 1990, p. 74). Other possibilities would be to add
an additional column to the indicator matrix for each variable with missing data or to add
as many additional columns as there are missing data for the j-th variable. However, our
approach is to define the binary diagonal matrix Mj if dimension n × n for each variable j.
The diagonal element (i, i) is equal to 0 if object i has a missing value on variable j and
equal to 1 otherwise. Based on Mj we can define M? as the sum of the Mj ’s and M• as their
average.
For convenience we introduce

Dj = G0j Mj Gj = G0j Gj ,

(2)

as the kj ×kj diagonal matrix with the (marginal) frequencies of variable j in its main diagonal.

. More detailed plot descriptions are given in Section 4. The category points are the centers of gravity of the object points that share the same category. the minimization problem is solved by the iterative alternating least squares algorithm (ALS. The loss in (3) pertains to the sum of squares of the line lengths in the graph plot. The larger the spread between category points the better a variable discriminates and thus the smaller the contribution to the loss. Minimizing the loss function Typically. .e... . we can think of G as the adjacency matrix of a bipartite graph in which the n objects and the kj categories (j = 1. . the loss corresponds to the sum over variables of the sum of squared line lengths. Moreover. Thus. Multiplying the scores by n/m gives a mean of 0 and a variance of 1 (i. Normalization: X (t+1) = M? 2 orth(M? 2 X . can be considered as the classical or standard homals plot. . . The problem of finding these solutions can be formulated by means of the following loss function to be minimized: ∆ σ(X. . . . 2.e. . Producing a star plot. sometimes quoted as reciprocal averaging algorithm). m ˜ (t) = M?−1 Pm G Y (t) 2. 2. let Yj be the unknown kj × p matrix containing the coordinates of the category projections into the same p-dimensional space (category quantifications). Update object scores: X j=1 j j 1 1 − − ˜ (t) ) 3. Ym ) = m X tr(X − Gj Yj )0 Mj (X − Gj Yj ) (3) j=1 We use the normalization u0 M• X = 0 and X 0 M• X = I in order to avoid the trivial solution X = 0 and Yj = 0. . The first restriction centers the graph plot (see Section 4) around the origin whereas the second p restriction makes the columns of the object score matrix orthogonal. m) are the vertices. The closeness of two objects in the plot is related to the similarity of their response patterns. Update category quantifications: Yj = Dj−1 G0j X (t) for j = 1. Geometry of the loss function In the homals package we use homogeneity analysis as a graphical method to explore multivariate data sets. i.3. . We give a graphical interpretation of the loss function in the following section.. Each iteration t consists of three steps: (t) 1. Y1 . they are z-scores). i. A “perfect” solution. In the corresponding graph plot an object and a category are connected by an edge if the object is in the corresponding category. connecting the object scores with their category centroid. Note that from an analytical point of view the loss function represents the sum-of-squares of (X − Gj Yj ) which obviously involves the object scores and the category quantifications. Furthermore. At iteration t = 0 we start with arbitrary object scores X (0) . without any loss at all. would imply that all object points coincide with their category points. we minimize simultaneously over X and Yj .2.Journal of Statistical Software 3 Now let X be the unknown n × p matrix containing the coordinates (object scores) of the object projections into Rp .e. The joint plot mapping object scores and the category quantifications in a joint space.

Using the block matrix notation. We can use QR decomposition. we can put the problem in Equation 3 into a SVD context (de Leeuw. X (t+1) = lsvec(X • • (5) In each iteration t we compute the value of the loss function to monitor convergence.4 homals: Gifi Methods for Optimal Scaling in R Note that matrix multiplications using indicator matrices can be implemented efficiently as cumulating the sums of rows over X and Y . P• denotes the average. the ALS algorithm only computes the first p dimensions of the solution. here denoted decomposition (SVD). Further elaborations can be found in Michailidis and de Leeuw (1998). The goodness-of-fit of a solution can be examined by means of a screeplot of the eigenvalues. The contribution of each variable to the final solution can be examined by means of discrimination measures defined by ||Gj Yj ||2 /n (see Meulman 1996). Also X 0 Pj X is the between-category dispersion for variable j. are used. G 0 M?−1 GY 2 = DY Λ . Compared to the classical SVD approach. and Wang 1999). Greenacre and Blasius 2006).e.. Usually. modified Gram-Schmidt. i. (6) 0 G X = DY Λ. (4) j=1 Again. by capitalizing on sparseness of G. Here orth is some technique which computes an orthonormal basis for the column space of a matrix. (8) (9) Here the eigenvalues Λ2 are the ratios along each dimension of the average between-category variance and the average total variance. Computing the homals solution in this way is the same as performing a CA on G. The package homals allows for imposing restrictions on the . let Pj denote the orthogonal projector on the subspace spanned by the columns of Gj . This leads to an increase in computational efficiency. multiple CA solves the generalized eigenproblem for the Burt matrix C = G0 G and its diagonal D (Greenacre 1984. because it replaces computation with sparse indicator matrices by computations with a dense average projector. In the homals package the left singular vectors of X as lsvec. Moreover. Thus. By means of the lsvec notation and including P• we can describe a complete iteration step as ˜ (t) ) = lsvec(M −1 P X (t) ). 3. the homals package is able to handle large data sets. Pj = Gj Dj−1 G0j . Note that Formula (5) is not suitable for computation. (7) or equivalently one of the two generalized eigenvalue problems GD−1 G0 X = M? XΛ2 . Michailidis. we have to solve the generalized singular value problem of the form GY = M? XΛ. the sum over the m projectors is P? = m X Pj = j=1 m X Gj Dj−1 G0j . Extensions of homogeneity analysis Gifi (1990) provides various extensions of homogeneity analysis and elaborates connections to other multivariate methods. or the singular value ˜ (t) . To simplify. Correspondingly.

which allows for the existence of object scores with a single category quantification. Thus. Our homals implementation allows for multiple rank restrictions. for a simple homals solution. 3. · · · i . . Its minimization leads to a n × p matrix of component scores and an m × p matrix of component loadings. nonlinear PCA (NLPCA) can be used. (12) Gathering the Aj ’s in a block matrix as well.. rj = kj − 1 iff kj ≤ p. and rj = p otherwise. Gm Z m . without loss of generality. principal components analysis (PCA) is a common technique to reduce the dimensionality of the data set. Thus.g.e. If no restrictions are imposed.. broad sense view in the Introduction). . having nonmetric variables. i.. NLPCA can be defined as homogeneity analysis with restrictions on the quantification matrix Yj .. In this case we say that all variables are single and the rank restrictions are imposed by Yj = zj a0j . 3.. Am i (13) . Multiple quantifications It is not necessarily needed that we restrict the rank of the score matrix to 1. We require.. Let us denote rj ≤ p as the parameter for the imposed restriction on variable j. The term “nonlinear” pertains to nonlinear transformations of the observed variables (de Leeuw 2006). Nonlinear principal component analysis Having a n × m data matrix with metric variables.Journal of Statistical Software 5 variable ranks and levels as well as defining sets of variables. We start our explanations with the simple case for rj = 1 for all j. · · · .1. A2 . (10) where zj is a vector of length kj with category quantifications and aj a vector of length p with weights. In Gifi terminology. Zj is kj × rj and Aj is p × rj . These options offer a wide spectrum of additional possibilities for multivariate data analysis beyond classical homogeneity analysis (cf. each quantification matrix is restricted to rank 1.. to project the variables into a subspace Rp where p  m. To establish the loss function for the rank constrained version we write r? for the sum of the rj and r• for their average. However... We can simply extend Equation 10 to the general case Yj = Zj A0j (11) where again 1 ≤ rj ≤ min (kj − 1. The block matrix G of dummy variables now becomes h ∆ Q = G1 Z1 .2... the p × r? matrix h ∆ A = A1 . The Eckart-Young theorem states that this classical form of linear PCA can be formulated by means of a loss function. G2 Z2 . we have the situation of multiple quantifications which implies imposing an additional constraint each time PCA is carried out. that Zj0 Dj Zj = I.. p). as e.

Then. optimal scaling attempts to do two things simultaneously: The transformation of the data by a transformation appropriate for the scale level (i. (15) σ(Z) = min σ(X. i. level constraints). Level constraints: Optimal scaling From a general point of view. we find h= 1 u0 D ju A0j Yj0 Dj u = 0. (19) . (18b) with W as a symmetric matrix of Langrange multipliers. For nominal variables. A corresponding example in terms of a lossplot is given in Section 4. we will proceed on our explanations by regarding this particular term only. Z. In fact. Z. Equation 17 is minimized under the conditions u0 Dj Zj = 0 and Zj0 Dj Zj = I. Y1 . the starting point is to split up Equation 14 into two separate terms. and numerical variables can be imposed on Zj in the following manner.3..A s=p+1 where the λs are the ordered singular values. and the fit of a model to the transformed data to account for the data. level constraints for nominal. by minimizing (14) over X and A we see that r? X ∆ λ2s (Z) + m(p − r• ). In this paper we will take into account the scale level of the variables in terms of restrictions within Zj .e. polynomial. 3. Solving. the rank restrictions Yj = Zj A0j affect only the second term and hence.e.. (17) j=1 Now. The stationary equations are Aj = Yj0 Dj Zj . A) = m X tr (X − Gj Zj A0j )0 Mj (X − Gj Zj A0j ) = j=1 = mtr X 0 M? X − 2tr X 0 QA + tr A0 A = = mp + tr (Q − XA)0 (Q − XA) − tr Q0 Q = = tr(Q − XA)0 (Q − XA) + m(p − r• ) (14) This shows that σ(X. To do this. (18a) 0 Yj Aj = Zj W + uh . Using Yˆj = Dj−1 G0j X this leads to Pm 0 j=1 tr(X − Gj Yj ) Mj (X − Gj Yj ) P ˆ ˆ 0 ˆ ˆ = m j=1 tr(X − Gj (Yj + (Yj − Yj ))) Mj (X − Gj (Yj + (Yj − Yj ))) Pm Pm ˆ 0 ˆ ˆ 0 ˆ (16) = j=1 tr(X − Gj Yj ) Mj (X − Gj Yj ) + j=1 tr(Yj − Yj ) Dj (Yj − Yj ). σ(Z. Obviously. · · · . A) = m X tr(Zj A0j − Yˆj )0 Dj (Zj A0j − Yˆj ).6 homals: Gifi Methods for Optimal Scaling in R results. Thus it is a simultaneous process of data transformation and data representation (Takane 2005). A) = X. all columns in Zj are unrestricted. ordinal. Equation 3 becomes σ(X. Ym ) ≥ m(p − r• ) and the loss is equal to this lower bound if we can choose the Zj such that Q is of rank p.

van der Burg. Y1 . Gvm . the first column in Zj denoted by zj0 is fixed and linear with the category numbers.. and Verdegaal 1988. Nonlinear canonical correlation analysis In Gifi terminology. Thus all p columns of Yj are polynomials of degree rj . the rest is free. Our implementation allows for multiple ordinal.Journal of Statistical Software ∆ ∆ 1/2 7 1/2 and thus. . In this section the relation to homogeneity analysis is shown. (22) 1 2 v The loss function can be stated as  0   mv mv K X X X ∆ 1 σ(X. (21) j=1 Since column zj0 is fixed. z0j0 Dj Zj = 0 is required as minimization condition. 3. Again (17) has to be minimized under the condition Zj0 Dj Zj = I (and optionally additional conditions on Zj ). . letting Z j = Dj Zj and Y j = Dj Yj . Hence. de Leeuw. . nonlinear canonical correlation analysis (NLCCA) is called “OVERALS” (van der Burg. This approach can be extended in order to compute p orthogonal vectors in K general matrices Gv . Moreover. h i ∆ Gv = Gv . Unlike in Gifi (1990) and Michailidis and de Leeuw (1998) it is not necessary to have rank 1 restrictions in order to allow for different scaling levels. and Aj = Y j Z j = Lr Λr O. For polynomial constraints the matrix Zj are the first rj orthogonal polynomials.. K v=1 j=1 j=1 (23) . Thus. the loss function in (17) changes to σ(Z. mv ) in set v. A) = m X tr(Zj A0j + zj0 a0j0 − Yˆj )0 Dj (Zj A0j + zj0 a0j0 − Yˆj ). and Dijksterhuis 1994). is p × (rj − 1). (20) If Y j = KΛL0 is the SVD of Y j . If the data set has variables with different scale levels.. . Zj A0j = Dj Kr Λr L0r . In the case of numerical variables. Gv . level constraints. . YK ) = tr X − Gvj Yvj  Mv X − Gvj Yvj  . Recall that the aim of homogeneity analysis is to find p orthogonal vectors in m indicator matrices Gj . it follows that 0 Y j Y j Z j = Z j W. 0 −1/2 −1/2 Zj = Dj Kr O. for the computation NLCCA between g = 1.. . then we see that Z j = Kr O with O as an arbitrary rotation matrix and Kr as the singular vectors corresponding with the r largest singular values. · · · . . .. . with aj0 as the first column. de Leeuw. Note that level constraints can be imposed additionally to rank constraints. we can also solve the problem tr(Zj0 Dj Yj Yj0 Dj Zj ) with Zj0 Dj Zj = I. each of dimension n × mv where mv is the number of variables (j = 1. K sets of variables. Having ordinal variables. If we minimize over Aj . the first column of Zj is constrained to be either increasing or decreasing. the rest is free. This is due to the fact that it has most of the other Gifi-models as special cases. multiple numerical etc.4.. . In order to minimize (21). The homals package allows for the definition of sets of variables and thus. . Thus. the homals package allows for setting level constraints for each variable j separately. . Zj is a kj × (rj − 1) matrix and Aj .

˜ A) ˆ ≤ σ(Z. We show below that we can still produce a sequence of normalized solutions with a non-increasing sequence of loss function values. Corresponding examples will be given in Section 4. All the transformations discussed above are projections on some convex cone Kj . Gvj is n × kj . A+ ) ≤ σ(Z + . A) (26) This seems to invalidate the usual convergence proof. Furthermore. The basic idea of the algorithm is to apply alternating least squares with rescaling. Moreover the first column z0 of Z is restricted by z0 ∈ K.8 homals: Gifi Methods for Optimal Scaling in R X is the n × p matrix with object scores. Clearly. and A is p × r. ˜ ˆ corresponding loss function value σ(Z. i. Z should also satisfy the common normalization conditions u0 DZ = 0 and Z 0 DZ = I. The normalization conditions are XM• X = I and u0 M• X = 0 where M• is the average of Mv . (27) Finally compute A+ by minimizing σ(Z + . Now update Z to Z + using the weighted Gram-Schmidt solution Z˜ = Z + S for Z where S is the Gram-Schmidt triangular matrix.5. Z is k × r. and Yvj is the kj × p matrix containing the coordinates. and keeping A fixed at A. The “non-standard” part of the algorithm is that we do not impose the normalization conditions if we minimize over Z. 2 sets this implies that G1 is n × 1 and G2 is n × m − 1. Cone restricted SVD In this final methodological section we show how the loss functions of these models can be solved in terms of cone restricted SVD. To improve it we first minimize over the nonSuppose (Z. ˆ A). A) = σ(Z. The first column z˜0 of Z˜ satisfies the cone constraint. ˆ A). σ(Z. ˆ A). A). For the sake of simplicity we drop the j and v indexes and we look only at the second term of the partitioned loss function (see Equation 17).e. Since NLPCA can be considered as special case of NLCCA. which is based on a non-increasing ˆ −1 )0 . (25) but Z˜ is not normalized. Then Z˜ Aˆ0 = Z + A0 . A+ ) ≤ σ(Z + . if the sets are treated in an asymmetric manner predictive models such as regression analysis and discriminant analysis can be emulated. i. ˆ This gives Z˜ and a normalized Z. (24) over Z and A. Thus we alternate minimizing over Z for fixed A and over A for fixed Z. with K as a convex cone. Since σ(Z + . σ(Z. For v = 1. Missing values are taken into account in Mv which is the element-wise minimum of the Mj in set v. ˆ σ(Z. (28) . But now also adjust Aˆ to A = A(S and thus ˜ A) ˆ = σ(Z + . A) we have the chain ˜ A) ˆ ≤ σ(Z. A). sequence of loss function values. satisfying the cone constraint. so does the first column of Z + . ˆ σ(Z + ... ˆ A) ˆ is our current best solution. all the additional restrictions for different scaling levels can straightforwardly be applied for NLCCA. where Yˆ is k × p. and because of the nature of Gram-Schmidt. Unlike classical canonical correlation analysis. A) = tr(ZA0 − Yˆ )0 D(ZA0 − Yˆ ).2. Observe that it is quite possible that ˆ > σ(Z. NLCCA is not restricted to two sets of variables but allows for the definition of an arbitrary number of sets. A) over A. ˆ σ(Z + . 3.e. for K = m.

In actual computation. The extended models can be computed by setting corresponding arguments: The rank argument (integer value) allows for the calculation of rank-restricted nonlinear PCA. Examples can be found in the corresponding help files. For variables with rank restrictions we first have to project the objects on the hyperplane spanned by the category quantifications. dynamic rgl plots (Adler and Murdoch 2009) with plot3d and static scatterplot3d plots (Ligges and M¨ achler 2003) with plot3dstatic:  Object plot ("objplot"): Plots the scores of the objects (rows in data set) on two or three dimensions. . plot. plot3dstatic and predict. The package offers a wide variety of plots. Three-dimensional plots are available. For some plot types three-dimensional versions are provided. summary.  Joint plot ("jointplot"): The object scores and category quantifications are mapped in the same (two.  Hull plot ("hullplot"): For a particular variable the object scores are mapped onto two dimensions.  Category plot ("catplot"): Plots the rank-restricted category quantifications for each variable separately. showing misclassification. and then compute distances in that plane.  Voronoi plot ("vorplot"): Produces a category plot with Voronoi regions.R-project. plot3d. We can then find out how well we have reconstructed the original data. The predict method works as follows. The convex hulls around the object scores are drawn with respect to each response category of this variable. In the plot method the user can specify the type of plot through the argument plot. some of them are discussed in Michailidis and de Leeuw (1998) and Michailidis and de Leeuw (2001).or three-dimensional) device. In any case we can make a square table of observed versus predicted for each variable. Finally. a joint plot is produced with additional connections between the objects and the corresponding response categories. and thus it also is not necessary to compute the Gram-Schmidt triangular matrix S.  Graph plot ("graphplot"): Basically. As a result. By default the level is set to nominal. the sets argument allows for partitioning the variables into two or more sets in order to perform nonlinear CCA. an object of class "homals" is created and the following methods are provided: print. The level argument (strings) allows to set different scale levels. 4. it is not necessary to compute A. Given a homals solution we can reconstruct the indicator matrix by assigning each object to the closest category point of the variable.org/package=homals.type. The R package homals At this point we show how the models described in the sections above can be computed using the package homals in R (R Development Core Team 2009) available from the Comprehensive R Archive Network at http://CRAN.Journal of Statistical Software 9 In any iteration the loss function does not increase. The core function of the package for the computation of the methodology above is homals().

no level or rank restrictions.  Scree plot ("screeplot"): Produces a scree plot based on the eigenvalues. A full description of the items can be found in the corresponding package help file. rep(TRUE.type = "spanplot". the object scores are mapped on two or three dimensions. economic.  Transformation plot ("trfplot"): Plots variable-wise the original (categorical) scale against the transformed (metric) scale Zj for each solution...  Projection plot ("prjplot"): For variables of rank 1 the object scores (two-dimensional) are projected onto the orthogonal lines determined by the rank restricted category quantifications. Simple homogeneity analysis The first example is a simple (i. var.  Vector plot ("vecplot"): For variables of rank 1 the object scores (two-dimensional) are projected onto a straight line determined by the rank restricted category quantifications. The votes selected cover a full spectrum of domestic. plot. it maps the object scores for each variable and it connects them by the shortest path within each response category.1.e. The data consists of 2001 senate votes on 20 issues selected by Americans for Democratic Action. In addition. Note that if rj > 1 only the first solution is taken. the variable Party containing 50 Republicans vs. active = c(FALSE. As a consequence.subset = 1. 2). sphere = FALSE. plot.10 homals: Gifi Methods for Optimal Scaling in R  Label plot ("labplot"): Similar to object plot. 20)). Due to non-responses we have several missings which are coded as NA. military.  Star plot ("starplot"): Again. . bgpng = NULL) plot(res. environmental and social issues.  Loss plot ("lossplot"): Plots the rank-restricted category quantifications against the unrestricted for each variable separately.dim = c(1. R> R> R> R> R> library("homals") data("senate") res <.type = "objplot". A three-dimensional version is provided. Democrat candidates have many more “yes” responses than Republican candidates.  Discrimination measures ("dmplot"): Plots the discrimination measures for each variable. no sets defined) threedimensional homogeneity analysis for the senate data set (Americans for Democratic Action 2002). the object scores are plotted but for each variable separately with the corresponding category labels.e. ndim = 3) plot3d(res. 49 Democrats and 1 Independent) is inactive and will be used for validation. these points are connected with the category centroid. The first column of the data set (i. We tried to select votes which display sharp liberal/conservative contrasts.  Loadings plot ("loadplot"): Plots the loadings aj and connects them with the origin. foreign.  Span plot ("spanplot"): Like label plot. plot. 4.homals(senate.

The west and the north branches are composed by Republicans. plot. + xlim = c(-2. 3). var. 3).type = "spanplot". + xlim = c(-2. ylim = c(-2. asp = 1) R> plot(res.dim = c(2.dim = c(1. ylim = c(-2. The two-dimensional slices show that Dimension 1 vs. Note that the 3D-plot is rotated in a way that Dimension 3 is horizontally aligned. 3).subset = 1.subset = 1. and Dimension 1 is the one aligned from front to back. asp = 1) Figure 1 shows four branches (or clusters) of senators which we will denote by north. south.type = "spanplot". Dimension 2 is vertically aligned. 3). the east and south branches by Democrats. 3). 2 does not distinguish . 3). 3). ylim = c(-2. asp = 1) R> plot(res. west and east. plot. plot. var. 3).Journal of Statistical Software 11 3 Span plot for Party ● 1 ● ● ●● ● ● ● ●● ●● ● ● ● ●●● ● ● ● ●● ● ●● ●●● ● ● 0 ● ●● ● ● ●● ● ●●● −1 Dimension 2 2 Category (D) Category (I) Category (R) ● ● ●●● ● ● ● ●● ● ● ● ●● ●● −2 ● −2 −1 0 1 2 3 Dimension 1 Category (D) Category (I) Category (R) 2 ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●● ● ● ● ● ● ●● ● ●● ● 1 ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● −2 ● ●● ● −2 ● ●● ● ● ●● ● ● ● ● ● ● 0 ● ● ●● ● ● ● ● ● ● ● ● 0 ● ●● ● ● ● ● 3 Category (D) Category (I) Category (R) −1 ●● ●● ● ●● ● ● Span plot for Party Dimension 3 2 ●● 1 ● ●● −1 Dimension 3 3 Span plot for Party −2 −1 0 1 Dimension 1 2 3 −2 −1 0 1 2 3 Dimension 2 Figure 1: 3D object plot and span plots for senate data. + xlim = c(-2. plot.

this is the determining factor and not the party affiliation of the senator. Hence. R> p. Warner (R-VA) motion to authorize an additional round of U. plot. Item 19 has to be taken into account: V19: S 1438. It is well known that the response on this item mainly depends on whether there is a military base in the senator’s district or not.res <. the separation between Democrats and Republicans is obvious. those senators who have a military base in their district do not want to close it since such a base provides working places and is an important income source for the district.predict(res) R> p. To distinguish within north-west and south-east.res\$cl.type = "loadplot") Given a (multiple) homals solution.12 homals: Gifi Methods for Optimal Scaling in R Loadings plot ● V19 ● ● ●● ● ●● ● ● ● ● ● ● ● ● 5 6 ● 4 0 2 0 −2 −5 Dimension 3 Party ● Dimension 2 10 ● −4 −10 −6 −8 −4 −2 0 2 4 6 8 Dimension 1 Figure 2: Loadings plot for senate data. between Democrats and Republicans. respectively. as in the two bottom plots in Figure 1. R> plot3dstatic(res. A “yes” vote is a +. we can reconstruct the indicator matrix by assigning each object to the closest points of the variable. South-Wing Democrats and West-Wing Republicans voted with “no”.S. If Dimension 3 is involved. Republicans belonging to the north branch as well as Democrats belonging to the east branch gave a “yes” vote. This result is underpinned by Figure 2 where Item 19 is clearly separated from the remaining items.table\$Party . military base realignment and closures in 2003. Military Base Closures.

advice (teacher categorized the children into 7 possible forms of secondary education.6054 60.e. R> R> + R> R> R> data("galo") res <. FALSE). 2..6310 correctly classified teacher advice results. 4). 4). Predictive models and canonical correlation The sets argument allows for partitioning the variables into sets in order to emulate canonical correlation analysis and predictive models. IQ. plot. The reason is that in simple homals the classification rate is not the criterion to be optimized. A labeled plot is given for the IQs which shows that on the upper half of the horseshoe there are mainly children with IQ-categories 7-9. Ext = extended primary education. Man = manual. 4. The Voronoi plot in Figure 3 shows the Voronoi regions for the same variable.69 4 SES 0.54 3 advice 0. Rate %Cl.3705 37. we use the galo dataset (Peschar 1975) where data of 1290 school children in the sixth grade of an elementary school in the city of Groningen (Netherlands) were collected. the others. The variables are gender.type = "vorplot".subset = 3. including housekeeping. plot.18 2 IQ 0.6318 63. var. 5)) plot(res. If not. i. 3.7969 79. asp = 1) predict(res) Classification rate: Variable Cl.homals(galo. the interpretation in terms of canonical correlation is more appropriate. var. we can put this type of homals model into a predictive modeling context. Gen = general.Journal of Statistical Software 13 pre obs (D) (I) (R) (D) 49 1 0 (I) 0 1 0 (R) 0 9 40 From the classification table we see that 91% of the party affiliations are correctly classified.71 A rate of .subset = 2. if the variables are partitioned into asymmetric sets of one variable vs.2. In this example it could be of interest to predict advice from gender. active = c(rep(TRUE. As outlined above.05 5 School 0. asp = 1) plot(res. Rate 1 gender 0. For . and SES (whereas school is inactive). Grls = secondary school for girls. To demonstrate this. None = no further education. Agr = agricultural. Distinctions between these levels of intelligence are mainly reflected by Dimension 1. IQ (categorized into 9 ordered categories). Uni = preUniversity). Note that in the case of such a simple homals solution it can happen that a lower dimensional solution results in a better classification rate than a higher dimensional. sets = list(c(1.type = "labplot".0171 1. SES (parent’s profession in 6 categories) and school (37 different schools).

3 1.6 6.7 5.3 1.4 1.5 4.2 5.7 1.4 1.7 4.4 4.6 1.9 5.2 1.2 34.1 4.9 4. A hull plot for the response.6 5.1 5.9 6. the aim is to predict species from petal/sepal length/width.7 5 4.5 1.2 1.4 1.1 5.9 5.1 5.3 5.3 4.9 1. a label plot Petal Length and loss plots for all predictors are produced.1 6 −4 −4 virginica −4 −2 0 Dimension 1 2 4 −4 −2 0 2 4 Dimension 1 Figure 4: Hullplot and label plot for iris data.7 1.5 1. The polynomial level constraint is posed on the predictors and the response is treated as nominal.4 5.9 5.8 4.7 1.6 444.3 1.7 5.4 1.5 1. 3 Labplot for Petal. Using the classical iris dataset.Length 3 Hullplot for Species −1 2 1 virginica virginica virginica virginica setosa −2 virginica −2 virginica virginica −3 virginica 0 Dimension 2 0 setosa setosa setosa −3 Dimension 2 1 versicolor versicolor setosa 4.1 5.7 3.3 1.3 3. .14 homals: Gifi Methods for Optimal Scaling in R −2 0 2 4 0 −2 −4 ExtNoneNone Ext None Ext Ext Ext None Ext Ext Ext Ext Man Ext Man NoneExt Man ExtExt None Ext Man None None Ext Ext Man Man Man Ext Man Man None Ext Ext Ext Man Ext Ext Ext Man Ext Man Man Man Man Ext Man Ext Man Ext Man Man Man Ext Ext Man Ext Man Man Ext Man Man Ext Gen Gen Ext Gen Man Man Ext ManMan Man Gen Man Man Gen Ext Ext Man Man Ext Man Man Gen Gen Gen Gen Ext Ext Gen Gen Gen Gen Man Man Gen Man Gen Gen Gen Ext Gen Man Gen Grls Grls Grls Agr GenGenMan Gen Man Gen Gen Agr Gen Man Gen Gen Grls Grls Gen Ext Agr Gen Gen Gen Gen Uni Agr Gen GenUni Grls Gen Gen Grls Agr Uni Grls Agr Gen Gen GenUni Agr Uni Uni Uni Gen Gen Uni Uni Uni Uni Uni Uni Uni Uni Uni Uni Uni Uni Uni Uni Uni Uni 2 Labplot for IQ Dimension 2 0 −2 −4 Dimension 2 2 Voronoi plot for advice 2 3 3 2 3 3 3343 2 2 1 3 4 2 34542 34 44 2 2 1 1 4 53 335 4 44 2 2 3 5 4 4 45 553 34 44 5 55 3 44 4 5 555 5 6 4 3 3 4 6 3 5 6 4 6 5 5 45 65 66 444 56 66477 5 655564 64 44 6 7 5 5 5 55 6 5 7 4 6 66 7 77 7 4 6 6 6 66 8 4 77 677 5 5 7 77 7 77 766 66688 8 6 6 6678778 7 7 77 7 98 989898 98 9898 98 6 −2 0 Dimension 1 2 4 2 6 Dimension 1 Figure 3: Voronoi plot and label plot for galo data.1 4. the lower horseshoe half it can be stated that both dimensions reflect differences in lower IQ-categories.5 1.5 3.3 4.3 1.6 44 4.7 4.4 1.4 4.5 4.6 1.3 5.2 5.9 6.3 4.7 4.4 4.5 5.5 1.1 3.8 5.1 4.6 1.5 1.5 1.9 4.6 5.4 1.64.5 5.6 4.9 4.9 44.7 1.5 5.5 5.6 1.5 4.6 5.5 3.9 3.6 6.5 4.4 5.5 1.3 5.5 1.9 3.2 3.6 1.4 1.55 4.1 5.7 5.1 1 −1 2 versicolor versicolor versicolor versicolor versicolor 5 4.1 5 5.6 6.8 5.4 6.6 5.8 5.8 4.6 1.1 4.1 3.8 66.76.7 6.8 1.6 1.5 4.4 4.8 4.2 3.54.9 1.

6 4.4 4.5 ● ● 2.9 5.4 ● ● 4 ● 4.2 ● ●● 3. 3).8 ●5 1.2 2.4 4.4 2 2.7● ● ● ● 5. cex = 0.1 6.5 0.2 1.2 0.8 4.5 3.5 0.4 3.2 3.2 5.9 ● ●● 5.3 0 Dimension 2 2 0 −4 Dimension 1 4. 5).8 ● 6.5 ● ●●●●●●●4.7.Length Lossplot for Petal.2 4.2 5.3 4. R> R> + R> + R> + data("iris") res <.1 1.4 ● ● ●1.4 ● ● ●● 1.7 5. sets = list(1:4.3 ● ● 3.9 ● 4.4 3. The hullplot in Figure 4 shows that the species are clearly separated on the two-dimensional plane.7 ● 7.6 6.5 ● 1.8 5.9 0.3 4.9 ● ● 5.14. xlim = c(-3.5 3. ylim = c(-4.4 ● ● ●●● ● ●●6.4 ● 1.4 ● ● 4.6 2.3 ● ● 6.6 3.5 −4 −4 4 1.3 −2 Dimension 2 −2 −2 −4 ● 6.3 7.2 2.2 .2●● ● 5.1 6 ●5.6 ● 0.1 1.8 3.3 ● 0.1 ● 2.9 1.9 4● ● ● ● ● ● ●●●3.7 3.1 ● ● 4.3 ● 6.9 3.6 4.4 3.7 ● 2.2 1.3 ● 2.1 4.1 ●● ● 3.6 ● ● 1.2 ● 2.9 6.5 3.7●● 5.1 1.1 32.83.9 ● 5.7 ● ●1.8 5.44.1● ● 5.3 4 4. "nominal").7 ●6.6 ● 1.8 5.1 5.4 5.8 ● 0.6 4.homals(iris.6 2.3 ●● 5.8 ● ● ● ● 3.3 3.3 ● ●● ● ●● ●● ● 4.6 6● ● ● 6.4 ● ● ● 6.24.7 ● 3.9 5.3 3.5 ●6. cex = 0.3 0.5 6.5 3.8 ●● 3.7 ● 6 ● 6.21●●● 5.3 6.6 ● ● ● 6.Width 7 6.7 ● 5.Length 15 Lossplot for Sepal.3 ● ● ● ● ● ●7.1 ● 7.1 ● ●●6.7 ● ● 6.2 1.4 ● 7.5 2. 100% of the iris species are correctly classified.7 ● ● 5.3 ● ● 7.3 ● ● ● 2.3 5.6 1.3 11.1 ●2 ● ● ● 2.Journal of Statistical Software Lossplot for Sepal. 4).6● ● ●5.5 ●4.2 4 4.9 0.4 1. asp = 1) plot(res.6 5.5 ● 1.6 4.7 5.1 ● 1. var.2 3.type = "hullplot".7 6.4 ● ● ● 7.1 5.6 ●● ●● ● 7.4 2.8 ● ● 6.6 ● 2. asp = 1) For this two-dimensional homals solution.4● ● 0.2 5.3 ● ● ● 7.2 5.7 4.8 3.5 1.4 ● 1.4 5.5 1.6 5.1 ●● 6.type = "labplot".subset = 3.4 7.4 4.9 4.9 2.4 1● ● ● 6.54.8 5.9 ● −4 −2 0 2 4 −4 −2 Dimension 1 0 2 4 Dimension 1 Figure 5: Loss plots for iris predictors.9 4.3 ● 5.7 ● 6.3 ●● 2.3 5 1.6 ● ● ● −2 Dimension 2 0 −2 Dimension 2 2 ● ● 5.2● 2 ● 0.4 1 ● 1.8 ● ● ● 2. rank = 2.Width 2 ● ● ● 2.5 2.1 ●● 2.7.6 ● ● ● ● ●1.9 ● ● ●6.1 3.9 3 ●● ● 4.5 1. ylim = c(-4. plot.6 66.3 1.2 ● ● 34.6 7.9 ● ● −4 −4 ● 0 2 4 −2 0 2 Dimension 1 Lossplot for Petal.8 ● 1.subset = 5.7 3.7 3.7 3.6 ● ● ● ●●●7 4.7 0. var.1 5.5 1.9 4. 3).7 4.1 4.6 0 2 2 2. xlim = c(-3.9 7.7 ● 7.4 ● 6.9 3. plot.7 ● 2.5 .5 4.6 ● ● 1.4 ● ● 2.3 ● 1.8 ● 6. 3).1 4.8 6. itermax = 2000) plot(res. level = c(rep("polynomial". In the label plot the object scores are labeled with the response on Petal Length and .5 55 ● ●● 5.9 1.9 ● ● 3 3.2 7. 3).2 ● 2.

main = "Loadings Plot Linear Regression". 10). var. 10).lin.subset = 1:4. 10). and he discusses the systematic and accidental divergences (residuals).lin <. He applied this formula and the estimated constants to 65 experiments carried out by Neumann. sets = list(3.mon <. Iris versicolor have medium lengths. xlim = c(-10. sets = list(3.type = "lossplot". The implication of the polynomial level restriction for the fitted model is obvious. 1:2). The loss plots in Figure 5 show the fitted rank-2 solution (red lines) against the unrestricted solution. level = "ordinal".type = "loadplot".16 homals: Gifi Methods for Optimal Scaling in R 10 Loadings Plot Monotone Regression temperature ● density ● pressure ● 5 Dimension 2 5 ● density 0 pressure 0 Dimension 2 10 Loadings Plot Linear Regression ● −5 −5 ● temperature −10 −5 0 Dimension 1 5 10 −10 −5 0 5 10 Dimension 1 Figure 6: Loading plots for Neumann regression. plot. 3). Constraining the levels to be ordinal. plot. whereas iris virginica are composed by obervations with large petal lengths. In the homals package such a linear regression of density on temperature and pressure can be emulated by setting numerical levels. R> R> + R> + R> + R> + data("neumann") res. asp = 1) plot(res. 10).homals(neumann. asp = 1) To show another homals application of predictive (in this case regression) modeling we use the Neumann dataset (Wilson 1926): Willard Gibbs discovered a theoretical formula connecting the density.homals(neumann. we get a monotone regression (Gifi 1990).mon. main = "Loadings Plot Monotone Regression". it becomes obvious that small lengths form the setosa “cluster”. asp = 1) . R> plot(res. ylim = c(-5. level = "numerical". 1:2). plot. rank = 1) res. and the absolute temperature of a mixture of gases with convertible components. xlim = c(-10.type = "loadplot". rank = 1) plot(res. ylim = c(-5. + xlim = c(-3. the pressure.7. cex = 0. ylim = c(-5. 3).

3 linear monotone −0. monotone transformations are carried out (i.e. NLPCA on Roskam data Roskam (1968) collected preference data where 39 psychologists ranked all nine areas (see Table 1) of the Psychology Department at the University of Nijmengen.05 linear monotone −0.4 linear monotone −0.e. linear regression).15 transformed scale 0. The points in the loadings plot in Figure 6 correspond to regression coefficients. monotone regression)..1 −0.05 0.05 transformed scale 0. Numerical level restrictions lead to linear transformations of the original scale with respect to the homals scaling (i.1 0.Journal of Statistical Software Transformation plot for pressure 0..05 0. The impact of the level restrictions on the scaling is visualized in the transformation plots in Figure 7. Pertaining to ordinal levels.15 Transformation plot for temperature 17 78 100 110 120 130 140 150 160 185 66 93 113 original scale 164 197 243 335 432 original scale −0.3.15 transformed scale 0. 4. .0 0.2 0.15 Transformation plot for temperature 78 100 110 120 130 140 150 160 185 original scale Figure 7: Transformation plots for Neumann regression.

18 homals: Gifi Methods for Optimal Scaling in R SOC EDU CLI MAT EXP CUL IND TST PHY Social Psychology Educational and Developmental Psychology Clinical Psychology Mathematical Psychology and Psychological Statistics Experimental Psychology Cultural Psychology and Psychology of Religion Industrial Psychology Test Construction and Validation Physiological and Animal Psychology Table 1: Psychology areas in Roskam data. Note that the scale level is set to "ordinal".5. 2. 2. clinical and cultural psychology.type = "vecplot". ylim = c(-2. Physiological and animal psychology are somewhat separated from the other areas.5. xlim = c(-2. industrial psychology and test construction (both are close to the former two areas). plot.5). 2.5. + ylim = c(-2. plot. R> data("roskam") R> res <.5. Using this data set we will perform two-dimensional NLPCA by restricting the rank to be 1. the input data structure is a 9 × 39 data frame.5).homals(roskam. main = "Vector Plot Rater 2".5). level = "ordinal") R> plot(res. rank = 1. . Thus. asp = 1) The object plot in Figure 8 shows interesting rating “twins” of departmental areas: mathematical and experimental psychology.5). Note that the objects are the areas and the variables are the psychologists.type = "objplot".subset = 2. xlim = c(-2. educational and social psychology. 2. var. + asp = 1) R> plot(res. 2 Vector Plot Rater 2 2 Plot Object Scores 1 7 6 ● 5 ● 8 ● ● ● 2 −1 CLI CUL ● 3 4 1 2 56 78 3 1 0 Dimension 2 0 EDU SOC −1 Dimension 2 1 MAT EXP IND TST ● 4 ● 99 ● −2 −2 PHY −3 −2 −1 0 1 2 3 −3 −2 Dimension 1 −1 0 1 2 3 Dimension 1 Figure 8: Plots for Roskam data.

8(11). easy-touse routines which allow researchers from different areas to compute. Dunod. Academic Press. 5. Chapman & Hall/CRC. The package offers a broad variety of real-life datasets and furthermore provides numerous methods of visualization. Michailidis G. and NLPCA.). interpret. Americans for Democratic Action (2002). Ligges U.” Journal of Statistical Software. Further analyses of this dataset within a PCA context can be found in de Leeuw (2006). Chichester. cultural and clinical psychology rather than to methodological fields. “scatterplot3d – An R Package for Visualizing Multivariate Data. R package version 0. either in a two-dimensional or in a three-dimensional way. Future enhancements will be to replace indicator matrices by more general B-spline bases and to incorporate weights for observations.org/ v08/i11/. Multivariate Analysis. Gifi A (1990). pp. Discussion In this paper theoretical foundations of the methodology used in the homals package are elaborated and package application and visualization issues are presented. de Leeuw J. and visualize methods belonging to the Gifi family. M¨ achler M (2003). London. Analyse des Donn´ees. “Nonlinear Principal Component Analysis and Related Techniques. Blasius J (2006).R-project. New York. Greenacre MJ. Dekker. NLCCA. 57(1).jstatsoft. Chapman & Hall/CRC. Benz´ecri JP (1973). John Wiley & Sons. homals provides flexible. pp.org/package=rgl. rgl: 3D Visualization Device System (OpenGL). Multiple Correspondence Analysis and Related Methods. 1–20. J Blasius (eds. References Adler D. Boca Raton. Murdoch D (2009). 523–546. Design of Experiments. Wang D (1999). Multiple Correspondence Analysis and Related Methods. 107–134. “Correspondence Analysis Techniques. . predictive models.” In MJ Greenacre. URL http://www. Basically. Similarly. Theory and Applications of Correspondence Analysis. and Survey Sampling. a projection plot could be created. Nonlinear Multivariate Analysis. It can handle missing data and the scale level of the variables can be taken into account. Greenacre MJ (1984).). 1–17.84.” ADA Today. Boca Raton.” In S Ghosh (ed. To conclude. Paris. URL http://CRAN.Journal of Statistical Software 19 Obviously this rater is attracted to areas like social. The vector plot on the right hand side projects the category scores onto a straight line determined by rank restricted category quantifications. “Voting Record: Shattered Promise of Liberal Progress. homals covers the techniques described in Gifi (1990): Homogeneity analysis. de Leeuw J (2006).

” Statistical Science. 38–40. de Leeuw J (1998). 18. “Psychometrics Task View. Meulman JJ (1996). 1–13. “Fitting a Distance Model to Homogeneous Subsets of Variables: Points of View Analysis of Categorical Data. Michailidis G. 435–450. “Homogeneity Analysis with k Sets of Variables: An Alternating Least Squares Method with Optimal Scaling Features. Tjeek Willink. School. R: A Language and Environment for Statistical Computing. 16. University of Leiden. URL http://www.ucla.jstatsoft.” Journal of Classification. Statistical Computing Section.). “Empiricism and Rationalism. Greenacre MJ (2007). with Two. “OVERALS: Nonlinear Canonical Correlation with k Sets of Variables. 1479–1482. 307–336. 20(3). Metric Analysis of Ordinal Data in Psychology. Takane Y (2005). R Foundation for Statistical Computing. thesis. van der Burg E. de Leeuw J. “Correspondence Analysis in R. Wilson EB (1926). 47–57.” Science. American Statistical Association.and ThreeDimensional Graphics: The ca Package. van der Burg E.D. John Wiley & Sons. de Leeuw J (2009).” R News. Peschar JL (1975). URL http: //www.” Psychometrika.” Computational Statistics. 13.ucla. Vienna. pp.stat. Milieu. “The Gifi System of Descriptive Multivariate Analysis. de Leeuw J. Los Angeles E-mail: deleeuw@stat. Chichester.R-project.” In JSM 2008 Proceedings. de Leeuw J (2001). D Howell (eds.20 homals: Gifi Methods for Optimal Scaling in R Mair P. “Data Visualization Through Graph Drawing. R Development Core Team (2009). 64. Ph. Michailidis G.org/doc/Rnews/. 13.” In B Everitt. ISBN 3-900051-07-0. 249–266. “Optimal Scaling.org/. 141–163. 53. Nenadic O.org/v20/i03/. Mair P. Groningen. Verdegaal R (1988). Hatzinger R (2007).” Journal of Statistical Software. Roskam E (1968). Affiliation: Jan de Leeuw Department of Statistics University of California.” Computational Statistics & Data Analysis. Alexandria. 177–197. 7(3).edu URL: http://www.R-project.edu/~deleeuw/ . Beroep. Austria. URL http: //CRAN. Dijksterhuis G (1994). Encyclopedia of Statistics for Behavioral Sciences. “Rank and Set Restrictions for Homogeneity Analysis in R.

ac.at URL: http://statmath. Issue 4 August 2009 http://www.Mair@wu.wu.Journal of Statistical Software 21 Patrick Mair Department of Statistics and Mathematics WU Wirtschaftsuniversit¨ at Wien E-mail: Patrick.ac.org/ http://www.org/ Submitted: 2008-11-06 Accepted: 2009-07-23 .jstatsoft.amstat.at/~mair/ Journal of Statistical Software published by the American Statistical Association Volume 31.