## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

February 15, 2013

Version 1.2-13 Date 2012-02-19 Title Visualizing Categorical Data Author David Meyer [aut, cre], Achim Zeileis [aut], Kurt Hornik [aut],Michael Friendly [ctb] Maintainer David Meyer <David.Meyer@R-project.org> Description Visualization techniques, data sets, summary and inference procedures aimed particularly at categorical data. Special emphasis is given to highly extensible grid graphics. The package was inspired by the book ‘‘Visualizing Categorical Data’’by Michael Friendly. LazyLoad yes LazyData yes Depends R (>= 2.4.0), MASS, grid, colorspace Suggests KernSmooth, mvtnorm, kernlab, HSAUR, coin Imports stats, utils, MASS, grDevices License GPL-2 Repository CRAN Date/Publication 2012-02-20 11:37:19 NeedsCompilation no

R topics documented:

agreementplot Arthritis . . . assoc . . . . . assocstats . . Baseball . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 . 5 . 6 . 9 . 10

2 BrokenMarriage . . . . . . . . . . . . . . . . . Bundesliga . . . . . . . . . . . . . . . . . . . Bundestag2005 . . . . . . . . . . . . . . . . . Butterﬂy . . . . . . . . . . . . . . . . . . . . . cd_plot . . . . . . . . . . . . . . . . . . . . . CoalMiners . . . . . . . . . . . . . . . . . . . coindep_test . . . . . . . . . . . . . . . . . . . cotabplot . . . . . . . . . . . . . . . . . . . . cotab_panel . . . . . . . . . . . . . . . . . . . co_table . . . . . . . . . . . . . . . . . . . . . DanishWelfare . . . . . . . . . . . . . . . . . distplot . . . . . . . . . . . . . . . . . . . . . doubledecker . . . . . . . . . . . . . . . . . . Employment . . . . . . . . . . . . . . . . . . . Federalist . . . . . . . . . . . . . . . . . . . . fourfold . . . . . . . . . . . . . . . . . . . . . goodﬁt . . . . . . . . . . . . . . . . . . . . . . grid_barplot . . . . . . . . . . . . . . . . . . . grid_legend . . . . . . . . . . . . . . . . . . . Hitters . . . . . . . . . . . . . . . . . . . . . . hls . . . . . . . . . . . . . . . . . . . . . . . . HorseKicks . . . . . . . . . . . . . . . . . . . Hospital . . . . . . . . . . . . . . . . . . . . . independence_table . . . . . . . . . . . . . . . JobSatisfaction . . . . . . . . . . . . . . . . . JointSports . . . . . . . . . . . . . . . . . . . Kappa . . . . . . . . . . . . . . . . . . . . . . labeling_border . . . . . . . . . . . . . . . . . labeling_cells_list . . . . . . . . . . . . . . . . legends . . . . . . . . . . . . . . . . . . . . . Lifeboats . . . . . . . . . . . . . . . . . . . . mar_table . . . . . . . . . . . . . . . . . . . . mosaic . . . . . . . . . . . . . . . . . . . . . . MSPatients . . . . . . . . . . . . . . . . . . . NonResponse . . . . . . . . . . . . . . . . . . oddsratio . . . . . . . . . . . . . . . . . . . . Ord_plot . . . . . . . . . . . . . . . . . . . . . OvaryCancer . . . . . . . . . . . . . . . . . . Pairs plot panel functions for diagonal cells . . Pairs plot panel functions for off-diagonal cells pairs.table . . . . . . . . . . . . . . . . . . . . plot.loglm . . . . . . . . . . . . . . . . . . . . PreSex . . . . . . . . . . . . . . . . . . . . . . Punishment . . . . . . . . . . . . . . . . . . . RepVict . . . . . . . . . . . . . . . . . . . . . Rochdale . . . . . . . . . . . . . . . . . . . . rootogram . . . . . . . . . . . . . . . . . . . . Saxony

R topics documented

agreementplot SexualFun . . . . shadings . . . . . sieve . . . . . . . SpaceShuttle . . spacings . . . . . spine . . . . . . . strucplot . . . . . structable . . . . struc_assoc . . . struc_mosaic . . struc_sieve . . . Suicide . . . . . table2d_summary ternaryplot . . . . tile . . . . . . . . Trucks . . . . . . UKSoccer . . . . VisualAcuity . . VonBort . . . . . WeldonDice . . . WomenQueue . . woolf_test . . . . Index

3 82 83 87 89 91 92 94 98 100 102 104 105 106 107 109 111 112 113 114 115 116 117 118

agreementplot

Bangdiwala’s Observer Agreement Chart

Description Representation of a k × k confusion matrix, where the observed and expected diagonal elements are represented by superposed black and white rectangles, respectively. The function also computes a statistic measuring the strength of agreement (relation of respective area sums). Usage ## Default S3 method: agreementplot(x, reverse_y = TRUE, main = NULL, weights = c(1, 1 - 1/(ncol(x) - 1)^2), margins = par("mar"), newpage = TRUE, pop = TRUE, xlab = names(dimnames(x))[2], ylab = names(dimnames(x))[1], xlab_rot = , xlab_just = "center", ylab_rot = 9 , ylab_just = "center", fill_col = function(j) gray((1 - (weights[j]) ^ 2) ^ .5), line_col = "red", xscale = TRUE, yscale = TRUE, ...) ## S3 method for class ’formula’ agreementplot(formula, data = NULL, ..., subset)

such as y ~ x. . giving the ﬁll colors used for exact and partial agreement color used for the diagonal reference line a formula. respectively. logical.. logical. agreementplot if TRUE.and y-axis. xscale. yscale logicals indicating whether the marginals should be added on the x-axis/y-axis. The ﬁrst element should be 1. or a contingency table from which the variables in formula should be taken. Details Weights can be speciﬁed to allow for partial agreement. user-speciﬁed main title. the plot is drawn on a new page. For details. an optional vector specifying a subset of the rows in the data frame to be used for plotting.e. i.org> the weight vector used. a table with equal-sized dimensions. the y axis is reversed (i. see xtabs.Meyer@R-project. vector of weights for successive larger observed areas. used in the agreement strength statistic. xlab. each additional element increases the maximum number of disagreement steps. further graphics parameters (see par). ylab_rot rotation angle for the category labels. xlab_rot. if TRUE. Value Invisibly returned. a data frame (or list).. taking into account contributions from offdiagonal cells... if TRUE. fill_col line_col formula data subset a function. .4 Arguments x reverse_y main weights margins newpage pop a confusion matrix. ylab_just justiﬁcation for the category labels. weights Author(s) David Meyer <David. a list with components Bangdiwala the unweighted agreement strength statistic. A weight vector of length 1 means strict agreement only. xlab_just. the rectangles’ positions correspond to the contingency table). vector of margins (see par). all newly generated viewports are popped after plotting. ylab labels of x. and also for the shading. Bangdiwala_Weighted the weighted statistic.e.

g.yorku. main = "New Orleans Patients". Treated). newpage = FALSE) popViewport() pushViewport(viewport(layout. Improved ordered factor indicating treatment outcome (None.sas . newpage = FALSE) popViewport(2) dev.col = 2)) agreementplot(t(MSPatients[.layout(ncol = 2))) pushViewport(viewport(layout. Marked). Usage data("Arthritis") Format A data frame with 84 observations and 5 variables.Arthritis References Michael Friendly (2000).off() 5 Arthritis Arthritis Treatment Data Description Data from Koch \& Edwards (1988) from a double-blind clinical trial investigating a new treatment for rheumatoid arthritis.. Visualizing Categorical Data: http://euclid. e. ID patient ID. using: ## get(getOption("device"))(width = 12) pushViewport(viewport(layout = grid. Some. Source Michael Friendly (2000). Age age of patient.. Sex factor indicating sex (Female. Treatment factor indicating treatment (Placebo.col = 1)) agreementplot(t(MSPatients[.pos. Visualizing Categorical Data.1]).pos. NC.psych.2]). main = "Winnipeg Patients".ca/ftp/ sas/vcd/catdata/arthrit. Male). Cary. SAS Institute. Examples data("SexualFun") agreementplot(t(SexualFun)) data("MSPatients") ## best visualized using a resized device.

if FALSE. data = Arthritis. Clinical efﬁciency trials with categorical data. gp = shading_Friendly) mosaic(art. New York. In K. yspace = unit( . or an object inheriting from the "ftable" class (such as "structable" objects). NC...xtabs(~ Treatment + Improved. Edwards (1988). Cary. Biopharmaceutical Statistics for Drug Development. 403–451. gp_axis = gpar(lty = 3)) ## S3 method for class ’formula’ assoc(formula. the space between rows and columns is ﬁxed and hence the plot is more “compressed”. Friendly (2000). data = NULL.). If TRUE.. subset = Sex == "Female") art mosaic(art. Examples data("Arthritis") art <. xlim = NULL. . a vector of integers giving the indices.9. subset = NULL. E. Koch \& S. . Visualizing Categorical Data. or a character vector giving the names of the variables to be used for the rows of the association plot. a vector of integers giving the indices. main = NULL. M.. row_vars = NULL. ylim = NULL. row_vars col_vars compress . compress = TRUE. residuals_type = "Pearson". spacing_args = list(). sub = NULL) Arguments x a contingency table in array form with optional category labels speciﬁed in the dimnames(x) attribute.6 References assoc G. logical. gp = shading_max) assoc Extended Association Plots Description Produce an association plot indicating deviations from a speciﬁed independence model in a possibly high-dimensional contingency table. split_vertical = NULL. Peace (ed. sub = NULL.action = NULL. or a character vector giving the names of the variables to be used for the columns of the association plot. spacing = spacing_conditional(sp = ). SAS Institute.5. xscale = .. keep_aspect_ratio = FALSE. Usage ## Default S3 method: assoc(x. "lines"). col_vars = NULL. main = NULL. na. the space between the rows (columns) are chosen such that the total heights (widths) of the rows (columns) are all equal. Marcel Dekker..

other parameters passed to strucplot . if speciﬁed (see strucplot for more information). The columns of xlim correspond to the columns of the association plot. a data frame. k number of total columns of the plot. either a logical. sub . where k is the number of margins of x (default: FALSE).assoc xlim 7 a 2 × k matrix of doubles. or a character string used for plotting the main (sub) title. A TRUE component indicates that the corresponding dimension is folded into the columns. list or environment containing the variables to be cross-tabulated. thus adding additional space between the tiles. Ignored if data is a contingency table. an optional vector specifying a subset of observations to be used. keep_aspect_ratio logical indicating whether the aspect ratio should be ﬁxed or not. a spacing object. object of class "unit" specifying additional space separating the rows. scale factor resizing the tile’s width. the rows describe the column ranges (minimums in the ﬁrst row. Ignored if data is a contingency table. if FALSE: from the whole association plot matrix). The columns of ylim correspond to the rows of the association plot. If logical and TRUE. Values are recycled as needed. a spacing function. If ylim is NULL.. k number of total rows of the plot. or an object inheriting from class table. a 2 × k matrix of doubles. maximums in the second row). xscale yspace gp_axis formula data subset na. the ranges are determined from the residuals according to compress (if TRUE: widest range from each column. ylim spacing spacing_args split_vertical vector of logicals of length k . an optional function which indicates what should happen when the data contain NAs. the name of the data object is used.. the ranges are determined from the residuals according to compress (if TRUE: widest range from each row. or a corresponding generating function (see strucplot for more information). The default is the spacing-generating function spacing_conditional that is (by default) called with the argument list spacing_args (see spacings for more details). Currently. if FALSE: from the whole association plot matrix). a formula object with possibly both left and right hand sides specifying the column and row variables of the ﬂat table. If xlim is NULL. list of arguments for the spacing-generating function. only Pearson residuals are supported. residuals_type a character string indicating the type of residuals to be computed. FALSE folds the dimension into the rows. maximums in the second row). the rows describe the column ranges (minimums in the ﬁrst row. object of class "gpar" specifying the visual aspects of the tiles’ baseline.action main.

http://datavis. The rectangles in each row are positioned relative to a baseline indicating independence (dij.. The layout is very ﬂexible: the speciﬁcation of shading. A. . spacing.at/Conferences/ DSC-2 3/Proceedings/ . the box rises above the baseline.k = (fij. F. the signed contribution to Pearson’s χ2 for cell {ij .. and produce (extended) association plots. Hornik..Meyer@R-project. On the graphical display of the signiﬁcant components in a two-way contingency table. e. each cell is represented by a rectangle that has (signed) height proportional to dij..g. Most of the functionality is described there...ci. Both are high-level interfaces to the strucplot function. 1025–1041. http://www. k } is dij. ISSN 1609-395X. such as speciﬁcation of the independence model. legend. D.. 190–200.. Proceedings of the 3rd International Workshop on Distributed Statistical Computing. and other graphical parameters. A9. and falls below otherwise. and legend is modularized (see strucplot for details).8 Details assoc Association plots have been suggested by Cohen (1980) and extended by Friendly (1992) and provide a means for visualizing the residuals of an independence model for a contingency table. Package vcd offers a range of residual-based shadings (see the shadings help page)..k = 0). . Hornik.org> References Cohen. Communications in Statistics—Theory and Methods. (2003).ca/papers/sugi/sugi17.ac..tuwien. 17. Leisch. spacing. Visualizing independence using extended association plots... Unlike the assocplot function in the graphics package. labeling. the residuals can be colored depending on a speciﬁed shading scheme (see Meyer et al. SAS User Group International Conference Proceedings.k are the observed and expected counts corresponding to the cell. For a contingency table.. Graphical methods for categorical data.). 2003). Zeileis (eds.k √ and width proportional to eij. Value The "structable" visualized is returned invisibly. the dimensions are folded into rows and columns... (1992). so that the area of the box is proportional to the difference in observed and expected frequencies. labeling. Additionally..k where fij.k . Similar to the construction of ‘ﬂat’ tables (like objects of class "ftable" or "structable").k ) √ eij.k − eij. this function allows the visualization of contingency tables with more than two dimensions.. Zeileis.. A. assoc is a generic function and currently has a default method and a formula interface.pdf Meyer. M. Author(s) David Meyer <David. In the association plot. K. Some of them allow. (1980). Friendly. the visualization of test statistics. K. A....k and eij. shading. If the observed frequency of a cell is greater than the expected one..

expected = ~ (Hair + Eye) * Sex.margin.5). K. structable Examples data("HairEyeColor") ## Aggregate over sex: (x <.. 3))) chisq. See Also mosaic. URL http://www. labeling_args = list(abbreviate = c(Gender = TRUE). shade = TRUE) ## Aggregate over Eye color: (x <. rot_labels = c(right = ). (2006). shade = TRUE) # Visualize multi-way table assoc(aperm(HairEyeColor). the Likelihood Ratio chi-Squared test. c(1. expected = ~ (Admit + Gender) * Dept.table(HairEyeColor. the phi coefﬁcient. the contingency coefﬁcient and Cramer’s V. offset_varnames = c(right = 1.table(HairEyeColor. c(1.assocstats 9 Meyer. D.margin. 1-48..test(x) assoc(x.jstatsoft. labeling_args = list(just_labels = c(Eye = "left"). tl_varnames = c(Eye = TRUE)) ) assoc(aperm(UCBAdmissions). strucplot.org/v17/i03/ and available as vignette("strucplot"). main = "Relation between hair color and sex". 2))) ## Ordinary assocplot: assoc(x) ## and with residual-based shading (of independence) assoc(x. Journal of Statistical Software. Usage assocstats(x) . Zeileis. main = "Relation between hair and eye color".. rot_labels = ) ) assocstats Association Statistics Description Computes the Pearson chi-Squared test. compress = FALSE. 17(3). A. The strucplot framework: Visualizing multi-way contingency tables with vcd.2). and Hornik. offset_labels = c(right = .

Baseball Baseball Data Description Baseball data. atbat86 times at Bat: number of ofﬁcial plate appearances by a hitter. . Cary. a 2 × 3 table with the chi-squared statistics. name1 player’s ﬁrst name. Examples data("Arthritis") tab <. The phi coefﬁcient. Cramer’s V. data = Arthritis) summary(assocstats(tab)) Baseball an r × c table.Meyer@R-project. SAS Institute. The contingency coefﬁcient. hits86 hits. sacriﬁce. NC.org> References Michael Friendly (2000). It counts as an ofﬁcial at-bat as long as the batter does not walk.10 Arguments x Value A list with components: chisq_tests phi cont cramer Author(s) David Meyer <David. get hit by a pitch or reach base due to catcher’s interference. name2 player’s last name. Usage data("Baseball") Format A data frame with 322 observations and 25 variables. Visualizing Categorical Data.xtabs(~Improved + Treatment.

Baseball homer86 home runs. Seems to count all years a player has actually played in the Major Leagues. hits career hits. walk. team87 team in 1987. walks86 A “walk” (or “base on balls”) is an award of ﬁrst base granted to a batter who receives four pitches outside the strike zone. second. A run is scored by an offensive player who advances from batter to runner and touches ﬁrst. hit-batsman or on an error (when the ofﬁcial scorer rules that the run would have scored anyway). 11 runs86 the number of runs scored by a player. NC. runs career runs. div86 player’s division. team86 player’s team. walks career walks. atbat career times at bat. outs86 number of putouts (see Hitters) assist86 number of assists (see Hitters) error86 number of assists (see Hitters) sal87 annual salary on opening day (in USD 1000). First Edition. league87 league in 1987. Friendly (2000). Source SAS System for Statistical Graphics. league86 player’s league. Cary. rbi career runs batted in. years Years in the Major Leagues. sacriﬁce (bunt or ﬂy) ﬁelder’s choice.3 References M. posit86 player’s position (see Hitters). page A2. third and home base in that order without being put out. not necessarily consecutive. See Also Hitters Examples data("Baseball") . rbi86 Runs Batted In: A hitter earns a run batted in when he drives in a run via a hit. SAS Institute. homeruns career home runs. Visualizing Categorical Data.

12

Bundesliga

BrokenMarriage

Broken Marriage Data

Description Data from the Danish Welfare Study about broken marriages or permanent relationships depending on gender and social rank. Usage data("BrokenMarriage") Format A data frame with 20 observations and 4 variables. Freq frequency. gender factor indicating gender (male, female). rank factor indicating social rank (I, II, III, IV, V). broken factor indicating whether the marriage or permanent relationship was broken (yes, no). Source E. B. Andersen (1991), The Statistical Analysis of Categorical Data, page 177. References E. B. Andersen (1991), The Statistical Analysis of Categorical Data. 2nd edition. Springer-Verlag, Berlin. Examples

data("BrokenMarriage") structable(~ ., data = BrokenMarriage)

Bundesliga

Ergebnisse der Fussball-Bundesliga

Description Results from the ﬁrst German soccer league (1963-2008). Usage data("Bundesliga")

Bundesliga Format A data frame with 14018 observations and 7 variables. HomeTeam factor. Name of the home team. AwayTeam factor. Name of the away team. HomeGoals number of goals scored by the home team. AwayGoals number of goals scored by the away team. Round round of the game. Year year in which the season started. Date starting time of the game (in "POSIXct" format). Details

13

The data comprises all games in the ﬁrst German soccer league since its foundation in 1963. The data have been queried online from the ofﬁcial Web page of the DFB and prepared as a data frame in R by Daniel Dekic, Torsten Hothorn, and Achim Zeileis (replacing earlier versions of the data in the package containing only subsets of years). Each year/season comprises 34 rounds (except 1963, 1964, 1991) so that all 18 teams play twice against each other (switching home court advantage). In 1963/64, there were only 16 teams, hence only 30 rounds. In 1991, after the German uniﬁcation, there was one season with 20 teams and 38 rounds. Source Homepage of the Deutscher Fussball-Bund (DFB, German Football Association): http://www. dfb.de/ References Leonhard Knorr-Held (1999), Dynamic rating of sports teams. SFB 386 “Statistical Analysis of Discrete Structures”, Discussion paper 98. See Also UKSoccer Examples

data("Bundesliga") ## number of goals per game poisson distributed? ngoals1 <- xtabs(~ HomeGoals, data = Bundesliga, subset = Year == 1995) ngoals2 <- xtabs(~ AwayGoals, data = Bundesliga, subset = Year == 1995) ngoals3 <- table(apply(subset(Bundesliga, Year == 1995)[,3:4], 1, sum)) gf1 <- goodfit(ngoals1) gf2 <- goodfit(ngoals2) gf3 <- goodfit(ngoals3)

14

Bundestag2005

summary(gf1) summary(gf2) summary(gf3) plot(gf1) plot(gf2) plot(gf3) Ord_plot(ngoals1) distplot(ngoals1)

Bundestag2

5

Votes in German Bundestag Election 2005

Description Number of votes by province in the German Bundestag election 2005 (for the parties that eventually entered the parliament). Usage data("Bundestag2 Format A 2-way "table" giving the number of votes for each party (Fraktion) in each of the 16 German provinces (Bundesland): No 1 2 Details In the election for the German parliament “Bundestag”, ﬁve parties obtained enough votes to enter the parliament: the social democrats SPD, the conservative CDU/CSU, the liberal FDP, the green party “Die Gruenen” and the leftist party “Die Linke”. The table Bundestag2 5 gives the number of votes for each party (Fraktion) in each of the 16 German provinces (Bundesland). The provinces are ordered from North to South. The data have been obtained from the German statistical ofﬁce (Statistisches Bundesamt) from the Web page given below. Note that the number of seats in the parliament cannot be computed from the number of votes alone. The examples below show the distribution of seats that resulted from the election. Source Der Bundeswahlleiter, Statistisches Bundesamt. http://www.bundeswahlleiter.de/bundestagswahl2 5/ Name Bundesland Fraktion Levels Schleswig-Holstein, Mecklenburg-Vorpommern, . . . SPD, CDU/CSU, Gruene, FDP, Linke 5")

SPD: red ## using the respective hues from a color wheel with ## chroma = 6 and luminance = 75 parties <. "lines").Names = c("CDU/CSU". 1 .Butterﬂy Examples ## The outcome of the election in terms of seats in the ## parliament was: seats <. 9 . 222). 3. "FDP". Usage data("Butterfly") Format A 1-way table giving the number of tokens for 501 species of butterﬂies."center". "Gruene". ## is shown in a mosaic display: first for the 1 Western then the ## 6 Eastern provinces.Bundestag2 5[c(1. . FDP: yellow. 12). just_labels = c("center". 51. spacing = spacing_highlighting.5. 6:8. 2. ## No party would enter a coalition with the leftists. The variable and its levels are . "Linke". c = 6 . varnames = FALSE)."right"). . 9. "SPD")) ## Hues are chosen as metaphors for the political parties ## CDU/CSU: blue."center". col = parties) ## The regional distribution of the votes. leading to a ## big coalition. 12). margins = unit(c(2. 3:5. Linke: purple. clockwise = TRUE. gp = gpar(fill = parties[colnames(votes)]). l = 75)[c(5. (1943) giving the number of tokens found for each of 501 species of butterﬂies collected in Malaya. 1)] names(parties) <.names(seats) parties ## The pie chart shows that neither the SPD+Gruene coalition nor ## the opposition of CDU/CSU+FDP could assemble a majority. "FDP". 61. Gruene: green. 6. labeling = labeling_left. keep_aspect_ratio = FALSE) 15 Butterfly Butterﬂy Species in Malaya Description Data from Fisher et al. labeling_args = list(rot_labels = c( . 2. "Gruene". pie(seats. "Linke")] mosaic(votes. 1. "SPD". 1. data("Bundestag2 5") votes <. ). c("CDU/CSU". stratified by province.rainbow_hcl(6. pos_labels = "center". 11. 13:16.structure(c(226. 54.

C.. 1. newpage = TRUE. Visualizing Categorical Data. 4. M. main = "". margins = gp = gpar(). c(5.1.1. . Corbet. to = NULL. The relation between the number of species and the number of individuals. pop = TRUE.1. 4. ylab_tol = . 42–58. n = 512. xlab = NULL. . plot = TRUE.1. name = "cd_plot". plot = TRUE. Usage cd_plot(x. 4. A. Examples data("Butterfly") Ord_plot(Butterfly) cd_plot Conditional Density Plots Description Computes and plots conditional densities describing how the distribution of a categorical variable y changes over a numerical variable x.1. 4. S. Williams (1943). . References R. 5. . 12. data = list(). A. 24 cd_plot Source Michael Friendly (2000).) ## S3 method for class ’formula’ cd_plot(formula. Cary. 3. ylab_tol = ..) c(5. B. . ylab = NULL. pages 21–22.. pop = TRUE.. 3.1).16 No 1 Name nTokens Levels 0. Journal of Animal Ecology. NC. Fisher. . xlab = NULL. .1). Friendly (2000).1. bw = "nrd ". SAS Institute. bw = "nrd ". y.. Visualizing Categorical Data.. from = NULL. name = "cd_plot".) ## Default S3 method: cd_plot(x. . newpage = TRUE. to = NULL. main = "". 5. n = 512. from = NULL. margins = gp = gpar(). ylab = NULL.

the default method expects either a single numerical variable. a "factor" interpreted to be the dependent variable 17 a "formula" of type y ~ x with a single dependent "factor" and a single numerical explanatory variable. H. Author(s) Achim Zeileis <Achim. xlab. arguments passed to density main. The densities are derived cumulatively over the levels of y.org> References Hofmann. This typically results in conditional densities that are based on very few observations in the margins: hence. Should the viewport created be popped? name newpage pop Details cd_plot computes the conditional densities of x given the levels of y weighted by the marginal distribution of y. Should the computed conditional densities be plotted? convenience tolerance parameter for y-axis annotation. Should grid. bw. from. Value The conditional density functions (cumulative over the levels of y) are returned invisibly.cd_plot Arguments x y formula data plot ylab_tol an object. they are plotted equidistantly. an optional data frame. M. Furthermore. the estimates are less reliable there. Unpublished Manuscript. Interactive graphics for visualizing conditional distributions. density .colors. n.. but rather use a smoothing approach. It should specify in particular a vector of fill colors of the same length as levels(y). The default is to call gray.Zeileis@R-project. (2005). name of the plotting viewport. This visualization technique is similar to spinograms (see spine) but they do not discretize the explanatory variable. to. the original x axis and not a distorted x axis (as for spinograms) is used.. logical. .newpage be called before plotting? logical. logical. See Also spine. If the distance between two labels drops under this threshold. Theus. ylab character strings for annotation margins gp margins when calling plotViewport a "gpar" object controlling the grid graphical parameters of the rectangles..

breathlessness and wheeze. bw = 3) = Arthritis. bw = 2. col = 2) CoalMiners Breathlessness and Wheeze in Coal Miners Description Data from Ashford & Snowden (1970) given by Agresti (1990) on the association between two pulmonary conditions. factor = 2). Agresti (1990). cdens[[1]](53:81). NoW B. . data cd_plot(Improved ~ Age. bw = "SJ") Arthritis. data ## compare with spinogram spine(Improved ~ Age. New York. plot = FALSE) plot(I(-1 * (as. bw = 2) ## scatter plot with conditional density cdens <. ylab = "Failure") lines(53:81. .cd_plot(Fail ~ Temperature. data = SpaceShuttle.numeric(Fail) . data = SpaceShuttle. Categorical Data Analysis. Wiley-Interscience. data = SpaceShuttle. pages 82–83. breaks = 3) ## Space shuttle data data("SpaceShuttle") cd_plot(Fail ~ Temperature.330 coal miners. . data = CoalMiners = Arthritis) = Arthritis. 60-64 Source Michael Friendly (2000).18 Examples ## Arthritis data data("Arthritis") cd_plot(Improved ~ Age. The variables and their levels are as follows: No 1 2 3 Name Wheeze Breathlessness Age Levels W. References A. Visualizing Categorical Data. NoB 25-29. 319–322. .2)) ~ jitter(Temperature. in a large sample of coal miners. data cd_plot(Improved ~ Age. 30-34. Usage data("CoalMiners") Format A 3-dimensional array resulting from cross-tabulating variables for 16. . xlab = "Temperature".

NC. main = "Breathlessness and Wheeze in Coal Miners") m <. margin = NULL. n = 1 . 535–546. 6 . Friendly (2000). margin index(es) or corresponding name(s) of the conditioning variables. Snowdon (1970). Biometrics. number of (conditional) independence tables to be drawn. pearson = TRUE) Arguments x margin n indepfun aggfun alternative pearson a contingency table. aggregation function aggregating the test statistics computed by indepfun. col = "red") ## Fourfold display. "less"). both margins equated fourfold(CoalMiners.lm(l ~ g + I(g^2)) lines(fitted(m). std = "ind. a character string specifying the alternative hypothesis. SAS Institute. D. Ashford \& R. M. R.oddsratio(CoalMiners)) g <. Multivariate probit analysis. by = 5) plot(l. Usage coindep_test(x. Examples data("CoalMiners") ## Fourfold display. must be either "greater" (default) or "less" (and may be abbreviated. Each resulting conditional table has to be a 2-way table.seq(25. aggregation function capturing independence in (each conditional) 2-way table. mfcol = c(2. Cary. xlab = "Age Group". Should the table of Pearson residuals under independence be computed and passed to indepfun (default) or the raw table of observed frequencies? .max". mfcol = c(2.coindep_test J. strata equated fourfold(CoalMiners.4)) 19 coindep_test Test for (Conditional) Independence Description Performs a test of (conditional) independence of 2 margins in a contingency table by simulation from the marginal distribution of the input table under (conditional) independence.4)) ## Log Odds Ratio Plot summary(l <. indepfun = function(x) max(abs(x)). alternative = c("greater". Visualizing Categorical Data. aggfun = max.) logical. 26.

org> See Also chisq. By default.Zeileis@R-project. the corresponding quantile function (for computing critical values). 1. "Tea")) ) ## compute maximum statistic coindep_test(TeaTasting) ## compute Chi-squared statistic coindep_test(TeaTasting. Alternatively. dimnames = list(Guess = c("Milk". . Other statistics can be computed by changing pearson to FALSE. a character string giving the name(s) of the data. r2dtable Examples TeaTasting <. Value A list of class "coindep_test" inheriting from "htest" with following components: statistic p. observed table of frequencies expected table of frequencies corresponding Pearson residuals the margin used a vector of size n with simulated values of the distribution of the statistic under the null. this uses a (double) maximum statistic of Pearson residuals.test. a character string indicating the type of the test. 3). margin can give several conditioning variables and then conditional independence in the resulting conditional table is tested.test(TeaTasting.matrix(c(3. 1. The function uses r2dtable to simulate the distribution of the test statistic under the null. the corresponding distribution function (for computing p values).value method data. Truth = c("Milk". the p value for the test. By changing indepfun or aggfun a (maximum of) Pearson Chi-squared statistic(s) can be computed or just the usual Pearson Chi-squared statistics and so on.test.20 Details coindep_test If margin is NULL this computes a simple independence statistic in a 2-way table. nr = 2. indepfun = function(x) sum(x^2)) ## use unconditional asymptotic distribution chisq. "Tea").name observed expctd residuals margin dist qdist pdist Author(s) Achim Zeileis <Achim. fisher. correct = FALSE) the value of the test statistic.

.) ## S3 method for class ’formula’ cotabplot(formula. . panel function applied for each conditioned plot. or a numeric vector of length 4. margin = "Dept") ## maximum of Chi-squared statistics coindep_test(UCBAdmissions. either an object of class "unit" of length 4.) ## Default S3 method: cotabplot(x. margins = rep(1. newpage = TRUE. The elements are recycled as needed. giving the number of rows and columns for the panel.. panel_args = list().. cond = NULL. data = UCBAdmissions) cotabplot Coplot for Contingency Tables Description cotabplot is a generic function for creating trellis-like coplots (conditional plots) for contingency tables. integer vector (of length two).. . margin index(es) or corresponding name(s) of the conditioning variables. . data = NULL. Usage cotabplot(x..9)). panel = cotab_mosaic. indepfun = function(x) sum(x^2).test(TeaTasting) 21 data("UCBAdmissions") ## double maximum statistic coindep_test(UCBAdmissions. margin = "Dept". aggfun = sum) ## use unconditional asymptotic distribution loglm(~ Dept * (Gender + Admit). rect_gp = gpar(fill = grey( .cotabplot chisq. indepfun = function(x) sum(x^2)) ## Pearson Chi-squared statistic coindep_test(UCBAdmissions.. 4). margin = "Dept". see details. giving the margins around the whole plot. layout = NULL. object of class "gpar" used for the text in the panel titles.) Arguments x cond panel panel_args margins layout text_gp an object. The default method can deal with contingency tables in array form.. pop = TRUE. text_gp = gpar(fontsize = 12). list of arguments passed to panel if this is a panel-generating function inheriting from class "grapcon_generator".

22 rect_gp pop newpage . which also includes the arguments set in panel_args. condlevels) where x is the full table (tab in the example above) and condlevels is a named vector with the levels (e. New Jersey: Hobart Press. further arguments passed to the panel-generating function. A description of the underlying ideas is given in Zeileis. W. R. either a data frame.g..-J.. logical controlling whether a new grid page should be created. association and sieve plots can be found at cotab_mosaic.org> References Becker. Suitable panel-generating functions for mosaic.. To produce this plot either the default interface can be used or the formula interface via cotabplot(tab. formula object of class "gpar" used for the rectangles with the panel titles. data = tab) The panel function needs to be of the form panel(x. The visual design and control of trellis display. and then relies on a panel function to create plots from the full table and the conditioning information. Meyer.A.. c(z = "z1") in the example above).. It has to be of type ~ x + y | z where z is/are the conditioning variable(s) used. W. M. Summit..S. A simple example would be a contingency table tab with margin names "x". 1993. cotabplot takes on computing the conditioning information and setting up the trellis display.. panel can also be a panel-generating function of class "grapcon_generator" which creates a function with the interface described above. condvars is now only a vector with the names of the conditioning variables (and not their levels. Alternatively. (1993). or an object of class "table" or "ftable".g. Author(s) Achim Zeileis <Achim. data Details cotabplot is a generic function designed to create coplots or conditional plots (see Cleveland. Cleveland. 123–155. e. "y" and "z".Zeileis@R-project. Journal of Computational and Graphical Statistics. "z" in the example above). Shyu.. . Shyu. cotabplot logical indicating whether the generated viewport tree should be removed at the end of the drawing or not. Cleveland. The panel-generating function is called with the interface panel(x. condvars.. Hornik (2005). "z") cotabplot(~ x + y | z.) where again x is the full table. a formula specifying the variables used to create a contingency table from data. .. Visualizing Data. Cleveland.S. Further arguments can be passed to the panel-generating function via . and Becker. 5. (1996). 1996) similar to coplot but for contingency tables.

. D. interpolate = c(2. Residual-based shadings for visualizing (conditional) independence. c = NULL. condvars = NULL.cotab_coindep(UCBAdmissions. Journal of Computational and Graphical Statistics.. 507–525. . and Hornik. condvars = "Dept". condvars = NULL. h = NULL. level = NULL. co_table. 1.. URL http://www. n = 5 cotabplot(~ Admit + Gender | Dept. test = c("doublemax". (2006).org/v17/i03/ and available as vignette("strucplot")...) cotab_assoc(x = NULL.. Zeileis. Journal of Statistical Software. D. coindep_test Examples data("UCBAdmissions") cotabplot(~ Admit + Gender | Dept. The strucplot framework: Visualizing multi-way contingency tables with vcd. l = NULL. character indicating which type of statistic should be used for assessing conditional independence.. condvars = NULL. (2007). . By default this is computed from x. lty = 1. 16. type = c("mosaic". . A. condvars = NULL.) cotab_fourfold(x = NULL. y-axis limits for assoc plot. "maxchisq".) cotab_coindep(x. data = UCBAdmissions. ylim = NULL.) Arguments x condvars ylim test a contingency tables in array form. Zeileis.. ylim = NULL.. A. data = UCBAdmissions. panel = cotab_assoc) ucb <. margins = c(3. 4). cotab_coindep. panel = ucb) . type = "assoc". 1..cotab_panel 23 Meyer.. See Also cotab_mosaic. 3)) cotab_panel Panel-generating Functions for Contingency Table Coplots Description Panel-generating functions visualizing contingency tables that can be passed to cotabplot. data = UCBAdmissions) cotabplot(~ Admit + Gender | Dept. . n = 1 .) cotab_sieve(x = NULL. condvars. K.jstatsoft. Meyer. Usage cotab_mosaic(x = NULL.. Hornik K. "assoc").... . "sumchisq"). margin name(s) of the conditioning variables. legend = FALSE. 17(3). 1-48.

character indicating which type of plot should be produced. Zeileis. D. condvars = "Dept". 507–525..n. D. 1. assoc. shading_hcl Examples data("UCBAdmissions") cotabplot(~ Admit + Gender | Dept. see shadings for more details. Hornik K.24 cotab_panel level.. n = 5 cotabplot(~ Admit + Gender | Dept. Meyer.jstatsoft. A. type legend . panel = cotab_fourfold) ucb <. Journal of Statistical Software. 1-48.interpolate variables controlling the HCL shading of the residuals. Meyer. data = UCBAdmissions. (2006). See Also cotabplot.org/v17/i03/ and available as vignette("strucplot")..l. 3)) . sieve. type = "assoc". they return functions with the interface panel(x. co_table. Journal of Computational and Graphical Statistics... Author(s) Achim Zeileis <Achim.Zeileis@R-project. URL http://www. Details These functions of class "panel_generator" are panel-generating functions for use with cotabplot. 1. The function cotab_coindep is similar but additionally chooses an appropriate residual-based shading visualizing the associated conditional independence model. The conditional independence test is carried out via coindep_test and the shading is set up via shading_hcl. Zeileis. Residual-based shadings for visualizing (conditional) independence. condlevels) required for cotabplot. K. data = UCBAdmissions.e. A. coindep_test. margins = c(3. cotab_assoc and cotab_sieve essentially only call co_table to produce the conditioned table and then call mosaic. Should a legend be produced in each panel? further arguments passed to the plotting function (such as mosaic or assoc or sieve respectively). data = UCBAdmissions) cotabplot(~ Admit + Gender | Dept. 16. Hornik (2005).h. assoc or sieve respectively with the arguments speciﬁed. logical. i. A description of the underlying ideas is given in Zeileis. panel = ucb) . data = UCBAdmissions.. 17(3). (2007).. and Hornik.cotab_coindep(UCBAdmissions. mosaic. The functions produced by cotab_mosaic.org> References Meyer.c.lty. The strucplot framework: Visualizing multi-way contingency tables with vcd. panel = cotab_assoc) cotabplot(~ Admit + Gender | Dept.

1:2. Usage co_table(x. Details This is essentially an interface to [ which is more convenient for arrays of arbitrary dimension. character used when collapsing level names (if more than 1 margin is speciﬁed). Value A list of the resulting conditional tables. 1) co_table(HairEyeColor. collapse = "") . margin. compute a list of conditional tables given some margins. c("Hair". Author(s) Achim Zeileis <Achim. "Eye")) co_table(HairEyeColor.Zeileis@R-project. margin index(es) or corresponding name(s) of the conditioning variables.co_table 25 co_table Compute Conditional Tables Description For a contingency table in array form. collapse = ".") Arguments x margin collapse a contingency table in array form.org> Examples data("HairEyeColor") co_table(HairEyeColor.

Alcohol factor indicating daily alcohol consumption: less than 1 unit (<1). countryside (Country). 1 unit is approximately 1 bottle of beer or 4cl 40% alcohol. Andersen (1991). page 205. Examples data("DanishWelfare") ftable(xtabs(Freq ~ . Source E. Unmarried). 100-150. 2nd edition. The Statistical Analysis of Categorical Data. Married. >150). Suburbian Copenhagen (SubCopenhagen). The Statistical Analysis of Categorical Data. References E. other cities (City). 50-100. B. Andersen (1991). three largest cities (LargeCity).. 1-2 units (1-2) or more than 2 units (>2). Usage data("DanishWelfare") Format A data frame with 180 observations and 5 variables. Urban factor indicating urbanization: Copenhagen (Copenhagen). Berlin. Freq frequency.26 DanishWelfare DanishWelfare Danish Welfare Study Data Description Data from the Danish Welfare Study. Income factor indicating income group in 1000 DKK (0-50. Status factor indicating marriage status (Widow. data = DanishWelfare)) . B. Springer-Verlag.

a label for the x axis. conf_level = . parameter of the poisson distribution. If set to NULL and type is "nbinomial". then size is taken to be the maximum count. conf_int = TRUE. "binomial". pop = TRUE. . name of the plotting viewport. type size lambda legend xlim ylim conf_int conf_level main xlab ylab gp name newpage pop . Usage distplot(x. binomialness and negative binomialness plots. a "gpar" object controlling the grid graphical parameters of the points. type = c("poisson". ylab = "Distribution metameter".5).newpage be called before plotting? logical. legend = TRUE. lambda = NULL. . Should a legend be plotted? limits for the x axis. xlim = NULL. then size is estimated from the data. the size argument for the binomial and negative binomial distribution.points. ylim = NULL.. gp = gpar(cex = .. Should conﬁdence intervals be plotted? conﬁdence level for conﬁdence intervals. size = NULL.95.. a label for the y axis. limits for the y axis. a title for the plot. logical.) Arguments x either a vector of counts. a character string indicating the distribution. newpage = TRUE.distplot 27 distplot Diagnostic Distribution Plots Description Diagnostic distribution plots: poissonness. logical. logical. name = "distplot". "nbinomial"). main = NULL. Should grid. If set to NULL and type is "binomial". Should the viewport created be popped? further arguments passed to grid.. xlab = "Number of occurrences". If type is "poisson" and lambda is speciﬁed a leveled poissonness plot is produced. a 1-way table of frequencies of counts or a data frame or matrix with frequencies in the ﬁrst column and the corresponding counts in the second column.

. D.. In D. depvar = length(dim(x)). Hoaglin.org> References D. Friendly (2000). W. F. type = "poisson") distplot(HorseKicks. M. . Author(s) Achim Zeileis <Achim. SAS Institute.8) ## Real data examples: data("HorseKicks") data("Federalist") data("Saxony") distplot(HorseKicks. C. prob = distplot(dummy. If the distribution ﬁts the data. See Friendly (2000) for details. New York. Hoaglin (1980). A poissonness plot. Cary.Zeileis@R-project. main = NULL) ## Default S3 method: doubledecker(x. lambda = . Mosteller. Examples ## Simulated data examples: dummy <. Visualizing Categorical Data. W. chapter 9. type = "nbinomial") distplot(Saxony. type = "nbinomial". C. margins = c(1.. NC. 146–149.61) distplot(Federalist. The American Statistican. Checking the shape of discrete distributions. 34.).. type = "binomial". data = NULL. size = 1) distplot(Federalist. 1). type = "poisson") distplot(Federalist. C. John Wiley \& Sons. type = "poisson". Tukey (eds. Usage ## S3 method for class ’formula’ doubledecker(formula. type = "nbinomial") . length(dim(x)) + 1.5. size = 12) doubledecker Doubledecker Plot Description This function creates a doubledecker plot visualizing a classiﬁcation rule. size = 1. J.28 Details doubledecker distplot plots the number of occurrences (counts) against the distribution metameter of the speciﬁed distribution. Hoaglin \& J.4. Tukey (1985). Trends and Shapes. the plot should show a straight line. Exploring Data Tables.rnbinom(1 .

.doubledecker gp = gpar(fill = rev(gray. margins of the plot. Author(s) David Meyer <David. If main is TRUE. Formally. .Meyer@R-project. a contingency table in array form. .) Arguments formula data x depvar margins gp labeling spacing main 29 a formula specifying the variables used to create a contingency table from data.org> Further parameters passed to mosaic.. Note that by default. they are mosaic plots with vertical splits for all dimensions (antecedents) except the last one. or an object of class "table" or "ftable". either a data frame.. no space between the tiles. Details Doubledecker plots visualize the the dependence of one categorical (typically binary) variable on further categorical variables. which represents the dependent variable (consequent). all factor names (except the last one) and their levels are visualized as a block under the plot. keep_aspect_ratio logical indicating whether the aspect ratio should be maintained or not. That will be sorted last in the table. the name of the data object is used. object of class "gpar" used for the tiles of the last variable. and separate colors for the levels. main = NULL.. 1)))). spacing object. spacing function or corresponding generating function (see strucplot for details). either a logical. dimension index or character string specifying the dependent variable. or a character string used for plotting the main title. keep_aspect_ratio = FALSE. The last variable is visualized by horizontal splits. spacing = spacing_highlighting. Value The "structable" visualized is returned invisibly. labeling function or corresponding generating generating function (see strucplot for details).colors(tail(dim(x). labeling = labeling_doubledecker.. with optional category labels speciﬁed in the dimnames(x) attribute. The dependent variable is used last for splitting.

Unemployed <1Mo. Generalized odds ratios for visual modeling..org/v17/i03/ and available as vignette("strucplot"). 1-2Yr. Journal of Computational and Graphical Statistics. The workers are classiﬁed by their employment status on 1975-01-01.jstatsoft. the cause of their layoff and the length of employment before they were laid off. See Also strucplot. 2-5Yr. mosaic Examples data("Titanic") doubledecker(Titanic) doubledecker(Titanic. D.. 1-48. (2006). Meyer.30 References Employment H. Visualizing Categorical Data. Usage data("Employment") Format A 3-dimensional array resulting from cross-tabulating variables for 1314 employees. >5Yr Closure. The strucplot framework: Visualizing multi-way contingency tables with vcd. and Hornik. URL http://www. 3-12Mo. data = Titanic) Employment Employment Status Description Data from a 1974 Danish study given by Andersen (1991) on the employees who had been laid off.. pages 126–129. Journal of Statistical Software. K. Hoffmann (2001). 4. A. Replaced Source Michael Friendly (2000). 628–640. 10. Zeileis. The variables and their levels are as follows: No 1 2 3 Name EmploymentStatus EmploymentLength LayoffCause Levels NewJob. 1-3Mo. 17(3). depvar = "Survived") doubledecker(Survived ~ . .

main = "Layoff: Replaced". M. B. main = "Layoff*EmployLength + Layoff*EmployStatus") ## Stratified view grid. expected = ~ LayoffCause * EmploymentLength + EmploymentStatus.. Usage data("Federalist") .col = 2)) ## Replaced mosaic(Employment[. newpage = FALSE) popViewport(2) 31 Federalist ‘May’ in Federalist Papers Description Data from Mosteller & Wallace (1984) investigating the use of certain keywords (‘may’ in this data set) to identify the author of 12 disputed ‘Federalist Papers’ by Alexander Hamilton.2].. expected = ~ LayoffCause * EmploymentLength + LayoffCause * EmploymentStatus.newpage() pushViewport(viewport(layout = grid.pos. Springer-Verlag. Friendly (2000). John Jay and James Madison.col = 1)) ## Closure mosaic(Employment[.pos. Visualizing Categorical Data.layout(ncol = 2))) pushViewport(viewport(layout. main = "Layoff: Closure". The Statistical Analysis of Categorical Data. Berlin. newpage = FALSE) popViewport(1) pushViewport(viewport(layout.1]. SAS Institute.Federalist References E. Andersen (1991). Cary. Examples data("Employment") ## Employment Status mosaic(Employment. NC. main = "Layoff*EmployLength + EmployStatus") mosaic(Employment.

The variable and its levels are .32 Format Federalist A 1-way table giving the number of occurrences of ‘may’ in 262 blocks of text.

main = NULL. mfcol = NULL. Cary.fourfold No 1 Name nMay Levels 0. Visualizing Categorical Data. fontsize = 12) Arguments x a 2 × 2 × k contingency table in array form. Applied Bayesian and Classical Inference: The Case of the Federalist Papers. SAS Institute. type = "nbinomial") summary(gf) plot(gf) fourfold Fourfold Plots Description Creates an (extended) fourfold display of a 2 × 2 × k contingency table. or a 2 × 2 matrix if k is 1. If length(dim(x)>3.methods. "#FF ". . 1. 6 33 Source Michael Friendly (2000). allowing for the visual inspection of the association between two dichotomous variables in one or several populations (strata). space = . NY.max". mfrow = NULL. "#FFA A ".max"). . "#A A FF". sub = NULL. "#6699CC". Usage fourfold(x.15.adjust. 2).goodfit(Federalist. . "all. M. std = c("margins". dimensions 3:length(dim(x) are silently raveled into a combined strata dimension with k=prod(dim(x)[-(1:2)])). extended = TRUE. newpage = TRUE. NC.2. margin = c(1. Visualizing Categorical Data. page 19. "ind. Wallace (1984). Springer-Verlag. New York. References F. L. ticks = . Mosteller & D. . . p_adjust_method = p. "# conf_level = .95. Friendly (2000). Examples data("Federalist") gf <. 8 "). color = c("#99CCFF".

the tables are either individually or simultaneously standardized to a maximal cell frequency of 1. each 2 × 2 table is standardized to equate the margins speciﬁed by margin while preserving the odds ratio. The p-values are used for the ‘visual’ signiﬁcance tests of the odds ratios.newpage() is called before plotting. or both row and column in each 2 × 2 table. sub mfrow. conﬁdence rings are suppressed.max".max". conf_level std margin space main. or c(1. color is used to show this direction. 2. Must be one of 1. If set to 0. if TRUE. indicating that the displays for the 2 × 2 tables should be arranged in an nr by nc layout. logical.adjust}. Must be one of "margins". or "all. newpage fontsize Details The fourfold display is designed for the display of 2 × 2 × k tables. Other labels are scaled relative to this. only column. conﬁdence level used for the conﬁdence rings on the odds ratios. ﬁlled by rows/columns. the other two for the extended version: the second/third pair is used for tables with non-signiﬁcant/signiﬁcant log-odds ratios. a character string specifying how to standardize the table. as provided by link[stats]{p. If "ind. colors are brighter for signiﬁcant log-odds ratios. the latter being visualized in brighter colors. .max". Conﬁdence rings for the odds ratio allow a visual test of the null of no association. Only used if std equals "margins". "ind. character string for the fourfold plot title/subtitle.34 color fourfold a vector of length 6 specifying the colors to use for the smaller and larger diagonals of each 2 × 2 table. a numeric vector with two components: nr and nc . and ticks are plotted showing the direction of association for positive log-odds.max" or "all. the cell frequencies fij of each 2 × 2 table are shown as a quarter circle whose radius is proportional to fij so that its area is proportional to the cell frequency. i. the amount of space (as a fraction of the maximal radius of the quarter circles) used for the row and column labels. An association (odds ratio different from 1) between the binary row and column variables is indicated by the tendency of diagonally opposite cells in one direction to differ in size from those in the other direction. The defaults are calculated to give a collection of plots in landscape orientation when k is not a perfect square. mfcol extended ticks the length of the ticks. if TRUE. 2) (the default). grid. if set to 0. no ticks are plotted. Following suitable standardization. If set to "margins". extended plots are plotted. respectively. The ﬁrst pair is used for the standard (non-extended) plots. a numeric vector with the margins to equate. p_adjust_method method to be used for p-value adjustments for multi-stratum plots. fontsize of main title.e. Must be a single non-negative number less than 1. Use p_adjust_method="none" to disable this adjustment. and can be abbreviated by the initial letter.. which corresponds to standardizing only the row. the rings for adjacent quadrants overlap iff the observed counts are consistent with the null hypothesis. logical.

Visualizing Categorical Data. M. SAS Institute. panel = cotab_fourfold) ## Fourfold display of x.table(x. ## Figure 3 in Friendly (1994). margin = 2) goodfit Goodness-of-ﬁt Tests for Discrete Data Description Fits a discrete (count data) distribution for goodness-of-ﬁt tests. Cary. References Friendly. Note that the conﬁdence rings for the individual odds ratios are not adjusted for multiple testing. assoc link[stats]{p. Friendly.pdf. 1. NC. fourfold(x) cotabplot(x.adjust} for methods of p value adjustment Examples data("UCBAdmissions") ## Use the Berkeley admission data as in Friendly (1995). 2))) ## Fourfold display of x. The fourfold display visualizes the pattern of association. with ## frequencies standardized to equate the margins for admission ## and sex. with frequencies in each table ## standardized to equate the margins for admission. See Also mosaic. (2000). ## Figure 2 in Friendly (1994).aperm(UCBAdmissions. York University. M. "No") names(dimnames(x)) <. x <.c("Sex". Psychology Department. A fourfold display for 2 by 2 by k tables. http://datavis. "Admit?". c(1. ## Figure 1 in Friendly (1994). "Department") ftable(x) ## Fourfold display of data aggregated over departments.c("Yes". fourfold(x. Technical Report 217. c(2. 3)) dimnames(x)[[2]] <. and it is of interest to see whether the association is homogeneous across strata.goodﬁt 35 Typically. but not ## for sex. . the number k corresponds to the number of levels of a stratifying variable.ca/papers/4fold/4fold. with frequencies in each table ## standardized to equate the margins for admission and sex. fourfold(margin. (1994).

the ﬁtted values are computed. method = c("ML". . a character string indicating which distribution should be ﬁt (for goodfit) or indicating the type of prediction (ﬁtted response or probabilities in predict) respectively. Details goodfit essentially computes the ﬁtted values of a discrete distribution (either Poisson. However. The plot method produces a rootogram of the observed and ﬁtted values. "prob"). If the parameter size is not speciﬁed if type is "binomial" it is taken to be the maximum count. i. a named list giving the distribution parameters (named as in the corresponding density function). par should be a named list specifying the parameters lambda for "poisson" and prob and size for "binomial" or "nbinomial". respectively. The corresponding Pearson Chi-squared or likelihood ratio statistic. These can also be extracted by fitted(object). the minimum Chi-squared approach is somewhat ad hoc. type = c("response". To ﬁx parameters. respectively.. "nbinomial"). currently not used. a 1-way table of frequencies of counts or a data frame or matrix with frequencies in the ﬁrst column and the corresponding counts in the second column. is computed and given with their p values by the summary method. an object of class "goodfit". newcount = NULL. size is not speciﬁed it is not estimated but taken as the maximum count. By default the counts stored in object are used. The summary method always prints this information and returns a matrix with the printed information invisibly. if set to NULL. the parameters are estimated. type = c("poisson". "binomial". Strictly speaking. then parameter size can be speciﬁed to ﬁx it so that only the parameter prob will be estimated (see the examples below). type method par object newcount . the Chi-squared asymptotics would only hold if the number of cells were ﬁxed or did not increase too quickly with the sample size. a vector of counts.. If for "binomial".36 Usage goodfit(x.. the default.. par = NULL) ## S3 method for class ’goodfit’ predict(object. If the parameters are not speciﬁed they are estimated either by ML or Minimum Chi-squared. If type is "nbinomial".) Arguments x goodﬁt either a vector of counts.e. binomial or negative binomial) to the count data given in x. All counts larger than the maximal count are merged into the cell with the last count for computing the test statistic.. "MinChisq"). a character string indicating whether the distribution should be ﬁt via ML (Maximum Likelihood) or Minimum Chi-squared. . In case of count distribtions (Poisson and negative binomial). in goodfit the number of cells is data-driven: Each count is a cell of its own.

expected frequencies (ﬁtted by ML).fit2) 37 observed frequencies. . size = 1. SAS Institute.goodfit(dummy. par = list(size = 6)) gf2 <. Cary. type = "binomial". Visualizing Categorical Data. Friendly (2000). prob = . method = "MinChisq") summary(gf) plot(gf) dummy <.goodfit(dummy. degrees of freedom.fit2) plot(F.rbinom(1 . NC. type = "binomial".fit) plot(F. type = "nbinomial".5.fit) plot(HK. par = list(prob = .rnbinom(2 . par = list(size = 1)) F.fit) data("Federalist") ## try geometric and full negative binomial distribution F.goodfit(dummy. size = 6)) summary(gf1) plot(gf1) summary(gf2) plot(gf2) ## Real data examples: data("HorseKicks") HK.goodfit(Federalist.goodfit(HorseKicks) summary(HK. corresponding counts.5) gf1 <.8) gf <. type = "nbinomial". a character string indicating the ﬁtting method (can be either "ML". a named list of the (estimated) distribution parameters.fit <.fit) summary(F.goodﬁt Value A list of class "goodfit" with elements: observed count fitted type method df par Author(s) Achim Zeileis <Achim.Zeileis@R-project. "MinChisq" or "fixed" if the parameters were speciﬁed). a character string indicating the distribution ﬁtted. size = 6. type = "nbinomial") summary(F.fit <. Examples ## Simulated data examples: dummy <.org> References M.goodfit(Federalist.fit2 <.6. prob = .

38 grid_barplot grid_barplot Barplot Description Bar plots of 1-way tables in grid. width = . Should grid. xlab = "".8. gp = gpar(fill = "lightgray").Zeileis@R-project. Should the viewport created be popped? . name = "grid_barplot". names = letters[1:6]) either a vector or a 1-way table of frequencies. a label for the y axis. width of the bars (recycled if needed to the number of bars). logical.newpage be called before plotting? logical. name of the plotting viewport. a "gpar" object controlling the grid graphical parameters of the rectangles. ylab = "".org> Examples grid_barplot(sample(1:6). offset of the bars (recycled if needed to the number of bars). limits for the y axis. limits for the x axis. ylim = NULL. a vector of names for the bars. offset = . if set to NULL the names of height are used. Usage grid_barplot(height. pop = FALSE) Arguments height width offset names xlim ylim xlab ylab main gp name newpage pop Details grid_barplot mimics (some of) the features of barplot. a label for the x axis. newpage = TRUE. but currently it only supports 1-way tables. Author(s) Achim Zeileis <Achim. xlim = NULL. a title for the plot. main = "". names = NULL.

"lines"). default_units = "lines". draw = TRUE. Author(s) David Meyer <David. labels. hgap = unit( .5. y. frame = TRUE.Meyer@R-project. character string indicating the plot’s title . title = "Legend:") Arguments x. Usage grid_legend(x. vgap = unit( . y pch col labels frame hgap vgap default_units gp draw title Value Invisibly.org> See Also legend coordinates of the legend integer vector of plotting symbols character vector of colors for the symbols character vector of labels corresponding to the symbols logical indicating whether the legend should have a border or not. gp = gpar().grid_legend 39 grid_legend Legend Function for grid Graphics Description This function can be used to add legends to grid-based plots. col. pch.3. the legend as a "grob" object. object of class "unit" specifying the space between symbols and labels object of class "unit" specifying the space between the lines character string indicating the default unit object of class "gpar" used for the legend logical indicating whether the legend be drawn or not. "lines").

col = ifelse(side == "Port".3 References M. C=catcher. "Starboard"). .1. 2B. Cary. SS=Short Stop. "blue").40 Examples data("Lifeboats") attach(Lifeboats) ternaryplot(Lifeboats[. 19). c("red". Friendly (2000). In order to reduce the number of observations. UT=Utility Players). Positions factor indicating the ﬁeld position (1B=ﬁrst baseman. Errors count the errors made by a player. c("Port". Assists are credited to other ﬁelders involved in making that putout. 2B=second baseman.8. OF. pch = ifelse(side == "Port". Usage data("Hitters") Format A data frame with 154 observations and 4 variables.4:6]. id = ifelse(men / total > . main = "Lifeboats on Titanic") grid_legend( . each of these three variables was scaled to a common range by dividing each variable by the maximum of the variable. dimnames_position = "edge". SS and UT). NA). Page A2. . Visualizing Categorical Data. SAS Institute. c(1. NC. Source SAS System for Statistical Graphics. the was compressed by calculating the mean number of errors. In addition. "red".character(boat).9. C. Putouts and Assists made by each player. OF=outﬁelder. as. prop_size = 2. 1. putouts and assists for each team and for only 6 positions (1B. "blue"). title = "SIDE") Hitters Hitters Hitters Data Description This data set is deduced from the Baseball ﬁelding data set: ﬁelding performance basically includes the numbers of Errors. First Edition. 3B. Putouts occur when a ﬁelder causes an opposing player to be tagged or forced out. 3B=third baseman. 19).

9. saturation value in [0. main = "Baseball Hitters Data") grid_legend( . 1].numeric(Positions)]. hcl2hex. luminance value in [0. levels(Positions)."red".c("black".substr(levels(Positions). Author(s) Achim Zeileis <Achim. 1. 1) ternaryplot(Hitters[. . s = 1) hue value in [0. pch = as. pch.5.hls Examples data("Hitters") attach(Hitters) colors <.org> See Also hsv.8.2:4]. 12). title = "POSITION(S)") detach(Hitters) 41 hls HLS Color Speciﬁcation Description Create a HLS color from specifying hue. col = colors[as."green". . polarLUV Examples ## an HLS color wheel pie(rep(1.Zeileis@R-project.character(Positions). l = Arguments h l s Details HLS colors are a similar speciﬁcation of colors as HSV colors. colors. Usage hls(h = 1. luminance and saturation. 1]."blue") pch <."black". but using hue/luminance/saturation rather that hue/saturation/value. function(x) hls(x))) . col = sapply(1:12/12."red"."blue". 1].

Usage data("HorseKicks") Format A 1-way table giving the number of deaths in 200 corps-years.42 HorseKicks HorseKicks Death by Horse Kicks Description Data from von Bortkiewicz (1898). . . page 18. 4 Source Michael Friendly (2000). Cary. Andrews & A.goodfit(HorseKicks) summary(gf) plot(gf) . von Bortkiewicz (1898). Visualizing Categorical Data. Herzberg (1985). Fisher (1925). Teubner. M. References D. This data set is a popular subset of the VonBort data. Springer-Verlag. 1. Oliver \& Boyd. . M. The variable and its levels are No 1 Name nDeaths Levels 0. See Also VonBort Examples data("HorseKicks") gf <. NY. New York. Friendly (2000). Statistical Methods for Research Workers. on number of deaths by horse or mule kicks in 10 (of 14 reported) corps of the Prussian army. Leipzig. NC. A. Das Gesetz der kleinen Zahlen. R. . given by Andrews \& Herzberg (1985). L. Data: A Collection of Problems from Many Fields for the Student and Research Worker. London. Visualizing Categorical Data. SAS Institute. 4 corps were not considered by Fisher (1925) as they had a different organization. F.

shade = TRUE) assoc(Hospital. shade = TRUE) Name Visit Frequency Length of Stay Levels Regular. 1:38–51. Wing (1962): Institutionalism in mental hospitals. which are homogeneous. 20+ years . Usage data("Hospital") Format A 2-dimensional array resulting from cross-tabulating 132 patients. Examples data("Hospital") mosaic(t(Hospital). References J.K. Biometrics. Source S.J Haberman (1974): Log-linear models for frequency tables with ordered classiﬁcations. shade = TRUE) sieve(Hospital. Haberman (1974) notes that this pattern does not increase from the "Less than monthly" to the "Never" group. 10–19 years.Hospital 43 Hospital Hospital data Description The table relates the length of stay (in years) of 132 long-term schizophrenic patients in two London mental hospitals with the frequency of visits. British Journal of Social Clinical Psychology. shade = TRUE) mosaic(Hospital. Less than monthly. the less frequent the visits. 30:689–700. Never 2–9 years. The variables and their levels are as follows: No 1 2 Details Wing (1962) who collected this data concludes that the longer the length of stay in hospital.

44 JobSatisfaction independence_table Independence Table Description Computes table of expected frequencies (under the null hypotheses of independence) from an n-way table. "relative")) Arguments x frequency Value A table with either absolute or relative frequencies. Author(s) David Meyer <David. Usage data("JobSatisfaction") . JobSatisfaction Job Satisfaction Data Description Data from Petersen (1968) about the job satisfaction of 715 blue collar workers.org> Examples data("MSPatients") independence_table(MSPatients) independence_table(MSPatients. Usage independence_table(x. frequency = "relative") a table. selected from Danish Industry in 1968.Meyer@R-project. indicates whether absolute or relative frequencies should be computed. frequency = c("absolute".

Freq frequency.4. Usage data("JointSports") . management factor indicating quality of management (bad. Table 5. E. References 45 E. good). Source E. B. Petersen (1968). Mentalhygiejnisk Forlag. data = JobSatisfaction)) JointSports Opinions About Joint Sports Description Data from a Danish study in 1983 and 1985 about sports activities and the opinion about joint sports with the other gender among 16–19 year old high school students. Examples data("JobSatisfaction") structable(~ . high). 2nd edition.test(xtabs(Freq ~ own + supervisor + management. data = JobSatisfaction) mantelhaen. Andersen (1991). high). Berlin. Andersen (1991).JointSports Format A data frame with 8 observations and 4 variables. The Statistical Analysis of Categorical Data. supervisor factor indicating supervisor’s job satisfaction (low. Job Satisfaction in Denmark. (In Danish). Springer-Verlag.. The Statistical Analysis of Categorical Data. own factor indicating worker’s own job satisfaction (low. Copenhagen. B.

Berlin. 3rd). data = JointSports) doubledecker(opinion ~ gender + year + grade.xtabs(Freq ~ gender + opinion + grade + year. very bad). or a user-speciﬁed matrix with same dimensions as x. Freq frequency. either one of the character strings given in the default value. good. data = tab) Kappa Cohen’s Kappa and Weighted Kappa Description Computes two agreement rates: Cohen’s kappa and weighted kappa. weights = c("Equal-Spacing". Girl). 2nd edition. gender factor indicating gender (Boy. indifferent. Examples data("JointSports") tab <. B. Usage Kappa(x.46 Format A data frame with 40 observations and 5 variables. B. The Statistical Analysis of Categorical Data. Source E. Kappa opinion factor indicating opinion about sports joint with the other gender (very good. data = tab) loglm(~ opinion* (gender + grade+ year) + gender*year*grade. 1985). "Fleiss-Cohen")) Arguments x weights a confusion matrix. and conﬁdence bands. Springer-Verlag. Andersen (1991). bad. References E. page 210. Andersen (1991). grade factor indicating school grade (1st. . The Statistical Analysis of Categorical Data. year factor indicating year of study (1983.

B... 37–46. along with Approximate Standard Error (ASE component) idem for the weighted kappa.S. Moments of statistics kappa and weighted kappa. Large sample standard errors of kappa and weighted kappa. and the Fleiss-Cohen weights by 1 −|i − j |2 /(r − 1)2 . (1968). The British Journal of Mathematical and Statistical Psychology. J. Educational and Psychological Measurement. See Also agreementplot.Meyer@R-project.L. Value An object of class "Kappa" with three components: Unweighted Weighted Weights Note The summary method also prints the weights. A coefﬁcient of agreement for nominal scales. . Fleiss. There is a confint method for computing approximate conﬁdence intervals. The latter one attaches greater importance to near disagreements. J. 21. Everitt. (1969). Psychological Bulletin. r number of columns/rows. Cohen. B. J. and Everitt. (1960).org> References Cohen. numeric matrix with weights used. 72.Kappa Details 47 Cohen’s kappa is the diagonal sum of the (possibly weighted) relative frequencies.S. 332–327. confint Examples data("SexualFun") Kappa(SexualFun) numeric vector of length 2 with the kappa statistic (value component). corrected for expected values and standardized by its maximum value. Author(s) David Meyer <David. 97–103. The equal-spacing weights are deﬁned by 1 −|i − j |/(r − 1). 20.

alternate_labels = FALSE. pos_labels = c("left". clip = NULL. clip_cells = FALSE.. fontface = 2).) labeling_values(value_type = c("observed". "center"). tl_varnames = NULL. just_labels = "center". "left". . labels_varnames = FALSE.) labeling_doubledecker(lab_pos = c("bottom". "top"). offset_varnames = offset_labels. rot_labels = c( . gp_varnames = gpar(fontsize = 12.int( . . "residuals"). digits = 1. pos_varnames = "left". tl_labels = NULL. . boxes = TRUE. labbl_varnames = FALSE.. Usage labeling_border(labels = TRUE. 4).. varnames = labels. rot_varnames = c( .. fontface = 2).. gp_labels = gpar(fontsize = 12).) labeling_cboxed(tl_labels = TRUE. labbl_varnames = NULL. just_labels = "left". clip = FALSE. sep = ": ".. "center". offset_varnames = c( . tl_labels = NULL. boxes = FALSE. suppress = NULL. dep_varname = TRUE.. 9 . . "left". ). .) labeling_left2(tl_labels = TRUE. . pos_labels = "left". offset_labels = c( . just_varnames = pos_varnames. . ). "expected". pos_labels = "center". boxes = NULL.) labeling_lboxed(tl_labels = FALSE. clip_cells = FALSE.. fill_boxes = FALSE. 9 ). . pos_labels = "left". clip = TRUE..) Arguments labels vector of logicals indicating whether labels should be drawn for a particular dimension. pos_labels = "left". . pos_varnames = "left".. . . "center").. set_labels = NULL.. set_varnames = NULL. clip = TRUE. just_labels = "left". labbl_varnames = FALSE. just_labels = c("left".) labeling_left(rep = FALSE. rot_labels = rep. varnames = NULL. . clip = TRUE. 9 ). rep = TRUE. boxes = TRUE.6.. . abbreviate_labs = FALSE. gp_varnames = gpar(fontsize = 12.... . pos_labels = "center".. just_labels = "left".) labeling_residuals(suppress = NULL.. digits = 1. pos_varnames = "center". "left".48 labeling_border labeling_border Labeling Functions for Strucplots Description These functions generate labeling functions used for strucplots. 9 .. .) labeling_conditional(.

specifying the ﬁll colors for the boxes. list of objects of class "gpar" used for drawing the variable names. set_varnames vector of logicals indicating whether labels should be positioned on top (column labels) / left (row labels) for a particular dimension. The component names must exactly match the variable names whose labels should be replaced. labbl_varnames vector of logicals indicating whether variable names should be drawn on the left (column variables) / on top (row variables) of the corresponding labels. This color is transformed into its HSV representation. vector of rotation angles for the labels for each of the four sides of the plot. "right") for each of the four sides of the plot. "center". "center". alternate_labels vector of logicals indicating whether labels should be alternated on the top/bottom (left/right) side of the plot for a particular dimension. character string of variable names positions ("left". "TRUE" and "FALSE" values are transformed into "grey" and "white". The component names must exactly match the variable names to be replaced. If fill_boxes is atomic. An optional list with named components of character vectors replacing the labels of the so-speciﬁed variables. and the value is varied from 50% to 100% to give a sequential color palette for the levels. "right") for each of the variables. . offset_varnames numeric vector of length 4 indicating the offset of the labels (variable names) for each of the four sides of the plot. vector of logicals indicating whether boxes should be drawn around the labels for a particular dimension. "right") for each of the four sides of the plot. "right") for each of the variables. each vector speciﬁes the level colors of the corresponding dimension. tl_labels tl_varnames gp_labels gp_varnames rot_labels rot_varnames pos_labels pos_varnames just_labels just_varnames boxes fill_boxes vector of logicals indicating whether variable names should be positioned on top (column labels) / on left (row labels) for a particular dimension.labeling_border varnames set_labels 49 vector of logicals indicating whether variable names should be drawn for a particular dimension. Either a vector of logicals. If fill_boxes is a list of vectors. each component speciﬁes a basic color for the corresponding dimension. For NA components. no palette is produced (no ﬁll color). character string of label positions ("left". "center". "center". list of objects of class "gpar" used for drawing the labels. or a list of such vectors. character string of label justiﬁcations ("left". offset_labels. An optional character vector with named components replacing the so-speciﬁed variable names. or a vector of characters. respectively. character string of variable names justiﬁcations ("left". vector of rotation angles for the variable names for each of the four sides of the plot.

The default for labeling residuals is c(-2. these defaults can be overloaded by the sequence of non-named components which are recycled as needed (see examples). the number of characters the labels should be abbreviated to. labeling_left2. rep clip lab_pos dep_varname vector of logicals indicating.. for each dimension. A single number. labeling_left. TRUE means 1 character. character string switching between "top" or "bottom" position of the labels (only used for labeling_doubledecker). All values supplied to vectorized arguments can be ‘abbreviated’ by using named components which override the default component values. only used for labeling_conditional and labeling_doubledecker: parameters passed to labeling_cells and labeling_border.50 labels_varnames labeling_border vector of logicals indicating. is treated as c(-k . and labeling_lboxed are really just wrappers to labeling_border. logical indicating whether the values should be clipped at the cell borders. logical or character string. Use suppress = to show all non-zero values. sep abbreviate_labs vector of integers or logicals indicating. separator used if any component of "labels_varnames" is TRUE. this is indicating whether the name of the dependent variable should be printed or not. character string specifying which values should be displayed in the cells. Values are recycled as needed. numeric vector of length 2 specifying an interval of values that are not displayed. vector of integers indicating. whether labels should be repeated for all conditioning strata. integer specifying the number of digits used for rounding. for each dimension. whether labels should be clipped to not overlap.2). k ). k . more functions are described on the help page for labeling_cells and labeling_list. This help page only documents labeling_border and derived functions. labeling_residuals is a trivial wrapper for labeling_values.. A character string will be printed instead of the variable name taken from the dimnames. . and good examples for the parameter usage. for each dimension. 0 values are never displayed. Note that the functions can also be used ‘stand-alone’ as shown in the examples. In addition. FALSE causes no abbreviation. value_type suppress digits clip_cells . whether the variable name should be added to the corresponding labels or not. since the positions of the viewports are used for the label positioning. for each dimension. which in turn calls labeling_border by additionally adding the observed or expected frequencies or residuals to the cells. If logical. They suppose that a strucplot has been drawn and the corresponding viewport structure is pushed. Details These functions generate labeling functions called by strucplot for their side-effect of adding labels to the plot. labeling_cboxed. or appear only once.

"green"))) mosaic(Titanic. The strucplot framework: Visualizing multi-way contingency tables with vcd. split_vertical vector of logicals indicating the split directions.jstatsoft. boxes = TRUE. boxes = TRUE. and Hornik. labeling_args = list(clip = TRUE. fill_boxes = list(Sex = "red". Journal of Statistical Software. data = PreSex. 17(3).org/v17/i03/ and available as vignette("strucplot"). labeling_list. A. labeling_args = list(clip = TRUE. labeling = labeling_left) labeling_left mosaic(Titanic. 1-48. condvars Author(s) David Meyer <David. labeling_args = list(clip = TRUE. URL http://www. "red"))) mosaic(Titanic. labeling = labeling_cboxed) labeling_cboxed mosaic(Titanic. integer vector of conditioning dimensions. grid. labeling = labeling_lboxed) labeling_lboxed data("PreSex") mosaic(~ PremaritalSex + ExtramaritalSex | Gender + MaritalStatus.Meyer@R-project.text Examples data("Titanic") mosaic(Titanic) mosaic(Titanic. fill_boxes = c(Survived = "green".labeling_border Value A function with arguments: d 51 "dimnames" attribute from the visualized contingency table. Zeileis. See Also labeling_cells. D. labeling_args = list(abbreviate_labs = c(Survived = TRUE))) mosaic(Titanic. . K. (2006). boxes = TRUE.. labeling = labeling_conditional) ## specification of vectorized arguments mosaic(Titanic. or the visualized table itself from which the "dimnames" attributes will then be extracted..org> References Meyer. structable.

whether labels for the factor levels should be drawn or not. just = "center". varnames = TRUE. varnames = TRUE.5.. "lines"). for each dimension. abbreviate_labels = FALSE.. pos = "center". lsep = ": ". FALSE causes no abbreviation. cols = 2.. for each dimension. clip_cells = TRUE. labeling_args = list(set_varnames = c(Survived = "Status"). Values are recycled as needed. abbreviate_varnames = FALSE. Values are recycled as needed. . dimension) names should be abbreviated to.) Arguments labels varnames vector of logicals indicating. Usage labeling_cells(labels = TRUE. 2). vector of logicals indicating. lsep = ": ". Values are recycled as needed. TRUE means 1 character.52 fill_boxes = list(Sex = c(Male = "red". rot = . "blue"). the number of characters the variable (i. labeling = labeling_values) labeling_cells_list Labeling Functions for Strucplots Description These functions generate labeling functions that produce labels for strucplots. text = NULL. gp_text = gpar(). "Not Survived")). TRUE means 1 character.e. whether variable names should be drawn. "lines"). sep = " ". the number of characters the labels should be abbreviated to. just = "left". labeling_args = list(set_varnames = c(Sex = "Gender"))) labeling_cells_list ## change labels mosaic(Titanic.. abbreviate_varnames vector of integers or logicals indicating.) labeling_list(gp_text = gpar(). gp_text lsep object of class "gpar" used for the text drawn. "green"))) ## change variable names mosaic(Titanic. abbreviate_labels vector of integers or logicals indicating. offset = unit(c(2. margin = unit( . rep = FALSE)) ## show frequencies mosaic(Titanic. Values are recycled as needed. for each dimension. set_labels = list(Survived = c("Survived". . character that separates variable names from the factor levels. for each dimension. FALSE causes no abbreviation. . lcollapse = "\n".. pos = "left".

condvars Author(s) David Meyer <David. logical indicating whether text should be clipped at the cell borders (only used for labeling_cells). used for all labels (only used for labeling_cells).Meyer@R-project. 53 object of class "unit" of length 2 specifying the offset in x. This help page only documents labeling_list and labeling_cells. pos character that separates the factor levels (only used for labeling_list).and y-direction of the text block drawn under the strucplot (only used for labeling_list). or the visualized table itself from which the "dimnames" attributes will then be extracted. Using labeling_list will typically necessitate a bottom margin adjustment. They assume that a strucplot has been drawn and the corresponding viewport structure is pushed. Value A function with arguments: d "dimnames" attribute from the visualized contingency table. object of class "unit" (a numeric value is converted to "lines") specifying an offset from the cell borders (only used for labeling_cells). NA entries are not drawn. This allows custom cell annotations (see examples). a character table of the same dimensions than the contingency table whose entries will then be used instead of the labels.. Optionally. rotation angle in degrees. integer vector of conditioning dimensions split_vertical vector of logicals indicating the split directions. rot margin clip_cells text . more functions are described on the help page for labeling_border. Typically they are supplied to strucplot which then generates and calls the labeling function.) character string of length 1 (labeling_list) or at most 2 (labeling_cells) specifying the labels’ horizontal position and justiﬁcation (horizontal and vertical for labeling_cells). Only used for labeling_cells.. Currently not used.org> . number of text columns (only used for labeling_list). Details These functions generate labeling functions that can add different kinds of labels to an existing plot. Typically a line break. (Only used for labeling_cells. character that separates several variable name/factor level-combinations. so that by navigating through the viewport tree the labels can be positioned appropriately.labeling_cells_list sep offset cols lcollapse just. The functions can also be used ‘stand-alone’ as shown in the examples.

. labeling = labeling_list) ## A more complex example. height = unit( . space = .7.text Examples data("Titanic") mosaic(Titanic. "npc"). text = NULL. width fontsize of title and p-value text. ticks = 1 . range = NULL) legend_fixed(fontsize = 12. . D. objects of class "unit" indicating the coordinates of the title.ifelse(Titanic < 6. Journal of Statistical Software. "lines"). y = NULL.1.54 References legends Meyer. x = unit(1. adding the observed frequencies ## to a mosaic plot: tab <.8. Zeileis."npc"). steps = 2 . steps = 2 . check_overlap = TRUE. structable. Titanic) mosaic(Titanic. y = unit( . and Hornik. Usage legend_resbased(fontsize = 12. labeling = labeling_cells) mosaic(Titanic. NA. 1-48. See Also labeling_border. digits = 1. A. For legend_fixed. "lines"). width = unit(1. pop = FALSE) labeling_cells(text = tab. For legend_fixed. "lines"). text = NULL. (2006). pvalue = TRUE. 17(3). width = unit( . grid. x = unit(1. range = NULL) Arguments fontsize x. The strucplot framework: Visualizing multi-way contingency tables with vcd. the default for y is computed as to align upper margins of legend and actual plot. K. object of class "unit" indicating the height/width of the legend. y height.org/v17/i03/ and available as vignette("strucplot").. URL http://www. digits = 3. "lines").5.jstatsoft. height = NULL. the default for y is computed as to leave enough space for the speciﬁed text. margin = )(Titanic) legends Legend Functions for Strucplots Description These functions generate legend functions for residual-based shadings. 5.

For more details on the shading functions and their return values. Hornik. For legend_fixed only: proportion of space between the tiles. (2006). logical indicating whether overlap of scale labels should be inhibited or not. http://www. see shadings. D. K. Details These functions generate legend functions for residual-based shadings. F. Hornik. character vector indicating the title to be used when no text argument is speciﬁed.). 1-48. The strucplot framework: Visualizing multi-way contingency tables with vcd. 55 logical indicating whether the p-value should be visualized under the scale or not. K.. and Hornik.ci. character string indicating the title of the legend.at/Conferences/ DSC-2 3/Proceedings/ See Also structable. Computed from the residuals if omitted. Value A function with arguments: residuals shading autotext residuals from the ﬁtted independence model to be visualized. Leisch. Numeric vector of length 2 for setting the legend range. shading function computing colors from residuals (see details). URL http://www. Zeileis. Meyer.tuwien. A.jstatsoft. Therefore.ac. Zeileis (eds. granularity of the color gradient.org/v17/i03/ and available as vignette("strucplot"). K. 17(3). legend_fixed is inspired by the legend used in mosaicplot. number of scale ticks. (2003). shadings .. visualizing deviations from expected values of an hypothesized independence model.legends digits check_overlap space text steps ticks pvalue range number of digits for the scale labels.Meyer@R-project. NA values are replaced by the corresponding minimum / maximum of the residuals. A.org> References Meyer.. Visualizing independence using extended association plots. ISSN 1609-395X. Proceedings of the 3rd International Workshop on Distributed Statistical Computing. D. Journal of Statistical Software.. Author(s) David Meyer <David. Allows strucplot to generate sensible defaults depending on the residuals type. A. Zeileis. the legend uses a supplied shading function to visualize the color gradient for the residuals range.

gp = shading_Friendly) Lifeboats Lifeboats Lifeboats on the Titanic Description Data from Mersey (1912) about the 18 (out of 20) lifeboats launched before the sinking of the S. Visualizing Categorical Data. S. S. cap capacity of the boat.ca/ftp/sas/ vcd/catdata/lifeboat. Report on the loss of the “Titanic” (S. shade = TRUE.yorku. legend = legend_fixed. NC. M. SAS Institute. Friendly (2000). Mersey (1912). Cary. women number of women (including female crew) on board. legend = legend_resbased) mosaic(Titanic. Usage data("Lifeboats") Format A data frame with 18 observations and 8 variables.). Friendly (2000). Titanic.psych. Side of the boat. total total number of passengers. men number of men on board. boat factor indicating the boat. Source M.56 Examples data("Titanic") mosaic(Titanic. launch launch time in "POSIXt" format. Parliamentary command paper 6452. crew number of male crew members on board. . Visualizing Categorical Data: http://euclid. shade = TRUE.sas References L. side factor.

as.mar_table Examples data("Lifeboats") attach(Lifeboats) ternaryplot( Lifeboats[. . "red".org> Examples data("SexualFun") mar_table(SexualFun) a two-way table. c("red". main = "Lifeboats on the Titanic" ) grid_legend( . 1. title = "SIDE") detach(Lifeboats) 57 mar_table Table with Marginal Sums Description Adds row and column sums to a two-way table.8.9. "Starboard"). 19). Usage mar_table(x) Arguments x Value A table with row and column totals added.4:6]. "blue").Meyer@R-project. pch = ifelse(side == "Port". Author(s) David Meyer <David. c("Port". c(1. "blue").character(boat). . col = ifelse(side == "Port". id = ifelse(men / total > . 19). dimnames_position = "edge".1. NA). prop_size = 2.

. Ignored if data is a contingency table. panel = NULL. data subset na.g.. a speciﬁed response variable will be highlighted in the cells. sub = NULL. subset = NULL. sub = NULL. and FALSE otherwise. .. If FALSE and zero_shade is TRUE.action zero_size zero_split zero_shade zero_gp . data. object of class "gpar" used for zero bullets in case they are not shaded. vector of integers or character strings indicating conditioning variables. if any. If FALSE and zero_shade is FALSE. Usage ## Default S3 method: mosaic(x. conditioning formulas can be speciﬁed. or an object of class "structable". For convenience. a function which indicates what should happen when the data contain NAs. highlighting = NULL. spacing_args = list(). condvars = NULL. residual-based shadings to be effective also for zero cells. expected = NULL. shade = NULL. main = NULL.. . size of the bullets used for zero entries (if 0. an optional vector specifying a subset of observations to be used. highlighting_fill = grey. zero_gp = gpar(col = ). a bullet for each zero cell is drawn to allow. split_vertical = NULL. zero_size = . highlighting_direction = NULL.) ## S3 method for class ’formula’ mosaic(formula. no bullets are drawn). If any. zero_shade = NULL.. logical controlling whether zero cells should be further split. the conditioning variables will then be used ﬁrst for splitting. zero_split = FALSE. with optional category labels speciﬁed in the dimnames(x) attribute.action = NULL) Arguments x condvars formula a contingency table in array form. The table will be permuted to order them ﬁrst. na.58 mosaic mosaic Extended Mosaic Plots Description Plots (extended) mosaic displays. highlighting = NULL. either a data frame. e. main = NULL. gp = NULL. The default is TRUE if shade is TRUE or expected is not null or gp is not null. direction = NULL. or an object of class "table" or "ftable". logical controlling whether zero bullets should be shaded.colors.. a formula specifying the variables used to create a contingency table from data.5. only one bullet is drawn (centered) for unsplit zero cells. spacing = NULL.

index. list of arguments for the generating function. an array of expected values of the same dimension as x. if any.mosaic 59 split_vertical vector of logicals of length k . gp. highlighting_direction Either "left". spacing spacing_args gp shade expected highlighting character vector or integer specifying a variable to be highlighted in the cells. or "bottom" specifying the direction of highlighting in the cells. highlighting_fill color vector or palette function used for a highlighted variable. Ignored if shade = FALSE. spacing object. optionally. observed. If TRUE and expected is unspeciﬁed. . The default is spacing_equal if x has two dimensions. "top". where k is the number of margins of x (default: FALSE). If logical and TRUE. object of class "gpar". and spacing_conditional if conditioning variables are speciﬁed using condvars or the formula interface. gp a gpar object for the tile. and else the total independence model. the name of the data object is used. mosaicplot is a base graphics implementation and mosaic is a much more ﬂexible and extensible grid implementation.. or corresponding generating function (see strucplot for more information). if speciﬁed (see strucplot for more information). spacing function. logical specifying whether gp should be used or not (see gp). For each component. and name called by the struc_mosaic workhorse for each tile that is drawn in the mosaic. "right". direction character vector of length k . spacing_increase for more dimensions. whereas "v" indicates vertical split(s). Other arguments passed to strucplot main. a corresponding conditional independence model. or alternatively the corresponding independence model speciﬁcation as used by loglin or loglm (see strucplot). Details Mosaic displays have been suggested in the statistical literature by Hartigan and Kleiner (1984) and have been extended by Friendly (1994). a default model is ﬁtted: if condvars (see strucplot) is speciﬁed.. shading function or a corresponding generating function (see details and shadings). or a character string used for plotting the main (sub) title. Ignored if direction is not NULL. expected. A TRUE component indicates that the tile(s) of the corresponding dimension should be split vertically. Components of "gpar" objects are recycled as needed along the last splitting dimension. a value of "h" indicates that the tile(s) of the corresponding dimension should be split horizontally. index is an integer vector with the tile’s coordinates in the contingency table. and name a label to be assigned to the drawn grid object. panel Optional function with arguments: residuals. either a logical. Values are recycled as needed. where k is the number of margins of x (values are recycled as needed). FALSE means horizontal splits. sub .

17–23. Both are high-level interfaces to the strucplot function. and other graphical parameters. Journal of the American Statistical Association. and legend is modularized (see strucplot for details). strucplot. size and possibly signiﬁcance of the corresponding residual. A mosaic of television ratings. Value The "structable" visualized is returned invisibly. package = "vcd"). visualizes the ﬁt of a particular log-linear model. Zeileis. The strucplot framework: Visualizing multi-way contingency tables with vcd.org/v17/i 3/ and available as vignette("strucplot". spacing. (1984).ca) provides information on various aspects of graphical methods for analyzing categorical data. mosaicplot. in addition. J. given the dimensions of previous splits. W. and Hornik. including mosaic plots. legend. structable. doubledecker .org> References Hartigan. Author(s) David Meyer <David. jstatsoft. Typically. A mosaic plot is an area proportional visualization of a (possibly higher-dimensional) table of expected frequencies. The area of each tile is proportional to the corresponding cell entry. and Kleiner. An extended mosaic plot. In particular. 1. such as speciﬁcation of the independence model. Statistical Computing and Graphics Newsletter (ASA). Mosaic displays in S-PLUS: A general implementation and a case study. A. 190–200. Most of the functionality is described there. 89. D. 17(3).. B. 1-48. ca/courses/VCD/. Mosaic displays for multi-way contingency tables. this is done by residual-based shadings where color and/or outline of the tiles visualize sign. K. spacing. Emerson. shading. (1994). J. there are many materials for his course “Visualizing Categorical Data with SAS and R” at http://datavis. URL http://www. labeling. Journal of Statistical Software. and produce (extended) mosaic displays.60 mosaic mosaic is a generic function which currently has a default method and a formula interface. Meyer.. the splits start with the horizontal direction by default to match the printed output of structable. M. Friendly. The home page of Michael Friendly (http://datavis. labeling. It is composed of tiles (corresponding to the cells) created by recursive vertical and horizontal splits of a square.. (1998). 38.Meyer@R-project. 9. In contrast to the mosaicplot function in graphics. (2006). The layout is very ﬂexible: the speciﬁcation of shading. The American Statistician. 32–35. See Also assoc.A.

shade = TRUE. data = PreSex) ## Highlighting: mosaic(Survived ~ . Indicates that ## there are significantly more blue eyed blond females than expected ## in the case of independence (and too few brown eyed blond females). highlighting_direction = "right") MSPatients Diagnosis of Multiple Sclerosis Description Data from Westlund \& Kurland (1953) on the diagnosis of multiple sclerosis (MS): two samples of patients. shade = TRUE. and are ## overrepresented among people with brown hair and blue eyes. legend = TRUE) data("HairEyeColor") mosaic(HairEyeColor.4)) mosaic(~ ExtramaritalSex + PremaritalSex | MaritalStatus + Gender. shade = TRUE) ## Independence model of hair and eye color and sex. data = Titanic. data = Titanic) data("Arthritis") mosaic(Improved ~ Treatment | Sex. ## Formula interface for raw data: visualize crosstabulation of numbers ## of gears and carburettors in Motor Trend car data. zero_size = 61 ) . data = Arthritis. but not ## "significantly".MSPatients Examples data("Titanic") mosaic(Titanic) ## Formula interface for tabulated data plus shading and legend: mosaic(~ Sex + Age + Survived. mosaic(HairEyeColor.. were each rated by two neurologists (one from each city) in four diagnostic categories. expected = list(c(1. 3)) ## Model of joint independence of sex from hair and eye color. shade = TRUE) data("PreSex") mosaic(PreSex. data = mtcars.2). Males ## are underrepresented among people with brown hair and eyes. main = "Survival on the Titanic". Usage data("MSPatients") . data("mtcars") mosaic(~ gear + carb. zero_size = mosaic(Improved ~ Treatment | Sex. one from Winnipeg and one from New Orleans. data = Arthritis. condvars = c(1.

Friendly (2000). Doubtful Certain. American Journal of Hygiene. Cary.col = 1)) agreementplot(t(MSPatients[.. Probable. Visualizing Categorical Data. New Orleans NonResponse Non-Response Survey Data Description Data about non-response for a Danish survey in 1965.psych.layout(ncol = 2))) pushViewport(viewport(layout.62 Format NonResponse A 3-dimensional array resulting from cross-tabulating 218 observations on 3 variables. SAS Institute. Possible. newpage = FALSE) popViewport() pushViewport(viewport(layout.pos. Usage data("NonResponse") . newpage = FALSE) popViewport(2) dev. NC.col = 2)) agreementplot(t(MSPatients[. Visualizing Categorical Data: http://euclid. using: ## get(getOption("device"))(width = 12) pushViewport(viewport(layout = grid.ca/ftp/sas/ vcd/catdata/msdiag. Probable.g.. Louisiana. T. Examples data("MSPatients") ## best visualized using a resized device.2]). Manitoba and New Orleans. 380–396. Kurland (1953). main = "Winnipeg Patients".1]).sas References K. Westlund \& L. main = "New Orleans Patients". B.pos.yorku.off() Name New Orleans Neurologist Winnipeg Neurologist Patients Levels Certain. Possible. e. M. The variables and their levels are as follows: No 1 2 3 Source M. Studies on multiple sclerosis in Winnipeg. 57. Doubtful Winnipeg. Friendly (2000).

95. References E. Source E. The Statistical Analysis of Categorical Data.. xlim = NULL. . ylab = NULL. Springer-Verlag. gender factor indicating gender (male. 2nd edition. Country). if not NULL or FALSE. stratum = NULL. Freq frequency. ordinary odds ratios are computed. conf_level-% conﬁdence intervals are plotted for each data point. .oddsratio Format A data frame with 12 observations and 4 variables.) Arguments x stratum log conf_level type a 2 by 2 by . if FALSE. . log = TRUE) ## S3 method for class ’oddsratio’ plot(x. Usage oddsratio(x. no). The Statistical Analysis of Categorical Data. plot type. conf_level = . 63 residence factor indicating whether residence was in Copenhagen. Examples data("NonResponse") structable(~ . type = "o". transpose = FALSE. B. in a city outside Copenhagen or at the countryside (Copenhagen. Andersen (1991). vector of strata dimensions. Berlin. response factor indicating whether a response was given (yes. female). table. whiskers = . B. ylim = NULL.1. data = NonResponse) oddsratio (Log) Odds Ratios Description Computes (log) odds ratios and their asymptotic standard errors for (possibly) stratiﬁed data. Andersen (1991). xlab = NULL.17. City. baseline = TRUE.. Table 5.. .

x-axis limits.64 xlab ylab xlim ylim baseline transpose whiskers .. label for the y-axis. 0.org> References M.5 will be added to the table. along with conﬁdence intervals. y-axis limits. The summary method prints the standard errors and—for log odds ratios—also computes and prints asymptotic z tests (standardized log odds ratios) and the corresponding p values. Defaults to "Strata" if transpose is TRUE. See Also confint Examples ## load Coal Miners data data("CoalMiners") ## compute log odds ratios lor <. The plot method plots (log) odds ratios. SAS Institute. Friendly (2000). logical indicating whether log odds ratios or common odds ratios are computed. . Value label for the x-axis. Author(s) David Meyer <David. NC. Visualizing Categorical Data. which is simply a vector of (log) odds ratios with dimensionality depending on stratum. Defaults to "Strata" if transpose is FALSE. the plot is transposed. a red dashed line is plotted at a value of 1 (in case of odds) or 0 (in case of log-odds). An object of class "logoddsratio". oddsratio if TRUE.oddsratio(CoalMiners) lor a numeric vector with the asymptotic standard errors. Ignored if transpose is FALSE..Meyer@R-project. Ignored if transpose is TRUE. along with the following attributes: ASE log Note In case of zero entries. width of the conﬁdence interval whiskers. computed by oddsratio for 2 × 2 × k tables. other graphics parameters (see par). if TRUE. Cary. There is a confint method for computing conﬁdence intervals for the (log) odds ratios.

ylim = NULL.) Ord_estimate(x. estimate = TRUE. "binomial". logical. type = NULL. See details. xlab = "Number of occurrences".Ord_plot 65 ## summary with z tests summary(lor) ## confidence intervals confint(lor) ## visualization plot(lor.5). legend estimate tol type xlim ylim .seq(25.. In the latter case the distribution is estimated from the data. xlim = NULL. logical. Usage Ord_plot(obj. a character string indicating the distribution. 6 . legend = TRUE. gp = gpar(cex = . "nbinomial" or "log-series" or NULL. limits for the x axis.lm(lor ~ g + I(g^2)) lines(fitted(m). must be one of "poisson". a 1-way table of frequencies of counts or a data frame or matrix with frequencies in the ﬁrst column and the corresponding counts in the second column. newpage = TRUE. main = "Breathelessness and Wheeze in Coal Miners") ## add quadratic model g <. . name = "Ord_plot". ylab = "Frequency ratio". limits for the y axis. type = NULL. Should the distribution and its parameters be estimated from the data? See details..1) Arguments obj either a vector of counts. col = "red") Ord_plot Ord Plots Description Ord plots for diagnosing discrete distributions. Should a legend be plotted?. tol = . tol = . xlab = "Age Group".1. main = "Ord plot". pop = TRUE. See details. tolerance for estimating the distribution. by = 5) m <.

66 xlab ylab main gp name newpage pop . x Details a label for the x axis.Zeileis@R-project. name of the plotting viewport. logical. The Ord plot plots the number of occurrences against a certain frequency ratio (see Friendly (2000) for details) and should give a straight line if the data comes from a poisson. Should the viewport created be popped? further arguments passed to grid. SAS Institute. 232–238.org> References J. a vector giving intercept and slope for the (ﬁtted) line in the Ord plot. Examples ## Simulated data examples: dummy <.8) . size = 1. Be careful with the conclusions from Ord_estimate as it implements just some simple heuristics! Value A vector giving the intercept and slope of the weighted OLS line. If none of the distributions ﬁts well. To judge whether a coefﬁcient is positive or negative a tolerance given by tol is used. prob = Ord_plot(dummy) ## Real data examples: data("HorseKicks") data("Federalist") data("Butterfly") data("WomenQueue") . The intercept and slope of this straight line conveys information about the underlying distribution. Graphical methods for a class of discrete distributions. Journal of the Royal Statistical Society. Should grid. NC.rnbinom(1 .. Ord_plot a "gpar" object controlling the grid graphical parameters of the points. binomial.5.. Ord_plot ﬁts a usual OLS line (black) and a weighted OLS line (red). a label for the y axis. From the coefﬁcients of the latter the distribution is estimated by Ord_estimate as described in Table 2. Visualizing Categorical Data. Ord (1967). K. a title for the plot. Cary. Michael Friendly (2000). no parameters are estimated. negative binomial or log-series distribution. A 130.10 in Friendly (2000). Author(s) Achim Zeileis <Achim.newpage be called before plotting? logical.points.

advanced). newpage = FALSE) popViewport(2) OvaryCancer Ovary Cancer Data Description Data from Obel (1975) about a retrospective study of ovary cancer carried out in 1973. who were operated for ovary cancer 10 years before. main = "Butterfly species collected in Malaya".pos.pos.pos. limited). newpage = FALSE) popViewport() pushViewport(viewport(layout. .col=1.row=2)) Ord_plot(Federalist.row=1)) Ord_plot(Butterfly. main = "Instances of ’may’ in Federalist papers".OvaryCancer 67 grid.4. operation factor indicating type of operation (radical. 2))) pushViewport(viewport(layout.col=2.col=1. xray factor indicating whether X-ray treatment was received (yes. no). layout. stage factor indicating the stage of the cancer at the time of operation (early. newpage = FALSE) popViewport() pushViewport(viewport(layout. main = "Death by horse kicks". Table 6. Andersen (1991). Source E. newpage = FALSE) popViewport() pushViewport(viewport(layout.row=2)) Ord_plot(WomenQueue.pos.pos.layout(2.row=1)) Ord_plot(HorseKicks. layout. layout.pos.newpage() pushViewport(viewport(layout = grid.pos. The Statistical Analysis of Categorical Data.pos. Freq frequency. B. survival factor indicating survival status after 10 years (yes. layout. Information was obtained from 299 women. no). Usage data("OvaryCancer") Format A data frame with 16 observations and 5 variables. main = "Women in queues of length 1 ".col=2.

. rot = . expected = ~ xray * operation * stage + survival*stage) Pairs plot panel functions for diagonal cells Diagonal Panel Functions for Table Pairs Plot Description Diagonal panel functions for pairs. gp = gpar(fill = rev(grey. doubledecker(survival ~ stage + operation + xray. A Comparative Study of Patients with Cancer of the Ovary Who Have Survived More or Less Than 10 Years."top"). just_vartext = c("center". gp_leveltext = gpar(). The Statistical Analysis of Categorical Data. col. TRUE. gp = NULL. FALSE). Springer-Verlag. "xray")) ## model: ~ xray * operation * stage + survival * stage ## interpretation: treat xray. gp_vartext = gpar(fontsize = 17).) pairs_diagonal_mosaic(split_vertical = TRUE. just_leveltext = c("center". but not xray and operation. "top"). offset_varnames = . stage as fixed margins. gp_leveltext = gpar(). 55. .vars = c("stage".4. keep = FALSE.) pairs_text(dimnames = TRUE.. TRUE.. Acta Obstetricia et Gynecologica Scandinavica. 429-439."margin"). . TRUE. Berlin. split = c(FALSE. Obel (1975)..colors(2)))) mosaic(~ stage + operation + xray + survival.. . gp_border = gpar(). "lines"). operation. offset_labels = .table. gp_leveltext = gpar(). row.vars = "survival". gp_vartext = gpar(fontsize = 17).. var_offset = unit(1. Examples data("OvaryCancer") tab <. ## the survival depends on stage. distribute = c("equal". fill = "grey". "npc"). abbreviate = FALSE. gp_vartext = gpar(fontsize = 17. split = c(FALSE.) . rot = . data = tab) mosaic(~ stage + operation + xray + survival. Andersen (1991). E. fontface = "bold").) pairs_diagonal_text(varnames = TRUE. data = OvaryCancer) ftable(tab.68 References Pairs plot panel functions for diagonal cells E.. . FALSE).. gp_border = gpar(). margins = unit( . "operation". Usage pairs_barplot(gp_bars = NULL. B.. fill = "grey".xtabs(Freq ~ xray + survival + stage + operation. check_overlap = TRUE. pos = c("right". keep = FALSE. TRUE. 2nd edition. data = tab. B. "bottom"). data = tab.

fill color vector or palette function used for the ﬁll colors of bars (for pairs_barplot) or tiles (for pairs_diagonal_mosaic). some levels will suppressed to avoid overlapping. Default is FALSE. the numbers are interpreted as "lines" units. gp_leveltext object of class "gpar" used for the factor levels. varnames vector of logicals indicating whether the variable names should be displayed (only used for pairs_text_diagonal).. in which case the non-named arguments specify the default values (recycled as needed). .Pairs plot panel functions for diagonal cells Arguments 69 vector of logicals indicating whether the factor levels should be displayed (only used for pairs_text). and left margin of the plot. margins either an object of class "unit" of length 4. ‘bottom’.. FALSE means horizontal splits. offset_varnames numeric vector of length 4 indicating the offset of the labels (variable names) for each of the four sides of the plot. The four components specify the top. ‘right’. if any. the default is to set the fill component of this object to the fill argument. TRUE means 1 character. rot rotation angle for the variable levels. where k is the number of margins of x (values are recycled as needed). other parameters passed to the underlying graphics functions. distribute character string indicating whether levels should be distributed equally or according to the margins (only used for pairs_text_diagonal). pos character string of length 2 controlling the horizontal and vertical position of the variable names (only used for pairs_text_diagonal). gp_vartext object of class "gpar" used for the factor names. offset_labels. the unit defaults to "npc". the unit or numeric vector may have named arguments (‘top’. gp object of class "gpar" used for the tiles (only used for pairs_diagonal_mosaic). gp_border object of class "gpar" used for the border (only used for pairs_text). If unspeciﬁed. A TRUE component indicates that the tile(s) of the corresponding dimension should be split vertically. If unspeciﬁed. right. and ‘left’). bottom. abbreviate integer or logical indicating the number of characters the labels should be abbreviated to. In addition. or a numeric vector of length 4. the default is to set the fill component of this object to the fill argument. gp_bars object of class "gpar" used for bars (only used for pairs_barplot). var_offset object of class "unit" specifying the offset of variable names from the bottom of the bar plots created by pairs_barplot. overloaded by the named arguments. split_vertical vector of logicals of length k . respectively. FALSE causes no abbreviation. check_overlap If TRUE. just_leveltext. dimnames . The elements are recycled as needed. just_vartext character string indicating the justiﬁcation for variable names and levels. When a numeric vector is supplied. If numeric.

70 Details

Pairs plot panel functions for off-diagonal cells

In the diagonal cells, the pairsplot visualizes statistics or information for each dimension (that is: the single factors) alone. pairs_text displays the factor’s name, and optionally also the factor levels. pairs_barplot produces a bar plot of the corresponding factor, along with the factor’s name. Value A function with one argument: the marginal table for the corresponding dimension. Author(s) David Meyer <David.Meyer@R-project.org> See Also pairs.table, pairs_assoc, pairs_mosaic Examples

data("UCBAdmissions") pairs(UCBAdmissions) # pairs_barplot is default pairs(UCBAdmissions, diag_panel = pairs_text) pairs(UCBAdmissions, diag_panel = pairs_diagonal_text) pairs(Titanic, diag_panel = pairs_diagonal_text) pairs(Titanic, diag_panel = pairs_diagonal_text(distribute = "margin")) pairs(Titanic, diag_panel = pairs_diagonal_text(distribute = "margin", rot = 45))

Pairs plot panel functions for off-diagonal cells Off-diagonal Panel Functions for Table Pairs Plot

Description Off-diagonal panel functions for pairs.table. Usage pairs_strucplot(panel = mosaic, type = c("pairwise", "total", "conditional", "joint"), legend = FALSE, margins = c( , , , ), labeling = NULL, ...) pairs_assoc(...) pairs_mosaic(...) pairs_sieve(...)

Pairs plot panel functions for off-diagonal cells Arguments panel type legend margins labeling ...

71

function to be used for the plots in each cell, such as pairs_assoc, pairs_mosaic, and pairs_sieve. character string specifying the type of independence model visualized in the cells. logical specifying whether a legend should be displayed in the cells or not. margins inside each cell (see strucplot). labeling function or labeling-generating function (see strucplot). pairs_mosaic, pairs_assoc, and pairs_sieve: parameters passed to pairs_strucplot. pairs_strucplot: other parameters passed to panel function.

Details These functions really just wrap assoc, sieve, and mosaic by basically inhibiting labeling and legend-drawing and setting the margins to 0. Value A function with arguments: x i, j Author(s) David Meyer <David.Meyer@R-project.org> References Cohen, A. (1980), On the graphical display of the signiﬁcant components in a two-way contingency table. Communications in Statistics—Theory and Methods, A9, 1025–1041. Friendly, M. (1992), Graphical methods for categorical data. SAS User Group International Conference Proceedings, 17, 190–200. http://datavis.ca/sugi/sugi17.pdf See Also pairs.table, pairs_text, pairs_barplot, assoc, mosaic Examples

data("UCBAdmissions") data("PreSex") pairs(PreSex) pairs(UCBAdmissions) pairs(UCBAdmissions, upper_panel_args = list(shade = FALSE)) pairs(UCBAdmissions, lower_panel = pairs_mosaic(type = "conditional")) pairs(UCBAdmissions, upper_panel = pairs_assoc)

contingency table. cell coordinates.

72

pairs.table

pairs.table

Pairs Plot for Contingency Tables

Description Produces a matrix of strucplot displays. Usage ## S3 method for class ’table’ pairs(x, upper_panel = pairs_mosaic, upper_panel_args = list(), lower_panel = pairs_mosaic, lower_panel_args = list(), diag_panel = pairs_barplot, diag_panel_args = list(), main = NULL, sub = NULL, main_gp = gpar(fontsize = 2 ), sub_gp = gpar(fontsize = 15), space = .3, newpage = TRUE, pop = TRUE, margins = unit(1, "lines"), ...) Arguments x upper_panel a contingency table in array form, with optional category labels speciﬁed in the dimnames(x) attribute.

function for the upper triangle of the matrix, or corresponding generating function. If NULL, no panel is drawn. upper_panel_args list of arguments for the generating function, if speciﬁed. function for the lower triangle of the matrix, or corresponding generating function. If NULL, no panel is drawn. lower_panel_args list of arguments for the panel-generating function, if speciﬁed. lower_panel diag_panel diag_panel_args list of arguments for the generating function, if speciﬁed. main sub main_gp, sub_gp object of class "gpar" containing the graphical parameters used for the main (sub) title, if speciﬁed. space newpage pop double specifying the distance between the cells. logical controlling whether a new grid page should be created. logical indicating whether all viewports should be popped after the plot has been drawn. either a logical, or a character string used for plotting the main title. If main is a logical and TRUE, the name of the object supplied as x is used. a character string used for plotting the subtitle. If sub is a logical and TRUE and main is unspeciﬁed, the name of the object supplied as x is used. function for the diagonal of the matrix, or corresponding generating function. If NULL, no panel is drawn.

for marginal and conditional independence among all pairs of variables. along with parameters passed to foo\_panel\_args . list of arguments for the panel-generating functions of upper and lower panels. Zeileis. Friendly. 190–200. and Hornik. M. The four components specify the top. lower matrix. The joint independence mosaic matrix shows mosaic plots for joint independence of all pairs of variables from the others. On the graphical display of the signiﬁcant components in a two-way contingency table. and ‘left’). right.. the second approach is equivalent to the ﬁrst if foo\_panel(foo\_panel\_args) is passed to foo\_panel . Author(s) David Meyer <David. 1-48. Four independence types are distinguished: "pairwise". and diagonal cells. ‘bottom’. "conditional" and "joint". (1980). bottom. Journal of Statistical Software. A. 2. the numbers are interpreted as "lines" units. "total". Passing a suitable panel function to foo\_panel which subsequently is called for each cell with the corresponding coordinates.. i. the unit or numeric vector may have named arguments (‘top’. It plots a matrix of pairwise mosaic plots. respectively.e. Graphical methods for categorical data. for each panel parameter foo (= ‘upper’.. D.. overloaded by the named arguments.org/v17/i03/ and available as vignette("strucplot"). Hence. . A. When a numeric vector is supplied. The total independence mosaic matrix shows mosaic plots for mutual independence. A9.org> References Cohen. or a numeric vector of length 4.pairs. Details This is a pairs method for objects inheriting from class "table" or "structable".. ‘right’. or ‘diag’). . (2006). collapsed over all other variables. that generates such a function. ‘lower’. K. In addition. Passing a corresponding generating function (of class "panel_generator") to foo\_panel . pairs. The conditional independence mosaic matrix shows mosaic plots for marginal independence given all other variables. URL http://www.jstatsoft. This method uses panel functions called for each cell of the matrix which can be different for upper matrix. 17(3). SAS User Group International Conference Proceedings. The strucplot framework: Visualizing multi-way contingency tables with vcd. 1025–1041. in which case the non-named arguments specify the default values (recycled as needed).Meyer@R-project. For convenience. The pairwise mosaic matrix shows bivariate marginal relations. which can be used to specify the parameters as follows: 1. 17. if speciﬁed. and left margin of the plot.table takes two arguments: foo\_panel and foo\_panel\_args . The elements are recycled as needed.table margins 73 either an object of class "unit" of length 4.ca/sugi/sugi17. Correspondingly.pdf Meyer. http://datavis. (1992). Communications in Statistics—Theory and Methods.

if speciﬁed. gp_args = list(). sieve. lower_panel = pairs_mosaic(type = "conditional")) pairs(UCBAdmissions. diag_panel_args = list(fill = grey. residuals and expected values. highlighting = 2. diag_panel = pairs_diagonal_mosaic. pairs_diagonal_text. alternate_labels =TRUE)) plot. .colors)) pairs(hec..74 See Also plot. pairs_assoc. Currently. mosaic and assoc in vcd. assoc. shading function or a corresponding generating function (see details and shadings). see loglm. gp_args list of arguments for the shading-generating function. Usage ## S3 method for class ’loglm’ plot(x. pairs_text. .. . "expected"). Other arguments passed to the panel function. diag_panel_args = list(fill = grey. pairs_sieve.) Arguments x panel a ﬁtted "loglm" object. mosaic Examples data("UCBAdmissions") data("PreSex") data(HairEyeColor) hec = structable(Eye ~ Sex + Hair. upper_panel_args = list(shade = TRUE)) pairs(UCBAdmissions.. shade = TRUE) pairs(hec.loglm pairs_mosaic. diag_panel = pairs_text) pairs(UCBAdmissions. Ignored if shade = FALSE. highlighting = 2. type a character string indicating whether the observed or the expected values of the table should be visualized.. panel = mosaic.colors. "deviance"). gp = shading_hcl. residuals_type a character string indicating the type of residuals to be computed. residuals_type = c("pearson". data = HairEyeColor) pairs(PreSex) pairs(UCBAdmissions) pairs(UCBAdmissions. a panel function for visualizing the observed values. gp object of class "gpar".loglm Visualize Fitted Log-linear Models Description Visualize ﬁtted "loglm" objects by mosaic or association plots. upper_panel = pairs_assoc. pairs_barplot. type = c("observed".

labeling_args = list(abbreviate = c(Admit = 3))) ## and association plot plot(fm. panel = assoc) assoc(fm) PreSex Pre-marital Sex and Divorce Description Data from Thornes \& Collard (1979). Value The "structable" visualized is returned invisibly. c(3. assoc.loglm(~ PremaritalSex * ExtramaritalSex * (Gender + MaritalStatus). data = aperm(PreSex. split_vertical = TRUE. mosaic. split_vertical = TRUE) ## visualize LR statistic plot(fm. . setting the panel argument accordingly. Author(s) Achim Zeileis <Achim. residuals_type = "deviance") ## conditional independence in UCB admissions data data("UCBAdmissions") fm <. reported in Gilbert (1981). The mosaic and assoc methods are simple convenience interfaces to this plot method. 4. 1))) fm ## visualize Pearson statistic plot(fm.PreSex Details 75 The plot method for "loglm" objects by default visualizes the model using a mosaic plot (can be changed to an association plot by setting panel = assoc) with a shading based on the residuals of this model.and extra-marital sex and divorce. strucplot Examples ## mosaic display for PreSex model data("PreSex") fm <. The legend also reports the corresponding p value of the associated goodness-of-ﬁt test. data = aperm(UCBAdmissions)) ## use mosaic display plot(fm.Zeileis@R-project.loglm(~ Dept * (Gender + Admit). on pre.org> See Also loglm. 2.

Cary.76 Usage data("PreSex") Format PreSex A 4-dimensional array resulting from cross-tabulating 1036 observations on 4 variables. c(2. expected = ~ Gender * PremaritalSex * ExtramaritalSex + MaritalStatus * PremaritalSex * ExtramaritalSex. Men . Allen and Unwin. No Women. London.4)). Visualizing Categorical Data: http://euclid. Married Yes.4)). c(3.ca/ftp/ sas/vcd/catdata/marital. Who Divorces?. main = "Gender and Premarital Sex") ## (Gender Pre)(Extra) mosaic(margin. NC. main = "PreMarital*ExtraMarital + MaritalStatus") ## (GPE)(PEM) mosaic(PreSex. Thornes \& J. Visualizing Categorical Data. main = "G*P*E + P*E*M") Name MaritalStatus ExtramaritalSex PremaritalSex Gender Levels Divorced. N.sas References G. Friendly (2000). Routledge \& Kegan. expected = ~Gender*PremaritalSex*ExtramaritalSex + MaritalStatus.3. expected = ~Gender * PremaritalSex + ExtramaritalSex . No Yes. Gilbert (1981).table(PreSex.table(PreSex. SAS Institute. M. B. The variables and their levels are as follows: No 1 2 3 4 Source Michael Friendly (2000).yorku. Collard (1979).psych. London. main = "PreMaritalSex*Gender +Sex") ## (Gender Pre Extra)(Marital) mosaic(PreSex. Examples data("PreSex") ## Mosaic display for Gender and Premarital Sexual Experience ## (Gender Pre) mosaic(margin. Modelling Society: An Introduction to Loglinear Analysis for Social Researchers.

data = pun. attitude factor indicating attitude: (no. education factor indicating highest level of education (elementary. test = "maxchisq". package = "vcd") pun <. high). 25-39. no). Examples data("Punishment". 40-).xtabs(Freq ~ memory + attitude + age + education. data = Punishment) ## model: ~ (memory + attitude) * age * education ## use maximum sum-of-squares test/shading cotabplot(~ memory + attitude | age + education. Berlin.Punishment 77 Punishment Corporal Punishment Data Description Data from a study of the Gallup Institute in Denmark in 1979 about the attitude of a random sample of 1. The Statistical Analysis of Categorical Data. interpolate = 1:2) . B. 2nd edition. memory factor indicating whether the person had memories of corporal punishment as a child (yes. type = "assoc". References E. pages 207–208. B. Andersen (1991). Note Anderson (1991) erroneously indicates the total sum of respondents to be 783. Freq frequency. age factor indicating age group in years (15-24. Source E. The Statistical Analysis of Categorical Data. Usage data("Punishment") Format A data frame with 36 observations and 5 variables. moderate) punishment of children. n = 5 . panel = cotab_coindep. secondary. Springer-Verlag. Andersen (1991).456 persons towards corporal punishment of children.

Assault. E. SAS Institute. Pickpocket. 2nd edition. National Crime Survey. Examples data("RepVict") mosaic(RepVict[-c(4. Reiss (eds. Friendly (2000). The Analysis of Cross-Classiﬁed Categorical Data. NC. Indicators of Crime and Criminal Justice. MIT Press. The variables and their levels are as follows: No 1 2 Name First Victimization Second Victimization Levels Rape.S. gp = shading_max. J. Visualizing Categorical Data. Assault. Fienberg (1980). J. Robbery. Robbery. A. Cary. Washington. U. In S. Personal Larceny. M. J. Auto Theft Rape.-c(4. main = "Repeat Victimization Data") . Household Larceny. Pickpocket. Burglary. E. J.). References S. page 113. Fienberg & A.7). Visualizing Categorical Data.7)]. Cambridge.78 RepVict RepVict Repeat Victimization Data Description Data from Reiss (1980) given by Fienberg (1980) about instances of repeat victimization for households in the U. Personal Larceny. Reiss (1980). Government Printing Ofﬁce. DC. Burglary. Auto Theft Source Michael Friendly (2000).S. Victim proneness by type of crime in repeat victimization. Usage data("RepVict") Format A 2-dimensional array resulting from cross-tabulating victimization. Household Larceny.

Lancashire. Computational Statistics & Data Analysis. Graphical Models on Applied Multivariate Statistics. Examples data("Rochdale") mosaic(Rochdale) . The study was conducted to identify inﬂuence factors on economical activity of wives. The variables and their levels are as follows: No 1 2 3 4 5 6 7 8 Name EconActive Age HusbandEmployed Child Education HusbandEducation Asian HouseholdWorking Levels yes. no <38. no Note Many observations are missing: only 91 out of all 256 combinations contain information. >38 yes. UK. no yes. no yes. Usage data("Rochdale") Format A 8-dimensional array resulting from cross-tabulating 665 observations on 8 variables. Source Whittaker (1990). Whittaker (1990). no yes. Constructing and reading mosaicplots. Hofmann (2003).Rochdale 79 Rochdale Rochdale Data Description Information on 665 households of Rochdale. 565–580. Wiley. 4. no yes. J. New York. no yes. References H. 43.

name = "rootogram". fitted. rect_gp = gpar(fill = "lightgray"). a "gpar" object controlling the grid graphical parameters of the rectangles. names = NULL..newpage be called before plotting? logical.80 rootogram rootogram Rootograms Description Rootograms of observed and ﬁtted values. points_gp = gpar(col = "red"). pch = 19. Details The observed frequencies are displayed as bars and the ﬁtted frequencies as a line. plotting character for the points. By default a sqrt scale is used to make the smaller frequencies more visible. Should the viewport created be popped? further arguments passed to grid_barplot. "deviation"). "standing". ylab = NULL. . logical. type = c("hanging".) Arguments x fitted names scale type either a vector or a 1-way table of frequencies. a vector of ﬁtted frequencies. ylim = NULL. a "gpar" object controlling the grid graphical parameters of the points. "raw"). a label for the y axis. a "gpar" object controlling the grid graphical parameters of the lines.. a character string indicating whether the values should be plotted on the raw or square root scale. Should grid. pop = TRUE. Usage ## Default S3 method: rootogram(x.. lines_gp = gpar(col = "red"). xlab = NULL. a label for the x axis.. rect_gp lines_gp points_gp pch xlab ylab ylim name newpage pop . name of the plotting viewport. limits for the y axis. a character string indicating if the bars for the observed frequencies should be hanging or standing or indicate the deviation between observed and ﬁtted frequencies. a vector of names passed to grid_barplot. . scale = c("sqrt". if set to NULL the names of x are used. newpage = TRUE.

dnbinom(as. NC.table(dummy) fitted1 <.6) * sum(observed) rootogram(observed. MA. W. size = 2.fit) plot(HK. prob = . prob = . Visualizing Categorical Data. fitted1) rootogram(observed. prob = .Zeileis@R-project.5.fit) plot(F.rnbinom(2 .numeric(names(observed)). size = 1.dnbinom(as.8) observed <. cited in Sokal & Rohlf (1969) and Lindsey (1995) on gender distributions in families in Saxony in the 19th century. Exploratory Data Analysis.fit) data("Federalist") F. fitted2) ## Real data examples: data("HorseKicks") HK.fit <.fit <.goodfit(Federalist. Cary.Saxony Author(s) Achim Zeileis <Achim.8) * sum(observed) fitted2 <. size = 1.goodfit(HorseKicks) summary(HK.fit) ## or equivalently rootogram(HK. SAS Institute. Addison Wesley. See Also grid_barplot Examples ## Simulated data examples: dummy <. Usage data("Saxony") .numeric(names(observed)). type = "nbinomial") summary(F. Tukey (1977).org> References J. Reading.fit) 81 Saxony Families in Saxony Description Data from Geissler. M. Friendly (2000).5.

Always Fun Never Fun. CA. . The variables and their levels are as follows: No 1 2 Name Husband Wife Levels Never Fun.82 Format SexualFun A 1-way table giving the number of male children in 6115 families of size 12. Fairly Often. R. Very Often. . References J. 12 Source M. J. (1987) given by Agresti (1990) summarizing the responses of married couples to the questionnaire item: Sex is fun for me and my partner: (a) never or occasionally. M. Lindsey (1995). UK. (b) fairly often. Very Often. type = "binomial") summary(gf) plot(gf) SexualFun Sex is Fun Description Data from Hout et al. Analysis of Frequency and Count Data. Usage data("SexualFun") Format A 2-dimensional array resulting from cross-tabulating the ratings of 91 married couples. H. Examples data("Saxony") gf <. (c) very often. R. Oxford University Press.goodfit(Saxony. Friendly (2000). Fairly Often. NC. 1. . SAS Institute. The variable and its levels are No 1 Name nMales Levels 0. Friendly (2000). San Francisco. The Principles and Practice of Statistics. K. Rohlf (1969). Always Fun . Cary. . Oxford. Sokal & F. pages 40–42. Freeman. Visualizing Categorical Data. (d) almost always. W. Biometry. Visualizing Categorical Data.

interpolate = c(2. v = c(1.95. Examples data("SexualFun") ## Kappa statistics Kappa(SexualFun) ## Agreement Chart agreementplot(t(SexualFun). Sociological Methodology. line_col = "black".) . Categorical Data Analysis. New York.9. h = NULL. ylab = "Wife’s Rating". p.95. ). h = c(2/3.. ).. residuals = NULL. l = NULL.. page 91. Usage shading_hcl(observed. Cary.shadings Source M. weights = 1) ## Partial Agreement Chart and B-Statistics agreementplot(t(SexualFun). Duncan. 17. Wiley-Interscience.) shading_max(observed = NULL. Visualizing Categorical Data. .. 83 M. eps = NULL. O. interpolate = c(2. 4).value = NULL. level = . h = NULL. df = NULL. D. Friendly (2000). lty = 1.. residuals = NULL. expected = NULL. c = NULL. 145-184. . 4).5). n = 1 .value = NULL. p. df = NULL. expected = NULL. main = "Husband’s and Wife’s Sexual Fun") shadings Shading-generating Functions for Residual-based Shadings Description Shading-generating functions for computing residual-based shadings for mosaic and association plots. . M. NC. Friendly (2000).) shading_hsv(observed. lty = 1. SAS Institute. M. Hout. c = NULL. . eps = NULL.99). level = . lty = 1. expected = NULL. l = NULL. s = c(1. Visualizing Categorical Data. . eps = NULL. level = c( . line_col = "black".. xlab = "Husband’s Rating". line_col = "black". Sobel (1987). Association and heterogeneity: Structural models of similarities and differences. References A. E. Agresti (1990). df = NULL. residuals = NULL.

saturation value in the HSV color description. h = c(26 . df = NULL. numeric tolerance value below which absolute residuals are considered to be zero. In the latter case. respectively. This is used principally in shading_Friendly. 5 ) for small and large residuals respectively. df = NULL. a step function with steps of equal size going from 0 to 1 is used. a vector of two line types for positive and negative residuals respectively. 2 ). line_col = "black". Recycled if necessary. Defaults to c(1. default border color (for shading_sieve: default sieve color). This controls the maximum chroma for signiﬁcant and non-signiﬁcant results respectively and defaults to c(1 . residuals = NULL. expected = NULL. If set to a numeric value. eps = . c = 35. saturation value in the HSV color description. line_col = "black". a speciﬁcation for mapping the absolute size of the residuals to a value in [0. ) by default and for HSV c(2/3. eps = . . interpolate = c(2. chroma value in the HCL color description. Defaults to c(9 . If set to NULL (default). 4).) shading_binary(observed = NULL. 1. interpolate = c(2. 1] for HSV colors. 1. which is used for coding the border color and line type. This can be either a function or a numeric vector. h = c(2/3. residuals = NULL. luminance value in the HCL color description. borders corresponding to smaller residuals are are drawn with line_col and lty[1].. lty = 1:2. expected = NULL. hue value in the HCL or HSV color description. df = NULL. . . The default is to use blue and red for positive and negative residuals respectively.84 shadings shading_Friendly(observed = NULL. ). col = NULL) hcl2hex(h = Arguments observed residuals expected df h contingency table of observed values contingency table of residuals contingency table of expected values degrees of freedom of the associated independence model.. In the HCL speciﬁcation it is c(26 . residuals = NULL.5) for signiﬁcant and non-signiﬁcant results respectively. fixup = TRUE) c l s v interpolate lty eps line_col . 1].. 4). ).) shading_sieve(observed = NULL. 360] for HCL and in [0. Defaults to c(1. . ) for large and small residuals respectively. all borders have the default color speciﬁed by line_col.. expected = NULL. ). lty = 1:2. l = 85. has to be in [0. all border colors corresponding to residuals with a larger absolute value are set to the full positive or negative color.

The color implementations employed are hsv from base R and polarLUV from the colorspace package. 70) and red HCL(0.value is smaller than 1 . if not. df) that computes the p value from the data. interpreted as a vector/table of residuals. logical. Both shadings visualize the sign of the residuals of an independence model using two hues (by default: blue and red). The shading shading_Friendly is very similar to shading_hsv. level n col fixup . The test is carried out by calling coindep_test. strucplot calls these functions with the arguments observed.value can be either a scalar or a function(observed..level. . By default. 50. this is computed from a Chi-squared distribution with df degrees of freedom. and returns a "gpar" object with the corresponding vector(s)/table(s) of graphical parameter(s). shading_sieve is similar. The shading shading_max is applicable in 2-way contingency tables and uses a similar strategy as shading_hcl. If p. expected. a colorful palette is used. To transform the HCL coordinates to a hexadecimal color string (as returned by hsv). But instead of using the cut-offs 2 and 4. but use HCL or HSV colors respectively. Furthermore. conﬁdence level of the test used. If set to NA no inference is performed. expected. The shading shading_binary just visualizes the sign of the residuals by using two different colors (default: blue HCL(260. Details These shading-generating functions can be passed to strucplot to generate residual-based shadings for contingency tables. and Hornik (2007) and diverge_hcl for more details.shadings p. number of permutations used in the call to coindep_test. it employs the critical values for the maximum statistic (by default at 90% and 99%). p. the function hex is employed. expected values and associated degrees of freedom for a particular contingency table and associated independence model. A convenience wrapper hcl2hex is provided. More categories or a continuous scale can be speciﬁed by setting interpolate. less colorful for medium sized residuals (< 4 and > 2). The former is usually preferred because they are perceptually based. respectively. The absolute size of the residuals is visualized by the colorfulness and the amount of grey. residuals. residuals. 70)). respectively.. df which give the observed values.value 85 the p value associated with the independence model. See Friendly (1994) for more details. See Zeileis. residuals. a vector of two colors for positive and negative residuals respectively. Should the color be corrected to a valid RGB value before correction? Other arguments passed to hcl2hex or hsv. the result of a signiﬁcance test can be visualized by the amount of grey in the colors. but additionally codes the sign of the residuals by different line types. grey/white for small residuals (< 2). If signiﬁcant. The corresponding critical values are then used as interpolate cut-offs. The shadings shading_hcl and shading_hsv do the same thing conceptually. Value A shading function which takes only a single argument. For shading_max a vector of levels can be supplied. the amount of color is reduced. respectively. 50. bright colors are used. color in the plot signals a signiﬁcant result at 90% or 99% signiﬁcance level. Meyer. but uses HCL colors. otherwise dark colors are employed. by default in three categories: very colorful for large residuals (> 4). Consequently.

polarLUV. Meyer D.at/~zeileis/papers/Zeileis+Hornik+Murrell-2 8. 1. Journal of the American Statistical Association.pdf. (2007). Computational Statistics & Data Analysis. package = "vcd"). (2006). gp = shading_hcl. ## hence the cut-offs can be modified mosaic(art. Hornik K.. Zeileis A. The Strucplot Framework: Visualizing Multi-Way Contingency Tables with vcd. 1–48.. gp = shading_hcl) ## the residuals are two small to have color. Zeileis A. Escaping RGBland: Selecting Colors for Statistical Graphics.xtabs(~Treatment + Improved. Zeileis A.8))) ## assess independence using the maximum statistic ## cut-offs are now critical values for the test statistic mosaic(art. gp = shading_binary(col = c(1. 2))) . gp = shading_Friendly. gp_args = list(interpolate = c(1. 17(3).Zeileis@R-project.8))) ## the same with the Friendly palette ## (without significance testing) mosaic(art. Journal of Statistical Software. 507–525. assoc. data = Arthritis) ## plain mosaic display without shading mosaic(art) ## with shading for independence model mosaic(art.. 89.jstatsoft.86 Author(s) Achim Zeileis <Achim. (2008). Residual-Based Shadings for Visualizing (Conditional) Independence. See also vignette("strucplot". shade = TRUE) ## which uses the HCL shading mosaic(art. Mosaic Displays for Multi-Way Contingency Tables. 190–200..org/v17/i03/.wu-wien. hsv. Journal of Computational and Graphical Statistics. 16. gp_args = list(interpolate = c(1. and Murrell P. strucplot.ac. URL http://www. and Hornik K. Meyer D. Forthcoming. diverge_hcl Examples ## load Arthritis data data("Arthritis") art <. (1994). mosaic. Preprint available from http: //statmath. Hornik K. 1. See Also hex. gp = shading_max) ## association plot with shading as in base R assoc(art.org> References shadings Friendly M..

main = NULL.. If sievetype is "expected". either a data frame. Components of "gpar" objects are recycled as needed along the last splitting dimension. gp = NULL. data. or an object of class "table" or "ftable". gp_tile = gpar(). sievetype = c("observed". shade = NULL. logical specifying whether gp should be used or not (see gp). . condvars = NULL. The default is a modiﬁed version of shading_Friendly: if sievetype is "observed". The table will be permuted to order them ﬁrst. a default model is ﬁtted: if condvars is speciﬁed. subset = NULL) Arguments x condvars formula a contingency table in array form. sub = NULL. an optional vector specifying a subset of observations to be used. the conditioning variables will then be used ﬁrst for splitting.. spacing_args = list(). legend = FALSE. shading function or a corresponding generating function (see details of strucplot and shadings). Formulas for sieve displays (unlike those for doubledecker plots) have no response variable. direction = NULL. object of class "gpar". spacing = NULL. data subset shade sievetype gp . a corresponding conditional independence model. main = NULL.. ..) ## S3 method for class ’formula’ sieve(formula. logical indicating whether rectangles should be ﬁlled according to observed or expected frequencies. Usage ## Default S3 method: sieve(x. if any."expected"). If TRUE and expected is unspeciﬁed.sieve 87 sieve Extended Sieve Plots Description (Extended) sieve displays for n-way contingency tables: plots rectangles with areas proportional to the expected cell frequencies and ﬁlled with a number of squares equal to the observed frequencies. and else the total independence model. cells with positive residuals are painted with a red sieve. If shade is NULL (default). For convenience. Thus. sub = NULL.. a formula specifying the variables used to create a contingency table from data. with optional category labels speciﬁed in the dimnames(x) attribute. Ignored if shade = FALSE. gp is used if speciﬁed. the densities visualize the deviations of the observed from the expected values. vector of integers or character strings indicating conditioning variables. and cells with negative residuals with a blue one. conditioning formulas can be speciﬁed. the sieves’ color is gray. split_vertical = NULL.

spacing function. Value The "structable" visualized is returned invisibly. where k is the number of margins of x (default: FALSE). or a logical value. Note To be faithful to the original deﬁnition by Riedwyl & Schüpbach. and produce (extended) sieve displays. shading. either a legend-generating function. spacing. whereas "v" indicates vertical split(s).. Author(s) David Meyer <David. and legend is modularized (see strucplot for details). border and ﬁll color). the default is to have no spacing between the tiles for two-way tables. Ignored if direction is not NULL.. For each component. split_vertical vector of logicals of length k .g. Values are recycled as needed. Details sieve is a generic function which currently has a default method and a formula interface.Meyer@R-project. labeling. legend.88 gp_tile legend sieve object of class "gpar".. spacing object. FALSE means horizontal splits. spacing. such as speciﬁcation of the independence model. If logical and TRUE. Other arguments passed to strucplot spacing spacing_args main. legend defaults to legend_resbased. sub . direction character vector of length k . and spacing_increase for more dimensions. if speciﬁed (see strucplot for more information). Most of the functionality is described there. controlling the appearance of all static elements of the cells (e. The default is no spacing at all if x has two dimensions. labeling. the name of the data object is used. where k is the number of margins of x (values are recycled as needed). A TRUE component indicates that the tile(s) of the corresponding dimension should be split vertically.org> . The layout is very ﬂexible: the speciﬁcation of shading. either a logical. and other graphical parameters. a value of "h" indicates that the tile(s) of the corresponding dimension should be split horizontally. or a character string used for plotting the main (sub) title. If legend is NULL or TRUE and gp is a function. or corresponding generating function (see strucplot for more information). a legend function (see details of strucplot and legends). Both are high-level interfaces to the strucplot function. list of arguments for the generating function.

SpaceShuttle References

89

H. Riedwyl & M. Schüpbach (1994), Parquet diagram to plot contingency tables. In F. Faulbaum (ed.), Softstat ’93: Advances in Statistical Software, 293–299. Gustav Fischer, New York. M. Friendly (2000), Visualizing Categorical Data, SAS Institute, Cary, NC. David Meyer, Achim Zeileis, and Kurt Hornik (2006). The strucplot framework: Visualizing multiway contingency tables with vcd. Journal of Statistical Software, 17(3), 1-48. URL http://www.jstatsoft.org/v17/i03/ and available as vignette("strucplot"). See Also assoc, strucplot, mosaic, structable, doubledecker Examples

data("HairEyeColor") ## aggregate over ’sex’: (tab <- margin.table(HairEyeColor, c(2,1))) ## plot expected values: sieve(tab, sievetype = "expected", shade = TRUE) ## plot observed table: sieve(tab, shade = TRUE) ## plot complete diagram: sieve(HairEyeColor, shade = TRUE) ## an example for the formula interface: data("VisualAcuity") sieve(Freq ~ right + left, data = VisualAcuity) ## example with observed values in the cells: sieve(Titanic, pop = FALSE, shade = TRUE) labeling_cells(text = Titanic, gp_text = gpar(fontface = 2))(Titanic)

SpaceShuttle

Space Shuttle O-ring Failures

Description Data from Dalal et al. (1989) about O-ring failures in the NASA space shuttle program. The damage index comes from a discussion of the data by Tufte (1997). Usage data("SpaceShuttle")

90 Format A data frame with 24 observations and 6 variables. FlightNumber Number of space shuttle ﬂight. Temperature temperature during start (in degrees F). Pressure pressure. Fail did any O-ring failures occur? (no, yes). nFailures how many (of six) 0-rings failed?. Damage damage index.

SpaceShuttle

Source Michael Friendly (2000), Visualizing Categorical Data: http://euclid.psych.yorku.ca/ftp/ sas/vcd/catdata/orings.sas

References S. Dalal, E. B. Fowlkes, B. Hoadly (1989), Risk analysis of the space shuttle: Pre-Challenger prediction of failure, Journal of the American Statistical Association, 84, 945–957. E. R. Tufte (1997), Visual Explanations: Images and Quantities, Evidence and Narrative. Graphics Press, Cheshire, CT. M. Friendly (2000), Visualizing Categorical Data. SAS Institute, Cary, NC.

Examples

data("SpaceShuttle") plot(nFailures/6 ~ Temperature, data = SpaceShuttle, xlim = c(3 , 81), ylim = c( ,1), main = "NASA Space Shuttle O-Ring Failures", ylab = "Estimated failure probability", pch = 19, col = 4) fm <- glm(cbind(nFailures, 6 - nFailures) ~ Temperature, data = SpaceShuttle, family = binomial) lines(3 : 81, predict(fm, data.frame(Temperature = 3 : 81), type = "re"), lwd = 2) abline(v = 31, lty = 3)

spacings

91

spacings

Spacing-generating Functions

Description These functions generate spacing functions to be used with strucplot to obtain customized spaces between the elements of a strucplot. Usage spacing_equal(sp = unit( .3, "lines")) spacing_dimequal(sp) spacing_increase(start = unit( .3, "lines"), rate = 1.5) spacing_conditional(sp = unit( .3, "lines"), start = unit(2, "lines"), rate = 1.8) spacing_highlighting(start = unit( .2, "lines"), rate = 1.5) Arguments start rate sp Details These generating functions return a function used by strucplot to generate appropriate spaces between tiles of a strucplot, using the dimnames information of the visualized table. spacing_equal allows to specify one ﬁxed space for all dimensions. spacing_dimequal allows to specify a ﬁxed space for each dimension. spacing_increase creates increasing spaces for all dimensions, based on a starting value and an increase rate. spacing_conditional combines spacing_equal and spacing_increase to create ﬁxed spaces for conditioned dimensions, and increasing spaces for conditioning dimensions. spacing_highlighting is essentially spacing_conditional but with the space of the last dimension set to 0. With a corresponding color scheme, this gives the impression of the last class being ‘highlighted’ in the penultimate class (as, e.g., in doubledecker plots). Value A spacing function with arguments: d condvars "dim" attribute of a contingency table. index vector of conditioning dimensions (currently only used by spacing_conditional). object of class "unit" indicating the start value for increasing spacings. increase rate for spacings. object of class "unit" specifying a ﬁxed spacing.

This function computes a list of objects of class "unit". Each list element contains the spacing information for the corresponding dimension of the table. The length of the "unit" objects is k − 1, k number of levels of the corresponding factor.

spacing = spacing_increase(start = . K.3)). xlab = NULL. condvars = 2) spine Spine Plots and Spinograms Description Spine plots are a special cases of mosaic plots. off = NULL. ylim = c( . strucplot(Titanic.. ylab = NULL. A.1. URL http://www. name = "spineplot". . 5. main = "".org> References spine Meyer. pop = TRUE. breaks = NULL.. D. gp = gpar(fill = c("light gray". strucplot(Titanic. See Also strucplot.. ylim = c( ..1. rate = 1. Usage spine(x. xlab = NULL.5)) spacing = spacing_equal(1)) spacing = spacing_dimequal(1:4 / 4)) spacing = spacing_highlighting. spinograms are an extension of histograms.) ## Default S3 method: spine(x.1. spacing = spacing_conditional. and Hornik. 3.5. ylab_tol = .org/v17/i03/ and available as vignette("strucplot")..1). strucplot(Titanic. 3. off = NULL. margins = c(5. . 17(3). data = list(). ylab_tol = . 4.2. gp = gpar().1.1.1).) ## S3 method for class ’formula’ spine(formula.. The strucplot framework: Visualizing multi-way contingency tables with vcd. newpage = TRUE. 1-48. newpage = TRUE.. 4. c(1. doubledecker Examples data("Titanic") strucplot(Titanic. ylab = NULL. . gp = gpar()."dark gray"))) data("PreSex") strucplot(aperm(PreSex. (2006). 5. 4.) 1). and can be seen as a generalization of stacked (or highlighted) bar plots.. 4. main = "".Meyer@R-project. breaks = NULL. y = NULL. Zeileis. name = "spineplot". 1). Journal of Statistical Software.92 Author(s) David Meyer <David. .4.jstatsoft. margins = c(5. pop = TRUE. Analogously.1.

See details. As for the histogram. Analogously. Should the viewport created be popped? additional arguments passed to plotViewport.. x can be either categorical (then a spine plot is created) or numerical (then a spinogram is plotted). logical. spine can also be called with only a single argument which then has to be a 2-way table. y) or spine(y ~ x) where y is interpreted to be the dependent variable (and has to be categorical) and x the explanatory variable. a "factor" interpreted to be the dependent variable a "formula" of type y ~ x with a single dependent "factor" and a single explanatory variable. The default is to call gray. If the distance between two labels drops under this threshold. This is a special case of a mosaic plot with speciﬁc spacing and shading. name newpage pop . Value The table visualized is returned invisibly. if the explanatory variable is numeric. the default method expects either a single variable (interpreted to be the explanatory variable) or a 2-way table. convenience tolerance parameter for y-axis annotation. breaks is passed to hist and can be a list of arguments. y).colors. ylab character strings for annotation ylim margins gp limits for the y axis margins when calling plotViewport a "gpar" object controlling the grid graphical parameters of the rectangles. x is ﬁrst discretized (using hist) and then for the discretized data a spine plot is created. Additionally. Should grid. Details spine creates either a spinogram or a spine plot. The heights of the bars then correspond to the conditional relative frequencies of y in every x group.. Spine plots are a generalization of stacked bar plots where not the heights but the widths of the bars corresponds to the relative frequencies of x.spine Arguments x y formula data breaks ylab_tol off 93 an object. name of the plotting viewport. for spinograms and vertical offset between the bars (in per cent). xlab. an optional data frame. this controls how it is discretized.newpage be called before plotting? logical. It should specify in particular a vector of fill colors of the same length as levels(y). interpreted to correspond to table(x. spinograms extend stacked histograms. main. It can be called via spine(x. . they are plotted equidistantly. It is ﬁxed to defaults to 2 for spine plots.

Theus. "lines"). .org> References strucplot Hummel. J. spacing = spacing_equal. expected = NULL. "expected"). core = struc_mosaic. H. Usage strucplot(x. shade = NULL. breaks = 5)) Arthritis. Linked bar charts: Analysing categorical data graphically. labeling = labeling_border. Unpublished Manuscript. (2005). gp_args = list(). data = on a numerical variable) Arthritis. data = SpaceShuttle. breaks = 3)) strucplot Structured Displays of Contingency Tables Description This modular function visualizes certain aspects of high-dimensional contingency tables in a hierarchical way. breaks = "Scott")) ## Space shuttle data (dependence on a numerical variable) data("SpaceShuttle") (spine(Fail ~ Temperature. 11. Hofmann. legend_args = list(). mosaic. labeling_args = list().Zeileis@R-project. M. core_args = list(). type = c("observed". df = NULL. gp = NULL. 23–33. margins = unit(3.. See Also cd_plot. sub = NULL. (1996). legend = NULL. data = Arthritis)) ## Arthritis data (spine(Improved ~ (spine(Improved ~ (spine(Improved ~ (dependence Age. spacing_args = list(). condvars = NULL. data = Age. hist Examples ## Arthritis data (dependence on a categorical variable) data("Arthritis") (spine(Improved ~ Treatment. breaks = quantile(Arthritis$Age))) Arthritis. residuals_type = NULL. Computational Statistics. residuals = NULL.94 Author(s) Achim Zeileis <Achim. data = Age. split_vertical = NULL. Interactive graphics for visualizing conditional distributions. main = NULL.

no labeling is produced. object of class "gpar". gp_args labeling labeling_args .strucplot title_margins = NULL. spacing function. df condvars shade residuals_type a character string indicating the type of residuals to be computed when none are supplied. . prefix = "". list of arguments for the shading-generating function. type a character string indicating whether the observed or the expected values of the table should be visualized. logical specifying whether gp should be used or not (see gp). main_gp = gpar(fontsize = 2 ). or alternatively the corresponding independence model speciﬁcation as used by loglin or loglm (see details). legend_width = NULL.) Arguments x residuals expected 95 a contingency table in array form. list of arguments for the spacing-generating function. if speciﬁed. a default model is ﬁtted: if condvars is speciﬁed. an array of expected values of the same dimension as x. Will be calculated (and overwritten if speciﬁed) if both expected and residuals are NULL. "deviance" (giving components of the likelihood ratio chi-squared). The value of this argument can be abbreviated. if any. If TRUE and expected is unspeciﬁed. those are expected to be ordered ﬁrst in the table. where k is the number of margins of x (values are recycled as needed). or "FT" for the Freeman-Tukey residuals. Components of "gpar" objects are recycled as needed along the last splitting dimension. A TRUE component indicates that the tile(s) of the corresponding dimension should be split vertically. optionally. If residuals is NULL. Ignored if shade = FALSE. an array of residuals of the same dimension as x (see details). If FALSE or NULL. giving components of Pearson’s chi-squared). shading function or a corresponding generating function (see details and shadings). if speciﬁed.. If residuals are speciﬁed. pop = TRUE. degrees of freedom passed to the shading functions used for inference. with optional category labels speciﬁed in the dimnames attribute. either a logical. FALSE means horizontal splits. residuals_type must be one of "pearson" (default. split_vertical vector of logicals of length k . This information is used for computing the expected values. keep_aspect_ratio = NULL. or if expected is given a formula. and else the total independence model. if speciﬁed. number of conditioning variables. and is also passed to the spacing functions (see spacings).. a corresponding conditional independence model. sub_gp = gpar(fontsize = 15). or a labeling function. optionally. spacing spacing_args gp spacing object. or a corresponding generating function (see details and labelings. newpage = TRUE. list of arguments for the labeling-generating function. or a corresponding generating function (see details and spacings). the value of residuals_type is just passed “as is” to the legend function. Default is FALSE.

For convenience. If legend is NULL or TRUE and gp is a function. An object of class "unit" of length 1 specifying the width of the legend (if any). core_args legend legend_args main sub margins title_margins legend_width pop main_gp. a character string used for plotting the subtitle. and sieve plots (struc_sieve) are provided. list of arguments for the legend-generating function. respectively. the numbers are interpreted as "lines" units. bottom. Currently. When a numeric vector is supplied. object of class "gpar" containing the graphical parameters used for the main (sub) title. sub_gp . right. If main is a logical and TRUE. either an object of class "unit" of length 4. or a character string used for plotting the main title. overloaded by the named arguments. optional character string used as a preﬁx for the generated viewport and grob names. the unit or numeric vector may have named arguments (‘top’. or a corresponding generating function (see details). the name of the object supplied as x is used. except when a legend is plotted and keep_aspect_ratio is TRUE: in this case. If unspeciﬁed. and left margin of the plot. If sub is a logical and TRUE and main is unspeciﬁed. the default is TRUE for two-dimensional tables and FALSE otherwise. prefix . if speciﬁed. or a legend function (see details and legends). The default for each speciﬁed title are 2 lines (and 0 else). Default: 5 lines. or a numeric vector of length 4. generating functions for mosaic plots (struc_mosaic). In addition. the unit or numeric vector may have named arguments (‘top’ and ‘bottom’). When a numeric vector is supplied. legend defaults to legend_resbased. ‘right’. association plots (struc_assoc). in which case the non-named arguments specify the default values (recycled as needed). the numbers are interpreted as "lines" units. either a legend-generating function. or a numeric vector of length 2. In addition. The four components specify the top.. either a logical. either an object of class "unit" of length 2. logical indicating whether the generated viewport tree should be removed at the end of the drawing or not. newpage logical indicating whether a new page should be created for the plot or not. keep_aspect_ratio logical indicating whether the aspect ratio should be ﬁxed or not. or a logical. and ‘left’). The elements are recycled as needed.. list of arguments passed to the labeling-generating function used. if speciﬁed. ‘bottom’.96 core strucplot either a core function. The elements are recycled as needed. respectively. if speciﬁed. overloaded by the named arguments. the default values of both margins are set as to align the heights of legend and actual plot. in which case the non-named argument specify the default value (recycled as needed). the name of the object supplied as x is used. The two components specify the top and bottom title margin of the plot. list of arguments for the core-generating function.

The strucplot framework: Visualizing multi-way contingency tables with vcd. as well as the tiles and bullets. the labeling. If type = "expected".org> References Meyer D. for highlighting particular cells. legend. 2. labelings. a ‘cinemascope’-like layout is used for the plot to preserve the 1:1 aspect ratio. The function invisibly returns the "structable" object visualized. 3. URL http://www. the expected values are passed to the observed argument of the core function.org/v17/i03/ and available as vignette("strucplot"). are modularized in graphical appearance control (“grapcon”) functions and speciﬁed as parameters. where the expected frequencies are optionally computed for a speciﬁed independence model. or legend). If legends are drawn. 4. gp. Passing a corresponding generating function to foo .strucplot Details 97 This function—usually called by higher-level functions such as assoc and mosaic—generates conditioning plots of contingency tables. an object of class "structable" corresponding to the plot. Generating functions must inherit from classes "grapcon_generator" and "}foo \code{". the spacing between the tiles.).g. see the corresponding help pages for more details on the data structures. strucplot takes two arguments: foo and foo\_args . the shading of the tiles. Value Invisibly. and labeling are called to produce the plot. that generates such a function.and subtitles. Although the gp argument is typically used for shading. passing the ﬁnal parameter object itself. . 1-48. and the actual plot region. and the observed values to the expected argument.jstatsoft. labeling.. it sets up a set of viewports for main.Meyer@R-project. the speciﬁed functions for spacing. Most elements of the plot. residuals are computed as needed from observed and expected frequencies. Except for the shading functions (shading\_bar ). core. 17(3). Passing a suitable function to foo which subsequently will be called from strucplot to compute shadings. along with parameters passed to foo\_args . Note The created viewports. Author(s) David Meyer <David. Finally. legend. and Hornik K. are named and thus can conveniently modiﬁed after a plot has been drawn (and pop = FALSE). For shadings and spacings. For each element foo (= spacing. etc. such as the core function. main plot. Then. etc. Zeileis A. passing foo(foo\_args) to the foo argument.. and the legend. Journal of Statistical Software. it can be used for arbitrary modiﬁcations of the tiles’ graphics parameters (e. (2006).. which can be used to specify the parameters in the following alternative ways: 1. First.

action a formula object with possibly both left and right hand sides specifying the column and row variables of the ﬂat table. a data frame.action) ## Default S3 method: structable(.5. pop = FALSE) grid. doubledecker..edit("rect:Class=1st. direction = NULL... list or environment containing the variables to be cross-tabulated. . spacing_args = list(start = . structable. split_vertical = FALSE) Arguments formula data subset na. gp = gpar(fill = "red")) structable Structured Contingency Tables Description This function produces a ‘ﬂat’ representation of a high-dimensional contingency table constructed by recursive splits (similar to the construction of mosaic displays). direction = NULL.5)) strucplot(Titanic. struc_mosaic. shadings. Usage ## S3 method for class ’formula’ structable(formula..Age=Adult. struc_assoc. na. Ignored if data is a contingency table .. spacings Examples data("Titanic") strucplot(Titanic) strucplot(Titanic. subset.5. mosaic. labelings. struc_sieve. spacing = spacing_increase.98 See Also structable assoc.5)) ## modify a tile’s color strucplot(Titanic. sieve. an optional vector specifying a subset of observations to be used.. legends. Ignored if data is a contingency table.Survived=Yes". core = struc_assoc) strucplot(Titanic. or an object inheriting from class table. a function which indicates what should happen when the data contain NAs. rate = 1. rate = 1. data. split_vertical = NULL.Sex=Male. spacing = spacing_increase(start = .

Values are recycled as needed.structable . direction = c("h". the value is alternated for all dimensions. mosaic. and is. appropriate aperm. The formula interface is quite similar to the one of ftable. or a list (or data frame) whose components can be so interpreted. and Hornik.... 17(3)."v". length. ftable Examples structable(Titanic) structable(Titanic. using either level indices or names (see examples). data = Titanic) . Details This function produces textual representations of mosaic displays. (2006). or a contingency table object of class "table" or "ftable". The corresponding replacement functions are available as well. In addition. but also accepts the mosaic-like formula interface (empty left-hand side). TRUE. Journal of Statistical Software.org> References Meyer.jstatsoft. K. inheriting from class "ftable". If the argument is of length 1. Ignored if direction is provided. split_vertical logical vector indicating. dim. A. Zeileis.. for each dimension. If the argument is of length 1. cbind. split_vertical = c(TRUE. 1-48. URL http://www. Values are recycled as needed.org/v17/i03/ and available as vignette("strucplot"). with the splitting information ("split_vertical") as additional attribute. Author(s) David Meyer <David. rbind."h". than the left-hand side should be left empty—the Freq column will be handled correctly."v")) structable(Sex + Class ~ Survived + Age. FALSE. the value is alternated for all dimensions. Value An object of class "structable". 99 R objects which can be interpreted as factors (including character strings). "structable" objects can be subset using the [ and [[ operators. the split_vertical or direction argument is needed to specify the order of the horizontal and vertical splits.Meyer@R-project. See Also strucplot. and thus ‘ﬂat’ contingency tables. D. Note that even if the ftable interface is used. If pretabulated data with a Freq column is used. direction character vector alternatively specifying the splitting direction ("h" for horizontal and "v" for vertical splits). FALSE)) structable(Titanic. The strucplot framework: Visualizing multi-way contingency tables with vcd.na methods do exist. whether it should be split vertically or not (default: FALSE).

1]. .4)] hec["Male". and "Brown" from the second ## (the following two commands are equivalent): hec[["Male"]][["Brown"]] hec[[c("Male"."Brown").c(1. given the level "Male" ## of the first variable."Green")] ## replacement funcion: tmp <.2:3] <."Hazel"]] hec[[c("Male".vector(hec) ## computed on the _multiway_ table as.hec[."Hazel"]] ## a few other operations t(hec) dim(hec) dimnames(hec) as.3]) as."Brown")]] ## Seeking subtables by conditioning on row and/or column variables: hec[["Male".matrix(hec) length(hec) cbind(hec[.4)]) ## In contrast.structable(aperm(HairEyeColor))) struc_assoc ## The "[" operator treats structables as a block-matrix and selects parts of the matrix: hec[1] hec[2] hec[1.c(2. Indexing conditions on specified levels and thus reduces the dimensionality: ## seek subtables conditioning on levels of the first dimension: hec[[1]] hec[[2]] ## Seek subtable from the first two dimensions.hec (tmp[1. the "[[" operator treats structables as two-dimensional ## lists."Brown").vector(unclass(hec)) struc_assoc Core-generating Function for Association Plots Description Core-generating function for strucplot returning a function producing association plots.tmp[2.c("Blue".100 ## subsetting of structable objects (hec <.]] hec[[c("Male".

or a recycled vector from which such a matrix will be constructed. the space between the rows and columns is ﬁxed and hence the plot is more “compressed”. If xlim is NULL. The columns of ylim correspond to the rows of the association plot. xscale = . maximums in the second row). If ylim is NULL. if FALSE: from the whole association plot matrix). xlim = NULL. object of class "unit" specifying the space between the tiles. the split direction. gp_axis = gpar(lty = 3)) Arguments compress 101 logical. k the number of total columns of the plot. xlim ylim xscale yspace gp_axis Details This function is usually called by strucplot (typically when called by assoc) and returns a function used by strucplot to produce association plots. Author(s) David Meyer <David. not used by struc_assoc.5.org> . list of gpar objects used for the drawing the tiles. the ranges are determined from the residuals according to compress (if TRUE: widest range from each column. split_vertical vector of logicals indicating. The columns of xlim correspond to the columns of the association plot. either a 2 × k matrix of doubles. if FALSE. for each dimension of the table. "lines"). scale factor resizing the tile’s width. object of class "unit" specifying additional space separating the rows. yspace = unit( . Value A function with arguments: residuals observed expected spacing gp table of residuals. maximums in the second row). the space between the rows (columns) are chosen such that the total heights (widths) of the rows (column) are all equal.9. If TRUE.struc_assoc Usage struc_assoc(compress = TRUE. the ranges are determined from the residuals according to compress (if TRUE: widest range from each row. k the number of total rows of the plot.Meyer@R-project. either a 2 × k matrix of doubles. if FALSE: from the whole association plot matrix). or a recycled vector from which such a matrix will be constructed. ylim = NULL. object of class "gpar" specifying the visual aspects of the tiles’ baseline. the rows describe the column ranges (minimums in the ﬁrst row. the rows describe the column ranges (minimums in the ﬁrst row. table of expected frequencies. thus adding additional space between the tiles.

Graphical methods for categorical data. See Also assoc. 17. A. expected = ~ Dept * (Admit + Gender). (1980). core = struc_assoc(ylim = c(-4. residual-based shadings to be effective also for zero cells.pdf Meyer.ca/sugi/sugi17.. SAS User Group International Conference Proceedings. (1992). logical controlling whether zero cells should be further split. Zeileis. Communications in Statistics—Theory and Methods. A. zero_gp = gpar(col = ).. 4)). If FALSE and zero_shade is TRUE. The strucplot framework: Visualizing multi-way contingency tables with vcd. labeling_args = list(abbreviate = c(Admit = 3))) struc_mosaic Core-generating Function for Mosaic Plots Description Core-generating function for strucplot returning a function producing mosaic plots.5. and Hornik. K. Journal of Statistical Software. e. A9. logical controlling whether zero bullets should be shaded. D.102 References struc_mosaic Cohen. Usage struc_mosaic(zero_size = . (2006). Friendly. panel = NULL) Arguments zero_size zero_split size of the bullets used for zero-entries in the contingency table (if 0.aperm(UCBAdmissions) ## association plot for conditional independence strucplot(ucb. zero_shade zero_gp .org/v17/i03/ and available as vignette("strucplot"). only one bullet is drawn (centered) for unsplit zero cells. strucplot. URL http://www. a bullet for each zero cell is drawn to allow. On the graphical display of the signiﬁcant components in a two-way contingency table. 190–200.g. If FALSE and zero_shade is FALSE. zero_shade = TRUE. 17(3).. object of class "gpar" used for zero bullets in case they are not shaded. http://datavis. structable Examples ## UCB Admissions data("UCBAdmissions") ucb <. zero_split = FALSE. 1-48.jstatsoft. M. 1025–1041. no bullets are drawn).

A.org/v17/i03/ and available as vignette("strucplot"). The strucplot framework: Visualizing multi-way contingency tables with vcd. index. and name called by the struc_mosaic workhorse for each tile that is drawn in the mosaic. observed. and name a label to be assigned to the drawn grid object. 1-48. 1025–1041. core = struc_mosaic(zero_size = 1)) . Value A function with arguments: residuals observed expected spacing gp table of residuals. A9. On the graphical display of the signiﬁcant components in a two-way contingency table. M.ca/sugi/sugi17. object of class "unit" specifying the space between the tiles. 17(3).jstatsoft. D. strucplot. Friendly. K.pdf Meyer. not used by struc_mosaic.struc_mosaic panel 103 Optional function with arguments: residuals. (1980). Graphical methods for categorical data. expected. split_vertical vector of logicals indicating. SAS User Group International Conference Proceedings.. URL http://www. list of gpar objects used for the drawing the tiles. for each dimension of the table. http://datavis. See Also mosaic. 17. Journal of Statistical Software.. Zeileis. structable Examples ## Titanic data data("Titanic") ## mosaic plot with large zeros strucplot(Titanic. gp. and Hornik. 190–200. Communications in Statistics—Theory and Methods. index is an integer vector with the tile’s coordinates in the contingency table. (1992). gp a gpar object for the tile. Author(s) David Meyer <David. table of observed values. A.Meyer@R-project. Details This function is usually called by strucplot (typically when called by mosaic) and returns a function used by strucplot to produce mosaic plots. (2006). the split direction.org> References Cohen.

. Gustav Fischer. object of class "gpar".104 struc_sieve struc_sieve Core-generating Function for Sieve Plots Description Core-generating function for strucplot returning a function producing sieve plots.g. table of observed values."expected"). for each dimension of the table.). 293–299. (2006). 1-48. and Schüpbach. Value A function with arguments: residuals observed expected spacing gp table of residuals. Usage struc_sieve(sievetype = c("observed". Meyer. M. A. and Hornik. M. object of class "unit" specifying the space between the tiles. Faulbaum (ed. The strucplot framework: Visualizing multi-way contingency tables with vcd. 17(3). gp_tile = gpar()) Arguments sievetype gp_tile logical indicating whether rectangles should be ﬁlled according to observed or expected frequencies.jstatsoft. Journal of Statistical Software. D. In F. not used by struc_sieve.Meyer@R-project. controlling the appearance of all static elements of the cells (e.. Details This function is usually called by strucplot (typically when called by sieve) and returns a function used by strucplot to produce sieve plots. NC.. (2000). H. Cary. list of gpar objects used for the drawing the tiles. Visualizing Categorical Data. border and ﬁll color).org/v17/i03/ and available as vignette("strucplot"). Author(s) David Meyer <David. the split direction. K. SAS Institute. Zeileis. (1994). .org> References Riedwyl. Friendly. split_vertical vector of logicals indicating. URL http://www. New York. Parquet diagram to plot contingency tables. Softstat ’93: Advances in Statistical Software..

strucplot. age. Selbstmord bei Kindern und Jugendlichen. Age classiﬁed into 5 groups.ca/ftp/ sas/vcd/catdata/suicide. age age (rounded). Source Michael Friendly (2000). sex factor indicating sex (male. Visualizing Categorical Data: http://euclid. Examples data("Suicide") structable(~ sex + method2 + age.yorku. Stuttgart. Usage data("Suicide") Format A data frame with 306 observations and 6 variables. NC. data = Suicide) . Cary. structable Examples ## Titanic data data("Titanic") strucplot(Titanic.psych.sas References J. method2 factor indicating method used (same as method but some levels are merged). Heuer (1979). core = struc_sieve) 105 Suicide Suicide Rates in Germany Description Data from Heuer (1979) on suicide rates in West Germany classiﬁed by age. Visualizing Categorical Data. Ernst Klett Verlag. Freq frequency of suicides. sex. female).group factor. M. and method of suicide. SAS Institute. method factor indicating method used.group.Suicide See Also sieve. Friendly (2000).

and conditional distributions. k depending on the amount of choices (at most 3). Usage table2d_summary(object. independence_table Examples data("UCBAdmissions") table2d_summary(margin...table. 1:2)) . "column"). if TRUE.) Arguments object margins percentages conditionals .. given the row/column factor. relative frequencies are computed. the conditional distributions. are computed. currently not used. marginal. percentages = FALSE. Value Returns invisibly a r × c × k table. conditionals = c("none". margins = TRUE. a r × c-contingency table if TRUE. Author(s) David Meyer <David.org> See Also mar_table..Meyer@R-project. if not "none".table(UCBAdmissions.106 table2d_summary table2d_summary Summary of a 2-way Table Description Prints a 2-way contingency table along with percentages. "row". marginal distributions are computed. . prop.

bg = "white".ternaryplot 107 ternaryplot Ternary Diagram Description Visualizes compositional. . color of these labels. labels_color position and color of the grid labels. May optionally be a string indicating the line type (default: "dotted"). "none"). dimnames_position. pop = TRUE. "center")."edge". dimnames_position = c("corner". cex = 1. dimnames_color position and color of dimension labels. coordinates = FALSE. Ignored for the symbol size if prop_size is not FALSE. scale = 1. Usage ternaryplot(x.) Arguments x scale a matrix with three columns. grid_color grid color. dimnames_color = "black". main = "ternary plot". dimnames dimension labels (defaults to the column names of x). the coordinates of the points are plotted below them. if TRUE. a grid is plotted. newpage = TRUE. Defaults to ﬁlled dots. row sums scale to be used. grid = TRUE. labels = c("inside". dimnames = NULL. prop_size = FALSE. . triangle background. labels. coordinates and id are mutual exclusive. character vector of length 1 or 2 indicating the justiﬁcation of these labels. pch = 19. a numerical value giving the amount by which plotting text and symbols should be scaled relative to the default. id_just = c("center". 3-dimensional data in an equilateral triangle. plotting character. grid_color = "gray". id_color = "black". if TRUE. coordinates and id are mutual exclusive."none"). id id_color id_just coordinates grid optional labels to be plotted below the plot symbols. border bg pch cex color of the triangle border. "outside"... col = "red". border = "black". id = NULL. labels_color = "darkgray".

the plot will appear on a new graphics page. 19. Details ternaryplot if TRUE.as. Visualizing Categorical Data. c 3/2). "red".8.e. all newly generated viewports are popped after plotting. main = "Arthritis Treatment Data" ) ## legend grid_legend( .c("red". col. c). ## Titanic data("Lifeboats") attach(Lifeboats) ternaryplot( . are: P (b + c/2. prop_size = TRUE. plotting color.. pch = pch. logical. if TRUE. grid_color = "white".7. bg = "lightgray". "blue") pch <.c(1.org> References M. the symbol size is plotted proportional to the row sum of the three variables. b. NC.. title = "GROUP") . if TRUE. SAS Institute. rownames(tab).108 prop_size col main newpage pop . Cary.. Friendly (2000). Examples data("Arthritis") ## Build table by crossing Treatment and Sex tab <. a + b + c = 1. 19) ## plot ternaryplot( tab. data = Arthritis)) ## Mark groups col <. pch.Meyer@R-project. i. main title. represents the weight of the observation. col = col. the coordinates of a point P (a.table(xtabs(~ I(Sex:Treatment) + Improved. 1. labels_color = "white". additional graphics parameters (see par) A points’ coordinates are found by computing the gravity center of mass points using the data √ entries as weights. "blue". Author(s) David Meyer <David. Thus.

tile_type = c("squaredarea". set_labels = NULL. 1. c(1. "width"). pch = ifelse(side == "Port". "red". col = ifelse(side == "Port".character(boat). "center". c("red". valign = c("bottom". as. shade = FALSE. halign = c("left". pch. "blue"). 19). keep_aspect_ratio = FALSE."red".numeric(Positions)]. margins = unit(3. levels(Positions)."red". "height"."blue". "lines"). prop_size = 2. c("Port". "blue"). . "center".tile Lifeboats[. title = "SIDE") 109 ## Hitters data("Hitters") attach(Hitters) colors <. id = ifelse(men / total > . .8. col = colors[as."green".2:4]. NA). pch = as."blue") pch <.c("black".substr(levels(Positions).9. colors.character(Positions). main = "Lifeboats on Titanic" ) grid_legend( . Usage ## Default S3 method: tile(x. title = "POSITION(S)") tile Tile Plot Description Plots a tile display. legend_width = NULL."black".8. "top"). 19).1. "right"). "area".4:6].9. spacing = spacing_equal(unit(1. main = "Baseball Hitters Data" ) grid_legend( . . dimnames_position = "edge". "lines")). 1. split_vertical = NULL. 1) ternaryplot( Hitters[. "Starboard").

main = NULL. an optional vector specifying a subset of observations to be used. na. The component names must exactly match the variable names to be replaced.. Other arguments passed to strucplot main. in which case the non-named arguments specify the default values (recycled as needed).. a function which indicates what should happen when the data contain NAs. Ignored if data is a contingency table. A TRUE component indicates that the tile(s) of the corresponding dimension should be split vertically. character string indicating how the tiles should reﬂect the table frequencies (see details). An optional character vector with named components replacing the so-speciﬁed variable names. ‘right’. spacing function. and ‘left’). . tile a formula specifying the variables used to create a contingency table from data.. Default: 5 lines. where k is the number of margins of x (values are recycled as needed). Default is FALSE. the numbers are interpreted as "lines" units.. the unit or numeric vector may have named arguments (‘top’. sub = NULL. The four components specify the top. halign. or an object coercible to one. If logical and TRUE. valign character string specifying the horizontal and vertical alignment of the tiles. overloaded by the named arguments.. logical specifying whether shading should be enabled or not (see strucplot). . squared_tiles logical indicating whether white space should be added as needed to rows or columns to obtain squared tiles in case of an unequal number of row and column labels. or corresponding generating function (see strucplot for more information). . data. bottom. FALSE means horizontal splits. or a numeric vector of length 4. ‘bottom’. subset = NULL. In addition. spacing set_labels spacing object. The elements are recycled as needed. the name of the data object is used. right.action = NULL) Arguments x formula data subset na. shade margins legend_width An object of class "unit" of length 1 specifying the width of the legend (if any). either a logical. When a numeric vector is supplied.action tile_type a contingency table.) ## S3 method for class ’formula’ tile(formula. respectively. or a character string used for plotting the main (sub) title. The default is FALSE to enable the creation of squared tiles. or an object of class "table" or "ftable". sub . either an object of class "unit" of length 4.. keep_aspect_ratio logical indicating whether the aspect ratio should be ﬁxed or not. split_vertical vector of logicals of length k . sub = NULL.. main = NULL. either a data frame.110 squared_tiles = TRUE. and left margin of the plot.

In contrast to other high-level strucplot functions. squared_tiles = FALSE) tile(Titanic. halign = "center". row-wise and overall comparisons. type = "expected") tile(Titanic.2)]) Trucks Truck Accidents Data Description Data from a study in England in two periods from November 1969 to October 1971 and November 1971 to October 1973. respectively. considering either the width or the height. "height". the question is whether the safety measure had an effect on the number of accidents and on the point of collision on the truck.org> See Also assoc. squared_tiles = FALSE) tile(Titanic. tile_type = "area". Examples data("Titanic") ## default plot tile(Titanic) tile(Titanic. For each tile. strucplot. artiﬁcial dimnames will be created. Note that multiway-tables are ﬁrst “ﬂattened” using structable. respectively. either the "width". and the actual ones are drawn using set_labels. tile_type = "height". The last variant allows to compare the tiles both column-wise and row-wise. A new compulsory safety measure for trucks was introduced in October 1971. Author(s) David Meyer <David. structable.. valign = "center") ## repeat levels tile(Titanic[.c(1.1.Trucks Details 111 A tile plot is a matrix of tiles. . or squared area is proportional to the corresponding entry. The ﬁrst three options allow column-wise. "area". Value The "structable" visualized is returned invisibly. Therefore.Meyer@R-project..2. tile_type = "width". In this case. shade = TRUE) ## some variations tile(Titanic. mosaic. tile also accepts a table with duplicated levels (see examples).

data = tab) doubledecker(collision ~ parked + light + period. night on an illuminated road (night. Source E.112 Usage data("Trucks") Format A data frame with 24 observations on 5 variables. B. Springer-Verlag. no). parked factor indicating whether the truck was parked (yes. data = tab) cotabplot(tab. Andersen (1991).8. dark). The variables and their levels are as follows: . 2nd edition. Table 6. Usage data("UKSoccer") Format A 2-dimensional array resulting from cross-tabulating the number of goals scored in 380 games. The Statistical Analysis of Categorical Data. forward). night on a dark road (night. data = Trucks) loglm(~ (collision + period) * parked * light. light factor indicating light conditions: day light (daylight). illuminate). Berlin. B.xtabs(Freq ~ period + collision + light + parked. on the goals scored by Home and Away teams in the Premier Football League. 1995/6 season. UKSoccer Freq frequency of accidents involving trucks. period factor indicating time period (before. Andersen (1991). Examples data("Trucks") tab <. The Statistical Analysis of Categorical Data. after) 1971-11-01. collision factor indicating whether the collision was in the back or forward (including the front and the sides) of the truck (back. References E. panel = cotab_coindep) UKSoccer UK Soccer Scores Description Data from Lee (1997).

. right visual acuity on right eye.477 women. Chance. Visualizing Categorical Data: http://euclid. page 27. gp = shading_max.psych. SAS Institute. Freq frequency of visual acuity measurements. . NC. 15–19. main = "UK Soccer Scores") VisualAcuity Visual Acuity in Left and Right Eyes Description Data from Kendall & Stuart (1961) on unaided vision among 3. See Also Bundesliga Examples data("UKSoccer") mosaic(UKSoccer. M. . . all aged 30-39 and employed in the U. 10(1).yorku. . Friendly (2000). left visual acuity on left eye. Royal Ordnance factories 1943-1946. . . Modelling scores in the Premier League: Is Manchester United really the best?. Visualizing Categorical Data. gender factor indicating gender of patient. J.ca/ftp/sas/ vcd/catdata/vision.242 men and 7. 1. Friendly (2000). Visualizing Categorical Data. . 4 0. Lee (1997).sas . Friendly (2000). Usage data("VisualAcuity") Format A data frame with 32 observations and 4 variables. 4 113 A.VisualAcuity No 1 2 Source M. Cary.K. 1. Source M. References Name Home Away Levels 0.

Grifﬁn. Usage data("VonBort") Format A data frame with 280 observations and 4 variables. year year of the deaths. Fisher (1925).114 References VonBort M. deaths number of deaths. M. corps factor indicating the corps. Visualizing Categorical Data. Kendall & A. A. Oliver & Boyd. 2. SAS Institute. SAS Institute. Data: A Collection of Problems from Many Fields for the Student and Research Worker. R. Cary. given by Andrews \& Herzberg (1985). Friendly (2000). on number of deaths by horse or mule kicks in 14 corps of the Prussian army. NC. M. L. data = VisualAcuity) VonBort Von Bortkiewicz Horse Kicks Data Description Data from von Bortkiewicz (1898). Vol.psych. Cary.sas References D. M. Visualizing Categorical Data. ﬁsher factor indicating whether the corresponding corps was considered by Fisher (1925) or not. New York. Friendly (2000). Andrews \& A. NY. Springer-Verlag. Examples data("VisualAcuity") structable(~ gender + left + right.yorku. Das Gesetz der kleinen Zahlen. Visualizing Categorical Data: http://euclid. London. Source Michael Friendly (2000). G. London. F. See Also HorseKicks for a popular subsample. Statistical Methods for Research Workers. Stuart (1961). von Bortkiewicz (1898). Herzberg (1985). Teubner. The Advanced Theory of Statistics. . NC. Leipzig.ca/ftp/ sas/vcd/catdata/vonbort.

10 Source M. Philosophical Magazine.306 throws of 12 dice where 10 indicates ‘10 or more’ 5s or 6s. . . Cary. data = VonBort. Visualizing Categorical Data. Weldon tossed the dice 26. NC. type = "binomial") summary(gf) plot(gf) . References K. Examples data("WeldonDice") gf <. Visualizing Categorical Data. 1. The variable and its levels are No 1 Name n56 Levels 0. . 50 (5th series). Friendly (2000). . Usage data("WeldonDice") Format A 1-way table giving the frequency of a 5 or a 6 in 26. subset = fisher == "yes") 115 WeldonDice Weldon’s Dice Data Description Data from Pearson (1900) about the frequency of 5s and 6s in throws of 12 dice.goodfit(WeldonDice. M. 157–175. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen by random sampling. pages 20–21.306 times and reported his results in a letter to Francis Galton on 1894-02-02. Friendly (2000). Pearson (1900). SAS Institute.WeldonDice Examples data("VonBort") ## HorseKicks data xtabs(~ deaths.

Hoaglin \& J. Friendly (2000). W. SAS Institute. . R. J.). A. chapter 9. type = "binomial") summary(gf) plot(gf) . 239–248. Examples data("WomenQueue") gf <. The Statistician. Checking the shape of discrete distributions. Mosteller. 1. References D. Slater (1981). New York. 30. Tukey (1985). . Exploring Data Tables. 10 Source M.goodfit(WomenQueue. F. pages 19–20. Cary. M. In D. . Visualizing Categorical Data. C. C. Usage data("WomenQueue") Format A 1-way table giving the number of women in 100 queues of length 10. Tukey (eds. Friendly (2000). The variable and its levels are No 1 Name nWomen Levels 0. John Wiley \& Sons. Trends and Shapes.116 WomenQueue WomenQueue Women in Queues Description Data from Jinkinson \& Slater (1981) and Hoaglin \& Tukey (1985) reporting the frequency distribution of females in 100 queues of length 10 in a London Underground station. NC. Visualizing Categorical Data. W. . Critical discussion of a graphical method for identifying discrete distributions. Jinkinson \& M. Hoaglin.

p-value for the test.. 251-253. See Also mantelhaen.test Examples data("CoalMiners") woolf_test(CoalMiners) A 2 × 2 × k table.e. Human Genet. degrees of freedom of the approximate chi-squared distribution of the test statistic. B. the observed counts. Usage woolf_test(x) Arguments x Value A list of class "htest" containing the following components: statistic parameter p. the expected counts under the null hypothesis.woolf_test 117 woolf_test Woolf Test Description Test for homogeneity on 2 × 2 × k tables over strata (i. whether the log odds ratios are the same in all strata).value method data. the chi-squared test statistic. a character string giving the name(s) of the data. On estimating the relation between blood group and disease. (London) 19. a character string indicating the type of test performed. . Ann. 1955.name observed expected References Woolf.

78 Rochdale. 54 mosaic. 113 VonBort. 111 UKSoccer. 58 Pairs plot panel functions for diagonal cells. 35 independence_table. 75 Punishment. 67 PreSex. 41 labeling_border. 38 grid_legend. 105 Trucks. 5 Baseball. 72 plot. 89 Suicide. 80 shadings.table. 9 distplot. 83 sieve. 114 WeldonDice. 14 Butterfly. 92 struc_assoc. 87 spacings. 94 . 91 spine.loglm. 100 struc_mosaic. 65 table2d_summary. 74 rootogram. 31 Hitters. 21 doubledecker. 3 assocstats. 112 VisualAcuity. 77 RepVict. 81 118 SexualFun. 44 Kappa. 62 OvaryCancer. 61 NonResponse. 44 JointSports. 23 cotabplot. 25 independence_table. 57 oddsratio. 39 hls. 116 ∗Topic hplot agreementplot. 16 cotab_panel. 42 Hospital. 63 Ord_plot. 6 cd_plot. 70 pairs. 10 BrokenMarriage. 45 Lifeboats. 46 mar_table. 102 struc_sieve. 12 Bundesliga. 106 ∗Topic datasets Arthritis. 79 Saxony. 104 strucplot. 44 ∗Topic category agreementplot. 26 Employment.Index ∗Topic array co_table. 43 JobSatisfaction. 115 WomenQueue. 52 legends. 15 CoalMiners. 3 assoc. 68 Pairs plot panel functions for off-diagonal cells. 28 fourfold. 33 grid_barplot. 56 MSPatients. 12 Bundestag2 5. 40 HorseKicks. 18 DanishWelfare. 27 goodfit. 30 Federalist. 82 SpaceShuttle. 48 labeling_cells_list.

45 Kappa. 5 as.Kappa (Kappa). 19 woolf_test.points. 107 tile. 51 labeling_cells (labeling_cells_list). 28. 93 grid. 38 Baseball. 17. 38.na. 40 hls. 18 coindep_test. 75. 64 confint. 109 ∗Topic htest coindep_test.vector.loglm (plot. 81 grid_legend.structable (structable). 98 as. 54 labeling_cboxed (labeling_border). 80. 94 Hitters. 24.structable (structable). 39 hcl2hex. 20 co_table. 35 fourfold. 51. 17.structable (structable). 97.coindep_test (coindep_test). 89. 6. 40 BrokenMarriage. 99 is. 98 JobSatisfaction.goodfit (goodfit). 14 Butterfly. 86 hist. 16.structable (structable). 17 dim. 111 assoc. 71. 94 chisq. 117 [. 10. 44. 24. 99 goodfit. 54 grid_barplot.na.INDEX structable. 98 cd_plot. 85.structable (structable). 42. 31 fisher.structable (structable). 23. 85. 63 coplot.oddsratio (oddsratio). 83 hex.colors. 21. 91. 92. 85 confint. 23 cotab_coindep (cotab_panel). 3. 99 cbind. 47 aperm. 12 Bundesliga. 23 cotab_panel. 33 ftable.structable (structable). 24 DanishWelfare. 24. 19. 86 independence_table. 23 cotabplot.structable (structable). 25 CoalMiners. 23 cotab_mosaic. 74. 43 hsv. 98 ternaryplot. 9 barplot. 46 labeling_border. 44 JointSports. 23 cotab_mosaic (cotab_panel). 47. 86 doubledecker.newpage. 66 grid. 74 assocplot. 114 Hospital.structable (structable). 98. 30 Extract. 60. 53. 8 assocstats. 98 Employment. 85 hcl2hex (shadings). 35. 15 cbind. 23 cotab_sieve (cotab_panel). 102. 93 grid. 38. 98 is. 22 cotab_assoc (cotab_panel). 23 cotab_fourfold (cotab_panel). 80.text. 98 dimnames. 27 diverge_hcl. 98 Arthritis. 35 gray. 86. 48. 52 . 27. 22. 41 HorseKicks. 41.matrix. 60. 11. 41.structable (structable). 26 density. 23 cotab_coindep. 99 aperm.loglm).table. 113 Bundestag2 5. 20 fitted. 23. 46 confint. 106 is. 12. 48 labeling_cells. 93. 99 dim.test. 85. 27. 98 as.test. 98 distplot. 50. 98 119 Federalist. 19 fitted. 66. 98 assoc. 89. 25 agreementplot.

70. 94. 63 print. 68 pairs_mosaic. 89. 74 plot.oddsratio (oddsratio). 74 pairs_sieve (Pairs plot panel functions for off-diagonal cells).Kappa (Kappa). 48 labeling_left (labeling_border). 9 print.table. 74. 63 Ord_estimate (Ord_plot). 54 legend_resbased. 46 print. 30. 117 mar_table. 62 oddsratio. 70 pairs_sieve. 48 labeling_lboxed (labeling_border). 74.structable (pairs. 48 labeling_list. 59. 65 Ord_plot. 35 print. 46 print. 72 pairs_assoc.table. 35 plot. 75 print. 48 labeling_values (labeling_border).oddsratio (oddsratio). 75. 54 legends. 68. 96 legend_resbased (legends). 70. 68 pairs_diagonal_text. 57.goodfit (goodfit). 97–99.table). 74 pairs_text (Pairs plot panel functions for diagonal cells). 67 pairs. 68 Pairs plot panel functions for off-diagonal cells.loglm. 86 POSIXt. 63 print. 48 labeling_left2 (labeling_border). 86. 70 pairs. 9 print. 59. 41. 59. 98 length. 55.test. 95. 71. 106 Punishment.summary.summary. 52 labeling_conditional (labeling_border). 61 NonResponse.table2d_summary (table2d_summary). 35. 88. 70. 48 labelings. 58. 106 prop. 71. 72 pairs. 99 length.assocstats (assocstats). 71. 54. 71.goodfit (goodfit). 74 pairs_diagonal_text (Pairs plot panel functions for diagonal cells). 106 mosaic. 65 OvaryCancer. 70 pairs_text. 74 mosaicplot. 24. 68 par.loglm (plot. 75. 9.structable (structable). 70 pairs_strucplot (Pairs plot panel functions for off-diagonal cells). 103. 68 pairs_diagonal_mosaic (Pairs plot panel functions for diagonal cells). 48 labeling_doubledecker (labeling_border). 52 labeling_residuals (labeling_border).goodfit (goodfit). 93 polarLUV. 77 . 70 pairs_barplot.oddsratio (oddsratio). 63 plotViewport. 98 labelings (labeling_border). 98 Lifeboats.Kappa (Kappa). 71. 71. 74 INDEX pairs_assoc (Pairs plot panel functions for off-diagonal cells). 60 MSPatients.120 labeling_cells_list. 73 Pairs plot panel functions for diagonal cells. 4. 96. 85. 17.summary. 56 loglin. 74 pairs_mosaic (Pairs plot panel functions for off-diagonal cells).assocstats (assocstats). 35 PreSex. 39 legend_fixed (legends). 95 mantelhaen. 50.loglm). 71. 74 pairs_barplot (Pairs plot panel functions for diagonal cells). 51 labeling_list (labeling_cells_list). 70. 64 plot. 56 predict. 70. 48 legend. 111 mosaic. 88. 95 loglm.

83 shading_max (shadings). 83. 98 RepVict. 95. 100 struc_mosaic. 9. 75. 42. 63 t. 35 summary. 59. 106 ternaryplot. 24. 92 struc_assoc. 4 121 . 17. 111 structable. 83 shading_Friendly. 110. 87. 91 spacings. 103. 107 tile. 102 struc_sieve.oddsratio (oddsratio). 105 SpaceShuttle. 9 summary. 86–89. 24 shading_hcl (shadings). 111 Suicide.Kappa (Kappa). 24.goodfit (goodfit). 89. 78 Rochdale. 102. 96. 7–9. 71.assocstats (assocstats). 60. 71. 91 spacing_increase (spacings). 94. 117 xtabs. 102–105. 89 spacing_conditional. 98. 55. 91. 30. 114 WeldonDice. 29. 50. 99 rbind.structable (structable). 115 WomenQueue. 83 shading_hcl. 98. 80 Saxony. 83 shadings. 105. 83 shading_sieve (shadings). 98 sieve. 55. 98. 74. 60. 92. 112 VisualAcuity. 13. 98. 7 spacing_conditional (spacings). 104 strucplot. 104. 105 summary. 87 shading_Friendly (shadings). 20 rbind. 109 Trucks.structable (structable). 46 summary. 79 rootogram. 83 shading_hsv (shadings). 91 spacing_dimequal (spacings). 98 spine. 95.INDEX r2dtable. 81 SexualFun. 98. 99. 51. 82 shading_binary (shadings). 98. 116 woolf_test. 36. 54. 96. 98 table2d_summary. 53. 91 spacing_equal (spacings). 91 spacing_highlighting (spacings). 91. 96. 113 VonBort. 74. 87. 59. 111 UKSoccer.

- CS301_handouts_1_45.pdf
- Wickham Advanced R
- 06.14.14 Mariners Minor League Report
- Ict131 Jan 2017 Exam Paper (1)
- document-1079847121
- Ninja Saga Cheat
- Matlab Course
- Array Concept
- Arrays and Strings
- Data Structure
- SLIDES_P&T_2b
- Getting Started With VB 6
- هياكل بيانات
- 05.17.13 Mariners Minor League Report
- CS220-FinalExam-I14-StudyGuide(1)
- Meljun Cortes Data Structures Arrays
- Abstract
- Lab06_1
- AI-Search-I
- 10-arraylists
- Programarea calculatorului teorie si definitii (EN)
- INTRODUCING THE CONCEPT OF BACKINKING.pdf
- Assembly Strings and Arrays
- Lecture 10
- conte sophomore presentation
- 2 Excel for Analysts Formulas 101
- C++
- Convert Number to Word
- Thesis
- Increasing Assessment Reliability

Skip carousel

- 05.19.16 Game Notes
- 08.12.13 Mariners Minor League Report
- 08.15.15 Game Notes
- 08.14.16 Game Notes
- 03.24.16 Box Score.pdf
- 03.21.15 Box Score
- 08.24.14 Game Notes
- 05.07.14 Game Notes
- 12.11.12 Mariners Winter League Report
- 05.05.13 Mariners Minor League Report
- 09.08.13 SEA Notes
- 09.25.15 Game Notes
- 07.30.15 Game Notes
- 08.10.15 Post-Game Notes
- 05.25.14 Game Notes
- 08.23.14 Game Notes
- 05.16.13 Mariners Minor League Report
- 08.27.13 Mariners Minor League Report
- 09.17.16 Game Notes
- 08.21.15 Game Notes
- 07.03.15 Game Notes.pdf
- Postgame Notes 0902
- 07.09.14 Game Notes
- 09 18 15 PG Notes vs SEA
- 08.25.16 Game Notes
- 08.10.16 Game Notes
- 07.26.13 Box Score
- 07.17.16 Game Notes.pdf
- 08.01.13 Mariners Minor League Report
- SD PostGame Notes (06 30 15)

- Ppa
- ch-3
- Inter Trans
- chapter-23.ppt
- CH1_2011
- fjuykuyj
- rfhf
- 37 Material Guarantee
- NETSANET
- hjmg
- ch 5
- 6&7
- Chapter 1
- Chapter 15 (1)
- Dura
- Meneger Yalebet Qutir
- App
- Meneger Yalebet Qutir Smint
- sdfgsdsdf
- introductoijkj to stklji
- desta
- Fundamental of Research
- Hateta Ze Zera Yaiqob.amharic
- Lectures on Mental Scienceprinted
- ch01
- chapter 2
- vcn cvn
- Tplf Federalism 1
- Need Assessment Questionnaire
- Table

Sign up to vote on this title

UsefulNot usefulClose Dialog## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

Close Dialog## This title now requires a credit

Use one of your book credits to continue reading from where you left off, or restart the preview.

Loading