You are on page 1of 19

Package ‘tree’

September 21, 2011
Title Classification and regression trees Version 1.0-29 Date 2011-07-24 Depends R (>= 2.7.0), grDevices, graphics, stats Author Brian Ripley <ripley@stats.ox.ac.uk>. Maintainer Brian Ripley <ripley@stats.ox.ac.uk> Description Classification and Regression Trees. LazyLoad yes License GPL-2 | GPL-3 Repository CRAN Date/Publication 2011-07-25 09:18:44

R topics documented:
cv.tree . . . . . . deviance.tree . . misclass.tree . . . na.tree.replace . . partition.tree . . . plot.tree . . . . . plot.tree.sequence predict.tree . . . prune.tree . . . . snip.tree . . . . . text.tree . . . . . tile.tree . . . . . tree . . . . . . . tree.control . . . tree.screens . . . Index 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 3 3 4 5 6 7 8 9 11 12 13 14 16 17 19

Ripley See Also tree. Optionally an integer vector of the length the number of cases used to create object.. Additional arguments to FUN.tree(log1 (perf) ~ syct + mmin + mmax + cach + chmin + chmax.tree) An object of class "tree". Author(s) B. data=cpus) cv.tree Cross-validation for Choosing Tree Complexity Description Runs a K-fold cross-validation experiment to find the deviance or number of misclassifications as a function of the cost-complexity parameter k.2 cv.. . K = 1 . rand.tree Examples data(cpus. with component dev replaced by the cross-validated results from the sum of the dev components of each fit. D.. FUN = prune. The number of folds of the cross-validation..tree. . . The function to do the pruning. Usage cv. Value A copy of FUN applied to object. prune.) Arguments object rand FUN K .tree(object. package="MASS") cpus. prune.tree(cpus.ltr <.ltr. assigning the cases to different groups for cross-validation.tree cv.

representing a classification tree. either overall or at each node. If false.. . report overall number of mis-classifications. returns a vector of deviance contributions from each node. detail = FALSE..tree Misclassifications by a Classification Tree Description Report the number of mis-classifications made by a classification tree. detail = FALSE) Arguments tree detail Object of class "tree". Usage ## S3 method for class ’tree’ deviance(object. arguments to be passed to or from other methods. Details The quantities returned are weighted by the observational weights if these are supplied in the construction of tree. If true. an object of calls "tree" logical..tree 3 deviance. or a vector of contributions from the cases at each node. If true.. . misclass. report the number at each node. Usage misclass.) Arguments object detail . The overall deviance is the sum over leaves in the latter case. Value The overall deviance.tree Extract Deviance from a Tree Object Description Extract deviance from a tree object.tree(tree.deviance.

tree. detail=TRUE) na.replace(frame) Arguments frame Details This function is used via the na. iris) misclass. See Also tree.tr.tree(ir.tree(ir.replace Replace NAs in Predictor Variables Description Adds a new level called "NA" to any discrete predictor in a data frame that contains NAs. Author(s) B.tree.tr <.omit. data frame used to grow a tree.4 Value Either the overall number of misclassifications or the number for each node. Value data frame such that a new level named "NA" is added to any discrete predictor in frame with NAs.replace na.tree(Species ~. D. Ripley See Also tree Examples ir. na. .tr) misclass.action argument to tree. Usage na..tree. Stops if any continuous predictor contains an NA.

tree(tree.tree 5 partition. Usage partition.. If the tree contains two predictors. Graphical parameters. The ordering of the variables to be used in a 2D plot. the first will be used on the x axis. . Details This can be used with a regression or classification tree containing one or two continuous predictors (only). If true. label = "yval". Ripley See Also tree A object of class "tree". ordvars. . add to existing plot.. Author(s) B.. D. otherwise start a new plot..partition. If the tree contains one predictor. Value None. Specify the names in a character string of length 2. A character string giving the column of the frame component of tree to be used to label the regions. a plot is made of the space covered by those two predictors and the partition made by the tree is superimposed.) Arguments tree label add ordvars . the predicted value (a regression tree) or the probability of the first class (a classification tree) is plotted against the predictor over its range in the training set.tree Plot the Partitions of a simple Tree Model Description Plot the partitions of a tree involving one or two variables. add = FALSE.

Author(s) B. . 3].3].Tree. . type = c("proportional". iris) plot(iris[. xlab="petal length".Length. 4]. type="n". iris) ir.tree(ir. Used for positional matching of type.tr <.4].tree(Species ~. cex = 1.tr.tr <. "c". the branches are of uniform length.tr1) par(pty = "s") plot(iris[.. cex = 1. D.tree Plot a Tree Object Description Plot a tree object on the current graphical device Usage ## S3 method for class ’tree’ plot(x. Otherwise they are proportional to the decrease in impurity. iris[.unif.? in the global environment. "uniform").tree(ir. "v")[iris[.6 Examples ir.tree(Petal.tr...Width ~ Petal. ignored. the value of type == "uniform" is stored in the variable .iris[. 4]. 5]]) partition. type="n". ylab="Width") partition.tree plot. 7)) summary(ir. If this partially matches "uniform".snip. add = TRUE. iris[. Value An (invisible) list with components x and y giving the coordinates of the tree nodes.) Arguments x y type . nodes = c(12.tree(ir. graphical parameters. character string. 3]. As a side effect. xlab="Length".tr1 <. y = NULL. ylab="petal width") text(iris[.5) # 1D example ir. add = TRUE..5) plot. c("s".tr ir..tr1. Ripley See Also tree an object of class "tree". where ? is the device number.

"decreasing")) Arguments x object of class tree..ltr)) order type.tree. It can be invoked by calling plot(x) for an object x of the appropriate class. This is assumed to be the result of some function that produces an object with the same named components (size. package="MASS") cpus.. of size on the plot. . ylim = range(x$dev). Usage ## S3 method for class ’tree.. Side Effects Plots deviance or number of misclassifications (or total loss) versus size for a sequence of trees. order = c("increasing".tree(cpus.plot. . . Examples data(cpus.tree.sequence Plot a Tree Sequence Description Allows the user to plot a tree sequence.sequence’ plot(x. Use "decreasing" for the natural ordering of k and the amount of pruning. or directly by calling plot. deviance. k) as that returned by prune. type = "l".tree(log(perf) ~ syct + mmin + mmax + cach + chmin + chmax.sequence. ylim.sequence. Only the first character is needed.sequence 7 plot. data = cpus) plot(prune.ltr <.tree. Details This function is a method for the generic function plot() for class tree.sequence(x) regardless of the class of the object...tree. graphical parameters.

used if events of predicted probability zero occur in newdata when predicting a tree. nwts.. If missing.) Arguments object fitted model object of class tree. character string denoting whether the predictions are returned as a vector (default) or as a tree object. governs the handling of missing values.tree Predictions from a Fitted Tree Object Description Returns a vector of predicted responses from a fitted tree object.tree(x) regardless of the class of the object.8 predict. "tree".. or directly by calling predict. newdata type split nwts eps . used when predicting a tree. fitted values are returned. The predicted values are averaged over the fractions to give the prediction. Usage ## S3 method for class ’tree’ predict(object. It can be invoked by calling predict(x) for an object x of the appropriate class. weights for the newdata cases. newdata = list(). a lower bound for the probabilities. "where").. Details This function is a method for the generic function predict() for class tree. If split = TRUE cases with missing attributes are split into fractional cases and dropped down each side of the split. further arguments passed to or from other methods. .. eps = 1e-3. "class". The predictors referred to in the right side of formula(object) must be present by name in newdata. This is assumed to be the result of some function that produces an object with the same named components as that returned by the tree function. If false. and that node is used for prediction. type = c("vector".tree predict. cases with missing values are dropped down the tree until a leaf is reached or a node for which the attribute is missing. . data frame containing the values at which predictions are required. split = FALSE.

nwts. minsize=2) shuttle. Examples data(shuttle. a factor of the predicted classes (that with highest posterior probability. matrix of predicted class probabilities.tree Value 9 If type = "vector": vector of predicted responses or.tree Cost-complexity Pruning of Tree Object Description Determines a nested sequence of subtrees of the supplied tree by recursively “snipping” off the least important splits. If type = "class": for a classification tree.tr <. mindev=1e-6. Cambridge. newdata. shuttle. tree. best = NULL. This new object is obtained by dropping newdata down object. If type = "tree": an object of class "tree" is returned with new values for frame$n and frame$dev. package="MASS") shuttle. loss. subset=1:253. If newdata does not contain a column for the response in the formula the value of frame$dev will be NA. eps = 1e-3) prune. "misclass"). Chapter 7.shuttle[254:256. if an observation contains a level not used to grow the tree. nwts. For factor predictors. D.misclass(tree. if the response is a factor. and if some values in the response are missing. method = c("deviance". See Also predict.tr. B. (1996). References Ripley..tree(use ~ . If type = "where": the nodes the cases reach. Pattern Recognition and Neural Networks.tree(tree. with ties split randomly). shuttle1) prune. k = NULL. eps = 1e-3) . ] # 3 missing cases predict(shuttle. best = NULL. k = NULL. newdata. loss. the some of the deviances will be NA. it is left at the deepest possible node and frame$yval or frame$yprob at that node is the prediction. Usage prune.tr shuttle1 <. Cambridge University Press.prune.

The classes should be in the order of the levels of the response. These data are dropped down each tree in the cost-complexity sequence and deviances or losses calculated by comparing the supplied response to the prediction. For regression trees. The default is 0–1 loss. weights for the newdata cases. the next largest is returned. deviance. an object of class tree.misclass is an abbreviation for prune. It displays the value of the deviance. a matrix giving for each true class (row) the numeric loss of predicting the class (column). If k is supplied. k is determined algorithmically. the number of misclassifications or the total loss for each subtree in the cost-complexity sequence. This is assumed to be the result of some function that produces an object with the same named components as that returned by the tree() function. integer requesting the size (i.tree(method = "misclass") for use with cv. the default is deviance and the alternative is misclass (number of misclassifications or total loss). . Details Determines a nested sequence of subtrees of the supplied tree by recursively "snipping" off the least important splits. If missing. Value If k is supplied and is a scalar. If there is no tree in the sequence of the requested size. the data used to grow the tree are used. only the default. prune. used to compute deviances if events of predicted probability zero occur in newdata.tree() routinely uses the newdata argument in cross-validating the pruning procedure. Otherwise.tree k best newdata nwts method loss eps fitted model object of class tree.tree. the optimal subtree for that value is returned. total deviance of each tree in the cost-complexity pruning sequence. a tree object of size best is returned. The function cv. a tree object is returned that minimizes the cost-complexity measure for that k. If best is supplied. data frame upon which the sequence of cost-complexity subtrees is evaluated. number of terminal nodes) of a specific subtree in the cost-complexity sequence to be returned. is accepted.10 Arguments tree prune. based upon the cost-complexity measure.e. It is conventional for a loss matrix to have a zero diagonal. the value of the cost-complexity pruning parameter of each tree in the sequence. The object contains the following components: size deviance k number of terminal nodes in each tree in the cost-complexity pruning sequence. character string denoting the measure of node heterogeneity used to guide costcomplexity pruning. An additional axis displays the values of the cost-complexity parameter at each subtree. cost-complexity parameter defining either a specific subtree of tree (k a scalar) or the (optional) sequence of subtrees minimizing the cost-complexity measure (k a vector). For classification trees. a lower bound for the probabilities. The response as well as the predictors referred to in the right side of the formula in tree must be present by name in newdata. A plot method exists for objects of this class.sequence is returned. If missing. This is an alternative way to select a subtree than by supplying a scalar cost-complexity parameter k.

this makes sense only if the tree has already been plotted.save digits Value A tree object containing the nodes that remain after specified or selected subtrees have been snipped off.cv <.cv$dev <.cv) 11 snip. Clicking any other button terminates the selection process.tree(tree. digits = getOption("digits") . the x and y coordinates selected interactively are saved in the object . nodes.tree(type ~ . An integer vector giving those nodes that are the roots of sub-trees to be snipped off. the user is invited to select nodes interactively. its number and the deviance of the current tree and that which would remain if that node were removed are printed. prune. fgl) plot(print(fgl. Usage snip.cv$dev <. Precision used in printing statistics for selected nodes. prune. D.tree(fgl. .tr. If nodes is not supplied. package="MASS") fgl.tree) for(i in 2:5) fgl.3) Arguments tree nodes xy.snip. Ripley An object of class "tree".tree Snip Parts of Tree Objects Description snip.tree)$dev fgl. If nodes is supplied..cv$dev/5 plot(fgl. xy.tree has two related functions. A node is selected by clicking with the left mouse button. If true.tree(fgl..tr <.tr. the user is invited to select a node at which to snip.fgl. it removes those nodes and all their descendants from the tree..cv$dev + cv.save = FALSE.fgl. Selecting the same node again causes it to be removed (and the lines of its sub-tree erased). If missing.cv.tr)) fgl.tree Examples data(fgl. Author(s) B.xy in the global environment.

. Details If pretty = then the level names of a factor split attributes are used unchanged. If pretty = NULL. the levels are presented by a. significant digits for numerical labels. label = "yval". See Details. .. . Usage ## S3 method for class ’tree’ text(x. xpd. Can be NULL to suppress node-labelling logical. graphical parameters such as cex and font. Ripley an object of class "tree" logical. splits = TRUE. If pretty is a positive integer. If TRUE the splits are labelled The name of column in the frame component of x. .12 See Also tree. to be used to label the nodes.tree text. abbreviate is applied to the labels with that value for its argument minlength. digits = getOption("digits") . .tree.3. . By default. b.) Arguments x splits label all pretty digits adj. If the lettering is vertical (par srt = 9 ) and adj is not supplied it is adjusted appropriately.. text. Value None... D. Author(s) B. xpd = TRUE. adj = par("adj"). only the leaves are labelled. but if true interior nodes are also labelled.tree Annotate a Tree Plot Description Add text to a tree plot. all = FALSE. . the manipulation used for split labels involving attributes. prune. pretty = NULL.

logical flag for drawing of axes for the barcharts. Author(s) B.tr) 13 tile. The principal effect is the plot. a factor variable to be displayed: by default it is the response factor of the tree.tr) text(ir. var. D. Ripley See Also tree. and plots barcharts of the set of frequencies underneath each leaf. .tile.tr <. Usage tile.tree Add Class Barcharts to a Classification Tree Plot Description This computes the frequencies of level of var for cases reaching each leaf of the tree.tree Examples ir.tree See Also plot. screen.tree(tree.tree(Species ~.arg axes Value A matrix of counts of categories (rows) for each leaf (columns). The screen to be used: default the next after the currently active screen. iris) plot(ir.arg = ascr + 1. axes = TRUE) Arguments tree var screen..screens fitted object of class "tree".

tr.) Arguments formula A formula expression.tr. A list as returned by tree. The left-hand-side (response) should be either a numerical vector when a regression tree will be fitted or a factor.).control. 31. control = tree.tree(fgl. The right-hand-side should be a series of numeric or factor variables separated by +. when a classification tree is produced.tr1 <. character string giving the method to use. x = FALSE. Vector of non-negative observational weights.screens() plot(fgl. fgl$type) close.. package="MASS") fgl. The default is na. subset. Both .screen(all = TRUE) tree tree Fit a Classification or Regression Tree Description A tree is grown by binary recursive partitioning using the response in the specified formula and choosing splits from the terms of the right-hand-side. The only other useful value is "model.are allowed: regression trees can have offset terms.5) fgl.action control method split .control(nobs.tr) plot(fgl. 26)) tree.. An expression specifying the subset of cases to be used. all=TRUE.action = na. Splitting criterion to use..tr1) text(fgl. weights and subset.tr). weights.partition". . Usage tree(formula. method = "recursive.14 Examples data(fgl.snip. model = FALSE. A function to filter missing data from the model frame.tree(fgl. node=c(1 8. wts = TRUE. A data frame in which to preferentially interpret formula. cex= . . split = c("deviance"..frame".pass. data. fractional weights are allowed. and .pass (to do nothing) as tree handles missing values (by dropping them down the tree as far as possible).tree(type ~ .tr <. data weights subset na. fgl) summary(fgl. na. there should be no interaction terms.. y = TRUE.tr1) tile. "gini").tr1. text(fgl.

and row. If model = TRUE. If y = TRUE. Splitting continues until the terminal nodes are too small or too few to be split. A tree with no splits is of class "singlenode" which inherits from class "tree". the model matrix. and model is used to define the model. Numeric variables are divided into X < a and X > a.. the data set split and the process repeated. This limit is imposed for ease of labelling. dev the deviance of the node. a matrix of fitted probabilities for each response level. the fitted value at the node (the mean for regression trees. If true. If true. minsize or mindev. then the formula and data arguments are ignored. Details A tree is grown by binary recursive partitioning using the response in the specified formula and choosing splits from the terms of the right-hand-side.. Factor predictor variables can have up to 32 levels.tree model 15 If this argument is itself a model frame. the (weighted) number of cases reaching that node. a majority class for classification trees) and split. x y wts .names giving the node numbers. the model frame. logical. the model frame is stored as component model in the result. logical. the practical limit is much less. . for classification trees. An integer vector giving the row number of the frame detailing the node to which each case is assigned. Normally used for mincut. Additional arguments that are passed to tree. ylevels. If the argument is logical and true. the matrix of variables for each case is returned. If x = TRUE. a two-column matrix of the labels for the left and right splits at the node. If true. the response. The split which maximizes the reduction in impurity is chosen. Value The value is an object of class "tree" which has components frame A data frame with a row for each node. n. The terms of the formula. The columns include var. If wts = TRUE.control. Tree growth is limited to a depth of 31 by the use of integers to label nodes. the weights are returned. the weights. the response variable is returned. but since their use in a classification tree with three or more levels in a response involves a search over 2(k−1) − 1 groupings for k levels. yval. the levels of an unordered factor are divided into two non-empty groups. the variable used at the split (or "<leaf>" for a terminal node). logical. Classification trees also have yprob. The matched call to Tree. where terms call model x y wts and attributes xlevels and.

.tree(log1 (perf) ~ syct+mmin+mmax+cach+chmin+chmax.control(nobs.ltr summary(cpus.ltr). This is a weighted quantity. minsize mindev . See Also tree. package="MASS") cpus. prune. snip. (1996) Pattern Recognition and Neural Networks. The within-node deviance must be at least this times that of the root node for the node to be split.control. predict. Ripley.tree(Species ~. Usage tree. Chapter 7. A. C.. text(cpus. Ripley References tree. D. The minimum number of observations to include in either child node. J. Cambridge University Press. mincut = 5. Wadsworth.control Select Parameters for Tree Description A utility function for use with the control argument of tree.control Breiman L. The smallest allowed node size: a weighted quantity.ltr) plot(cpus. The default is 5..tr summary(ir. D.tree. and Stone.tr) tree.. Cambridge. the observational weights are used to compute the ‘number’. iris) ir. mindev = . Olshen R. H.tr <. cpus) cpus. Friedman J. minsize = 1 .16 Author(s) B.ltr <. (1984) Classification and Regression Trees. B. 1) Arguments nobs mincut The number of observations in the training set.ltr) ir.tree Examples data(cpus. The default is 10.tree.

Usage tree. To produce a tree that fits the data perfectly.arg .screens Split Screen for Plotting Trees Description Splits the screen in a way suitable for using tile.. and apparently not what is actually implemented in S. The input nobs. by default the whole display area... A specification of the split of the screen. Author(s) B. screen.tree. if the limit on tree tree.screen for the allowed forms. It seems S uses an absolute bound. plot parameters to be passed to par. . A estimate of the maximum number of nodes that might be grown.screens Details 17 This function produces default values of mincut and minsize. and minsize = 2.arg = . Value A list: mincut minsize nmax nobs Note The interpretation of mindev given here is that of Chambers and Hastie (1992. and ensures that mincut is at most half minsize.) Arguments figs screen.tree. 415). D. p. Ripley See Also tree The maximum of the input or default mincut and 1 The maximum of the input or default minsize and 2. the screen to divide. ..screens(figs. See split. set mindev = depth allows such a tree.

tr.tree.. fgl) summary(fgl.tree(fgl.tr) plot(fgl.tr1 <.tree(fgl. fgl$type) close.18 Value A vector of screen numbers for the newly-created screens. Author(s) B. cex= . split.tree(type ~ .snip.screen Examples data(fgl.tr1) tile.screen(all = TRUE) tree.tr <.tr1. package="MASS") fgl. 26)) tree.screens .5) fgl. all=TRUE. text(fgl.screens() plot(fgl. 31. Ripley See Also tile.tr). D.tr. node=c(1 8.

14. 17 abbreviate.screens. 12 cv.tree.tree. 9 snip. 13 tree. 12 tile.tree.tree. 16.tree.tree.tree.sequence. 2 deviance.tree. 5 plot.summary. 9.omit.tree (tree). 17 tree.tree. 17. 10 deviance.tree. 12 tile.tree.tree.tree. 7 predict. 8 prune. 9 prune. 14 text. 6 plot.replace.tree. 12 tile. 2.sequence.tree (tree). 2. 18 tree.tree. 3 deviance.tree. 16 residuals. 3 na. 14 prune. 8. 5 plot. 2.misclass (prune. 9 predict. 13.tree. 3 misclass. 13.replace.control. 4–6. 12.tree (tree).screens. 17 ∗Topic tree cv.tree).tree.tree. 3 misclass.tree (tree).tree.tree. 14 tree. 9.tree. 6. 16 print.tree).tree. 14 19 print.tree. 18 summary. 13 plot.control.tree. 4 partition. 17 . 13 tree.tree. 4 partition. 4 na. 7 predict. 14 snip.Index ∗Topic hplot partition.tree.screens.tree. 6 text. 16 tree. 3 na. 12. 16 split.singlenode (deviance. 11 text. 11.screen. 5 plot. 16 tree. 7.