You are on page 1of 1

Factors with forcats : : CHEAT SHEET

The forcats package provides tools for working with factors, which are R's data structure for categorical data.

Factors stored displayed Change the order of levels Change the value of levels
R represents categorical 1 a 1= a 1= a
integer 2= b a a
data with factors. A factor vector 3 c 2= b 1= a a 1= b fct_relevel(.f, ..., a er = 0L) 1= a v 1= v fct_recode(.f, ...) Manually change
3= c 3= c 2= c Manually reorder factor levels. 2= x levels. Also fct_relabel() which obeys
is an integer vector with a 2 b c 2= b c c 2= b z 3= z
3= c 3= a fct_relevel(f, c("b", "c", "a")) 3= c purrr::map syntax to apply a function
levels attribute that stores levels 1 a b b b x
a set of mappings between or expression to each level.
a a a v fct_recode(f, v = "a", x = "b", z = "c")
integers and categorical values. When you view a factor, R fct_infreq(f, ordered = NA)
displays not the integers, but the levels associated with them. fct_relabel(f, ~ paste0("x", .x))
Reorder levels by the frequency
Create a factor with factor() c 1= a c 1= c in which they appear in the
c 2= c c 2= a data (highest frequency first).
a a 1= a factor(x = character(), levels, Also fct_inseq(). a 1= a 2 1=2 fct_anon(f, prefix = "")
2= b a a c 2= b 2=1 Anonymize levels with random
c c labels = levels, exclude = NA, ordered f3 <- factor(c("c", "c", "a")) 1 3=3
3= c = is.ordered(x), nmax = NA) Convert 3= c integers.
b b fct_infreq(f3) b 3
a vector to a factor. Also as_factor(). fct_anon(f)
a a a 2
f <- factor(c("a", "c", "b", "a"),
levels = c("a", "b", "c")) b 1= a b 1= b fct_inorder(f, ordered = NA)
a 2= b a 2= a Reorder levels by order in which a 1= a x 1= x fct_collapse(.f, …, other_level =
a 1= a a Return its levels with levels() they appear in the data. 2= b 2= c NULL) Collapse levels into manually
2= b b fct_inorder(f2) c c
c levels(x) Return/set the levels of a b
3= c defined groups.
3= c c
factor. levels(f); levels(f) <- c("x","y","z")
x fct_collapse(f, x = c("a", "b"))
b a x
a a 1= a a 1= c fct_rev(f) Reverse level order.
Use unclass() to see its structure b 2= b 2= b f4 <- factor(c("a","b","c"))
3= c
b 3= a
c c fct_rev(f4) fct_lump_min(f, min, w = NULL,
Inspect Factors a
c
1= a
2= b
a
Other
1= a
2 = Other
other_level = "Other") Lumps
together factors that appear fewer
a fct_count(f, sort = FALSE, 3= c than min times. Also fct_lump_n(),
1= a f n a 1= a a 1= c fct_shi (f) Shi levels to le or b Other
fct_lump_prop(), and
c 2= b a 2 prop = FALSE) Count the 2= b 2= a right, wrapping around end. a a
3= c number of values with each b 3= c
b 3= b
fct_lump_lowfreq().
b b 1 c c fct_shi (f4) fct_lump_min(f, min = 2)
level. fct_count(f)
a c 1
fct_match(f, lvls) Check for
lvls in f. fct_match(f, "a") a 1= a a 1= a fct_shu le(f, n = 1L) Randomly a 1= a a 1= a fct_other(f, keep, drop, other_level =
b 2= b b 2= c permute order of factor levels. c 2= b 2= b "Other") Replace levels with "other."
a 1= a a 1= a fct_unique(f) Return the 3= c 3= b fct_shu le(f4) 3= c
Other
3 = Other fct_other(f, keep = c("a", "b"))
b 2= b b 2= b unique values, removing c c b b
a duplicates. fct_unique(f) a a

Combine Factors Add or drop levels


1= a 1= b fct_reorder(.f, .x, .fun = median,
a 2= b ca 2= c
bc 3= c b 3= a ..., .desc = FALSE) Reorder levels
by their relationship with another
a 1= a + b 1= a = a 1= a fct_c(…) Combine factors variable. a 1= a a 1= a fct_drop(f, only) Drop unused levels.
c 2= c a 2= b c 2= c with di erent levels. boxplot(data = PlantGrowth, b 2= b b 2= b f5 <- factor(c("a","b"),c("a","b","x"))
3= b Also fct_cross(). weight ~ reorder(group, weight)) 3= x f6 <- fct_drop(f5)
b f1 <- factor(c("a", "c"))
a f2 <- factor(c("b", "a"))
fct_c(f1, f2) a 1= a a 1= a fct_expand(f, …) Add levels to a factor.
1= a 1= b fct_reorder2(.f, .x, .y, .fun = b 2= b b 2= b fct_expand(f6, "x")
2= b 2= c last2, ..., .desc = TRUE) Reorder 3= x
a 1= a
2= b
a 1= a
2= b
fct_unify(fs, levels = 3= c 3= a
levels by their final values when
b b 3= c lvls_union(fs)) Standardize plotted with two other variables. a 1= a a 1= a fct_explicit_na(f, na_level="(Missing)")
a 1= a a 1= a levels across a list of factors. ggplot(diamonds,aes(carat, price, 2= b 2= b Assigns a level to NAs to ensure they
c 2= c
2c
2= b
fct_unify(list(f2, f1)) b b 3= x
3= c color = fct_reorder2(color, carat, NA x appear in plots, etc.
price))) + geom_smooth() fct_explicit_na(factor(c("a", "b", NA)))

CC BY SA Posit So ware, PBC • info@posit.co • posit.co • Learn more at forcats.tidyverse.org • Diagrams inspired by @LVaudor on Twitter • forcats 0.5.1 • Updated: 2021-07

ft
ff
ff
ft
ff



ft

ft



ft





ft












You might also like