You are on page 1of 1

Factors with forcats : : CHEAT SHEET

The forcats package provides tools for working with factors, which are R's data structure for categorical data.

Factors stored displayed Change the order of levels Change the value of levels
R represents categorical integer
1 1= a a 1= a
data with factors. A factor vector 3 23 == bc c 23 == bc a 1= a a 1= b fct_relevel(.f, ..., a er = 0L) a 1= a v 1= v
2= x
fct_recode(.f, ...) Manually change
is an integer vector with a 2 b c 2= b c 2= c Manually reorder factor levels. c 2= b
z levels. Also fct_relabel() which obeys
3= c 3= a fct_relevel(f, c("b", "c", "a")) 3= c 3= z purrr::map syntax to apply a function
levels attribute that stores levels 1 a b b b x
a set of mappings between or expression to each level.
a a a v fct_recode(f, v = "a", x = "b", z = "c")
integers and categorical values. When you view a factor, R fct_infreq(f, ordered = NA)
displays not the integers, but the levels associated with them. fct_relabel(f, ~ paste0("x", .x))
Reorder levels by the frequency
Create a factor with factor() c 1= a c 1= c in which they appear in the
c 2= c c 2= a data (highest frequency first).
a a 1= a factor(x = character(), levels, Also fct_inseq(). a 1= a 2 1=2 fct_anon(f, prefix = "")
c c 2= b labels = levels, exclude = NA, ordered a a c 2= b 2=1 Anonymize levels with random
3= c
f3 <- factor(c("c", "c", "a")) 3= c 1 3=3
integers.
b b = is.ordered(x), nmax = NA) Convert fct_infreq(f3) b 3
a vector to a factor. Also as_factor(). fct_anon(f)
a a f <- factor(c("a", "c", "b", "a"),
a 2
levels = c("a", "b", "c")) b 1= a b 1= b fct_inorder(f, ordered = NA)
a 2= b a 2= a Reorder levels by order in which
Return its levels with levels() they appear in the data. a 1= a x 1= x fct_collapse(.f, …, other_level =
a 1= a a
c 2= b c 2= c NULL) Collapse levels into manually
c 2= b b
levels(x) Return/set the levels of a fct_inorder(f2) 3= c defined groups.
3= c c
factor. levels(f); levels(f) <- c("x","y","z") b x fct_collapse(f, x = c("a", "b"))
b a x
a a 1= a a 1= c fct_rev(f) Reverse level order.
Use unclass() to see its structure 2= b 2= b f4 <- factor(c("a","b","c"))
b 3= c b 3= a
c c fct_rev(f4) fct_lump_min(f, min, w = NULL,
Inspect Factors a
c
1= a
2= b
3= c
a
Other
1= a
2 = Other
other_level = "Other") Lumps
together factors that appear fewer
than min times. Also fct_lump_n(),
a 1= a f n fct_count(f, sort = FALSE, a 1= a a 1= c fct_shi (f) Shi levels to le or b Other
fct_lump_prop(), and
c 2= b
a 2 prop = FALSE) Count the 2= b 2= a right, wrapping around end. a a
3= c number of values with each b 3= c b 3= b
fct_lump_lowfreq().
b b 1 c c fct_shi (f4) fct_lump_min(f, min = 2)
level. fct_count(f)
a c 1
fct_match(f, lvls) Check for
lvls in f. fct_match(f, "a") a 1= a a 1= a fct_shu le(f, n = 1L) Randomly a 1= a a 1= a fct_other(f, keep, drop, other_level =
2= b 2= c permute order of factor levels. 2= b 2= b "Other") Replace levels with "other."
a 1= a a 1= a fct_unique(f) Return the b 3= c b 3= b
c 3= c
Other
3 = Other
c c fct_shu le(f4) fct_other(f, keep = c("a", "b"))
b 2= b
b 2= b unique values, removing b b
a duplicates. fct_unique(f) a a

Combine Factors fct_reorder(.f, .x, .fun = median,


Add or drop levels
1= a 1= b
a 2= b
ca 2= c
bc
..., .desc = FALSE) Reorder levels
3= c b 3= a
by their relationship with another
a 1= a + b 1= a = a 1= a fct_c(…) Combine factors variable. a 1= a a 1= a fct_drop(f, only) Drop unused levels.
c 2= c a 2= b c 2= c with di erent levels. boxplot(data = PlantGrowth, b 2= b
b 2= b f5 <- factor(c("a","b"),c("a","b","x"))
3= b Also fct_cross(). weight ~ reorder(group, weight)) 3= x f6 <- fct_drop(f5)
b f1 <- factor(c("a", "c"))
a f2 <- factor(c("b", "a"))
fct_c(f1, f2) a 1= a a 1= a fct_expand(f, …) Add levels to a factor.
1= a 1= b fct_reorder2(.f, .x, .y, .fun = 2= b 2= b fct_expand(f6, "x")
2= b 2= c last2, ..., .desc = TRUE) Reorder b b 3= x
a 1= a
2= b
a 1= a
2= b
fct_unify(fs, levels = 3= c 3= a
levels by their final values when
b b 3= c lvls_union(fs)) Standardize
levels across a list of factors.
plotted with two other variables. a 1= a a 1= a fct_explicit_na(f, na_level="(Missing)")
a 1= a
2= c
a 1= a
2= b ggplot(diamonds,aes(carat, price, b 2= b
b 2= b Assigns a level to NAs to ensure they
c 2c 3= c fct_unify(list(f2, f1)) color = fct_reorder2(color, carat, 3= x appear in plots, etc.
price))) + geom_smooth()
NA x fct_explicit_na(factor(c("a", "b", NA)))

RStudio® is a trademark of RStudio, PBC • CC BY SA RStudio • info@rstudio.com • 844-448-1212 • rstudio.com • Learn more at forcats.tidyverse.org • Diagrams inspired by @LVaudor on Twitter • forcats 0.5.1 • Updated: 2021-07

ft
ff
ff
ft
ff



ft



ft





ft












You might also like