library(tidyverse)
Addressing the challenge of renaming multiple variables through an external lookup table using tidy evaluation techniques
Problem
Let’s assume that a data frame is present with certain columns that possess the appropriate names, however, the remaining columns require renaming. An existing lookup table is ready to be used for setting new names to these specific columns.
Here is the data frame with 3 variables, namely var1
, var2
and var4
.
<- tribble(
test_tib ~var1, ~var2, ~var4,
"x", "a", 1L,
"y", "b", 2L,
"z", "c", 3L
) test_tib
# A tibble: 3 × 3
var1 var2 var4
<chr> <chr> <int>
1 x a 1
2 y b 2
3 z c 3
Define the lookup table with the new names. Transform this lookup table into a named vector using deframe()
. Do not forget that the first argument of deframe()
should be the new names of the variable and the second one should have the actual names.
<- tribble(
new_names ~names_var, ~new_names_var,
"var1", "Variable 1",
"var2", "Variable 2",
"var3", "Variable 3",
"var4", "Variable 4"
) new_names
# A tibble: 4 × 2
names_var new_names_var
<chr> <chr>
1 var1 Variable 1
2 var2 Variable 2
3 var3 Variable 3
4 var4 Variable 4
<- deframe(select(new_names, new_names_var, names_var))
new_names_vec new_names_vec
Variable 1 Variable 2 Variable 3 Variable 4
"var1" "var2" "var3" "var4"
Solution
We can solve this using tidy evaluation tools, namely the unquote-splice !!!
, or the dplyr functions any_of()
. Reading the article written by Tim Tiefenbach, I was able to come up with the solutions below.
Using !!!
Our goal is to unpack the vector of column name pairs that are actually in our data frame. We could achieve this by using unquote-splice !!!
which will splice the list of names into the dynamic dots ...
of rename()
.
However, the column var3
is not found. An error appears.
|>
test_tib rename(!!!new_names_vec)
Error in `rename()`:
! Can't rename columns that don't exist.
✖ Column `var3` doesn't exist.
Select only the variables which are in the named vector new_names_vec
.
|>
test_tib rename(!!!new_names_vec[new_names_vec %in% names(test_tib)])
# A tibble: 3 × 3
`Variable 1` `Variable 2` `Variable 4`
<chr> <chr> <int>
1 x a 1
2 y b 2
3 z c 3
Using any_of()
Instead of selecting the common variables, you can use any_of()
which does this selection automatically.
|>
test_tib rename(any_of(new_names_vec))
# A tibble: 3 × 3
`Variable 1` `Variable 2` `Variable 4`
<chr> <chr> <int>
1 x a 1
2 y b 2
3 z c 3
References
These examples are inspired by:
Citation
@online{lettry2023,
author = {Lettry, Layal Christine},
title = {Rename Variables in a Data Frame Using an External Lookup
Table},
date = {2023-10-08},
url = {https://rdiscovery.netlify.app/posts/2023-10-08_rename-columns-lookup/},
langid = {en}
}