Rename variables in a data frame using an external lookup table

unquote-splice
tidy evaluation
rename
any_of
Author
Affiliations

Layal Christine Lettry

cynkra GmbH

University of Fribourg, Dept. of Informatics, ASAM Group

Published

October 8, 2023

Addressing the challenge of renaming multiple variables through an external lookup table using tidy evaluation techniques

Problem

Let’s assume that a data frame is present with certain columns that possess the appropriate names, however, the remaining columns require renaming. An existing lookup table is ready to be used for setting new names to these specific columns.

library(tidyverse)

Here is the data frame with 3 variables, namely var1, var2 and var4.

test_tib <- tribble(
  ~var1,      ~var2,   ~var4,
  "x",        "a",     1L,
  "y",        "b",     2L,
  "z",        "c",     3L
)
test_tib
# A tibble: 3 × 3
  var1  var2   var4
  <chr> <chr> <int>
1 x     a         1
2 y     b         2
3 z     c         3

Define the lookup table with the new names. Transform this lookup table into a named vector using deframe(). Do not forget that the first argument of deframe() should be the new names of the variable and the second one should have the actual names.

new_names <- tribble(
  ~names_var, ~new_names_var,
  "var1",     "Variable 1",
  "var2",     "Variable 2",
  "var3",     "Variable 3",
  "var4",     "Variable 4"
)
new_names
# A tibble: 4 × 2
  names_var new_names_var
  <chr>     <chr>        
1 var1      Variable 1   
2 var2      Variable 2   
3 var3      Variable 3   
4 var4      Variable 4   
new_names_vec <- deframe(select(new_names, new_names_var, names_var))
new_names_vec
Variable 1 Variable 2 Variable 3 Variable 4 
    "var1"     "var2"     "var3"     "var4" 

Solution

We can solve this using tidy evaluation tools, namely the unquote-splice !!!, or the dplyr functions any_of(). Reading the article written by Tim Tiefenbach, I was able to come up with the solutions below.

Using !!!

Our goal is to unpack the vector of column name pairs that are actually in our data frame. We could achieve this by using unquote-splice !!! which will splice the list of names into the dynamic dots ... of rename().

However, the column var3 is not found. An error appears.

test_tib |>
  rename(!!!new_names_vec)
Error in `rename()`:
! Can't rename columns that don't exist.
✖ Column `var3` doesn't exist.

Select only the variables which are in the named vector new_names_vec.

test_tib |>
  rename(!!!new_names_vec[new_names_vec %in% names(test_tib)])
# A tibble: 3 × 3
  `Variable 1` `Variable 2` `Variable 4`
  <chr>        <chr>               <int>
1 x            a                       1
2 y            b                       2
3 z            c                       3

Using any_of()

Instead of selecting the common variables, you can use any_of() which does this selection automatically.

test_tib |>
  rename(any_of(new_names_vec))
# A tibble: 3 × 3
  `Variable 1` `Variable 2` `Variable 4`
  <chr>        <chr>               <int>
1 x            a                       1
2 y            b                       2
3 z            c                       3

References

These examples are inspired by:

Citation

BibTeX citation:
@online{lettry2023,
  author = {Lettry, Layal Christine},
  title = {Rename Variables in a Data Frame Using an External Lookup
    Table},
  date = {2023-10-08},
  url = {https://rdiscovery.netlify.app/posts/2023-10-08_rename-columns-lookup/},
  langid = {en}
}
For attribution, please cite this work as:
Lettry, Layal Christine. 2023. “Rename Variables in a Data Frame Using an External Lookup Table.” October 8, 2023. https://rdiscovery.netlify.app/posts/2023-10-08_rename-columns-lookup/.