What are the main differences between openxlsx and openxlsx2?

Introduction

Within this blog post, my aim is to elucidate the primary distinctions between openxlsx and openxlsx2. I’ve encountered these variances through a Github Gist, the informative vignette titled “Update from openxlsx”, and my personal hands-on experience.

The provided tables are designed to offer a clear overview of the differences between openxlsx and openxlsx2. They serve as a concise presentation, illustrating the content of the vignette titled “Update from openxlsx,” which can be explored in detail here.

Chaining versus Piping

In contrast to openxlsx, openxlsx2 enables the creation of workbooks, the addition of styles, data writing, and other operations through the use of chained or piped operations.

In their recent update on openxlsx2, Jordan Mark Barbone and Jan Marvin Garbuszus shared a significant development in this vignette section: the transition to R6 in openxlsx2. This shift signifies a fundamental change in the relationship between methods and objects within openxlsx2, where methods are now associated with objects rather than generics.

Therefore, functions can now be seamlessly chained. Moreover, operation pipelines can now be used in openxlsx2 as opposed to openxlsx. This change was triggered by the restructuration of the internal framework of openxlsx.

For those seeking a deeper understanding of R6, please refer to Chapter 6 of “Advanced R” by Hadley Wickham. This chapter provides comprehensive insights into the principles and applications of R6.

In the tables below, we’ve highlighted the key arguments for each function. For a deeper dive into the specifics of each function’s arguments, check out the examples in the following sections.

Kindly explore this specific vignette section for an example that contrasts openxlsx and openxlsx2. Unlike openxlsx, which lacks integration with pipes or chains, openxlsx2 offers flexibility in its application.

In the forthcoming examples, I will employ pipes and chains to illustrate diverse applications of the openxlsx2 functions. Feel free to use your preferred method!

Read an existing Excel Workbook

Read an existing Excel workbook
		openxlsx2
Action	openxlsx	chain	pipe
Load file		`wb_load()`	`wb_load()`
Read file		`wb_to_df()`	`wb_to_df()`
Read file	`read.xlsx()`	`read_xlsx()`	`read_xlsx()`
Read file	`readWorkbook()`	`wb_read()`	`wb_read()`

Please be aware that the recently introduced functions in openxlsx2, namely read_xlsx() and wb_read(), serve as mere wrappers for wb_to_df(). Their purpose is to facilitate a seamless transition between openxlsx and openxlsx2.

Example

In this example, I want to illustrate how to read an Excel file efficiently.

You can use wb_to_df(), wb_read() or read_xlsx() which almost have all arguments in common, since wb_read() and read_xlsx() are wrappers of wb_to_df().

When you read an Excel file, you can define the types of the columns directly. Please refer to the Details section for more information. Please keep in mind that the type argument must be a named numeric where each variables belongs to one of these instances:

0: character
1: numeric
2: date
3: posixt (datetime)
4: logical

file <- system.file("extdata", "openxlsx2_example.xlsx", package = "openxlsx2")

# With wb_load
my_wb_loaded <- wb_load(file)
my_wb_loaded

A Workbook object.
 
Worksheets:
 Sheets: Sheet1, Sheet2 
 Write order: 1, 2

wb_open(my_wb_loaded) # take a look at the raw xlsx file
my_df_loaded <- wb_to_df(my_wb_loaded,
  sheet = 1,
  skip_empty_cols = TRUE,
  skip_empty_rows = TRUE,
  show_formula = TRUE,
  na.strings = c("#NUM!", "#VALUE!"),
  types = c(
    "Var1" = 4, "Var2" = 1, "Var3" = 1, "Var4" = 0, "Var5" = 2,
    "Var6" = 0, "Var7" = 0, "Var8" = 3
  )
)
my_df_loaded

    Var1 Var2   Var3 Var4       Var5         Var6            Var7
3   TRUE    1   1.00    a 2023-05-29 3209324 This            E3/0
4   TRUE   NA     NA    b 2023-05-23         <NA>              C4
5   TRUE    2   1.34    c 2023-02-01         <NA>            <NA>
6  FALSE    2     NA <NA>       <NA>         <NA>           C6+E6
7  FALSE    3   1.56    e       <NA>         <NA>            <NA>
8  FALSE    1   1.70    f 2023-03-02         <NA>           C8+E8
10 FALSE    2  23.00    h 2023-12-24         <NA>    SUM(C10,E10)
11 FALSE    3  67.30    i 2023-12-25         <NA> PRODUCT(C11,E3)
12    NA    1 123.00 <NA> 2023-07-31         <NA>         E12-C12
                  Var8
3  1899-12-31 01:27:15
4  1899-12-31 14:02:57
5  1899-12-31 23:01:02
6  1899-12-31 17:24:53
7                 <NA>
8  1899-12-31 08:45:58
10                <NA>
11                <NA>
12                <NA>

# With read_xlsx
my_xslx_read <- read_xlsx(file,
  convert = FALSE # to keep all variables as characters
)
glimpse(my_xslx_read)

Rows: 10
Columns: 9
$ Var1 <chr> "TRUE", "TRUE", "TRUE", "FALSE", "FALSE", "FALSE", NA, "FALSE", "…
$ Var2 <chr> "1", NA, "2", "2", "3", "1", NA, "2", "3", "1"
$ NA   <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
$ Var3 <chr> "1", "#NUM!", "1.34", NA, "1.56", "1.7", NA, "23", "67.3", "123"
$ Var4 <chr> "a", "b", "c", "#NUM!", "e", "f", NA, "h", "i", NA
$ Var5 <chr> "2023-05-29", "2023-05-23", "2023-02-01", NA, NA, "2023-03-02", N…
$ Var6 <chr> "3209324 This", NA, NA, NA, NA, NA, NA, NA, NA, NA
$ Var7 <chr> "#DIV/0!", "0", "#VALUE!", "2", NA, "2.7", NA, "25", "3", "122"
$ Var8 <chr> "6.0590277777777778E-2", "0.58538194444444447", "0.95905092592592…

# With wb_read
my_wb_read <- wb_read(file,
  detect_dates = TRUE
)
glimpse(my_wb_read)

Rows: 10
Columns: 9
$ Var1 <lgl> TRUE, TRUE, TRUE, FALSE, FALSE, FALSE, NA, FALSE, FALSE, NA
$ Var2 <dbl> 1, NA, 2, 2, 3, 1, NA, 2, 3, 1
$ NA   <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
$ Var3 <chr> "1", "#NUM!", "1.34", NA, "1.56", "1.7", NA, "23", "67.3", "123"
$ Var4 <chr> "a", "b", "c", "#NUM!", "e", "f", NA, "h", "i", NA
$ Var5 <date> 2023-05-29, 2023-05-23, 2023-02-01, NA, NA, 2023-03-02, NA, 2023-…
$ Var6 <chr> "3209324 This", NA, NA, NA, NA, NA, NA, NA, NA, NA
$ Var7 <chr> "#DIV/0!", "0", "#VALUE!", "2", NA, "2.7", NA, "25", "3", "122"
$ Var8 <time> 01:27:15, 14:02:57, 23:01:02, 17:24:53,       NA, 08:45:58,      …

Note that all returned objects have the class data.frame.

class(my_df_loaded)

[1] "data.frame"

class(my_xslx_read)

[1] "data.frame"

class(my_wb_read)

[1] "data.frame"

Write a File

Write an xlsx or xlsm workbook
		openxlsx2
Action	openxlsx	chain	pipe
Create temporary xlsm file		`temp_xlsx(macros = TRUE)`	`temp_xlsx(macros = TRUE)`
Create temporary xlsx file	`temp_xlsx()`	`temp_xlsx()`	`temp_xlsx()`
Write file	`write.xlsx()`	`write_xlsx()`	`write_xlsx()`

Example

Let’s say we want to put the information about palmerpenguins::penguins in a new Excel file.

Then, we would like to include palmerpenguins::penguins_raw in the same Excel file.

Eventually, we want an xlsm file.

# Write a data frame into an xlsx file
my_tempfile <- temp_xlsx()
basename(my_tempfile)

[1] "temp_xlsx_10197b87fb55.xlsx"

write_xlsx(penguins, my_tempfile, col_names = TRUE)
read_xlsx(my_tempfile) |>
  as_tibble()

# A tibble: 344 × 8
   species island    bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
   <chr>   <chr>              <dbl>         <dbl>             <dbl>       <dbl>
 1 Adelie  Torgersen           39.1          18.7               181        3750
 2 Adelie  Torgersen           39.5          17.4               186        3800
 3 Adelie  Torgersen           40.3          18                 195        3250
 4 Adelie  Torgersen           NA            NA                  NA          NA
 5 Adelie  Torgersen           36.7          19.3               193        3450
 6 Adelie  Torgersen           39.3          20.6               190        3650
 7 Adelie  Torgersen           38.9          17.8               181        3625
 8 Adelie  Torgersen           39.2          19.6               195        4675
 9 Adelie  Torgersen           34.1          18.1               193        3475
10 Adelie  Torgersen           42            20.2               190        4250
# ℹ 334 more rows
# ℹ 2 more variables: sex <chr>, year <dbl>

# Write a list of data frames into an xlsx file
my_2nd_tempfile <- temp_xlsx()
basename(my_2nd_tempfile)

[1] "temp_xlsx_101951d88fdb.xlsx"

write_xlsx(
  list(
    "penguins" = penguins,
    "penguins_raw" = penguins_raw
  ),
  my_2nd_tempfile,
  start_col = c(1L, 2L),
  start_row = c(2L, 3L),
  col_names = TRUE
)
browseURL(my_2nd_tempfile) # open in Excel
read_xlsx(my_2nd_tempfile, sheet = "penguins") |>
  as_tibble()

# A tibble: 344 × 8
   species island    bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
   <chr>   <chr>              <dbl>         <dbl>             <dbl>       <dbl>
 1 Adelie  Torgersen           39.1          18.7               181        3750
 2 Adelie  Torgersen           39.5          17.4               186        3800
 3 Adelie  Torgersen           40.3          18                 195        3250
 4 Adelie  Torgersen           NA            NA                  NA          NA
 5 Adelie  Torgersen           36.7          19.3               193        3450
 6 Adelie  Torgersen           39.3          20.6               190        3650
 7 Adelie  Torgersen           38.9          17.8               181        3625
 8 Adelie  Torgersen           39.2          19.6               195        4675
 9 Adelie  Torgersen           34.1          18.1               193        3475
10 Adelie  Torgersen           42            20.2               190        4250
# ℹ 334 more rows
# ℹ 2 more variables: sex <chr>, year <dbl>

browseURL(my_2nd_tempfile) # open in Excel
read_xlsx(my_2nd_tempfile, sheet = "penguins_raw") |>
  as_tibble()

# A tibble: 344 × 17
   studyName `Sample Number` Species         Region Island Stage `Individual ID`
   <chr>               <dbl> <chr>           <chr>  <chr>  <chr> <chr>          
 1 PAL0708                 1 Adelie Penguin… Anvers Torge… Adul… N1A1           
 2 PAL0708                 2 Adelie Penguin… Anvers Torge… Adul… N1A2           
 3 PAL0708                 3 Adelie Penguin… Anvers Torge… Adul… N2A1           
 4 PAL0708                 4 Adelie Penguin… Anvers Torge… Adul… N2A2           
 5 PAL0708                 5 Adelie Penguin… Anvers Torge… Adul… N3A1           
 6 PAL0708                 6 Adelie Penguin… Anvers Torge… Adul… N3A2           
 7 PAL0708                 7 Adelie Penguin… Anvers Torge… Adul… N4A1           
 8 PAL0708                 8 Adelie Penguin… Anvers Torge… Adul… N4A2           
 9 PAL0708                 9 Adelie Penguin… Anvers Torge… Adul… N5A1           
10 PAL0708                10 Adelie Penguin… Anvers Torge… Adul… N5A2           
# ℹ 334 more rows
# ℹ 10 more variables: `Clutch Completion` <chr>, `Date Egg` <date>,
#   `Culmen Length (mm)` <dbl>, `Culmen Depth (mm)` <dbl>,
#   `Flipper Length (mm)` <dbl>, `Body Mass (g)` <dbl>, Sex <chr>,
#   `Delta 15 N (o/oo)` <dbl>, `Delta 13 C (o/oo)` <dbl>, Comments <chr>

# Write a data frame into an xlsm file
my_3rd_tempfile <- temp_xlsx(macros = TRUE)
basename(my_3rd_tempfile)

[1] "temp_xlsx_10193f5470e0.xlsm"

write_xlsx(penguins, my_3rd_tempfile, col_names = TRUE)
read_xlsx(my_3rd_tempfile) |>
  as_tibble()

# A tibble: 344 × 8
   species island    bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
   <chr>   <chr>              <dbl>         <dbl>             <dbl>       <dbl>
 1 Adelie  Torgersen           39.1          18.7               181        3750
 2 Adelie  Torgersen           39.5          17.4               186        3800
 3 Adelie  Torgersen           40.3          18                 195        3250
 4 Adelie  Torgersen           NA            NA                  NA          NA
 5 Adelie  Torgersen           36.7          19.3               193        3450
 6 Adelie  Torgersen           39.3          20.6               190        3650
 7 Adelie  Torgersen           38.9          17.8               181        3625
 8 Adelie  Torgersen           39.2          19.6               195        4675
 9 Adelie  Torgersen           34.1          18.1               193        3475
10 Adelie  Torgersen           42            20.2               190        4250
# ℹ 334 more rows
# ℹ 2 more variables: sex <chr>, year <dbl>

Parse XML Files

Aside from the transition to R6, another significant improvement in openxlsx2, compared to openxlsx, is the inclusion of pugixml to reliably parse XML files.

# source: https://www.geeksforgeeks.org/working-with-xml-files-in-r-programming/
path_xml <- file.path(
  here::here(),
  "posts", "2023-11-11_openxlsx2", "records.xml"
)

# Read an xml file
my_xml_file <- read_xml(path_xml)
my_xml_file

<RECORDS>
 <STUDENT>
  <ID>1</ID>
  <NAME>Alia</NAME>
  <MARKS>620</MARKS>
  <BRANCH>IT</BRANCH>
 </STUDENT>
 <STUDENT>
  <ID>2</ID>
  <NAME>Brijesh</NAME>
  <MARKS>440</MARKS>
  <BRANCH>Commerce</BRANCH>
 </STUDENT>
 <STUDENT>
  <ID>3</ID>
  <NAME>Yash</NAME>
  <MARKS>600</MARKS>
  <BRANCH>Humanities</BRANCH>
 </STUDENT>
 <STUDENT>
  <ID>4</ID>
  <NAME>Mallika</NAME>
  <MARKS>660</MARKS>
  <BRANCH>IT</BRANCH>
 </STUDENT>
 <STUDENT>
  <ID>5</ID>
  <NAME>Zayn</NAME>
  <MARKS>560</MARKS>
  <BRANCH>IT</BRANCH>
 </STUDENT>
</RECORDS>

# Return node
node_records <- xml_node(my_xml_file, level1 = "RECORDS")
node_records

[1] "<RECORDS><STUDENT><ID>1</ID><NAME>Alia</NAME><MARKS>620</MARKS><BRANCH>IT</BRANCH></STUDENT><STUDENT><ID>2</ID><NAME>Brijesh</NAME><MARKS>440</MARKS><BRANCH>Commerce</BRANCH></STUDENT><STUDENT><ID>3</ID><NAME>Yash</NAME><MARKS>600</MARKS><BRANCH>Humanities</BRANCH></STUDENT><STUDENT><ID>4</ID><NAME>Mallika</NAME><MARKS>660</MARKS><BRANCH>IT</BRANCH></STUDENT><STUDENT><ID>5</ID><NAME>Zayn</NAME><MARKS>560</MARKS><BRANCH>IT</BRANCH></STUDENT></RECORDS>"

node_student <- xml_node(my_xml_file, level1 = "RECORDS", level2 = "STUDENT")
node_student

[1] "<STUDENT><ID>1</ID><NAME>Alia</NAME><MARKS>620</MARKS><BRANCH>IT</BRANCH></STUDENT>"         
[2] "<STUDENT><ID>2</ID><NAME>Brijesh</NAME><MARKS>440</MARKS><BRANCH>Commerce</BRANCH></STUDENT>"
[3] "<STUDENT><ID>3</ID><NAME>Yash</NAME><MARKS>600</MARKS><BRANCH>Humanities</BRANCH></STUDENT>" 
[4] "<STUDENT><ID>4</ID><NAME>Mallika</NAME><MARKS>660</MARKS><BRANCH>IT</BRANCH></STUDENT>"      
[5] "<STUDENT><ID>5</ID><NAME>Zayn</NAME><MARKS>560</MARKS><BRANCH>IT</BRANCH></STUDENT>"

node_name <- xml_node(my_xml_file,
  level1 = "RECORDS", level2 = "STUDENT",
  level3 = "NAME"
)
node_name

[1] "<NAME>Alia</NAME>"    "<NAME>Brijesh</NAME>" "<NAME>Yash</NAME>"   
[4] "<NAME>Mallika</NAME>" "<NAME>Zayn</NAME>"

node_branch <- xml_node(my_xml_file,
  level1 = "RECORDS", level2 = "STUDENT",
  level3 = "BRANCH"
)
node_branch

[1] "<BRANCH>IT</BRANCH>"         "<BRANCH>Commerce</BRANCH>"  
[3] "<BRANCH>Humanities</BRANCH>" "<BRANCH>IT</BRANCH>"        
[5] "<BRANCH>IT</BRANCH>"

# Load character string to pugixml and returns an external pointer
as_xml(node_student)

<STUDENT>
 <ID>1</ID>
 <NAME>Alia</NAME>
 <MARKS>620</MARKS>
 <BRANCH>IT</BRANCH>
</STUDENT>
<STUDENT>
 <ID>2</ID>
 <NAME>Brijesh</NAME>
 <MARKS>440</MARKS>
 <BRANCH>Commerce</BRANCH>
</STUDENT>
<STUDENT>
 <ID>3</ID>
 <NAME>Yash</NAME>
 <MARKS>600</MARKS>
 <BRANCH>Humanities</BRANCH>
</STUDENT>
<STUDENT>
 <ID>4</ID>
 <NAME>Mallika</NAME>
 <MARKS>660</MARKS>
 <BRANCH>IT</BRANCH>
</STUDENT>
<STUDENT>
 <ID>5</ID>
 <NAME>Zayn</NAME>
 <MARKS>560</MARKS>
 <BRANCH>IT</BRANCH>
</STUDENT>

as_xml(node_name)

<NAME>Alia</NAME>
<NAME>Brijesh</NAME>
<NAME>Yash</NAME>
<NAME>Mallika</NAME>
<NAME>Zayn</NAME>

as_xml(node_branch)

<BRANCH>IT</BRANCH>
<BRANCH>Commerce</BRANCH>
<BRANCH>Humanities</BRANCH>
<BRANCH>IT</BRANCH>
<BRANCH>IT</BRANCH>

# Return value
xml_value(my_xml_file, level1 = "RECORDS", level2 = "STUDENT", level3 = "NAME")

[1] "Alia"    "Brijesh" "Yash"    "Mallika" "Zayn"

xml_value(my_xml_file, level1 = "RECORDS", level2 = "STUDENT", level3 = "BRANCH")

[1] "IT"         "Commerce"   "Humanities" "IT"         "IT"

xml_value(my_xml_file, level1 = "RECORDS", level2 = "STUDENT", level3 = "ID")

[1] "1" "2" "3" "4" "5"

xml_value(my_xml_file, level1 = "RECORDS", level2 = "STUDENT", level3 = "MARKS")

[1] "620" "440" "600" "660" "560"

# Return attributes
xml_attr(my_xml_file, level1 = "RECORDS", level2 = "STUDENT")

[[1]]
named character(0)

[[2]]
named character(0)

[[3]]
named character(0)

[[4]]
named character(0)

[[5]]
named character(0)

xml_attr(my_xml_file, level1 = "RECORDS")

[[1]]
named character(0)

Create an Excel Workbook from Scratch

Let’s say you would like to create a workbook called wb and a worksheet called ws where you want to write data x.

Create a workbook and write data
		openxlsx2
Action	openxlsx	chain	pipe
Create workbook	`createWorkbook()`	`wb_workbook()`	`wb_workbook()`
Add worksheet	`addWorksheet(wb, 'ws')`	`wb$add_worksheet('ws')`	`wb_add_worksheet('ws')`
Add data to worksheet	`writeData(wb, 'ws', x)`	`wb$add_data(x)`	`wb_add_data(x)`
Inspect workbook	`openXL(wb)`	`wb$open()`	`wb_open()`
Save workbook	`saveWorkbook(wb, file)	`wb$save(file)`	`wb_save(file)`

Example

In this example, our aim is to explore various options for creating, saving, and importing an Excel workbook.

Create a Workbook with Multiple Sheets

Chained

start_col_ws1 <- 3L
start_row_ws1 <- 10L

wb <- wb_workbook(
  creator = "My Name",
  title = "My Test",
  subject = "Create a Workbook",
  category = "Test",
  datetime_created = Sys.time(),
  theme = "Office Theme",
  keywords = "test",
  comments = "This is my test",
  manager = "Me",
  company = "My Company"
)$
  add_worksheet(
  sheet = "my_first_sheet",
  tab_color = "blue",
  zoom = 150L,
  visible = TRUE,
  orientation = "portrait",
  header = c("ODD HEAD LEFT", "ODD HEAD CENTER", "ODD HEAD RIGHT"),
  footer = c("ODD FOOT RIGHT", "ODD FOOT CENTER", "ODD FOOT RIGHT"),
  even_header = c("EVEN HEAD LEFT", "EVEN HEAD CENTER", "EVEN HEAD RIGHT"),
  even_footer = c("EVEN FOOT RIGHT", "EVEN FOOT CENTER", "EVEN FOOT RIGHT"),
  first_header = c("TOP", "OF FIRST", "PAGE"),
  first_footer = c("BOTTOM", "OF FIRST", "PAGE")
)$
  add_data(
  x = penguins,
  start_col = start_col_ws1,
  start_row = start_row_ws1,
  array = FALSE,
  col_names = TRUE,
  row_names = FALSE,
  with_filter = FALSE
)$
  add_worksheet(
  sheet = "my_second_sheet",
  tab_color = "red",
  zoom = 75L,
  visible = TRUE,
  orientation = "landscape"
)$
  add_data(
  x = penguins_raw,
  dims = wb_dims(2, 1),
  col_names = TRUE,
  row_names = TRUE,
  with_filter = TRUE
)

Piped

start_col_ws1 <- 3L
start_row_ws1 <- 10L

wb <- wb_workbook(
  creator = "My Name",
  title = "My Test",
  subject = "Create a Workbook",
  category = "Test",
  datetime_created = Sys.time(),
  theme = "Office Theme",
  keywords = "test",
  comments = "This is my test",
  manager = "Me",
  company = "My Company"
) |>
  wb_add_worksheet(
    sheet = "my_first_sheet",
    tab_color = "blue",
    zoom = 150L,
    visible = TRUE,
    orientation = "portrait",
    header = c("ODD HEAD LEFT", "ODD HEAD CENTER", "ODD HEAD RIGHT"),
    footer = c("ODD FOOT RIGHT", "ODD FOOT CENTER", "ODD FOOT RIGHT"),
    even_header = c("EVEN HEAD LEFT", "EVEN HEAD CENTER", "EVEN HEAD RIGHT"),
    even_footer = c("EVEN FOOT RIGHT", "EVEN FOOT CENTER", "EVEN FOOT RIGHT"),
    first_header = c("TOP", "OF FIRST", "PAGE"),
    first_footer = c("BOTTOM", "OF FIRST", "PAGE")
  ) |>
  wb_add_data(
    x = penguins,
    start_col = start_col_ws1,
    start_row = start_row_ws1,
    array = FALSE,
    col_names = TRUE,
    row_names = FALSE,
    with_filter = FALSE
  ) |>
  wb_add_worksheet(
    sheet = "my_second_sheet",
    tab_color = "red",
    zoom = 75L,
    visible = TRUE,
    orientation = "landscape"
  ) |>
  wb_add_data(
    x = penguins_raw,
    dims = wb_dims(2, 1),
    col_names = TRUE,
    row_names = TRUE,
    with_filter = TRUE
  )

Inspect and Write the Workbook Data

You can perform the following steps to examine the data and save it to a temporary xlsx file.

Chained

# Inspect the workbook
wb$open()

# Create a new temporary file
myfile <- temp_xlsx()

# Write the workbook into a temporary xlsx file
wb$save(file = myfile, overwrite = TRUE)

# Open the xlsx file
browseURL(myfile)

Piped

# Inspect the workbook
wb |>
  wb_open()

# Create a new temporary file
myfile <- temp_xlsx()

# Write the workbook into a temporary xlsx file
wb |>
  wb_save(file = myfile, overwrite = TRUE)

# Open the xlsx file
browseURL(myfile)

Please be aware that in order to view the headers and footers, you should click on “View” and select “Header and Footer” (for Mac users). This option enables you to distinguish between the landscape and portrait orientations on the two sheets.

Import a Workbook and Start Working on the Data with dplyr

Here, no operations are available for chaining.

# Import the data of your file into a data frame
my_df <- myfile |>
  wb_to_df(
    sheet = 1,
    detect_dates = TRUE,
    col_names = TRUE
  )

# Display the first few rows of the data frame
my_df |>
  head()

   species    island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
11  Adelie Torgersen           39.1          18.7               181        3750
12  Adelie Torgersen           39.5          17.4               186        3800
13  Adelie Torgersen           40.3          18.0               195        3250
14  Adelie Torgersen             NA            NA                NA          NA
15  Adelie Torgersen           36.7          19.3               193        3450
16  Adelie Torgersen           39.3          20.6               190        3650
      sex year
11   male 2007
12 female 2007
13 female 2007
14   <NA> 2007
15 female 2007
16   male 2007

# Provide a summary of the data frame structure
my_df |>
  glimpse()

Rows: 344
Columns: 8
$ species           <chr> "Adelie", "Adelie", "Adelie", "Adelie", "Adelie", "A…
$ island            <chr> "Torgersen", "Torgersen", "Torgersen", "Torgersen", …
$ bill_length_mm    <dbl> 39.1, 39.5, 40.3, NA, 36.7, 39.3, 38.9, 39.2, 34.1, …
$ bill_depth_mm     <dbl> 18.7, 17.4, 18.0, NA, 19.3, 20.6, 17.8, 19.6, 18.1, …
$ flipper_length_mm <dbl> 181, 186, 195, NA, 193, 190, 181, 195, 193, 190, 186…
$ body_mass_g       <dbl> 3750, 3800, 3250, NA, 3450, 3650, 3625, 4675, 3475, …
$ sex               <chr> "male", "female", "female", NA, "female", "male", "f…
$ year              <dbl> 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007…

# Provide the class of the data frame
class(my_df)

[1] "data.frame"

# Convert it into a tibble
my_tib <- as_tibble(my_df)
my_tib

# A tibble: 344 × 8
   species island    bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
   <chr>   <chr>              <dbl>         <dbl>             <dbl>       <dbl>
 1 Adelie  Torgersen           39.1          18.7               181        3750
 2 Adelie  Torgersen           39.5          17.4               186        3800
 3 Adelie  Torgersen           40.3          18                 195        3250
 4 Adelie  Torgersen           NA            NA                  NA          NA
 5 Adelie  Torgersen           36.7          19.3               193        3450
 6 Adelie  Torgersen           39.3          20.6               190        3650
 7 Adelie  Torgersen           38.9          17.8               181        3625
 8 Adelie  Torgersen           39.2          19.6               195        4675
 9 Adelie  Torgersen           34.1          18.1               193        3475
10 Adelie  Torgersen           42            20.2               190        4250
# ℹ 334 more rows
# ℹ 2 more variables: sex <chr>, year <dbl>

Add a Drop-Down List to Your Workbook

Let’s assume you have a workbook called wb.

To add a drop-down list, you must first add the data to a separate sheet¹. We also want to keep it hidden, so we set the argument visible to "hidden". You can then create a drop-down list with wb_add_data_validation().

It is also possible to apply various styles created with wb_add_dxfs_style()² to the values of your drop-down list with wb_add_conditional_formatting().

Add a Drop-Down List to Your Workbook
		openxlsx2
Action	openxlsx	chain	pipe
Add a drop-down list	`dataValidation()`	`wb$add_data_validation()`	`wb_add_data_validation()`
Create a dxfs styling for the workbook	`createStyle()`	`wb$add_dxfs_style()`	`wb_add_dxfs_style()`
Add conditional formatting	`conditionalFormatting()`	`wb$add_conditional_formatting()`	`wb_add_conditional_formatting()`

Example

Chained

wb_ddl <- wb$
  add_worksheet("drop-down_list", visible = "hidden")$
  add_data(
  x = c("huge", "big", "normal", "small", "tiny"),
  start_col = 1L,
  start_row = 1L
)$
  add_data(
  sheet = "my_first_sheet",
  x = "size",
  start_col = which(names(penguins) == last(names(penguins))) + start_col_ws1,
  start_row = start_row_ws1
)$
  add_data_validation(
  sheet = "my_first_sheet",
  dims = wb_dims(
    x = penguins,
    cols = which(names(penguins) == last(names(penguins))) + start_col_ws1,
    from_row = start_row_ws1
  ),
  operator = "equal",
  type = "list",
  value = "'drop-down_list'!$A$1:$A$5"
)

Piped

wb_ddl <- wb |>
  wb_add_worksheet("drop-down_list", visible = "hidden") |>
  wb_add_data(
    x = c("huge", "big", "normal", "small", "tiny"),
    start_col = 1L,
    start_row = 1L
  ) |>
  wb_add_data(
    sheet = "my_first_sheet",
    x = "size",
    start_col = which(names(penguins) == last(names(penguins))) + start_col_ws1,
    start_row = start_row_ws1
  ) |>
  wb_add_data_validation(
    sheet = "my_first_sheet",
    dims = wb_dims(
      x = penguins,
      cols = which(names(penguins) == last(names(penguins))) + start_col_ws1,
      from_row = start_row_ws1
    ),
    operator = "equal",
    type = "list",
    value = "'drop-down_list'!$A$1:$A$5"
  )

Add Colors to Your Drop-Down Values

In order to make it look better, we can format each of the values of the drop-down list.

Chained

dims <- wb_dims(
  x = penguins,
  cols = which(names(penguins) == last(names(penguins))) + start_col_ws1,
  from_row = start_row_ws1
)

wb_cf <- wb_ddl$
  add_dxfs_style(
  name = "style_huge",
  bg_fill = wb_color("#AAAAAA")
)$
  add_dxfs_style(
  name = "style_big",
  bg_fill = wb_color("#6FA8DC")
)$
  add_dxfs_style(
  name = "style_normal",
  bg_fill = wb_color("#00AA00")
)$
  add_dxfs_style(
  name = "style_small",
  bg_fill = wb_color("#CCCC00")
)$
  add_dxfs_style(
  name = "style_tiny",
  bg_fill = wb_color("#CC0000"),
  font_color = wb_color("#EEEEEE")
)$
  add_conditional_formatting(
  dims = dims,
  type = "containsText",
  rule = "huge",
  style = "style_huge"
)$
  add_conditional_formatting(
  dims = dims,
  type = "containsText",
  rule = "big",
  style = "style_big"
)$
  add_conditional_formatting(
  dims = dims,
  type = "containsText",
  rule = "normal",
  style = "style_normal"
)$
  add_conditional_formatting(
  dims = dims,
  type = "containsText",
  rule = "small",
  style = "style_small"
)$
  add_conditional_formatting(
  dims = dims,
  type = "containsText",
  rule = "tiny",
  style = "style_tiny"
)

Piped

dims <- wb_dims(
  x = penguins,
  cols = which(names(penguins) == last(names(penguins))) + start_col_ws1,
  from_row = start_row_ws1
)

wb_cf <- wb_ddl |>
  wb_add_dxfs_style(
    name = "style_huge",
    bg_fill = wb_color("#AAAAAA")
  ) |>
  wb_add_dxfs_style(
    name = "style_big",
    bg_fill = wb_color("#6FA8DC")
  ) |>
  wb_add_dxfs_style(
    name = "style_normal",
    bg_fill = wb_color("#00AA00")
  ) |>
  wb_add_dxfs_style(
    name = "style_small",
    bg_fill = wb_color("#CCCC00")
  ) |>
  wb_add_dxfs_style(
    name = "style_tiny",
    bg_fill = wb_color("#CC0000"),
    font_color = wb_color("#EEEEEE")
  ) |>
  wb_add_conditional_formatting(
    dims = dims,
    type = "containsText",
    rule = "huge",
    style = "style_huge"
  ) |>
  wb_add_conditional_formatting(
    dims = dims,
    type = "containsText",
    rule = "big",
    style = "style_big"
  ) |>
  wb_add_conditional_formatting(
    dims = dims,
    type = "containsText",
    rule = "normal",
    style = "style_normal"
  ) |>
  wb_add_conditional_formatting(
    dims = dims,
    type = "containsText",
    rule = "small",
    style = "style_small"
  ) |>
  wb_add_conditional_formatting(
    dims = dims,
    type = "containsText",
    rule = "tiny",
    style = "style_tiny"
  )

Filter Data, Protect Worksheet and Freeze your a Worksheet Pane

Let’s say you have a workbook called wb and you would like to add some filters, to protect a worksheet and to freeze data.

Manage and Secure your Data
		openxlsx2
Action	openxlsx	chain	pipe
Filter data	`addFilter()`	`wb$add_filter()`	`wb_add_filter()`
Protect worksheet	`protectWorksheet()`	`wb$protect_worksheet()`	`wb_protect_worksheet()`
Freeze data	`freezePane()`	`wb$freeze_pane()`	`wb_freeze_pane()`

Example

It’s important to note that when you use wb_add_filter(), every column in between the specified ones will also inherit a filter. So, if you don’t want a filter to be added to certain columns, you’ll need to rearrange your dataset accordingly to achieve the desired outcome.

In the following example, we will make the following improvements:

Add a filter to the two first columns, namely species and island
Set the first active row to the one that contains data, just below the column names, and the first active column to to the island column (the second one)
Protect the second worksheet called “my_second_sheet”

Chained

wb_pff <- wb_cf$
  add_filter(
  sheet = "my_first_sheet",
  rows = start_row_ws1,
  cols = start_col_ws1 - 1 + c(
    which(names(penguins) == "species"),
    which(names(penguins) == "island")
  )
)$
  freeze_pane(
  first_active_row = start_row_ws1 + 1,
  first_active_col = start_col_ws1 + 1
)$
  protect_worksheet(
  sheet = "my_second_sheet"
)$
  protect_worksheet(
  sheet = "my_first_sheet",
  protect = TRUE,
  # Formatting cells / columns is allowed,
  # but inserting / deleting columns is protected:
  c(
    formatCells = FALSE, formatColumns = FALSE,
    insertColumns = TRUE, deleteColumns = TRUE
  )
)

Piped

wb_pff <- wb_cf |>
  wb_add_filter(
    sheet = "my_first_sheet",
    rows = start_row_ws1,
    cols = start_col_ws1 - 1 + c(
      which(names(penguins) == "species"),
      which(names(penguins) == "island")
    )
  ) |>
  wb_freeze_pane(
    first_active_row = start_row_ws1 + 1,
    first_active_col = start_col_ws1 + 1
  ) |>
  wb_protect_worksheet(
    sheet = "my_second_sheet"
  ) |>
  wb_protect_worksheet(
    sheet = "my_first_sheet",
    protect = TRUE,
    # Formatting cells / columns is allowed,
    # but inserting / deleting columns is protected:
    c(
      formatCells = FALSE, formatColumns = FALSE,
      insertColumns = TRUE, deleteColumns = TRUE
    )
  )

Make Your Data Shine

Let’s say you have a workbook called wb and you would like to some styling.

The following functions will help you format your data as you wish in your worksheet.

Format your Data
		openxlsx2
Action	openxlsx	chain	pipe
Add font	`addStyle()`, `createStyle()`	`wb$add_font()`	`wb_add_font()`
Add cell style	`addStyle()`, `createStyle()`	`wb$add_cell_style()`	`wb_add_cell_style()`
Add background fill color	`addStyle()`, `createStyle()`	`wb$add_fill()`	`wb_add_fill()`
Add number formatting	`addStyle()`, `createStyle()`	`wb$add_numfmt()`	`wb_add_numfmt()`
Set column width	`setColWidths()`	`wb$set_col_widths()`	`wb_set_col_widths()`
Add border	`addStyle()`, `createStyle()`	`wb$add_border()`	`wb_add_border()`

These helpers could be used to apply these formatting functions:

wb_dims(): Helper to specify the dims argument
int2col(): Convert integer to Excel column

Example

Chained

ws1_colnames <- wb_dims(
  rows = start_row_ws1,
  cols = seq_len(
    ncol(penguins) + 1
  ) +
    start_col_ws1 - 1
)

ws1_data <- wb_dims(
  x = penguins,
  from_row = start_row_ws1,
  rows = seq_len(nrow(penguins)),
  from_col = start_col_ws1,
  cols = seq_len(ncol(penguins) + 1)
)

col_bmg <- wb_dims(
  x = penguins,
  from_row = start_row_ws1,
  rows = seq_len(nrow(penguins)),
  cols = which(names(penguins) == "body_mass_g") +
    start_col_ws1 - 1
)

col_size <- wb_dims(
  x = penguins,
  cols = which(names(penguins) == last(names(penguins))) + start_col_ws1,
  from_row = start_row_ws1
)

wb_format <- wb_pff$
  add_font(
  sheet = 1,
  dims = ws1_colnames, # add font to the original penguins variables
  size = 12,
  bold = TRUE
)$
  add_border(
  sheet = 1,
  dims = ws1_data,
  bottom_border = "thin",
  left_border = "thin",
  right_border = "thin",
  top_border = NULL
)$
  add_border(
  sheet = 1,
  dims = ws1_colnames,
  bottom_border = "thick",
  left_border = "thick",
  right_border = "thick",
  top_border = "thick"
)$
  add_cell_style(
  dims = ws1_data,
  vertical = "top",
  horizontal = "left",
  wrap_text = TRUE,
  locked = FALSE
)$
  add_cell_style(
  dims = ws1_colnames,
  vertical = "center",
  horizontal = "center",
  wrap_text = TRUE
)$
  set_col_widths(
  cols = int2col(
    seq_len(ncol(penguins) + 1)
  ),
  widths = 22
)$
  add_numfmt(
  dims = col_bmg,
  numfmt = "#.0"
)$
  add_fill(
  dims = col_size,
  color = wb_color("#d9d2e9")
)

Piped

ws1_colnames <- wb_dims(
  rows = start_row_ws1,
  cols = seq_len(
    ncol(penguins) + 1
  ) +
    start_col_ws1 - 1
)

ws1_data <- wb_dims(
  x = penguins,
  from_row = start_row_ws1,
  rows = seq_len(nrow(penguins)),
  from_col = start_col_ws1,
  cols = seq_len(ncol(penguins) + 1)
)

col_bmg <- wb_dims(
  x = penguins,
  from_row = start_row_ws1,
  rows = seq_len(nrow(penguins)),
  cols = which(names(penguins) == "body_mass_g") +
    start_col_ws1 - 1
)

col_size <- wb_dims(
  x = penguins,
  cols = which(names(penguins) == last(names(penguins))) + start_col_ws1,
  from_row = start_row_ws1
)

wb_format <- wb_pff |>
  wb_add_font(
    sheet = 1,
    dims = ws1_colnames, # add font to the original penguins variables
    size = 12,
    bold = TRUE
  ) |>
  wb_add_border(
    sheet = 1,
    dims = ws1_data,
    bottom_border = "thin",
    left_border = "thin",
    right_border = "thin",
    top_border = NULL
  ) |>
  wb_add_border(
    sheet = 1,
    dims = ws1_colnames,
    bottom_border = "thick",
    left_border = "thick",
    right_border = "thick",
    top_border = "thick"
  ) |>
  wb_add_cell_style(
    dims = ws1_data,
    vertical = "top",
    horizontal = "left",
    wrap_text = TRUE,
    locked = FALSE
  ) |>
  wb_add_cell_style(
    dims = ws1_colnames,
    vertical = "center",
    horizontal = "center",
    wrap_text = TRUE
  ) |>
  wb_set_col_widths(
    cols = int2col(
      seq_len(ncol(penguins) + 1)
    ),
    widths = 22
  ) |>
  wb_add_numfmt(
    dims = col_bmg,
    numfmt = "#.0"
  ) |>
  wb_add_fill(
    dims = col_size,
    color = wb_color("#d9d2e9")
  )

Output

You can download the final Excel output here.

Acknowledgements

Thank you for reading. Enjoy using openxlsx2!

Many thanks to the openxlsx2 developers:

References

Footnotes

Please refer to my older blog post on openxlsx.demo.↩︎
The function wb_add_dxfs_style() creates a dxfs styling for the workbook. The element <dxfs> defines a specific formatting. Please refer to c-rex.net for more information.↩︎

Citation

BibTeX citation:

@online{lettry2023,
  author = {Lettry, Layal Christine},
  title = {Switch from Openxlsx to Openxlsx2},
  date = {2023-11-15},
  url = {https://rdiscovery.netlify.app/posts/2023-11-11_openxlsx2/},
  langid = {en}
}

For attribution, please cite this work as:

Lettry, Layal Christine. 2023. “Switch from Openxlsx to Openxlsx2.” November 15, 2023. https://rdiscovery.netlify.app/posts/2023-11-11_openxlsx2/.