5.2 Tidyverse Basics

Since tidyverse is a set of packages that work together, you will often want to load multiple packages at the same time. The tidyverse authors recognize this, and defined a set of reasonable packages to load at once when loading the metapackage (i.e. a package that contains multiple packages):

library(tidyverse)
-- Attaching packages --------------------------------------------- tidyverse 1.3.1 --
v ggplot2 3.3.5     v purrr   0.3.4
v tibble  3.1.6     v dplyr   1.0.7
v tidyr   1.1.4     v stringr 1.4.0
v readr   2.1.1     v forcats 0.5.1
-- Conflicts ------------------------------------------------ tidyverse_conflicts() --
x dplyr::filter() masks stats::filter()
x dplyr::lag()    masks stats::lag()

The packages in the Attaching packages section are those loaded by default, and each of these packages adds a unique set of functions to your R environment. We will mention functions from many of these packages as we go through this chapter, but for now here is a table of these packages and briefly what they do:

Package Description
ggplot2 Plotting using the grammar of graphics
tibble Simple and sane data frames
tidyr Operations for making data “tidy”
readr Read rectangular text data into tidyverse
purrr Functional programming tools for tidyverse
dplyr “A Grammar of Data Manipulation”
stringr Makes working with strings in R easier
forcats Operations for using categorical variables

Notice the dplyr::filter() syntax in the Conflicts section. filter is defined as a function in both the dplyr package as well as the base R stats package. The stats package is loaded by default when you run R, and thus the filter function is defined (specifically, it performs linear filtering on time series data). However, when dplyr is loaded, it also has a filter function which overrides the definition from the stats package. This is why the tidyverse package reports this as a conflict when loaded:

-- Conflicts ------------------------------------------------ tidyverse_conflicts() --
x dplyr::filter() masks stats::filter()

This is tidyverse telling you that the filter function has been redefined and you should make sure you are aware of that.

However, if you did want to use the original stats defined filter function, you may still access it using the :: namespace operator. This operator lets you “look inside” a loaded package, for example dplyr, and access a function within the namespace of that package:

library(dplyr)
# the filter() function definition is now from the dplyr package, and not from the stats package
# the following two lines of code execute the same function
filter(df)
dplyr::filter(df)
# to access the stats definition of the filter function, use the namespace operator
stats::filter(df)

Most functions defined by a package can be accessed using the :: operator, and it is often a good idea to do so, to ensure you are calling the right function.