styler
packagescale()
Bioconductor is an organized collection of strictly biological analysis methods packages for R. These packages are hosted and maintained outside of CRAN because the maintainers enforce a more rigorous set of coding quality, testing, and documentation standards than is required by normal R packages. These stricter requirements arose from a recognition that software packages are generally only as usable as their documentation allows them to be, and also that many if not most of the users of these packages are not statisticians or experienced computer programmers. Rather, they are people like us: biological analysis practitioners who may or may not have substantial coding experience but must analyze our data nonetheless. The excellent documentation and community support of the bioconductor ecosystem is a primary reason why R is such a popular language in biological analysis.
Bioconductor is divided into roughly two sets of packages: core maintainer packages and user contributed packages. The core maintainer packages are among the most critical, because they define a set of common objects and classes (e.g. the ExpressionSet class in the Biobase package) that all Bioconductor packages know how to work with. This common base provides consistency among all Bioconductor packages thereby streamlining the learning process. User contributed and maintained packages provide specialized functionality for specific types of analysis.
Bioconductor itself must be installed prior to installing other Bioconductor packages. To [install bioconductor], you can run the following:
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
::install(version = "3.14") BiocManager
Bioconductor undergoes regular releases indicated by a version, e.g. version = "3.14"
at the time of writing. Every bioconductor package also has a version,
and each of those versions may or may not be compatible with a specific version
of Bioconductor. To make matters worse, Bioconductor versions are only
compatible with certain versions of R. Bioconductor version 3.14 requires R
version 4.1.0, and will not work with older R versions. These versions can cause
major version compatibility issues when you are forced to use older versions of
R, as may sometimes be the case on managed compute clusters. There is not a good
general solution for ensuring all your packages work together, but the general
best rule of thumb is to use the most current version of R and all packages at
the time when you write your code.
The BiocManager
package is the only Bioconductor package installable using
install.packages()
. After installing the BiocManager package, you may then
install bioconductor packages:
# installs the affy bioconductor package for microarray analysis
::install("affy") BiocManager
One key aspect of Bioconductor packages is consistent and helpful documentation; every package page on the Bioconductor.org site lists a block of code that will install the package, e.g. for the affy package. In addition, Biconductor provides three types of documentation:
The thorough and useful documentation of packages in Bioconductor is one of the reasons why the package ecosystem, and R, is so widely used in biology and bioinformatics.
The base Bioconductor packages define convenient data structures for storing and analyzing many types of data. Recall earlier in the Types of Biological Data section, we described five types of biological data: primary, processed, analysis, metadata, and annotation data. Bioconductor provides a convenient class for storing many of these different data types together in one place, specifically the SummarizedExperiment class in the package of the same name package. An illustration of what a SummarizedExperiment object stores is below, from the Bioconductor maintainer’s paper in Nature:
As you can see from the figure, this class stores processed data (assays
),
metadata (colData
and exptData
), and annotation data (rowData
). The
SummarizedExperiment class is used ubiquitously throughout the Bioconductor
package ecosystem, as are other core classes some of which we will cover later
in this chapter.