Introduction to renv

Kevin Ushey

2022-05-26

The renv package is a new effort to bring project-local R dependency management to your projects. The goal is for renv to be a robust, stable replacement for the Packrat package, with fewer surprises and better default behaviors.

Underlying the philosophy of renv is that any of your existing workflows should just work as they did before – renv helps manage library paths (and other project-specific state) to help isolate your project’s R dependencies, and the existing tools you’ve used for managing R packages (e.g. install.packages(), remove.packages()) should work as they did before.

Workflow

The general workflow when working with renv is:

  1. Call renv::init() to initialize a new project-local environment with a private R library,

  2. Work in the project as normal, installing and removing new R packages as they are needed in the project,

  3. Call renv::snapshot() to save the state of the project library to the lockfile (called renv.lock),

  4. Continue working on your project, installing and updating R packages as needed.

  5. Call renv::snapshot() again to save the state of your project library if your attempts to update R packages were successful, or call renv::restore() to revert to the previous state as encoded in the lockfile if your attempts to update packages introduced some new problems.

The renv::init() function attempts to ensure the newly-created project library includes all R packages currently used by the project. It does this by crawling R files within the project for dependencies with the renv::dependencies() function. The discovered packages are then installed into the project library with the renv::hydrate() function, which will also attempt to save time by copying packages from your user library (rather than reinstalling from CRAN) as appropriate.

Calling renv::init() will also write out the infrastructure necessary to automatically load and use the private library for new R sessions launched from the project root directory. This is accomplished by creating (or amending) a project-local .Rprofile with the necessary code to load the project when the R session is started.

If you’d like to initialize a project without attempting dependency discovery and installation – that is, you’d prefer to manually install the packages your project requires on your own – you can use renv::init(bare = TRUE) to initialize a project with an empty project library.

Reproducibility

Using renv, it’s possible to “save” and “load” the state of your project library. More specifically, you can use:

For each package used in your project, renv will record the package version, and (if known) the external source from which that package can be retrieved. renv::restore() uses that information to retrieve and reinstall those packages in your project.

Caveats

It is important to emphasize that renv is not a panacea for reproducibility. Rather, it is a tool that can help make projects reproducible by solving one small part of the problem: it records the version of R + R packages being used in a project, and provides tools for reinstalling the declared versions of those packages in a project. Ultimately, making a project reproducible requires some thoughtfulness from the user: what does it mean for a particular project to be reproducible, and how can renv (and other tools) be used to accomplish that particular goal of reproducibility?

There are a still a number of factors that can affect whether this project could truly be reproducible in the future – for example,

  1. The results produced by a particular project might depend on other components of the system it’s being run on – for example, the operating system itself, the versions of system libraries in use, the compiler(s) used to compile R and the R packages used, and so on. Keeping a ‘stable’ machine image is a separate challenge, but Docker is one popular solution. See also vignette("docker", package = "renv") for recommendations on how Docker can be used together with renv.

  2. The R packages that the project depends on may no longer be available. If your project depends on R packages available on CRAN, it’s possible those packages may be removed in the future – either by request of the package maintainer, or by the maintainers of CRAN itself. This is quite rare, but needs consideration if reproducibility of a project is paramount.

In addition, be aware that package installation may fail if a package was originally installed through a CRAN-available binary, but that binary is no longer available. renv will attempt to install the package from sources in this situation, but attempts to install from source can (and often do) fail due to missing system prerequisites for compilation of a package. The renv::equip() function may be useful in these scenarios, especially on Windows: it will download external software commonly used when compiling R packages from sources, and instruct R to use that software during compilation.

A salient example of this is the rmarkdown package, as it relies heavily on the pandoc command line utility. However, because pandoc is not bundled with the rmarkdown package (it is normally provided by RStudio, or installed separately by the user), simply restoring an renv project using rmarkdown may not be sufficient – one also needs to ensure the project is run in a environment with the correct version of pandoc available.

Infrastructure

The following files are written to and used by projects using renv:

File Usage
.Rprofile Used to activate renv for new R sessions launched in the project.
renv.lock The lockfile, describing the state of your project’s library at some point in time.
renv/activate.R The activation script run by the project .Rprofile.
renv/library The private project library.
renv/settings.dcf Project settings – see ?settings for more details.

In particular, renv/activate.R ensures that the project library is made active for newly launched R sessions. This ensures that any new R processes launched within the project directory will use the project library, and hence are isolated from the regular user library.

For development and collaboration, the .Rprofile, renv.lock and renv/activate.R files should be committed to your version control system; the renv/library directory should normally be ignored. Note that renv::init() will attempt to write the requisite ignore statements to the project .gitignore.

Dependency Discovery

By default, renv::snapshot() will examine your project’s R files to determine which packages are used in your project, and will include only those packages (alongside their recursive dependencies) in the lockfile. This is done via a call to the renv::dependencies() function. We call this an “implicit” snapshot, since the packages your project depends on are implicit based on how packages appear to be used in your project. renv uses static analysis to determine which packages appear to be used; e.g. by scanning your code for calls to library() or require().

While useful, this approach is not 100% reliable in detecting the packages required by your project. If you find that renv’s dependency discovery is failing to discover one or more packages used in your project, one escape hatch is to include a file called _dependencies.R with code of the form:

library(<pkg>)

Ignore Files

By default, renv reads the .gitignore files in your project (if any) to infer which files should be ignored when scanning for dependencies. If you find that renv’s dependency discovery is scanning files you don’t want to be scanned, you can use an .renvignore file to instruct renv to ignore certain patterns of files in the project. For example, you might use:

/data

to tell renv not to scan files within the data folder.

If you’d prefer that renv ignored all folders by default, except for some subset of folders where you place your code files, you could use something like:

*
!/code

In this case, renv will only scan your code folder at the root of the project directory for dependencies.

Explicit Snapshots

If you’d instead prefer to explicitly declare which packages are used in your project, you can do so by creating a DESCRIPTION file at your project root. These DESCRIPTION files should be formatted similarly to those used by default in R package development – see https://r-pkgs.org/description.html for more details.

In this case, your DESCRIPTION file might look like:

Type: project
Description: My project.
Depends:
    tidyverse,
    devtools,
    shiny,
    data.table

The packages used in your project can be part of either the Depends or Imports fields.

Collaborating

When sharing a project with other collaborators, you may want to ensure everyone is working with the same environment – otherwise, code in the project may unexpectedly fail to run because of changes in behavior between different versions of the packages in use. renv can help to make such collaboration easier – see vignette("collaborating", package = "renv") for more details.

Package Sources

renv is able to install and restore packages from a variety of sources, including:

renv uses an installed package’s DESCRIPTION file to infer its source. For example, packages installed from the CRAN repositories typically have the field:

Repository: CRAN

set, and renv takes this as a signal that the package was retrieved from CRAN.

Inferring Package Sources

The following fields are checked, in order, when inferring a package’s source:

  1. The RemoteType field; typically written for packages installed by the devtools, remotes and pak packages,

  2. The Repository field; for example, packages retrieved from CRAN will typically have the Repository: CRAN field,

  3. The biocViews field; typically present for packages installed from the Bioconductor repositories,

As a fallback, if renv is unable to determine a package’s source from the DESCRIPTION file directly, but a package of the same name is available in the active R repositories (as specified in getOption("repos")), then the package will be treated as though it was installed from an R package repository.

If all of the above methods fail, renv will finally check for a package available from the cellar. See here for more details. The package cellar is typically used as an escape hatch, for packages which do not have a well-defined remote source, or for packages which might not be remotely accessible from your machine.

Unknown Sources

If renv is unable to infer a package’s source, it will inform you during renv::snapshot() – for example, if we attempted to snapshot a package called skeleton with no known source:

> renv::snapshot()
The following package(s) were installed from an unknown source:

        skeleton

renv may be unable to restore these packages in the future.
Consider reinstalling these packages from a known source (e.g. CRAN).

Do you want to proceed? [y/N]:

While you can still create a lockfile with such packages, restore() will likely fail unless you can ensure this package is installed through some other mechanism.

Custom R Package Repositories

Custom and local R package repositories are supported as well. The only requirement is that these repositories are set as part of the repos R option, and that these repositories are named. For example, you might use:

repos <- c(CRAN = "https://cloud.r-project.org", WORK = "https://work.example.org")
options(repos = repos)

to tell renv to work with both the official CRAN package repository, as well as a package repository you have hosted and set up in your work environment.

Upgrading renv

After initializing a project with renv, that project will then be ‘bound’ to the particular version of renv that was used to initialize the project. If you need to upgrade (or otherwise change) the version of renv associated with a project, you can use renv::upgrade(). This will install the latest-available version of renv from your declared package repositories. Alternatively, if you’re currently using a development version of renv as installed from GitHub in your project, then renv will install the latest-available version of renv from GitHub.

With each commit of renv, we bump the package version and also tag the commit with the associated package version. This implies that you can call, for example:

renv::upgrade(version = "0.15.5")

to request the installation of that particular version of renv if so required.

Cache

One of renv’s primary features is the use of a global package cache, which is shared across all projects using renv. The renv package cache provides two primary benefits:

  1. Future calls to renv::restore() and renv::install() will become much faster, as renv will be able to find and re-use packages already installed in the cache.

  2. Because it is not necessary to have duplicate versions of your packages installed in each project, the renv cache should also help you save disk space relative to an approach with project-specific libraries without a global cache.

To understand the renv cache, we need to first understand what an R library is. An R library is, normally, a directory of installed R packages which can be loaded and used within an R session. These are the directories reported by e.g. .libPaths(), and R uses these directories when searching for packages to load (e.g. in response to a call to library(dplyr)).

When using renv with the global package cache, the project library is instead formed as a directory of symlinks (or, on Windows, junction points) into the renv global package cache. Hence, while each renv project is isolated from other projects on your system, they can still re-use the same installed packages as required.

In some cases, renv will be unable to directly link from the global package cache to your project library – for example, if the package cache and your project library live on different disk volumes. In such a case, renv will instead copy the package from the cache into the project library.

By default, renv generates its cache in the following folders:

Platform Location
Linux ~/.local/share/renv
macOS ~/Library/Application Support/renv
Windows %LOCALAPPDATA%/renv

If you’d like to share the package cache across multiple users, you can do so by setting the RENV_PATHS_CACHE environment variable to a shared path. This variable can be set in an R startup file to make it apply to all R sessions. For example, it could be set within:

You may also want to set RENV_PATHS_CACHE so that the global package cache can be stored on the same volume as the projects you normally work on. This is especially important when working projects stored on a networked filesystem.

While we recommend enabling the cache by default, if you’re having trouble with renv when the cache is enabled, it can be disabled by setting the project setting renv::settings$use.cache(FALSE). Doing this will ensure that packages are then installed into your project library directly, without attempting to link and use packages from the renv cache.

If you find a problematic package has entered the cache (for example, an installed package has become corrupted), that package can be removed with the renv::purge() function. See the ?purge documentation for caveats and things to be aware of when removing packages from the cache.

Installation from Source

In the end, renv still needs to install R packages – either from binaries available from CRAN, or from sources when binaries are not available. Installation from source can be challenging for a few reasons:

  1. Your system will need to have a compatible compiler toolchain available. In some cases, R packages may depend on C / C++ features that aren’t available in an older system toolchain, especially in some older Linux enterprise environments.

  2. Your system will need requisite system libraries, as many R packages contain compiled C / C++ code that depend on and link to these packages.

Downloads

By default, renv uses curl for file downloads when available. This allows renv to support a number of download features across multiple versions of R, including:

If curl is not available on your machine, it is highly recommended that you install it. Newer versions of Windows 10 come with a bundled version of curl.exe; other users on Windows can use renv::equip() to download and install a recent copy of curl. Newer versions of macOS come with a bundled version of curl that is adequate for usage with renv, and most Linux package managers have a modern version of curl available in their package repositories.

curl downloads can be configured through renv’s configuration settings – see ?renv::config for more details.

If you’ve already configured R’s downloader and would like to bypass renv’s attempts to use curl, you can use the R option renv.download.override. For example, executing:

options(renv.download.override = utils::download.file)

would instruct renv to use R’s own download machinery when attempting to download files from the internet (respecting the R options download.file.method and download.file.extra as appropriate). Advanced users can also provide their own download function, provided its signature matches that of utils::download.file().

You can also instruct renv to use a different download method by setting the RENV_DOWNLOAD_METHOD environment variable. For example:

# use Windows' internal download machinery
Sys.setenv(RENV_DOWNLOAD_METHOD = "wininet")

# use R's bundled libcurl implementation
Sys.setenv(RENV_DOWNLOAD_METHOD = "libcurl")

Note that other features (e.g. authentication) may not be supported when using an alternative download file method – you will have to configure the downloader yourself if that is required. See ?download.file for more details.

Proxies

If your downloads need to go through a proxy server, then there are a variety of approaches you can take to make this work:

  1. Set the http_proxy and / or https_proxy environment variables. These environment variables can contain the full URL to your proxy server, including a username + password if necessary.

  2. You can use a .curlrc (_curlrc on Windows) to provide information about the proxy server to be used. This file should be placed in your home folder (see Sys.getenv("HOME"), or Sys.getenv("R_USER") on Windows); alternatively, you can set the CURL_HOME environment variable to point to a custom ‘home’ folder to be used by curl when resolving the runtime configuration file. On Windows, you can also place your _curlrc in the same directory where the curl.exe binary is located.

See the curl documentation on proxies and config files for more details.

As an example, the following _curlrc works when using authentication with NTLM and SSPI on Windows:

--proxy "your.proxy.dns:port"
--proxy-ntlm
--proxy-user ":"
--insecure

The curl R package also has a helper:

curl::ie_get_proxy_for_url()

which may be useful when attempting to discover this proxy address.

Authentication

Your project may make use of packages which are available from remote sources requiring some form of authentication to access – for example, a GitHub enterprise server. Usually, either a personal access token (PAT) or username + password combination is required for authentication. renv is able to authenticate when downloading from such sources, using the same system as the remotes package. In particular, environment variables are used to record and transfer the required authentication information.

Remote Source Authentication
GitHub GITHUB_PAT
GitLab GITLAB_PAT
Bitbucket BITBUCKET_USER + BITBUCKET_PASSWORD
Git Remotes GIT_PAT / GIT_USER + GIT_PASSWORD

These credentials can be stored in e.g. .Renviron, or can be set in your R session through other means as appropriate.

If you require custom authentication for different packages (for example, your project makes use of packages available on different GitHub enterprise servers), you can use the renv.auth R option to provide package-specific authentication settings. renv.auth can either be a a named list associating package names with environment variables, or a function accepting a package name + record, and returning a list of environment variables. For example:

# define a function providing authentication
options(renv.auth = function(package, record) {
  if (package == "MyPackage")
    return(list(GITHUB_PAT = "<pat>"))
})

# use a named list directly
options(renv.auth = list(
  MyPackage = list(GITHUB_PAT = "<pat>")
))

# alternatively, set package-specific option
options(renv.auth.MyPackage = list(GITHUB_PAT = "<pat>"))

For packages installed from Git remotes, renv will attempt to use git from the command line to download and restore the associated package. Hence, it is recommended that authentication is done through SSH keys when possible.

Authentication with Custom Headers

If you want to set arbitrary headers when downloading files using renv, you can do so using the renv.download.headers R option. It should be a function that accepts a URL, and returns a named character vector indicating the headers which should be supplied when accessing that URL.

For example, suppose you have a package repository hosted at https://my/repository, and the credentials required to access that repository are stored in the AUTH_HEADER environment variable. You could define renv.download.headers like so:

options(renv.download.headers = function(url) {
  if (grepl("^https://my/repository", url))
    return(c(Authorization = Sys.getenv("AUTH_HEADER")))
})

With the above, renv will set the Authorization header whenever it attempts to download files from the repository at URL https://my/repository.

Shims

To help you take advantage of the package cache, renv places a couple of shims on the search path:

Function Shim
install.packages() renv::install()
remove.packages() renv::remove()
update.packages() renv::update()

In effect, calling install.packages() within an renv project will call renv::install() instead. This can be useful when installing packages which have already been cached. For example, if you use renv::install("dplyr"), and renv detects that the latest version on CRAN has already been cached, then renv will just install using the copy available in the cache – thereby skipping some of the installation overhead.

If you’d like to bypass these shims within an session, you can explicitly call the version of these functions from the utils package, e.g. with utils::install.packages(<...>).

If you’d prefer not to use the renv shims at all, they can be disabled by setting the R option options(renv.config.shims.enabled = FALSE), or by setting the environment variable RENV_CONFIG_SHIMS_ENABLED = FALSE. See ?config for more details.

History

If you’re using a version control system with your project, then as you call renv::snapshot() and later commit new lockfiles to your repository, you may find it necessary later to recover older versions of your lockfiles. renv provides the functions renv::history() to list previous revisions of your lockfile, and renv::revert() to recover these older lockfiles.

Currently, only Git repositories are supported by renv::history() and renv::revert().

Comparison with Packrat

renv differs from Packrat in the following ways:

  1. The renv lockfile renv.lock is formatted as JSON. This should make the lockfile easier to use and consume with other tools.

  2. renv no longer attempts to explicitly download and track R package source tarballs within your project. This was a frustrating default that operated under the assumption that you might later want to be able to restore a project’s private library without access to a CRAN repository. In practice, this is almost never the case, and the time spent downloading + storing the package sources seemed to outweigh the potential reproducibility benefits.

  3. Packrat tried to maintain the distinction between so-called ‘stale’ packages; that is, R packages which were installed by Packrat but were not recorded in the lockfile for some reason. This distinction was (1) overall not useful, and (2) confusing. renv no longer makes this distinction: snapshot() saves the state of your project library to renv.lock, restore() loads the state of your project library from renv.lock, and that’s all.

  4. In renv, the global package cache is enabled by default. This should reduce overall disk-space usage as packages can effectively be shared across each project using renv.

  5. renv’s dependency discovery machinery is more configurable. The function renv::dependencies() is exported, and users can create .renvignore files to instruct renv to ignore specific files and folders in their projects. (See ?renv::dependencies for more information.)

Migrating from Packrat

The renv::migrate() function makes it possible to migrate projects from Packrat to renv. See the ?migrate documentation for more details. In essence, calling renv::migrate("<project path>") will be enough to migrate the Packrat library and lockfile such that they can then be used by renv.

Uninstalling renv

If you find renv isn’t the right fit for your project, deactivating and uninstalling it is easy.

If you want to completely remove any installed renv infrastructure components from your entire system, you can do so with the following R code:

root <- renv::paths$root()
unlink(root, recursive = TRUE)

The renv package can then also be uninstalled via:

utils::remove.packages("renv")

Note that if you’ve customized any of renv’s infrastructure paths as described in ?renv::paths, then you’ll need to find and remove those customized folders as well.

Future Work

renv, like Packrat, is designed to work standalone without the need to depend on any non-base R packages. However, the following (future) integrations are planned:

These integrations will be optional (so that renv can always work standalone) but we hope that they will further improve the speed and reliability of renv.