Here we describe how to do auth with a package that uses gargle, without requiring any user interaction. This comes up in a wide array of contexts, ranging from simple rendering of a local R Markdown document to deploying a data product on a remote server.
We assume the wrapper package uses the design described in How to use gargle for auth in a client package. Examples include:
Full details on gargle::token_fetch()
,
which powers this strategy, are given in How
gargle gets tokens.
When two computers are talking to each other, possibly with no human involvement, the most appropriate type of token to use is a service account token.
This requires some advance preparation, but that tends to pay off pretty quickly, in terms of having a much more robust auth setup.
Step 1: Get a service account and then download a token. Described in the gargle article How to get your own API credentials, specifically in the Service account token section.
Step 2: Call the wrapper package’s main auth function proactively and provide the path to your service account token. Example using googledrive:
library(googledrive)
drive_auth(path = "/path/to/your/service-account-token.json")
If this code is running on, e.g., a continuous integration service and you need to use an encrypted token, see the gargle article Managing tokens securely.
If the code is running on AWS, a special auth flow is available
called workload identity federation. Learn more in the documentation for
credentials_external_account()
.
For certain APIs, service accounts are inherently awkward, because
you often want to do things on behalf of a specific user. Gmail
is a good example. If you are sending email programmatically, there’s a
decent chance you want to send it as yourself (or from some other
specific email account) instead of from
zestybus-geosyogl@fuffapster-654321.iam.gserviceaccount.com
.
This is described as “impersonation”, which should tip you off that
Google does not exactly encourage this workflow. Some details:
subject
argument of
credentials_service_account()
and
credentials_app_default()
is available to specify which
user to impersonate, e.g. subject = "user@example.com"
.
This argument first appeared in gargle 0.5.0, so it may not necessarily
be exposed yet in user-facing auth functions like
drive_auth()
. If you need subject
in a client
package, that is a reasonable feature request.If delegation of domain-wide authority is impossible or unappealing, you must use an OAuth user token, as described below.
Wrapper packages that use gargle::token_fetch()
in the
recommended way have access to the token search strategy known as
Application Default Credentials.
You need to put the JSON corresponding to your service or external account in a very specific location or, alternatively, record the location of this JSON file in a specific environment variable.
Full details are in the credentials_app_default()
section of the gargle article How
gargle gets tokens.
If you have your token rigged properly, you do not
need to do anything else, i.e. you do not need to call
PACKAGE_auth()
explicitly. Your token should just get
discovered upon first need.
For troubleshooting purposes, you can set a gargle option to see
verbose output about the execution of
gargle::token_fetch()
:
options(gargle_verbosity = "debug")
withr-style convenience helpers also exist:
with_gargle_verbosity()
and
local_gargle_verbosity()
.
If you somehow have the OAuth token you want to use as an R object,
you can provide it directly to the token
argument of the
main auth function. Example using googledrive:
library(googledrive)
<- # some process that results in the token you want to use
my_oauth_token drive_auth(token = my_oauth_token)
gargle caches each OAuth user token it obtains to an
.rds
file, by default. If you know the filepath to the
token you want to use, you could use readRDS()
to read it
and provide as the token
argument to the wrapper’s auth
function. Example using googledrive:
# googledrive
drive_auth(token = readRDS("/path/to/your/oauth-token.rds"))
How would you know this filepath? That requires some attention to the location of gargle’s OAuth token cache folder, which is described in the next section.
Full details are in the credentials_byo_oauth2()
section of the gargle article How
gargle gets tokens.
This is the least recommended strategy, but it appeals to many users, because it doesn’t require creating a service account. Just remember that the perceived ease of using the token you already have (an OAuth user token) is quickly cancelled out by the greater difficulty of managing such tokens for non-interactive use. You might be forced to use this strategy with certain APIs, such as Gmail, that are difficult to use with a service account.
Two main principles:
There are many ways to do this. We’ll work several examples using that convey the range of what’s possible.
.Rmd
to renderStep 1: Get that first token. You must run your code at least once, interactively, do the auth dance, and allow gargle to store the token in its cache.
library(googledrive)
# do anything that triggers auth
drive_find(n_max)
Step 2: Revise your code to pre-authorize the use of
that token next time. Now your .Rmd
can be rendered or your
.R
script can run, without further interaction.
You have two choices to make:
gargle_oauth_email
option or call
PACKAGE_auth(email = ...)
.
.Rmd
or .R
or in a user-level or project level
.Rprofile
startup file.email = TRUE
works if we’re only going to find, at
most, 1 token, i.e. you always auth with the same identityemail = "jane@example.com"
pre-authorizes use of a
token associated with a specific identityemail = "*@example.com"
pre-authorizes use of a token
associated with an identity from a specific domain; good for code that
might be executed on the machines of both alice@example.com
and bob@example.com
This sets an option that allows gargle to use cached tokens whenever there’s a unique match:
options(gargle_oauth_email = TRUE)
This sets an option to use tokens associated with a specific email address:
options(gargle_oauth_email = "jenny@example.com")
This sets an option to use tokens associated with an email address with a specific domain:
options(gargle_oauth_email = "*@example.com")
This gets a token right now and allows the use of a matching token, using googledrive as an example:
drive_auth(email = TRUE)
This gets a token right now, for the user with a specific email address:
drive_auth(email = "jenny@example.com")
This gets a token right now, first checking the cache for a token associated with a specific domain:
drive_auth(email = "*@example.com")
This is like the previous example, but with an added twist: we use a project-level OAuth cache. This is good for deployed data products.
Step 1: Obtain the token intended for non-interactive use and make sure it’s cached in a (hidden) directory of the current project. Using googledrive as an example:
library(googledrive)
# designate project-specific cache
options(gargle_oauth_cache = ".secrets")
# check the value of the option, if you like
::gargle_oauth_cache()
gargle
# trigger auth on purpose --> store a token in the specified cache
drive_auth()
# see your token file in the cache, if you like
list.files(".secrets/")
Do this setup once per project.
Another way to accomplish the same setup is to specify the desired cache location directly in the call to the auth function:
library(googledrive)
# trigger auth on purpose --> store a token in the specified cache
drive_auth(cache = ".secrets")
If you are doing setup in a web-based environment, such as RStudio Server, you may also need to request out-of-band auth, whenever you are first acquiring a token. That is a separate issue, which is explained in Auth when using R in the browser.
Step 2: In all downstream use, announce the location of the cache and pre-authorize the use of a suitable token discovered there. Continuing the googledrive example:
library(googledrive)
options(
gargle_oauth_cache = ".secrets",
gargle_oauth_email = TRUE
)
# now use googledrive with no need for explicit auth
drive_find(n_max = 5)
Setting the option gargle_oauth_email = TRUE
says that
googledrive is allowed to use a token that it finds in the cache,
without interacting with a user, as long as it discovers EXACTLY one
matching token. This option-setting code needs to appear in each script,
.Rmd
, or app that needs to use this token
non-interactively. Depending on the context, it might be suitable to
accomplish this in a startup file, e.g. project-level
.Rprofile
.
Here’s a variation where we say which token to use by explicitly specifying the associated email. This is handy if there’s a reason to have more than one token in the cache.
library(googledrive)
options(
gargle_oauth_cache = ".secrets",
gargle_oauth_email = "jenny@example.com"
)
# now use googledrive with no need for explicit auth
drive_find(n_max = 5)
Here’s another variation where we specify the necessary info directly in an auth call, instead of in options:
library(googledrive)
drive_auth(cache = ".secrets", email = TRUE)
# now use googledrive with no need for explicit auth
drive_find(n_max = 5)
Here’s one last variation that’s applicable when the local cache could contain multiple tokens:
library(googledrive)
drive_auth(cache = ".secrets", email = "jenny@example.com")
# now use googledrive with no need for explicit auth
drive_find(n_max = 5)
Be very intentional about paths and working directory. Personally I
would use here::here(".secrets)"
everywhere above, to make
things more robust.
For troubleshooting purposes, you can set a gargle option to see
verbose output about the execution of
gargle::token_fetch()
:
options(gargle_verbosity = "debug")
withr-style convenience helpers also exist:
with_gargle_verbosity()
and
local_gargle_verbosity()
.
For a cached token to be considered a “match”, it must match the current request with respect to user’s email, scopes, and OAuth app (client ID or key and secret). By design, these settings have very low visibility, because we usually want to use the defaults. If your token is not being discovered, consider if any of these fields might explain the mismatch.