Advanced Use Cases

Koen Hufkens

2022-08-17

Advanced Use Cases

This is a brief overview of some of the more advanced options in the ecmwfr package.

Setup

A ‘hidden’ feature of the wf_set_key() function is that it returns the user name upon success. This allows for easier integration in scripts shared with users (which have different credentials)

# set a key to the keychain interactively
user <- wf_set_key(service = "webapi")

Formatting requests

The conversion from a MARS or python based queries (formed on the ECMWF or Copernicus CDS websites) to the list format used by ecmwfr can be automated if you use the RStudio based Addin.

By selecting and using Addin -> Mars to list (or ‘Python to list’) you dynamically convert queries copied from either ECMWF or CDS based services.

Using the Addin is the sure way to form a proper ecmwfr request and avoids typos. As such, we recommend the use of the Addin.

Piped requests

Another hidden feature of ecmwfr is the fact that the request is the first argument in the wf_request() function. This means that any valid list can be piped into this function (using the %>% or pipe symbol).

list(stream  = "oper",
     levtype = "sfc",
     param   = "167.128",
     dataset = "interim",
     step    = "0",
     grid    = "0.75/0.75",
     time    = "00",
     date    = "2014-07-01/to/2014-07-02",
     type    = "an",
     class   = "ei",
     area    = "73.5/-27/33/45",
     format  = "netcdf",
     target  = "tmp.nc") %>%
  wf_request(user = user, path = "~")

Dynamic request functions / archetypes

Once a valid request has been created it can be made into a dynamic function using achetypes. Archetype functions are build using a valid ecmwfr ECMWF or CDS request and the vector naming the field which are to be set as dynamic.

The wf_archetype() function creates a new function with as parameters the dynamic fields previously assigned. The below example show how to use the function to generate the custom dynamic_request() function. We then use this new function to alter the area and day fields and pipe (%>%) into the wf_request() function to retrieve the data.

# this is an example of a request
dynamic_request <- wf_archetype(
  request = list(
  "dataset_short_name" = "reanalysis-era5-pressure-levels",
  "product_type"   = "reanalysis",
  "variable"       = "temperature",
  "pressure_level" = "850",
  "year"           = "2000",
  "month"          = "04",
  "day"            = "04",
  "time"           = "00:00",
  "area"           = "70/-20/30/60",
  "format"         = "netcdf",
  "target"         = "era5-demo.nc"
  ),
  dynamic_fields = c("area","day"))

# change the day of the month
dynamic_request(day = "01")

Batch (parallel) requests

As of version 1.4.0 you can submit parallel batch requests. Using the archetypes, as discussed above, it was easy to request multiple data products. However, these requests would go through sequentially. The ECMWF CDS infrastructure allows up to 20 parallel requests in your queue. The speed of downloading data could be increased when submitting jobs in parallel rather than sequentially. A new function wf_request_batch() now implements parallel CDS requests, using lists of requests (potentially generated by an archetype as per above).

# creating a list of requests using wf_archetype()
# setting the day value
batch_request <- list(
  dynamic_request(day = "01"),
  dynamic_request(day = "02")
)

# submit a batch job using 2 workers
# one for each in the list (the number of workers
# can't exceed 20)
wf_request_batch(
  batch_request,
  workers = 2,
  user = user
  )

Date specification

For those familiar to ECMWF mars syntax: CDS/ADS does not accept date = "2000-01-01/to/2000-12-31" specifications at the moment. It is possible to specify one specific date via date = "2000-01-01" or multiple days via date = ["2000-01-01","2000-01-02","2000-10-20"] or date = "YYYY-MM-DD/YYYY-MM-DD" but not via ".../to/...". Specifying the date as a range allows you to sidestep the ERA5T restricted access issue.