This is a brief overview of some of the more advanced options in the
ecmwfr
package.
A ‘hidden’ feature of the wf_set_key()
function is that
it returns the user name upon success. This allows for easier
integration in scripts shared with users (which have different
credentials)
# set a key to the keychain interactively
<- wf_set_key(service = "webapi") user
The conversion from a MARS or python based queries (formed on the
ECMWF or Copernicus CDS websites) to the list format used by
ecmwfr
can be automated if you use the RStudio based
Addin.
By selecting and using Addin -> Mars to list (or ‘Python to list’) you dynamically convert queries copied from either ECMWF or CDS based services.
Using the Addin
is the sure way to form a proper
ecmwfr
request and avoids typos. As such, we recommend the
use of the Addin
.
Another hidden feature of ecmwfr
is the fact that the
request is the first argument in the wf_request()
function.
This means that any valid list can be piped into this function (using
the %>% or pipe symbol).
list(stream = "oper",
levtype = "sfc",
param = "167.128",
dataset = "interim",
step = "0",
grid = "0.75/0.75",
time = "00",
date = "2014-07-01/to/2014-07-02",
type = "an",
class = "ei",
area = "73.5/-27/33/45",
format = "netcdf",
target = "tmp.nc") %>%
wf_request(user = user, path = "~")
Once a valid request has been created it can be made into a dynamic
function using achetypes
. Archetype functions are build
using a valid ecmwfr
ECMWF or CDS request and the vector
naming the field which are to be set as dynamic.
The wf_archetype()
function creates a new function with
as parameters the dynamic fields previously assigned. The below example
show how to use the function to generate the custom
dynamic_request()
function. We then use this new function
to alter the area
and day
fields and pipe
(%>%) into the wf_request()
function to retrieve the
data.
# this is an example of a request
<- wf_archetype(
dynamic_request request = list(
"dataset_short_name" = "reanalysis-era5-pressure-levels",
"product_type" = "reanalysis",
"variable" = "temperature",
"pressure_level" = "850",
"year" = "2000",
"month" = "04",
"day" = "04",
"time" = "00:00",
"area" = "70/-20/30/60",
"format" = "netcdf",
"target" = "era5-demo.nc"
),dynamic_fields = c("area","day"))
# change the day of the month
dynamic_request(day = "01")
As of version 1.4.0
you can submit parallel batch
requests. Using the archetypes, as discussed above, it was easy to
request multiple data products. However, these requests would go through
sequentially. The ECMWF CDS infrastructure allows up to 20 parallel
requests in your queue. The speed of downloading data could be increased
when submitting jobs in parallel rather than sequentially. A new
function wf_request_batch()
now implements parallel CDS
requests, using lists of requests (potentially generated by an archetype
as per above).
# creating a list of requests using wf_archetype()
# setting the day value
<- list(
batch_request dynamic_request(day = "01"),
dynamic_request(day = "02")
)
# submit a batch job using 2 workers
# one for each in the list (the number of workers
# can't exceed 20)
wf_request_batch(
batch_request,workers = 2,
user = user
)
For those familiar to ECMWF mars syntax: CDS/ADS does not
accept date = "2000-01-01/to/2000-12-31"
specifications at
the moment. It is possible to specify one specific date via
date = "2000-01-01"
or multiple days via
date = ["2000-01-01","2000-01-02","2000-10-20"]
or
date = "YYYY-MM-DD/YYYY-MM-DD"
but not via
".../to/..."
. Specifying the date as a range allows you to
sidestep the ERA5T
restricted access issue.