rlang
PR 1255.batchtools
template file can be brewed
(#1359, @pat-s).targets
.NOTICE
and inst/NOTICE
to more
explicitly credit code included from other open source projects.
(Previously drake
just had comments in the source with
links to the various projects.)dsl_sym()
instead of as.symbol()
when
constructing commands for combine()
(#1340, @vkehayas).level_separation
argument to
vis_drake_graph()
and render_drake_graph()
to
control the aspect ratio of visNetwork
graphs (#1303, @matthewstrasiotto, @matthiasgomolka,
@robitalec).caching = "master"
in favor of
caching = "main"
..data
in DSL (#1323,
@shirdekel).identical()
to compare file hashes (#1324, @shirdekel).seed = TRUE
in future::future()
.parallelism = "clustermq"
and caching = "worker"
(@richardbayes).NROW()
throws an error (#1300,
julian-tagell
on Stack Overflow).lifecycle
that does not require
badges to be in man/figures
.log_worker
argument of
clustermq::workers()
to make()
and
drake_config()
(#1305, @billdenney, @mschubert).as.is
to TRUE
in
utils::type.convert()
(#1309, @bbolker).cached_planned()
and cached_unplanned()
now work with non-standard cache locations (#1268, @Plebejer).use_cache
to FALSE
more often (#1257,
@Plebejer).iris
dataset with the
airquality
dataset in all documentation, examples, and
tests (#1271).code_to_function()
to the
proper environment (#1275, @robitalec).tidyselect
(#1274, @dernst).txtq
lockfiles (#1232, #1239, #1280, @danwwilson, @pydupont, @mattwarkentin).drake_script()
function to write
_drake.R
files for r_make()
(#1282).expose_imports()
in favor of
make(envir = getNamespace("yourPackage")
(#1286, @mvarewyck).r_make()
if
getOption("drake_r_make_message")
is FALSE
(#1238, @januz).visNetwork
graph by using
the hierarchical layout with
visEdges(smooth = list(type = "cubicBezier", forceDirection = TRUE))
(#1289, @mstr3336).splice_inner()
from dropping formal arguments
shared by c()
(#1262, @bart1).subtarget_hashes.cross()
for crosses on a single
grouping variable.group()
used with specialized formats
(#1236, @adamaltmejd).tidyselect
>= 1.0.0..names
argument (#1240, @maciejmotyka, @januz).drake_plan()
(#1237, @januz).cross()
sub-targets (#1204, @psadil). Expansion order is the same, but
names are correctly matched now.file_out()
files in
clean()
, even when garbage_collection
is
TRUE
(#521, @the-Hull).keep_going = TRUE
for formatted targets
(#1206).progress_bar
instead of progress
) so that
drake
works without the progress
package
(#1208, @mbaccou).config$settings
(#965).drake_done()
and
drake_cancelled()
(#1205).drake_graph_info()
(#1207).verbose
is 2
(#1203, @kendonB).jobs
argument of
clean()
.drake_build()
or drake_debug()
(#1214, @kendonB).hasty_build
(#1222).config$settings
(#965).file_in()
/file_out()
/knitr_in()
files are not literal strings (#1229).file_out()
and knitr_in()
in
imported functions (#1229).knitr_in()
in dynamic branching (#1229).target()
.progress()
=> drake_progress()
,
running()
=> drake_running()
,
failed()
=> drake_failed()
) (#1205).digest
version to require 0.6.21 (#1166, @boshek)depend
trigger to toggle invalidation from
dynamic-only dependencies, including the max_expand
argument of make()
.session_info
argument parsing (and reduce calls
to utils::sessionInfo()
in tests).tibble
3.0.0.target(format = "file")
(#1168, #1127).max_expand
on a target-by-target
basis via target()
(#1175, @kendonB).make()
, not in drake_config()
(#1156).make(verbose = 2)
, remove the spinner and use a
progress bar to track how many targets are done so far.cli
(optional package).console_log_file
in favor of
log_make
as an argument to make()
and
drake_config()
."loop"
and
"future"
parallel backends (#400).loadd()
RStudio addin
through the new rstudio_drake_cache
global option (#1169,
@joelnitta).recoverable()
, e.g. dynamic
branching + dynamic files.drake_plan()
if
a grouping variable is undefined or invalid (#1182, @kendonB).drake_deps
and drake_deps_ht
(#1183).rlang::trace_back()
to make
diagnose()$error$calls
nicer (#1198).These changes invalidate some targets in some workflows, but they are necessary bug fixes.
$<-()
and @<-()
(#1144).bind_plans()
(#1136,
@jennysjaarda).analyze_assign()
(#1119, @jennysjaarda)."running"
progress of dynamic targets."fst_tbl"
format for large
tibble
targets (#1154, @kendonB).format
argument to make()
, an
optional custom storage format for targets without an explicit
target(format = ...)
in the plan (#1124).lock_cache
argument to make()
to
optionally suppress cache locking (#1129). (It can be annoying to
interrupt make()
repeatedly and unlock the cache manually
every time.)cancel()
and cancel_if()
function to cancel targets mid-build (#1131).subtarget_list
argument to
loadd()
and readd()
to optionally load a
dynamic target as a list of sub-targets (#1139, @MilesMcBain).file_out()
(#1141).drake_config()
level (#1156, @MilesMcBain).config
argument in all user-side
functions (#1118, @vkehayas). Users can now supply the plan
and other make()
arguments directly, without bothering with
drake_config()
. Now, you only need to call
drake_config()
in the _drake.R
file for
r_make()
and friends. Old code with config
objects should still work. Affected functions:
make()
outdated()
drake_build()
drake_debug()
recoverable()
missed()
deps_target()
deps_profile()
drake_graph_info()
vis_drake_graph()
sankey_drake_graph()
drake_graph()
text_drake_graph()
predict_runtime()
. Needed to rename the
targets
argument to targets_predict
and
jobs
to jobs_predict
.predict_workers()
. Same argument name changes as
predict_runtime()
.drake_config()
is to serve functions r_make()
and friends.@
operator. For example, in the static code
analysis of x@y
, do not register y
as a
dependency (#1130, @famuvie).deps_profile()
(#1134, @kendonB).deps_target()
output (#1134, @kendonB).drake_meta_()
objects objects.drake_envir()
and id_chr()
(#1132).drake_envir()
to select the environment with
imports (#882).vctrs
paradigm and its type stability for
dynamic branching (#1105, #1106).target
as a symbol by default in
read_trace()
. Required for the trace to make sense in
#1107."future"
backend (#1083, @jennysjaarda).log_build_times
argument to
make()
and drake_config()
. Allows users to
disable the recording of build times. Produces a speedup of up to 20% on
Macs (#1078).make()
, outdated(make_imports = TRUE)
,
recoverable(make_imports = TRUE)
,
vis_drake_graph(make_imports = TRUE)
, clean()
,
etc. on the same cache.format
trigger to invalidate targets when the
specialized data format changes (#1104, @kendonB).cache_planned()
and
cache_unplanned()
to help selectively clean workflows with
dynamic targets (#1110, @kendonB).drake_config()
objects and analyze_code()
objects."qs"
format (#1121, @kendonB).%||%
(%|||%
is
faster). (#1089, @billdenney)%||NA
due to slowness (#1089, @billdenney).is_dynamic()
and
is_subtarget()
(#1089, @billdenney).getVDigest()
instead of digest()
(#1089, #1092,
https://github.com/eddelbuettel/digest/issues/139#issuecomment-561870289,
@eddelbuettel,
@billdenney).backtick
and .deparseOpts()
to
speed up deparse()
(#1086,
https://stackoverflow.com/users/516548/g-grothendieck
,
@adamkski).build_times()
(#1098).mget_hash()
in progress()
(#1098).drake_graph_info()
(#1098).outdated()
(#1098).make()
, avoid checking for nonexistent metadata for
missing targets.drake_config()
.use_drake()
(#1097, @lorenzwalthert, @tjmahr).drake
’s
interpretation of the plan. In the plan, all the dependency
relationships among targets and files are implicit. In the
spec, they are all explicit. We get from the plan to the spec
using static code analysis, e.g. analyze_code()
.drake::drake_plan(x = target(...))
from
throwing an error if drake
is not loaded (#1039, @mstr3336).transformations
lifecycle badge to the proper
location in the docstring (#1040, @jeroen).readd()
/ loadd()
from turning an
imported function into a target (#1067).disk.frame
targets with their stored
values (#1077, @brendanf).subtargets()
function to get the cached names
of the sub-targets of a dynamic target.subtargets
arguments to loadd()
and readd()
to retrieve specific sub-targets from a parent
dynamic target.get_trace()
and read_trace()
functions to help track which values of grouping variables go into the
making of dynamic sub-targets.id_chr()
function to get the name of the
target while make()
is running.plot(plan)
(#1036).vis_drake_graph()
, drake_graph_info()
, and
render_drake_graph()
now take arguments that allow behavior
to be defined upon selection of nodes. (#1031,@mstr3336).max_expand
argument to make()
and drake_config()
to scale down dynamic branching (#1050,
@hansvancalster).drake_config()
objects.prework
is a language object, list of
language objects, or character vector (#1 at pat-s/multicore-debugging
on GitHub, @pat-s).config$layout
.
Supports internal modifications by reference. Required for #685.dynamic
a formal argument of
target()
.storr
s and decorated
storr
s (#1071).setdiff()
and avoiding
names(config$envir_targets)
.dir_size()
. Incurs
rehashing for some workflows, but should not invalidate any
targets.which_clean()
function to preview which
targets will be invalidated by clean()
(#1014, @pat-s).storr
(#1015, @billdenney, @noamross)."diskframe"
format for larger-than-memory
data (#1004, @xiaodaigh).drake_tempfile()
function to help with
"diskframe"
format. It makes sure we are not copying large
datasets across different physical storage media (#1004, @xiaodaigh).code_to_function()
to allow for
parsing script based workflows into functions so
drake_plan()
can begin to manage the workflow and track
dependencies. (#994, @thebioengineer)seed_trigger()
(#1013,
@CreRecombinase).txtq
API inside decorated
storr
API (#1020).max_expand
in
drake_plan()
. max_expand
is now the maximum
number of targets produced by map()
, split()
,
and cross()
. For cross()
, this reduces the
number of targets (less cumbersome) and makes the subsample of targets
more representative of the complete grid. It also. ensures consistent
target naming when .id
is FALSE
(#1002). Note:
max_expand
is not for production workflows anyway, so this
change does not break anything important. Unfortunately, we do lose the
speed boost in drake_plan()
originally due to
max_expand
, but drake_plan()
is still fast, so
that is not so bad.NULL
targets (#998).cross()
(#1009). The same fix should apply to
map()
and split()
too.map()
(#1010).fst
-powered saving of
data.table
objects.transform
a formal argument of
target()
so that users do not have to type “transform =”
all the time in drake_plan()
(#993).ropensci.github.io/drake
to
docs.ropensci.org/drake
.target(format = "keras")
(#989).verbose
argument in various caching
functions. The location of the cache is now only printed in
make()
. This made the previous feature easier to
implement.combine()
(#1008).storr
(#968).drake_plan(transform = slice())
understand
.id
and grouping variables (#963).clean(garbage_collection = TRUE, destroy = TRUE)
.
Previously it destroyed the cache before trying to collect garbage.r_make()
passes informative error messages
back to the calling process (#969).map()
and
cross()
on topologically side-by-side targets (#983).dsl_left_outer_join()
so cross()
selects the
right combinations of existing targets (#986). This bug was probably
introduced in the solution to #983.progress()
more consistent, less
dependent on whether tidyselect
is installed.format
argument of target()
(#971). This
allows users to leverage faster ways to save and load targets, such as
write_fst()
for data frames and
save_model_hdf5()
for Keras models. It also improves memory
because it prevents storr
from making a serialized
in-memory copy of large data objects.tidyselect
functionality for ...
in
progress()
, analogous to loadd()
,
build_times()
, and clean()
.do_stuff()
and the method stuff.your_class()
are defined in envir
, and if do_stuff()
has a
call to UseMethod("stuff")
, then drake
’s code
analysis will detect stuff.your_class()
as a dependency of
do_stuff()
.file_in()
URLs. Requires
the new curl_handles
argument of make()
and
drake_config()
(#981).target()
, map()
, split()
,
cross()
, and combine()
(#979).file_out()
files in clean()
unless
garbage_collection
is TRUE
. That way,
make(recover = TRUE)
is a true “undo button” for
clean()
. clean(garbage_collection = TRUE)
still removes data in the cache, as well as any file_out()
files from targets currently being cleaned.clean()
only appears if
garbage_collection
is TRUE
. Also, this menu is
added to rescue_cache(garbage_collection = TRUE)
..drake/
. The
old .drake_history/
folder was awkward. Old histories are
migrated during drake_config()
, and
drake_history()
..drake_history
in
plan_to_code()
, plan_to_notebook()
, and the
examples in the help files.make(recover = TRUE)
.recoverable()
and
r_recoverable()
to show targets that are outdated but
recoverable via make(recover = TRUE)
.drake_history()
. Powered by txtq
(#918,
#920).no_deps()
function, similar to
ignore()
. no_deps()
suppresses dependency
detection but still tracks changes to the literal code (#910).transform_plan()
.seed
column of drake
plans
to set custom seeds (#947).seed
trigger to optionally ignore changes to
the target seed (#947).drake_plan()
, interpret custom columns as
non-language objects (#942).clustermq
>= 0.8.8.ensure_workers
in drake_config()
and make()
.make()
after config
is already supplied.make()
from inside the cache
(#927).CITATION
file with JOSS paper.deps_profile()
, include the seed and change the
names.make()
. All
this does is invalidate old targets.set_hash()
and get_hash()
in
storr
to double the speed of progress tracking.$
(#938).xxhash64
as the default hash algorithm
for non-storr
hashing if the driver does not have a hash
algorithm.These changes are technically breaking changes, but they should only affect advanced users.
rescue_cache()
no longer returns a value.clustermq
(#898). Suggest
version >= 0.8.8 but allow 0.8.7 as well.drake
recomputes config$layout
when
knitr
reports change (#887).make()
(#878).r_drake_build()
.r_make()
(#889).expose_imports()
: do not do the
environment<-
trick unless the object is a non-primitive
function.assign()
vs
delayedAssign()
.file_in()
files and other strings (#896).ignore()
work inside loadd()
,
readd()
, file_in()
, file_out()
,
and knitr_in()
.file_in()
and
file_out()
. drake
now treats
file_in()
/file_out()
files as URLS if they
begin with “http://”, “https://”, or “ftp://”. The fingerprint is a
concatenation of the ETag and last-modified timestamp. If neither can be
found or if there is no internet connection, drake
throws
an error."unload"
and
"none"
, which do not attempt to load a target’s
dependencies from memory (#897).drake_slice()
to help split data across multiple
targets. Related: #77, #685, #833.drake_cache()
function, which is now
recommended instead of get_cache()
(#883).r_deps_target()
function.r_make()
,
r_vis_drake_graph()
, and r_outdated()
(#892).get_cache()
in favor of
drake_cache()
.clean()
menu
prompt.drake_config()
.config
argument.use_cache
to FALSE
in storr
function calls for saving and loading targets.
Also, at the end of make()
, call flush_cache()
(and then gc()
if garbage collection is enabled).callr::r()
within commands as a safe
alternative to lock_envir = FALSE
in the self-invalidation
section of the make()
help file.file_in()
/file_out()
/knitr_in()
files. We now rehash files if the file is less than 100 KB or the time
stamp changed or the file size changed.rlang
’s new interpolation operator
{{
, which was causing make()
to fail when
drake_plan()
commands are enclosed in curly braces
(#864).config$lock_envir <- FALSE
” from
loop_build()
to backend_loop()
. This makes
sure config$envir
is correctly locked in
make(parallelism = "clustermq")
..data
argument of map()
and cross()
in the DSL.drake_plan()
, repair
cross(.data = !!args)
, where args
is an
optional data frame of grouping variables.file_in()
/file_out()
directories for Windows
(#855)..id_chr
work with combine()
in the
DSL (#867).make_spinner()
unless the version of
cli
is at least 1.1.0.text_drake_graph()
(and
r_text_drake_graph()
and
render_text_drake_graph()
). Uses text art to print a
dependency graph to the terminal window. Handy for when users SSH into
remote machines without X Window support.max_expand
argument to
drake_plan()
, an optional upper bound on the lengths of
grouping variables for map()
and cross()
in
the DSL. Comes in handy when you have a massive number of targets and
you want to test on a miniature version of your workflow before you
scale up to production.clustermq
workers for as
long as possible. Before launching them, build/check targets locally
until we reach an outdated target with hpc
equal to
FALSE
. In other words, if no targets actually require
clustermq
workers, no workers get created.make(parallelism = "future")
, reset the
config$sleep()
backoff interval whenever a new target gets
checked.CodeDepends
with a base R solution in
code_to_plan()
. Fixes a CRAN note.drake_plan()
) is no longer
experimental.callr
API (r_make()
and friends) is no
longer experimental.evaluate_plan()
, expand_plan()
,
map_plan()
, gather_plan()
,
gather_by()
, reduce_plan()
,
reduce_by()
.deps()
,
max_useful_jobs()
, and
migrate_drake_project()
.drake_plan(x = target(..., transform = map(...)))
avoid inserting extra dots in target names when the grouping variables
are character vectors (#847). Target names come out much nicer this way,
but those name changes will invalidate some targets (i.e. they need to
be rebuilt with make()
).config$jobs_preprocess
(local jobs) in several
places where drake
was incorrectly using
config$jobs
(meant for targets).loadd(x, deps = TRUE, config = your_config)
to
work even if x
is not cached (#830). Required disabling
tidyselect
functionality when deps
TRUE
. There is a new note in the help file about this, and
an informative console message prints out on
loadd(deps = TRUE, tidyselect = TRUE)
. The default value of
tidyselect
is now !deps
.testthat
>=
2.0.1.9000.drake_plan()
transformations, allow the user to
refer to a target’s own name using a special .id_chr
symbol, which is treated like a character string.transparency
argument to
drake_ggraph()
and render_drake_ggraph()
to
disable transparency in the rendered graph. Useful for R installations
without transparency support.vis_drake_graph()
and drake_ggraph()
displays.
Only activated in vis_drake_graph()
when there are at least
10 nodes distributed in both the vertical and horizontal
directions.vis_drake_graph()
and
render_drake_graph()
.drake_plan()
(#847).drake
plans (drake_plan()
)
inside drake_config()
objects. When other bottlenecks are
removed, this will reduce the burden on memory (re #800).targets
argument inside
drake_config()
objects. This is to reduce memory
consumption.layout
and direction
arguments of vis_drake_graph()
and
render_drake_graph()
. Direction is now always left to right
and the layout is always Sugiyama.drake_cache.csv
by default) to avoid issues with spaces
(e.g. entry names with spaces in them, such as “file report.Rmd”)`.drake
7.0.0, if you run make()
in
interactive mode and respond to the menu prompt with an option other
than 1
or 2
, targets will still build.drake_graph()
. The
bug came from append_output_file_nodes()
, a utility
function of drake_graph_info()
.r_make(r_fn = callr::r_bg())
re #799.drake_ggraph()
and
sankey_drake_graph()
to work when the graph has no
edges.use_drake()
function to write the
make.R
and _drake.R
files from the “main
example”. Does not write other supporting scripts.hpc
column in your
drake_plan()
, you can now select which targets to deploy to
HPC and which to run locally.list
argument to build_times()
, just
like loadd()
.file_in()
and file_out()
can now handle
entire directories,
e.g. file_in("your_folder_of_input_data_files")
and
file_out("directory_with_a_bunch_of_output_files")
.config
to HPC workers.drake_ggraph()
drake
plan to the config
argument of a
function.map()
and cross()
transformations
in the DSL, prevent the accidental sorting of targets by name (#786).
Needed merge(sort = FALSE)
in
dsl_left_outer_join()
.verbose
argument of
make()
now takes values 0, 1, and 2, and maximum verbosity
in the console prints targets, retries, failures, and a spinner. The
console log file, on the other hand, dumps maximally verbose runtime
info regardless of the verbose
argument.f <- Rcpp::cppFunction(...)
did not stay up to date from
session to session because the addresses corresponding to anonymous
pointers were showing up in deparse(f)
. Now,
drake
ignores those pointers, and Rcpp
functions compiled inline appear to stay up to date. This problem was
more of an edge case than a bug.drake_plan()
, deprecate the
tidy_evaluation
argument in favor of the new and more
concise tidy_eval
. To preserve back compatibility for now,
if you supply a non-NULL
value to
tidy_evaluation
, it overwrites tidy_eval
.drake_config()
objects by
assigning closure of config$sleep
to
baseenv()
.drake
plans, the command
and
trigger
columns are now lists of language objects instead
of character vectors. make()
and friends still work if you
have character columns, but the default output of
drake_plan()
has changed to this new format.parallelism
argument of
make()
) except “clustermq” and “future” are removed. A new
“loop” backend covers local serial execution.built()
, find_project()
,
imported()
, and parallel_stages()
; full list
at #564) and the single-quoted file API.lock_envir
to
TRUE
in make()
and
drake_config()
. So make()
will automatically
quit in error if the act of building a target tries to change upstream
dependencies.make()
no longer returns a value. Users will need to
call drake_config()
separately to get the old return value
of make()
.jobs
argument to be of length 1
(make()
and drake_config()
). To parallelize
the imports and other preprocessing steps, use
jobs_preprocess
, also of length 1.storr
namespace. As a result,
drake
is faster, but users will no longer be able to load
imported functions using loadd()
or
readd()
.target()
, users must now explicitly name all the
arguments except command
,
e.g. target(f(x), trigger = trigger(condition = TRUE))
instead of target(f(x), trigger(condition = TRUE))
.bind_plans()
when the result has
duplicated target names. This makes drake
’s API more
predictable and helps users catch malformed workflows earlier.loadd()
only loads targets listed in the plan. It no
longer loads imports or file hashes.progress()
,
deps_code()
, deps_target()
, and
predict_workers()
are now data frames.hover
to FALSE
in visualization functions. Improves speed.bind_plans()
to work with lists of plans
(bind_plans(list(plan1, plan2))
was returning
NULL
in drake
6.2.0 and 6.2.1).get_cache(path = "non/default/path", search = FALSE)
looks
for the cache in "non/default/path"
instead of
getwd()
.tibble
.ensure_loaded()
in
meta.R
and triggers.R
when ensuring the
dependencies of the condition
and change
triggers are loaded.config
argument to drake_build()
and loadd(deps = TRUE)
.lock_envir
argument to safeguard
reproducibility. More discussion: #619, #620.from_plan()
function allows the users to
reference custom plan columns from within commands. Changes to values in
these columns columns do not invalidate targets.make()
pitfalls in interactive mode (#761). Appears once per session. Disable
with options(drake_make_menu = FALSE)
.r_make()
,
r_outdated()
, etc. to run drake
functions more
reproducibly in a clean session. See the help file of
r_make()
for details.progress()
gains a progress
argument for
filtering results. For example,
progress(progress = "failed")
will report targets that
failed.storr
’s key mangling in favor of drake
’s own
encoding of file paths and namespaced functions for storr
keys..
, ..
, and
.gitignore
from being target names (consequence of the
above).drake
cache, which the
user can set with the hash_algorithm
argument of
new_cache()
, storr::storr_rds()
, and various
other cache functions. Thus, the concepts of a “short hash algorithm”
and “long hash algorithm” are deprecated, and the functions
long_hash()
, short_hash()
,
default_long_hash_algo()
,
default_short_hash_algo()
, and
available_hash_algos()
are deprecated. Caches are still
back-compatible with drake
> 5.4.0 and <= 6.2.1.magrittr
dot symbol to appear in some
commands sometimes.fetch_cache
argument in all
functions.DBI
and RSQLite
from
“Suggests”.config$eval <- new.env(parent = config$envir)
for
storing built targets and evaluating commands in the plan. Now,
make()
no longer modifies the user’s environment. This move
is a long-overdue step toward purity.codetools
package.session
argument of
make()
and drake_config()
. Details: in
#623.graph
and layout
arguments
to make()
and drake_config()
. The change
simplifies the internals, and memoization allows us to do this.make()
in a subdirectory of
the drake
project root (determined by the location of the
.drake
folder in relation to the working directory).verbose
argument, including
the option to print execution and total build times.mclapply()
or parLapply()
, depending on the
operating system).build_times()
, predict_runtime()
, etc. focus
on only the targets.plan_analyses()
, plan_summaries()
,
analysis_wildcard()
, cache_namespaces()
,
cache_path()
, check_plan()
,
dataset_wildcard()
, drake_meta()
,
drake_palette()
, drake_tip()
,
recover_cache()
, cleaned_namespaces()
,
target_namespaces()
, read_drake_config()
,
read_drake_graph()
, and
read_drake_plan()
.target()
as a user-side function. From now
on, it should only be called from within drake_plan()
.drake_envir()
now throws an error, not a warning, if
called in the incorrect context. Should be called only inside commands
in the user’s drake
plan.*expr*()
rlang
functions with
their *quo*()
counterparts. We still keep
rlang::expr()
in the few places where we know the
expressions need to be evaluated in config$eval
.prework
argument to make()
and
drake_config()
can now be an expression (language object)
or list of expressions. Character vectors are still acceptable.make()
, print messages about triggers
etc. only if verbose >= 2L
.in_progress()
to
running()
.knitr_deps()
to
deps_knitr()
.dependency_profile()
to
deps_profile()
.predict_load_balancing()
to
predict_workers()
.this_cache()
and defer to
get_cache()
and storr::storr_rds()
for
simplicity.hover
to FALSE
in visualization functions. Improves speed. Also a breaking change.drake_cache_log_file()
. We recommend using
make()
with the cache_log_file
argument to
create the cache log. This way ensures that the log is always up to date
with make()
results.Version 6.2.1 is a hotfix to address the failing automated CRAN
checks for 6.2.0. Chiefly, in CRAN’s Debian R-devel (2018-12-10) check
platform, errors of the form “length > 1 in coercion to logical”
occurred when either argument to &&
or
||
was not of length 1
(e.g. nzchar(letters) && length(letters)
). In
addition to fixing these errors, version 6.2.1 also removes a
problematic link from the vignette.
sep
argument to gather_by()
,
reduce_by()
, reduce_plan()
,
evaluate_plan()
, expand_plan()
,
plan_analyses()
, and plan_summaries()
. Allows
the user to set the delimiter for generating new target names.hasty_build
argument to make()
and drake_config()
. Here, the user can set the function
that builds targets in “hasty mode”
(make(parallelism = "hasty")
).drake_envir()
function that returns the
environment where drake
builds targets. Can only be
accessed from inside the commands in the workflow plan data frame. The
primary use case is to allow users to remove individual targets from
memory at predetermined build steps.tibble
2.0.0.0s
from
predict_runtime(targets_only = TRUE)
when some targets are
outdated and others are not.sort(NULL)
warnings from
create_drake_layout()
. (Affects R-3.3.x.)evaluate
,
formatR
, fs
, future
,
parallel
, R.utils
, stats
, and
stringi
.parse()
in code_dependencies()
.memory_strategy
(previously pruning_strategy
)
to "speed"
(previously "lookahead"
).drake_config()
(config$layout
) just to store the code analysis results.
This is an intermediate structure between the workflow plan data frame
and the graph. It will help clean up the internals in future
development.label
argument to future()
inside
make(parallelism = "future")
. That way , job names are
target names by default if job.name
is used correctly in
the batchtools
template file.dplyr
,
evaluate
, fs
, future
,
magrittr
, parallel
, R.utils
,
stats
, stringi
, tidyselect
, and
withr
.rprojroot
from “Suggests”.force
argument in all functions except
make()
and drake_config()
.prune_envir()
to
manage_memory()
.pruning_strategy
argument to
memory_strategy
(make()
and
drake_config()
).console_log_file
in
real time (#588).vis_drake_graph()
hover text to
display commands in the drake
plan more elegantly.predict_load_balancing()
and remove its
reliance on internals that will go away in 2019 via #561.worker
column of
config$plan
in predict_runtime()
and
predict_load_balancing()
. This functionality will go away
in 2019 via #561.predict_load_balancing()
to time
and
workers
.predict_runtime()
and
predict_load_balancing()
up to date.drake_session()
and rename to
drake_get_session_info()
.timeout
argument in the API of
make()
and drake_config()
. A value of
timeout
can be still passed to these functions without
error, but only the elapsed
and cpu
arguments
impose actual timeouts now.map_plan()
function to easily create
a workflow plan data frame to execute a function call over a grid of
arguments.plan_to_code()
function to turn
drake
plans into generic R scripts. New users can use this
function to better understand the relationship between plans and code,
and unsatisfied customers can use it to disentangle their projects from
drake
altogether. Similarly,
plan_to_notebook()
generates an R notebook from a
drake
plan.drake_debug()
function to run a target’s
command in debug mode. Analogous to drake_build()
.mode
argument to trigger()
to
control how the condition
trigger factors into the decision
to build or skip a target. See the ?trigger
for
details.sleep
argument to make()
and
drake_config()
to help the main process consume fewer
resources during parallel processing.caching
argument for the
"clustermq"
and "clustermq_staged"
parallel
backends. Now,
make(parallelism = "clustermq", caching = "main")
will do
all the caching with the main process, and
make(parallelism = "clustermq", caching = "worker")
will do
all the caching with the workers. The same is true for
parallelism = "clustermq_staged"
.append
argument to
gather_plan()
, gather_by()
,
reduce_plan()
, and reduce_by()
. The
append
argument control whether the output includes the
original plan
in addition to the newly generated rows.load_main_example()
,
clean_main_example()
, and
clean_mtcars_example()
.filter
argument to gather_by()
and
reduce_by()
in order to restrict what we gather even when
append
is TRUE
.make(parallelism = "hasty")
skips all
of drake
’s expensive caching and checking. All targets run
every single time and you are responsible for saving results to custom
output files, but almost all the by-target overhead is gone.path.expand()
on the file
argument to
render_drake_graph()
and
render_sankey_drake_graph()
. That way, tildes in file paths
no longer interfere with the rendering of static image files.evaluate_plan(trace = TRUE)
followed by
expand_plan()
, gather_plan()
,
reduce_plan()
, gather_by()
, or
reduce_by()
. The more relaxed behavior also gives users
more options about how to construct and maintain their workflow plan
data frames."future"
parallelism to make sure
files travel over network file systems before proceeding to downstream
targets.visNetwork
package is not installed.make_targets()
if all the targets are
already up to date.seed
argument in
make()
and drake_config()
.caching
argument of make()
and drake_config()
to "main"
rather than
"worker"
. The default option should be the lower-overhead
option for small workflows. Users have the option to make a different
set of tradeoffs for larger workflows.condition
trigger to evaluate to non-logical
values as long as those values can be coerced to logicals.condition
trigger evaluate to a vector
of length 1.drake_plan_source()
.make(verbose = 4)
now prints to the console when a
target is stored.gather_by()
and reduce_by()
now
gather/reduce everything if no columns are specified.make(jobs = 4)
was equivalent to
make(jobs = c(imports = 4, targets = 4))
. Now,
make(jobs = 4)
is equivalent to
make(jobs = c(imports = 1, targets = 4))
. See issue #553
for details.verbose
is at least 2.load_mtcars_example()
.hook
argument of make()
and
drake_config()
.gather_by()
and reduce_by()
, do not
exclude targets with all NA
gathering variables.digest()
wherever possible. This
puts old drake
projects out of date, but it improves
speed.stringi
package no longer compiles on 3.2.0.code_dependencies()
, restrict the possible global variables
to the ones mentioned in the new globals
argument (turned
off when NULL
. In practical workflows, global dependencies
are restricted to items in envir
and proper targets in the
plan. In deps_code()
, the globals
slot of the
output list is now a list of candidate globals, not necessarily
actual globals (some may not be targets or variables in
envir
).unlink()
in clean()
, set
recursive
and force
to FALSE
.
This should prevent the accidental deletion of whole directories.clean()
deleted input-only files if no
targets from the plan were cached. A patch and a unit test are included
in this release.loadd(not_a_target)
no longer loads every target in the
cache.igraph
vertex attribute (fixes #503).knitr_in()
file code
chunks.sort(NULL)
that caused warnings in
R 3.3.3.analyze_loadd()
was
sometimes quitting with “Error: attempt to set an attribute on
NULL”.digest::digest(file = TRUE)
on directories.
Instead, set hashes of directories to NA
. Users should
still not directories as file dependencies.vis_drake_graph()
. Previously, these files were missing
from the visualization, but actual workflows worked just fine.codetools
failures in R 3.3 (add
a tryCatch()
statement in
find_globals()
).clustermq
-based parallel backend:
make(parallelism = "clustermq")
.evaluate_plan(trace = TRUE)
now adds a
*_from
column to show the origins of the evaluated targets.
Try
evaluate_plan(drake_plan(x = rnorm(n__), y = rexp(n__)), wildcard = "n__", values = 1:2, trace = TRUE)
.gather_by()
and reduce_by()
,
which gather on custom columns in the plan (or columns generated by
evaluate_plan(trace = TRUE)
) and append the new targets to
the previous plan.template
argument of clustermq
functions (e.g. Q()
and workers()
) as an
argument of make()
and drake_config()
.code_to_plan()
function to turn R scripts and
R Markdown reports into workflow plan data frames.drake_plan_source()
function, which generates
lines of code for a drake_plan()
call. This
drake_plan()
call produces the plan passed to
drake_plan_source()
. The main purpose is visual inspection
(we even have syntax highlighting via prettycode
) but users
may also save the output to a script file for the sake of
reproducibility or simple reference.deps_targets()
in favor of a new
deps_target()
function (singular) that behaves more like
deps_code()
.vis_drake_graph()
and
render_drake_graph()
.vis_drake_graph()
and
render_drake_graph()
.vis_drake_graph()
using
the “title” node column.vis_drake_graph(collapse = TRUE)
.dependency_profile()
show major trigger hashes
side-by-side to tell the user if the command, a dependency, an input
file, or an output file changed since the last make()
.txtq
package is installed.loadd()
and
readd()
, giving specific usage guidance in prose.build_drake_graph()
and print
to the console the ones that execute.txtq
is not installed.drake
’s code examples to the
drake-examples
GitHub repository and make make
drake_example()
and drake_examples()
download
examples from there.show_output_files
argument to
vis_drake_graph()
and friends."clustermq_staged"
and "future_lapply"
.igraph
attributes of the
dependency graph to allow for smarter dependency/memory management
during make()
.vis_drake_graph()
and
sankey_drake_graph()
to save static image files via
webshot
.static_drake_graph()
and
render_static_drake_graph()
in favor of
drake_ggraph()
and render_drake_ggraph()
.columns
argument to evaluate_plan()
so users can evaluate wildcards in columns other than the
command
column of plan
.target()
so users do not have to
(explicitly).sankey_drake_graph()
and
render_sankey_drake_graph()
.static_drake_graph()
and
render_static_drake_graph()
for
ggplot2
/ggraph
static graph
visualizations.group
and clusters
arguments to
vis_drake_graph()
, static_drake_graph()
, and
drake_graph_info()
to optionally condense nodes into
clusters.trace
argument to
evaluate_plan()
to optionally add indicator columns to show
which targets got expanded/evaluated with which wildcard values.always_rename
argument to
rename
in evaluate_plan()
.rename
argument to
expand_plan()
.make(parallelism = "clustermq_staged")
, a
clustermq
-based staged parallelism backend (see #452).make(parallelism = "future_lapply_staged")
, a
future
-based staged parallelism backend (see #450).codetools
rather than
CodeDepends
for finding global variables.loadd()
and readd()
dependencies in
knitr
reports referenced with knitr_in()
inside imported functions. Previously, this feature was only available
in explicit knitr_in()
calls in commands.drake_plan()
s.inst/hpc_template_files
.drake_batchtools_tmpl_file()
in favor of
drake_hpc_template_file()
and
drake_hpc_template_files()
.garbage_collection
argument to
make()
. If TRUE
, gc()
is called
after every new build of a target.sanitize_plan()
in
make()
.tracked()
to accept only a
drake_config()
object as an argument. Yes, it is
technically a breaking change, but it is only a small break, and it is
the correct API choice.DESCRIPTION
file.knitr
reports without
warnings.lapply
-like backends,
drake
uses persistent workers and a main process. In the
case of "future_lapply"
parallelism, the main process is a
separate background process called by Rscript
.make()
’s. (Previously, there were “check” messages and a
call to staged_parallelism()
.)make(parallelism = c(imports = "mclapply_staged", targets = "mclapply")
.make(jobs = 1)
. Now, they are kept in memory until no
downstream target needs them (for make(jobs = 1)
).predict_runtime()
. It is a more sensible way to
go about predicting runtimes with multiple jobs. Likely to be more
accurate.make()
no longer leave targets in the user’s
environment.imports_only
argument to
make()
and drake_config()
in favor of
skip_targets
.migrate_drake_project()
.max_useful_jobs()
.upstream_only
argument to failed()
so users can list failed targets that do not have any failed
dependencies. Naturally accompanies
make(keep_going = TRUE)
.plyr
as a dependency.drake_plan()
and
bind_plans()
.target()
to help create drake plans
with custom columns.drake_gc()
, clean out disruptive files in
storr
s with mangled keys (re: #198).load_basic_example()
in favor of
load_mtcars_example()
.README.md
file on the main example rather
than the mtcars example.README.Rmd
file to generate
README.md
.deps_targets()
.deps()
in favor of
deps_code()
pruning_strategy
argument to make()
and drake_config()
so the user can decide how
drake
keeps non-import dependencies in memory when it
builds a target.drake
plans to help users customize
scheduling.makefile_path
argument to make()
and
drake_config()
to avoid potential conflicts between
user-side custom Makefile
s and the one written by
make(parallelism = "Makefile")
.console
argument to make()
and
drake_config()
so users can redirect console output to a
file.show_source()
,
readd(show_source = TRUE)
,
loadd(show_source = TRUE)
.!!
operator from tidyeval and
rlang
is parsed differently than in R <= 3.4.4. This
change broke one of the tests in tests/testthat/tidy-eval.R
The main purpose of drake
’s 5.1.2 release is to fix the
broken test.R CMD check
error from building the pdf
manual with LaTeX.drake_plan()
, allow users to customize target-level
columns using target()
inside the commands.bind_plans()
function to concatenate the rows
of drake plans and then sanitize the aggregate plan.session
argument to tell
make()
to build targets in a separate, isolated main R
session. For example,
make(session = callr::r_vanilla)
.reduce_plan()
function to do pairwise reductions
on collections of targets..
) from being a dependency of
any target or import. This enforces more consistent behavior in the face
of the current static code analysis functionality, which sometimes
detects .
and sometimes does not.ignore()
to optionally ignore pieces of workflow
plan commands and/or imported functions. Use
ignore(some_code)
to
drake
to not track dependencies in
some_code
, andsome_code
when it comes to
deciding which target are out of date.drake
to only look for imports in environments
inheriting from envir
in make()
(plus
explicitly namespaced functions).loadd()
to ignore foreign imports (imports not
explicitly found in envir
when make()
last
imported them).loadd()
so that only targets (not imports) are
loaded if the ...
and list
arguments are
empty..gitignore
file containing "*"
to
the default .drake/
cache folder every time
new_cache()
is called. This means the cache will not be
automatically committed to git. Users need to remove
.gitignore
file to allow unforced commits, and then
subsequent make()
s on the same cache will respect the
user’s wishes and not add another .gitignore
. this only
works for the default cache. Not supported for manual
storr
s."future"
backend with a manual
scheduler.dplyr
-style tidyselect
functionality in loadd()
, clean()
, and
build_times()
. For build_times()
, there is an
API change: for tidyselect
to work, we needed to insert a
new ...
argument as the first argument of
build_times()
.file_in()
for file inputs to commands or imported
functions (for imported functions, the input file needs to be an
imported file, not a target).file_out()
for output file targets (ignored if used in
imported functions).knitr_in()
for
knitr
/rmarkdown
reports. This tells
drake
to look inside the source file for target
dependencies in code chunks (explicitly referenced with
loadd()
and readd()
). Treated as a
file_in()
if used in imported functions.drake_plan()
so that it automatically fills in
any target names that the user does not supply. Also, any
file_out()
s become the target names automatically
(double-quoted internally).read_drake_plan()
(rather than an empty
drake_plan()
) the default plan
argument in all
functions that accept a plan
.loadd(..., lazy = "bind")
. That way, when you have a target
loaded in one R session and hit make()
in another R
session, the target in your first session will automatically
update.dataframes_graph()
.diagnose()
will take on
the role of returning this metadata.read_drake_meta()
function in favor of
diagnose()
.expose_imports()
function to optionally force
drake
detect deeply nested functions inside specific
packages.drake_build()
to be an exclusively user-side
function.replace
argument to loadd()
so that
objects already in the user’s environment need not be replaced.seed
argument to make()
,
drake_config()
, and load_basic_example()
. Also
hard-code a default seed of 0
. That way, the
pseudo-randomness in projects should be reproducible across R
sessions.drake_read_seed()
function to read the seed
from the cache. Its examples illustrate what drake
is doing
to try to ensure reproducible random numbers.!!
for the
...
argument to drake_plan()
. Suppress this
behavior using tidy_evaluation = FALSE
or by passing in
commands passed through the list
argument.rlang::expr()
before evaluating them. That means you can use the quasiquotation
operator !!
in your commands, and make()
will
evaluate them according to the tidy evaluation paradigm.drake_example("basic")
,
drake_example("gsp")
, and
drake_example("packages")
to demonstrate how to set up the
files for serious drake
projects. More guidance was needed
in light of #193.drake_plan()
in the help file
(?drake_plan
).drake
to rOpenSci GitHub URL.config
argument, which you can get from drake_config()
or
make()
. Examples:
cache$exists()
instead.make()
decides to build targets.storr
cache in a way
that is not back-compatible with projects from versions 4.4.0 and
earlier. The main change is to make more intelligent use of
storr
namespaces, improving efficiency (both time and
storage) and opening up possibilities for new features. If you attempt
to run drake >= 5.0.0 on a project from drake <= 4.0.0, drake will
stop you before any damage to the cache is done, and you will be
instructed how to migrate your project to the new drake.formatR::tidy_source()
instead of
parse()
in tidy_command()
(originally
tidy()
in R/dependencies.R
). Previously,
drake
was having problems with an edge case: as a command,
the literal string "A"
was interpreted as the symbol
A
after tidying. With tidy_source()
, literal
quoted strings stay literal quoted strings in commands. This may put
some targets out of date in old projects, yet another loss of back
compatibility in version 5.0.0.rescue_cache()
, exposed to the user and used
in clean()
. This function removes dangling orphaned files
in the cache so that a broken cache can be cleaned and used in the usual
ways once more.cpu
and elapsed
arguments of make()
to NULL
. This solves an
elusive bug in how drake imposes timeouts.graph
argument to functions
make()
, outdated()
, and
missed()
.prune_graph()
function for igraph
objects.prune()
and
status()
.analyses()
=> plan_analyses()
as_file()
=> as_drake_filename()
backend()
=> future::plan()
build_graph()
=>
build_drake_graph()
check()
=> check_plan()
config()
=> drake_config()
evaluate()
=> evaluate_plan()
example_drake()
=> drake_example()
examples_drake()
=>
drake_examples()
expand()
=> expand_plan()
gather()
=> gather_plan()
plan()
, workflow()
,
workplan()
=> drake_plan()
plot_graph()
=> vis_drake_graph()
read_config()
=>
read_drake_config()
read_graph()
=> read_drake_graph()
read_plan()
=> read_drake_plan()
render_graph()
=>
render_drake_graph()
session()
=> drake_session()
summaries()
=> plan_summaries()
output
and code
as names in the
workflow plan data frame. Use target
and
command
instead. This naming switch has been formally
deprecated for several months prior.drake_quotes()
,
drake_unquote()
, and drake_strings()
to remove
the silly dependence on the eply
package.skip_safety_checks
flag to make()
and drake_config()
. Increases speed.sanitize_plan()
, remove rows with blank targets
““.purge
argument to clean()
to
optionally remove all target-level information.namespace
argument to cached()
so
users can inspect individual storr
namespaces.verbose
to numeric: 0 = print nothing, 1 = print
progress on imports only, 2 = print everything.next_stage()
function to report the targets
to be made in the next parallelizable stage.session_info
argument to make()
.
Apparently, sessionInfo()
is a bottleneck for small
make()
s, so there is now an option to suppress it. This is
mostly for the sake of speeding up unit tests.log_progress
argument to make()
to suppress progress logging. This increases storage efficiency and
speeds some projects up a tiny bit.namespace
argument to
loadd()
and readd()
. You can now load and read
from non-default storr
namespaces.drake_cache_log()
,
drake_cache_log_file()
, and
make(..., cache_log_file = TRUE)
as options to track
changes to targets/imports in the drake cache.rmarkdown::render()
, not just knit()
.drake
properly.plot_graph()
to display subcomponents. Check out
arguments from
, mode
, order
, and
subset
. The graph visualization vignette has
demonstrations."future_lapply"
parallelism: parallel backends
supported by the future
and future.batchtools
packages. See ?backend
for examples and the parallelism
vignette for an introductory tutorial. More advanced instruction can be
found in the future
and future.batchtools
packages themselves.diagnose()
.hook
argument to make()
to
wrap around build()
. That way, users can more easily
control the side effects of distributed jobs. For example, to redirect
error messages to a file in
make(..., parallelism = "Makefile", jobs = 2, hook = my_hook)
,
my_hook
should be something like
function(code){withr::with_message_sink("messages.txt", code)}
.drake
was previously using the outfile
argument for PSOCK clusters to generate output that could not be caught
by capture.output()
. It was a hack that should have been
removed before.drake
was previously using the outfile
argument for PSOCK clusters to generate output that could not be caught
by capture.output()
. It was a hack that should have been
removed before.make()
and outdated()
print “All targets are already up to date” to the console."future_lapply"
backends.plot_graph()
and progress()
. Also
see the new failed()
function, which is similar to
in_progress()
.parLapply
parallelism. The
downside to this fix is that drake
has to be properly
installed. It should not be loaded with
devtools::load_all()
. The speedup comes from lightening the
first clusterExport()
call in run_parLapply()
.
Previously, we exported every single individual drake
function to all the workers, which created a bottleneck. Now, we just
load drake
itself in each of the workers, which works
because build()
and do_prework()
are
exported.overwrite
to FALSE
in load_basic_example()
.report.Rmd
in
load_basic_example()
.get_cache(..., verbose = TRUE)
.lightly_parallelize()
and
lightly_parallelize_atomic()
. Now, processing happens
faster, and only over the unique values of a vector.make_with_config()
function to do the work of
make()
on an existing internal configuration list from
drake_config()
.drake_batchtools_tmpl_file()
to
write a batchtools
template file from one of the examples
(drake_example()
), if one exists.Version 4.3.0 has: - Reproducible random numbers (#56) - Automatic detection of knitr dependencies (#9) - More vignettes - Bug fixes
Version 4.2.0 will be released today. There are several improvements to code style and performance. In addition, there are new features such as cache/hash externalization and runtime prediction. See the new storage and timing vignettes for details. This release has automated checks for back-compatibility with existing projects, and I also did manual back compatibility checks on serious projects.
Version 3.0.0 is coming out. It manages environments more
intelligently so that the behavior of make()
is more
consistent with evaluating your code in an interactive session.
Version 1.0.1 is on CRAN! I’m already working on a massive update, though. 2.0.0 is cleaner and more powerful.