tar_meta() default settings for tar_read() etc. (for million-target pipelines)."terse" reporter by default if the calling session is non-interactive. This will hopefully avoid problems on CRAN for packages that use targets with the default settings.tar_renv() (#1506, @valentingar).rstudioapi::isAvailable() (#1519, @dipterix).qmethod = "escape" to avoid https://github.com/Rdatatable/data.table/issues/3509 (#1480, @koefoeden).error = "trim" does not hang when the errored target has a long chain of reverse dependencies (#1481, @koefoeden)."rlib_error_package_not_found" from errors (#1484, @malcolmbarrett). This and #1354 are unfortunate consequences of #997.suppressPackageStartupMessages() once for the whole pipeline. Repeated target-specific calls may be slow, and the messages themselves are cumbersome. This is an appropriate tradeoff.tar_debug_instructions().cli::cli_text() instead of cli::cli_progress_output() (#1478, @dipterix).tar_make() (#1482)._targets/objects/ time stamps only for local builders mentioned in the metadata, as opposed to everything in that directory (#1482).format is "file" and repository is not "local", then the local file is no longer deleted after upload (#1467).rstudio_available() returns FALSE without error if rstudioapi is not installed."terse" reporter, which is the "balanced" reporter without the progress bar. Make "terse" the default reporterpriority argument of tar_target(). Because of #1458, custom priorities no longer have an effect on execution order. However, up-to-date parallelized pipelines with 100000+ targets can now be checked around 10 times faster, so the tradeoff is worth it. And as a workaround, you can send high-priority targets to one or more special crew controllers in a controller group (details: https://books.ropensci.org/targets/crew.html#heterogeneous-workers).format = "file" files on disk even for non-local repositories (#1467).tar_option_get(), set repository_meta to "local" by default, regardless of repository (#1427).tar_option_get(), set storage = "worker", retrieval = "auto", and memory = "auto" by default (#1426). For memory, "auto" is now equivalent to "transient" most of the time, but it is equivalent to "persistent" for non-dynamic targets that other targets dynamically branch over. For retrieval, the "auto" setting is new. It is equivalent to "worker" for most cases, but it aligns with "main" for dynamic branches that branch over non-dynamic targets. All this is to avoid re-reading the upstream target from disk every time a branch needs to run.tar_make() and tar_outdated().garbage_collection argument of tar_option_get() to 0 (#1464).crew controllers are saturated.storage, retrieval, and memory that balance resource tradeoffs for the most common pipelines (#1426).targets:::target_run() (#1464). There is no longer a separate gc() call on the main process.store_sync_file_meta() in the general case.tar_option_get("repository_meta") is "aws" or "gcp". Download them with tar_workspace_download() and delete them with tar_destroy(destroy = "all") or tar_destroy(destroy = "cloud").format = "auto" (#1425, @paulseamer).store_read_path.tar_auto() (#1429, @paulseamer).iteration = "group" branching problems.cli::style_reset() at the end of non-silent reporters (#1450, @r2evans).visNetwork graph based on the number of hierarchical levels and the maximum number of vertices per level (#1432).tar_visnetwork(), choose the colors of the edges based on the origin vertices, not the destination vertices (#1433)."verbose" and "timestamp" reporters, print "dispatched pattern" messages, and print the total computation and storage summed over all the branches."balanced" reporter with a cli progress bar (#1442)."forecast", "forecast_interactive", "verbose_positives", and "timestamp_positives" (#1442).callr process (#1442).tar_option_with() (https://github.com/ropensci/tarchetypes/issues/215, @noamross).prettyunits to print elapsed times and file sizes.tar_make() error message.on_worker argument to target_run() and builder_unload_value() so the latter only removes the target value if the target was actually run on a worker.R6 classes.crew task retries.NA buckets in store_delete_objects.tar_aws() and store_delete_objects.tar_gcp().These changes invalidate certain targets in a pipeline and cause them to rerun on the next tar_make().
tar_repository_cas() output strings to reduce the size of pipeline metadata (#1390).tar_format() output strings to reduce the size of pipeline metadata (#1390).tar_make() and tar_outdated() run much faster in this release. Extensive profiling was done on a real-world simulation pipeline with 66002 up-to-date targets. For tar_make() using all the default settings:
Machine | Before (seconds) | After (seconds) | Speedup ---|---|---|--- M2 Macbook | 413.16 | 35.538 | 11.62587 RHEL9 | 450.66 | 94.08 | 4.790
And for tar_outdated() using all the default settings
Machine | Before (seconds) | After (seconds) | Speedup ---|---|---|--- M2 Macbook | 91.314 | 16.636 | 5.48894 RHEL9 | 167.809 | 37.395 | 4.487472
To take advantage of these speed gains for an existing pipeline, you may have to run tar_make() to convert the time stamps and file sizes to a new format. This initial tar_make() is slow, but subsequent tar_make() calls should be much faster than before the upgrade.
tar_make() and tar_outdated() by avoiding excessive buffering and disk writes for metadata and reporters when the pipeline is just skipping targets.tar_runtime$file_info (#1398)."forecast_interactive" reporter to tar_outdated() to choose "forecast" for interactive sessions and "silent" for non-interactive ones.seconds_reporter_outdated argument to tar_config_set() with a default of 1 to control the time interval of the reporter of tar_outdated() and other passive algorithm functions.path vectors with cloud metadata (#1382, @n8layman).ps::ps_disk_partitions() and ps::ps_fs_mount_point()._targets/objects/ paths in metadata for CAS repositories (#1391).igraph >= 2.1.2.format = "file_fast" (#1339, @koefoeden).error = "trim" (#1340, @koefoeden).garbage_collection to be a non-negative integer to control the frequency of garbage collection in a performant, convenient, unified way (#1351).garbage_collection argument of tar_make(), tar_make_future(), and tar_make_clusterm() (#1351).target_run(), target_prepare(), and target_conclude() using autometric."vctrs_error_subscript_oob" to rlang::abort() (#1354, @Jiefei-Wang).store_assert_format() and store_convert_object() is storage is "none".list() method to tar_repository_cas() to make it easier and more efficient to specify custom CAS repositories (#1366).memory is "transient" (#1364).memory class with the new lookup class.memory = "auto" to select transient memory for dynamic branches and persistent memory for other targets (#1371).retrieval is "main" and only a bud is actually used. The same cannot be done with branches because each branch may need to be (un)marshaled individually.retrieval is "worker" and the whole pattern is part of the subpipeline.format = "qs" from qs to qs2 (#1373).tar_unblock_process()."keepNA" and "keepInteger" to .deparseOpts() (#1375). This may cause existing pipelines to rerun, but it makes add-ons like tarchetypes::tar_map() much easier to use.tar_watch() UI module in bslib::page() (#1302, @kwbyron-lilly).callr_function in tar_make_as_job() argument list.storage = "worker" is respected when the process of storing an object generates an error (#1304, @multimeric)._targets.R pattern in tar_branches() (#1306, @multimeric, @mattwarkentin).tar_prune() (#1312, @benzipperer).workspace_on_error option to TRUE (#1310, @hadley).error = "stop" error message._targets/objects for error = "null". Instead, switch to a special "null" storage format class if error is "null" the target throws an error. This should allow users to more freely create new formats with tar_format() without worrying about how to handle NULL objects created by error = "null".format = "auto" (#1311, @hadley).pingr dependency with base::socketConnection() for local URL utilities (#1317, #1318, @Adafede).tar_repository_cas(), tar_repository_cas_local(), and tar_repository_cas_local_gc() for content-addressable storage (#1232, #1314, @noamross).tar_format_get() to make implementing CAS systems easier.error = "trim" in tar_target() and tar_option_set() (#1310, #1311, @hadley).format = "file_fast" in favor of the above (#1315).trust_object_timestamps in favor of the more unified trust_timestamps in tar_option_set() (#1315).tar_target() and tar_target_raw(). Same with tar_load() and tar_load_raw().substitute argument to tar_format() to make it easier to write custom storage formats without metaprogramming.bslib in tar_watch().target_upstream_edges() and pipeline_upstream_edges() by avoiding data frames until the last minute (17% speedup for certain kinds of large pipelines).as_job to FALSE in tar_make() if rstudioapi and/or RStudio is not available.secretbase::siphash13() instead of digest(algo = "xxhash64", serializationVersion = 3) so hashes of in-memory objects no longer depend on serialization version 3 headers (#1244, @shikokuchuo). Unfortunately, pipelines built with earlier versions of targets will need to rerun.targets and changes to the package will cause the current work to rerun (#1244). For the tar_make*() functions, utils::menu() prompts the user to give people a chance to downgrade if necessary.data.table::fread(), then convert them to the correct types afterwards.tar_resources_custom_format() function which can pass environment variables to customize the behavior of custom tar_format() storage formats (#1263, #1232, @Aariq, @noamross).extras in tar_renv().tar_target() gains a description argument for free-form text describing what the target is about (#1230, #1235, #1236, @tjmahr).tar_visnetwork(), tar_glimpse(), tar_network(), tar_mermaid(), and tar_manifest() now optionally show target descriptions (#1230, #1235, #1236, @tjmahr).tar_described_as() is a new wrapper around tidyselect::any_of() to select specific subsets of targets based on the description rather than the name (#1136, #1196, @noamross, @mattmoo).names argument (nudge users toward tidyselect expressions).arrow-related CRAN check NOTE.use_targets() only writes the _targets.R script. The run.sh and run.R scripts are superseded by the as_job argument of tar_make(). Users not using the RStudio IDE can call tar_make() with callr_function = callr::r_bg to run the pipeline as a background process. tar_make_clustermq() and tar_make_future() are superseded in favor tar_make(use_crew = TRUE), so template files are no longer written for the former automatically.Because of the changes below, upgrading to this version of targets will unavoidably invalidate previously built targets in existing pipelines. Your pipeline code should still work, but any targets you ran before will most likely need to rerun after the upgrade.
tar_seed_create(), use secretbase::sha3(x = TARGET_NAME, bits = 32L, convert = NA) to generate target seeds that are more resistant to overlapping RNG streams (#1139, @shikokuchuo). The previous approach used a less rigorous combination of digest::digest(algo = "sha512") and digets::digest2int().deployment argument of tar_target() to reflect the advent of crew (#1208, @psychelzh).cli.num_colors on exit in tar_error() and tar_warning() (#1210, @dipterix).seconds_timeout if the crew controller is actually a controller group (#1207, https://github.com/wlandau/crew.cluster/discussions/35, @stemangiola, @drejom).tar_make() gains an as_job argument to optionally run a targets pipeline as an RStudio job.igraph version to 2.0.0 because igraph::get.edgelist() was deprecated in favor of igraph::as_edgelist().crew controllers (or controller groups) (#1220). Use the new push_backlog() and pop_backlog() crew methods to make this smooth.tar_make() if there is already a targets pipeline running on a local process on the same local data store. The local process is detected using the process ID and time stamp from tar_process() (with a 1.01-second tolerance for the time stamp).pkgload::load_all() warning (#1218). Tried using .__DEVTOOLS__ but it interferes with reverse dependencies.tar_target_raw() to let users know that iteration = "group" is invalid for dynamic targets (ones with pattern = map(...) etc.; #1226, @bmfazio).clustermq version to 0.9.2.tar_debug_instructions() tips for when commands are long.Because of the changes below, upgrading to this version of targets will unavoidably invalidate previously built targets in existing pipelines. Your pipeline code should still work, but any targets you ran before will most likely need to rerun after the upgrade.
tar_seed_create() help file for details and justification. Unfortunately, this change will invalidate all currently built targets because the seeds will be different. To avoid rerunning your whole pipeline, set cue = tar_cue(seed = FALSE) in tar_target().targets:::digest_chr64() in both cases before storing the result in the metadata.targets now tries to ensure that the up-to-date data objects in the cloud are in their newest versions. So if you roll back the metadata to an older version, you will still be able to access historical data versions with e.g. tar_read(), but the pipeline will no longer be up to date.tar_seed_create() which creates target-specific pseudo-random number generator seeds.tar_seed_create() help file to justify and defend how targets and tarchetypes approach pseudo-random numbers.tar_seed_set() which sets a seed and sets all the RNG algorithms to their defaults in the R installation of the user. Each target now uses tar_seed_set() function to set its seed before running its R command (#1139).tar_seed() in favor of the new tar_seed_get() function.tar_delete(), tar_destroy(), and tar_prune() now use efficient batched calls to delete_objects() instead of costly individual calls to delete_object() (#1171).verbose argument to tar_delete(), tar_destroy(), and tar_prune().batch_size argument to tar_delete(), tar_destroy(), and tar_prune().page_size and verbose to tar_resources_aws() (#1172).tar_unversion() function to remove version IDs from the metadata of cloud targets. This makes it easier to interact with just the current version of each target, as opposed to the version ID recorded in the local metadata.clustermq 0.9.0 (@mschubert).tar_started() in favor of tar_dispatched() (#1192).tar_built() in favor of tar_completed() (#1192).crew scheduling algorithm no longer waits on saturated controllers, and targets that are ready are greedily dispatched to crew even if all workers are busy (#1182, #1192). To appropriately set expectations for users, reporters print "dispatched (pending)" instead of "dispatched" if the task load is backlogged at the moment.crew scheduling algorithm, waiting for tasks is now a truly event-driven process and consumes 5-10x less CPU resources (#1183). Only the auto-scaling of workers uses polling (with an inexpensive default polling interval of 0.5 seconds, configurable through seconds_interval in the controller).tar_config_projects() and tar_config_yaml() (#1153, @psychelzh).builder_wait_correct_hash() in target_conclude.tar_builder() (#1154, @gadenbuie).builder_error_null().tar_meta_upload() and tar_meta_download() to avoid errors if one or more metadata files do not exist. Add a new argument strict to control error behavior.meta, progress, process, and crew to control individual metadata files in tar_meta_upload(), tar_meta_download(), tar_meta_sync(), and tar_meta_delete().crew 0.5.0.9003 (https://github.com/wlnadau/crew/issues/131).tar_read() etc. inside a pipeline whenever it uses a different data store (#1158, @MilesMcBain).seed = FALSE in future::future() (#1166, @svraka).physics argument to tar_visnetwork() and tar_glimpse() (#925, @Bdblodgett-usgs).Because of these changes, upgrading to this version of targets will unavoidably invalidate previously built targets in existing pipelines. Your pipeline code should still work, but any targets you ran before will most likely need to rerun after the upgrade.
hash_deps() method of the metadata class, exclude symbols which are not actually dependencies, rather than just giving them empty strings. This change decouples the dependency hash from the hash of the target's command (#1108).tar_make(), tar_make_clustermq(), and tar_make_future() (#1109). Upload them to the repository specified in the repository_meta tar_option_set() option, and use the bucket and prefix set in the resources tar_option_set() option. repository_meta defaults to the existing repository tar_option_set() option.tar_meta_download(), tar_meta_upload(), tar_meta_sync(), and tar_meta_delete() to directly manage cloud metadata outside the pipeline (#1109).tempdir() for #1103.path_scratch_dir_network() to file.path(tempdir(), "targets") and make sure tar_destroy("all") and tar_destroy("cloud") delete it.tar_mermaid() subgraphs with transparent fills and black borders.database$get_data() to work with list columns.tarchetypes literate programming target factories like tar_render() and tar_quarto().hash_deps() method of the metadata class, use a new custom sort_chr() function which temporarily sets the LC_COLLATE locale to "C" for sorting. This ensures lexicographic comparisons are consistent across platforms (#1108).tar_source(), use the file argument and keep.source = TRUE to help with interactive debugging (#1120).seconds_interval in tar_config_get(), tar_make(), tar_make_clustermq() and tar_make_future(). Replace it with seconds_meta (to control how often metadata gets saved) and seconds_reporter (to control how often to print messages to the R console) (#1119).seconds_meta and seconds_reporter for writing metadata and console messages even for currently building targets (#1055).googleAuthR (#1112).format = "url", only retry on the HTTP error codes above.seconds_interval and seconds_timeout from tar_resources_url(), and implement max_tries arguments in tar_resources_aws() and tar_resources_gcp() (#1127).file and keep.source in parse() in callr utils and target Markdown."file_fast" format to "file" format for cloud targets.tar_prune() and tar_delete(), do not try to delete pattern targets which have no cloud storage.seconds_timeout, close_connection, s3_force_path_style to tar_resources_aws() to support the analogous arguments in paws.storage::s3() (#1134, @snowpong).tar_prune_list() (#1090, @mglev1n).file.rename() in tryCatch() and fall back on a copy-then-remove workaround (@jds485, #1102, #1103).tools::R_user_dir(package = "targets", which = "cache") instead of tempdir(). tar_destroy(destroy = "cloud") and tar_destroy(destroy = "all") remove any leftover files from failed uploads/downloads (@jds485, #1102, #1103).paws.storage instead of all of paws.crew controllers._targets.R file from use_targets().tar_crew() compatible with crew >= 0.3.0.terminate to terminate_controller in tar_make().use_crew in tar_make() and add an option in tar_config_set() to make it configurable.target_prepare().label and level_separation arguments through tar_config_set() (#1085, @Moohan).nanonext usage in time_seconds_local() at runtime and not installation time. That way, if nanonext is removed after targets is installed, functions in targets still work. Fixes the CRAN issues seen in tarchetypes, jagstargets, and gittargets.crew-related startup messages.cli colors and bullets to improve performance in RStudio.packageStartupMessage() for package startup messages.crew is used.gc() more appropriately when garbage_collection is TRUE in tar_target().garbage_collection arguments to tar_make(), tar_make_clustermq(), and tar_make_future() to add optional garbage collection before targets are sent to workers. This is different and independent from the garbage_collection argument of tar_target(). In high-performance computing scenarios, the former controls what happens on the main controlling process, whereas the latter controls what happens on the worker.garbage_collection and seconds_interval arguments to tar_make(), tar_make_clustermq(), tar_make_future(), and tar_config_set().tar_runtime object."file_fast" format and the trust_object_timestamps option in tar_option_set() as safer alternatives.crew controller groups (#1065, @mglev1n).tar_backoff(). The backoff argument of tar_option_set() now accepts output from tar_backoff(), and supplying a numeric is deprecated.crew scheduling algorithm.tar_resources_network() to configure retries and timeouts for internal HTTP/HTTPS requests in specialized targets with format = "url", repository = "aws", and repository = "gcp". Also applies to syncing target files across network file systems in the case of storage = "worker" or format = "file", which previously had a hard-coded seconds_interval = 0.1 and seconds_timeout = 60.seconds_interval and seconds_timeout in tar_resources_url() in favor of the new equivalent arguments of tar_resources_network()crew controller when the controller is saturated (#1074, @mglev1n).crew controller.paws.common (@DyfanJones)._targets/objects/ in tar_callr_inner_try() and update the cache as targets are saved to _targets/objects/ to avoid the overhead of repeated calls to file.exists() and file.info() (#1056)._targets/objects/ are up to date (#1062). tar_option_set(trust_object_timestamps = FALSE) ignores the timestamps and recomputes the hashes._targets/meta/meta and _targets/meta/progress in timed batches instead of line by line (#1055).tempfile() when working with the scratch directory.nanonext::mclock() instead of proc.time() when there is no risk of forked processes.withr with slightly faster/leaner base R alternatives.setwd() (#1057).tar_options methods in the internals instead of tar_option_get().gsub() in store_init().meta$get_record() in builder_should_run().cli::col_none() to reduce the number of ANSI characters printed to the R console.targets is moving to version 1.0.0 because it is significantly more mature than previous versions. Specifically,
tar_make() now integrates with crew, which will significantly improve the way targets does high-performance computing going forward.targets has stabilized. There is still room for smaller new features, but none as large as crew integration, none that will fundamentally change how the package operates.crew package in tar_make() (#753). crew itself is still in its early stages and currently lacks the launcher plugins to match the clustermq and future backends, but long-term, crew will be the predominant high-performance computing backend.store_copy_object() to the store class to enable "fst_dt" and other formats to make deep copies when needed (#1041, @MilesMcBain).copy argument to allow tar_format() formats to set the store_copy_object() method (#1041, @MilesMcBain).tar_format() when default methods are used.change_directory argument to tar_source() (#1040, @dipterix).format = "url" targets, implement retries and timeouts when connecting to URLs. The default timeout is 10 seconds, and the default retry interval is 1 second. Both are configurable via tar_resources_url() (#1048).parallelly::freePort() in tar_random_port().tar_script() example pipeline (#1033, @b-rodrigues).tar_destroy() help file (#988, @Sage0614).destroy = "user" in tar_destroy().#!/bin/sh line to the top of SLURM clustermq template file (#944, #955, @GiuseppeTT).tar_path_script().tar_store() to tar_path_store() with deprecation.tar_path() to tar_path_target() with deprecation.tar_path_script_support().tar_option_set() now supports a seed argument, and target-specific seeds are determined by tar_option_get("seed") and the target name. tar_option_set(seed = NA) disables seed-setting behavior but forcibly invalidates all the affected targets except when seed is FALSE in the target's tar_cue() (#882, @sworland-thyme, @joelnitta).seed argument in tar_cue() to control whether targets update in response to changing or NA seeds (#882, @sworland-thyme, @joelnitta).tar_github_actions() workflow file to use @v2 (#960, @kulinar).callr_function is NULL (#961)."feather", "parquet", "file", and "url" work with error = "null" (#969)."keras" and "torch" superseded by tar_format(). Documented in the tar_target() help file."keras" and "torch" incompatible with error = "null". Documented in the tar_target() help file and in a warning thrown by tar_target() via tar_target_raw().convert argument to tar_format() to allow custom store_convert_object() methods (#970).any_of() instead of all_of() in tests to ensure compatibility with tidyselect 1.1.2.9000 (#928, @hadley).run.R from use_targets() executable (#929, @petrbouchal).#!/usr/bin/env Rscript to the top of run.R from use_targets() (#929, @petrbouchal).skip_on_cran() to avoid https://github.com/r-lib/testthat/issues/1470#issuecomment-1248145555.names argument of tar_make() does not identify any such targets in the pipeline (#923, @llrs)..packageName, .__NAMESPACE__., and .__S3MethodsTable__. when importing objects from packages with the imports option of tar_option_set().imports option of tar_option_set() (#926, @joelnitta).tar_read() and tar_load() when the data store is missing.command column of tar_manifest() output, separate lines with "\n" instead of "\n" so the text output is straightforward to work with.drop_missing argument to tar_manifest() to hide/show columns with all NA values.paws functions via ... in tar_resources_aws() (#855, @michkam89).tar_source() to conveniently source R scripts (e.g. in _targets.R).targets messages the default theme color, and color warnings and errors red (#856, @gorkang).use_targets().tar_option_get("resources") (#892). See the revised "Resources" section of the tar_resources() help file for details.legend and color to further configure tar_mermaid() (#848, @noamross).use_targets() now creates a job.sh script to run the pipeline as a cluster job (#839).use_targets(). Avoids defining a global variable for the file.use_targets() _targets.R file.tar_mermaid() graph ordering.tar_mermaid() graphs to avoid JavaScript keywords.data.table::fread() with encoding equal to getOption("encoding") if available (#814, @svraka). Only works with UTF-8 and latin1 because that is what data.table supports.use_targets() now writes a _targets.R file tailored to the project in the current working directory (#639, @noamross).use_targets() to use_targets_rmd().getOption("OutDec") is not "." to prevent time stamps from being corrupted (#433, @jarauh).tar_load_everything() to quickly load all targets (#823, @malcolmbarrett)tar_target(..., repository = "gcp") (#720, @markedmondson1234). Special thanks to @markedmondson1234 for the cloud storage utilities in R/utils_gcp.Rmermaid.js static graphs with tar_mermaid() (#775, @yonicd).tar_target(..., error = "null")to allow errored targets to return NULL and continue (#807, @zoews). Errors are still registered, those targets are not up to date, and downstream targets have an easier time continuing on.tar_assert_finite().tar_destroy(), tar_delete(), and tar_prune() now attempt to delete cloud data for the appropriate targets (#799). In addition, tar_exist_objects() and tar_objects() now report about target data in the cloud when applicable. Add a new cloud argument to each function to optionally suppress this new behavior.zoom_speed argument to tar_visnetwork() and tar_glimpse() (#749, @dipterix)."verbose", "verbose_positives", "timestamp", and "timesamp_positives" reporters."aws_*" storage format values in favor of a new repository argument (#803). In other words, tar_target(..., format = "aws_qs") is now tar_target(..., format = "qs", repository = "aws"). And internally, storage classes with multiple inheritance are created dynamically as opposed to having hard-coded source files. All this paves the way to add new cloud storage platforms without combinatorial chaos."tar_nonexportable" to format = "aws_keras" and format = "aws_torch" stores.tar_make_interactive_load_target().tar_target(format = tar_format(...)) (#736).tar_call() to return the targets function currently running (from _targets.R or a target).tar_active() to tell whether the pipeline is currently running. Detects if it is called from tar_make() or similar function.Sys.getenv("TAR_PROJECT") to the output of tar_envvars().store field of tar_runtime prior to sourcing _targets.R so tar_store() works in target scripts.tar_envvars() to targets run on parallel workers.format = "file" targets to return character(0) (#728, @programLyrique).git checkout a different branch of your code and all you targets will stay up to date.paws (#711).region argument to tar_resources_aws() to allow the user to explicitly declare a region for each AWS S3 buckets (@caewok, #681). Different buckets can now have different regions. This feature required modifying the metadata path for AWS storage formats. Before, the first element of the path was simply the bucket name. Now, it is internally formatted like "bucket=BUCKET:region=REGION", where BUCKET is the user-supplied bucket name and REGION is the user-supplied region name. The new targets is back-compatible with the old metadata format, but if you run the pipeline with targets >= 0.8.1.9000 and then downgrade to targets <= 0.8.1, any AWS targets will break.timestamp_positives" and "verbose_positives" that omit messages for skipped targets (@psanker, #683).tar_assert_file().tar_reprex() for creating easier reproducible examples of pipelines.tar_store() to get the path to the store of the currently running pipeline (#714, @MilesMcBain)._targets/user/ folder to encourage gittargets users to put custom files there for data version control.tar_path() uses the current store path of the currently running pipeline instead of tar_config_get("store") (#714, @MilesMcBain)..gitignore file inside the data store to allow the metadata to be committed to version control more easily (#685, #711).tar_target() and tar_target_raw() (@tjmahr, #679).target_should_run.tar_builder(). These kinds of errors sometimes come up with AWS storage._targets/.gitignore for new data stores so the user can delete the .gitignore file without it mysteriously reappearing (#685).strict and silent to allow tar_load() and tar_load_raw() to bypass targets that cannot be loaded.tidyselect docs in tar_make() (#640, @dewoller).tar_dir() in tar_test() (#642, @billdenney).tar_assert_target_list() error message (@kkami1115, #654).tar_destroy() and related cleanup functions (@billdenney, #675).tar_target(target_name, ..., format = "aws_file"). Previously, _targets/objects/target_name was also hashed if it existed.tar_config_unset() function to delete one or more configuration settings from the YAML configuration file.TAR_CONFIG environment variable to set the default file path of the YAML configuration file with project settings (#622, @yyzeng, @atusy, @nsheff, @wdkrnls). If TAR_CONFIG is not set, the file path is still _targets.yaml.config package) and support the TAR_PROJECT environment variable to select the current active project for a given R session. The old single-project format is gracefully deprecated (#622, @yyzeng, @atusy, @nsheff, @wdkrnls).retrieval = "none" and storage = "none" to anticipate loading/saving targets from other languages, e.g. Julia (@MilesMcBain).tar_definition() function to get the target definition object of the current target while that target is running in a pipeline.tar_path() now returns the path to the staging file instead of _targets/objects/target_name. This ensures you can still write to tar_path() in storage = "none" targets and the package will automatically hash the right file and upload it to the cloud. (This behavior does not apply to formats "file" and "aws_file", where it is never necessary to set storage = "none".)eval(parse(text = ...), envir = tar_option_set("envir") instead of source() in the _targets.R file for Target Markdown.RecordBatch and Table (@MilesMcBain).knitr load the Target Markdown engine (#469, @nviets, @yihui). Minimum knitr version is now 1.34.tar_resources_future() help file, encourage the use of plan to specify resources.error = "continue" does not cause errored targets to have NULL values.knitr engine).poll_connection, stdout, and stderr arguments of callr::r_bg() in tar_watch() (@mpadge).tar_started(), tar_skipped(), tar_built(), tar_canceled(), and tar_errored().tar_interactive(), tar_noninteractive(), and tar_toggle() to differentially suppress code in non-interactive and interactive mode in Target Markdown (#607, @33Vito).future errors within targets (#570, @stuvet).message knitr chunk option is FALSE (#574, @jmbuhr).tar_interactive is not set, choose interactive vs non-interactive mode based on isTRUE(getOption("knitr.in.progress")) instead of interactive().tar_poll() to lose and then regain connection to the progress file.tar_group column of iteration = "group" data frames do not invalidate slices (#507, @lindsayplatt).tar_interactive global option to select interactive mode or non-interactive mode (#469).degree_from and degree_to of tar_visnetwork() and tar_glimpse() (#474, @rgayler).tar_config_set() (#476).tar_script chunk option in Target Markdown to control where the {targets} language engine writes the target script and helper scripts (#478).script and store to choose custom paths to the target script file and data store for individual function calls (#477).targets backends. Unavoidably, the path gets reset to _targets.yaml when the session restarts._targets.yaml config options reporter_make, reporter_outdated, and workers to control function argument defaults shared across multiple functions called outside _targets.R (#498, @ianeveperry).tar_load_globals() for debugging, testing, prototyping, and teaching (#496, @malcolmbarrett).resources argument of tar_target() to avoid conflicts among formats and HPC backends (#489). Includes user-side helper functions like tar_resources() and tar_resources_aws() to build the required data structures._targets/meta/progress and display then in tar_progress(), tar_poll(), tar_watch(), tar_progress_branches(), tar_progress_summary(), and tar_visnetwork() (#514). Instead of writing each skip line separately to _targets/meta/progress, accumulate skip lines in a queue and then write them all out in bulk when something interesting happens. This avoids a lot of overhead in certain cases.shortcut argument to tar_make(), tar_make_clustermq(), tar_make_future(), tar_outdated(), and tar_sitrep() to more efficiently skip parts of the pipeline (#522, #523, @jennysjaarda, @MilesMcBain, @kendonB).names and shortcut in graph data frames and graph visuals (#529).allow and exclude to the network behind the graph visuals rather than the visuals themselves (#529).tar_watch() app to show verbose progress info and metadata.workspace_on_error argument of tar_option_set() to supersede error = "workspace". Helps control workspace behavior independently of the error argument of tar_target() (#405, #533, #534, @mattwarkentin, @xinstein).error = "abridge" in tar_target() and related functions. If a target errors out with this option, the target itself stops, any currently running targets keeps, and no new targets launch after that (#533, #534, @xinstein).tar_destroy() which can be suppressed with TAR_ASK = "false" (#542, @gofford).tar_older() and tar_newer() to help users identify and invalidate targets at regular times or intervals.targets chunk option in favor of tar_globals (#469).error = "workspace" in tar_target() and related functions. Use tar_option_set(workspace_on_error = TRUE) instead (#405, #533, @mattwarkentin, @xinstein).clustermq worker (@rich-payne).store_sync_file_meta.default() on small files.tar_watch(), take several measures to avoid long computation times rendering the graph:
display and displays to tar_watch() so the user can select which display shows first."summary" the default display instead of "graph".outdated to FALSE by default.tar_read() for targets with format = "aws_file", download the file back to the path the user originally saved it when the target ran.TAR_MAKE_REPORTER environment variable with targets::tar_config_get("reporter_make").eval(parse(text = readLines("_targets.R")), envir = some_envir) and related techniques instead of the less controllable source(). Expose an envir argument to many functions for further control over evaluation if callr_function is NULL.out.attrs when hashing groups of data frames to extend #507 to expand.grid() (#508).targets.GITHUBPAT to GITHUB_TOKEN in the tar_github_actions() YAML file (#554, @eveyp).eval chunk option in Target Markdown (#552, @fkohrt).time column for all builder targets, regardless of storage format._targets.yaml to parallel workers.exclude argument to tar_watch() and tar_watch_server() (#458, @gorkang)..gitignore file to ignore everything in _targets/meta/ except .gitignore and _targets/meta/meta.knitr engines for pipeline construction and prototyping from within literate programming documents (#469, @cderv, @nviets, @emilyriederer, @ijlyttle, @GShotwell, @gadenbuie, @tomsing1). Huge thanks to @cderv on this one for answering my deluge of questions, helping me figure out what was and was not possible in knitr, and ultimately circling me back to a successful approach.use_targets(), which writes the Target Markdown template to the project root (#469).tar_unscript() to clean up scripts written by Target Markdown.tar_make() and tar_manifest().pattern = slice() or pattern = sample() are invalid.tar_target_raw(), assert that commands have length 1 when converted to expressions.tar_cue() (@maelle).dplyr groups and "grouped_df" class in tar_group() (tarchetypes discussion #53, @kendonB).tar_read() and tar_read_raw()._targets.yaml). Fixes CRAN check errors from version 0.4.1.roxygen2 docstrings from shiny.Suggests: packages.targets.yaml in the callr process.file.rename() errors when migrating staged temporary files (#410).assert_df() from store_assert_format() instead of store_cast_object(). And now those last two functions are not called at all if the target throws an error.tar_poll() at the same time as the pipeline (#393).tar_renv() to _targets_packages.R (#397).outdated = FALSE in tar_visnetwork().tar_timestamp() and tar_timestamp_raw() to get the last modified timestamp of a target's data (#378).tar_progress_summary() to compactly summarize all pipeline progress (#380).characters argument of tar_traceback() to cap the traceback line lengths (#383).tar_watch() (#382).tar_poll() to repeatedly poll runtime progress in the R console (#381). tar_poll() is a lightweight alternative to tar_watch().tar_envvar() function to list values of special environment variables supported in targets. The help file explains each environment variable in detail._targets.yaml (#297). New functions tar_config_get() and tar_config_set() interact with the _targets.yaml file. Currently only supports the store field to set the data store path to something other than _targets/.deployment = "main" (#398, #399, #404, @pat-s).tar_traceback() (#383).tar_watch(), use shinybusy instead of shinycssloaders and keep current output on display while new output is rendering (#386, @rcorty).AWS_DEFAULT_REGION environment variable (check_region = TRUE; #400, @tomsing1).tar_meta(), return POSIXct times in the time zone of the calling system (#131).qs::qread() now that qs 0.24.1 requires stringfish >= 1.5.0 (#147, @glep).pattern = slice(...) can take multiple indexes (#406, #419, @djbirke, @alexgphayes)queue$enqueue() is now queue$prepend() and always appends to the front of the queue (#371).devtools::load_all() or similar is detected inside _targets.R (#374).feather and parquet tests on CRAN.backoff option in tar_option_set() to set the maximum upper bound (seconds) for the polling interval (#333).tar_github_actions() function to write a GitHub Actions workflow file for continuous deployment of data analysis pipelines (#339, @jaredlander).TAR_MAKE_REPORTER environment variable to globally set the reporter of the tar_make*() functions (#345, @alexpghayes).tar_make_clustermq() and tar_make_future() (#333).tar_make_future(), try to submit a target every time a worker is polled.tar_make_future(), poll workers in order of target priority.targets internal objects out of the environment in order to avoid accidental massive data transfers to workers.rlang::check_installed() inside assert_package() (#331, @malcolmbarrett).tar_destroy(destroy = "process").tar_watch(), increase default seconds to 15 (previously 5).tar_watch(), debounce instead of throttle inputs.tar_watch(), add an action button to refresh the outputs.tar_make(). Will help compute a cache key on GitHub Actions and similar services.tar_deduplicate() due to the item above.tar_target_raw(), tar_meta(), and tar_seed() (#357, @alexpghayes).%||% and %|||% to conform to historical precedent.reporter = "silent" (#364, @matthiasgomolka).envir element.tar_load(), subset metadata to avoid accidental attempts to load global objects in tidyselect calls.vctrs::vec_c() (#320, @joelnitta).names argument to tar_objects() and tar_workspaces() with tidyselect functionality.targets version) in _targets/meta/process and write new functions tar_process() and tar_pid() to retrieve the data (#291, #292).targets_only argument to tar_meta().tar_helper() and tar_helper_raw() to write general-purpose R scripts, using tidy evaluation for as a template mechanism (#290, #291, #292, #306).tar_exist_meta(), tar_exist_objects(), tar_exist_progress(), tar_exist_progress(), tar_exist_script() (#310).supervise argument to tar_watch().complete_only argument to tar_meta() to optionally return only complete rows (no NA values).callr errors and refer users to the debugging chapter of the manual.crayon if an only if the calling process is interactive (#302, @ginolhac). Can still be disabled with options(crayon.enabled = FALSE) in _targets.R.format = "url" when the HTTP response status code is not 200 (#303, @petrbouchal).extras packages to tar_renv() (to support tar_watch()).tar_watch() if _targets.R does not exist.names argument of tar_load() (#314, @jameelalsalam).nobody in custom curl handles (#315, @riazarbi).targets is somehow actively monitoring each job, e.g. through a connection or heartbeat (#318).errormode = "warn" in getVDigest() for files to work around https://github.com/eddelbuettel/digest/issues/49 for network drives on Windows. targets already runs those file checks anyway. (#316, @boshek).targets tried to load from.tar_test() now skips all tests on Solaris in order to fix the problems shown on the CRAN check page.allow and exclude to work on imports in tar_visnetwork() and tar_glimpse().visNetwork legends on right to avoid crowding the graph.force() on subpipeline objects to eliminate high-memory promises in target objects. Allows targets to be deployed to workers much faster when retreival is "main" (#279).tar_watch() app to tabulate progress on dynamic branches (#273, @mattwarkentin).type, parent, and branches in progress data for tar_watch() (#273, @mattwarkentin).fields argument in tar_progress() and default to "progress" for back compatibility (#273, @mattwarkentin).tar_progress_branches() function to tabulate branch progress (#273, @mattwarkentin).tar_watch() to toggle automatic refreshing and force a refresh..Random.seed by default in tar_visnetwork().tar_watch() app.clustermq tests on Solaris.if(FALSE) blocks from help files to fix "unexecutable code" warnings (tar_glimpse(), tar_visnetwork(), and tar_watch()).tar_edit(), tar_watch_ui(), and tar_watch_server()).tar_workspace().)CITATION._targets.R (#253).tar_pipeline() and tar_bind() because of the above (#253).visNetwork stabilization (#264, @mattwarkentin).visNetwork font size.error is "continue" (#267, @liutiming).tar_bind() (#245, @yonicd).igraph topological sort.workspaces argument to tar_option_set() to specify which targets will save their workspace files during tar_make() (#214).error = "save" to error = "workspace" to so it is clearer that saving workspaces no longer duplicates data (#214).what to destroy in tar_destroy().tar_undebug() because is redundant with tar_destroy(destroy = "workspaces").head(), tail(), and sample() to provide functionality equivalent to drake's max_expand (#56).tar_pattern() function to emulate dynamic branching outside a pipeline.level_separation argument to tar_visnetwork() and tar_glimpse() to control the aspect ratio (#226).imports argument to tar_option_set() (#239).outdated is FALSE in tar_visnetwork().tar_visnetwork() to try to account for color blindness.tar_manifest().tar_renv() now invokes _targets.R through a background process just like tar_outdated() etc. so it can account for more hidden packages (#224, @mattwarkentin).deployment equal to "main" for all targets in tar_make(). This ensures tar_make() does not waste time waiting for nonexistent files to ship over a nonexistent network file system (NFS). tar_make_clustermq() or tar_make_future() could use NFS, so they still leave deployment alone.size field to the metadata to allow targets to make better judgments about when to rehash files (#180). We now compare hashes to check file size differences instead of doing messy floating point comparisons with ad hoc tolerances. It breaks back compatibility with old projects, but the error message is informative, and this is all still before the first official release.storage, retrieval, and deployment settings (#183, @mattwarkentin).garbage_collection to a target-level setting, i.e. argument to tar_target() and tar_option_set() (#194). Previously was an argument to the tar_make*() functions.tar_name() and tar_path() to run outside the pipeline with debugging-friendly default return values.storage is "remote" (#182, @mattwarkentin).target$subpipeline rather than target$cache to make that happen (#209, @mattwarkentin).tar_bind() to combine pipeline objects.tar_seed() to get the random number generator seed of the target currently running.future::plan()s through the resources argument of tar_target() (#198, @mattwarkentin).library() instead of require() in command_load_packages().targets$cache$targets$envir to improve convenience in interactive debugging (ls() just works now.) This is reasonably safe now that the cache is populated at the last minute and cleared as soon as possible (#209, #210).