Title: | Targets for JAGS Pipelines |
---|---|
Description: | Bayesian data analysis usually incurs long runtimes and cumbersome custom code. A pipeline toolkit tailored to Bayesian statisticians, the 'jagstargets' R package is leverages 'targets' and 'R2jags' to ease this burden. 'jagstargets' makes it super easy to set up scalable JAGS pipelines that automatically parallelize the computation and skip expensive steps when the results are already up to date. Minimal custom code is required, and there is no need to manually configure branching, so usage is much easier than 'targets' alone. For the underlying methodology, please refer to the documentation of 'targets' <doi:10.21105/joss.02959> and 'JAGS' (Plummer 2003) <https://www.r-project.org/conferences/DSC-2003/Proceedings/Plummer.pdf>. |
Authors: | William Michael Landau [aut, cre] , David Lawrence Miller [rev], Eli Lilly and Company [cph] |
Maintainer: | William Michael Landau <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.2.3 |
Built: | 2024-12-04 22:52:49 UTC |
Source: | https://github.com/ropensci/jagstargets |
Bayesian data analysis usually incurs long runtimes
and cumbersome custom code. A pipeline toolkit tailored to
Bayesian statisticians, the jagstargets
R package leverages
targets
and R2jags
to ease this burden.
jagstargets
makes it super easy to set up scalable
JAGS pipelines that automatically parallelize the computation
and skip expensive steps when the results are already up to date.
Minimal custom code is required, and there is no need to manually
configure branching, so usage is much easier than targets
alone.
https://docs.ropensci.org/jagstargets/, tar_jags()
Targets to run a JAGS model once with MCMC and save multiple outputs.
tar_jags( name, jags_files, parameters.to.save, data = list(), summaries = list(), summary_args = list(), n.cluster = 1, n.chains = 3, n.iter = 2000, n.burnin = as.integer(n.iter/2), n.thin = 1, jags.module = c("glm", "dic"), inits = NULL, RNGname = c("Wichmann-Hill", "Marsaglia-Multicarry", "Super-Duper", "Mersenne-Twister"), jags.seed = 1, stdout = NULL, stderr = NULL, progress.bar = "text", refresh = 0, draws = TRUE, summary = TRUE, dic = TRUE, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = "qs", format_df = "fst_tbl", repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
tar_jags( name, jags_files, parameters.to.save, data = list(), summaries = list(), summary_args = list(), n.cluster = 1, n.chains = 3, n.iter = 2000, n.burnin = as.integer(n.iter/2), n.thin = 1, jags.module = c("glm", "dic"), inits = NULL, RNGname = c("Wichmann-Hill", "Marsaglia-Multicarry", "Super-Duper", "Mersenne-Twister"), jags.seed = 1, stdout = NULL, stderr = NULL, progress.bar = "text", refresh = 0, draws = TRUE, summary = TRUE, dic = TRUE, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = "qs", format_df = "fst_tbl", repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
name |
Symbol, base name for the collection of targets. Serves as a prefix for target names. |
jags_files |
Character vector of JAGS model files. If you
supply multiple files, each model will run on the one shared dataset
generated by the code in |
parameters.to.save |
Model parameters to save, passed to
|
data |
Code to generate the |
summaries |
List of summary functions passed to |
summary_args |
List of summary function arguments passed to
|
n.cluster |
Number of parallel processes, passed to
|
n.chains |
Number of MCMC chains, passed to
|
n.iter |
Number if iterations (including warmup), passed to
|
n.burnin |
Number of warmup iterations, passed to
|
n.thin |
Thinning interval, passed to
|
jags.module |
Character vector of JAGS modules to load, passed to
|
inits |
Initial values of model parameters, passed to
|
RNGname |
Choice of random number generator, passed to
|
jags.seed |
Seeds to apply to JAGS, passed to
|
stdout |
Character of length 1, file path to write the stdout stream
of the model when it runs. Set to |
stderr |
Character of length 1, file path to write the stderr stream
of the model when it runs. Set to |
progress.bar |
Type of progress bar, passed to
|
refresh |
Frequency for refreshing the progress bar, passed to
|
draws |
Logical, whether to create a target for posterior draws.
Saves draws as a compressed |
summary |
Logical, whether to create a target to store a small data frame of posterior summary statistics and convergence diagnostics. |
dic |
Logical, whether to create a target with deviance information criterion (DIC) results. |
tidy_eval |
Logical, whether to enable tidy evaluation
when interpreting |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
format |
Character of length 1, storage format of the non-data-frame
targets such as the JAGS data and any JAGS fit objects.
Please choose an all=purpose
format such as |
format_df |
Character of length 1, storage format of the data frame
targets such as posterior draws. We recommend efficient data frame formats
such as |
repository |
Character of length 1, remote repository for target storage. Choices:
Note: if |
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
storage |
Character string to control when the output of the target
is saved to storage. Only relevant when using
|
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
The MCMC targets use R2jags::jags()
if n.cluster
is 1
and
R2jags::jags.parallel()
otherwise. Most arguments to tar_jags()
are forwarded to these functions.
tar_jags()
returns list of target objects.
See the "Target objects" section for
background.
The target names use the name
argument as a prefix, and the individual
elements of jags_files
appear in the suffixes where applicable.
As an example, the specific target objects returned by
tar_jags(name = x, jags_files = "y.jags", ...)
returns a list
of targets::tar_target()
objects:
x_file_y
: reproducibly track the JAGS model file. Returns
a character vector of length 1 with the path to the JAGS
model file.
x_lines_y
: read the contents of the JAGS model file
for safe transport to parallel workers.
Returns a character vector of lines in the model file.
x_data
: run the R expression in the data
argument to produce
a JAGS dataset for the model. Returns a JAGS data list.
x_mcmc_y
: run MCMC on the model and dataset.
Returns an rjags
object from R2jags
with all the MCMC results.
x_draws_y
: extract posterior samples from x_mcmc_y
.
Returns a tidy data frame of MCMC draws. Omitted if draws = FALSE
.
x_summary_y
: extract posterior summaries from x_mcmc_y
.
Returns a tidy data frame of MCMC draws.
Omitted if summary = FALSE
.
x_dic
: extract deviance information criterion (DIC) info
from x_mcmc_y
. Returns a tidy data frame of DIC info.
Omitted if dic = FALSE
.
Most stantargets
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
if (identical(Sys.getenv("TAR_JAGS_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(jagstargets) # Do not use a temp file for a real project # or else your targets will always rerun. tmp <- tempfile(pattern = "", fileext = ".jags") tar_jags_example_file(tmp) list( tar_jags( your_model, jags_files = tmp, data = tar_jags_example_data(), parameters.to.save = "beta", stdout = R.utils::nullfile(), stderr = R.utils::nullfile() ) ) }, ask = FALSE) targets::tar_make() }) }
if (identical(Sys.getenv("TAR_JAGS_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(jagstargets) # Do not use a temp file for a real project # or else your targets will always rerun. tmp <- tempfile(pattern = "", fileext = ".jags") tar_jags_example_file(tmp) list( tar_jags( your_model, jags_files = tmp, data = tar_jags_example_data(), parameters.to.save = "beta", stdout = R.utils::nullfile(), stderr = R.utils::nullfile() ) ) }, ask = FALSE) targets::tar_make() }) }
An example dataset compatible with the model file
from tar_jags_example_file()
. The output has a .join_data
element so the true value of beta
from the simulation
is automatically appended to the beta
rows of the
summary output.
tar_jags_example_data(n = 10L)
tar_jags_example_data(n = 10L)
n |
Integer of length 1, number of data points. |
A list with the following elements:
n
: integer, number of data points.
x
: numeric, covariate vector.
y
: numeric, response variable.
true_beta
: numeric of length 1, value of the regression
coefficient beta
used in simulation.
.join_data
: a list of simulated values to be appended
to as a .join_data
column in the output of
targets generated by functions such as
tar_jags_rep_summary()
. Contains the
regression coefficient beta
(numeric of length 1)
and prior predictive data y
(numeric vector).
The tar_jags_example_data()
function draws a JAGS
dataset from the prior predictive distribution of the
model from tar_jags_example_file()
. First, the
regression coefficient beta
is drawn from its standard
normal prior, and the covariate x
is computed.
Then, conditional on the beta
draws and the covariate,
the response vector y
is drawn from its
Normal(x * beta
, 1) likelihood.
List, dataset compatible with the model file from
tar_jags_example_file()
. The output has a .join_data
element so the true value of beta
from the simulation
is automatically appended to the beta
rows of the
summary output.
tar_jags_example_data()
tar_jags_example_data()
Overwrites the file at path
with a built-in example
JAGS model file.
tar_jags_example_file(path = tempfile(pattern = "", fileext = ".jags"))
tar_jags_example_file(path = tempfile(pattern = "", fileext = ".jags"))
path |
Character of length 1, file path to write the model file. |
NULL
(invisibly).
path <- tempfile(pattern = "", fileext = ".jags") tar_jags_example_file(path = path) writeLines(readLines(path))
path <- tempfile(pattern = "", fileext = ".jags") tar_jags_example_file(path = path) writeLines(readLines(path))
Run multiple MCMCs on simulated datasets and return DIC and the effective number of parameters for each run.
tar_jags_rep_dic( name, jags_files, parameters.to.save, data = list(), batches = 1L, reps = 1L, combine = TRUE, n.cluster = 1, n.chains = 3, n.iter = 2000, n.burnin = as.integer(n.iter/2), n.thin = 1, jags.module = c("glm", "dic"), inits = NULL, RNGname = c("Wichmann-Hill", "Marsaglia-Multicarry", "Super-Duper", "Mersenne-Twister"), jags.seed = NULL, stdout = NULL, stderr = NULL, progress.bar = "text", refresh = 0, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = "qs", format_df = "fst_tbl", repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
tar_jags_rep_dic( name, jags_files, parameters.to.save, data = list(), batches = 1L, reps = 1L, combine = TRUE, n.cluster = 1, n.chains = 3, n.iter = 2000, n.burnin = as.integer(n.iter/2), n.thin = 1, jags.module = c("glm", "dic"), inits = NULL, RNGname = c("Wichmann-Hill", "Marsaglia-Multicarry", "Super-Duper", "Mersenne-Twister"), jags.seed = NULL, stdout = NULL, stderr = NULL, progress.bar = "text", refresh = 0, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = "qs", format_df = "fst_tbl", repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
name |
Symbol, base name for the collection of targets. Serves as a prefix for target names. |
jags_files |
Character vector of JAGS model files. If you
supply multiple files, each model will run on the one shared dataset
generated by the code in |
parameters.to.save |
Model parameters to save, passed to
|
data |
Code to generate the |
batches |
Number of batches. Each batch runs a model |
reps |
Number of replications per batch. Ideally, each rep
should produce its own random dataset using the code
supplied to |
combine |
Logical, whether to create a target to combine all the model results into a single data frame downstream. Convenient, but duplicates data. |
n.cluster |
Number of parallel processes, passed to
|
n.chains |
Number of MCMC chains, passed to
|
n.iter |
Number if iterations (including warmup), passed to
|
n.burnin |
Number of warmup iterations, passed to
|
n.thin |
Thinning interval, passed to
|
jags.module |
Character vector of JAGS modules to load, passed to
|
inits |
Initial values of model parameters, passed to
|
RNGname |
Choice of random number generator, passed to
|
jags.seed |
The |
stdout |
Character of length 1, file path to write the stdout stream
of the model when it runs. Set to |
stderr |
Character of length 1, file path to write the stderr stream
of the model when it runs. Set to |
progress.bar |
Type of progress bar, passed to
|
refresh |
Frequency for refreshing the progress bar, passed to
|
tidy_eval |
Logical, whether to enable tidy evaluation
when interpreting |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
format |
Character of length 1, storage format of the data frames
of posterior summaries and other data frames returned by targets.
We recommend efficient data frame formats
such as |
format_df |
Character of length 1, storage format of the data frame
targets such as posterior draws. We recommend efficient data frame formats
such as |
repository |
Character of length 1, remote repository for target storage. Choices:
Note: if |
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
storage |
Character string to control when the output of the target
is saved to storage. Only relevant when using
|
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
The MCMC targets use R2jags::jags()
if n.cluster
is 1
and
R2jags::jags.parallel()
otherwise. Most arguments to tar_jags()
are forwarded to these functions.
tar_jags_rep_dic()
returns list of target objects.
See the "Target objects" section for
background.
The target names use the name
argument as a prefix, and the individual
elements of jags_files
appear in the suffixes where applicable.
As an example, the specific target objects returned by
tar_jags_rep_dic(name = x, jags_files = "y.jags")
are as follows.
x_file_y
: reproducibly track the JAGS model file. Returns
a character vector of length 1 with the path to the JAGS
model file.
x_lines_y
: read the contents of the JAGS model file
for safe transport to parallel workers.
Returns a character vector of lines in the model file.
x_data
: use dynamic branching to generate multiple JAGS
datasets from the R expression in the data
argument.
Each dynamic branch returns a batch of JAGS data lists.
x_y
: run JAGS on each dataset from x_data
.
Each dynamic branch returns a tidy data frame of DIC
results for each batch of data.
x
: combine all the batches from x_y
into a non-dynamic target.
Suppressed if combine
is FALSE
.
Returns a long tidy data frame with all DIC info
from all the branches of x_y
.
Rep-specific random number generator seeds for the data and models
are automatically set based on the batch, rep,
parent target name, and tar_option_get("seed")
. This ensures
the rep-specific seeds do not change when you change the batching
configuration (e.g. 40 batches of 10 reps each vs 20 batches of 20
reps each). Each data seed is in the .seed
list element of the output,
and each JAGS seed is in the .seed column of each JAGS model output.
Most stantargets
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
if (identical(Sys.getenv("TAR_JAGS_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(jagstargets) # Do not use a temp file for a real project # or else your targets will always rerun. tmp <- tempfile(pattern = "", fileext = ".jags") tar_jags_example_file(tmp) list( tar_jags_rep_dic( your_model, jags_files = tmp, data = tar_jags_example_data(), parameters.to.save = "beta", batches = 2, reps = 2, stdout = R.utils::nullfile(), stderr = R.utils::nullfile() ) ) }, ask = FALSE) targets::tar_make() }) }
if (identical(Sys.getenv("TAR_JAGS_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(jagstargets) # Do not use a temp file for a real project # or else your targets will always rerun. tmp <- tempfile(pattern = "", fileext = ".jags") tar_jags_example_file(tmp) list( tar_jags_rep_dic( your_model, jags_files = tmp, data = tar_jags_example_data(), parameters.to.save = "beta", batches = 2, reps = 2, stdout = R.utils::nullfile(), stderr = R.utils::nullfile() ) ) }, ask = FALSE) targets::tar_make() }) }
Run multiple MCMCs on simulated datasets and return posterior samples and the effective number of parameters for each run.
tar_jags_rep_draws( name, jags_files, parameters.to.save, data = list(), batches = 1L, reps = 1L, transform = NULL, combine = FALSE, n.cluster = 1, n.chains = 3, n.iter = 2000, n.burnin = as.integer(n.iter/2), n.thin = 1, jags.module = c("glm", "dic"), inits = NULL, RNGname = c("Wichmann-Hill", "Marsaglia-Multicarry", "Super-Duper", "Mersenne-Twister"), jags.seed = NULL, stdout = NULL, stderr = NULL, progress.bar = "text", refresh = 0, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = "qs", format_df = "fst_tbl", repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = "transient", garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
tar_jags_rep_draws( name, jags_files, parameters.to.save, data = list(), batches = 1L, reps = 1L, transform = NULL, combine = FALSE, n.cluster = 1, n.chains = 3, n.iter = 2000, n.burnin = as.integer(n.iter/2), n.thin = 1, jags.module = c("glm", "dic"), inits = NULL, RNGname = c("Wichmann-Hill", "Marsaglia-Multicarry", "Super-Duper", "Mersenne-Twister"), jags.seed = NULL, stdout = NULL, stderr = NULL, progress.bar = "text", refresh = 0, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = "qs", format_df = "fst_tbl", repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = "transient", garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
name |
Symbol, base name for the collection of targets. Serves as a prefix for target names. |
jags_files |
Character vector of JAGS model files. If you
supply multiple files, each model will run on the one shared dataset
generated by the code in |
parameters.to.save |
Model parameters to save, passed to
|
data |
Code to generate the |
batches |
Number of batches. Each batch runs a model |
reps |
Number of replications per batch. Ideally, each rep
should produce its own random dataset using the code
supplied to |
transform |
Symbol or |
combine |
Logical, whether to create a target to combine all the model results into a single data frame downstream. Convenient, but duplicates data. |
n.cluster |
Number of parallel processes, passed to
|
n.chains |
Number of MCMC chains, passed to
|
n.iter |
Number if iterations (including warmup), passed to
|
n.burnin |
Number of warmup iterations, passed to
|
n.thin |
Thinning interval, passed to
|
jags.module |
Character vector of JAGS modules to load, passed to
|
inits |
Initial values of model parameters, passed to
|
RNGname |
Choice of random number generator, passed to
|
jags.seed |
The |
stdout |
Character of length 1, file path to write the stdout stream
of the model when it runs. Set to |
stderr |
Character of length 1, file path to write the stderr stream
of the model when it runs. Set to |
progress.bar |
Type of progress bar, passed to
|
refresh |
Frequency for refreshing the progress bar, passed to
|
tidy_eval |
Logical, whether to enable tidy evaluation
when interpreting |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
format |
Character of length 1, storage format of the data frames
of posterior summaries and other data frames returned by targets.
We recommend efficient data frame formats
such as |
format_df |
Character of length 1, storage format of the data frame
targets such as posterior draws. We recommend efficient data frame formats
such as |
repository |
Character of length 1, remote repository for target storage. Choices:
Note: if |
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
storage |
Character string to control when the output of the target
is saved to storage. Only relevant when using
|
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
The MCMC targets use R2jags::jags()
if n.cluster
is 1
and
R2jags::jags.parallel()
otherwise. Most arguments to tar_jags()
are forwarded to these functions.
tar_jags_rep_draws()
returns list of target objects.
See the "Target objects" section for
background.
The target names use the name
argument as a prefix, and the individual
elements of jags_files
appear in the suffixes where applicable.
As an example, the specific target objects returned by
tar_jags_rep_dic(name = x, jags_files = "y.jags")
are as follows.
x_file_y
: reproducibly track the JAGS model file. Returns
a character vector of length 1 with the path to the JAGS
model file.
x_lines_y
: read the contents of the JAGS model file
for safe transport to parallel workers.
Returns a character vector of lines in the model file.
x_data
: use dynamic branching to generate multiple JAGS
datasets from the R expression in the data
argument.
Each dynamic branch returns a batch of JAGS data lists.
x_y
: run JAGS on each dataset from x_data
.
Each dynamic branch returns a tidy data frame of draws
for each batch of data.
x
: combine all the batches from x_y
into a non-dynamic target.
Suppressed if combine
is FALSE
.
Returns a long tidy data frame with all draws
from all the branches of x_y
.
Rep-specific random number generator seeds for the data and models
are automatically set based on the batch, rep,
parent target name, and tar_option_get("seed")
. This ensures
the rep-specific seeds do not change when you change the batching
configuration (e.g. 40 batches of 10 reps each vs 20 batches of 20
reps each). Each data seed is in the .seed
list element of the output,
and each JAGS seed is in the .seed column of each JAGS model output.
Most stantargets
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
if (identical(Sys.getenv("TAR_JAGS_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(jagstargets) # Do not use a temp file for a real project # or else your targets will always rerun. tmp <- tempfile(pattern = "", fileext = ".jags") tar_jags_example_file(tmp) list( tar_jags_rep_draws( your_model, jags_files = tmp, data = tar_jags_example_data(), parameters.to.save = "beta", batches = 2, reps = 2, stdout = R.utils::nullfile(), stderr = R.utils::nullfile() ) ) }, ask = FALSE) targets::tar_make() }) }
if (identical(Sys.getenv("TAR_JAGS_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(jagstargets) # Do not use a temp file for a real project # or else your targets will always rerun. tmp <- tempfile(pattern = "", fileext = ".jags") tar_jags_example_file(tmp) list( tar_jags_rep_draws( your_model, jags_files = tmp, data = tar_jags_example_data(), parameters.to.save = "beta", batches = 2, reps = 2, stdout = R.utils::nullfile(), stderr = R.utils::nullfile() ) ) }, ask = FALSE) targets::tar_make() }) }
Run multiple MCMCs on simulated datasets and return posterior summaries and the effective number of parameters for each run.
tar_jags_rep_summary( name, jags_files, parameters.to.save, data = list(), variables = NULL, summaries = NULL, summary_args = NULL, batches = 1L, reps = 1L, combine = TRUE, n.cluster = 1, n.chains = 3, n.iter = 2000, n.burnin = as.integer(n.iter/2), n.thin = 1, jags.module = c("glm", "dic"), inits = NULL, RNGname = c("Wichmann-Hill", "Marsaglia-Multicarry", "Super-Duper", "Mersenne-Twister"), jags.seed = NULL, stdout = NULL, stderr = NULL, progress.bar = "text", refresh = 0, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = "qs", format_df = "fst_tbl", repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = "transient", garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
tar_jags_rep_summary( name, jags_files, parameters.to.save, data = list(), variables = NULL, summaries = NULL, summary_args = NULL, batches = 1L, reps = 1L, combine = TRUE, n.cluster = 1, n.chains = 3, n.iter = 2000, n.burnin = as.integer(n.iter/2), n.thin = 1, jags.module = c("glm", "dic"), inits = NULL, RNGname = c("Wichmann-Hill", "Marsaglia-Multicarry", "Super-Duper", "Mersenne-Twister"), jags.seed = NULL, stdout = NULL, stderr = NULL, progress.bar = "text", refresh = 0, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = "qs", format_df = "fst_tbl", repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = "transient", garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
name |
Symbol, base name for the collection of targets. Serves as a prefix for target names. |
jags_files |
Character vector of JAGS model files. If you
supply multiple files, each model will run on the one shared dataset
generated by the code in |
parameters.to.save |
Model parameters to save, passed to
|
data |
Code to generate the |
variables |
Character vector of model parameter names. The output posterior summaries are restricted to these variables. |
summaries |
List of summary functions passed to |
summary_args |
List of summary function arguments passed to
|
batches |
Number of batches. Each batch runs a model |
reps |
Number of replications per batch. Ideally, each rep
should produce its own random dataset using the code
supplied to |
combine |
Logical, whether to create a target to combine all the model results into a single data frame downstream. Convenient, but duplicates data. |
n.cluster |
Number of parallel processes, passed to
|
n.chains |
Number of MCMC chains, passed to
|
n.iter |
Number if iterations (including warmup), passed to
|
n.burnin |
Number of warmup iterations, passed to
|
n.thin |
Thinning interval, passed to
|
jags.module |
Character vector of JAGS modules to load, passed to
|
inits |
Initial values of model parameters, passed to
|
RNGname |
Choice of random number generator, passed to
|
jags.seed |
The |
stdout |
Character of length 1, file path to write the stdout stream
of the model when it runs. Set to |
stderr |
Character of length 1, file path to write the stderr stream
of the model when it runs. Set to |
progress.bar |
Type of progress bar, passed to
|
refresh |
Frequency for refreshing the progress bar, passed to
|
tidy_eval |
Logical, whether to enable tidy evaluation
when interpreting |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
format |
Character of length 1, storage format of the data frames
of posterior summaries and other data frames returned by targets.
We recommend efficient data frame formats
such as |
format_df |
Character of length 1, storage format of the data frame
targets such as posterior draws. We recommend efficient data frame formats
such as |
repository |
Character of length 1, remote repository for target storage. Choices:
Note: if |
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
storage |
Character string to control when the output of the target
is saved to storage. Only relevant when using
|
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
The MCMC targets use R2jags::jags()
if n.cluster
is 1
and
R2jags::jags.parallel()
otherwise. Most arguments to tar_jags()
are forwarded to these functions.
tar_jags_rep_summary()
returns list of target objects.
See the "Target objects" section for
background.
The target names use the name
argument as a prefix, and the individual
elements of jags_files
appear in the suffixes where applicable.
As an example, the specific target objects returned by
tar_jags_rep_dic(name = x, jags_files = "y.jags")
are as follows.
x_file_y
: reproducibly track the JAGS model file. Returns
a character vector of length 1 with the path to the JAGS
model file.
x_lines_y
: read the contents of the JAGS model file
for safe transport to parallel workers.
Returns a character vector of lines in the model file.
x_data
: use dynamic branching to generate multiple JAGS
datasets from the R expression in the data
argument.
Each dynamic branch returns a batch of JAGS data lists.
x_y
: run JAGS on each dataset from x_data
.
Each dynamic branch returns a tidy data frame of summaries
for each batch of data.
x
: combine all the batches from x_y
into a non-dynamic target.
Suppressed if combine
is FALSE
.
Returns a long tidy data frame with all summaries
from all the branches of x_y
.
Rep-specific random number generator seeds for the data and models
are automatically set based on the batch, rep,
parent target name, and tar_option_get("seed")
. This ensures
the rep-specific seeds do not change when you change the batching
configuration (e.g. 40 batches of 10 reps each vs 20 batches of 20
reps each). Each data seed is in the .seed
list element of the output,
and each JAGS seed is in the .seed column of each JAGS model output.
Most stantargets
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
if (identical(Sys.getenv("TAR_JAGS_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(jagstargets) # Do not use a temp file for a real project # or else your targets will always rerun. tmp <- tempfile(pattern = "", fileext = ".jags") tar_jags_example_file(tmp) list( tar_jags_rep_summary( your_model, jags_files = tmp, data = tar_jags_example_data(), parameters.to.save = "beta", batches = 2, reps = 2, stdout = R.utils::nullfile(), stderr = R.utils::nullfile() ) ) }, ask = FALSE) targets::tar_make() }) }
if (identical(Sys.getenv("TAR_JAGS_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(jagstargets) # Do not use a temp file for a real project # or else your targets will always rerun. tmp <- tempfile(pattern = "", fileext = ".jags") tar_jags_example_file(tmp) list( tar_jags_rep_summary( your_model, jags_files = tmp, data = tar_jags_example_data(), parameters.to.save = "beta", batches = 2, reps = 2, stdout = R.utils::nullfile(), stderr = R.utils::nullfile() ) ) }, ask = FALSE) targets::tar_make() }) }