Send long-running or parallel jobs to a Slurm workload manager (i.e. cluster)
using the slurm_call
, slurm_apply
, or
slurm_map
functions.
This package includes three core functions used to send computations to a
Slurm cluster: 1) slurm_call
executes a function using a
single set of parameters (passed as a list), 2) slurm_apply
evaluates a function in parallel for each row of parameters in a given
data frame, and 3) slurm_map
evaluates a function in parallel
for each element of a list.
The functions slurm_apply
and slurm_map
automatically split the
parameter rows or list elements into equal-size chunks,
each chunk to be processed by a separate cluster node.
They use functions from the parallel-package
package to parallelize computations across processors on a given node.
The output of slurm_apply
, slurm_map
, or slurm_call
is a slurm_job
object that serves as an input to the other functions in the package:
print_job_status
, cancel_slurm
,
get_slurm_out
and cleanup_files
.
To be compatible with slurm_apply
, a function may accept any
number of single value parameters. The names of these parameters must match
the column names of the params
data frame supplied. There are no
restrictions on the types of parameters passed as a list to
slurm_call
or slurm_map
If the function passed to slurm_call
or slurm_apply
requires
knowledge of any R objects (data, custom helper functions) besides
params
, a character vector corresponding to their names should be
passed to the optional global_objects
argument.
When parallelizing a function, since any error will interrupt all
calculations for the current node, it may be useful to wrap expressions
which may generate errors into a try
or
tryCatch
function. This will ensure the computation
continues with the next parameter set after reporting the error.
The default output format for get_slurm_out
(outtype = "raw"
)
is a list where each element is the return value of one function call. If
the function passed to slurm_apply
produces a vector output, you may
use outtype = "table"
to collect the output in a single data frame,
with one row by function call.
Advanced options for the Slurm workload manager may accompany job submission
by slurm_call
, slurm_map
, and slurm_apply
through the optional slurm_options
argument. For example, passing
list(time = '1:30:00')
for this options limits the job to 1 hour and 30
minutes. Some advanced configuration must be set through environment
variables. On a multi-cluster head node, for example, the SLURM_CLUSTERS
environment variable must be set to direct jobs to a non-default cluster.
if (FALSE) {
# Create a data frame of mean/sd values for normal distributions
pars <- data.frame(par_m = seq(-10, 10, length.out = 1000),
par_sd = seq(0.1, 10, length.out = 1000))
# Create a function to parallelize
ftest <- function(par_m, par_sd) {
samp <- rnorm(10^7, par_m, par_sd)
c(s_m = mean(samp), s_sd = sd(samp))
}
sjob1 <- slurm_apply(ftest, pars)
print_job_status(sjob1)
res <- get_slurm_out(sjob1, "table")
all.equal(pars, res) # Confirm correct output
cleanup_files(sjob1)
}