When you start divvy
, you may provide a configuration file that specifies one or more compute packages. A compute package is just a set of a variables that contains information needed to run a job, such as a job submission template, the command that you use to submit a job (e.g. sbatch
or qsub
), and any other variables needed to fill the template (e.g. partition
or account
). You can find out how to write your own divvy config file, but for this tutorial, we'll just use the default.
Start by importing divvy
, and then create a new ComputingConfiguration
object. If you provide no arguments, you'll just get a few default packages:
import divvy
dcc = divvy.ComputingConfiguration()
Using default config file, no global config file provided in environment variable(s): ['DIVCFG', 'PEPENV']
Loading divvy config file: /home/nsheff/.local/lib/python2.7/site-packages/divvy/submit_templates/default_compute_settings.yaml
Available packages: default, local, slurm
Activating compute package 'default'
This loads up the default compute package, and we see that there are a few other packages available. We can explore the compute settings in the loaded (default
) package like this:
dcc.compute
{'submission_command': 'sh', 'submission_template': '/home/nsheff/.local/lib/python2.7/site-packages/divvy/submit_templates/localhost_template.sub'}
Here you can see that a compute package is really a simple thing. In this case, it's just 2 key-value pairs. The submission_template
key is a path to a template file, with these contents:
with open(dcc.compute.submission_template) as f:
print(f.read())
#!/bin/bash
echo 'Compute node:' `hostname`
echo 'Start time:' `date +'%Y-%m-%d %T'`
{CODE} | tee {LOGFILE}
We can populate this simple template by passing values for the {VARIABLE}
text in the template:
dcc.write_script("test_local.sub", {"code": "run-this-command", "logfile": "logfile.txt"})
Writing script to /home/nsheff/code/divvy/docs_jupyter/test_local.sub
'test_local.sub'
Now let's look at the contents of our populated template:
with open("test_local.sub") as f:
print(f.read())
#!/bin/bash
echo 'Compute node:' `hostname`
echo 'Start time:' `date +'%Y-%m-%d %T'`
run-this-command | tee logfile.txt
This function opens the template specified by the submission_template
variable in the compute package, and then populates any template variables with values from the compute package. The original {CODE}
and {LOGFILE}
has been replaced by the variables we passed to write_script()
.
The other variable in the compute package is submission_command
, which contains the shell instruction that would be used to submit this populated template; in this case, it's simply sh
to run this script in the console. We can activate a different compute_package like this:
dcc.activate_package("slurm")
Activating compute package 'slurm'
True
It returns 'True' to indicate that the activation has been successful. This will change our settings. Let's inspect the new package:
dcc.compute
{'submission_command': 'sbatch', 'submission_template': '/home/nsheff/.local/lib/python2.7/site-packages/divvy/submit_templates/slurm_template.sub'}
Now that we've activated the package of interest, let's take a peek at the now-active submission_template
:
with open(dcc.compute.submission_template) as f:
print(f.read())
#!/bin/bash
#SBATCH --job-name='{JOBNAME}'
#SBATCH --output='{LOGFILE}'
#SBATCH --mem='{MEM}'
#SBATCH --cpus-per-task='{CORES}'
#SBATCH --time='{TIME}'
#SBATCH --partition='{PARTITION}'
#SBATCH -m block
#SBATCH --ntasks=1
echo 'Compute node:' `hostname`
echo 'Start time:' `date +'%Y-%m-%d %T'`
{CODE}
In this template there are a lot more variables to populate. If we don't populate them all, they will just be left in the template. Let's pass a value for the code
variable and see how this changes the submission script output:
s = dcc.write_script("test_script.sub", {"code":"yellow"})
Writing script to /home/nsheff/code/divvy/docs_jupyter/test_script.sub
Here's the output. Notice that the {CODE}
variable has been replaced with the word yellow
:
with open("test_script.sub") as f:
print(f.read())
#!/bin/bash
#SBATCH --job-name='{JOBNAME}'
#SBATCH --output='{LOGFILE}'
#SBATCH --mem='{MEM}'
#SBATCH --cpus-per-task='{CORES}'
#SBATCH --time='{TIME}'
#SBATCH --partition='{PARTITION}'
#SBATCH -m block
#SBATCH --ntasks=1
echo 'Compute node:' `hostname`
echo 'Start time:' `date +'%Y-%m-%d %T'`
yellow
Now, you can also pass more than one Dict
object, in priority order, by just passing a list. Here, we'll pass 2 dicts, and any values in the 1st will override values in the 2nd:
s = dcc.write_script("test_script.sub", [{"code":"red"}, {"code": "yellow", "time": "now"}])
Writing script to /home/nsheff/code/divvy/docs_jupyter/test_script.sub
with open("test_script.sub") as f:
print(f.read())
#!/bin/bash
#SBATCH --job-name='{JOBNAME}'
#SBATCH --output='{LOGFILE}'
#SBATCH --mem='{MEM}'
#SBATCH --cpus-per-task='{CORES}'
#SBATCH --time='now'
#SBATCH --partition='{PARTITION}'
#SBATCH -m block
#SBATCH --ntasks=1
echo 'Compute node:' `hostname`
echo 'Start time:' `date +'%Y-%m-%d %T'`
red
In this case the value red
took priority for the code
variable, because it came first; but time
was not overwritten in the first entry, so it is maintained. This allows for a cascading cumulative priority variable replacement.