Apptainer (aka Singularity)
The Linux Foundation version of Singularity is now called apptainer
but otherwise is functionally identical.
This document will explain how to take advantage of the apptainer integration into the HTCondor container universe. Additionally, we'll suggest taking advantage of the file transfer mechanisms to - we hope - make this as easy as possible to use.
Container location
As explained elsewhere the best way to use apptainer with the CERN batch service is to have that container image dumped out into /cvmfs/unpacked.cern.ch
. For the purposes of this example, we'll use /cvmfs/unpacked.cern.ch/gitlab-registry.cern.ch/batch-team/containers/plusbatch/cs8-full:latest
Input / Output data
When the container is running, the initial working directory of the container is the initial working directory of the batch job. We can have htcondor transfer data to that directory at the start of the job, and from that directory at the end of the job. Standard htcondor file transfer semantics can be used, so with this example we'll transfer data to and from EOS
using the file transfer plugin method
Example submit file
executable = runme.sh
log = singularity.$(ClusterId).log
error = singularity.$(ClusterId).$(ProcId).err
output = singularity.$(ClusterId).$(ProcId).out
should_transfer_files = YES
MY.JobFlavour = "longlunch"
transfer_input_files = root://eosuser.cern.ch//eos/user/b/bejones/scripts/payload.py, root://eosuser.cern.ch//eos/user/b/bejones/data/input.txt
output_destination = root://eosuser.cern.ch//eos/user/b/bejones/results/$(ClusterId)/
transfer_output_files = output.txt
MY.XRDCP_CREATE_DIR = True
MY.SingularityImage = "/cvmfs/unpacked.cern.ch/gitlab-registry.cern.ch/batch-team/containers/plusbatch/cs8-full:latest"
queue
Let's just go through that and explain a few things, since not all of it may be necessary for you, but some of it might.
- The
executable
is a shell script, but later on we're running some python (thepayload.py
). Sometimes it's useful to set up some environment in a script, then run something else that is being transferred as input. - The
log
has the$(ClusterId)
in its name whereaserror
andoutput
also have$(ProcId)
. If youqueue
more than one job, then a log can be written by each process, whereasoutput
anderror
cannot, they must be unique. transfer_input_files
these are multiple files transferred fromEOS
. Note that this is for files stored in eosuser, if the instance is different, the server name will have to reflect that. These files end up in the working directory of the job.output_destination
is a directory inEOS
that you want the output sent to. In this case we're creating one specific to the submission with$(ClusterId)
. You will get back by default output, error and anything else you write to the sandbox unless you specify a list. We will do that though with...transfer_output_files
the list of files to be sent back. Note that if it's listed, it has to be produced or the job will be failed.MY.XRDCP_CREATE_DIR
this says that the output directory should be created by the plugin if it does not exist.MY.SingularityImage
this specifies the container 'image' that we'll be using.