Apptainer (aka Singularity)
The Linux Foundation version of Singularity is now called
apptainer but otherwise is functionally identical.
This document will explain how to take advantage of the apptainer integration into the HTCondor container universe. Additionally, we'll suggest taking advantage of the file transfer mechanisms to - we hope - make this as easy as possible to use.
As explained elsewhere the best way to use apptainer with the CERN batch service is to have that container image dumped out into
/cvmfs/unpacked.cern.ch. For the purposes of this example, we'll use
Input / Output data
When the container is running, the initial working directory of the container is the initial working directory of the batch job. We can have htcondor transfer data to that directory at the start of the job, and from that directory at the end of the job. Standard htcondor file transfer semantics can be used, so with this example we'll transfer data to and from
EOS using the file transfer plugin method
Example submit file
executable = runme.sh log = singularity.$(ClusterId).log error = singularity.$(ClusterId).$(ProcId).err output = singularity.$(ClusterId).$(ProcId).out should_transfer_files = YES MY.JobFlavour = "longlunch" transfer_input_files = root://eosuser.cern.ch//eos/user/b/bejones/scripts/payload.py, root://eosuser.cern.ch//eos/user/b/bejones/data/input.txt output_destination = root://eosuser.cern.ch//eos/user/b/bejones/results/$(ClusterId)/ transfer_output_files = output.txt MY.XRDCP_CREATE_DIR = True MY.SingularityImage = "/cvmfs/unpacked.cern.ch/gitlab-registry.cern.ch/batch-team/containers/plusbatch/cs8-full:latest" queue
Let's just go through that and explain a few things, since not all of it may be necessary for you, but some of it might.
executableis a shell script, but later on we're running some python (the
payload.py). Sometimes it's useful to set up some environment in a script, then run something else that is being transferred as input.
$(ClusterId)in its name whereas
$(ProcId). If you
queuemore than one job, then a log can be written by each process, whereas
errorcannot, they must be unique.
transfer_input_filesthese are multiple files transferred from
EOS. Note that this is for files stored in eosuser, if the instance is different, the server name will have to reflect that. These files end up in the working directory of the job.
output_destinationis a directory in
EOSthat you want the output sent to. In this case we're creating one specific to the submission with
$(ClusterId). You will get back by default output, error and anything else you write to the sandbox unless you specify a list. We will do that though with...
transfer_output_filesthe list of files to be sent back. Note that if it's listed, it has to be produced or the job will be failed.
MY.XRDCP_CREATE_DIRthis says that the output directory should be created by the plugin if it does not exist.
MY.SingularityImagethis specifies the container 'image' that we'll be using.