Apptainer (aka Singularity)
The Linux Foundation version of Singularity is now called apptainer but otherwise is functionally identical.
This document will explain how to take advantage of the apptainer integration into the HTCondor container universe. Additionally, we'll suggest taking advantage of the file transfer mechanisms to - we hope - make this as easy as possible to use.
Container location
As explained elsewhere the best way to use apptainer with the CERN batch service is to have that container image dumped out into /cvmfs/unpacked.cern.ch. For the purposes of this example, we'll use /cvmfs/unpacked.cern.ch/gitlab-registry.cern.ch/batch-team/containers/plusbatch/cs8-full:latest
Input / Output data
When the container is running, the initial working directory of the container is the initial working directory of the batch job. We can have htcondor transfer data to that directory at the start of the job, and from that directory at the end of the job. Standard htcondor file transfer semantics can be used, so with this example we'll transfer data to and from EOS using the file transfer plugin method
Example submit file
executable = runme.sh
log = singularity.$(ClusterId).log
error = singularity.$(ClusterId).$(ProcId).err
output = singularity.$(ClusterId).$(ProcId).out
should_transfer_files = YES
MY.JobFlavour = "longlunch"
transfer_input_files = root://eosuser.cern.ch//eos/user/b/bejones/scripts/payload.py, root://eosuser.cern.ch//eos/user/b/bejones/data/input.txt
output_destination = root://eosuser.cern.ch//eos/user/b/bejones/results/$(ClusterId)/
transfer_output_files = output.txt
MY.XRDCP_CREATE_DIR = True
MY.SingularityImage = "/cvmfs/unpacked.cern.ch/gitlab-registry.cern.ch/batch-team/containers/plusbatch/cs8-full:latest"
queue
Let's just go through that and explain a few things, since not all of it may be necessary for you, but some of it might.
- The
executableis a shell script, but later on we're running some python (thepayload.py). Sometimes it's useful to set up some environment in a script, then run something else that is being transferred as input. - The
loghas the$(ClusterId)in its name whereaserrorandoutputalso have$(ProcId). If youqueuemore than one job, then a log can be written by each process, whereasoutputanderrorcannot, they must be unique. transfer_input_filesthese are multiple files transferred fromEOS. Note that this is for files stored in eosuser, if the instance is different, the server name will have to reflect that. These files end up in the working directory of the job.output_destinationis a directory inEOSthat you want the output sent to. In this case we're creating one specific to the submission with$(ClusterId). You will get back by default output, error and anything else you write to the sandbox unless you specify a list. We will do that though with...transfer_output_filesthe list of files to be sent back. Note that if it's listed, it has to be produced or the job will be failed.MY.XRDCP_CREATE_DIRthis says that the output directory should be created by the plugin if it does not exist.MY.SingularityImagethis specifies the container 'image' that we'll be using.