Skip to content

Apptainer (aka Singularity)

The Linux Foundation version of Singularity is now called apptainer but otherwise is functionally identical.

This document will explain how to take advantage of the apptainer integration into the HTCondor container universe. Additionally, we'll suggest taking advantage of the file transfer mechanisms to - we hope - make this as easy as possible to use.

Container location

As explained elsewhere the best way to use apptainer with the CERN batch service is to have that container image dumped out into /cvmfs/unpacked.cern.ch. For the purposes of this example, we'll use /cvmfs/unpacked.cern.ch/gitlab-registry.cern.ch/batch-team/containers/plusbatch/cs8-full:latest

Input / Output data

When the container is running, the initial working directory of the container is the initial working directory of the batch job. We can have htcondor transfer data to that directory at the start of the job, and from that directory at the end of the job. Standard htcondor file transfer semantics can be used, so with this example we'll transfer data to and from EOS using the file transfer plugin method

Example submit file

executable              = runme.sh
log                     = singularity.$(ClusterId).log
error                   = singularity.$(ClusterId).$(ProcId).err
output                  = singularity.$(ClusterId).$(ProcId).out
should_transfer_files   = YES
MY.JobFlavour           = "longlunch"
transfer_input_files    = root://eosuser.cern.ch//eos/user/b/bejones/scripts/payload.py, root://eosuser.cern.ch//eos/user/b/bejones/data/input.txt
output_destination      = root://eosuser.cern.ch//eos/user/b/bejones/results/$(ClusterId)/
transfer_output_files   = output.txt
MY.XRDCP_CREATE_DIR     = True
MY.SingularityImage     = "/cvmfs/unpacked.cern.ch/gitlab-registry.cern.ch/batch-team/containers/plusbatch/cs8-full:latest"
queue

Let's just go through that and explain a few things, since not all of it may be necessary for you, but some of it might.

  • The executable is a shell script, but later on we're running some python (the payload.py). Sometimes it's useful to set up some environment in a script, then run something else that is being transferred as input.
  • The log has the $(ClusterId) in its name whereas error and output also have $(ProcId). If you queue more than one job, then a log can be written by each process, whereas output and error cannot, they must be unique.
  • transfer_input_files these are multiple files transferred from EOS. Note that this is for files stored in eosuser, if the instance is different, the server name will have to reflect that. These files end up in the working directory of the job.
  • output_destination is a directory in EOS that you want the output sent to. In this case we're creating one specific to the submission with $(ClusterId). You will get back by default output, error and anything else you write to the sandbox unless you specify a list. We will do that though with...
  • transfer_output_files the list of files to be sent back. Note that if it's listed, it has to be produced or the job will be failed.
  • MY.XRDCP_CREATE_DIR this says that the output directory should be created by the plugin if it does not exist.
  • MY.SingularityImage this specifies the container 'image' that we'll be using.

Last update: May 21, 2024