Submit Files

We saw the basics of a submit file in the quick start guide, but here we'll drill down into some of the specifics. This is by no means an exhaustive guide, as ever the HTCondor Documentation is recommended as a reference. Aside for some differences around the exact memory / cpu amounts you can request, you should be able to use an HTCondor submit file you use elsewhere here at CERN. That said, there are also some local shortcuts that we've added to help people. We detail those differences here too.

Default schedd mapping

Launching jobs from central services like lxplus or aiadm will benefit from our internal configuration that automatically maps a user to a specific schedd. This mapping ensures that users get a consistent view of their jobs by always interacting with the same schedd.

To see the current mapping, or to see how to change mapping, please refer to the documentation for myschedd

Universe

HTCondor allows submission to different platforms or architectures with the use of something it calls universes. In the submit file, you can specify the universe, like so:

universe = vanilla

The vanilla universe is the default that you will get if you specify nothing, and is in general the one that you'll want to use. However, we will run other universes, such as the docker universe, which allows jobs to be executed inside Docker containers.

Input, output and logs

Most jobs require input and output in order to run. HTCondor provides job logging using the following job submission directives:

input  = jobinput.txt
output = joboutput.txt
error  = joberr.txt
log    = joblog.txt

For applications requiring standard input (stdin) during job execution, the input file can be specified which HTCondor then uses to pipe into stdin of the running executable. Subsequently, the output file contains standard output (stdout) and error file keeps track of standard error (stderr) messages. Finally, the log file reports job status progression as reported by HTCondor.

Whereas HTCondor doesn't require a shared filesystem, the use of one is supported at CERN and enables some additional features, such as being able to use condor_wait on the log file to monitor job state transitions. In the above example, the paths are relative, but are presumed to be on a shared filesystem that both the submission node and the condor scheduler can both access. At the present time in CERN this effectively means AFS, though other shared filesystems will be supported as they become available.

There are a number of variables that can be used in the generation of the filenames, which is useful when a submit file is being used to generate multiple jobs. This is an example of such a submit file excerpt:

input  = input/job.$(ClusterId).$(ProcId).txt
output = output/job.$(ClusterId).$(ProcId).output
log    = log/job.$(ClusterId).$(ProcId).log

The variables that can be used in the submit file are detailed here which is useful to separate files. It is though also sometimes useful to have a single logfile for each submission, in which case condor_wait can watch individual jobs in the logfile.

Operating system requirements

The default operating system is set system-wide and is currently AlmaLinux9. To select the non-default operating system, please use a requirements attribute in your job submit file:

To see the list of available operating systems, and the proportions, run:

$ condor_status -compact -af OpSysAndVer | sort | uniq -c

At the time of writing though, there's only one answer.

Since we're at it though, we could mention Architecture instead, that's a bit less mono:

$ condor_status -compact -af Arch | sort | uniq -c
     19 aarch64
   3144 X86_64

(Guess the default!)

OS selection via Containers

Given we are about to go through a period where we are less homogenous with Operating Systems, there are options to have the OS managed for you, with containers being an option.

The following can be added to your submit file:

MY.WantOS = "el9"

...with valid choices currently being el7, el8 and el9.

This will ensure your job runs in the correct OS version for these - whatever flavour of Enterprise Linux clone is run for that release. Your job will match to a machine, if that machine is running a different base OS, then your job will run in a singularity/apptainer container.

Setting the OpSysAndVer as the previous example will ensure that you match on the base OS of the worker node. Combining the two options doesn't really make sense, at best it would mean that you are reducing the potential workers to run your OS version independent container.

CPU architecture requirements

The vast majority of the worker nodes in the CERN Batch service have processors of the x86_64 architecture, but since early 2023 we provide a limited amount of ARM servers as well. Note that, by default, the CPU architecture of the worker node running your job will match the architecture of the submission node.

So if you want to run on an ARM worker node, submitting from an x86_64 node (such as lxplus), you can specify this requirement:

requirements = (Arch =?= "aarch64")

Note that only ARM servers running ALMA9 are available

Other requirements

Some requirements can be used to target particular parts of the batch system. These should be used with care (and usually only on the advice of the Batch support team), since, by definition it means limiting the nodes that are available for your job. However it is sometimes necessary, for example, if you need a new package that has only currently been deployed to the QA batch nodes:

Requirements can use any ClassAd and expressions can be chained:

requirements = ( (OpSysAndVer =?= "CentOS7") && (CERNEnvironment =?= "qa") )

Job Flavours

In order to help scheduling, and to provide priority to jobs that are shorter and more efficient, the maximum runtime of a job should be set. This can either be set directly in seconds, or a job can be assigned a "flavour" which will bucket the job into a max runtime. Jobs which exceed the maximum runtime will be terminated. The runtime is the wall time of the job (the elapsed actual time) rather than a calculated cpu time.

The job flavours are as follows:

espresso     = 20 minutes
microcentury = 1 hour
longlunch    = 2 hours
workday      = 8 hours
tomorrow     = 1 day
testmatch    = 3 days
nextweek     = 1 week

The default job flavour for a job submitted with no other information is "espresso".

Setting the job flavour in the submit file is achieved like this:

+JobFlavour = "longlunch"

Setting manually can be achieved by placing the following in your submit file:

+MaxRuntime = Number of seconds

Note that if a job runs out of time, the partial stdout/stderr is not normally copied back. Benchmark jobs can help you choose an appropriate time limit for your jobs.

Resources and limits

As with any system of finite resources, there are limits which you need to be aware of. The time limit of a job is one such limit. The MaxRuntime of a job is the value of the longest JobFlavour.

By default, a job will get one slot of a CPU core, 3gb of memory and 20gb of disk space. It is possible to ask for more CPUs or more memory, but the system will scale the number of CPUs you receive to respect the 3gb / core limit. To ask for more CPUs you can do the following in the submit file:

RequestCpus = 4

This will result in a slot of 4 CPUs, 12gb of memory and 80gb of disk.

Note that memory is applied as a soft limit. This means that your job will be terminated if there is memory pressure on the node. If you need additional memory, the safe thing to do is to request more slots.

Warning

Multicore jobs require a batch system to "defragment" or drain slots in order that they are not filled with single core. However, to ensure that your jobs are scheduled efficiently, this means in practice that multicore jobs of 4+ cores are more likely to be scheduled quickly than 2 or 3 core jobs. Be careful that the memory scaling doesn't push your job requirements into an odd-shaped multicore request.

Submitting multiple jobs

We've talked about submitting multiple jobs in passing, and how the interpolated values such as $(ClusterId) and $(ProcId) can help. The mechanics of how to do it are quite simple. The queue directive can take an integer to submit multiple jobs, for example with the following submit file:

executable            = runmore.sh
input                 = input/mydata.$(ProcId)
arguments             = $(ClusterID) $(ProcId)
output                = output/hello.$(ClusterId).$(ProcId).out
error                 = error/hello.$(ClusterId).$(ProcId).err
log                   = log/hello.$(ClusterId).log
queue 150

The "queue 150" directive instructs condor to run 150 jobs. Note that each job will have an incremental $ProcId, which can be used to interpolate other directives in the submit file, such as to separate input files.

Note that the queue command has many other features, which are documented here

Last update: April 17, 2025