Python API

Why use the Python API

HTCondor has long had a Python API, but until recently it was not possible to us it to submit jobs to our HTCondor installation at CERN. Fortunately, the version that we are now running is capable of submitting jobs to run here at CERN with the credentials needed to interact with things like network filesystems.

The documentation for the API can be found here. There are tutorials, for example here or fully fledged binder tutorials here. We would urge you to consult them. The point to this guide is to give you tips on how to use those bindings here at CERN.

Python job submission at CERN

Python submission: token management

In a python shell, here's a method to get your token submitted to the schedd. It assumes that you are running on either lxplus or a machine setup like lxplus to have SEC_CREDENTIAL_PRODUCER in the condor config (if you are running on your own installation, if you can submit jobs to the CERN batch system from the command line, then it will also work in the api). It also assumes that you have valid kerberos tickets.

import htcondor

col = htcondor.Collector()
credd = htcondor.Credd()
credd.add_user_cred(htcondor.CredTypes.Kerberos, None)

It should be noted at this point that there is a credd.check_user_cred() method, but it currently does not work with kerberos tokens, but this should be fixed upstream soon. For the time being, it would be good practice to submit a token with each session. You can normally count on tokens being maintained for a few hours, but adding a token is relatively lightweight, so can be repeated as necessary.

Submitting a job

With the creds on the schedd, submitting with the submit object isn't a lot different.

>>> sub = htcondor.Submit()
>>> sub['Executable'] = "/afs/cern.ch/user/b/bejones/tmp/condor/hello.sh"
>>> sub['Error'] = "/afs/cern.ch/user/b/bejones/tmp/condor/error/hello-$(ClusterId).$(ProcId).err"
>>> sub['Output'] = "/afs/cern.ch/user/b/bejones/tmp/condor/output/hello-$(ClusterId).$(ProcId).out"
>>> sub['Log'] = "/afs/cern.ch/user/b/bejones/tmp/condor/log/hello-$(ClusterId).log"
>>> sub['MY.SendCredential'] = True
>>> sub['+JobFlavour'] = '"tomorrow"'
>>> sub['request_cpus'] = '1'
>>>
>>> schedd = htcondor.Schedd()
>>> res = schedd.submit(sub)
>>> cluster_id = res.cluster()
>>>
>>> print(cluster_id)
1034

Submitting spooled jobs

If the boolean parameter spool is set to True within the htcondor.Submit method, jobs will be submitted in a spooling hold mode so that input files can be spooled to a remote condor_schedd daemon (please see Spool submission). N.B.: If this mode is used, job will be left in the HOLD state until the spool() method is called.

An example of usage would be the following:

>>> res = schedd.submit(sub, spool=True)
>>> cluster_id = res.cluster()
>>> schedd.spool(list(sub.jobs(clusterid=cluster_id)))

Querying jobs

>>> print(cluster_id)
1034
>>> q = schedd.query(constraint='ClusterId == 1034', projection=['ClusterId','ProcId','Cmd','Args', 'JobStatus'])
>>> q
[[ ClusterId = 1034; ProcId = 0; ServerTime = 1622815338; Cmd = "/afs/cern.ch/user/b/bejones/tmp/condor/hello.sh"; JobStatus = 1 ]]
>>> for job in q:
...     print(f'{job.get("ClusterId")}.{job.get("ProcId")} status: {htcondor.JobStatus(job.get("JobStatus")).name}')
...
1034.0 status: IDLE

Last update: May 14, 2025