Skip to content

Exercise 7: Periodic Removal

In this exercise the ability to force jobs to be removed, held, released will be examined. In the submit file the following can be set:

  • If the expression is true, HTCondor removes the jobs.
 periodic_remove  = (expression)
  • If the expression is true, puts the job in the idle state again.
periodic_release = (expression)
 ```
- **If the expression is true, HTCondor puts the job in the hold state.**

```Ini
periodic_hold    = (expression)

The condor_schedd scheduler periodically evaluates these expressions every 5 minutes and if the expression is true, the command is executed. Variables such as EnteredCurrentStatus, JobStatus, JobCurrentStartDate, etc, can be used for the expressions and a non exhaustive list can be found in http://research.cs.wisc.edu/htcondor/manual/latest/12_Appendix_A.html.

HTCondor submits a job to the queue for the executable welcome.sh. If the job arrives in the on hold status and remains there for more than 60 seconds, when the scheduler evaluates the value of periodic_release, it will release the job.

The script welcome.sh contains a simple command:

#!/bin/bash

echo "welcome to HTCondor tutorial"

Execute condor_submit exercise7.sub using the following submit description file to submit the jobs.

executable              = welcome.sh
arguments               = $(ClusterId)$(ProcId)
output                  = output/welcome.$(ClusterId).$(ProcId).out
error                   = error/welcome.$(ClusterId).$(ProcId).err
log                     = log/welcome.log

periodic_release         = ((JobStatus == 5) && (time() - EnteredCurrentStatus) >  60)
queue

Note: The periodic_release is useful only if the executable is correct and fails under specific circumstances such as problems with executing machines, etc. Otherwise, HTCondor will reschedule (NO resubmit) the job and the job will again arrive in the on hold status.