Exercise 8e: Script
In a Dag file the name of a script which can run either before (PRE script) or after (POST script) a job within a node can specified.
The script can contain code which processes the output files, clean up or remove files once jobs are finished.
SCRIPT [DEFER status time] PRE <name of Node> <name of script> [arguments] OR SCRIPT [DEFER status time] POST <name of Node> <name of script> [arguments]
DEFER status time: Retries the script in time seconds if the exit value is the defined status.
Useful arguments passed to the PRE and POST scripts
- $JOB: Is the Job name.
- $RETRY: This variable counts the number of a node's retries. Evaluation by using this macro, the number of retries for a node. It starts from 0 and keeps increasing.
- $MAX_RETRIES: Evaluation by using this macro as an argument the maximum number of retries for a node.
- $FAILED_COUNT: It is the number of failed nodes in a DAG.
- $DAG_STATUS: It is an integer that represents the status of the DAG.
Only for the Post script
- $JOBID: It represents the ClusterId and the ProcId. If the node has more than one job, then the ProcId value is the last one within the cluster.
- $RETURN: The value of $RETURN is 0 if all the jobs in a cluster have finished successfully. Although, if a job fails, the macro's value will be job's return value and all the remaining jobs, from the same cluster, in the queue will be removed.
- $PRE_SCRIPT_RETURN: The POST script can use this variable in order to check if the PRE script has failed and assign success or failure to the node. If there is no PRE script the value is equal to -1.
In this exercise the name of a script will be defined. The welcome.sh script is executed before the execution of Job C.
The script welcome.sh contains the following:
#!/bin/bash echo "welcome to DAGs">prescript.txt
JOB A A.sub JOB B B.sub JOB C C.sub JOB D D.sub PARENT A CHILD B C PARENT C CHILD D PARENT B CHILD D SCRIPT PRE C welcome.sh
Note: If the Pre script fails before the defined Node C, the Node does not run and the Post script, if exists, is not executed.
The Post script is executed in all cases (success or failure of the Node) by defining -AlwaysRunPost argument in the condor_submit_dag command. The default value of DAGMAN_ALWAYS_RUN_POST is false.
The following tables display under which circumstances the Pre script, Job and Node success (S) or fail (F).
- In the case there is no Post script and DAGMAN_ALWAYS_RUN_POST=false:
- In the case there is a Post script DAGMAN_ALWAYS_RUN_POST=false:
|Pre script||Job||Post script||Node|