Skip to content

File Transfer to xrootd URL

EOS is often used at CERN, whereas for the HTCondor service, AFS is used as the shared filesystem between your submissions and the schedd. Whereas we do not allow EOS FUSE to serve as this filesystem, we do have a file transfer plugin that can help integrate EOS into your workflow for local batch submission. This is achieved using an HTCondor file transfer plugin to xrdcp files to EOS.

Note

Note that this is authenticated with kerberos and so should be viewed as a way to use EOS at CERN, not global xrootd URIs.

Usage via output_destination

In your condor submit file, you can specify an output_destination. A correctly formatted xrootd URI will tell htcondor to write your output files to this destination. Here is a basic example, where I want to write files to /eos/user/b/bejones/condor/xfer:

executable     = script.sh
log            = xfer.$(ClusterId).log
error          = yf.$(ClusterId).$(ProcId).err
output         = yf.$(ClusterId).$(ProcId).out
output_destination = root://eosuser.cern.ch//eos/user/b/bejones/condor/xfer/
queue

Using this example, all files written to the working directory (aka the output sandbox) will be written to that directory in EOS.

However, we can also ask for any directories to be created, which can help to avoid having too many files in one directory. We do this by adding an additional special attribute MY.XRDCP_CREATE_DIR = true. Full example:

executable     = script.sh
log            = xfer.$(ClusterId).log
error          = yf.$(ClusterId).$(ProcId).err
output         = yf.$(ClusterId).$(ProcId).out
output_destination = root://eosuser.cern.ch//eos/user/b/bejones/condor/xfer/$(ClusterId)/
MY.XRDCP_CREATE_DIR = True
queue

Note the addition of a subdir with the ClusterId in the output_destination.

We can also ask the plugin to import input files, though this needs to have the full path, for example:

executable     = script.sh
log            = xfer.$(ClusterId).log
error          = yf.$(ClusterId).$(ProcId).err
output         = yf.$(ClusterId).$(ProcId).out
output_destination = root://eosuser.cern.ch//eos/user/b/bejones/condor/xfer/$(ClusterId)/
transfer_input_files = root://eosuser.cern.ch//eos/user/b/bejones/condor/file.txt
MY.XRDCP_CREATE_DIR = True
queue

Limitations

The user log or the log specified in the above examples is not transferred, and so in the normal case would be a file in AFS. There is though a way to avoid using AFS at all, by using condor_submit -spool...

Use with condor_submit -spool

condor_submit -spool uses the schedd to hold all the files, avoiding the use of shared filesystems altogether. Combined with output_destination it can be very convenient. The only limitation is the user log which will be written on the schedd. If after a job terminates or there is an error you wish to consult, it is necessary to transfer the log back using the following command:

condor_transfer_data <job id>

This should then transfer back the user log to the initial working directory of the submitted job.


Last update: February 14, 2022