Skip to content

File Transfer to xrootd URL

EOS is often used at CERN, whereas for the HTCondor service, AFS is used as the shared filesystem between your submissions and the schedd. Whereas we do not allow EOS FUSE to serve as this filesystem, we do have a file transfer plugin that can help integrate EOS into your workflow for local batch submission. This is achieved using an HTCondor file transfer plugin to xrdcp files to EOS.

Note

Note that this is authenticated with kerberos and so should be viewed as a way to use EOS at CERN, not global xrootd URIs.

The xrootd file transfer plugin usage may be explicitely configured in the submit file, or automally imposed by the use of the new (experimental) EosSubmit schedds.

Usage in the submit file

Output files

To use the xrootd transfer plugin for the output produced by the job, you can specify the output_destination attribute in the submit file. A correctly formatted xrootd URI will tell htcondor to write your output files to this destination. Here is a basic example, where I want to write files to /eos/user/b/bejones/condor/xfer:

executable     = script.sh
log            = xfer.$(ClusterId).log
error          = yf.$(ClusterId).$(ProcId).err
output         = yf.$(ClusterId).$(ProcId).out
output_destination = root://eosuser.cern.ch//eos/user/b/bejones/condor/xfer/
queue

Using this example, all files written to the working directory (aka the output sandbox) will be written to that directory in EOS.

However, we can also ask for any directories to be created, which can help to avoid having too many files in one directory. We do this by adding an additional special attribute MY.XRDCP_CREATE_DIR = true. Full example:

executable     = script.sh
log            = xfer.$(ClusterId).log
error          = yf.$(ClusterId).$(ProcId).err
output         = yf.$(ClusterId).$(ProcId).out
output_destination = root://eosuser.cern.ch//eos/user/b/bejones/condor/xfer/$(ClusterId)/
MY.XRDCP_CREATE_DIR = True
queue

Note the addition of a subdir with the ClusterId in the output_destination.

We can also include the transfer_output_files attribute to explicitely indicate the number of output files to transfer (instead of all the files written in the job working directory). E.g.:

transfer_output_files = fout1, fout2

Input files

We can ask the plugin to import input files by setting the transfer_input_files attribute, though this needs to have the full path (for every included file), for example:

executable     = script.sh
log            = xfer.$(ClusterId).log
error          = yf.$(ClusterId).$(ProcId).err
output         = yf.$(ClusterId).$(ProcId).out
output_destination = root://eosuser.cern.ch//eos/user/b/bejones/condor/xfer/$(ClusterId)/
transfer_input_files = root://eosuser.cern.ch//eos/user/b/bejones/condor/file.txt, root://eosuser.cern.ch//eos/user/b/bejones/condor/sub/file2.txt
MY.XRDCP_CREATE_DIR = True
queue

Notice that you could also use a root:// URL for the Input attribute (stdin file) if desired, but not for the executable or initialdir attributes.

User log file and spool submission

Even when setting output_destination, the user log file (log attribute of the submit file) is not transferred with the plugin (since it is handled by the schedd directly). In the normal case the log file would be thus a file in AFS. There is though a way to avoid using AFS at all, by using condor_submit -spool.

As described in the dataflow page condor_submit -spool causes input/output/log files to be staged to the schedd beforehand, avoiding the use of the shared filesystems. However, if output_destination is set and xrootd URLs used in transfer_input_files, input/output files are directly transferred by the plugin to/from EOS and the execution nodes, skipping the schedd. This means that the only files written in the schedd will be the executable and the user log file. This is not perfect but at least avoids the use of shared filesystems (AFS) altogether.

In this case, after a job terminates, if you wish to consult the log, it is necessary to transfer the log back using the following command:

condor_transfer_data <job-id>

This will transfer back the log file to the submit machine (in the initial working directory of the submitted job).

Automatic usage by EosSubmit schedds

A new (experimental) feature allows users to submit jobs from EOS using the xrootd transfer plugin without explicitely setting output_destination or using xrootd URLs for input files. This can be very convenient if submit files are generated programatically or in general difficult to modify.

This method have some advantages and disadvantage over explicit plugin usage in the submit file or e.g. usage of -spool.

Please refer to EosSubmit schedds for information on how to use this feature.


Last update: February 11, 2025