Using HTCondor

Disclaimer

In my opinion, the official documentation for HTCondor is very good. Here is a link for submitting a job. Despite that, there aren't any complete examples that will help the average person get started immediately. So here are some examples that I think are complete enough for adapting to ones own coding needs.

Submitting Jobs to the Queue VIDEO

To run jobs on the tier3's worker nodes, a user needs an executable or wrapper script, as well as a submit script. The executable or wrapper script is what the user wants the worker nodes to run, while the submit script is what HTCondor uses to create the job.

Executable/Wrapper Script

This can be any form of an executable a user wants to run on a worker node, ranging from a simple unix command to a compiled executable to a complex shell script (also known as a wrapper script). This executable or wrapper script should include everything the user wants to run on the worker node, including copying/moving files from one place to another. Note: It is best practice to copy any code that will be run, as well as any data that will be runover, to the worker node. This avoids the issue of having the worker nodes constantly using home directory or work disk i/o, which could cause the login nodes or work disks to become slow.

Wrapper Script Example:

#!/bin/bash
 
# Copy code and input files to worker node disk, then cd to that space.
cp -r /pathToCode/code $TMPDIR/
cp /pathToInputs/inputFile.root $TMPDIR/
cd $TMPDIR/
 
# Compile code
cd code/
make
cd ..
 
# Run code on input file and get output file
./code/framework inputFile.root outputFile.root
 
# Copy output file to workdisk
mv outputFile.root /pathToWorkDisk/
 
# Cleanup
rm -fr ./*

Submit Script

The submit script is what HTCondor uses to create a job. An example is given below:

# Specify condor environment job should use
universe=vanilla
# Specify executable or wrapper script job will run
executable=wrapper.sh
 
# Specify log, output, and error files location and names
# Any of the below could also be given a path to where the file should be created
log = job.log
output = job.out
error = job.error
 
# Request resources
request_cpus = 1
request_disk = 20MB
request_memory = 20MB
 
# Queue some number of jobs
queue 1

The example submit script above will be used to create one job that runs "wrapper.sh" on a worker node. The different variables are defined below:
  • universe: the condor environment the job will be run with, this should usually be vanilla so no further description will be given.
  • executable: the executable or wrapper script to be run on the worker node.
  • log: the location and name of the condor log file, this records information on how long the job has been running as well as information on resource uses such as memory and disk space.
  • output: the location and name of the file that records the standard output of the job.
  • error: the location and name of the file that records the standard error of the job.
  • request_cpus: used to request job slots with a certain number of cpus, less useful in the current tier3 setup (each job slot is one core).
  • request_disk: used to request job slots with a certain disk space. Again, less useful in current tier3 setup. This should be set to how much disk space (so approximatly code size + input file size + output file size) a job will use.
  • request_memory: used to request job slots with a certain amount of memory. Again, less useful in current tier3 setup. This should be set to how much maximum memory a job will use. Finding this value usually requires submitting test jobs and looking at the log file.
  • queue: how many of this particular job should be created. Can be given more complex options for submitting multiple jobs as described below.
To submit a job to condor, use the command "condor_submit <submitScriptName>". It can be useful to give wrapper scripts the ".sub" file extension to differentiate them from other files.

HTCondor Script Variables

A variable inside a submit script can be set and referenced like so.
X = 3
Y = $(X)

Passing Arguments to the Executable

Some executables can have, or require, arguments be passed to them. Condor can pass said arguments with the "arguments" variable.

For example, given the wrapper script below:

#!/bin/bash
# For those unfamiliar with bash scripting: to reference arguments passed via the command line, use the built in variables $1, $2, etc. to reference the 1st argument, 2nd argument, etc.

# Copy code, specific input file, and specific configuration file for code to worker node disk. Then cd to worker node disk space.
cp -r /pathToCode/code $TMPDIR/
cp /pathToInputs/inputFile-$1.root $TMPDIR/
cp /pathToConfigs/config-$2.cfg $TMPDIR/
cd $TMPDIR/
 
# Compile code
cd code
make
 
# Run code on input file, using configuration file, to make output file with specific name
./code/framework config-$2.cfg inputFile-$1.root outputFile-$3.root
 
# Copy output to work disk
cp outputFile-$3.root /pathToWorkDisk/
 
# Cleanup
rm -fr ./$

You could pass the necessary arguments via the submit script like so:

# Specify condor environment job should use
universe=vanilla
# Specify executable or wrapper script job will run
executable=wrapper.sh
# Pass Arguments to executable
arguments = ttbar-semileptonic lvbb 1Lep-ttbar
 
# Specify log, output, and error files location and names
# Any of the below could also be given a path to where the file should be created
log = job.log
output = job.out
error = job.error
 
# Request resources
request_cpus = 1
request_disk = 20MB
request_memory = 20MB
 
# Queue some number of jobs
queue 1

So the following will copy inputFile-ttbar.root and config-lvbb.cfg from the specified paths to the worker node disk, then copy the resulting output file named outputFile-Lep1-ttbar.root to the specified path.

Submitting Multiple Jobs

If one wants to submit multiple jobs, there are several ways to modify the submit script to do so.

Queueing Identical Jobs

If one wants to submit multiple identical jobs, they can simply give an integer N to the queue statement at the end of the submit script. Ex: The following will submit 100 identical jobs if used in a submit script.

queue 100

The "in" Keyword

To create a job for each item in a list, the "in" keyword can be used.

# The variable "var" is just an example, the name could be different.
# To reference the variable in the rest of the submit script, use $(var).
# The elements of this list are also just examples.
queue var in (
item1
item2
item3
item4
)

The "matching" Keyword

Jobs for each item matching an expression can be created using the "matching" keyword. Ex: A submit script that used queue in this way would create a job for each .root file in the directory from which "condor_submit" was called.

queue var matching *.root

You could also give a path to another directory.

The "from" Keyword

If more than one variable is required, the "from" keyword can be used. This takes in a text file that is a comma separated list of values. Ex: Given the text file "List.txt" below:

500, 0.25, 0.25
500, 0.25, 0.5
500, 0.25, 0.75
500, 0.5, 0.25
500, 0.5, 0.5
500, 0.5, 0.75
500, 0.75, 0.25
500, 0.75, 0.5
500, 0.75, 0.75
1000, 0.25, 0.25
...
1500, 0.25,0.25
...

Where the first column represents the mass of a particle, and the second and third arguments represent some coupling. One could create a job for each line by modifying the "queue" portion of the submit script like so:

queue mass,g1,g2 from List.txt

Concurrency Limits

Some of the resources on the tier3 can only have so many jobs using them before they begin to get bogged down, namely the work disks (t3work1-9). Luckily condor has a built-in way of limiting the number of jobs that run on a particular resource, and it only requires the user to declare which resource they are using and how much of that resource a single job consumes. This is done by setting concurrency limits in your submit script, by adding the following line to it before the "queue" command.

Concurrency_Limits = <resource1 name>:<units needed by job>

Things to note about concurrency limits:
  • If multiple users declare they are using the same resource, the total number of jobs that run in parallel are determined aggregately, not per user.
  • To declare the use of multiple resources, use a comma separated list.
  • Using the wrong resource name does not cause an error.
  • Using a resource name does not actually claim anything physically, you could for instance limit the number of jobs you are running that use t3work9 by using concurrency limits meant for t3work5.
  • If you do not use concurrency limits, then your jobs will begin running on any available jobs slots and might overwhelm the resources they use, thereby slowing down that resource for everyone.

Work Disks

Nearly all of the work disks are setup to use concurrency limits. Each disk has an associated amount of i/o that is used to limit the number of jobs running on them. This value is somewhat arbitrary, it does not actually represent real units of a disk's i/o. The names of these resources and the unit's of i/o they are given are listed here (may change soon):
  • DISK_T3WORK1: 10000
  • DISK_T3WORK2: 10000
  • DISK_T3WORK3: 10000
  • DISK_T3WORK4: 10000
  • DISK_T3WORK5: 10000
  • DISK_T3WORK6: 10000
  • DISK_T3WORK7: 10000
  • DISK_T3WORK8: 10000
  • DISK_T3WORK9: 10000
  • DISK_CYNISCA: 10000
  • DISK_HOME: 10000

A fairly simple calculation can be done to figure out how many units of i/o a job takes up if you already know how many jobs you want running at one time (10000/# of jobs to run at once). For instance, if I want to use t3work3 and only have 200 jobs using it at once, then I would insert the following in my submit script:

# 10000/200 = 50, so I would put...
Concurrency_Limits = DISK_T3WORK3:50

User Limits

Most users do not use the work disk concurrency limits as intended (to limit the number of jobs running that use a particular disk). Instead, they simply need a way to limit the total number of their own jobs that are running (without the concurrency limits of different users interfering with each other). If this is the only concurrency limit you'd like to set, add the following line to your submit script:

Concurrency_Limits = <username>:<1000/# of jobs>
# Example, this will limit the number of jobs I (forrestp) can run to 100.
Concurrency_Limit = forrestp:10

Note that while the disks have a resource limit of 10,000 units, the users have a resource limit of 1,000 units.

User Defined Resources

It's also possible for user's to define their own resources with a resource limit of 1,000 units. There are at least two ways these could be useful: to setup group limits (VHres, single-top, etc.) or if you are running multiple sets of jobs at once for different projects.

If you'd like to setup group limits, i.e., everyone in your group limits their jobs collectively, simply pick a group name that you can all agree on, say "VHres". Next pick how many jobs you'd all like to be able to run at once, maybe 250. Insert the following line into all of the relevant submit scripts.
# 1000/250 = 4
Concurrency_Limits = VHres:4

If you are running multiple sets of jobs at once for different projects (your qualification task and main analysis for instance), you may want to limit the number of each that can run instead of using user limits. Simply pick a name for each ("qualtask" and "analysis"), tack your username to the front or back ("forrestp_qualtask" and "forrestp_analysis", note that this is simply to avoid two people giving the same name to their user-defined resources), pick how many of each you want to run at once (25 and 75), and insert the following lines into your submit scripts.
# In your qual task submit script
Concurrency_Limits = forrestp_qualtask:40    # 1000/25=40

# In your analysis submit script
Concurrency_Limtis = forrestp_analysis:13    #1000/75 = 13.3, but the value given needs to be an int.

Multiple Concurrency Limits

If you would like to submit jobs with limits on both the username and disk, or some user defined resource, give the "Concurrency_Limit" variable in your submit script a comma separated list.

# Example, I (forrestp) want to limit the number of jobs I can run to 100, but also want to limit the number of jobs running on t3work4 to 50.
Concurrency_Limit = forrestp:10, DISK_T3WORK4:200

An example of where this would be useful is if you've split the datasets you run over across multiple disks. So say you've split this dataset across t3work3, t3work4, and t3work5. If you'd like to limit the total number of jobs you can run at once to 100, but limit the number of jobs that use a particular disk to 50, there are two approaches you can take: have three different submit scripts (one for each disk) or one submit script that changes the relevant variables (a bit more complicated).

Example 1 (three submit scripts)

In the submit script for jobs that use the dataset on t3work3, insert:
Concurrency_Limits = forrestp:100, DISK_T3WORK3:200

In the submit script for jobs that use the dataset on t3work4, insert:
Concurrency_Limits = forrestp:100, DISK_T3WORK4:200

In the submit script for jobs that use the dataset on t3work5, insert:
Concurrency_Limits = forrestp:100, DISK_T3WORK5:200

Example 2 (one submit script that changes the appropriate variables)

Warning: this is a simple example and more/different changes may be needed for your submit script depending on its setup, please use this only as a reference for how you might change it.
universe = vanilla
executable = wrapper.sh # This wrapper excepts the location of the dataset as an argument

log = job.log
out = job-$(dataset).out
err = job-$(dataset).out

request_cpus = 1
request_disk = 1GB
request_memory = 1GB

arguments = /msu/data/t3work3/restOfPathToDataset
Concurrency_Limits = forrestp:10, DISK_T3WORK3:200
queue

arguments = /msu/data/t3work4/restOfPathToDataset
Concurrency_Limits = forrestp:10, DISK_T3WORK4:200
queue

arguments = /msu/data/t3work5/restOfPathToDataset
Concurrency_Limits = forrestp:10, DISK_T3WORK5:200
queue

Short, Medium, and Long Queues

MSU's tier3 is split up into three queues based on how long a job will take.
  • Short Queue: For jobs that will take about 3 hour or less to run. All job slots can run short jobs.
  • Medium Queue: For jobs that will take 3-48 hours to run. Almost all the job slots can run medium jobs.
  • Long Queue: For jobs that will take 2-7 days to run. A small fraction of the job slots can run long jobs.

By default a job is submitted to the short queue, but if a user knows there job will take more than three hours they can submit a job to the medium or long queue by doing the following. To submit a job to the medium queue, add the following line to your condor submit script before the queue command.

+IsMediumJob = true

To submit a job to the long queue, add the following line to your condor submit script before the queue command.

+IsLongJob = true

Changing a Job's Queue

If a user submits jobs to one of these queues and some or all of the jobs exceed the allotted time, they will be put on hold. Users can change which queue their jobs are in with TWO steps, first using the condor_qedit command and then condor_release.

To move a job from the short queue to the medium queue, use the following command.

condor_qedit <JobIdentifier> IsMediumJob true

Where <JobIdentifier> could be the process ID, the cluster ID, or the user's username. To move a job from the short queue to long queue, simply replace "IsMediumJob" with "IsLongJob" in the example above.

To move a job from the medium queue to the long queue, use the following command.

condor_qedit <JobIdentifier> IsMediumJob false IsLongJob true

Once you have changed the queue the jobs are in, release them with:

condor_release <JobIdentifier>

If a job exceeds it's time limit and is put on hold it will be restarted when released. If a user sees their job is approaching the time limit and would like to avoid restarting the job, they can switch queues using the condor_qedit command before the job is held.

The Bypass Queue

The bypass queue is a queue that bypasses the time limit and node restrictions of the short, medium, and long queues. It is discouraged to use this unless absolutely necessary.

In order to use the bypass queue, you must meet the following conditions:
  • You're crunched on time.
  • You need unlimited resources (no time limits and as many nodes as possible).
  • All T3 users agreed you can use it.

To use the bypass queue, insert the following line in your condor submit script.

+IsBypassJob = True

If you need to move already submitted jobs to the bypass queue, see the section "Changing a Job's Queue".

Misc.

Staggering jobs for the sake of the work disks

If all of a users jobs start at the same time, they will also start copying files from the work disks at the same time. This can overwhelm the work disks and slow the reading and writing capabilities of that disk. To avoid this, it's best practice to stagger your jobs when they start. This amounts to adding a sleep statement in your wrapper script that scales with the process ID of the job.

An easy way to do this is to pass the process ID as an argument to the wrapper script inside of your submit script:
universe = vanilla
executable = wrapper.sh
arguments = $(process)

# rest of submit script

and then calling the sleep command in the very beginning of the wrapper script
#!/bin/bash
sleep $((5*$1))

#rest of wrapper script


Checking the Status of Jobs & Job Slots

After you've submitted a job (or after it's finished running), you will probably want to check up on it from time to time or view it's ClassAds. There are two commands for doing this, one for jobs currently in the queue (condor_q) and one for jobs that are no longer in the queue (condor_history).

In addition, you may want to check how many job slots there are on the tier3, check how many of those are available, or view a slots ClassAds.

Jobs Currently in the Queue VIDEO

To view the job queue on the login/submit node you are currently on, use the command:
condor_q

If you'd like to view only your jobs, use the command:
condor_q <username>

To view the job queues of every login/submit node at once, use the command:
condor_q -global

To view the ClassAds of a particular job, use the command:
condor_q -l <JobID>

To view a particular ClassAd, use the command:
condor_q -af <ClassAd>

Jobs No Longer in the Queue (Video in job editing section)

If you'd like to view information about jobs that used to be on the queue, use the command:
condor_history

All of the options available in the condor_q examples work for this command as well.

Job Slots VIDEO

To view the status of the job slots, use the command:
condor_status

To view the ClassAds of a particular job slot, use the command:
condor_status -l <JobSlot>


Editing Jobs in the Queue VIDEO

If you'd like to edit the ClassAds of a job that has already been submitted you can do so by holding it, editing it, and then releasing it. To hold a job, use the following command:
condor_hold <JobID, ClusterID, or username>

To edit a job, use the following command:
condor_qedit <JobID, ClusterID, or username> <ClassAd> <new value>

Once a job is edited, release it using the command:
condor_release <JobID, ClusterID, or username>


Removing Jobs from the Queue VIDEO

To remove a job (or set of jobs) from the queue, use the command:
condor_rm <JobID, ClusterID, or username>


DagMan

If you have a job workflow that involves any form of submitting some number of jobs only to run more jobs that use the previous jobs results as inputs, then you may want to consider using DagMan. DagMan is a tool built into HTCondor that was made to handle complex job workflows. The documentation for it can be found here, but there is also a presentation on how to use it in the attachments of this page.
Topic attachments
I Attachment Action Size DateSorted ascending Who Comment
FHP_MSU_Weekly_HTCondor.pdfpdf FHP_MSU_Weekly_HTCondor.pdf manage 783.6 K 19 Sep 2017 - 20:02 ForrestPhillips Talk on using HTCondor to submit jobs.
DagMan_FHP_MSUATLAS_09-28-2017.pdfpdf DagMan_FHP_MSUATLAS_09-28-2017.pdf manage 767.6 K 06 Nov 2017 - 18:33 ForrestPhillips Talk on how to use DAGMan to manage complex job submissions.
Topic revision: r19 - 29 Oct 2019, ForrestPhillips
 

This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback