Using Condor at FSU HEP


Condor is a very sophisticated batch queuing system written by the computer science
department at the University of Wisconsin.   There is extensive tutorials and information
available from http://www.cs.wisc.edu/condor/

To run a condor job, you must first write a job execution script.  The execution script
includes information about your executable and job information. 

For example, I have a compiled an example program called
simple.c.   To create an executable, I compile and link the program.

$ condor_compile gcc -o simple  simple. c

In order for condor to run you job, you must tell condor some information.
Note: condor_compile is not necessary, but allows a job to be prempted.  
A job submission file looks like:

$ cat submit
Universe   = vanilla
Executable = simple
Arguments  = 4 10
Log        = simple.log
Output     = simple.out
Error      = simple.error
Queue

The first line tells which universe for condor to execute in.  For our cluster, this is vanilla.  The next two lines
in the file give name of the executable (simple),  and the arguments to simple (4 10).  This makes the program
execute as:

$ simple 4 10

The Log, Output and Error tags give the files which are used to store the run-time information.  The Log file
store information about where the program ran--condor specific information.  The Output and Error tags store
information from the standard output and the standard error, respectively.   The final line, Queue, tells condor to
start the job running. 

To submit a job to condor:

$ condor_submit submit

Condor will store the output and log files in the directory from which it was submitted.

Examples of these scripts can be found in ~jmcdon/condor/.

Currently condor is not available on all machines but only those running WBL or Fedora Core 2 or Better. 
If there is significant use of condor, then other machines, including desktops can be added.  Currently, one can
submit jobs on:

lnxc0a
lnxc0b
lnxc0c
lnxc0d
lnxc13
lnxc14
lnxc17
lnxc18
lnxc19
lnxc20
lnxc22
lnxc23

This makes a total of 17 slots to which jobs can be submitted.

Currently, there is no priority scheme implemented, but condor has the capacity to provide one.   Improvements
and expansion of condor will be determined by its usage.