Using Condor at FSU HEP
Condor is a very sophisticated batch queuing system written by the
computer science
department at the University of Wisconsin. There is
extensive tutorials and information
available from http://www.cs.wisc.edu/condor/
To run a condor job, you must first write a job execution script.
The execution script
includes information about your executable and job information.
For example, I have a compiled an example program called
simple.c. To create an executable, I compile and link the
program.
$ condor_compile gcc -o simple simple. c
In order for condor to run you job, you must tell condor some
information.
Note: condor_compile is not necessary, but allows a
job to be prempted.
A job
submission file looks like:
$ cat submit
Universe = vanilla
Executable = simple
Arguments = 4 10
Log = simple.log
Output = simple.out
Error = simple.error
Queue
The first line tells which universe for condor to execute in. For
our cluster, this is vanilla. The next two lines
in the file give name of the executable (simple), and the
arguments to simple (4 10). This makes the program
execute as:
$ simple 4 10
The Log, Output and Error tags give the files which are used to store
the run-time information. The Log file
store information about where the program ran--condor specific
information. The Output and Error tags store
information from the standard output and the standard error,
respectively. The final line, Queue, tells condor to
start the job running.
To submit a job to condor:
$ condor_submit submit
Condor will store the output and log files in the directory from which
it was submitted.
Examples of these scripts can be found in ~jmcdon/condor/.
Currently condor is not available on all machines but only those
running WBL or Fedora Core 2 or Better.
If there is significant use of condor, then other machines, including
desktops can be added. Currently, one can
submit jobs on:
lnxc0a
lnxc0b
lnxc0c
lnxc0d
lnxc13
lnxc14
lnxc17
lnxc18
lnxc19
lnxc20
lnxc22
lnxc23
This makes a total of 17 slots to which jobs can be submitted.
Currently, there is no priority scheme implemented, but condor has the
capacity to provide one. Improvements
and expansion of condor will be determined by its usage.