wiki:LunarcInfo

Lunarc Info

Submitting jobs

There are several ways of running jobs on Lunarc, the following has information that is relevant for the cluster named milleotto (milleotto.lunarc.lu.se). See more info at http://www.lunarc.lu.se.

Lunarc uses Torque, which is the same qeueing system as on Selma, so everything that works on Selma should work at Lunarc.

The system also supports submitting a batch of unique jobs using mpiexec. Each job has access to three different directories:

  • /home/username/ -- this is backed up, but only 700 MB.
  • /disk/global/username/ -- no backup.
  • /disk/local/ -- the discs belonging to the individual nodes. On milleotto you don't seem to have access to this directory, your job can work here but you can't see what it's doing until it's finished.

Jobs that use the disc a lot should be run on /disk/local/, jobs that don't use the disc much can be run on /disk/global/ instead. You shouldn't run jobs directly from /home/. The network has been unstable at times so it is not a bad idea to use the local discs, these jobs won't crash as easily if the network goes down.

The following scripts can be used to do the above.

Job Status

On top of Torque, there is an additional layer of administration at Lunarc called Maui. This allows for more information about your jobs, here are the two most useful commands:

  • showq -- shows the whole qeue and gives you an idea of when your job will start, finish, etc.
  • checkjob -- gives you detailed information about a job: elapsed time, remaining time, which nodes are used, etc. Pass the job's id-number as argument.

Docenten specifics

If you are running a job from /disk/local/ on Docenten (the older cluster) and want to have a look at your files use rsh (instead of ssh) to access the individual nodes, you can find out which nodes are used with checkjob (see above). If you want to copy the files to /disk/global/ before the job is finished use rcp. The script source:trunk/lunarc/cpjob.sh takes a jobid as argument and copies data from all nodes used by the job to the directory from where it is called.

mpiexec

The command mpiexec used in the scripts above requires that your program either uses an MPI-package or "manually" handles the arguments passed by mpiexec. The script source:trunk/lunarc/runlocal.sh is an example of the latter and the code in source:trunk/lunarc/mpich_main.cc shows what it might look like using the C++ MPI-package. A program like mpich_main.cc should be compiled with mpiCC, which can only be used after it has been activated by the lines

 > . use_modules
 > module load mpich-gcc3

If your program doesn't take any arguments you can of course just ignore the whole thing. If the different runs in a batch differ in a complex way you can either start them individually as separate jobs or use a solution like localjob2.pbs.

Compiling

If you want to use a new version of gcc (at the time of writing 4.1.2 instead of 3.4.6) the following commands are required

 > . use_modules
 > module load gcc
Last modified 10 years ago Last modified on Jun 20, 2007, 5:21:28 PM