wiki:JobRunning

Running jobs on the computers

Here is some useful information about commands and related stuff when using the department computers for running jobs.

Information about the department computers

The command che2 can be used to:

  • List all department computers
  • List jobs running on the computers
  • List free computers

Run 'che2 -h' for more information. Note that the che2 program requires password-less ssh-login to other computers (see below for more information), except for the case of just list all computers ('che2 -l').

Logging into another computer

Use the ssh command to log into another computer, e.g. to log into the computer bob do:

>> ssh bob

followed by typing your password. Using the -X flag allows you to run X applications on the remote computer.

To setup password-less ssh-login you need to copy your $HOME/.ssh/id_dsa.pub file to $HOME/.ssh/authorized_keys. If you already have an authorized_keysfile then you append the id_dsa.pub file to it. Also, make sure that the authorized_keys file only have rw permissions for the user (i.e. run chmod 600 authorized_keys).

If you do not have an id_dsa.pub file or even do not have a $HOST/.ssh directory you need to create it. Run

>> ssh-keygen -t dsa

and do not give any password.

The first time you log into a new computer you need to answer yes (adding the computer to the list of known hosts). You can run the script 'add_to_known_hosts.pl' to do that automatically for all department computers.

Starting jobs manually

If manually start jobs on other computers remember to:

  • always run with the lowest priority (i.e. nice -n 19 'command')
  • make sure that the computer you run on have enough internal memory

Submitting jobs using a program

There are scripts available that you can use to make it easier if you have a lot of jobs to run. Below is a small description of the Robot.pl script.

  • Basic usage is: 'Robot.pl cmd.txt &', where cmd.txt is a plain text file with commands that you would like to run, one line for each command.
  • The commands are then distributed among free computers.
  • The Robot.pl script is modifying the cmd.txt file, commenting out commands that have been submitted to a computer.
  • As long as the Robot.pl script is running you can add more lines to the cmd.txt file and they will be submitted to free computers.
  • Running the script as: 'Robot.pl -i cmd.txt' will enter a small shell with a limited set of functions to monitor and remove jobs.
  • Run 'Robot.pl -h' for more information.

Last modified 6 years ago Last modified on Nov 15, 2011, 9:59:50 AM