Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

If you do try to run a long job in interactive mode. It will be killed after 10-15 minutes and you may see a message like this:

Code Block

Message from root@login1.ls4.tacc.utexas.edu on pts/127 at 09:16 ...
Please do not run scripts or programs that require more than a few minutes of
CPU time on the login nodes.  Your current running process below has been
killed and must be submitted to the queues, for usage policy see
http://www.tacc.utexas.edu/user-services/usage-policies/
If you have any questions regarding this, please submit a consulting ticket.

...

First, let's make a very simple job to run. All we need to do is create a text file. Each line in this text file, which we will call simply commands, is a command exactly as you would type it into the terminal yourself to have it run.

Code Block
  nano commands
Code Block
titleAdd this line to your "commands" file

date > date.out
ls > ls.out

...

TACC has supplied a sample launcher script which we will modify to queue and execute our job. First, type

Code Block

 module load launcher

Now let's copy the example launcher file.

Code Block
  cp $TACC_LAUNCHER_DIR/launcher.sge ./

There's a few things we should change inside of this file. Open the file using nano like so:

Code Block

 nano launcher.sge

First, Let's change the name of the job.

...

Under -V, add 2 new lines like so:

Code Block

 #$ -M my_email@something.com
 #$ -m be

...

Change the line that says "setenv CONTROL_FILE" to say:

Code Block
  setenv CONTROL_FILE job.csh

...

Now that we have our job file and our launcher, we need to queue the launcher. Type:

Code Block

 qsub launcher.sge

Lonestar will make sure that everything you've specified is correct and if it is, your job will be queued.

You can check the status of your job like so:

Code Block

 qstat

This will tell you its job priority and what state it is in.

...

If you happen to notice that your job will run incorrectly, you can delete your job like so:

Code Block
  qdel job-ID

You can obtain the job-ID by typing "qstat."

If you are nosy and want to see all of the jobs queued and running on Lonestar, then use this command:

Code Block
  showq

You can also see just your jobs in this format:

Code Block
  showq -u

You can create a job that is dependent on another job finishing only start after the first job has completed using this command:

Code Block
  qsub -hold_jid job-ID launcher.sge

...

While your job is running, TACC creates 3 different files with names based on the -o field in the launcher. These files are named like so:

Code Block

 (job_name).e(job-ID)
 (job_name).pe(job-ID)
 (job_name).o(job-ID)

...

We have created a Python script called launcher_creator.py that makes creating a launcher.sge script a breeze. You will probably want to use this for the rest of the course.

First, let's copy the script to the directory we've been doing everything in:

...

.

...

Now run the script with the -h option to show the help message:

Code Block
  module load python
 ./launcher_creator.py -h

-n

name

The name of the job.

-a

allocation

The allocation you want to charge the run to.

-q

queue

The queue to submit to, like 'normal' or 'largemem', etc.

-w

wayness

Optional The number of jobs in a job list you want to give to each node. (Default is 12 for Lonestar, 16 for Stampede.)

-N

number of nodes

Optional Specifies a certain number of nodes to use. You probably don't need this option, as the launcher calculates how many nodes you need based on the job list (or Bash command string) you submit. It sometimes comes in handy when writing pipelines.

-t

time

Time allotment for job, format must be hh:mm:ss.

-e

email

Optional Your email address if you want to receive an email from Lonestar when your job starts and ends.

-l

launcher

Optional Filename of the launcher. (Default is <name>.sge)

-m

modules

Optional String of module management commands. module load launcher is always in the launcher, so there's no need to include that.

-b

Bash commands

Optional String of Bash commands to execute.

-j

Command list

Optional Filename of list of commands to be distributed to nodes.

-s

stdout

Optional Setting this flag outputs the name of the launcher to stdout.

We should mention that launcher_creator.py does some under-the-hood magic for you and automatically calculates how many cores to request on lonestar, assuming you want one core per process. You don't know it, but you should be grateful that this saves you from ever having to think about a confusing calculation.

...