...
TACC resources are partitioned into queues: a named set of compute nodes with different characteristics. The main ones on ls6 are listed below. Generally you use development (-q development) when you are writing and testing first test your code, then normal once you're sure your commands will execute properly.
...
- When you run a batch job, your project allocation gets "charged" for the time your job runs, in the currency of SUs (System Units).
- SUs are related in some way to node hours, usually 1 SU = 1 "standard" node hour.
Tip | ||
---|---|---|
| ||
Jobs should consist of tasks that will run for approximately the same length of time. This is because the total node hours for your job is calculated as the run time for your longest running task (the one that finishes last). For example, if you specify 100 commands and 99 finish in 2 seconds but one runs for 24 hours, you'll be charged for 100 x 24 node hours even though the total amount of work performed was only ~24 hours. |
...
tasks per node (wayness) | cores available to each task | memory available to each task |
---|---|---|
1 | 128 | ~256 GB |
2 | 64 | ~128 GB |
4 | 32 | ~64 GB |
8 | 16 | ~32 GB |
16 | 8 | ~16 GB |
32 | 4 | ~8 GB |
64 | 2 | ~4 GB |
128 | 1 | ~1 ~2 GB |
- In launcher_creator.py, wayness is specified by the -w argument.
- the default is 128 (one task per core)
- A special case is when you have only 1 command in your job.
- In that case, it doesn't matter what wayness you request.
- Your job will run on one compute node, and have all cores available.
Your choice of the wayness parameter will depend on the nature of the work you are performing: its computational intensity, its memory requirements and its ability to take advantage of multi-processing /multi-threading (e.g. bwa -t option or hisat2 -p option).
...
Code Block | ||
---|---|---|
| ||
cat cmd*log # or, for a listing ordered by nodecommand namenumber (the 11th2nd space-separated field) cat cmd*log | sort -k 112,112n |
The vertical bar ( | ) above is the pipe operator, which connects one program's standard output to the next program's standard input.
(Read more about the sort command at Some Linux fundamentalscommands: cut, sort, uniq, and more about Piping)
You should see something like output below.
...
Notice that there are 4 different host names. This expression:
Code Block | ||
---|---|---|
| ||
# the host (node) name is in the 11th field
cat cmd*log | awk '{print $11}' | sort | uniq -c |
should produce output something like this (read more about piping commands to make Piping a histogram)
Code Block | ||
---|---|---|
| ||
4 c302c303-005.ls6.tacc.utexas.edu 4 c302c303-006.ls6.tacc.utexas.edu 4 c305c304-005.ls6.tacc.utexas.edu 4 c305c304-006.ls6.tacc.utexas.edu |
Some best practices
...
Here's an example directory structure
$SCRATCH$WORK/my_project
/01.original # contains or links to original fastq files
/02.fastq_prep # run fastq QC and trimming jobs here
/03.alignment # run alignment jobs here
/gene_counts /04.# analyze gene overlap here
/51.test1 # play around with stuff here
/52.test2 # play around with other stuff here
...
Code Block | ||||
---|---|---|---|---|
| ||||
cd $SCRATCH$WORK/my_project/02.fastq_prep ls ../01.original/my_raw_sequences.fastq.gz |
...
Code Block | ||||
---|---|---|---|---|
| ||||
cd $SCRATCH$WORK/my_project/02.fastq_prep ln -ssf ../01.original fq ls ./fq/my_raw_sequences.fastq.gz |
...
Code Block | ||||
---|---|---|---|---|
| ||||
# navigate through the symbolic link in your Home directory cd ~scratch~/scratch/core_ngs/slurm/simple ls ../wayness ls ../.. ls -l ~/.bashrc |
...