The modules system

Introduction

TACC computers, like all UNIX systems, set up a default environment with the ability to execute additional commands to alter that environment. The default setup is controlled by system and user files that your shell executes before you see the shell prompt.

On all TACC computers, your environment is controlled with the modules utility. At login, module commands set up a basic environment for the default compilers, tools, and libraries. For example: the $PATH, $MANPATH, $LD_LIBRARY_PATH variables, directory locations ($WORK, $SCRATCH, $HOME) etc. These environment variables are automatically kept up-to-date when system and application software is upgraded.

Each of the major TACC applications has a modulefile that sets, unsets, appends to, or prepends to environment variables such as $PATH, $LD_LIBRARY_PATH, $INCLUDE_PATH, and $MANPATH for the specific application. Each modulefile also sets application-specific functions and aliases. The general format of the module command is:

login1$ module load module_name

A full and detailed manual for the modules utility is available, so we will concentrate on a few key use cases.

Loading software into your PATH

Let's say you need access to the BWA aligner for some analyses. You simply enter module load bwa either at the command line, for interactive use or write it in the early part of your job submission script for batch use. If you need a specific version append it after the name as follows module load bwa/0.6.1 (this assumes that version is available on the system).

What modules are available for use?

There are two answers to this: the simplest is to enter module avail. Try doing this in your Lonestar command prompt and you will see that there's a lot of software pre-built by TACC staff for your use. However, module avail gives you a big list, and only shows you modules available in the context of the C compiler that's currently active in the system. Modules system also supports a keyword functionality, which shows you any module whose name, tags, or help text contains <keyword>. Please try the following: module key Genomics.

What modules do you have loaded?

This one is simple: module list. Please try this for yourself. Below is what one of our module listings looks like:

 Result
Currently Loaded Modules:
  1) TACC-paths  3) cluster-paths  5) mvapich2/1.6  7) tar/1.22  9) TACC      
  2) Linux       4) intel/11.1     6) gzip/1.3.12   8) cluster   10) irods/2.5

Removing a loaded module from your environment

Let's say you are working on a large, unwieldy chunk of Python code that you've inherited from your predecessor (hey, this is a reality-based NSG course...) You are usually a Python 2.7.x user, which you invoke via module load python/2.7.1, but you suspect that your predecessor developed this code in Python 2.4.3. You could manually reconfigure your workspace, and probably end up tangled in the weeds somewhere, or you could do module unload python/2.7.1. Furthermore, TACC maintains two branches of Python 2.7.x - you could swap between them as follows: module swap python/2.7.1 python/2.7.1-epd, and vice versa.

Module swapping is also handy for changing C compilers, which is usually quite the daunting task. Swap between GCC 4.4.3 and 4.4.5 via module swap gcc/4.4.3 gcc/4.4.5 or between GCC and Intel via module swap intel gcc/4.4.5. Since a lot of bioinformatics code ends up only building under GCC, you may see module swap intel gcc/4.4.5 as part of a lot of instructions and tutorials related to bioinformatics at TACC.

We would prefer to always use the Intel C compiler, since it usually results in programs that are 5-30% faster than GCC, so if you have a favorite application that is only available under GCC at TACC, feel free to contact the developers and ask them to support the Intel compiler. We'll even help them update their code!