Using the stampede2 cluster
This page is a quick start for using the stampede2 cluster at TACC.
The stampede2 User Guide
For complete up-to-date information, always see: TACC's stampede2 User Guide
Start a new terminal window. For MACs this is done by clicking on the magnifying glass on the right hand side of the toolbar at the top of the page and type "terminal". For windows this can be done using putty or cygwin.
Before logging onto TACC servers, multi-factor authentication must be set up. Click here for an overview of this process, and click here to begin setting it up.
You will need to provide your password and a TACC token to successfully log in to stampede2 or any other TACC cluster.
ssh <my_user_name>@stampede2.tacc.utexas.edu
Setting up up a profile
There are many flavors of Linux/Unix shells. The default for TACC's Linux (and most other Linuxes) is bash (bourne againshell), which we will use throughout.
Whenever you login via an interactive shell as you did above, a well-known script is executed by the shell to establish your favorite environment settings. We've set up a common profile for you to start with that will help you know where you are in the file system and make it easier to access some of our shared resources. To set up this profile, do the following steps after logging in:
Copy a preconfigured "profile" to use with your account
cdh mv ~/.profile ~/.profile.old cp /work2/projects/BioITeam/projects/courses/Core_NGS_Tools/tacc/bashrc.corengs.stampede2 ~/.profile echo "export PATH=\$PATH:/work2/projects/BioITeam/common/bin/" >> ~/.profile chmod 600 .profile source .profile ls
The chmod 600 .profile command marks the file as readable/writable only by you. The .profile script file will not be executed unless it has these permissions settings. Note that the well-known filename is .profile (or .profile_user on some systems), which is specific to the bash shell.
Notice that when you do a normal ls to list the contents of your home directory, this file doesn't appear. That's because it's a hidden "dot file" – a file that has no filename, only an extension. To see these hidden files, use the -a (all) switch for ls.
Here, you also got a chance to see several routine and important unix commands in use: ls to list all files, mv to move a file to a different location or in this case, to rename a file, cp to make a copy of an existing file. You also saw the concept of wildcards in specifying path: ~ indicating home directory and the use of an already set alias : cdh to change to the home directory.
Transferring Files to and from stampede2
Obtaining the Path
It's a good idea to open 2 terminals for transferring files.
- One logged in on stampede2 with the current directory set to where you want to transfer files to or from
- One on your computer with the current directory set to where you want to transfer files to or from
On stampede2:
Go to the directory where you want your files to be or where you want to copy from.
Type
pwd
This gives the absolute path to your directory. It might start with "home" or "work" depending on what directory you're in.
Mac/Linux
Windows
Modules
Modules are programs or sets of programs that have been set up to run on TACC. They make managing your computational environment very easy. All you have to do is load the modules that you need and a lot of the advanced wizardry needed to set up the linux environment has already been done for you. New commands just appear.
To see all modules available in the current context, type:
module avail
module load gatk
Why not load all the modules by default? Well, you actually may want to add many of the moduels that you encounter in later tutorials to be loaded on login. The reason they are not loaded by default is to keep things lean for those people simulating hurricanes who don't want to load Bioperl every time they log in. Occasionally two different modules also don't play nice together and you will get messages that you have to "swap" one for another.
Since module avail
only shows modules in the current context (i.e. based on your currently loaded modules), to see all possible modules use:
module spider <freetext>
If you specify some text for <freetext>
, you'll see all modules with that text anywhere in their title or description. For example, try to find the transcriptome assembler Trinity.
Containers
A container is a way to encapsulate an application's code with all of its dependancies so that it can be run anywhere with no or minimal setup. TACC uses singularity as its container solution. Biocontainers, a project that containerizes bioinformatics software with all its dependencies is available on TACC clusters. More information about the bioinformatics tools present in Biocontainers can be found here.
module load biocontainers
Now let's go on to look at the directory structure at stampede2.
Welcome to the University Wiki Service! Please use your IID (yourEID@eid.utexas.edu) when prompted for your email address during login or click here to enter your EID. If you are experiencing any issues loading content on pages, please try these steps to clear your browser cache.