Table of Contents | ||||
---|---|---|---|---|
|
Introduction
OK. So you just read the latest issue of Bioinformatics (or did a Google search) and have discovered some new pieces of software that promise to slice and dice your data in new, interesting, and useful ways. Most often, these tools will be designed to run in a Linux environment. Unfortunately, the helpful support staff at TACC may not have had time to test these tools and make a proper module out of them (or maybe they didn't want to make 1,000+ modules for every piece of bioinformatics software out there). Perhaps there is a TACC module, but it was made a month or two back when the software was at version 1.01 and now it's at version 1.03, which has a bug fix or some nifty new bell and whistle.
The bottom line is that you are going to find yourself in a situation where module spider
will come up empty and you're on your own to get installing a piece of software that you are dying try out installed on TACC.
Unfortunately, there is no double-click installer for TACC. Fortunately, a majority of the better and more mature programs out there (but by no means all bioinformatics software) can be fairly easily installed. If these instructions fail, you might need to find your nearest Linux guru. Or, you might try to consult Google and tinker with things a bit.
The overall steps for installing a program on a Linux system are:
...
For programs that are already compiled (converted from high level source code in a language like C into machine specific code), you are often given some choices and need to determine how to download the version that has the correct CPU architecture for your machine.
...
The website for the SSAHA2 read mapper has links to download executables compiled for several different architectures. Using commands that you have learned in earlier lessons, download the correct one to Lonestar and place it under the directory $HOME/local/bin
.
Expand | ||||
---|---|---|---|---|
| ||||
You can often right-click to copy the URL of a link on a website and then use |
Expand | ||||
---|---|---|---|---|
| ||||
|
How the shell finds executables: $PATH
...
Instead of writing out the entire path to the executable to run it, like thisin one of these examples:
Code Block |
---|
login1$ /home1/01502/jbarrick/local/bin/ssaha2
login1$ $HOME/local/bin/ssaha2
|
Assuming you are using the bash shell, you can do this by editing your $HOME/.profile
or $HOME/.profile_user
configuration file. This file is These files are basically just a bash script scripts that is are run whenever you log in. You want to add a line that looks like this to the top of $HOME/.profile
:
Code Block | ||
---|---|---|
| ||
export PATH="$HOME/local/bin:$PATH" |
...
Important! In order to have this change take effect, you must log out or log in again to force the shell to re-read the $HOME/.profile_user
file. (Alternately, you can use the command source $HOME/.profile_user
one of these commands to re-read it at any time.) :
Code Block |
---|
login1$ . $HOME/.profile_user
login1$ source $HOME/.profile_user
|
If your path is not working or you're curious about where else your shell is looking for commands and the order, then you might want to see the value of your $PATH
.
...
Warning! If you forget to include $PATH
on the right side in the above example, then you will tell your shell to not look in the usual places for executables any more. This means that ls
, cd
, and other common commands will no longer work without typing out their whole paths, e.g. /bin/ls
. This can be extremely confusing.!!
Handling multiple versions If you install a newer version of a command that is already available on TACC for yourself, then you might get confused about what version you are running when you type the command. You can see the whole path to the executable that will be run when you type a one-word command using the which
command.
...
Case 2: Install from the source code
Note on TACC compilers
There are multiple compilers available on TACC:
intel
oricc
- the default compiler. Preferred for optimizing speed of compiled executables.gcc
- the GNU compiler collection. Tends to be more compatible.
Be aware that if you compile libraries and programs that link to them, that generally you must compile all components with the same compiler.
If you run into an error during compilation, try the gcc
compiler by loading its module. You may get a message like this:
Code Block |
---|
login1$ module load gcc
Error: You can only have one compiler module loaded at time.
You already have intel loaded.
To correct the situation, please enter the following command:
module swap intel gcc/4.4.5
Please submit a consulting ticket if you require additional assistance.
|
So, follow the directions:
Code Block |
---|
login1$ module swap intel gcc/4.4.5
|
You will need to do this to get breseq to compile in the next example.
Example: Install breseq from a source code archive
breseq is a tool developed by the Barrick lab. You might use it in a later lesson. It is a good example of a tool that can be downloaded and compiled.
breseq web page
breseq download page
breseq uses the common GNU build system install sequence. If you install other GNU tools then the same ./configure; make; make install
command sequence will often be used.
Code Block | ||
---|---|---|
| ||
$login1 cdw $login1 wget http://breseq.googlecode.com/files/breseq-0.17d19.tar.gz $login1 tar -xvzf breseq-0.17d19.tar.gz $login1 cd breseq-0.17d19 $login1 ./configure --prefix=$HOME/local $login1 make $login1 make install |
The extra option --prefix
to ./configure
sets where the executable and any other files associated with the program will be installed. If you leave off this flag, then it will try to install them in a system-side location. You must have administrator privileges to do this and would generally have to substitute sudo make install
for the last step to get this to work. That won't work on TACC! (sudo
means "super-user do".)
For some other tools, the instructions may tell you may to skip straight to make
, or you might also have to follow other instructions or install some other tools programs or libraries that the tool you want to use needs to run in addition. Generally, you can find this information in the online documentation or an INSTALL
file in the root of the downloaded code.
More Examples
Example: Install the latest version of Bowtie2
There is a newer version of Bowtie2 available than the one loaded into a module on TACC. You might want to use it because it includes some new bug fixes. You can download either a source code version to compile using the above instructions or a binary version of bowtie2. Try to get this running on your own.
Expand | ||||
---|---|---|---|---|
| ||||
Bowtie2 is comprised of multiple executables. You will need to copy or move all of them into |
Other Cases
In other lessons we'll cover various deviations and elaborations on these two procedures in order to install specific programs, R modules, Perl modules, Python modules, etc.