...
- Familiarize yourself with the way course material will be presented.
- Log into stampede2.
- Change your lonestar profile stampede2 profile to the course specific format.
- Refresh understanding of basic linux commands with some course organization.
- Review use of the nano text editor program, and become familiar with several other text editor programs.
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
cp /corral-repl/utexas/BioITeam/scriptsgva_course/GVA2021.bashrc .bashrc cp /corral-repl/utexas/BioITeam/scriptsgva_course/GVA2021.profile .profile chmod 700 .bashrc chmod 700 .profile |
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
ssh <username>@ls5<username>@stampede2.tacc.utexas.edu |
If everything is working correctly you should now see this as your prompt:
No Format |
---|
tacc:~$ |
Warning |
---|
If you see anything besides " |
Setting up other shortcuts:
In order to make navigating to the different file systems on stampede2 a little easier ($SCRATCH and $WORK), you can set up some shortcuts with these commands that create folders that "link" to those locations. Run these commands when logged into stampede2 with a terminal, from your home directory.
Code Block | ||
---|---|---|
| ||
cdh
ln -s $SCRATCH scratch
ln -s $WORK work
ln -s $BI BioITeam
|
Several people report seeing an error message stating "ln: failed to create symbolic link 'BioITeam/BioITeam': Permission denied."
This is being investigated, but is not expected to impact today's tutorial.
Understanding what your .bashrc file actually does.
...
title | While interesting and useful information to have, understanding it is not critical to variant analysis. I suggest you to look through this information after you complete the rest of the tutorial, in your free time, or when you need to modify your profile or bashrc files in the future. |
---|
...
Let's look at what your .bashrc profile actually does. Use the cat command to print contents of the .bashrc file to the screen.
Code Block | ||||
---|---|---|---|---|
| ||||
cat .bashrc |
This will print several lines of text to the terminal window. Let's look at what some of these lines do with a little more information:
...
lines that start with #
- Any line begins with a # symbol, is "commented out". Anything after a # symbol will not be executed by any program. Programers commonly make use of behavior to leave notes for others, or even themselves at a later date as to what particular lines of a script are actually doing.
...
Section 1 has multiple lines involving "module load <NAME>"
This loads different modules by default. We have included basic ones that will help with basic TACC things. After we review the use of the nano text editor we'll go into more depth with TACC modules. But for now trust us when we say that not having to load a bunch of modules every time you log into TACC is a good thing.
- In previous years the module system was used more extensively. Here we will attempt to rely more on miniconda installations for increased portability.
Section 2 has multiple lines starting with "export"
...
It is also likely or expected that upon logging in you see the following:
No Format |
---|
The following have been reloaded with a version change:
1) impi/18.0.2 => impi/17.0.3 2) intel/18.0.2 => intel/17.0.4 3) python2/2.7.15 => python2/2.7.14 |
These messages have to do with some of the core compilers and associated tools on TACC. You could use the module spider commands detailed below to find out more information of any of these modules and track down why such changes might be being made, but they are not concerning.
Warning |
---|
If you see anything besides " |
Setting up other shortcuts:
In order to make navigating to the different file systems on stampede2 a little easier ($SCRATCH and $WORK), you can set up some shortcuts with these commands that create folders that "link" to those locations. Run these commands when logged into stampede2 with a terminal, from your home directory.
Code Block | ||
---|---|---|
| ||
cdh
ln -s $SCRATCH scratch
ln -s $WORK work
ln -s $BI BioITeam
|
Several people report seeing an error message stating "ln: failed to create symbolic link 'BioITeam/BioITeam': Permission denied."
This is being investigated, but is not expected to impact today's tutorial.
Understanding what your .bashrc file actually does.
Expand | ||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||||||||||||
|
...
Expand | ||||
---|---|---|---|---|
| ||||
Komodo Edit is another free, full-featured text editor with syntax coloring for many programming languages and a remote file editing interface. It has versions for both Macintosh and Windows. Download the appropriate install image here. Once installed, start Komodo Edit and follow these steps to configure it:
When you want to open an existing file at Lonestarstampede2, do the following:
To create and save a new file, do the following:
|
...
Note that this may not be an inclusive list as it requires the name of the program, or its description to contain the word "alignment". Looking through the results you may notice some of the programs you already know and use for aligning 2 sequences to each other such as blast and clustalw. Try broadening your results a little by searching for "align" rather than "alignment" to see how important word choice is. When you compare the two sets of results you will see that one of the new results is:
...
Tip | ||
---|---|---|
| ||
While not always strictly necessary, using the version number (in this case " While it is tempting to only use "module load name" without the version numbers, using the version numbers can help keep track of what versions were used for referencing in your future publications, and make it easier to identify what went wrong when scripts that have been working for months or years suddenly stop working (ie TACC changed the default version of a program you are using). |
Since the module load command doesn't give any output, it is often useful to check what modules you have installed with either of the following commands:
Code Block |
---|
module list
module list bowtie |
The first example will list all currently installed modules while the second will only list modules containing bowtie in the name. If you see that you have installed the wrong version of something, a module is conflicting with another, or just don't feel like having it turned on anymore, use the following command:
Code Block |
---|
module unload bowtie
|
You will notice when you type module list you have several different modules installed already. These come from both TACC defaults (TACC, linux, etc), and several that are used so commonly both in this class and by biologists that it becomes cumbersome to type "module load python3
" all the time and therefore we just have them turned on by default by putting them in our profile to load on startup. As you advance in your own data analysis you may start to find yourself constantly loading modules as well. When you become tiered of doing this (or see jobs fail to run because the modules that load on the compute nodes are based on your .bashrc file plus commands given to each node), you may want to add additional modules to your .bashrc file. This can be done using the "nano .bashrc" command from your home directory.
2. Downloading from the web directly to TACC
When files are hosted online as direct downloads, you can use the wget
(Web get) command to skip your local computer and download the file directly to TACC. Typically this makes use of the "Copy Link Address" option when you right click on a link in a web browser that would otherwise start a download to your computer.
Here we will download the installation file for miniconda (which we will use in the next section and throughout the course) using both scp
and wget
to compare and contrast their functionality.
Using wget.
...
suddenly stop working (ie TACC changed the default version of a program you are using). |
Since the module load command doesn't give any output, it is often useful to check what modules you have installed with either of the following commands:
Code Block |
---|
module list
module list bowtie |
The first example will list all currently installed modules while the second will only list modules containing bowtie in the name. If you see that you have installed the wrong version of something, a module is conflicting with another, or just don't feel like having it turned on anymore, use the following command:
Code Block |
---|
module unload bowtie
|
You will notice when you type module list you have several different modules installed already. These come from both TACC defaults (TACC, linux, etc), and several that are used so commonly both in this class and by biologists that it becomes cumbersome to type "module load python3
" all the time and therefore we just have them turned on by default by putting them in our profile to load on startup. As you advance in your own data analysis you may start to find yourself constantly loading modules as well. When you become tiered of doing this (or see jobs fail to run because the modules that load on the compute nodes are based on your .bashrc file plus commands given to each node), you may want to add additional modules to your .bashrc file. This can be done using the "nano .bashrc" command from your home directory.
2. Downloading from the web directly to TACC
When files are hosted online as direct downloads, you can use the wget
(Web get) command to skip your local computer and download the file directly to TACC. Typically this makes use of the "Copy Link Address" option when you right click on a link in a web browser that would otherwise start a download to your computer.
Here we will download the installation file for miniconda (which we will use in the next section and throughout the course) using both scp
and wget
to compare and contrast their functionality.
Using wget.
In a new browser or tab navigate to https://docs.conda.io/en/latest/miniconda.html and right click on the "Miniconda3 Linux 64-bit" in the linux installers section and choose copy link address.
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
cd $WORK2
mkdir src
cd src |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
wget https://repo.anaconda.com/miniconda/Miniconda3-py39_4.9.2-Linux-x86_64.sh |
You should see a download bar showing you the file has begun downloading, when complete the ls
command will show you a new compressed file named 'Miniconda3-py39_4.9.2-Linux-x86_64.sh'
Using scp.
This is not necessary if you followed the wget commands above. Again In a new browser or tab you would navigate to https://docs.conda.io/en/latest/miniconda.html and but instead of right click clicking on the "Miniconda3 Linux 64-bit" in the linux installers section and choose copy link address.
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
cd $WORK2
mkdir src
cd src |
...
language | bash |
---|---|
title | Use the wget command to download the linux installer directly to your current directory |
collapse | true |
...
choosing copy link address you would simply left click and allow the file to download directly to your browser's Downloads folder. Using information from the SCP tutorial you would then transfer the local 'Miniconda3-py39_4.9.2-Linux-x86_64.sh
...
You should see a download bar showing you the file has begun downloading, when complete the ls
command will show you a new compressed file named 'Miniconda3-py39_4.9.2-Linux-x86_64.sh'
Using scp.
This is not necessary if you followed the wget commands above. Again In a new browser or tab you would navigate to https://docs.conda.io/en/latest/miniconda.html but instead of right clicking on the "Miniconda3 Linux 64-bit" in the linux installers section and choosing copy link address you would simply left click and allow the file to download directly to your browser's Downloads folder. Using information from the SCP tutorial you would then transfer the local '' file to the stampede2 remote location '$WORK2/src'.
Given that the wget command doesn't involve having to use MFA, or the somewhat cumbersome use of 2 differnt windows, and is subject to many fewer typos, hopefully you see how wget is preferable provided left clicking on a link directly downloads a file.
Finishing conda installation, and
Regardless of what method you chose to use, the following set of commands will work to install conda. For later reference, if you are planning to install miniconda on other systems or your local laptop, the 'regular installation' links on this link may be useful.
Code Block | ||||
---|---|---|---|---|
| ||||
bash Miniconda3-py39_4.9.2-Linux-x86_64.sh |
...
Given that the wget command doesn't involve having to use MFA, or the somewhat cumbersome use of 2 differnt windows, and is subject to many fewer typos, hopefully you see how wget is preferable provided left clicking on a link directly downloads a file.
Finishing conda installation, and
...
logout
#log back in using the ssh command.
conda config --set auto_activate_base false |
Following the installation prompts you will need to:
- hit enter to page through the license agreement
- enter 'yes' to agree to said license agreement
- enter to confirm the default installation location
enter 'yes' to initialize Miniconda3 by running conda init?
Code Block | ||||||
---|---|---|---|---|---|---|
| bash Miniconda3-py39_4.9.2-Linux-x86_64.sh
| |||||
logout #log back in using the ssh command. conda config --set auto_activate_base false |
...
The first time you logged back in, your promt prompt should have looked like this:
...
No Format |
---|
(base) tacc:~$ |
The second time you logged back in, your prompt should now look like thisgo back to looking like it did before you installed conda:
No Format |
---|
tacc:~$ |
If your prompt is different, please get the instructor's attention.
...
Code Block | ||||
---|---|---|---|---|
| ||||
conda create --name GVA2021
# enter 'y' to proceed
conda activate GVA2021 |
This will once again change your prompt. This time the expected prompt is:
Again if you see something different, you need to get the instructors attention. For the rest of the course, it is assumed that your prompt will start with (GVA2021) if not, remember that you need to use the conda activate GVA2021
command to enter the environment.
3. Using miniconda on TACC
...
In previous years, the pip installation program was used to install a few programs. While those programs will be installed through conda this year, the link here is provided to give a detailed walk through of how to use pip on TACC resources. This is particularly helpful for making use of the '--user' flag during the installation process as you do not have the expected permissions to install things in the default directories.
...