Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This portion of the class is devoted to making sure we are all starting from the same starting point on lonestarstampede. This tutorial was developed as a combined version of multiple other tutorials which were previously given credit here. Anyone wishing to use this tutorial is welcome.

...

  1. Familiarize yourself with the way course material will be presented.
  2. Log into stampede2.
  3. Change your lonestar profile stampede2 profile to the course specific format.
  4. Refresh understanding of basic linux commands with some course organization.
  5. Review use of the nano text editor program, and become familiar with several other text editor programs.

...

Code Block
languagebash
titleCopy the course provided .profile file and change its name and permissions
collapsetrue
cp /corral-repl/utexas/BioITeam/scriptsgva_course/GVA2021.bashrc .bashrc
cp /corral-repl/utexas/BioITeam/scriptsgva_course/GVA2021.profile .profile
chmod 700 .bashrc
chmod 700 .profile

...

Code Block
languagebash
titleGo log back in to Lonestar
collapsetrue
ssh <username>@ls5<username>@stampede2.tacc.utexas.edu

If everything is working correctly you should now see this as your prompt:  

No Format
tacc:~$
Warning

If you see anything besides "tacc:~$", get my attention and be ready to share your screen rather than continuing forward.

  • Setting up other shortcuts:

In order to make navigating to the different file systems on stampede2 a little easier ($SCRATCH and $WORK), you can set up some shortcuts with these commands that create folders that "link" to those locations. Run these commands when logged into stampede2 with a terminal, from your home directory.

Code Block
titleCreating a shortcut to the main Lonestar working directories
cdh
ln -s $SCRATCH scratch
ln -s $WORK work
ln -s $BI BioITeam

Several people report seeing an error message stating "ln: failed to create symbolic link 'BioITeam/BioITeam': Permission denied." This is being investigated, but is not expected to impact today's tutorial.

  • Understanding what your .bashrc file actually does.

...

titleWhile interesting and useful information to have, understanding it is not critical to variant analysis. I suggest you to look through this information after you complete the rest of the tutorial, in your free time, or when you need to modify your profile or bashrc files in the future.

...

Let's look at what your .bashrc profile actually does. Use the cat command to print contents of the .bashrc file to the screen.

...

It is also likely or expected that upon logging in you see the following:

No Format
The following have been reloaded with a version change:
  1) impi/18.0.2 => impi/17.0.3     2) intel/18.0.2 => intel/17.0.4     3) python2/2.7.15 => python2/2.7.14

These messages have to do with some of the core compilers and associated tools on TACC. You could use the module spider commands detailed below to find out more information of any of these modules and track down why such changes might be being made, but they are not concerning.


Warning

If you see anything besides "tacc:~$" as your prompt, get my attention and be ready to share your screen rather than continuing forward.



  • Setting up other shortcuts:

In order to make navigating to the different file systems on stampede2 a little easier ($SCRATCH and $WORK), you can set up some shortcuts with these commands that create folders that "link" to those locations. Run these commands when logged into stampede2 with a terminal, from your home directory.

Code Block
titleCreating a shortcut to the main Lonestar working directories
cdh
ln -s $SCRATCH scratch
ln -s $WORK work
ln -s $BI BioITeam

Several people report seeing an error message stating "ln: failed to create symbolic link 'BioITeam/BioITeam': Permission denied." This is being investigated, but is not expected to impact today's tutorial.

  • Understanding what your .bashrc file actually does.

Expand
titleWhile interesting and useful information to have, understanding it is not critical to variant analysis. I suggest you to look through this information after you complete the rest of the tutorial, in your free time, or when you need to modify your profile or bashrc files in the future.
Info

Let's look at what your .bashrc profile actually does. Use the cat command to print contents of the .bashrc file to the screen.

Code Block
languagebash
titlePrint the contents of the .profile file to the screen
cat .bashrc

This will print several lines of text to the terminal window. Let's look at what some of these lines do with a little more information:

  • lines that start with #

    • Any line begins with a # symbol, is "commented out". Anything after a # symbol will not be executed by any program. Programers commonly make use of behavior to leave notes for others, or even themselves at a later date as to what particular lines of a script are actually doing.
  • Section 1 has multiple lines involving "module load <NAME>"

    • This loads different modules by default. We have included basic ones that will help with basic TACC things. After we review the use of the nano text editor we'll go into more depth with TACC modules. But for now trust us when we say that not having to load a bunch of modules every time you log into TACC is a good thing.

    • In previous years the module system was used more extensively. Here we will attempt to rely more on miniconda installations for increased portability.
  • Section 2 has multiple lines starting with "export"

    • The export lines define shell variables for example BI and PATH. You've already seen how using $BI can come in handy accessing our shared course directory. As for PATH, that is a well-known environment variable that defines a set of directories where the shell will look when you type in a program's name. Our shared profile adds the common course directories that we copied at the start of this tutorial and your local ~/local/bin directory (which does not exist yet) to the location list. You can see the entire list of locations by doing this:

      Code Block
      languagebash
      titleHow to see where the bash shell looks for programs
      echo $PATH

      As you can see, there are a lot of locations on the path. That's because when you load modules at TACC (see above), that mechanism makes the programs available to you by putting their installation directories on your $PATH.

  • umask 002

    • The umask command is used to set the default permissions of newly created files and directories limiting the need to use the chmod command. umask functions as the inverse of chmod meaning that it subtracts the values from the default permissions. In this case the command umask 002 is the equivalent of the command chmod 775 for directories, and chmod 664 for files. in summary, having this command in your .profile gives all new files you create read and write access to both you and your group while giving read only access to everyone else.
  • PS1='tacc:\w$ '

    • The PS1='tacc:\w$ ' line is a special setting that tells the shell to display the current directory as part of its prompt. It saves you typing pwd all the time to see where you are in the directory hierarchy. Try using the mkdir command to make a new directory called tmp and change into that directory to see what it does to your prompt.

      Code Block
      languagebash
      titleSee how your prompt reflects your current directory
      collapsetrue
      mkdir tmp
      cd tmp
    • Your prompt should have changed from: "tacc:~$"to now be "tacc:~/tmp$". Your prompt now tells you you are in the tmp subdirectory of your home directory (~). See if you can figure out how to return to your home directory without expanding the code block. Expand the following code block to see the different ways of returning to your home directory.

      Code Block
      languagebash
      titleHow to return to your home directory
      collapsetrue
      cd
      cdh
      cd $HOME
      cd ~
      cd -

      The last example in the above code block will return you to your previous directory. In this case, that means the home directory, but it can be very useful in other situations when you change directories to do something in 1 place then need to hop back to where you were, or if you mistakenly leave a directory.

...

Expand
Komodo Edit for Mac and Windows
Komodo Edit for Mac and Windows

Komodo Edit is another free, full-featured text editor with syntax coloring for many programming languages and a remote file editing interface. It has versions for both Macintosh and Windows. Download the appropriate install image here.

Once installed, start Komodo Edit and follow these steps to configure it:

  • Configure the default line separator for Unix
    • On the Edit menu select Preferences
    • Select the New Files Category
    • For Specify the end-of-line (EOL) indicator for newly created files select UNIX (\n)
    • Select OK
  • Configure a connection to TACC
    • On the Edit menu select Preferences
    • Select the Servers Category
    • For Server type select SFTP
    • Give this profile the Name of stampede2
    • For Hostname enter stampede2.tacc.utexas.edu
    • Enter your TACC user ID for Username
    • Leave Port and Default path blank
    • Select OK

When you want to open an existing file at Lonestarstampede2, do the following:

  • Select the File menu -> Open -> Remote File
    • Select your stampede2 profile from the top Server drop-down menu
    • Once you log in, it should show you all the files and directories in your lonestar $HOME stampede2 $HOME directory
  • Navigate to the file you want and open it
    • Often you will use the work or scratch directory links to help you here

To create and save a new file, do the following:

  • From the Komodo Edit Start Page, select New File
    • Select the file type (Text is good for commands files)
  • Edit the contents
  • Select the File menu -> Save As Other -> Remote File
    • Select your Lonestar profile Stampede2 profile from the Server drop-down menu
    • Once you log in, it should show you all the files and directories in your stampede $HOME directory
  • Navigate to where you want the put the file and save it
    • Often you will use the work or scratch directory links to help you here

...

So you may be asking yourself what the point of using stampede2 is at all if it is wrought with so many issues. The answer comes in the form of compute nodes. There are nearly 6,000 compute nodes with different configurations that can only be accessed by a single person for a specified amount of time. For the duration of the class, each student will interact with a single compute node using an interactive DEVelopment (iDEV) session so that you get immediate feedback of seeing commands being run and know when to use the next command. This is not the typical way you will analyze your own data. Friday's tutorial will deal with the queue system.

While stampede2 is tremendously powerful and will greatly speed up your analysis, it doesn't have much in the way of a GUI (graphical user interface). The lack of a GUI means it can't visualize graphs or other meaningful representations of our data that we are used to seeing. In order to do these types of things, we have to get our data off of stampede2 and onto our own computers. This course uses the scp ("secure copy command") exclusively to move files back to your local computer, as mentioned there are other programs that can be configured to more easily transfer files back and forth as you progress in your analysis.

...

If (or when) you looked at what our edits to the .bashrc file did, you would have seen that section 1 has a series of "module load XXXX" commands, and a promise to talk more about them later. I'm sure you will be thrilled to learn that now is that time... As a "classically trained wet-lab biologist" one of the most difficult things I have experienced in computational analysis has been in installing new programs to improve my analysis. Programs and their installation instructions tend (or appear) to be written by computational biologists in what at times feels like a foreign language, particularly when things start going wrong. Here we will discuss 4 3 ways of accessing new commands/programs/scripts and explain their benefit. This is an incomplete list of ways to install new programs to use, but is meant to be a good working example that you can adapt to install other programs in your future work.

...

Note that this may not be an inclusive list as it requires the name of the program, or its description to contain the word "alignment". Looking through the results you may notice some of the programs you already know and use for aligning 2 sequences to each other such as blast and clustalw. Try broadening your results a little by searching for "align" rather than "alignment" to see how important word choice is. When you compare the two sets of results you will see that one of the new results is:

...

Here we will download the installation file for miniconda (which we will use in the next section and throughout the course) using both scp and wget to compare and contrast their functionality. 

3. Using miniconda on TACC

...

titleConda environments in the instructors work

...

In the next tutorial we will start accessing  the quality of some NGS reads using the fastqc program. Before we can use it, we must install it. Similar to the module system described above, to install a program via conda, we need 3 things:

...

Using wget.

In a new browser or tab navigate to https://docs.conda.io/en/latest/miniconda.html and right click on the "Miniconda3 Linux 64-bit" in the linux installers section and choose copy link address.

Code Block
languagebash
titleUsing the mkdir command to create a folder named 'src' inside of your $WORK2 directory
collapsetrue
cd $WORK2
mkdir src
cd src
Code Block
languagebash
titleUse the wget command to download the linux installer directly to your current directory
collapsetrue
wget https://repo.anaconda.com/miniconda/Miniconda3-py39_4.9.2-Linux-x86_64.sh

You should see a download bar showing you the file has begun downloading, when complete the ls command will show you a new compressed file named 'Miniconda3-py39_4.9.2-Linux-x86_64.sh'

Using scp.

This is not necessary if you followed the wget commands above. Again In a new browser or tab you would navigate to https://docs.conda.io/en/latest/miniconda.html but instead of right clicking on the "Miniconda3 Linux 64-bit" in the linux installers section and choosing copy link address you would simply left click and allow the file to download directly to your browser's Downloads folder. Using information from the SCP tutorial you would then transfer the local 'Miniconda3-py39_4.9.2-Linux-x86_64.sh' file to the stampede2 remote location '$WORK2/src'.

Given that the wget command doesn't involve having to use MFA, or the somewhat cumbersome use of 2 differnt windows, and is subject to many fewer typos, hopefully you see how wget is preferable provided left clicking on a link directly downloads a file.

Finishing conda installation, and 

Regardless of what method you chose to use, the following set of commands will work to install conda. For later reference, if you are planning to install miniconda on other systems or your local laptop, the 'regular installation' links on this link may be useful.


Code Block
languagebash
titleThe following command is then used to install miniconda
bash Miniconda3-py39_4.9.2-Linux-x86_64.sh
logout
#log back in using the ssh command. 
conda config --set auto_activate_base false

Following the installation prompts you will need to:

  1. hit enter to page through the license agreement
  2. enter 'yes' to agree to said license agreement
  3. enter to confirm the default installation location
  4. enter 'yes'  to initialize Miniconda3 by running conda init?


Code Block
Expand
titleIf you are having trouble finding the fastqc page on anaconda, the answer is here, as well as a description of the most likely problem you encountered.

https://anaconda.org/bioconda/fastqc

If you were unable to find this page, the most likely error you entered fastqc into the search box, and you recognized that 360,000+ downloads was likely the program you wanted, you clicked the first bit of hyperlink you found which took you to the bioconda page instead of to the fastqc program. Personally, I think the entire box should be clickable to send you the program page, but nobody has asked me.

Expand
titleClick here if you are unsure what command to use to install fastqc, or want to check your understanding
Code Block
languagebash
conda install -c bioconda fastqc

While there are two other possible commands listed, I tend to always start with the simplest command and work my way from there. The other two commands deal with accessing specific labels/versions of the program.

If all goes well, the installation command should give you the following output with you answering "y" when prompted if you actually want to install the packages:

The following packages will be downloaded:
No Format
languagebash
titleattempt to install the fastqc program using conda
 conda install fastqc

Like we saw with the module system above, things aren't quite this simple. In this particular case, we get a very helpful error message that can guide our next steps:

No Format
PackagesNotFoundError: The following packages are not available from current channels:

  - fastqc

Current channels:

  - https://repo.anaconda.com/pkgs/main/linux-64
  - https://repo.anaconda.com/pkgs/main/noarch
  - https://repo.anaconda.com/pkgs/r/linux-64
  - https://repo.anaconda.com/pkgs/r/noarch

To search for alternate channels that may provide the conda package you're
looking for, navigate to

    https://anaconda.org

and use the search bar at the top of the page.

...

as per the directions at the end of the installation process, logout, log back in, and disable conda base environment being activated
logout
#log back in using the ssh command. 
conda config --set auto_activate_base false

For help with the ssh command please refer back to Windows10 or MacOS tutorials. If you log out and back in 1 more time, what do you notice is different?

The first time you logged back in, your prompt should have looked like this:

No Format
(base) tacc:~$


The second time you logged back in, your prompt should go back to looking like it did before you installed conda:

No Format
tacc:~$


If your prompt is different, please get the instructor's attention.

Setting up your first environment

Now that you have installed conda, we want to get started with our first environment. More information about environments and their purpose can be found here, but for now we will just think about them as different sets of programs and relevant dependencies being installed together. 

Code Block
languagebash
titleusing the conda create command, make a new environment named "GVA2021", and activate it
conda create --name GVA2021
# enter 'y' to proceed
conda activate GVA2021

This will once again change your prompt. This time the expected prompt is:


Again if you see something different, you need to get the instructors attention. For the rest of the course, it is assumed that your prompt will start with (GVA2021) if not, remember that you need to use the conda activate GVA2021 command to enter the environment.

3. Using miniconda on TACC

The anaconda or miniconda interfaces to the conda system is becoming increasingly popular for controlling one's environment, streamlining new program installation, and tracking what versions of programs are being used. A comparison of the two different interfaces can be found here. The deciding factor on which interface we will use is hinted at, but not explicitly stated in the referenced comparison: TACC does not have a GUI and therefore anacondaa will not work, which is why we installed miniconda above.

Similar to the module system that TACC uses, the "conda" system allows for simple commands to download required programs/packages, and modify environmental variables (like $PATH discussed above). Two huge advantages of conda over the module system, are: #1 instead of relying on the employees at TACC to take a program and package it for use in the module system, anyone (including the same authors publishing a new tool they want the community to use) can create a conda package for a program; #2 rather than being restricted to use on the TACC clusters, conda works on all platforms (including windows and macOS), and deal with all the required dependency programs in the background for you. 

Info
titleConda environments in the instructors work

In my own work, I recently remarked to my PI that "I wish I had started using this 5 years ago", and was reminded that "it didn't exist 5 years ago, at least in its current super usable and popular format". It is entirely possible that future classes will be taught with only minimal references to the TACC module system, and this years course will feature far fewer than any previous year. 

While you may be thinking that since the conda system can work on your personal computer, you may want to just work on your personal computer for the duration of this class and ignore all the ssh commands and working remotely. This is strongly not advised. While you would be able to use the same programs in both instances (in most cases), the tutorials are developed with the speed of the stampede2 system in mind and attempt to minimize "waiting for something to finish" to how long it takes someone to read through the next block of text on the tutorial with some exceptions. If you were to do these tutorials on your personal computer, the timing would significantly increase and it would be difficult to keep up with the rest of the class.

In Friday's lecture I will explain why installing and using conda on your local computer is still a good idea and how I am currently using it in conjuncture with TACC.

In the next tutorial we will start accessing  the quality of some NGS reads using the fastqc program. Before we can use it, we must install it. Similar to the module system described above, to install a program via conda, we need 3 things:

  1. Tell bash we want to use the conda program.
  2. Tell conda we want to install a new program.
  3. Name the program we want to install.


Code Block
languagebash
titleattempt to install the fastqc program using conda
conda activate GVA2021 
conda install fastqc

If you have already activated your GVA2021 environment, the first line will not do anything, but if you have not, you will see your promt has changed to now say (GVA2021) on the far left of the line. As to the second command, like we saw with the module system above, things aren't quite this simple. In this particular case, we get a very helpful error message that can guide our next steps:

No Format
PackagesNotFoundError: The following packages are not available from current channels:

  - fastqc

Current channels:

  - https://repo.anaconda.com/pkgs/main/linux-64
  - https://repo.anaconda.com/pkgs/main/noarch
  - https://repo.anaconda.com/pkgs/r/linux-64
  - https://repo.anaconda.com/pkgs/r/noarch

To search for alternate channels that may provide the conda package you're
looking for, navigate to

    https://anaconda.org

and use the search bar at the top of the page.

More information about "channels" can be found here. By the end of this course you may find that the 'bioconda' channel is full of lots of programs you want to use, and may choose to permanently add it to your list of channels so the above command conda install fastqc and others used in this course would work without having to go through the intermediate of searching for the specific installation commands, or finding what channel the program you want is in. Information about how to do this, as well as more detailed information of why it is bad practice to go around adding large numbers of channels can be found here.

For now, use the error message you saw above to try to install the fastqc program yourself.

Expand
titleIf you are having trouble finding the fastqc page on anaconda, the answer is here, as well as a description of the most likely problem you encountered.

https://anaconda.org/bioconda/fastqc

If you were unable to find this page, the most likely error you entered fastqc into the search box, and you recognized that 360,000+ downloads was likely the program you wanted, you clicked the first bit of hyperlink you found which took you to the bioconda page instead of to the fastqc program. Personally, I think the entire box should be clickable to send you the program page, but nobody has asked me.

Expand
titleClick here if you are unsure what command to use to install fastqc, or want to check your understanding
Code Block
languagebash
conda install -c bioconda fastqc

While there are two other possible commands listed, I tend to always start with the simplest command and work my way from there. The other two commands deal with accessing specific labels/versions of the program.

If all goes well, the installation command should give you the following output with you answering "y" when prompted if you actually want to install the packages:

No Format
The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    fastqc-0.11.9              |       hdfd78af_1         9.7 MB  bioconda
    font-ttf-dejavu-sans-mono-2.37|       h6964260_0         335 KB
    ------------------------------------------------------------
                                           Total:        10.0 MB

The following NEW packages will be INSTALLED:

  fastqc             bioconda/noarch::fastqc-0.11.9-hdfd78af_1
  font-ttf-dejavu-s~ pkgs/main/noarch::font-ttf-dejavu-sans-mono-2.37-h6964260_0
  openjdk            pkgs/main/linux-64::openjdk-8.0.152-h7b6447c_3


Proceed ([y]/n)? y


Downloading and Extracting Packages
fastqc-0.11.9        | 9.7 MB    | ####################################################################################################################################################################################### | 100% 
font-ttf-dejavu-sans | 335 KB    | #######################################################################################################################################################################################
|
100%

Preparing transaction: done
Verifying transaction: done
Executing transaction: done

Github

This is about using the git clone command. Git is a command often used for collaborative program development or sharing of files. Some groups also put the programs or scripts associated with a particular paper on a github project and publish the link in their paper or on their lab website. 

Here we will clone the github repository for breseq which is developed by the Barrick lab here at UT and is used to comprehensively analyze haploid microbial genomes to identify all variants present. In some of the initial tutorials everyone will use a version of breseq that is available through the BioITeam, in the optional tutorials you may compile your own copy of breseq from this github project to underscore why binary files are typically preferred, or as a way of easily staying up to date on new developments with the program itself.

...

Proceed ([y]/n)? y


Downloading and Extracting Packages
fastqc-0.11.9        | 9.7 MB    | ####################################################################################################################################################################################### | 100% 
font-ttf-dejavu-sans | 335 KB    | ####################################################################################################################################################################################### | 100% 
Preparing transaction: done
Verifying transaction: done
Executing transaction: done

There are three commonly used methods to verify you have a given program installed. You should try all three in order for the fastqc program:

  1. Code Block
    languagebash
    title

...

cd $WORK
mkdir src
cd src

If you already have a src directory, you'll get a very benign error message stating that the folder already exists and thus can not be created. 

...

  1. The 'which' command can be used to search your $PATH variable for a command with a specific name, and return the location the command is stored in
    which fastqc
  2. Code Block
    languagebash
    title

...

git clone https://github.com/barricklab/breseq.git

You will see several download indicators increase to 100%, and when you get your command prompt back the ls command will show a new folder named 'breseq' containing a set of files. If you don't see said directory, or can't cd into that directory let the instructor know.

As with Trimmomatic, these files will require additional work that is somewhat specific to the specific program and there for beyond the scope of this tutorial. A link to the advanced tutorials for getting your own copy of breseq up and running will be added later in the week. 

pip

This is about using the pip3 install command. pip is the standard package manager for the common programing language python. When labs put together new analysis programs/packages, increasingly they try to make these programs available for others to use via pip. pip3 rather than just pip references the specific version of python.

Here we will install the multiqc analysis program which compiles reports from a program called fastqc about the quality of fastq files from multiple different samples at one time. In the later portion of the class you may choose to work with this program to get a better overall view of multiple fastq files all at once rather than clicking through individual reports.

Code Block
languagebash
titlePreferred simple installation
pip3 install --user multiqc

*note that the "--user" option in the above code is required while working on LS5 because individual users do not have access to core systems. If you have python3 on your personal computer and wanted to install multiqc (or any other package available through pip) you would typically omit the "--user"  flag.

...

  1. Many commands accept an option of '--version' to simply access the program and return what version of the program is installed
    fastqc --version
  2. Code Block
    languagebash
    titleNearly all commands/programs accept "-h" or "--help" options to give you basic information about how the command or program works
    fastqc --help

Throughout the course, you will routinely use the above 3 commands to make sure that you have access to a given program, that it is the correct version, and to get an idea of how to construct commands to perform a given analysis step. For now, be satisfied that if you get output that is not the following that you have correctly installed fastqc. In the next tutorial we will actually use fastqc. Examples of output you do not want to see to the above commands:

  1. /usr/bin/which: no fastqc in (<large list of directories specific to your TACC account>)

  2. -bash: fastqc: command not found

  3. -bash: fastqc: command not found

Github – an additional common method of getting files onto TACC

This is about using the git clone command. Git is a command often used for collaborative program development or sharing of files. Some groups also put the programs or scripts associated with a particular paper on a github project and publish the link in their paper or on their lab website. Github repositories are a great thing to add to a single location in your $WORK2 directory.

Here we will clone the github repository for the E. coli Long-Term Evolution Experiment (LTEE) originally started by Dr. Richard Lenski. These files will be used in some of the later tutorials, and are a good source of data for identifying variants in NGS data as the variants are well documented, and emerge in a controlled manner over the course of the evolution experiment. Initially cloning a github repository as exceptionally similar to using the wget command to download the repository, it involves typing 'git clone' followed by a web address where the repository is stored. As we did for installing miniconda, with wget we'll clone the repository into a 'src' directory inside of $WORK2.

Code Block
languagebash
titleVerify that multiqc was successfully installed
which multiqc
multiqc

...

Using the mkdir command to create a folder named 'src' inside of your $WORK2 directory
collapsetrue
cd $WORK2
mkdir src
cd src

If you already have a src directory, you'll get a very benign error message stating that the folder already exists and thus can not be created. 

In a web browser navigate to github and search for 'LTEE-Ecoli' in the top right corner of the page. The only result will be for barricklab/LTEE-Ecoli; click the green box for 'clone or download' and either control/command + C on the address listed, or click the clipboard icon to copy the repository address. This image may be helpful if you are are having trouble locating the green box

Code Block
languagebash
titleMore complicated invocation that may work in some instances when simple invocation failsOnce you have copied the address and are in the $WORK2/src directory clone the repository with 'git clone'
collapsetrue
python3git -m pip install --user multiqc

...

clone https://github.com/barricklab/LTEE-Ecoli.git

You will see several download indicators increase to 100%, and when you get your command prompt back the ls command will show a new folder named 'LTEE-Ecoli' containing a set of files. If you don't see said directory, or can't cd into that directory let the instructor know.The multiqc tutorial can be found here

pip

In previous years, the pip installation program was used to install a few programs. While those programs will be installed through conda this year, the link here is provided to give a detailed walk through of how to use pip on TACC resources. This is particularly helpful for making use of the '--user' flag during the installation process as you do not have the expected permissions to install things in the default directories.

This concludes the the linux and

...

stampede2 refresher/introduction tutorial.

Genome Variant Analysis Course 2020 2021 home.