Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

If you run the which command above outside of an idev session you should see 2 results. If you run from inside an idve node you get 1 result. On the head node (outside idev) 1 that points to the BioITeam near where you keep finding your data (/corral-repl/utexas/BioITeam/) part of the answer as the BioITeam. The "bin" folder specifically is full of binary or (typically small) bash/python/perl/R scripts that someone has written to help the TACC community. The other is in a folder specifically associated with the bioperl module.

If you try to run the BioITeam version of the script from the head node, you get the following error message:

...


Info
titleWhy do you get 2 different results depending on if you are inside or outside of an idev node

This has to do with how compute nodes are configured. On stampede2 /corral-repl/ and all of its subdirectories are not accessible so even though the BioITeam is in your $PATH, on the compute node, the command line can't access it. This is why in later tutorials you have to log out of the idev session to copy new raw data files to work with.

If you try to run the BioITeam version of the script from the head node /corral-repl/utexas/BioITeam/bin/bp_seqconvert.pl , you get the following error message:

No Format
Can't locate Bio/SeqIO.pm in @INC (@INC contains: /corral-repl/utexas/BioITeam//local/share/perl5 /corral-repl/utexas/BioITeam//perl5/lib/perl5/x86_64-linux-thread-multi /corral-repl/utexas/BioITeam//perl5/lib/perl5 /corral-repl/utexas/BioITeam//perl5/lib64/perl5/auto /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at /corral-repl/utexas/BioITeam/bin/bp_seqconvert.pl line 8.
BEGIN failed--compilation aborted at /corral-repl/utexas/BioITeam/bin/bp_seqconvert.pl line 8.

...

After loading the bioperl library to get past the error mesagemessage, run the script from the BioITeam without any arguments to get the help message:

Code Block
module load bioperl
/corral-repl/utexas/BioITeam/bin/bp_seqconvert.pl

If you find yourself needing to do lots of sequence conversions, and find the bp_seqconvert.pl script useful to do them, you may want to add a 'module load bioperl/1.007002' line to your .bashrc file. Recall that because the bp_seqconvert.pl script exists in 2 different locations as 2 different copies only the first one in the PATH variable will be used. Using the which -a command you see the copy used will be the module version unless you specifically envoke the BioITeam version. In this case it does not matter as the scripts are the same.

Convert a gbk reference to a embl reference

...

Warning
titleIMPORTANT

This command can take a while (~5 minutes) and is extremely taxing. This is longer than we want to run a job on the head node (especially when all of us are doing it at once). In fact, in previous years, TACC has noticed the spike in usage when multiple students forgot to make sure they were on idev nodes and complained pretty forcefully to us about it. Let's not have this be one of those years. Use the showq -u command to make sure you are on an idev node. on idev nodes and complained pretty forcefully to us about it. Let's not have this be one of those years. Use the hostname or  showq -u command to make sure you are on an idev node.

Commandon idev nodeon head node
hostnamelists a compute node starting with a C followed by a number before "stampede2.tacc.utexas.edu"lists a login node plus number before "stampede2.tacc.utexas.edu"
showq -u

-bash: showq: command not found

shows you a summary of jobs you have. (very likely empty during these tutorials)

If you are not sure if you are on an idev node or are seeing other output with one or both commands, speak up on zoom and I'll show(q) -u what you are looking for. Yes, your instructor likes bad puns. My apologies.

If you are not on an idev node, and need help to relaunch it, click over to the idev tutorial.

...

  • In the bowtie2 example, we mapped in --local mode. Try mapping in --end-to-end mode (aka global mode).

  • Do the BWA tutorial so you can compare their outputs (note BWA has a conda package making it even easier to try).
    • Did bowtie2 or BWA map more reads?
    • In our examples, we mapped in paired-end mode. Try to figure out how to map the reads in single-end mode and create this output.
    • Which aligner took less time to run? Are there any options you can change that:
      • Lead to a larger percentage of the reads being mapped? (increase sensitivity)
      • Speed up run time without causing many fewer reads to be mapped? (increase performance)

...