Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Expand
titleWe have previously covered using scp to transfer files, but here we present another detailed example. Click to expand.

To use scp you will need to run it in a terminal that is on your desktop and not on the remote TACC system. It can be tricky to figure out where the files are on the remote TACC system, because your desktop won't understand what $HOME, $WORK, $SCRATCH mean (they are only defined on TACC).

To figure out the full path to your file, you can use the pwd command in your terminal on TACC in the window that you ran breseq in (it should contain an "output" folder). Rather than copying the entire contents of the folder which can be rather large, we are going to add a twist of compressing the entire folder into a single compressed archive using the tar command so that the size will be smaller and it will transfer faster:

Code Block
languagebash
titleCommand to type in TACC
tar -czvf output.tar.gz output  # the czvf options in order mean Create, Zip, Verbose, Force
pwd

Then you can then copy paste that information (in the correct position) into the scp command on the desktop's command line:

Code Block
languagebash
titleCommand to type in the desktop's terminal window
scp -r <username>@lonestar.tacc.utexas.edu:<the_directory_returned_by_pwd>/output.tar.gz .

tar -xvzf output.tar.gz  # the new "x" option at the front means eXtract 

...

 

Examining breseq results

As before, copy the data back to your computer and examine the HTML output in a web browser.

Exercise: Can you figure out how to archive all of the output directories and copy only those files (and not all of the very large intermediate files) back to your machine? - without deleting any files?

tar -cvzf output.
Expand
One possible answer...One possible answer...
Code Block
titleClick here for a hint without the answer

You will want to use the tar command again, but you will need to use a wildcard to specify what goes into the compressed file, and only the output directories within each of the wildcard-matched directories.

Code Block
languagebash
titleclick here to check your solution, or get the answer
collapsetrue
tar -cvzf output.tgz output_*/output
Expand
titleHere are the commands we showed you for the previous example with the trick of getting a single compressed output directory to transfer so you don't have to scroll back and forth. See if you can remember how to do it without going back over the lesson.

To use scp you will need to run it in a terminal that is on your desktop and not on the remote TACC system. It can be tricky to figure out where the files are on the remote TACC system, because your desktop won't understand what $HOME, $WORK, $SCRATCH mean (they are only defined on TACC).

To figure out the full path to your file, you can use the pwd command in your terminal on TACC in the window that you ran breseq in (it should contain an "output" folder). Rather than copying the entire contents of the folder which can be rather large, we are going to add a twist of compressing the entire folder into a single compressed archive using the tar command so that the size will be smaller and it will transfer faster:

Code Block
languagebash
titleCommand to type in TACC
tar -czvf output.tar.gz output_*/output  # the czvf options in order mean Create, Zip, Verbose, Force
pwd

Then you can then copy paste that information (in the correct position) into the scp command on the desktop's command line:

Code Block
languagebash
titleCommand to type in the desktop's terminal window
scp -r <username>@lonestar.tacc.utexas.edu:<the_directory_returned_by_pwd>/output.tar.gz .
tar -xvzf output.tar.gz  # the new "x" option at the front means eXtract 

 

Click around in the results.

Optional: breseq utility commands

...

Additionally, the files in the data directory can be loaded in IGV if you copy them back to your desktop.

Optional Exercise: Running breseq in mixed population mode

The phage lambda data set you examined is actually a mixed population of many different phage lambda genotypes descended from a clonal ancestor. You ran breseq in a mode where it predicted consensus mutations in what it thinks is one uniform haploid genome. Actually, some individuals in the population have certain mutations and others do not, so you might have noticed when you looked at some of the alignments that there was a mixture of bases at a position.

We will talk more about analyzing mixed population data to predict rare variants in a later lesson. However, if you're curious you can now experimental with running breseq in a mode where it estimates the frequencies of different mutations in the population. This process is most accurate for single nucleotide variants. Mutations at intermediate frequencies are not (yet) predicted for all classes of mutations like large structural variants.

Code Block
login1$ breseq --polymorphism-prediction --polymorphism-no-indels -r lambda.gbk lambda_mixed_population.fastq 

The option --polymorphism-prediction turns on these mixed population predictions. The option --polymorphism-no-indels turns off predictions of small insertions and deletions (which don't work as well for reasons too complicated to explain here). You're welcome to also try it without this option.

Copy the resulting output directory back to your computer and examine the HTML output in a web browser. Compare it to the output from before.

 

Optional: Install breseq

...