Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
languagebash
titleStart an idev session
idev -m 90180 -N 1 -A OTH21164 -r CoreNGS-Tue       # or -A TRA23004 
# -or-
idev -m 90120 -N 1 -A OTH21164 -p development   # or -A TRA23004 

Data staging

Set ourselves up to process some yeast data data in $SCRATCH, using some of best practices for organizing our workflow.

...

One of the challenges of dealing with large data files, whether compressed or not, is finding your way around the data – finding and looking at relevant pieces of it. Except for the smallest of files, you can't open them up in a text editor because those programs read the whole file into memory, so will choke on sequencing data files! Instead we use various techniques to look at pieces of the files at a time. (Read more about commands for Displaying file contents)

The first technique is the use of pagers – we've already seen this with the more command. Review its use now on our small uncompressed file:

...

Read more about head and tail in Displaying file contents.

zcat and gunzip -c tricks

...

Code Block
languagebash
titleSyntax for artithmetic on the command line
echo $((2368720 / 4))

Here's another trick: backticks backtick evaluation. When you enclose a command expression in backtick quotes ( ` ) the enclosed expression is evaluated and its standard output substituted into the string. (Read more about Quoting in the shell)

...

In the code below we pipe the output from wc -l (number of lines in the FASTQ file) to awk, which executes its body (the statements between the curly braces ( {  } ) for each line of input. Here the input is just one line, with one field – the line count. The awk body just divides the 1st input field ($1) by 4 and writes the result to standard output. (Read more about awk in Some Linux commands: awk)

Expand
titleSetup (if needed)


Code Block
languagebash
# Setup (if needed)
export CORENGS=/work/projects/BioITeam/projects/courses/Core_NGS_Tools 
mkdir -p $SCRATCH/core_ngs/fastq_prep
cd $SCRATCH/core_ngs/fastq_prep
ln -sf $CORENGS/yeast_stuff/Sample_Yeast_L005_R1.cat.fastq.gz
ln -sf $CORENGS/yeast_stuff/Sample_Yeast_L005_R2.cat.fastq.gz


...