...
Expand | |||||||
---|---|---|---|---|---|---|---|
| |||||||
|
...
Expand | |||||
---|---|---|---|---|---|
| |||||
These are 101-base reads. wc -c counts the "invisible" newline character, so subtract 1 from the character count it returns for a line. Here's a way to strip the trailing newline characters from the quality scores string before calling wc -c to count the characters. We use the echo -n option that tells echo not to include the trailing newline in its output. We gemerate generate that text using sub-shell evaluation (an alternative to backtick evaluation) of that zcat ... command:
|
...
- Here's one of those cases where you have to be careful about separating standard output and standard error.
- cutadapt writes its FASTQ output to standard output by default, and writes summary information to standard error.
- In this command we first redirect standard error to a log file named miRNA_test.cuta.log using the 2> syntax, in order to capture cutadapt diagnostics.
- Then the remaining standard output is piped to gzip, whose output is the redirected to a new compressed FASTQ file. (Read more about Standard streams and redirection)
...
Expand | ||
---|---|---|
| ||
The cutadapt --help output describes its usage as follows:
From this we see that the
And this says that input reads can also be provided on standard input, if that argument is a hyphen ( - ). So input data can come:
What about cutadapt output (the trimmed reads)? The brackets around the usage -o option indicate that the resulting trimmed FASTQ can be written to a file, but is not by default. This implies that cutadapt by default writes its results to standard output. So output can go
Finally, as we've seen, cutadapt also writes diagnostic output. Where does it go? The usage line doesn't say anything about diagnostics explicitly. But in the Output section of cutadapt --help:
Careful reading of this suggests that:
|
...
Code Block | ||||
---|---|---|---|---|
| ||||
mkdir -p $SCRATCH/core_ngs/cutadapt cd $SCRATCH/core_ngs/cutadapt cp $CORENGS/human_stuff/Sample_H54_miRNA_L004_R1.cat.fastq.gz . cp $CORENGS/human_stuff/Sample_H54_miRNA_L005_R1.cat.fastq.gz . cp $CORENGS/custom_tracksalignment/Yeast_RNAseq_L002_R1.fastq.gz . cp $CORENGS/custom_tracksalignment/Yeast_RNAseq_L002_R2.fastq.gz . |
...
Tip | ||
---|---|---|
| ||
The BioITeam has an a number of useful NGS scripts that can be executed by anyone on ls6. or stampede2 stampede3. They are located in the /work/projects/BioITeam/common/script/ directory. For groups that participate in BRCF pods, the scripts are available in /mnt/bioi/script on any compute server. |
...
Or use this "cat to MARKER" trick, also known as an heredoc. The MARKER tag can be anything; below it is EOL.
...
Code Block | ||||
---|---|---|---|---|
| ||||
cd $SCRATCH/core_ngs/cutadapt
launcher_creator.py -j cuta.cmds -n cuta -t 01:00:00 -a OTH21164 -q normal
sbatch --reservation=CoreNGS-Wed cuta.slurm
showq -u
# or, if you're not on the reservation:
launcher_creator.py -j cuta.cmds -n cuta -t 01:00:00 -a OTH21164 -q development
sbatch cuta.slurm
showq -u |
...
- H54_miRNA_L004.cuta.log, H54_miRNA_L005.cuta.log, yeast_rnaseq.cuta.log
- these are the main execution log files, one for each trim_adapters.sh command
- H54_miRNA_L004.acut.pass0.log, H54_miRNA_L005.acut.pass0.log
- these are cutadapt statistics files for the single-end adapter trimming
- their contents will look like our small example above
- yeast_rnaseq.acut.pass1.log, yeast_rnaseq.acut.pass2.log
- these are cutadapt statistics files from paired-end trimming of the R1 and R2 adapters, respectively.
...
Expand | |||||
---|---|---|---|---|---|
| |||||
For more on printf, which is available in most programming languages, see see https://en.wikipedia.org/wiki/Printf#Format_specifier or https://alvinalexander.com/programming/printf-format-cheat-sheet/ |
...