2020 Catch-up
Environment setup
Directories and symlinks
Directories and links needed in your home directory.
cd ln -s -f $SCRATCH scratch ln -s -f $WORK work ln -s -f /work/projects/BioITeam mkdir -p ~/local/bin cd ~/local/bin ln -s -f /work/projects/BioITeam/common/bin/launcher_maker.py ln -s -f /work/projects/BioITeam/ls5/opt/cutadapt-1.10/bin/cutadapt ln -s -f /work/projects/BioITeam/ls5/opt/multiqc-1.0/multiqc ln -s -f /work/projects/BioITeam/ls5/opt/samstat-1.09/samstat
.bashrc setup
If you already have a .bashrc set up, make a backup copy first. You can restore your original login script after class is over.
cd cp .bashrc .bashrc.beforeNGS
Copy and configure the login profile for this class
cd cp /work/projects/BioITeam/projects/courses/Core_NGS_Tools/tacc/bashrc.corengs.ls5 .bashrc chmod 600 .bashrc
Source it to make it active (if this doesn't work, log off then log back in):
source ~/.bashrc
Environment variables
General
export ALLOCATION=UT-2015-05-18 export BI=/corral-repl/utexas/BioITeam export BIWORK=/work/projects/BioITeam export CORENGS=$BIWORK/projects/courses/Core_NGS_Tools  export PATH=.:$HOME/local/bin:$PATH # For cutadapt support: export PYTHONPATH=$BIWORK/ls5/lib/python2.7/site-packages:$PYTHONPATH # For MultiQC support: export PYTHONPATH=$BIWORK/ls5/lib/python2.7/annab-packages:$PYTHONPATH
Turn on coloring by file type in the shell:
export LS_OPTIONS='-N --color=auto -T 0' # For better colors using a white background terminal, un-comment this line: export LS_COLORS=$LS_COLORS:'di=1;33:' # For better colors using a white background terminal: export LS_COLORS=$LS_COLORS:'di=1;34:'
TACC intro
Commands files
Simple commands
mkdir -p $SCRATCH/core_ngs/slurm/simple cd $SCRATCH/core_ngs/slurm/simple cp $CORENGS/tacc/simple.cmds
Wayness commands
mkdir -p $SCRATCH/core_ngs/slurm/wayness cd $SCRATCH/core_ngs/slurm/wayness cp $CORENGS/tacc/wayness.cmds .
Start an idev session
To start a 3-hour idev (interactive development) session:
idev -p normal -m 180 -N 1 -n 24 -A UT-2015-05-18 --reservation=CCBB
You can tell you're in a idev session because the hostname command will return a compute node name (e.g. nid00438) instead of a login node name (e.g. login5).
The n idev session will terminate when the requested time has expired, or you use the exit command.
Working with FASTQ
Yeast data
Working with some yeast ChIP-seq FASTQ data:
# Area for "original" sequencing data mkdir -p $WORK/archive/original/2018_05.core_ngs cd $WORK/archive/original/2018_05.core_ngs wget http://web.corral.tacc.utexas.edu/BioITeam/yeast_stuff/Sample_Yeast_L005_R1.cat.fastq.gz wget http://web.corral.tacc.utexas.edu/BioITeam/yeast_stuff/Sample_Yeast_L005_R2.cat.fastq.gz # Create a $SCRATCH area for FASTQ prep and link the yeast data there mkdir -p $SCRATCH/core_ngs/fastq_prep cd $SCRATCH/core_ngs/fastq_prep ln -s -f $WORK/archive/original/2018_05.core_ngs/Sample_Yeast_L005_R1.cat.fastq.gz ln -s -f $WORK/archive/original/2018_05.core_ngs/Sample_Yeast_L005_R2.cat.fastq.gz # Copy over a small FASTQ file cd $SCRATCH/core_ngs/fastq_prep cp $CORENGS/misc/small.fq .
ATACseq data for MultiQC
Get some FastQC reports for MultiQC:
mkdir -p $SCRATCH/core_ngs/multiqc/fqc.atacseq cd $SCRATCH/core_ngs/multiqc/fqc.atacseq cp $CORENGS/multiqc/fqc.atacseq/*.html
FASTQ files for cutadapt
For command-line cutadapt exploration:
cd $SCRATCH/core_ngs/fastq_prep cp $CORENGS/human_stuff/Sample_H54_miRNA_L004_R1.cat.fastq.gz . cp $CORENGS/human_stuff/Sample_H54_miRNA_L005_R1.cat.fastq.gz . zcat Sample_H54_miRNA_L004_R1.cat.fastq.gz | head -2000 > miRNA_test.fq
For batch cutadapt processing:
mkdir -p $SCRATCH/core_ngs/cutadapt cd $SCRATCH/core_ngs/cutadapt cp $CORENGS/human_stuff/Sample_H54_miRNA_L004_R1.cat.fastq.gz . cp $CORENGS/human_stuff/Sample_H54_miRNA_L005_R1.cat.fastq.gz . cp $CORENGS/yeast_stuff/Yeast_RNAseq_L002_R1.fastq.gz . cp $CORENGS/yeast_stuff/Yeast_RNAseq_L002_R2.fastq.gz . cp $CORENGS/tacc/cuta.cmds .
Alignment workflow
Alignment workflow setup
Starting files:
# FASTA (for building references) mkdir -p $SCRATCH/core_ngs/references/fasta cp $CORENGS/references/*.* $SCRATCH/core_ngs/references/fasta/ # FASTQ (to align) mkdir -p $SCRATCH/core_ngs/alignment/fastq cp $CORENGS/alignment/*fastq.gz $SCRATCH/core_ngs/alignment/fastq/
References
Get a copy of all references we build in the exercises (including FASTA):
mkdir -p $SCRATCH/core_ngs/references rsync -ptlvrP $CORENGS/references/ $SCRATCH/core_ngs/references/
BWA PE alignment of yeast data
To jump into aligning PE yeast data with BWA
# Pre-built references mkdir -p $SCRATCH/core_ngs/references rsync -avrP $CORENGS/references/ $SCRATCH/core_ngs/references/ # FASTQ (to align) mkdir -p $SCRATCH/core_ngs/alignment/fastq cp $CORENGS/alignment/*fastq.gz $SCRATCH/core_ngs/alignment/fastq/ # Alignment directory mkdir -p $SCRATCH/core_ngs/alignment/yeast_bwa cd $SCRATCH/core_ngs/alignment/yeast_bwa ln -s -f ../fastq ln -s -f ../../references/bwa/sacCer3 module load bwa module load samtools
samtools manipulation of aligned yeast data
To jump into post-alignment manipulation of the yeast_pairedend.bam with samtools:
mkdir -p $SCRATCH/core_ngs/alignment/yeast_bwa cd $SCRATCH/core_ngs/alignment/yeast_bwa cp $CORENGS/catchup/yeast_bwa/yeast_pairedend.bam . module load samtools # If the sorted, indexed BAM is needed: cp $CORENGS/catchup/yeast_bwa/yeast_pairedend.sort* .
SAMTools and BEDTools
Setup for samtools
mkdir -p $SCRATCH/core_ngs/samtools cd $SCRATCH/core_ngs/samtools cp $CORENGS/catchup/for_samtools/* . module load samtools
Welcome to the University Wiki Service! Please use your IID (yourEID@eid.utexas.edu) when prompted for your email address during login or click here to enter your EID. If you are experiencing any issues loading content on pages, please try these steps to clear your browser cache.