...
Code Block |
---|
language | bash |
---|
title | copy some files over to intersect |
---|
|
cd $SCRATCH/core_ngs/
mkdir intersect
cd alignment/intersect
samtoolscp view -b human_rnaseq_mem.sam > human_rnaseq_mem.bam
samtools view -b human_mirnaseq.sam > human_mirnaseq.bam
cp human_rnaseq_mem.bam ../intersect
cp human_mirnaseq.bam ../intersect
cd ../intersect/corral-repl/utexas/BioITeam/core_ngs_tools/alignment/bam/human_mirnaseq_hg19.bam .
cp /corral-repl/utexas/BioITeam/core_ngs_tools/alignment/bam/human_rnaseq_bwa.bam .
ls -lah
|
Expand |
---|
title | click here to see my output on copying files |
---|
|
Code Block |
---|
| -rw-rw-r--rwxrwxr-x 1 awh394 G-801021 16M19M May 22 18:3657 human_mirnaseq_hg19.bam
-rw-rw-r--rwxrwxr-x 1 awh394 G-801021 96.1M6M May 22 18:3657 human_rnaseq_membwa.bam |
|
Before we can intersect these files, we need to perform the pipeline we used in samtools to index, sort and filter the files, and bedtools to convert from BAM over to bed, then collapse down the files using merge. Below is a little workflow to help you through it on the files you just copied above.
Expand |
---|
title | Click here to see the samtools/bedtools workflow |
---|
|
My output (for length of bed files) is in the comments. Code Block |
---|
language | bash |
---|
title | a samtools/bedtools workflow |
---|
| module load samtools #if you haven't loaded it up this session
#sort both files
samtools sort human_mirnaseq_hg19.bam human_mirnaseq_hg19_sort # will take 1-2 minutes
samtools sort human_rnaseq_bwa.bam human_rnaseq_bwa_sort # will take 1-2 minutes
#index the new files
samtools index human_mirnaseq_hg19_sort.bam
samtools index human_rnaseq_bwa_sort.bam
#filter the sorted files, reindex the new filtered files
samtools view -b -F 0x04 -q 1 -o human_mirnaseq_hg19_sort.mapped.q1.bam human_mirnaseq_hg19_sort.bam
samtools view -b -F 0x04 -q 1 -o human_rnaseq_bwa_sort.mapped.q1.bam human_rnaseq_bwa_sort.bam
samtools index human_mirnaseq_hg19_sort.mapped.q1.bam
samtools index human_rnaseq_bwa_sort.mapped.q1.bam
#convert filtered bam files to bed format
module load bedtools #if you haven't loaded it in for this session
bedtools bamtobed -i human_mirnaseq_hg19_sort.mapped.q1.bam > human_mirnaseq_hg19_sort.mapped.q1.bed
bedtools bamtobed -i human_rnaseq_bwa_sort.mapped.q1.bam > human_rnaseq_bwa_sort.mapped.q1.bed
#check the length of the files:
wc -l *.bed
# #164806164806 human_mirnaseq_hg19_sort.mapped.q1.bed
# 22538 human_rnaseq_bwa_sort.mapped.q1.bed
#merge the bed files, check the length again
bedtools merge -s -c 4,5,6 -o count_distinct,sum,distinct -i human_mirnaseq_hg19_sort.mapped.q1.bed -o count_distinct,sum -i human_mirnaseq_hg19_sort.mapped.q1.bed | awk '{print $1 "\t" $2 "\t" $3 "\t" $5 "\t" $6 "\t" $4}' > human_mirnaseq_hg19_sort.mapped.q1.merge.bed
bedtools merge -s -c 4,5,6 -o count_distinct,sum,distinct -i human_rnaseq_bwa_sort.mapped.q1.bed | awk '{print $1 "\t" $2 "\t" $3 "\t" $5 "\t" $6 "\t" $4}' > human_rnaseq_bwa_sort.mapped.q1.merge.bed
wc -l *.merge.bed
# #1479414794 human_mirnaseq_hg19_sort.mapped.q1.merge.bed
# 7134 human_rnaseq_bwa_sort.mapped.q1.merge.bed |
|
...
Expand |
---|
title | click to copy over the merged bed files... |
---|
|
Code Block |
---|
| cds
cd intersect
cp /scratch/01786/awh394/core_ngs/day4_2015/intersect/human_mirnaseq_hg19_sort.mapped.q1.merge.bed .
cp /scratch/01786/awh394/core_ngs/day4_2015/intersect/human_rnaseq_bwa_sort.mapped.q1.merge.bed .
|
|
...