Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In this lab, we'll look at how to use cummeRbund to visualize our gene expression results from cuffdiff.  CummeRbund is part of the tuxedo pipeline and it is an R package that is capable of plotting the results generated by cuffdiff.

Figure from: Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks, Trapnell et al, Nature Protocols, 2013. 

What can you do with cummeRbund?

CummeRbund is powerful package with many different functions. Cuffdiff produces a lot of output files and all the output files are related by sets by IDs. CummeRbund makes managing and querying this data easier by loading the data into multiple objects of several different classes. Because all of this gets stored in an sql database, you can access it quickly without loading everything in to memory.

  •   readCufflinks- functions designed to read all the output files that cuffdiff generates into an R data object (of class CuffSet).
    •  cuff_data <- readCufflinks(diff_out)
    • Image Added
    • Now you can access gene information in your differential output using genes(cuff_data), your isoform level output using isoforms(cuff_data), TSS related groups using tss(cuff_data) and so forth
  • Global statistics on data for quality, dispersion, distribution of gene expression scores etc.
    Image Added
  • Plot things for specific features or genes

Image Added

  • Get your data

Code Block
titleGet set up
cds
cd my_rnaseq_course
cp -r /corral-repl/utexas/BioITeam/rnaseq_course/diff_results .
cd diff_results
 
 

#We need the cuffdiff results because that is the input to cummeRbund.

...

And after executing the necessary commands, add the line:

Code Block
dev.off()

...

Exercise 1: Generate a scatterplot comparing genes across conditions C1 vs C2.

Code Block
pdf(file="scatterplot.pdf")

csScatter(genes(cuff_data), 'C1', 'C2')

dev.off()

...

Code Block
titleTo pull out significantly differentially expressed genes and isoforms
gene_diff_data  <- diffData(genes(cuff_data))
sig_gene_data  <- subset(gene_diff_data, (significant ==  'yes'))

#Count how many we have
nrow(sig_gene_data)

isoform_diff_data <-diffData(isoforms(cuff_data))
sig_isoform_data <- subset(isoform_diff_data, (significant == 'yes'))

#Count how many we have
nrow(sig_isoform_data)

...

 

Exercise 3: For a gene, regucalcin, plot gene and isoform level expression.

...

Code Block
titleTo plot gene level and isoform level expression for gene Rala
pdf(file="rala.pdf")
mygene2 <- getGene(cuff_data, 'Rala')
expressionBarplot(mygene2)
expressionBarplot(isoforms(mygene2))
dev.off()
  • Take cummeRbund for a spin...

CummeRbund is powerful package with many different functions. Above was an illustration of a few of them. Try any of the suggested exercises below to further explore the differential expression results with different cummeRbund functions.

...

Exercise 7: Pull out a subset of the genes using a lnlog_fold_change greater than 1.5. Generate expression bar plots for just those genes.

Expand
Solution
Solution
Code Block
titleOne possible solution
gene_diff_data  <- diffData(genes(cuff_data))
sig_gene_data  <- subset(gene_diff_data, (lnlog2_fold_change > 1.5))
head(sig_gene_data)
sig_geneids <- c(sig_gene_data$gene_id)

pdf("expressionboxplot2.pdf")
myGenes <- getGenes(cuff_data, sig_geneids)
expressionBarplot(myGenes)

dev.off()

The resultant graph is here.