...
Section | |||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Canonical ILLUMINA library design as of June 2012 (all 5'-3'), "TruSeq V3": NOTE all sequences shown are TOP STRAND 5' to 3'
|
...
Highlight color red P5 PCR primer/flowcell capture site: AATGATACGGCGACCACCGAGA
Highlight color yellow IndexRead2: NONE - as in do NOT put an index here. If you want to add an index here, use one of the "Dual-index" designs below.
Highlight color green Read1 primer site: Either the small RNA sequencing primer site: (NEB: TCTACACGTTCAGAGTTCTACAGTCCGACGATCA [Illumina lists this but it is UNPROVEN: CAGGTTCAGAGTTCTACAGTCCGACGATCA]) OR the standard TruSeq Read 1 primer site: TCTACACTCTTTCCCTACACGACGCTCTTCCGATCT. Which to choose? The TruSeq Read 1 primer site is complementary to the Read 2 primer site, so if you are designing amplicons do NOT use the TruSeq Read 1 primer site, use the small RNA sequencing primer site.
- The insert to be sequenced
Highlight color cyan Read2 primer site: Then the Index read primer site: AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC (NOTE: the initial A is from the dA tailing of the insert and is not included in the index primer or adaptor sequences; note also the reverse-complement of this is the Read 2 sequencing primer, but the Read 2 sequencing primer includes the T corresponding to the dA insert tail so sequencing starts with the insert)
Highlight color blue IndexRead1: The index sequence (usually 6 bp) - see many examples below in the Barcodes section. Within a lane, image analysis works best with as much base diversity as possible.
Highlight color purple P7 PCR primer/flowcell capture site: ATCTCGTATGCCGTCTTCTGCTTG
...
Here is an example of a read-pair from an RNA-seq library generated from the NEB small RNA kit with an insert size of 62 nt:
...
After exhaustive searching of all 4096 6-mers, the following table is all remaining 6 bp barcodes that have hamming distance of at least 3 from each other and the table above of 49 barcodes (NOTE: these have NOT been tested on the sequencer as of 2/7/12):
Sequence | GSAF name |
| |
---|---|---|---|
AAACAC | UTBC50 |
| |
TGAAGG | UTBC51 | ||
AACATA | UTBC52 | ||
CGCGTC | UTBC53 | ||
GATACA | UTBC54 | ||
GGTGTG | UTBC55 |
| |
TAAGAA | UTBC56 |
| |
AGCGAG | UTBC57 |
| |
CGGTTA | UTBC58 | ||
AGCTTT | UTBC59 |
| |
TGGTCT | UTBC60 |
| |
TATCCC | UTBC61 | ||
TGTCGT | UTBC62 |
| |
CCCCAC | UTBC63 |
| |
ATACGA | UTBC64 | ||
CCCTTG | UTBC65 | ||
ACCGGC | UTBC66 |
| |
TTACTG | UTBC67 | ||
GGAACT | UTBC68 | ||
GTTATT | UTBC69 | ||
AAAAGT | UTBC70 | ||
AAGGGA | UTBC71 | ||
AAGTAT | UTBC72 | ||
ACATCT | UTBC73 | ||
ACGATT | UTBC74 |
| |
ACGCCG | UTBC75 |
| |
ACTCTC | UTBC76 |
| |
AGAATC | UTBC77 | ||
ATTGGG | UTBC78 | ||
CCGCGT | UTBC79 | ||
CGCCCT | UTBC80 | ||
CTGCAG | UTBC81 |
| |
GAAGTT | UTBC82 |
| |
GCACCC | UTBC83 | ||
GCAGGA | UTBC84 | ||
GCCGCG | UTBC85 |
| |
GGCGGT | UTBC86 | ||
GTATTA | UTBC87 |
| |
TACGTG | UTBC88 |
| |
TCACAT | UTBC89 |
| |
TCTATA | UTBC90 | ||
TGCAAA | UTBC91 |
| |
TGGCAC | UTBC92 |
| |
TGTTAG | UTBC93 | ||
TTCTAT | UTBC94 | ||
GGTACG | UTBC95 |
Excruciating details - USE WITH CAUTION - RNA PCR primers are NOT current as of Dec. 2011
...
Code Block |
---|
Nextera adaptor style: 5'-TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGT-3' ||||||||||||||||||| 3'-CAGAGCACCCGAGCCTCTACACATATTCTCTGTC-5' TruSeq truncated adaptor: 5'-ACACTCTTTCCCTACACGACGCTCTTCCGATCT-3' ||||||||||||| 3'-CACTGACCTCAAGTCTGCACACGAGAAGGCTAGA-5' Nextera adaptor style, with primers overlaid: First cycle: 3'-GGCTCGGGTGCTCTG CAAAGC TAGAGCATACGGCAGAAGACGAAC-5' 5'-TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGT-3' <UNKNOWN> 5'-CTGTCTCTTATACACATCTCCGAGCCCACGAGAC-3' ||||||||||||||||||| ||||||||||||||||||| 3'-CAGAGCACCCGAGCCTCTACACATATTCTCTGTC-5' <UNKNOWN> 3'-TGACAGAGAATATGTGTAGACTGCGACGGCTGCT-5' Second and subsequent cycles: AATGATACGGCGACCACCGAGATCTACAC ATCACG TCGTCGGCAGCGTC 3'-AGCAGCCGTCGCAGTCTACACATATTCTCTGTCA <UNKNOWN> 3'-TGACAGAGAATATGTGTAGACTGCGACGGCTGCT-5' |
...
Cautions, common mistakes, and lessons learned from failure
- Assembling the P7 side adaptor or primer wrong - the key thing to note is that the "cannonical designs" are shown 5' to 3' across the entire finished sequencing construct. So if you're designing a reverse primer for the P7 side you have to use the reverse complement of ALL 3 DESIGN ELEMENTS (flow cell binding site, barcode, and sequencing primer site) and make sure they're in the right order.
- Incorrect P5 dual-index design - the "ACAC" motif in the single index design MUST be repeated on both sides of an index within P5 - see the "dual index" designs specifically.
- Reverse complement barcode sequences in either P5 or P7 side indexes, especially from amplicons - the fact that the Illumina sequencers read i5 differently is a pain - pay attention to that when submitting barcode sequences that will wind up in a sample sheet. And remember that the i7 index is read "forward, top strand" of the canonical design, which is reverse complement of the sequence that appears in a reverse primer used when creating a library.
...