Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

For some of the discussions below, we'll use a couple of data files from the GSAF's (Genome Sequencing and Analysis Facility) automated processing that delivers sequencing data to customers. These files have information about customer samples (libraries of DNA molecules to sequence on the machine), grouped into sets assigned as jobs, and sequenced on GSAF's sequencing machines as part of runs.

Here are links to the files if you need to download them after this class is over (you don't need to download them now, since we'll create symbolic links to them)

The files are in your ~/data directory:.

  • joblist.txt - contains job name/sample name pairs, tab-delimited, no header
  • sampleinfo.txt - contains information about all samples run on a particular run, along with the job each belongs to.
    • columns (tab-delimited) are job_name, job_id, sample_name, sample_id, date_string
    • column names are in a header line

...