Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The output looks like this, where the hexadecimal0x09 character is a Tab.

We will also use two data files from the GSAF's (Genome Sequencing and Analysis Facility) automated processing that delivers sequencing data to customers. These files have information about customer Samples (libraries of DNA molecules to sequence on the machine), grouped into sets assigned as Jobs, and sequenced on GSAF's sequencing machines as part of sequencer Runs.

...

A regular expression (regex) is a pattern of characters to search for and metacharacters that control and modify how matching is done.

The Intro Unix: Some Linux commands: Regular expressions section lists a nice set of "starter" metacharacters. Open that page now as a reference for this section.

...

  • -n tells perl to feed the input one line at a time (here 4 lines)
  • -e introduces the perl script
    • Always enclose a command-line perl script in single quotes to protect it from shell evaluation
    • perl has its own set of metacharacters that are different from the shell's
  • $_ is a built-in Perl variable holding the current line (including any invisible line-ending characters)
  • ~ is the perl pattern matching operator
    • =~ says pattern that matches;
    • ! ~ says pattern that does not match
  • the forward slashes ("/  /") enclose the regex pattern
  • the pattern matching operation returns true or false, to be used in a conditional statement
    • here "print current line if the pattern matches"

...

Use perl pattern matching to count the number of Runs in joblist.txt that were not run in 2015.

Expand
titleHint...


Code Block
languagebash
wc -l ~/data/joblist.txt
cat ~/data/joblist.txt | \
  perl -ne 'print $_ if $_ !~/SA15/;' | wc -l

# Of the 3841 entries in joblist.txt, 3088 were not run in 2015


...

Code Block
languagebash
for number in `seq 5`; do
  echo $number
done

for num in $(seq 5); do echo $num; done

Quotes matter

In the Review of some basics: Quoting in the shell section, we saw that double quotes allow the shell to evaluate certain metacharacters in the quoted text.

...

Expand
titleHint...

Here's the weird bash syntax for arithmetic (interger integer arithmetic only!):

Code Block
languagebash
n=0
n=$(( $n + 5 ))
echo $n


...

In addition to the methods of writing multi-line text discussed in Intro Unix: Writing text: Multi-line text, there's another one that can be useful for composing a large block of text for output to a file. This is done using the heredoc syntax to define a block of text between two user-supplied block delimiters, sending the text to a specified command.

...