...
One of the challenges of dealing with large data files, whether compressed or not, is finding your way around the data – finding and looking at relevant pieces of it. Except for the smallest of files, you can't open them up in a text editor because those programs read the whole file into memory, so will choke on sequencing data files! Instead we use various techniques to look at pieces of the files at a time. (Read more about commands for Displaying file contents)
The first technique is the use of pagers – we've already seen this with the more command. Review its use now on our small uncompressed file:
...
Read more about head and tail in Displaying file contents.
zcat and gunzip -c tricks
...
Code Block | ||||
---|---|---|---|---|
| ||||
echo $((2368720 / 4)) |
Here's another trick: backticks backtick evaluation. When you enclose a command expression in backtick quotes ( ` ) the enclosed expression is evaluated and its standard output substituted into the string. (Read more about Quoting in the shell)
...
In the code below we pipe the output from wc -l (number of lines in the FASTQ file) to awk, which executes its body (the statements between the curly braces ( { } ) for each line of input. Here the input is just one line, with one field – the line count. The awk body just divides the 1st input field ($1) by 4 and writes the result to standard output. (Read more about awk in Some Linux commands: awk)
Expand | |||||
---|---|---|---|---|---|
| |||||
|
...