...
Code Block |
---|
|
# copy a small.fq file into a new ~/gzips directory
cd; mkdir -p ~/gzips
cp -p /stor/work/CCBB_Workshops_1/misc_data/fastq/small.fq ~/gzips/
cd ~/gzips
ls -lh # small.fq is 66K (~66,000) bytes long
wc -l small.fq # small.fq is 1000 lines long |
...
The gunzip command does the reverse – decompresses the file and writes the results back without the .gz extension. gzip -d (decompress) does the same thing.
Code Block |
---|
|
gunzip small.fq.gz # decompress the small.fq.gz file in place, producing small.fq file
gunzip small.fq.gz
# or
gzip -d small.fq.gz |
Both gzip and gunzip also have -c or --stdout options that tell the command to write on standard output, keeping the original files unchanged.
Code Block |
---|
|
cd ~/gzips # change into your ~/gzips directory
ls small.fq # make sure you have an uncompressed "small.fq" file
gzip -c small.fq > sm2.fq.gz # compress the "small.fq" into a new file called "sm2.fq.gz"
gunzip -c sm2.fq.gz > sm3.fq # decompress "sm2.fq.gz" into a new "sm3.fq" file
ls -lh |
Both gzip and gunzip can also accept data on standard input. In that case, the output is always on standard output.
Code Block |
---|
|
cd ~/gzips # change into your ~/gzips directory
ls small.fq # make sure you have an uncompressed "small.fq" file
cat small.fq | gzip > smallsm4.fq.gz |
The good news is that most bioinformatics programs can accept data in compressed gzipped format. But how do you view these compressed files?
...
Code Block |
---|
|
cd ~/gzips
cat ../jabberwocky.txt | gzip > jabber.gz # make a compressed copy of the "jabberwocky.txt"
file
less jabber.gz # use 'less' to view the compressed "jabber.gz" file
(q to exit) zcat jabber.gz | wc -l # count lines in the compressed "jabber.gz" # (type 'q' to exit)
zcat jabber.gz | wc -l # count lines in the compressed "jabber.gz" file
zcat jabber.gz | tail -4 # view the last 4 lines of the "jabber.gz" file
zcat jabber.gz | cat -n # view "jabber.gz | cat -n" text with line numbers
# view "jabber.gz" text with line numbers (no zcat -n option) zcat jabber.gz | cat -n | tail +6 | head -4# # display(zcat linesdoes 6not -have 9an of "jabber.gz" text-n option)
|
Exercise 2-2
Display lines 6 7 - 9 of the compressed "jabber.gz" text
Expand |
---|
|
zcat jabber.gz | cat -n | tail +6 7 | head -43 - or - zcat jabber.gz | cat -n | head -10 9 | tail -43 |
Working with 3rd party program I/O
...