Table of Contents |
---|
Files and File systems
First, let's review Intro Unix: Files and File Systems. The most important takeaways are:
- Understanding the tree-like structure of directories and files in the file system hierarchy
- Absolute paths start with a slash ( / ), the root of the file system hierarchy
- More at: Intro Unix: Files and File Systems: The file system hierarchy
- Absolute paths start with a slash ( / ), the root of the file system hierarchy
- Knowing how to navigate the file system using the cd (change directory) command, Tab key completion, and relative path syntax:
- use the dot ( . ) metacharacter for the current directory
- use the dot-dot ( .. ) metacharacters for the parent directory
- More at:
- Selecting multiple files using pathname wildcards (a.k.a. "globbing")
- asterisk ( * ) to match any length of characters
- brackets ( [ ] ) match any character between the brackets, including hyphen ( - ) delimited character ranges such as [A-G]
- braces ( { } ) enclose a list of comma-separated strings to match (e.g. {dog,pony})
- More at: Intro Unix: Files and File Systems: Pathname wildcards
- A basic understanding of file attributes such as
- file type (file, directory)
- owner and group
- permissions (read, write, execute) for the owner, group and everyone
- More at: Intro Unix: Files and File Systems: File attributes
- Familiarly with basic file manipulation commands (mkdir, cp, mv, rm)
Working with remote files
...
Tip |
---|
When transferring files between your computer and a remote server, you always need to execute the command on your local computer. This is because your personal computer does not have an entry in the global hostname database (a.k.a. the , whereas the remote computer does. The global Domain Name Service, or DNS), whereas the remote computer does database maps full host names to their IP (Internet Protocol) address. Computers that can be accessed from anywhere on the Internet have their host names registered in DNS. |
wget (web get)
The wget <url> command lets you retrieve the contents of a valid Internet URL (e.g. http, https, ftp).
...
- ln -s <path> says to create a symbolic link link (symlink) to the specified file (or directory) in the current directory
- always use the -s option to avoid creating a hard link, which behaves quite differently
- the default link name corresponds to the last name component in <path>
- you can name the link file differently by supplying an optional link_file_name.
- it is best to change into (cd) the directory where you want the link before executing ln -s
- a symbolic link can be deleted without affecting the linked-to file
- the -f (force) option says to overwrite any existing symbolic link with the same name
...
- find returns a list of matching file paths on its standard output
- ln wants its files listed as arguments, not on standard input
- so the paths are piped to the standard input of xargs
- xargs takes the data on its standard input and calls the specified function (here ln -sf -t .) with that data as the function's argument list.
...
Display lines 7 - 9 of the compressed "jabber.gz" text
Expand | ||
---|---|---|
| ||
zcat jabber.gz | cat -n | tail +7 | head -3 |
...
The cutadapt adapter trimming command reads NGS sequences from a FASTQ file, and writes adapter-trimmed reads to a FASTQ file. Find its usage.
Expand | ||
---|---|---|
| ||
cutadapt # overview; tells you to run cutadapt --help for details Note that it also points you to https://cutadapt.readthedocs.io/ for full documentation.
|
Where does cutadapt write its output to from by default? How can that be changed?
Expand | ||
---|---|---|
| ||
The cutadapt usage says that output can be written to a file using the -o option
But the The brackets around [-o output.fastq] suggest this is optional. Reading a bit further we see:
This suggests output can be specified in 2 ways:
|
...
Expand | ||
---|---|---|
| ||
The cutadapt usage says an input.fastq file is a required argument:
But again, reading a bit further we see:
This says that the input.fastq file can be provided in one of three compression formats. And the usage also suggests input can be specified in 2 ways:
|
Where does cutadapt write its diagnostic output by default? How can that be changed?
Expand | ||
---|---|---|
| ||
The cutadapt usage doesn't say anything directly about diagnostics:
But again, reading in the Output: options section:
Careful reading of this suggests that: When
|
Expand | |||||
---|---|---|---|---|---|
| |||||
|