Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 7 Next »

About File Systems 

So far all the files we've been dealing with have been in your Home directory, which is your personal directory. All Unix systems will provide you with a Home directory of your own.

Files and directories in your Home directory are part of the overall file system hierarchy – a tree of directories and their files/sub-directories.

To see the partial file system hierarchy of your Home directory, you can use the tree command. (Note that tree is not always available on Linux systems; it is an add-on tool that must be installed separately).

Just calling tree will produce output similar to this:

This display shows the hierarchical structure of files and sub-directories. At top-level, there are a number of files that we pre-positioned for you (haiku.txt, mobydick.txt, etc.) along with some we created (out.txt, newfile.txt, etc.). There are also two sub-directories: bedfiles and docs, each with several files in them. And notice how tree colors the directories differently to make them stand out.

Of course there are other areas of the file system hierarchy a tree of. For example, the /stor/work/CBRS_unix directory.

tree /stor/work/CBRS_unix

Produces output similar to this:

There are two sub-directories under the /stor/work/CBRS_unix directory: fastq and unix, each with their own sub-directories and files. Again, notice how tree colors file with the extension .gz differently. These are FASTQ files produced by a Next Generation Sequencing (NGS) run in our Genome Sequencing and Analysis (GSAF) core facility.

Note that in Unix, and on Macs, directories are separated by forward slash ( / ) characters, unlike Windows where the backslash ( \ ) is used. And the root of all the file systems is that 1st forward slash.

Finding file systems

So if you're on a new system, how do you know what file systems are available to you? The df (disk free) command will list all the available file systems. As its name suggests, it also shows file system sizes, so it's always good to use df -h (human readable) to see sizes in more readable form. Also, there can be many many file systems available on any given system, so always pipe the output to more.

df -h | more

This produces a rather busy display like this one:

Fortunately, you can ignore most of it. Focus on the Mounted on and Size columns.

  • / (forward slash), under the Mounted on column:
    • / (forward slash) is the root of the file system where the operating system is installed
    • Note its Size is 98G with 47G used
  • Look for Mounted on entries with large Size numbers:
    • Gigabytes (10^9 bytes), Terabytes (10^12 bytes), Petabytes (10^15 bytes)
  • Ignore file systems with names like /run, /dev, /snap, /sys, /boot, /tmp, /var – these are system related
  • Here we see a number starting with /stor (/stor, /stor/home, /stor/work, etc.)
    • Note its large Size: 39T - a sign that it's a file system you want to know about.

Here's a similar listing from the Lonestar5 compute cluster at TACC:

Here the big important file systems are /home1 (7.0T), /scratch (8.1P) and /work (6.8P). There's also /admin (3.5T) but its name suggests that normal users won't be able to access it.

Navigating the file system

Now that we know there are other places, how do we get there? Enter the cd (change directory) command:

  • cd <optional directory_name>
    • with no argument, always changes to your Home directory.

There are also some "special" built-in directory names:

  • ~ (tilde) means my Home directory
  • . (single period) means the current directory
  • .. (two periods) means the parent of the current directory (directory above it)
    • So ls .. means "list contents of the parent directory"

So these two expressions do the same thing – take you to your Home directory from wherever you are in the file system.

cd
cd ~



Pathname wildcards ("globbing")

Since another goal of Unix is to type as little as possible, there are several metacharacters that serve as wildcards to represent sets of characters when typing file names. Using these metacharacters is called globbing (don't ask me why (smile)) and the pattern is called a glob.

  • asterisk ( * ) is the most common filename wildcard. It matches any length of any characters.
  • brackets ( [ ] ) match any character between the brackets.
    • and you can use a hyphen ( - ) to specify a range of characters (e.g. [A-G])
  • braces ( {  } ) enclose a list of comma-separated strings to match (e.g. {dog,pony})

And wildcards can be combined. Some examples:

ls *.txt        # lists all files with names ending in ".txt"
ls [a-z]*.txt   # does the same but only lists files starting with a lowercase letter
ls [ABhi]*      # lists all filenames whose 1st letter is A, B, h, or i
ls *.{txt,tsv}  # lists filenames ending in either .txt or .tsv

Exercise 4-6

Design a wildcard that will match the files haiku.txt and mobydick.txt but not jabberwocky.txt.

 Answer...

There are always multiple ways of doing things in Unix. Here are two possible answers:

ls [hm]*.txt
ls {haiku,jabberwocky}.txt 

x

  • No labels