Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

"Input" refers to any input file, folderdirectory, or string which the command will process or act upon. Some commands require an explicit input, while others have a default value that will be used if no input is provided.

Unix is entirely case sensitive, meaning command names, options, and input files or folders directories must be capitalized correctly. Commands are generally called using all lower case letters, to make them easier to type and remember, and it is a good idea to make directory and filenames all lower case as well.

...

The root directory contains a number of system directories, one of which is the home directory. The home directory will contain subdirectories for each user on the system, and when a user launches the terminal the cwd will be set to their individual home directory by default. The home directory is represented by a tilde (~).

Changing directories (cd command)

...

The mv command is also used to rename files. When used this way, the mv command "moves" a file from one path to another, even if that path is within the same folderdirectory. To rename a file, add the new name to the end of the target destination:

...

mv /path/to/source/file.abc /path/to/target/directory/newFile.abc

Calculating file and directory size (du command)

The du (disk usage) command calculates the size of a file, directory, or set of directories. To calculate the size of an individual file, simply run du followed by the path to the file:

...

Use the -s option to calculate the total size of a directory and its contents:

Calculating free disk

...

space (df command)

Searching within files (grep command)

Terminal multiplexer (tmux command)

Piping commands (stdin, stdout, and the tee command)

...

The df (disk free) command is used to calculate the amount of free space on a given volume. When run without any options or arguments, it will display the free space and disk usage of all mounted volumes, as well as the local disk:

Image Added

This can be somewhat overwhelming. To display the free space and disk usage of one volume in particular, run df followed by the path to the volume's mount point. The mount point is listed in the final column, under "Mounted on" of the default df output. For example, if you have the dps volume mounted at /dps:

Image Added

Like du, df displays disk usage in bytes by default. Use the -h option to produce a more human-readable display:

Image Added

Searching within files (grep command)

The grep command is used to search for a specific string within a plain text or CSV file. grep has many options including case-insensitive search, inverse search, and regular expression search. See /wiki/spaces/utldigitalstewardship/pages/43057645 for a full guide to using grep.

Displaying file contents (cat, less, and head/tail commands)

There are several commands that can be used to display the contents of plain text or CSV files right in the terminal window. Each displays information in slightly different ways.

To output every line from a file, run the cat (concatenate) command, followed by the path to a text or CSV file:

Image Added

If you want to preview the contents of a file without necessarily outputting every line to the terminal, run the less command, followed by the path to a file. This will fill your terminal window with the contents of the file, stopping once the window is full. You can then scroll down (and back up) through the text file using the down and up arrow keys. Hit the q key to exit the less output screen.

To quickly output only the first lines in a file, run the head command, followed by the path to a file. By default, head will output the first 10 lines of the file, but you can also specify the number of lines to output using the -n option:

Image Added

The tail command works just like head, but it displays the last lines of the file. Just like head, you can specify the number of lines to output using -n:

Image Added

Terminal multiplexer (tmux command)

Any time you have to run a process that will take a very long time, it's a good idea to use the tmux (terminal multiplexer) command to avoid accidentally interrupting the process. Using tmux you can begin a process, "detach" from the session to close the terminal window without killing the process, and then "reattach" to the session later to review the results of your process. tmux also allows you to run several commands simultaneously, one per tmux session.

To launch a new tmux session, simply enter tmux. Whenever the active terminal is a tmux session, a green stripe will be visible along the bottom of the terminal window:

Image Added

The session number is the value shown on the left side of the stripe. The first session launched will be called 0, and each subsequent session will be assigned the next largest number.

To detach from the session without interrupting an active process or erasing the terminal history, enter Ctrl+b, followed by d. This will return you to the main terminal window. To reattach to a previous tmux session, run tmux attach -t, followed by the session number (remember that the first session will be number 0, not 1). If there is only one session open, tmux attach will attach to it. If you aren't sure whether there are any tmux sessions active, run tmux ls:

Image Added

To scroll up and down through the terminal output of a tmux session, enter Ctrl+b, followed by [ (left square bracket). This will enable the cursor and allow you to scroll using the arrow keys or page up and page down. Press q to return to the last line of the terminal.

To close a tmux session (and kill whatever processes are running within it!), attach to the session and enter Ctrl+d. This cannot be undone, so before doing this, be sure that your process has finished, and that you have saved whatever terminal output you need.

Redirecting output and command pipelines (stdin and stdout)

You may encounter situations where you need to manipulate or reuse the output from a particular command. For example, the output from a given command may not be formatted the way you need it, or you may want to save the output from a command to a text file that you can review later. The terminal allows you to do this using stdin and stdout.

Every process on the terminal has a standard input and a standard output, called stdin and stdout for short. For basic commands, stdin is the text of the command provided by the user, such as "ls -1 /dps/david/temp". The stdout is the output produced by running the command in stdin. In the following example, the stdin is underlined in red, while the stdout is all the text contained within the green box:

Image Added

You can sometimes manipulate the format and structure of stdout using command options (such as "-1" in this example, which changes the output to a single directory or file per line), but that's still relatively limited, since the available options vary from one command to the next. Most of the time, if you need to manipulate or reuse a command's output, the best approach is to redirect stdout and/or pipe commands.

Redirecting stdout

The simplest way of redirecting stdout is to save it to a text file, which you can open, review, and manipulate using any text editor. To redirect stdout to a text file, use the > (greater than sign), followed by an output path. Think of it like an arrow pointing to an output file. In the following example, the stdout from "ls -1 /dps/david/temp" is saved as "/dps/david/tempcontents.txt":

Image Added

You can open that file with a text editor and see that its contents look just like the output we'd ordinarily expect the command to produce in the terminal:

Image Added

This redirection method will always create a new file at the output path you provide. If there is already a file at that location, it will be deleted and replaced with the stdout from your command, so be sure not to redirect to the location of any important existing file.

If you want to append your results to an existing file without overwriting it, use >> instead of >. In the following example, the contents of /dps/david/temp/ are saved to /dps/david/tempcontents.txt (shown in red), then the contents of /dps/david/temp/pdfs/ (shown in green) are added to the end of the same file

Image Added

Image Added

Command pipelines

Another option is to redirect stdout straight into another command using a "pipeline" of commands. Pipelines take the stdout from one command and feed it directly into a second command to form part of that command's stdin. Commands are separated using the pipe character ( | ), located above the backslash on a US keyboard layout. Command pipelines proceed from one command to the next from left to right, and there is no limit to the number of commands that can be chained together.

For example, if you wanted to find the first 25 lines in /dps/david/temp/manifests.txt that contain "tif", you could run the following command:

grep tif /dps/david/temp/manifests.txt

If you actually ran this command, however, you'd find that it produced far more than 25 lines. It also takes much longer than necessary, since it has to run through the entire file rather than stopping after 25 matches like you want. You could always save these results as a text file, open the file, and delete everything after the 25th match, but that's still quite time consuming.

A better approach is to combine the grep and head commands using a pipeline to stop the process after the 25th match:

Image Added

In this example, the stdout from the grep command has been fed directly into the head command that follows it. Ordinarily, when you run head you have to point it to a specific file to be read as an input, but in a command pipeline, you can omit that piece in order to use the previous command's stdout as the input.

Remember that there is no limit to the number of commands you can combine in a pipeline. If you needed to strip these results of the first two columns and leave only the file paths, you could add the cut command:

Image Added

The cut command receives the stdout from the previous head command and (using the -f3 option) cuts that list down to just the third column. If you then needed to count the number of characters in these 25 file paths, you could add the wc command:

Image Added

The wc command receives the stdout from the previous cut command and (using the -c) option counts the number of characters it contains. If you wanted to save this final output to a text file, you can always redirect the final stdout using > followed by an output path.

Looping through a file (while read method)

If you need to open a text file and perform some action using the value stored in each line, you can use the while read method to construct a "for loop". A for loop reads a line from a text file, stores the value of the line, performs some action using the value, and then moves onto the next line to repeat the process. To do this, you will need to provide a "while read" statement, the command(s) you want to run using each line, and identify the file containing the lines you want to act on.

For example, the file /dps/david/temp/msg_files.txt contains a list of .msg files stored within /dps/david/temp/test:

Image Added

Image Added

If you needed to calculate the character count for each one of these files, you could run wc -c on each file individually, or you could integrate that command into a for loop:

while read -r line; do wc -c "$line"; done < /dps/david/temp/txt_files.txt

Image Added

Every for loop is made up of three main elements, separated by semicolons:

  1. A "while read" statement (underlined in red in the above example)
    1. "while read -r" tells the terminal to read each line in an input file (that will be named at the end)
    2. "line" is the name of the variable that will be used as a stand-in for the actual value of each line in the file. This variable name is entirely arbitrary, and you can choose whatever variable name seems most logical to you, as long as you reference it correctly in the next part of the loop.
  2. A "do" statement (underlined in green)
    1. "do" tells the terminal to run a given command for each line in the input file
    2. "wc -c" is the command we have chosen to run in this case. Any terminal command can be incorporated into a for loop, using its original options and syntax requirements. You can also build command pipelines and redirect output as part of a for loop.
    3. ""$line"" (wrapped in quotes) is the name of the variable that was assigned in the first part of the loop, used here as the input for the "wc -c" command. If you changed the variable name to "abc" in the first part, you would change it to "$abc" in this part.
  3. A "done" statement (underlined in blue)
    1. "done" closes the loop
    2. "< /dps/david/temp/txt_files.txt" is the input file containing the lines to be read and stored as a variable in part one, then acted upon in part two. Think of < as an arrow feeding the contents of the text file into the preceding script.

In the above example, the terminal opens the input file indicated at the very end of the script, reads the first line and stores the value found there as "line", runs "wc -c" on that value, outputs the result, then moves onto the value in the next line. It does this until it reaches the end of the input text file, at which point the loop closes.

While the specific command(s) you run using a for loop may be different from the above example, all for loops adhere to this basic structure.

MacOS: Hidden .DS_Store and ._AppleDouble files

MacOS creates, among other hidden files, .DS_Store files and sometimes "AppleDouble" files on drives or network shares that are not formatted with an Apple file system.

.DS_Store files are created by MacOS Finder and store metadata about folders, including for instance Finder window size/position/configuration. Apple used to provide a technical bulletin to explain the use of these files and how to prevent them from being created on network shares, but the support document has been removed. These files typically can be safely removed.

"AppleDouble" files appear to be duplicates of other files existing in the same folder, prepended with dot and underscore (._). These files contain so-called Resource forks of a file, which are saved as a separate file on non-Apple file systems: https://en.wikipedia.org/wiki/AppleSingle_and_AppleDouble_formats. Exercise care when working with these files, as they might contain metadata that is not included in the Data fork portion of a file.