4. Working With Files and Directories

Creating directories

We now know how to explore files and directories, but how do we create them in the first place?

Step one: see where we are and what we already have

Let’s go back to our data-shell directory on the Desktop and use ls -F to see what it contains:

Create a directory

Let’s create a new directory called thesis using the command mkdir thesis (which has no output):

As you might guess from its name, mkdir means “make directory”. Since thesis is a relative path (i.e., does not have a leading slash, like /what/ever/thesis), the new directory is created in the current working directory:

Since we’ve just created the thesis directory, there’s nothing in it yet:

Create a text file

Let’s change our working directory to thesis using cd, then run a text editor called Nano to create a file called draft.txt.

On a Koa compute node do this first:

(The login nodes have the editors, nano, vim and emacs so one can skip the module load above.)

Let’s type in a few lines of text.

Once we’re happy with our text, we can press Ctrl+O (press the Ctrl or Control key and, while holding it down, press the O key) to write our data to disk (we’ll be asked what file we want to save this to: press Return to accept the suggested default of draft.txt).

/Node%20anatomy

Once our file is saved, we can use Ctrl-X to quit the editor and return to the shell.

nano doesn’t leave any output on the screen after it exits, but ls now shows that we have created a file called draft.txt:

Creating Files a Different Way

We have seen how to create text files using the nano editor.

Now, try the following command:

Moving files and directories

Returning to the data-shell directory,

In our thesis directory we have a file draft.txt which isn’t a particularly informative name, so let’s change the file’s name using mv, which is short for “move”:

The first argument tells mv what we’re “moving”, while the second is where it’s to go. In this case, we’re moving thesis/draft.txt to thesis/quotes.txt, which has the same effect as renaming the file. Sure enough, ls shows us that thesis now contains one file called quotes.txt:

One has to be careful when specifying the target file name, since mv will silently overwrite any existing file with the same name, which could lead to data loss. An additional option, mv -i (or mv --interactive), can be used to make mv ask you for confirmation before overwriting.

Note that mv also works on directories.

Let’s move quotes.txt into the current working directory. We use mv once again, but this time we’ll just use the name of a directory as the second argument to tell mv that we want to keep the filename, but put the file somewhere new. (This is why the command is called “move”.) In this case, the directory name we use is the special directory name . that we mentioned earlier.

The effect is to move the file from the directory it was in to the current working directory. ls now shows us that thesis is empty:

Further, ls with a filename or directory name as an argument only lists that file or directory. We can use this to see that quotes.txt is still in our current directory:

Moving to the Current Folder

Recall that .. refers to the parent directory (i.e. one above the current directory) and that . refers to the current directory.

Copying files and directories

The cp command works very much like mv, except it copies a file instead of moving it. We can check that it did the right thing using ls with two paths as arguments — like most Unix commands, ls can be given multiple paths at once:

We can also copy a directory and all its contents by using the recursive option -r, e.g. to back up a directory:

We can check the result by listing the contents of both the thesis and thesis_backup directory:

Moving and Copying

Removing files and directories

Returning to the data-shell directory, let’s tidy up this directory by removing the quotes.txt file we created. The Unix command we’ll use for this is rm (short for ‘remove’):

We can confirm the file has gone using ls:

If we try to remove the thesis directory using rm thesis, we get an error message:

This happens because rm by default only works on files, not directories.

rm can remove a directory and all its contents if we use the recursive option -r, and it will do so without any confirmation prompts:

Given that there is no way to retrieve files deleted using the shell, rm -r should be used with great caution (you might consider adding the interactive option rm -r -i).

Copy with Multiple Filenames

For this exercise, you can test the commands in the data-shell/data directory.

Using wildcards for accessing multiple files at once

Note that * is a wildcard, which matches zero or more characters.

Let’s consider the data-shell/molecules directory:

  1. *.pdb matches ethane.pdb, propane.pdb, and every file that ends with ‘.pdb’.
  2. On the other hand, p*.pdb only matches pentane.pdb and propane.pdb, because the ‘p’ at the front only matches filenames that begin with the letter ‘p’.

The character ? is also a wildcard, but it matches exactly one character.

So ?ethane.pdb would match methane.pdb whereas *ethane.pdb matches both ethane.pdb, and methane.pdb.

Wildcards can be used in combination with each other.

For example, ???ane.pdb matches three characters followed by ane.pdb, giving cubane.pdb ethane.pdb octane.pdb.

When the shell sees a wildcard, it expands the wildcard to create a list of matching filenames before running the command that was asked for. As an exception, if a wildcard expression does not match any file, Bash will pass the expression as an argument to the command as it is. For example typing ls *.pdf in the molecules directory (which contains only files with names ending with .pdb) results in an error message that there is no file called *.pdf. However, generally commands like wc and ls see the lists of file names matching these expressions, but not the wildcards themselves. It is the shell, not the other programs, that deals with expanding wildcards, and this is another example of orthogonal design.

List filenames matching a pattern

More on Wildcards

Organizing Directories and Files

Reproduce a folder structure