adapted from Software Carpentry, The Hacker Within and other wonderful resources
The Shell is a command processor that allows us to interact with the operating system. A Terminal lets us run a shell
You may have heard of some variations of the shell csh (c-shell) , tcsh (turbo c-shell), bash (bourne-again shell)
I this course we will focus in bash as it is most common, but the commands are very similar across all the shells
http://en.wikipedia.org/wiki/Bash_(Unix_shell)
ubiquity :
working in scientific computing you will come across a command line interface sooner rather than later. It is the most basic interface to mose computer clusters and high performance hardware.
power :
Knowing shell commands gives you greater flexibility and options, and it gives you a method of creating reproducible scripts when working with data.
Unix philosophy: Make each program do one thing well
Lets start with some basics, such as Where am I?, or how do I orient to where I am on the filesystem?
Use the pwd
command:
pwd
Can we just print something to the terminal? Yep, use echo
:
echo 'Hello'
Who am I? When you log in to your computer, the system knows who you are (your username)
Use the whoami
command:
whoami
What is in this directory?
To find out the contents of your current directory
use the ls
(short for list) command:
ls
cd
by itself takes you to your home directory:
cd
pwd
Once we are in your home directory, lets make a new directory to hold our bootcamp files
To do this we use mkdir
short for "make directory":
mkdir bootcamp
How do we move into this new directory? We use the "relative" path, and cd
:
cd bootcamp
pwd
To move back one directory:
cd ../
pwd
, this gives you the full path to your current directory. Now check with your neighbor, what is their full path?The cd
command takes an argument which is the directory
name. Directories can be specified using either a relative path a
full path. The directories on the computer are arranged into a
hierarchy. The absolute path tells you where a directory is in that
hierarchy. Navigate to the home directory. Now, enter the pwd
command and you should see something like:
/home/me
which is the full name of your home directory. This tells you that you
are in a directory called me
, which sits inside a directory called
home
which sits inside the very top directory in the hierarchy. The
very top of the hierarchy is a directory called /
which is usually
referred to as the root directory. So, to summarize: me
is a
directory in home
which is a directory in /
.
Now enter the following command:
cd /home/me/bootcamp
This jumps to bootcamp
directory. Now go back to the home directory. We saw
earlier that the command:
cd bootcamp
had the same effect - it took us to the bootcamp
directory. But,
instead of specifying the absolute path
(/home/me/bootcamp
), we specified a relative
path. In other words, we specified the path relative to our current
directory. A absolute path always starts with a /
. A relative path does
not. You can usually use either a absolute path or a relative path
depending on what is most convenient. If we are in the home directory,
it is more convenient to just enter the relative path since it
involves less typing.
/bin
directory. Do you see anythingfamiliar in there? 2. How does it compare to your neighbor?
Use cd
to go into your bootcamp directory and verify that is where you are.
We have moved around and looked at a few things, lets add some files
download this into your bootcamp directory and unzip
You can use commands in the shell to grab the file
OSX:
curl -O http://swcarpentry.github.io/2014-04-14-wise/advanced/shell/shelldata.zip
Linux:
wget http://swcarpentry.github.io/2014-04-14-wise/advanced/shell/shelldata.zip
You can use also use the command line call unzip
to unzip the file:
unzip shelldata.zip
NOTE If you cannot or do not want to do the bonus, please just be sure you have downloaded the file into your bootcamp directory and unzipped the file
You will now have a new directory data that contains some files
move to the new data directory
list contents (How many files are there?)
move back into the bootcamp directory
Previously we used ls to look at the contents of our working directory. Use ls to specifically look at data
ls data
To make an empty file use the touch
command:
touch emptyfile
Now check that it exists using ls
Sometimes we want to have more information about the contents of a directory. We can use an argument to ls
to get more information.
Use ls -l in the bootcamp directory:
ls -l
What does this tell us about our files?? This is what I get:
total 48
drwxrwxr-x 5 cindeem staff 170 Apr 10 15:48 data
-rw-r--r-- 1 cindeem staff 0 Apr 12 16:53 emptyfile
-rw-r--r-- 1 cindeem staff 23581 Apr 12 16:43 shelldata.zip
drwxrwxr-x 5 cindeem staff 170 Apr 10 15:48 data
drwxrwxr-x 5 cindeem staff 170 Apr 10 15:48 data
Permissions tell you:
if an item is a directory
drwxrwxr-x 5 cindeem staff 170 Apr 10 15:48 data
or a file
-rw-r--r-- 1 cindeem staff 0 Apr 12 16:53 emptyfile
The rest is permissions r for read, w for write, x for executable
For owner, group, and everyone else
drwxrwxr-x 5 cindeem staff 170 Apr 10 15:48 data
Using ls -l:
warning: If you are using gitbash, man will not work. gitbash has only implemented a limited number of commands. Share with a linux or OSX friend?
So how do you find out about all the useful flags for a command?
Prints out usage for a given command:
man ls
To move around in man
man ls
to find out how to list contents of a directory sorted in order of time modifiedls -h
be useful?Make sure you are in the bootcamp directory, and remove the empty file you created previously
ls emptyfile
rm emptyfile
ls emptyfile
You should see this
ls: emptyfile: No such file or directory
Now try it with the data directory
ls
rm data
ls
What happened? Why? Why is this good?
Lets create a new file, but we are going to do something odd, we are going to add a . (dot or period) to the beginning of the filename:
touch .hiddenfile
If we use ls again this file will not show up, this is beacuse of the leading .
However, we can see these files using the -a flag:
ls -a
We can now see the hidden file (.hiddenfile). Often these files are configuration files or temporary files, and in general you do not edit or work with them. But they will be useful to know about, and sometimes you do want to access them.
Bonus Use an absolute path instead of moving to your home directory
when using any command in a terminal, the terminal will try to guess what you are trying to do. To see how this works
go into the bootcamp directory
at the shell prompt enter ls d [tab]:
ls d[tab]
You should find it completed data for you (as this directory exists in the directory and is the only item that starts with d ).
Now try this in the data directory
ls data/d[tab]
ls data/d[tab][tab]
You had to hit the tab key twice. This is because there was not a unique option, instead there are multiple files that begin with d. Hitting tab twice shows your possible options. What do you think will happen if you try?
ls di[tab]
When at an empty prompt, you can use your up arrow and down arrow to cycle through your previously used commands. This can save you alot of typing.
In addition there is a history command which will print out the history of your recently used shell commands
history
Make sure you are in the bootcamps/data directory
There should be a file called dictionary.txt. How do we read the contents of this file?
cat is a tool to concatenate or list files. Use cat to look at the contents of dictionary.txt
cat dictionary.txt
Thats alot of text...
head allows us to look at just the first lines of a file
head dictionary.txt
tail allows us to look at the last few lines of a file
tail dictionary.txt
Less allows you to view the contents of a text file, with control over navigating the file:
less dictionary.txt
to navigate:
Side note:: cat, head, tail, and less are meant to be used to look at text files, not binary files. You will get unexpected results looking at binary files, but you will be able to see them.
Note sometimes it can be useful to use less to look at a binary file as many contain readable text, and can help you recognize if it is a valid file format.
cp dictionary.txt copied_dictionary.txt
This creates a new files with the same contents
what if we wanted to make a copy of data, to data_copy?
cd ../
cp data data_copy
This raises an error, why?
To make this work we need to copy a directory recursively, and we do this using the -r flag
cp -r data data_copy
Check data_copy to see that it has the same files as data
mv is used to rename
or change the location of a file or directory. We will use it to rename the directory data_copy
to renamed_data
mv data_copy renamed_data
Use ls
to see that you have successfully renamed the directory
Note: Move takes less time than copy for large files/directories. This is because copy (cp) writes newdata to disk while move (mv) just updates a pointer in the filesystem.
grep allows you to search for patterns
Pipe | allows you to take output from one command and feed it into another command
For example, we can use cat to list the contents of dictionary.txt, and then use the pipe | to send this to grep and look for words that contain egg
cat dictionary.txt | grep egg
To get all the words that start with y we use the carrot symbol ^
cat dictionary.txt | grep ^y
Another option is the bracket [ ] which can allow you to look for multiple items.
For example, this allows us to find any lines that contain a x or a z
cat dictionary.txt | grep [xz]
In the shell you can use > or >> to redirect output the a new file. This works a little like pipe (so note, if you are using a redirect, this is the one place you dont need an extra pipe)
eg DO NOT do this: cat file | >> newfile
To save output to a file, > puts output into a new file
cat dictionary.txt > newdictionary
To append to the end of the file, >> appends
output to an existing file
cat dictionary.txt >> newdictionary
g
in dictionary.txt
into a new file g_dictionary.txt
2. append all words that start with h in dictionary.txt
to g_dictionary.txt
, check with less
3. flashback remove newdictionary
wc stands for word count, and can be used to count words or lines. Lets see how many words are in dictionary.txt and then compare that to g_dictionary.txt
wc dictionary.txt g_dictionary.txt
850 852 5321 dictionary.txt
54 54 318 g_dictionary.txt
904 906 5639 total
You can also use this to count the number of items in a directory using the -l (for count lines) flag
ls | wc -l
7
Wildcards (*) are used to match anything, but we can use them to match specific things.
Navigate to the data
directory. USe a wildcard to find all the excel files
ls *.xlsx
This will give us everything in the directory
ls *
This command:
ls /usr/bin/z*
Lists every file in /usr/bin
that starts with z
.
Do each of the following using a single ls
command without
navigating to a different directory.
/bin
that contain the letters sh
d
/bin
that start with d
using grepIf you followed the install instructions, you should have installed anaconda for access to python and python libraries. One of the coolest tools you get with anaconda is grin. Grin wraps grep, and recursively looks into file(s) in a directory to find your matching pattern. Give this a try.
cd to the bootcamp directory
grin egg
you should see something like this:
./data/dictionary.txt:
229 : egg
grin found the pattern egg in the data directory in the dictionary.txt file. The pattern occurred on line 229. (Is that cool or what?)
env prints your environment variables. Your system uses this to keep track of variables within and across programs. For example, lets find out who you are
env | grep USER
This gives us the username of the current user. Programs can take advantge of this to let us know who created certain files or ran different programs.
The environment variable PATH
is important for your computer to find executable programs
env | grep PATH
You will see there are a few env variables that have PATH
in them, we want the simple one so we can do one of two
things
grep using ^ so it matches items that start with PATH
remember when we used echo, we can combine echo with $ to get the contents of environment variables
env | grep ^PATH
echo $PATH
Your system uses the directories defined in PATH
to find programs it can run.
You should have installed git as part of your pre-workshop installation. Type git at the prompt, what do you see?
git
git --version
Now lets clone the repo for the advanced track
go to the bootcamp directory (use cd)
use git to clone the repo
git clone https://github.com/swcarpentry/2014-04-14-wise.git
nano is a simple text editor, much like Microsoft Word is a text editor, but it is specially designed to work with code
and scripts. Top open nano, you can just type nano
at the command line.
nano
This will open a blank text editor. If you use the menu to open dictionary.txt
, it will open the text file.
All commands are at the bottom. To exit nano use Control-x.
If you alreay use a different editor, feel free to use your editor of choice instead fo nano. But keep in mind nano exists on most systems, this will likely not work on Windows
Remember when we used ls -a
to find hidden files. We are going to edit one of those hidden files. First go to your
home directory (a quick way to do this is just type cd
as it always takes you home. Use pwd
to verify
cd
pwd
We are going to update your bash resource file.
The .bashrc or .bash_profile file in your home directory controls the behavior of your bash shell. This will not work with gitbash on Windows, apologies
We want to add some magic to make working with git (source control) easier.
https://github.com/git/git/blob/master/contrib/completion/git-prompt.sh
source ~/.git-prompt.sh
PS1='[\u@\h \W$(__git_ps1 " (%s)")]\$ '
source will execute the code in .git-prompt.sh which is a shell script
the other line will update your prompt when you are working within a git repository
Exit out of nano and save the file using Control-x
In your current shell , go back to the git repo you downloaded into bootcamp. Is anything different?
Open a new terminal/shell, go to the same repo directory in the bootcamp directory. Now what do you see?
Why might these be different?
You can use shell command in an ipython notebook. there are generally two methods
! exclamation point before your command
!ls
To open an ipython notebook
This will open a webpage, and it will list the available notebooks
Click on a notebook to open, and it will open in a tab
Before we meet again tomorrow, spend a little time going into the advanced repo directory, look for notebooks and see if you can open them.
Download a copy of the notebook showing how to use shell magic linked at the top of this page (Its the icon of an arrow pointing down)
Not all magic will work for you, but understanding how to navigate the ipython notebook will help during the workshop tomorrow.
%%bash
ls
Shell.html Shell.ipynb git-prompt.sh shelldata.zip