Shell Programming

ArticleCategory: [Es gibt verschiedene Artikel Kategorien]

UNIX Basics

AuthorImage:[Ein Bild von Dir]

[Photo of the Authors]

TranslationInfo:[Autor und Übersetzer]

original in en Katja and Guido Socher 

AboutTheAuthor:[Eine kleine Biographie über den Autor]

Katja is the German editor of LinuxFocus. She likes Tux, film & photography and the sea. Her homepage can be found here.

Guido is a long time Linux fan and he likes Linux because it is designed by honest and open people. This is one of the reasons why we call it open source. His homepage is at linuxfocus.org/~guido.

Abstract:[Hier sollte eine kleine Zusammenfassung stehen]

In this article we explain how to write little shell scripts and give many examples.

ArticleIllustration:[Das Titelbild des Artikels]

[Illustration]

ArticleBody:[Der eigentliche Artikel. Überschriften innerhalb des Artikels sollten h2 oder h3 sein.]

Why shell programming?

Even though there are various graphical interfaces available for Linux the shell still is a very neat tool. The shell is not just a collection of commands but a really good programming language.You can automate a lot of tasks with it, the shell is very good for system administration tasks, you can very quickly try out if your ideas work which makes it very useful for simple prototyping and it is very useful for small utilities that perform some relatively simple tasks where efficiency is less important than ease of configuration, maintenance and portability.
So let's see now how it works:

Creating a script

There are a lot of different shells available for Linux but usually the bash (bourne again shell) is used for shell programming as it is available for free and is easy to use. So all the scripts we will write in this article use the bash (but will most of the time also run with its older sister, the bourne shell).
For writing our shell programs we use any kind of text editor, e.g. nedit, kedit, emacs, vi...as with other programming languages.
The program must start with the following line (it must be the first line in the file):
    #!/bin/sh 
   
The #! characters tell the system that the first argument that follows on the line is the program to be used to execute this file. In this case /bin/sh is shell we use.
When you have written your script and saved it you have to make it executable to be able to use it.
To make a script executable type
chmod +x filename
Then you can start your script by typing: ./filename

Comments

Comments in shell programming start with # and go until the end of the line. We really recommend you to use comments. If you have comments and you don't use a certain script for some time you will still know immediately what it is doing and how it works.

Variables

As in other programming languages you can't live without variables. In shell programming all variables have the datatype string and you do not need to declare them. To assign a value to a variable you write:
varname=value
To get the value back you just put a dollar sign in front of the variable:
#!/bin/sh
# assign a value:
a="hello world"
# now print the content of "a":
echo "A is:"
echo $a
Type this lines into your text editor and save it e.g. as first. Then make the script executable by typing chmod +x first in the shell and then start it by typing ./first
The script will just print:
A is:
hello world
Sometimes it is possible to confuse variable names with the rest of the text:
num=2
echo "this is the $numnd"
This will not print "this is the 2nd" but "this is the " because the shell searches for a variable called numnd which has no value. To tell the shell that we mean the variable num we have to use curly braces:
num=2
echo "this is the ${num}nd"
This prints what you want: this is the 2nd

There are a number of variables that are always automatically set. We will discuss them further down when we use them the first time.

If you need to handle mathematical expressions then you need to use programs such as expr (see table below).
Besides the normal shell variables that are only valid within the shell program there are also environment variables. A variable preceeded by the keyword export is an environment variable. We will not talk about them here any further since they are normally only used in login scripts.

Shell commands and control structures

There are three categories of commands which can be used in shell scripts:

1)Unix commands:
Although a shell script can make use of any unix commands here are a number of commands which are more often used than others. These commands can generally be described as commands for file and text manipulation.
Command syntax Purpose
echo "some text" write some text on your screen
ls list files
wc -l file
wc -w file
wc -c file
count lines in file or
count words in file or
count number of characters
cp sourcefile destfile copy sourcefile to destfile
mv oldname newname rename or move file
rm file delete a file
grep 'pattern' file search for strings in a file
Example: grep 'searchstring' file.txt
cut -b colnum file get data out of fixed width columns of text
Example: get character positions 5 to 9
cut -b5-9 file.txt
Do not confuse this command with "cat" which is something totally different
cat file.txt write file.txt to stdout (your screen)
file somefile describe what type of file somefile is
read var prompt the user for input and write it into a variable (var)
sort file.txt sort lines in file.txt
uniq remove duplicate lines, used in combination with sort since uniq removes only duplicated consecutive lines
Example: sort file.txt | uniq
expr do math in the shell
Example: add 2 and 3
expr 2 "+" 3
find search for files
Example: search by name:
find . -name filename -print
This command has many different possibilities and options. It is unfortunately too much to explain it all in this article.
tee write data to stdout (your screen) and to a file
Normally used like this:
somecommand | tee outfile
It writes the output of somecommand to the screen and to the file outfile
basename file return just the file name of a given name and strip the directory path
Example: basename /bin/tux
returns just tux
dirname file return just the directory name of a given name and strip the actual file name
Example: dirname /bin/tux
returns just /bin
head file print some lines from the beginning of a file
tail file print some lines from the end of a file
sed sed is basically a find and replace program. It reads text from standard input (e.g from a pipe) and writes the result to stdout (normally the screen). The search pattern is a regular expression (see references). This search pattern should not be confused with shell wildcard syntax. To replace the string linuxfocus with LinuxFocus in a text file use:
cat text.file | sed 's/linuxfocus/LinuxFocus/' > newtext.file
This replaces the first occurance of the string linuxfocus in each line with LinuxFocus. If there are lines where linuxfocus appears several times and you want to replace all use:
cat text.file | sed 's/linuxfocus/LinuxFocus/g' > newtext.file
awk Most of the time awk is used to extract fields from a text line. The default field separator is space. To specify a different one use the option -F.
 cat file.txt | awk -F, '{print $1 "," $3 }' 

Here we use the comma (,) as field separator and print the first and third ($1 $3) columns. If file.txt has lines like:
Adam Bor, 34, India
Kerry Miller, 22, USA

then this will produce:
Adam Bor, India
Kerry Miller, USA

There is much more you can do with awk but this is a very common use.


2) Concepts: Pipes, redirection and backtick
They are not really commands but they are very important concepts.

pipes (|) send the output (stdout) of one program to the input (stdin) of another program.
    grep "hello" file.txt | wc -l
finds the lines with the string hello in file.txt and then counts the lines.
The output of the grep command is used as input for the wc command. You can concatinate as many commands as you like in that way (within reasonable limits).

redirection: writes the output of a command to a file or appends data to a file
> writes output to a file and overwrites the old file in case it exists
>> appends data to a file (or creates a new one if it doesn't exist already but it never overwrites anything).

Backtick
The output of a command can be used as command line arguments (not stdin as above, command line arguments are any strings that you specify behind the command such as file names and options) for another command. You can as well use it to assign the output of a command to a variable.
The command
find . -mtime -1 -type f -print
finds all files that have been modified within the last 24 hours (-mtime -2 would be 48 hours). If you want to pack all these files into a tar archive (file.tar) the syntax for tar would be:
tar xvf file.tar infile1 infile2 ...
Instead of typing it all in you can combine the two commands (find and tar) using backticks. Tar will then pack all the files that find has printed:
#!/bin/sh
# The ticks are backticks (`)  not normal quotes ('):
tar -zcvf lastmod.tar.gz `find . -mtime -1 -type f -print`

3) Control structures
The "if" statement tests if the condition is true (exit status is 0, success). If it is the "then" part gets executed:
if ....; then
   ....
elif ....; then
   ....
else
   ....
fi
Most of the time a very special command called test is used inside if-statements. It can be used to compare strings or test if a file exists, is readable etc...
The "test" command is written as square brackets " [ ] ". Note that space is significant here: Make sure that you always have space around the brackets. Examples:
[ -f "somefile" ]  : Test if somefile is a file.
[ -x "/bin/ls" ]   : Test if /bin/ls exists and is executable.
[ -n "$var" ]      : Test if the variable $var contains something
[ "$a" = "$b" ]    : Test if the variables "$a" and  "$b" are equal
Run the command "man test" and you get a long list of all kinds of test operators for comparisons and files.
Using this in a shell script is straight forward:
#!/bin/sh
if [ "$SHELL" = "/bin/bash" ]; then
  echo "your login shell is the bash (bourne again shell)"
else
  echo "your login shell is not bash but $SHELL"
fi
The variable $SHELL contains the name of the login shell and this is what we are testing here by comparing it against the string "/bin/bash"

Shortcut operators
People familiar with C will welcome the following expression:
[ -f "/etc/shadow" ] && echo "This computer uses shadow passwors"
The && can be used as a short if-statement. The right side gets executed if the left is true. You can read this as AND. Thus the example is: "The file /etc/shadow exists AND the command echo is executed". The OR operator (||) is available as well. Here is an example:
#!/bin/sh
mailfolder=/var/spool/mail/james
[ -r "$mailfolder" ] || { echo "Can not read $mailfolder" ; exit 1; }
echo "$mailfolder has mail from:"
grep "^From " $mailfolder
The script tests first if it can read a given mailfolder. If yes then it prints the "From" lines in the folder. If it cannot read the file $mailfolder then the OR operator takes effect. In plain English you read this code as "Mailfolder readable or exit program". The problem here is that you must have exactly one command behind the OR but we need two:
-print an error message
-exit the program
To handle them as one command we can group them together in an anonymous function using curly braces. Functions in general are explained further down.
You can do everything without the ANDs and ORs using just if-statements but sometimes the shortcuts AND and OR are just more convenient.

The case statement can be used to match (using shell wildcards such as * and ?) a given string against a number of possibilities.
case ... in
...) do something here;;
esac
Let's look at an example. The command file can test what kind of filetype a given file is:
file lf.gz

returns:

lf.gz: gzip compressed data, deflated, original filename, 
last modified: Mon Aug 27 23:09:18 2001, os: Unix
We use this now to write a script called smartzip that can uncompress bzip2, gzip and zip compressed files automatically :
#!/bin/sh
ftype=`file "$1"`
case "$ftype" in
"$1: Zip archive"*)
    unzip "$1" ;;
"$1: gzip compressed"*)
    gunzip "$1" ;;
"$1: bzip2 compressed"*)
    bunzip2 "$1" ;;
*) error "File $1 can not be uncompressed with smartzip";;
esac

Here you notice that we use a new special variable called $1. This variable contains the first argument given to a program. Say we run
smartzip articles.zip
then $1 will contain the string articles.zip

The select statement is a bash specific extension and is very good for interactive use. The user can select a choice from a list of different values:
select var in ... ; do
  break
done
.... now $var can be used ....
Here is an example:
#!/bin/sh
echo "What is your favourite OS?"
select var in "Linux" "Gnu Hurd" "Free BSD" "Other"; do
        break
done
echo "You have selected $var"
Here is what the script does:
What is your favourite OS?
1) Linux
2) Gnu Hurd
3) Free BSD
4) Other
#? 1
You have selected Linux
In the shell you have the following loop statements available:
while ...; do
 ....
done
The while-loop will run while the expression that we test for is true. The keyword "break" can be used to leave the loop at any point in time. With the keyword "continue" the loop continues with the next iteration and skips the rest of the loop body.

The for-loop takes a list of strings (strings separated by space) and assigns them to a variable:
for var in ....; do
  ....
done
The following will e.g. print the letters A to C on the screen:
#!/bin/sh
for var in A B C ; do
  echo "var is $var"
done
A more useful example script, called showrpm, prints a summary of the content of a number of RPM-packages:
#!/bin/sh
# list a content summary of a number of RPM packages
# USAGE: showrpm rpmfile1 rpmfile2 ...
# EXAMPLE: showrpm /cdrom/RedHat/RPMS/*.rpm
for rpmpackage in $*; do
  if [ -r "$rpmpackage" ];then
    echo "=============== $rpmpackage =============="
    rpm -qi -p $rpmpackage
  else
    echo "ERROR: cannot read file $rpmpackage"
  fi
done
Above you can see the next special variable, $* which contains all the command line arguments. If you run
showrpm openssh.rpm w3m.rpm webgrep.rpm
then $* contains the 3 strings openssh.rpm, w3m.rpm and webgrep.rpm.

The GNU bash knows until-loops as well but generally while and for loops are sufficient.

Quoting
Before passing any arguments to a program the shell tries to expand wildcards and variables. To expand means that the wildcard (e.g. *) is replaced by the appropriate file names or that a variable is replaced by its value. To change this behaviour you can use quotes: Let's say we have a number of files in the current directory. Two of them are jpg-files, mail.jpg and tux.jpg.
#!/bin/sh
echo *.jpg
This will print "mail.jpg tux.jpg".
Quotes (single and double) will prevent this wildcard expansion:
#!/bin/sh
echo "*.jpg"
echo '*.jpg'
This will print "*.jpg" twice.
Single quotes are most strict. They prevent even variable expansion. Double quotes prevent wildcard expansion but allow variable expansion:
#!/bin/sh
echo $SHELL
echo "$SHELL"
echo '$SHELL'
This will print:
/bin/bash
/bin/bash
$SHELL
Finally there is the possibility to take the special meaning of any single character away by preceeding it with a backslash:
echo \*.jpg
echo \$SHELL
This will print:
*.jpg
$SHELL
Here documents
Here documents are a nice way to send several lines of text to a command. It is quite useful to write a help text in a script without having to put echo in front of each line. A "Here document" starts with << followed by some string that must also appear at the end of the here document. Here is an example script, called ren, that renames multiple files and uses a here document for its help text:
#!/bin/sh
# we have less than 3 arguments. Print the help text:
if [ $# -lt 3 ] ; then
cat <<HELP
ren -- renames a number of files using sed regular expressions

USAGE: ren 'regexp' 'replacement' files...

EXAMPLE: rename all *.HTM files in *.html:
  ren 'HTM$' 'html' *.HTM

HELP
  exit 0
fi
OLD="$1"
NEW="$2"
# The shift command removes one argument from the list of
# command line arguments.
shift
shift
# $* contains now all the files:
for file in $*; do
    if [ -f "$file" ] ; then
      newfile=`echo "$file" | sed "s/${OLD}/${NEW}/g"`
      if [ -f "$newfile" ]; then
        echo "ERROR: $newfile exists already"
      else
        echo "renaming $file to $newfile ..."
        mv "$file" "$newfile"
      fi
    fi
done
This is the most complex script so far. Let's discuss it a little bit. The first if-statement tests if we have provided at least 3 command line parameters. (The special variable $# contains the number of arguments.) If not, the help text is sent to the command cat which in turn sends it to the screen. After printing the help text we exit the program. If there are 3 or more arguments we assign the first argument to the variable OLD and the second to the variable NEW. Next we shift the command line parameters twice to get the third argument into the first position of $*. With $* we enter the for loop. Each of the arguments in $* is now assigned one by one to the variable $file. Here we first test that the file really exists and then we construct the new file name by using find and replace with sed. The backticks are used to assign the result to the variable newfile. Now we have all we need: The old file name and the new one. This is then used with the command mv to rename the files.

Functions
As soon as you have a more complex program you will find that you use the same code in several places and also find it helpful to give it some structure. A function looks like this:
functionname()
{
 # inside the body $1 is the first argument given to the function
 # $2 the second ...
 body
}
You need to "declare" functions at the beginning of the script before you use them.

Here is a script called xtitlebar which you can use to change the name of a terminal window. If you have several of them open it is easier to find them. The script sends an escape sequence which is interpreted by the terminal and causes it to change the name in the titlebar. The script uses a function called help. As you can see the function is defined once and then used twice:
#!/bin/sh
# vim: set sw=4 ts=4 et:

help()
{
    cat <<HELP
xtitlebar -- change the name of an xterm, gnome-terminal or kde konsole

USAGE: xtitlebar [-h] "string_for_titelbar"

OPTIONS: -h help text

EXAMPLE: xtitlebar "cvs"

HELP
    exit 0
}

# in case of error or if -h is given we call the function help:
[ -z "$1" ] && help
[ "$1" = "-h" ] && help

# send the escape sequence to change the xterm titelbar:
echo -e "\033]0;$1\007"
#
It's a good habit to always have extensive help inside the scripts. This makes it possible for others (and you) to use and understand the script.

Command line arguments
We have seen that $* and $1, $2 ... $9 contain the arguments that the user specified on the command line (The strings written behind the program name). So far we had only very few or rather simple command line syntax (a couple of mandatory arguments and the option -h for help). But soon you will discover that you need some kind of parser for more complex programs where you define your own options. The convention is that all optional parameters are preceeded by a minus sign and must come before any other arguments (such as e.g file names).

There are many possibilities to implement a parser. The following while loop combined with a case statement is a very good solution for a generic parser:
#!/bin/sh
help()
{
  cat <<HELP
This is a generic command line parser demo.
USAGE EXAMPLE: cmdparser -l hello -f -- -somefile1 somefile2
HELP
  exit 0
}

while [ -n "$1" ]; do
case $1 in
    -h) help;shift 1;; # function help is called
    -f) opt_f=1;shift 1;; # variable opt_f is set
    -l) opt_l=$2;shift 2;; # -l takes an argument -> shift by 2
    --) shift;break;; # end of options
    -*) echo "error: no such option $1. -h for help";exit 1;;
    *)  break;;
esac
done

echo "opt_f is $opt_f"
echo "opt_l is $opt_l"
echo "first arg is $1"
echo "2nd arg is $2"
Try it out! You can run it e.g with:
cmdparser -l hello -f -- -somefile1 somefile2
It produces
opt_f is 1
opt_l is hello
first arg is -somefile1
2nd arg is somefile2
How does it work? Basically it loops through all arguments and matches them against the case statement. If it finds a matching one it sets a variable and shifts the command line by one. The unix convention is that options (things starting with a minus) must come first. You may indicate that this is the end of option by writing two minus signs (--). You need it e.g with grep to search for a string starting with a minus sign:
Search for -xx- in file f.txt:
grep -- -xx- f.txt
Our option parser can handle the -- too as you can see in the listing above.

Examples

A general purpose sceleton

Now we have discussed almost all components that you need to write a script. All good scripts should have help and you can as well have our generic option parser even if the script has just one option. Therefore it is a good idea to have a dummy script, called framework.sh, which you can use as a framework for other scripts. If you want to write a new script you just make a copy:
cp framework.sh myscript
and then insert the actual functionality into "myscript".

Let's now look at two more examples:

A binary to decimal number converter

The script b2d converts a binary number (e.g 1101) into its decimal equivalent. It is an example that shows that you can do simple mathematics with expr:
#!/bin/sh
# vim: set sw=4 ts=4 et:
help()
{
  cat <<HELP
b2h -- convert binary to decimal

USAGE: b2h [-h] binarynum

OPTIONS: -h help text

EXAMPLE: b2h 111010
will return 58
HELP
  exit 0
}

error()
{
    # print an error and exit
    echo "$1"
    exit 1
}

lastchar()
{
    # return the last character of a string in $rval
    if [ -z "$1" ]; then
        # empty string
        rval=""
        return
    fi
    # wc puts some space behind the output this is why we need sed:
    numofchar=`echo -n "$1" | wc -c | sed 's/ //g' `
    # now cut out the last char
    rval=`echo -n "$1" | cut -b $numofchar`
}

chop()
{
    # remove the last character in string and return it in $rval
    if [ -z "$1" ]; then
        # empty string
        rval=""
        return
    fi
    # wc puts some space behind the output this is why we need sed:
    numofchar=`echo -n "$1" | wc -c | sed 's/ //g' `
    if [ "$numofchar" = "1" ]; then
        # only one char in string
        rval=""
        return
    fi
    numofcharminus1=`expr $numofchar "-" 1` 
    # now cut all but the last char:
    rval=`echo -n "$1" | cut -b 0-${numofcharminus1}`
}
    

while [ -n "$1" ]; do
case $1 in
    -h) help;shift 1;; # function help is called
    --) shift;break;; # end of options
    -*) error "error: no such option $1. -h for help";;
    *)  break;;
esac
done

# The main program
sum=0
weight=1
# one arg must be given:
[ -z "$1" ] && help
binnum="$1"
binnumorig="$1"

while [ -n "$binnum" ]; do
    lastchar "$binnum"
    if [ "$rval" = "1" ]; then
        sum=`expr "$weight" "+" "$sum"`
    fi
    # remove the last position in $binnum
    chop "$binnum"
    binnum="$rval"
    weight=`expr "$weight" "*" 2`
done

echo "binary $binnumorig is decimal $sum"
#
The algorithm used in this script takes the decimal weight (1,2,4,8,16,..) of each digit starting from the right most digit and adds it to the sum if the digit is a 1. Thus "10" is:
0 * 1 + 1 * 2 = 2
To get the digits from the string we use the function lastchar. This uses wc -c to count the number of characters in the string and then cut to cut out the last character. The chop function has the same logic but removes the last character, that is it cuts out everything from the beginning to the character before the last one.

A file rotation program
Perhaps you are one of those who save all outgoing mail to a file. After a couple of months this file becomes rather big and it makes the access slow if you load it into your mail program. The following script rotatefile can help you. It renames the mailfolder, let's call it outmail, to outmail.1 if there was already an outmail.1 then it becomes outmail.2 etc...
#!/bin/sh
# vim: set sw=4 ts=4 et: 
ver="0.1"
help()
{
    cat <<HELP
rotatefile -- rotate the file name 

USAGE: rotatefile [-h]  filename

OPTIONS: -h help text

EXAMPLE: rotatefile out
This will e.g rename out.2 to out.3, out.1 to out.2, out to out.1
and create an empty out-file

The max number is 10

version $ver
HELP
    exit 0
}

error()
{
    echo "$1"
    exit 1
}
while [ -n "$1" ]; do
case $1 in
    -h) help;shift 1;;
    --) break;;
    -*) echo "error: no such option $1. -h for help";exit 1;;
    *)  break;;
esac
done

# input check:
if [ -z "$1" ] ; then
 error "ERROR: you must specify a file, use -h for help" 
fi
filen="$1"
# rename any .1 , .2 etc file:
for n in  9 8 7 6 5 4 3 2 1; do
    if [ -f "$filen.$n" ]; then
        p=`expr $n + 1`
        echo "mv $filen.$n $filen.$p"
        mv $filen.$n $filen.$p
    fi
done
# rename the original file:
if [ -f "$filen" ]; then
    echo "mv $filen $filen.1"
    mv $filen $filen.1
fi
echo touch $filen
touch $filen

How does the program work? After checking that the user provided a filename we go into a for loop counting from 9 to 1. File 9 is now renamed to 10, file 8 to 9 and so on. After the loop we rename the original file to 1 and create an empty file with the name of the original file.

Debugging

The most simple debugging help is of course the command echo. You can use it to print specific variables around the place where you suspect the mistake. This is probably what most shell programmers use 80% of the time to track down a mistake. The advantage of a shell script is that it does not require any re-compilation and inserting an "echo" statement is done very quickly.

The shell has a real debug mode as well. If there is a mistake in your script "strangescript" then you can debug it like this:
sh -x strangescript
This will execute the script and show all the statements that get executed with the variables and wildcards already expanded.

The shell also has a mode to check for syntax errors without actually executing the program. To use this run:
sh -n your_script
If this returns nothing then your program is free of syntax errors.

We hope you will now start writing your own shell scripts. Have fun!

References