Practical Shell Scripting: 2008

Monday, December 29, 2008

Arrays in bash

Hello and welcome to this weeks installment of Practical shell scripting. This week I'm going to tackle something which is common in most coding languages, but it's handled a little differently in bash shell scripts, that is arrays. Hopefully most of you know that an arrays is a collection of like pieces of data. The data could be integers, or strings, or floats or whatever. For those of who come from a C/C++ background there are several things which are a little on the annoying side about using arrays in shell script. Firstly bash will allow you to mix data types that's because bash and I believe Bourne and Korn as well are what's referred to as loosely typed languages. So for instance in a language like C/C++ you define an arrays being of a specific data type. For example:

int MyNumbers[5]={1,3,5,7,11};

is a single dimensional array which will hold 5 integers, but if you try and introduce other variable types you will fail to compile. Bash on the other hand will you get away with a lot more. Also in Bash it's perfectly acceptable to declare your array without any predefined size, or variable type. So for instance in bash an array might look like this:

SomeAray="1 3 5 7 eleven";

However, the nice thing about bash arrays is that they follow a similar syntax regarding the syntax of the indexes, and like arrays in C/C++ they both start at 0. The only other tricky thing to remember is that in bash arrays are referenced using ${ }. As opposed to C/C++ where the array is just referenced directly. So to represent the fifth element of the C/C++ array you would refer to it as:

MyNumbers[4];

While in Bash you would reference the fifth element as:

${SomeArray[4]}

There are also a few nifty little things about bash arrays that I'm just starting to explore myself, but here's some code to play with to give you some ideas of what you can do with bash arrays.

#!/bin/bash

##-----------------------------------------------------------------##
##
## A simple shell script to give examples of arrays
##
##-----------------------------------------------------------------##

##--- example one ----##
echo "EXAMPLE 1: array of integers"
Array=(1 3 5 7 eleven)

echo "Here's the values"
for value in ${Array[*]}
do
echo "value is $value"
done

echo ""
echo "Here's the index"
for index in ${!Array[*]}
do
echo "index is $index"
done

echo ""
echo "Here's the index and it's corresponding value"
for index in ${!Array[*]}
do
echo "index is $index, value is ${Array[$index]}"
done

##--- example two ----##
echo ""
echo "So the value at index 2 is ${Array[2]}"
echo ""
echo "EXAMPLE 2: array of strings "
NewArray=(Apples Oranges Pears Beans 99)
echo "NewArray[0] is ${NewArray[0]}"
echo "NewArray[1] is ${NewArray[1]}"
echo "NewArray[2] is ${NewArray[2]}"
echo "NewArray[3] is ${NewArray[3]}"
echo "NewArray[4] is ${NewArray[4]}"

Notice how you can mix and match strings in with integers in the array and bash doesn't even care. There's a couple of other neat tricks that you can do with arrays, but I'll save that for next weeks entry. That's it for this week, as always if you have any questions or comments please feel free to contact me and Happy coding.

Monday, December 15, 2008

Condition testing and braces

Hello and welcome to this weeks edition of Practical Shell Scripting. Recently I was handed a project to work on that was a rewrite of some perl code into shell script and I came across some new tricks that I was unaware of and I wanted to share some of them with you. In particular testing for conditions using metacharacters. I'm sure that all of you are aware of the simple test condition statements using brackets such as:

if [ -f foobar ]; then

which will test for the existence of a the file foobar, but something that you might not be aware of, and something that I was was previously unaware of is that bash can also do much more complex metacharacter testing through the use of double brackets. ie: [[

Here's an example of a simple test using the common metachacter ^.

##----------------------------------------------------------------##
#!/bin/bash

STRING_TO_READ="foobar"

if [ $STRING_TO_READ =~ ^foo ]; then
echo "We have foo"
else
echo "no foo here"
fi

## Note the double braces in the test condition statement ##
if [[ $STRING_TO_READ =~ ^foo ]]; then
echo "We have foo"
else
echo "no foo here"
fi

##--------------------------------------------------------------------------------------------##

For those unfamiliar with metacharacters the ^ symbol is used to specify the something at the start of a line. So here's the tricky part.... by taking a regular set of test brackets an adding a second pair of brackets around them you now have the capability of doing any of the functions that you could do in say sed of other unix tools. I'm just starting to play with this, but I'm sure that this tool will prove incredibly versatile in the future.

That's it for this week, be sure and drop me a line if you have any questions or comments. Happy coding

Monday, December 8, 2008

Simple loops and counters in bash

Welcome to this weeks issue of Practical Shell Scripting. Last week I gave an example and explanation of a shell script that did some reading from a text file in a line by line fashion. This week I'm going to give a discussion of basic looping and simple math comparison. First lets look at the basics of doing simple math and comparison testing. Then I'll give a real world example of when this stuff might be used.

One of the confusing things for me is the syntax that bash uses for incrementing. The basic construct is just like that of basic, and I think maybe pascal and cobol, but it's been a while
since I coded in those languages. Specifically bash uses the word let to carry out arithmetic operations. However, here's where it gets a little funky. Instead of a simple statement like

let A=A+1

Which is how this is done in most formal languages. However bash uses an expression like this:

let $((COUNTER=$COUNTER+1))

In order to get this right and make it work the $(( on the left side must be just like that and the variable COUNTER must be next to them. While the second half must be $COUNTER, referencing the variable as one normally would. If the syntax is not exactly like this you can pretty much count on having some problems.

How and why this works is something that I have no idea. I only know that it does, perhaps someday in my travels this will be revealed to me, but for now I'm willing to take it for granted
that this is simply how we do things in bash. There is another syntax for doing math that I have come across in some bash scripts, however I have found that it doesn't always work with every version of bash and the example that I have given seems to be very portable across older and newer version.

I have one more topic that I need to bring up here before we get into the whole code example, and that is comparison operators. Bash uses a collection of acronym like operators to do comparison, you see the same syntax in perl. However, for us C/C++ guys it looks pretty strange. The basic operators are:

-eq ##equal to
-ne ## not equal to
-le ## less than or equal to
-lt ## less than
-gt ## greater than
-ge ## greater than or equal to

Now here's an example of a very simple while loop that uses both a math variable that gets incremented and a comparison operator to break the loop.

#!/bin/bash
##---------------------------------------------------------------------##
##
## This is a simple looping example with a counter and
## a math comparison operator.
##
##
##---------------------------------------------------------------------##

COUNTER=0
MAX_SIZE=10

while [ $COUNTER -lt $MAX_SIZE ]
do
let $((COUNTER=$COUNTER+1))
echo $COUNTER
sleep 1
done

Now I'd like to share a real world example that I had to write a few years back to do some house cleaning in a directory to make sure that we didn't overflow the drive. The company I work does automated nightly builds using a cron job to make sure that the coders didn't break anything during the days work. Also, there are additional day builds in the same directory to test quick fixes and branch merges. Each build takes up around 100 to 150 megabytes of space. So the problem turns up when after a while the builds end up taking up the whole drive and then at some point the next nightly build fails and some one has to go about the time consuming and tedious job of purging off the old builds. So we came up with a rule that we would keep the most current 10 builds and all the previous ones would be deleted. If anyone wanted a specific build to be saved off that was his responsibility to do so. The house-cleaning script would be run before the regular builds were run to make sure that there was enough room.

#!/bin/bash
# this is the house-cleaning.sh
# it makes sure that there are not a lot
# of silly old bulds laying around taking up
# space
#
#
#-----------------------------------------------------
#
# $Id: $
# $Change: $
# $Author: tglaser $
# $Date: 2008/11/06 $
# $NoKeyword: $
#
#
#-------------------------------------------------------

HOME=/home/tglaser
DERIVEDDIR=$HOME/BuildDir

cd $DERIVEDDIR

pwd

counter=0
too_many_files=5

for file in `ls -t`
do
echo $file
let $((counter=$counter+1))
if [ $counter -gt $too_many_files ]; then
echo "there are $counter files or directories"
echo "So we are going to delete $file";
rm -rf $file;
else
echo "you've still got room to work here"
echo "No need to delete anything yet"
fi

echo " "
echo " "
sleep 3
done

cd $HOME

exit

##----------------------------------------------------##

The only important difference between this script and the first one is the for loop which I will go into in one of the next issues. So that's it for this week. As always if you have any questions or comments please feel free to contact me. Happy coding.

Monday, December 1, 2008

Reading a file line by line using bash

Welcome to this weeks edition of practical shell scripting. This week I'm going to give a small simple example of a how to read a file one line at a time using bash. As with my other previous and hopefully future blog entries I'm trying to keep this down to just a single small topic and try to give a good example of it without clouding it with a lot of other things.

What prompted this weeks entry was a side project I had over the weekend to update one of my servers at home that required me to download a large number of files from an http server. However, I did not want to go to each file one at a time and click on it and download through the web so I viewed the source of the html file, saved it, and stripped out the html until I had a file that was nothing more than a list of all the files that I wanted to download. Once I had that file I fed it into a much more complicated script that retrieved those files from the http server using wget.

For the sake of this example we're going to be looking at just a tiny part of that and I've turned it into a generic example to show how bas reads line by line from a file called grocery.list.

First, here's what's in the file called grocery.list :

apples
oranges
bananas
pears
juice

Now here's the bash script to read it:

#!/bin/bash
###-------------------------------------------------###
#
# this is a simple bash script to show
# how to read from a file
#
###-------------------------------------------------###

INPUT_FILE="$1"

while read curline; do
echo $curline

done < "$INPUT_FILE"

....and that's all there is to it. So what we have here is a simple looping construct that goes one line at a time and steps through the input file that is given as the first argument or $1. In his example "curline" is the variable that the data is read into. This can be confusing because "curline" does not need to be declared a head of time, and in fact in all the shell scripts I have seen I can not recall a single instance of ever being declared up front. The other thing that has always been a little counter intuitive to me is that file being read from is always at the bottom of the while loop which is taking it's input in the done line. Most of us who come from a more formal language like C or C++ are used to having the file declared and read from someplace closer to the top, and this always seems odd to me.

So to execute this example you would type:
./ReadLineExample.sh grocery.list

And the screen should show exactly what is in the file. While there are a great many tools in bash for parsing and working with files this is one of the most versatile since it allows simple usage of a looping construct.

So that's it for this week. As always if you have any questions or comments please feel free to contact me. Happy coding.

Monday, November 24, 2008

Working with getopts and optargs in bash

Welcome to this weeks edition of Practical shell scripting. This weeks edition is about getopts and optargs in bash. In the last couple of weeks I was given a job to do at work that had a request to create a shell script that was capable of taking command line argument in the fashion of what C/C++ programmers would refer to as argv and argc. That is the arguments indicate what and how the following value will be used. This is typically depicted in a fashion like so:

#> ./foo -y 14

Where foo is the application or executable and -y tells you something about the next argument coming in. So maybe -y is for years or something else defined by the application foo internally and the value following -y is that value. This is easily done in the C programming language, but I had never done this in shell scripts, and while it seems straight forward there was a gotcha that I wanted to point out that hung me up a bit. So first here's a slice of example code to illustrate the basics of implementing getopts and optargs in a shell script. As with all shell scripts the # identifies lines which are not interpreted by bash either to be used as comments or to delineate the code in some fashion.

#!/bin/bash
###-----------------------------------------------------------###
# GetOptsExample.sh
# a small shell script to demonstrate getopts and optargs in bash
# --apg
#
###-----------------------------------------------------------------------------------###
PRODUCE=0

while getopts "a:o:b:" OPTION;
do
case $OPTION in
a)
## -a is for apples ##
echo "OPTION is a"
PRODUCE=$OPTARG
echo "The produce is $PRODUCE apples "
;;
o)
## -o is for oranges ##
echo "OPTION is o"
PRODUCE=$OPTARG
echo "The produce is $PRODUCE oranges"
;;
b)
## -b is for beans ##
echo "OPTION is b"
PRODUCE=$OPTARG
echo "The produce is $PRODUCE beans"
;;
esac
done

###--------------------------------------------------------------------------------------###

So what we see here is basically a simple case statement that uses the word OPTION as the flag for the case statement to differentiate which options are incoming as arguments. The optargs are then the parts that follow the options. This all seems fairly straight forward enough. However, the gotcha in my original attempts was quite sneaky. It seems that one critical aspect of the getopts line is the placement of the colon after each option. Each option will only accept an argument if that option is followed by a ":". So If you were to enter the line of code like this

while getopts "a:o:b" OPTION; #### -- note the missing ":" after the b option

any argument that you had hoped to pass along pertaining to the b would be lost. One easy way to test this out is to simple walk through each optarg section in the case and echo out what his optarg argument is.

So that's it for this week. As always if you have any questions or comments please feel free to.
contact me. Happy coding.

Monday, November 17, 2008

An introduction to Unix shell scripting.

One of the classes that I took during my post bach studies was entitled An introduction to Unix programming. Now I had worked with Unix briefly during my undergrad studies, but never really considered it as a programming language. It's an operating system right? However during the taking of the class I came to an epiphany. That epiphany being that one of the truly beautiful things about Unix is that it's a little of both. Since all the guys that developed the operating system were programmers first and foremost they built an operating system so rich in tools that the operating system itself was essentially a programmers operating system. This might explain why so many non programmers have so many frustrations with Unix, and why so many geeks find themselves right at home in it. The gentleman who taught the class was an extremely experienced C and C++ programmer and he made a statement about Unix that has stuck with me to this very day. He said "The thing I like about Unix is that everyday you work with it you find some tiny little jewel hidden in there". As I have been programming in Unix for these last 14 years I can confirm that he was exactly right. There is not a day that passes that I do not find some cool little thing that I ddn't know about, or some nifty little tool.

However, all that being said, Unix can be confusing and it's tools are all too often arcane and difficult to understand. My objective then in starting this blog was two fold. Firstly, I would like to help folks to overcome some of the many pitfalls and gotcha's that have cost me time and anguish over the years. Second, my hope is to create a reservoir for all the little code pieces so that myself and hopefully others can pull from for future efforts.

Practical Shell Scripting