Linux Development
Developers are, of course, encouraged to develop for a Linux system. There are a few cultural differences between Linux development and others (for instance, free software, lack of universal standards, etc.), but we will be discussing today the technical side of Linux development.
One important thing to remember is that there is no .NET framework or Cocoa for Linux. There is no universal Linux framework. This can make life a little more difficult (though there are MANY external libraries available for use), but it does tend to encourage developers to do things their own way.
Languages
There are many programming languages available for Linux. A choice of language is partly a personal choice (what you prefer to write in), but can also be affected by what you're writing (you don't write a device driver in Java).
Many compilers are provided by GCC, the GNU Compiler Collection. For this reason, every Linux distro has (or can easily obtain) a C, C++, (limited) Java, Fortran, etc. compiler.
C
C is the single most common programming language on Linux. Remember that Linux is based on UNIX, and Kernighan, who was heavily involved in UNIX, is the writer of the C language. The Linux kernel is written in C, and it is hailed by many in the Linux community as the greatest language.
The basic functionality of the C Standard Library is expanded on by the GNU C Library (glibc), which is available on every Linux system, and includes functions and structures for filesystem traversal, network programming, interfacing with the kernel, and just about all other such low-level work. It is very well documented online, and is a fantastic resource for C programming.
C++
C++ is another popular language on Linux, but is nowhere near C's popularity. It is however, the preferred object-oriented language. Because C++ can use C libraries, C++ also has glibc available to it, and can use most of the C third party libraries for many things. Besides this, there isn't much to say.
Java
Java holds a strange position in the Linux community. Java is fairly popular, but it is also resented because the Sun Java implementation is not open-source. There is, however, a Java implementation available for Linux, and Java apps are certainly available, particularly in cross-platform applications (LimeWire, Azureus, etc.).
The GCC includes a compiler called gcj that attempts to implement the Java language, though it has not had full success, particularly with the GUI libraries. Further, as it has recently been announced that Java will be released under the GPL, it is likely that Java will see a great deal of advancement on Linux in the near future.
Perl
Perl is an extremely common programming language on Linux: it was designed with shell scripting and other UNIX tools in mind, and is very friendly to the Linux mindset. Knowing Perl is a huge plus for a Linux systems administrator, and is, again, a very useful scripting language. While Perl can be used for huge projects (such as Frozen Bubble), it is also common to see it in place of a shell script, depending on the complexity of the script.
C#
C# is an extremely rare language on Linux, but it is possible and available. The Mono project is attempting to implement a .NET-compatible framework, and has therefore included a C# compiler. So while C# apps can be written and distributed, again, this is particularly uncommon.
Visual Basic
Visual Basic is NOT available on Linux. Having said that, the Gambas project has developed a VB-like language that may be usable by those familiar with Visual Basic. However, Gambas-based applications are essentially non-existent.
Shell Scripting
Having discussed these languages, it is now time to turn towards shell scripting. Shell scripting is a useful technique that allows you to essentially write a program by tying together existing programs in new and exciting ways. Shell scripting is not a full programming language, and is certainly not to be used if you are working on a full and complex project. However, for repetitive tasks, or even tasks for which many pieces can be automated, a script is often the perfect route.
Each shell has its own scripting language: we will focus on the Bash shell, which is the shell that we have been using this whole time. What we are learning now will not work on the C shell (a popular shell for the BSD operating systems), and much will not work on the Bourne Shell (the predecessor to the Bash shell). But again, Bash is by FAR the most common shell on Linux systems.
Introduction
As I said, Bash scripting is all about tying existing applications together in new ways. But before we get to that, let's try the traditional first script. It looks like this:
#!/bin/bash echo "Hello, world!"
Alrighty then, let's execute this script. We exit our editors, and run the command 'chmod +x FILE' to make the script executable. Then we run ./FILE to execute it. And lo and behold, "Hello, world!" prints to the screen.
Now, you may be wondering: what the hell? What does this script do? Let's look at it line-by-line.
Line 1 is the shebang line (stands for "Hash Bang"). On Linux, every script in any scripting language starts with a shebang line. This tells the shell what interpreter to use. So this script says "Use /bin/bash" (which is where the Bash shell is located). A Perl script would start with:
#!/usr/bin/perl
And a Ruby script would start with:
#!/usr/bin/ruby
Our second line invokes the 'echo' program, which simply prints out its parameters.
You may also notice that syntax is pretty sparse. Scripting was designed this way. You don't need semicolons unless you have two statements on the same line, you don't need a special phrase to indicate the end of the script. The script just goes.
Alrighty! So that's pretty simple there.
Now let's change it up a little bit. Let's say we want to print our message twice:
#!/bin/bash echo "Hello folks!" echo "Hi again, folks!"
Now, if we want to say hello to the kiddies instead of to the folks, we need to change this twice. Which is where variables come in.
Variables
In Bash scripting, variables are entirely untyped. This means that a variable can store a number, string, array, etc. with no changes.
So let's change our program to use a variable instead:
#!/bin/bash people=folks echo "Hello, $people" echo "Hi again, $people"
Now let's take a little look at this. We can probably guess that 'people' is the name of our variable, and that the line 'people=folks' is assigning a value. You can also probably guess that '$people' uses the value of the 'people' variable.
In Bash scripting, when you assign a variable, you can just do the assignment: Bash knows that you're assigning to a variable. When you are getting a value, however, you prefix the variable with a '$'. This is why, in the assignment line, 'people' is known to be a variable (it's being assigned to), but 'folks' is known to be text (there is no '$').
You may have also noticed that there are no quotes around the word 'folks'. Again, Bash is a very syntax-sparse language. There are different types of quotes that mean different things, but when simply using a string, you don't need them.
As a final note on the assignment, notice that the equal sign comes immediately after the variable, and is not separated by a space. This is VERY important. If there was a space, the '=' would be considered a comparison, not an assignment.
While we're talking about variables, we should talk about the first type of special variable: parameter variables. As you've noticed, when we call a program from the terminal, we can give it parameters. You can do the same with a script. In a script, the parameters are variables $1 to $9 ($0 is the name of the script). You can have more than 9 parameters, but you need to access them differently.
So let's write a script that prints out its first argument:
#!/bin/bash echo "You told me: $1"
Sure enough, when we run this script, it prints out the first argument.
Now, this is all kind of cool, but it sure would be nice if we could do something based on that parameter. And we can! We do this with conditionals.
Conditionals
So let's write a script that says hello if the parameter is "hello", and a mean message otherwise.
#!/bin/bash if [ "$1" = "hello" ]; then echo "Hello there!" else echo "You're a meanie!" fi
WOAH! That's a lot of stuff there. What does all of this mean?
So let's start with the if statement. If statements are in virtually every language, so the basic form shouldn't be too difficult. You'll notice that we use square brackets instead of parentheses, but that isn't too tough to understand. However, NOTE THE SPACES. This is extremely important. Without the spaces, the script will have an error.
Let's look at our condition. We have surrounded everything in quotes. And there's a reason for this. Bash is a very stupid language. When it sees '$1', it immediately puts in the variable of $1. If $1 has a space, it will find two statements and an equal, and it gets very confused. Therefore, you always surround your variables in quotes.
Also note the '='. In Bash, an equals sign is used for string comparison, there's a different construct for numbers.
Next is the semicolon. But I said we don't need a semicolon! In this case, the 'if' and the 'then' are on the same line and are considered two statements. I could have put the 'then' on the next line and not used a semicolon.
Speaking of the 'then', Bash doesn't use curly brackets (that's a little bit of a lie, as we'll see later). If statements use 'then' and 'fi' ("if" backwards) to denote the body, and loops use 'do' and 'done'.
We then have an else statement, which means "If none of the above if statements was correct, then do this". Pretty simple.
And then we end with the 'fi' I mentioned before.
Now then, let's improve our script. If you give me the word "goodbye", we will say goodbye back. This construct looks like this:
#!/bin/bash if [ "$1" = "hello" ]; then echo "Hello there!" elif [ "$1" = "goodbye" ]; then echo "Goodbye, friend!" else echo "You're a meanie!" fi
Notice that we use 'elif' to mean the same as an 'else if' in most languages (or Perl's 'elsif'). Using this technique, we can build a huge tree of possible responses.
Loops
Now, there's one more important construct to talk about, and this is the loop. Bash has support for 3 loops: you probably know two of them, the for loop and the while loop.
The for loop is generally used to go through a list of things, and the while loop goes until a condition is met. I should note that the for loop is generally more like a foreach loop (if you're familiar with those).
Let's show an example of each of these:
#!/bin/bash for number in 1 2 3 4 5; do echo "The number is $number" done
You can probably figure out what this does. It takes the given list, assigns each element to $number, and does the body of the loop to each element.
Now let's see the while loop:
#!/bin/bash number=0 while [ "$number" -ne "5" ]; do echo "The number is $number" number=$[number+1] done
Now, there's a lot of new things here, so let's take this step-by-step.
First, note the 'while' loop. We give it a condition, and while that condition is true, we repeat the loop. The condition that we have given is that while $number doesn't equal 5. Yes, "-ne" stands for "not equal" (numbers). This is equivalent to "!=" in most languages, but in Bash, "!=" is used for strings.
Next, let's look at the body of the loop. We are incrementing the number each time by using a funny construct. Let's look at an assignment like this:
number=1
number=$number+1
This construct actually sets the value of $number to "1+1". To do math, we use $[...]. So the line:
number=$[number+1]
actually increments $number.
Now, I said that there are actually 3 loops out there. And there are. The third loop is the until loop, which is the opposite of while. It loops UNTIL the condition is true. So the above script could also be written:
#!/bin/bash number=0 until [ "$number" -eq "5" ]; do echo "The number is $number" number=$[number+1] done
This will loop UNTIL $number is 5, and is generally a more natural way to write this condition.
External Programs
Up to this point, all we've done is print out some text. Let's actually do something!
So for instance, let's take in a filename, and rename that file. The command to rename a file is "mv", remember?
#!/bin/bash mv "$1" "$1.renamed"
That seems awfully simple, huh? I hope you remembered to use quotes. Remember: if variables have a space, things get confused if you don't use quotes. Bash is a stupid, stupid language. But do you realize that we can just invoke a program in the script and it works? Craziness!
But now let's try something really cool: storing the output of a command in a variable. Let's take in a filename, and instead of just adding something to the end, let's change the extension. To do this, we'll use sed.
#!/bin/bash newname=$(echo "$1" | sed -e 's/\..*$/.ext/') mv "$1" "$newname"
Pretty nifty, eh? But what does this do?
Well, the only new thing here is the newname assignment. $(...) means "execute the commands inside here and store the output". You can also use backticks (`...`), but I find that $(...) makes the statement stand out a lot more, and is also much more difficult to confuse with regular quotes.
In this particular case, we are piping $1 into sed. This is because sed usually takes a filename: by using a pipe, we can just give it the value of $1. It is very common to pipe variables into programs when you script.
sed then runs a regular expression that replaces from the last dot to the end with '.ext'.
This all gets stored in $newname, which is then used with mv to rename the file.
Files and Redirection
Now then, you may be aware that when you run a command, you can redirect its output to a file. For instance, the command:
echo "Hello" > testfile
Prints "Hello" to a file called testfile.
Similarly, you can redirect input to be from a file instead of from the terminal (rather like piping). For instance:
sed -e 's/Hello/Goodbye/' < testfile
In a script, we can do this, but we can actually open up whole new files! To do this, we take advantage of file descriptors.
If you've used C or C++, you know that you have several streams by default. stdin has file descriptor 0, stdout has 1, and stderr has 2. When we redirect input or output (as above), we attach these streams to different sources. For instance, stdin is usually connected to the keyboard. When we use '<' to redirect it, the program doesn't know that it is redirected, but it reads from a file instead of the keyboard. Yay!
If we want to open new files in our script, we can start with file descriptor 3. So let's write a script that keeps a log of its activities:
#!/bin/bash exec 3> log echo "$(date): Creating first file" >&3 touch file1 echo "$(date): Creating second file" >&3 touch file2 echo "$(date): Removing first file" >&3 rm file1 echo "$(date): Removing second file" >&3 rm file2
So what happened here?
The exec statement at the top opened file descriptor 3 for output (denoted by '>') and connected it to the file "log". Each of our echo statements is then redirected to that file to use as a log.
Example
Alrighty. We have little time left, so I'd like to show you a script that I have given to many people and which is fairly useful.
In the terminal, if you use 'rm' to delete a file, the file is gone forever. Many people don't like this. Therefore, this script creates a trashcan and moves your files there instead of actually removing them:
#!/bin/bash mkdir ~/.Trash &> /dev/null until [ -z "$1" ]; do mv "$1" ~/.Trash/ shift done
Let's analyze this script for a moment.
The first line in this script creates the folder '.Trash' in your home directory. Then there's that funky bit on the end.
What this does is redirects both stdout and stderr to the file /dev/null. /dev/null is a special file that takes output and then eradicates it. So even though we are "writing" data to /dev/null, nothing actually gets written. The reason for this is we don't care if the directory already exists. We just want it to be there.
The rest of the script is a simple until loop. There is some special syntax there, however. The "-z" test is true if the given string is empty. So this will loop until $1 is empty. Inside of the loop, we move the file named by $1 into ~/.Trash. We then run this command "shift". This takes each parameter variable and moves it down. $9 becomes $8, $3 becomes $2, etc. $1 is discarded. This way, we are able to loop through every parameter.