Most Linux and UNIX system administrators use a diverse mix of shell scripts and tools like grep, awk, cut and so on. The classical approach has proven its merits, but these scripts are generally not easy to read or to maintain. One solution is to use a real programming language for system administration tasks. In a complex environment, system administration can become much easier with a real programming language instead of shell scripts. Traditionally, Perl has been very popular among sysadmins, but some people maintain that this is not much better than shell.
In this article, we choose Ruby, a feature-rich but simple object-oriented programming language known from the popular web application framework Ruby on Rails. The Ruby programming language has many built-in and external libraries that can come in handy for typical system administration tasks such as file manipulation and text processing, log file analysis, logging into other servers, and so on. The gentle learning curve of this programming language, coupled to the easy-to-read and maintainable form of the scripts, makes this a valid choice for sysadmins.
Simple but effective
Let’s start with some basic features of the language, to show you why Ruby is in many cases a better choice than Perl or a shell script. First of all, everything in Ruby is an object, even primitive types such as numbers and strings. For instance, an array of numbers is just an object, which has methods and properties. This has a very simple syntax. For example, this is how you get the length of an array:
[1, 2, 3, 4, 5].length
And this is how you get the last element of an array:
[1, 2, 3, 4, 5].last
A second powerful feature are code blocks, which allow for shorter and easier-to-grasp code. A code block is a function without a name, that can be passed to another function as a parameter. For example, if you have an array of numbers, you could double each one in one line:
numbers = [1, 2, 3, 4, 5] doubles = numbers.collect { |n| 2*n }
The function ‘collect’ invokes the function 2*n for each element n in the array numbers. As a result, ‘doubles’ has the content [2, 4, 6, 8, 10]. Most of the built-in Ruby classes such as Array have some methods that use a code block as a parameter, especially for looping, iterating, sorting and mapping. Once you get used to code blocks, they make programming in Ruby quite intuitive. For example, it comes in handy when querying a database:
#!/usr/bin/env ruby require ‘mysql’ db = Mysql.new(“localhost”, “root”, “password”) db.select_db(“myusers”) result = db.query(“select name,email from users”) result.each { |row| puts “User #{name} has email address #{email}” }
In the first line, we set the #! line that allows the script to be executed by the shell if the file is executable. With the ‘require’ line, we indicate that we use the MySQL module (which we have first to install separately with our operating system’s package manager).
Other features will be familiar from other programming languages or even shell scripts: for example, running external commands with a backtick or regular expressions. This is how you use both features to handle an unknown host:
resolve = `host #{address}` puts “Host #{address} does not exist” if resolve =~ /does not exist/
The value of the variable ‘resolve’ is set to the output of the shell command ‘host address’. This value is then compared to the regular expression ‘does not exist’ (which is the output of the host command on a Linux system; adapt this if you use another operating system). If it matches, a message is shown.
Text processing
Ruby has great capabilities to process text, which is an important task in many UNIX workflows. The String class is a powerful instrument for this: it can hold, compare and manipulate textual data. Let’s show how we open a CSV (comma-separated values) file with bandwidth data of an OpenWrt router and filter the download data for a specific host on each day:
#!/usr/bin/env ruby require ‘csv’ ip = ARGV[0] CSV::Reader.parse(File.open(‘bandwidth.csv’)) do |row| if row[0].to_s == “download” and row[1].to_s == “day” and row[3].to_s == ip puts “From #{Time.at(row[4].to_i)} to #{Time.at(row[5].to_i)}: #{row[6].to_i/1048576} Mbytes” end end
This script uses the csv library, where we use the Reader.parse method to read the rows of the CSV file. The parse method uses a code block (here inside a do-end block instead of inside brackets, but this is the same), which is executed for each row. The elements in a CSV row are now elements in the row array. We test for the right requirements (the row should be about daily download data for the specific IP address), where we convert the row elements to strings, and then we show a message for each match: the begin and end time of the day (the row has the time in seconds, so we convert the row element to an integer and then use the Time.at method to convert it to a human-readable form), and then the number of megabytes downloaded. In the beginning, we set the value of the variable ‘ip’ to ARGV[0], which is the first argument that the user assigns to the program on the command line.
XML processing
Among the many tasks of a system administrator, one that comes back as part of a lot of scripts is converting data to different formats. More and more, file formats are using XML. Ruby has the REXML (Ruby Electric XML) module for this purpose in the standard library. So let’s show how we convert our router’s CSV file to XML, with the daily download data for each host:
#!/usr/bin/env ruby require ‘csv’ require ‘rexml/document’ ip = ARGV[0] xml = REXML::Document.new root = xml.add_element(“statistics”) CSV::Reader.parse(File.open(‘bandwidth.csv’)) do |row| if row[0].to_s == “download” and row[1].to_s == “day” and row[3].to_s != ‘COMBINED’ ip = row[3].to_s host = root.elements[“host[@ip=’#{ip}’]”] if host == nil host = root.add_element(“host”, {“ip” => ip}) end day = host.add_element(“day”) day.attributes[“begin”] = Time.at(row[4].to_i) day.attributes[“end”] = Time.at(row[5].to_i) day.add_element(“bandwidth”, {“bytes” => row[6].to_i}) end end puts xml
So next to the CSV library, we also use the REXML/Document library. We create an empty XML document and add an element ‘statistics’ as the root element. Then we read the CSV file like in our previous script, but instead of just showing our findings we construct an XML document with the findings. For each IP address we add a host element, and we first check if the element doesn’t exist yet. Then we add a day element to the host and then a bandwidth element to the day element. The beginning and end of the day and the number of bytes are added to their respective elements as attributes. At the end, we show the whole XML tree as output to stdout, so you can pipe it to the input of another program (maybe one that reads an XML file and outputs a SVG graph) or write it to a file.
Working with files
Most system administrators do file manipulation manually in their favourite shell, but there comes a time when some things have to be automated. This can be done in a pure Bash script, but the file manipulation commands in Bash are not the most easy to read. The matching methods are easier to read in Ruby, although they are somewhat scattered among different classes: File, Dir, FileUtils and Find. Unfortunately, these classes don’t behave very Ruby-like: most of the methods are class methods instead of instance methods, which means that the object-oriented approach is not used consistently for files.
In our previous scripts, we already used the method File.open to read a file. We can also create a new file and write text to it:
file = File.new(“temp.txt”, “w”) file.puts(“foobar”) file.close
We can also do a lot with file metadata:
puts File.open(“temp.txt”).atime File.open(“temp.txt”).chmod(0600) File.open(“temp.txt”).chown(1000, 1000)
But most of the interesting methods are class methods, which take the filename as an argument…
File.dirname(“/home/koan/temp.txt”) File.basename(“/home/koan/temp.txt”) File.size(“/home/koan/temp.txt”) File.delete(“/home/koan/temp.txt”) File.exists?(“/home/koan/temp.txt”) File.directory?(“/home/koan/temp.txt”) File.executable?(“/home/koan/temp.txt”)
There’s also a Dir class for directories, a module FileUtils for copying, moving, removing files and so on, and a module Find to make it easier to recursively search for files under a directory. We’ll illustrate this in a script that searches for backup files (*~) in a directory (recursively) and zips them for archival. The original files can then be deleted safely, but we leave this as an exercise to the reader.
#!/usr/bin/env ruby require ‘rubygems’ gem ‘rubyzip’ require ‘find’ require ‘zip/zip’ directory = ARGV[0] pattern = “*~” Zip::ZipFile.open(“backup.zip”, Zip::ZipFile::CREATE) do |zipfile| Find.find(directory) do |path| filename = File.basename(path) if File.directory?(path) if filename == ‘.’ or filename == ‘..’ Find.prune end else if File.fnmatch(pattern, filename) puts path zipfile.add(path.sub(directory, “”), path) end end end end
Here we see another way to install a non-standard Ruby library: using Ruby gems, the programming language’s own package manager. The directory that we want to search in is the first argument to the script, and the search pattern is ‘*~’. Then we create a zip file and start searching in the directory with the Find.find method.
This will iterate through each file recursively. If the file is ‘.’ (the current directory) or ‘..’ (the parent directory), we don’t look any further into this directory. If it’s another directory we don’t do anything with the directory itself, which means that next it’s the turn of the files in this directory. In the other case (the current file is a regular file), the filename is matched to the pattern – if it matches, the file is added to the zip file, with its relative path to the directory we searched in as the zip index.
Advanced usage
Of course, Ruby is not the holy grail for sysadmins. There are a lot of cases where Perl or even shell scripts are better. For example, the number of Ruby libraries can’t beat the breadth of Perl modules in CPAN. So if your task requires such functionality, it’s probably better to use Perl. However, Ruby catches on quickly, and recently some powerful system administration tools have been written in Ruby, such as the configuration management tool Puppet, which we’ll talk about in another article.
The examples we have shown are of course rather basic, but you get the picture. If you want to do some serious sysadmin tasks in Ruby scripts, you have to take care of error handling and a robust command-line interface instead of just using ARGV[0]. The GetoptLong module can parse the command’s arguments and options in a structured way.
When your Ruby sysadmin scripts begin to grow in complexity, it’s also a good idea to restructure them as Rake tasks. Rake originated as a simple Ruby build program with capabilities similar to the UNIX make command, but it’s also very usable to do deployment tasks or migrate database schemas.