At the Forge

Blosxom

Reuven M. Lerner

Issue #121, May 2004

No modules, no SQL, no hassle. Create a blog with power features without even restarting the Web server.

Weblogs, or blogs, have grown dramatically in popularity over the past few years. Only a few people wrote blogs in the mid- and late 1990s, but now the blogging phenomenon is an overwhelming trend. Indeed, blogging is becoming so widespread that the New York Times Magazine published an article about it earlier this year—concentrating on high-school students who write their own blogs.

For example, as I write this, the Democratic primaries currently are in high gear, and every candidate has at least one official Weblog. Professional and armchair political commentators have set up their own blogs to analyze and counter claims the candidates make in their blogs and elsewhere.

Last month, we looked at COREBlog, a Zope product that makes it easy to create your own Weblog. Of course, COREBlog requires that you have a copy of Zope at your disposal and that you can install and modify products. Not everyone has this luxury, whereas almost every Web hosting provider makes it possible to run CGI programs written in Perl on your Web site. For this reason, many of the most popular blogging packages are small programs that do not raise the ire of an ISP.

This month, we look at Blosxom (pronounced blossom), a Weblog package written in Perl and designed to be run as a CGI program on a Web site. Blosxom was written by Rael Dornfest, a programmer at O'Reilly and Associates. I initially wrote off Blosxom as an unrealistic tool for blogging, assuming that its small size was indicative of its abilities. But Blosxom's power is not only in its strong feature set but in the way it allows us to mix and match functionalities.

Installation

Installing Blosxom should be a piece of cake for anyone with experience working with a Web server. It consists of a single CGI program written in Perl. In my case, all I had to do was copy the file, blosxom.cgi, to /usr/local/apache/cgi-bin, and I was up and running.

Of course, every piece of software requires at least a bit of configuration, and Blosxom is no exception. All of the configuration is handled by a few Perl variables at the top of the program. Comments make the purpose of each variable relatively clear. To configure Blosxom for my system, for example, I changed the following variables:

  • $blog_title: the title of the Weblog as it appears to users and in the RSS syndication feed.

  • $blog_description: blog description that appears on the front page and in the RSS feed.

  • $datadir: each entry in a Blosxom Weblog actually is a text file on disk somewhere; $datadir defines where those files should reside.

With those three elements defined, my Weblog was up and running.

Adding Entries

I tested Blosxom by creating a simple text file in $datadir, introduction.txt:


This is a test entry.

<p>Hello!</p>

As Weblog entries go, this one was pretty boring. But it was interesting to see how this entry appears in my Weblog, preceded by a date, followed by a timestamp and a permanent link and with the first line boldfaced, as if it were a headline or title.

In other words, you can add entries to a Blosxom Weblog simply by creating new text files in the data directory. Any file ending with the value of $file_extension, which is txt by default, is considered a Weblog entry. This way, Emacs backup files, which end with ~, never are considered entries. But, if you are like me and have the habit of saving often while writing, you might be surprised to discover that your Weblog is being updated as you write it, live and for the whole world to see. If you want to work in the background, simply leave the .txt extension off the filename until you're ready to publish it.

On my workstation, where I installed Blosxom in the main cgi-bin directory, I can see my Blosxom blog as http://localhost/cgi-bin/blosxom.cgi.

Blosxom assigns a date and time to an entry based on the timestamp of the file that was created. Because I created the file on February 11, at 4PM, the Weblog entry was timestamped with that time. This means you can change the timestamp of a file retroactively with the touch command, as in:

touch -t 200401011500 testing.txt

The above command modifies the date of the file testing.txt to 3PM on January 1, 2004. (If testing.txt does not exist already, it is created.) Although this might go against the etiquette of the Weblog universe, it certainly is possible.

More interestingly, you can modify the time of a Weblog entry to be in the future, using the same touch command on the command line. If the $show_future_entries configuration variable is set to 1, entries with such future dates are displayed all of the time. But in the default configuration, entries are displayed only when their date matches the current date. This means you can time-bomb your entries to be displayed on a particular time and date.

Flavours

If this were all that Blosxom provides, I would not be too impressed. But after examining it a bit more closely, I see that it contains a great deal of power. That power is there thanks to the combination of display templates (known as flavours, using the British spelling) and the ability to accept any number of plugin programs. The combination of these two features makes Blosxom quite extensible.

Blosxom comes with two flavours built-in, the default HTML flavour and the optional RSS flavour for the RSS syndication feed. You can view the RSS feed yourself by tacking ?flav=rss onto the end of your blog's URL. So, if you normally view your Weblog at http://localhost/cgi-bin/blosxom.cgi, you can view the RSS feed for the site at http://localhost/cgi-bin/blosxom.cgi?flav=rss. Alternatively, you can specify your preferred flavour by changing the suffix of the page you retrieve. Thus, we can see RSS with http://localhost/cgi-bin/blosxom.cgi/index.rss.

A complete flavour registry is available on the Blosxom Web site. But the basic idea is easy to grasp: in your data directory, alongside your Weblog entries, you create an HTML file whose name reflects the part of Blosxom's output you want to change.

The filename's suffix is the same as the flavour you want to modify. Thus, the file header.html changes the way the Weblog's header is displayed in the HTML flavour, and date.blah changes Blosxom's display of dates in the blah flavour. Users can set the flavour in the URL by adding the flav name-value pair (as we saw before), and the default is set in blosxom.cgi itself, with the variable $default_flavour. Because blog entries have a .txt suffix, you cannot have a txt flavour.

Each flavour file consists of an HTML snippet, along with Perl variable names that might be instantiated into the particular file. For example, story flavour files receive the variables $title and $body, among others. (A full list is available on the Blosxom Web site.) I thus can change my blog's output such that headlines are huge and right-aligned, followed by the body:


<p>
<H1 align="right">$title</h1>
<br />
$body
</p>

The above flavour inserts the $body variable, the contents of our blog story, verbatim into the HTML. This is fine if the blog author knows HTML and is willing to enter paragraph tags manually. But if we want to let people separate paragraphs with blank lines, we need to run a program on our story. Luckily, Blosxom makes it easy to write such programs with an extensible plugin architecture.

Plugins

Each plugin is a Perl program loaded with the require function, which reads and evaluates code in a particular file. So require foo.pl opens foo.pl and evaluates the code it contains. I normally suggest that people avoid require in favor of use, which executes a number of commands, including require. However, because require executes at runtime, whereas use executes during the compilation phase, it is far easier to work with it here.

Blosxom assumes that any file in the plugin directory, defined by the optional $plugin_dir variable, is a plugin. Plugins are both loaded and applied in alphabetical order, which means if you want to make sure a particular plugin is applied first or last, you might need to rename it.

Each plugin is nothing more than a simple Perl program that defines one or more subroutines. Every plugin must define the start subroutine, which simply returns 1. This allows Blosxom to determine that the plugin is alive, ready and willing to be invoked. A number of other plugin subroutines are available that each plugin optionally may define, ranging from entries (which returns a list of entries) to story (which allows you to modify the contents of a story). By breaking things down in this way, Blosxom allows for a tremendous amount of customization and sophistication, while keeping the core code small and compact.

So, what sorts of features can plugins provide? There seems to be only a few restrictions. You can change the source from which Weblog entries are retrieved, the way in which this list of entries is filtered, the templates used to display the entries and the contents of the entries themselves.

A large number of plugins are available from the Blosxom Web site. Some of them depend on other plugins, while others, such as the calendar, appear only if you are using a flavour that supports the plugin. Other plugins work immediately and merely need to be dropped into your plugin directory.

A simple example of a plugin that works out of the box is atomfeed, which provides an Atom syndication feed. Atom is a competitor to RSS that has been promoted by a number of heavy-hitting bloggers and programmers, in no small part because of the competing standards now evident in the RSS world. To get an Atom feed, simply copy the atomfeed plugin to your plugins directory. You then can retrieve your Atom feed with http://localhost/cgi-bin/blosxom.cgi?flav=atom or http://localhost/cgi-bin/blosxom.cgi/index.atom.

Writing Plugins

Listing 1 contains a simple filter, called egotrip, to make my name appear in boldface whenever it appears in a Weblog entry. Notice how the plugin must define its own package; this ensures that each plugin's subroutines are kept in a separate namespace and makes it possible for Blosxom to determine whether a package contains a particular method name.

The actual work is done in the story subroutine, which is passed six arguments when invoked by Blosxom, corresponding to a number of items having to do with the entry. In our case, we care about changing only the body of the entry, which is in the final variable, known as $body_ref. As its name implies, this is a scalar reference, which means we can access or modify its contents by dereferencing it, using two $$ signs. With that in mind, it should not come as a surprise that we can boldface every instance of my name with:


$$body_ref =~ s|Reuven|<b>Reuven</b>|g;

Of course, we could make this step even more sophisticated and insert automatic hyperlinks to a number of different items:

$$body_ref =~ s|(Reuven Lerner)|
↪<a href="http://www.lerner.co.il/">$1</a>|g;
$$body_ref =~ s|(Linux Journal)|
↪<a href="http://www.linuxjournal.com/">$1</a>|g;

Indeed, a plugin of this sort already exists; it automatically creates links to the community-driven Wikipedia. Any text placed within [[brackets]] automatically is turned into a link to that on-line reference book.

Notice how flavours are HTML templates into which we can instantiate Perl variable values, whereas plugins are Perl programs. This division between display and actions takes a little bit of time to grasp, but it shouldn't be too difficult.

As for our paragraph-separating problem from before, there's no need to reinvent the wheel. You simply can download a plugin, Blox, that allows you to separate paragraphs with blank lines when writing your blog entry. The plugin then separates paragraphs with the HTML of your choice. Blox is listed on Blosxom's plugin registry (see the on-line Resources section).

The fact that Blosxom keeps all entries and flavours in a single directory is a bit disturbing to me and makes me wonder about the program's scalability. Even if my filesystem and Perl can handle that many files without too much trouble, do I really want to wade through them all? If and when this becomes a problem, an entries plugin probably can provide the right solution, scooping up files from multiple directories and returning an appropriate hash to Blosxom.

Conclusion

Blosxom is a powerful tool for creating a Weblog; it's more than it might appear at first glance. Blosxom consists of an easy-to-install, easy-to-configure CGI program written in Perl, but its true power lies in the fact that it lets you change every part of the display through a combination of flavours (display templates) and plugin routines. By mixing and matching existing flavours and templates with something of your own, it can be easy to create your own Weblog.

Resources for this article: /article/7454.

Reuven M. Lerner, a longtime consultant in Web/database programming, now is a graduate student in Learning Sciences at Northwestern University in Evanston, Illinois. You can reach him at reuven@lerner.co.il.