At the Forge

Logging

Reuven M. Lerner

Issue #215, March 2012

Knowing how to read logs and write to them are important skills for all Web developers.

When you're writing programs that operate on the command line, the error messages either appear in the same place as you're typing (that is, in the same terminal window) or are sent to a back-end logging system, such as the venerable “syslog” facility that Linux provides. But, one of the surprises and frustrations that Web developers encounter is that there is no such thing as a “terminal” in which their Web application is running. Rather, the Web server sends its error messages somewhere else entirely, to one or more files.

I often tell participants in my Web development courses that they should embrace their new best friend, the logfile. And, indeed, logfiles are a priceless tool in the hands of a developer, making it possible—like a doctor—to understand the source of a problem and then diagnose and fix it. Because there is no way to interact directly with a Web server, looking at the effects as displayed in the logfile is the best way to understand what is happening. Through the years, I've learned, however, that developers often are unfamiliar with many of the issues having to do with logging and how they can make the best use of those logs to ensure that their programs run as smoothly as possible.

So in this article, I look at a number of issues having to do with logging, as well as some strategies and techniques you can use to make best use of the logs. Some of the things I mention here are specific to the Ruby on Rails framework, which I use in much of my day-to-day work, but several techniques are common to many packages or to programming in general.

Where Are You Logging?

A typical modern Web application consists of a database server, an HTTP server and an application. For example, I generally use PostgreSQL (database), Apache (HTTP) and Ruby on Rails (application), but it's not unusual for me to use other technologies, from MySQL and MongoDB to other languages, servers and systems. Each program has its own logfile, with its own configuration settings and output syntax.

The first thing you should do when working on a project, whether you are starting it or taking it over from someone else, is determine which logfiles are being written to. In a Web application that uses Apache, Rails and PostgreSQL, you'll have at least three different logfiles, one for each of these programs.

Sometimes these logfiles are placed under the /var/log directory in your filesystem, but not always. It's not unusual for a PostgreSQL logfile to be in the “data directory”, which on my systems often is in /usr/local/pgsql/data. Note that the default location for a program's logfiles might not match the location that your Linux distribution has adopted, so unless you're careful, you can end up with two separate logfile directories, one of which is ignored by the system.

Some programs can create more than one logfile. Apache is able to act as an HTTP server for multiple sites, and it's often helpful to be able to separate out logs for those sites. And, Apache also separates HTTP request logs from errors and referrers, meaning that if you want to understand fully what is happening on your site, you might need to piece together several logs just for the HTTP server.

Apache Logs

The two most important logs that Apache produces are the request log and the error log. A typical line in the request log (using the “common format”, which combines referrer information with requests) represents a single HTTP transaction and looks like this:

84.108.219.125 - - [09/Jan/2012:14:13:52 +0200] 
 ↪"GET / HTTP/1.1" 200 764 "http://linuxjournal.com/"
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/535.11 
 ↪(KHTML, like Gecko) Chrome/17.0.963.26 Safari/535.11"

The most important elements of this entry are as follows:

  • The request was made from IP address 84.108.219.125.

  • The request took place at 14:13:52, UTC + 2.

  • The GET method was used to request /, using HTTP 1.1.

  • The return code was 200.

  • The response contained 764 bytes.

  • The user came to this page from a link at LinuxJournal.com.

  • The user was browsing with the Safari browser, on an Intel-based Mac.

The access log can be useful for a number of reasons. First, I'm often trying to debug the HTTP requests that a program makes to a server. This can be as mundane as when I click on a link in my browser or when I'm writing a program that is trying to retrieve information from a server. The HTTP access log shows me precisely what URL was sent to the server; I frequently have been able to debug problems with the requesting program as a result.

Of course, only GET requests, in which name-value pairs are part of the URL, are debugged easily in the access log. POST requests send their name-value pairs on a separate channel, which means you can't rely on the server log to display them.

It's also easy to assemble basic statistics about a site by looking at the Apache logs. For example, from my server, the response code is column #9 of the logfile. By chaining together several command-line tools, I can find out what percentage of each request got each response:

cut --fields=9 --delimiter=' ' access.log   | sort | uniq  -c

First, I use cut to divide the logfile into fields, separated by spaces. Next, I grab the 9th column, which corresponds to the HTTP response code. I then sort those codes and use uniq to count the number of times each one appears.

The other important Apache logfile, and one that is crucial in debugging Web applications, is the error log. Not only does this file mention many errors and warnings that Apache has encountered, but also anything that the Web application might have sent to the STDERR filehandle while executing. If something goes wrong on a Web application that I'm developing, my first instinct is to look at the error log. Often messages there will give me an indication of what is happening, including warnings of errors, incorrect HTTP headers or truncated requests.

Application Logs

For many years, the Apache error log was the best tool I had for debugging Web applications. Modern Web frameworks operate outside Apache (or any other HTTP server) though and often have their own logfiles. Ruby on Rails, for example, has a separate logfile for each of its environments. Thus, under the logs directory, you typically will find development.log, production.log and test.log.

The test environment is where you run your tests, and having a full-fledged logfile that I can open and peruse after tests have failed makes it relatively easy to figure out why they didn't work. The test logfile generally is full of enough detail that if something goes wrong, you can reproduce it in the console and understand why it didn't work.

The development log also is full of useful information. Although it reflects whatever requests you make to the development server, it offers a chance for more insight. For example, the default setting in Ruby on Rails gives a huge amount of information about database access via ActiveRecord, including the SQL that ultimately is sent to the server and when objects are fetched from ActiveRecord's cache, rather than from the database itself. When I'm developing a Web app, it's very common for me to do a tail -f (more on that below) on the Web application log.

The production log, as its name implies, is written to the production server. On systems with more than one production server, it can become tricky to keep track of multiple logs, because you cannot know to which production server given users will be sending their HTTP requests. In such instances, using a system that allows you to aggregate logs in a single place, such as syslog or Graylog2 (which I haven't ever used, but it looks like a marvelous system), might well come in handy. When I have a small number of servers, I often take the easy way out, putting the production logs separately on each server and then just looking at both via tail -f to see where a problem might have cropped up.

Database Logs

Most modern Web applications use a database of some sort. These often are relational databases, such as MySQL and PostgreSQL. Each of these databases has a variety of switches and configuration variables that you can set to describe and customize the ways in which logfiles are written.

I tend to use PostgreSQL whenever possible and generally set up the logs the same way on all systems I run. I turn logging on and configure the logs to rotate once per day, such that I have the last seven days' worth of logs if I ever need to check on something.

One of the nicest things about PostgreSQL's logs is that you can decide what you want to log—connections, disconnections, query parse trees, statement durations, vacuum execution or even the output from the query optimizer. These settings are in the postgresql.conf configuration file, which normally sits in the data directory, alongside the security configuration file pg_hba.conf.

By setting the appropriate variables in postgresql.conf, you can set separate log levels for client connections (client_min_messages) and the logfile (log_min_messages), such that the logfile contains more detail than messages sent to database clients. You can ask for log messages to be displayed verbosely (log_error_verbosity), which gives a level of detail I've never found useful, but which undoubtedly comes in handy when doing low-level database hacking.

The log_statement parameter allows you to log DDL (database definition statements), mod (modifications to the database) or everything.

One of the most useful log-related settings in the PostgreSQL configuration file is log_min_duration_statement. By default, it is set to –1, meaning that it is deactivated. Given a positive number as a parameter, this will log any time a statement on your PostgreSQL server takes longer than a certain number of milliseconds. I normally set this to 100ms and then can go through the logfile (using grep, searching for the word “duration”) to find which queries are taking the longest. That allows me to concentrate my efforts on the slowest queries, as identified by the database server itself.

How to Read Logs

One of the best ways to read through logfiles is also the easiest, namely the less command. less, which is a modern replacement for the traditional UNIX more command, lets you page through a file, backward and forward—and if the file grows while you're viewing it, less still will let you read through it, including to the current end of the file.

less has a very large number of options that you can apply. For years, my .zshrc profile has defined the PAGER environment variable to be:

less -RXs

This means it shows raw control characters (useful for reading colorized logs and man pages), that it doesn't reset the terminal (which I admit I use out of habit, rather than having demonstrated with certainty that this option is necessary), and that multiple blank lines will be squeezed into a second one. The fact that searching in less not only supports regular expressions, but also highlights any text that matches my search, makes it fairly easy to find my way around.

However, I often want to read the logs as they come through. One easy way to do this is with tail -f. The tail command shows you the final lines of a file on your filesystem. The -f flag tells Linux to keep reading from the file as it grows and to print it on the screen. When I'm developing, I often have one terminal window open on the application's logfile with tail -f:

tail -f log/production.log

If you're on a heavily loaded server, the output from tail -f not only will slow down your machine, but also will be unreadable by you. The solution is to filter the output of tail -f through grep, looking for a particular piece of text that is of interest to you, such as the name of a variable, function or error message.

If you want to grab a bunch of lines, you can use grep's -A (after), -B (before) or -C (context = before + after) flags. For example, if you want to see the list of parameters sent to your Web application server every time the Home#index method is called, you can do this:

tail -f log/production.log | grep -A10 Home#index

Do you really need ten lines after the method name is printed in the logs? Probably not, but this ensures that you'll get all of the parameters, without having to read through lots of other stuff.

Writing to Logs

It might sound odd, but it takes some practice to learn how to write to logfiles as well. I tend to write a lot of information to logs when I'm developing a Web application. I realize there are people for whom a debugger is their first tool of choice, but mine is the console that Rails provides, followed closely behind by the logfiles. From a Rails application, you can write to the log at any point using the “logger” object, which is defined in all models and controllers. The logger object supports a method for each log level, from “crit” (critical) to “info”, and everywhere in between. So to write “hello” into the logs, you would write:

logger.warn "hello"

Now, whatever you write to the logs is printed verbatim. You can, of course, write the value of a variable:

logger.warn some_variable

But, this is almost always a bad way to go. First, you want to be able to find what you've written to the logfile. Second, you're probably going to be writing multiple variable values, so you will want to indicate what you're printing here to distinguish it from other places. And finally, if the variable value is blank, or a newline, you'll want to know that—most easily by putting its value between delimiters. So when I want to print a variable value, I do something like this:

logger.warn "[ObjectName#method_name] some_variable 
 ↪= '#{some_variable}'"

Now I easily can find this value in the logfile; I can know in which method I was printing things, and I also can distinguish between different values—an empty string, a space or a newline.

If it's a complicated variable, I sometimes use the built-in to_json or to_yaml methods that Rails provides:

logger.warn "[ObjectName#method_name] some_variable =
 ↪'#{some_variable.to_yaml}'"

But watch out! I've learned the hard way that certain objects cannot be printed to YAML. What happens in those cases is that you get an error message instead of a printout, and that can cause more head-scratching, trying to figure out how and where you introduced a new bug. Printing things simply, in non-YAML format, is generally a good idea, when you can get away with it.

Logging Every Action

On some projects, I go all-out, logging every HTTP request to the database. This is a tremendous help when debugging applications, although it does tend to have the side effect of slowing things down, because each request needs to write to the database. (Using a non-relational database probably would be a faster way to go about this, although I enjoy using the powerful queries that SQL provides when I need to pull information out.) When I do this, I typically create a logged_actions table in the database, containing the following columns:

  • user_id.

  • controller name.

  • action name.

  • logged_at (timestamp).

  • message (allows me to stick an arbitrary message into these logs).

  • ip_address (of the person requesting).

  • browser_info (from the user's browser).

  • URL.

  • params (all of the parameters sent, via GET and POST).

  • session (the user's entire session, written in YAML if possible).

  • cookies (the user's cookies, written in YAML if possible).

  • flash (contents of the Rails “flash”, a sort of temporary session).

  • referrer.

I then create a “before filter” in the application controller, such that every single request to the Web application results in an entry in the Logged_Actions table. Moreover, whenever I want to add some debugging or other information, I put it into the logs in the usual way, or I can write it with much more information and context via my Logged_Actions table. This table, and the entries in it, have proved to be extremely valuable in many cases, allowing me to debug problems that otherwise would have been difficult to revisit or understand.

When I'm not debugging, I often turn this feature off, given the combination of overhead that it causes and the potential privacy/security issues that it can sometimes raise. In some cases, I've configured this before filter based on a configuration parameter that I set elsewhere in the system, such that I can turn fine-grained logging on and off without having to modify and re-deploy the application.

Conclusion

If you are a Web developer, and if you aren't yet using logfiles as part of your development and debugging process, I strongly suggest you begin to do so. Logfiles are a gold mine of information in all cases, but when you start to track and write additional messages, you give yourself the ability to discover and fix problems even before your users are aware of them happening. Plus, if you keep track of users' parameters and request information, you can often discover, as I have, that sometimes the problem has to do with something in users' sessions or histories, or even the browsers they are using to visit your site.

Reuven M. Lerner is a longtime Web developer, architect and trainer. He is a PhD candidate in learning sciences at Northwestern University, researching the design and analysis of collaborative on-line communities. Reuven lives with his wife and three children in Modi'in, Israel.