At the Forge

Checking Your Ruby Code with metric_fu

Reuven M. Lerner

Issue #183, July 2009

By combining automated testing with automated code analysis, you can make your Ruby code easier to test and easier to maintain.

Among programmers, there has long been a dispute between those who want a language to constrain them and those who want great flexibility.

If you have been programming for a while, you'll understand the benefits that each side touts. A rigid language can help check your code, often using a compiler and a strict type system, to find potential problems before they make their way into production systems. By contrast, a more flexible language is designed with the knowledge that compiler and strict typing don't find all bugs and often force programmers to work around the system's constraints, rather than benefit from them.

This brief description is little more than a caricature of modern programmer attitudes. But, it does point to a tension programmers often face when choosing a language. How much do you want the language to constrain you, and what trade-offs are you willing to make? Would you rather have a strict language that doesn't let you express yourself the way you want or a flexible language that won't stop you from doing something foolish or dangerous?

Like many Web developers, I have come to prefer dynamic, flexible languages. I don't want the language to stop me preemptively from doing things, even if what I'm doing might seem crazy or weird. I've become quite a fan of Ruby over the last few years because of the balance it tries to strike.

However, the lack of a compiler or other tool to perform regular sanity checks does bother me somewhat. I wouldn't ever claim that a compiler is the only tool a programmer should use to test the code, but it does perform a first-pass inspection that can provide some useful feedback.

Fortunately, the Ruby community encourages the use of regular automated testing to ensure that code works in the way you expect. Done correctly, testing actually can be better than a compiler and strict typing. It can check the code at multiple levels, reflect actual use cases and serve as a sanity check not only for the code's syntax, but also for its logic and specification. Moreover, writing tests forces programmers to reflect on their work, chewing over how they have implemented a particular feature. Such reflection is an essential part of the learning process, and it offers programmers a chance to become better at their craft, as well as to write better programs.

Automated testing, accompanied by automated analysis, thus, can help improve programmers, as well as improve the programs they write. So, I was delighted to discover metric_fu, a Ruby gem from Jake Scruggs and others that pulls together some of the best-known analysis tools in one convenient package for Rails programmers. The combination of these various tools—including rcov, Flay and Flog—makes it easy to locate potential problems in code you've written and improve it. Automated analysis tools won't ever provide you with 100%-accurate feedback, but it's always good to get this sort of input.

This month, I look at metric_fu and some of the code-analysis tools it makes available to Rails programmers. It's true that metric_fu is “just” a wrapper for these individual tools, but by making them so easily available and integrated with the rest of your testing, you'll constantly be in a position to understand where potential problems might lie and to fix issues before they cause you any real trouble.

Installing metric_fu

metric_fu is a Ruby gem, which means you can download and install it with:

sudo gem install metric_fu

The metric_fu gem specification automatically requires a number of other gems that it uses, including rcov and Flog. So installing the metric_fu gem should mean your system is ready, without the need for additional downloads and installations.

Assuming you are using metric_fu with Rails, you probably will want to tell Rails that it should look for and include the metric_fu gem. You can do this in modern versions of Rails by adding the following line to config/environment.rb:

config.gem 'jscruggs-metric_fu', :version => '0.9.0',
    :lib => 'metric_fu', :source => 'http://gems.github.com'

In other words, you want Rails to load the gem known as metric_fu, which can be downloaded from Github as jscruggs-metric_fu, version 0.9.0. If this gem does not exist, Rails will exit with an error.

Finally, you must add a line to your Rails application's Rakefile, telling it you want to load the Rake tasks associated with metric_fu:

require 'metric_fu'

Once this is complete, you should find a number of new tasks, all of whose names start with metric, available in Rake. You can list them with:

rake -T | grep metrics

I typically run all the tests, which you can invoke with:

rake metrics:all

This runs all of the software metric_fu works with, a list that has grown somewhat in the last year. At the time of this writing, running metrics:all includes:

  • churn: which files change the most?

  • coverage: which parts of your code are tested?

  • flay: which parts of your code are duplicated?

  • flog: is your code unnecessarily complex?

  • reek: does your code suffer from well-known bad practices?

  • saikuro: how complex is your code?

I cover a number of these tests in greater detail below. But, before continuing, it's important to note that metrics:all will fail to run all the tests if the rcov coverage tool encounters one or more errors. This isn't a problem if you test frequently, but it can bite you if you break a test and then run metrics:all.

When you run the full report with rake metrics:all, metric_fu puts all the output files under your application's tmp/metric_fu directory. Each test has its own separate subdirectory and produces output in HTML for easy reading with a Web browser. The fact that the files are put in tmp/metric_fu makes them easy to find and view on a local system, but it requires that you move them into a Web-accessible directory (for example, public/tmp/metric_fu) if you want to view them from a remote machine. It should go without saying that you don't want this information to appear on a Web site that is publicly viewable, so be sure to password-protect or delete these reports to avoid unpleasantness.

Although metric_fu's defaults work for most initial cases, you may find yourself wanting to customize one or more of its tests. You can do this within your Rakefile by adding a MetricFu::Configuration block and invoking config.*, where * is one of the tests that metric_fu brings in. For example, you can customize which tests run for :all with:

MetricFu::Configuration.run do |config|
  config.metrics          = [:coverage, :flog]
end

If you modify config.metrics to include only a subset of metric_fu's tests, you may find yourself puzzled when other tests fail. For example, if you were to set config.metrics to the above value of [:coverage, :flog], invoking rake metrics:reek would fail, with Rake complaining that it wasn't able to find such a task.

Code Coverage

Perhaps the best-known member of the metric_fu family is rcov, the Ruby code-coverage checker, written by Mauricio Fernandez. rcov invokes all your automated tests and then produces a report indicating which lines of your source code files were untouched by those tests. This allows you to see precisely which lines of each file have been tested, letting you concentrate on those paths that are highlighted in red (that is, untested), rather than writing additional tests for code that already has been tested.

rcov, as invoked by metric_fu, produces two basic types of HTML output. One provides an overview of the pages of a site. This output, with red and green bar graphs, shows the percentage of each file that has been secured. If any of your files has a graph whose bar is partly red, this tells you on which files to concentrate your initial effort.

But, once you have decided to make sure that a particular file has better test coverage, which lines do you improve? That's where rcov's individual file output comes in handy. It shows the source code of the file, with lines of the code in either green (to show that it was covered in tests) or red (to show that it was not). If you have any red lines, the idea is for you to add tests that force those lines to be covered next time around. And, of course, if there are red lines that don't need to be there, rcov has helped you refactor your code, making it leaner and meaner. Reading rcov's output is pretty simple—you want everything to be green, rather than red. Any red is an invitation to write more tests or realize that the code is no longer in use and can be removed.

One of the main reasons for testing your code is that it gives you some peace of mind when you make further changes. So, although you can refactor and otherwise change your code without 100% test coverage, it's always possible something will slip through the cracks. For that reason, rcov should be your first priority when using metric_fu. Once your code coverage is high enough to ensure that new problems and changes will be detected, you can try to make your code better, without changing what it does.

Flog

Another tool that comes with metric_fu is Flog, written by Ryan Davis. Flog produces what it calls a “pain report”, identifying code that it believes to be “tortured”—in such pain that you really should rescue it. Even if you disagree with some of its results, looking at Flog's output often can provide an interesting perspective on your code's complexity. It measures variable assignments, code branches (that is, if-then and case-when statements) and calls to other code, assigning a score to each of those. The total Flog score is the sum of the individual items that Flog finds.

As the Flog home page says, “the higher the score, the harder it is to test”. Even if you're not worried about testing, you certainly should consider other programmers who might work on your project. Complex code is hard to maintain, and maintaining software is (in my view) a bigger problem than writing it. So, by looking at Flog's output, you can get a sense of how hard your code will be for someone else to understand.

metric_fu provides an HTML version of Flog's output. I demonstrate it here from the command line, where it can be run as:

flog *.rb

This produces a simple set of outputs, such as the following, which I got for a small project I recently worked on and didn't test or analyze much:

181.0: flog total
 60.3: flog/method average

 72.5: UploadController#advertiser_file_action
 70.1: UploadController#whitepage_listing_file_action

This would seem to indicate that my upload controller has two different methods, both of which have a relatively high level of complexity. I can get further information about these two methods by invoking Flog with the --details command-line argument. That gives me the following output, which I have truncated somewhat:

~/Consulting/Modiinfo/modiinfo/app/controllers$ flog --details
upload_controller.rb
 181.0: flog total
  60.3: flog/method average

  72.5: UploadController#advertiser_file_action
  40.6: assignment
  17.3: branch
   4.8: split
   4.0: blank?
   3.2: strip
   3.2: params
   3.1: +
   3.0: map
   2.8: []
   2.1: downcase

In other words, a large proportion of Flog's high score results from the large number of variable assignments in UploadController#advertiser_file_action. And sure enough, I have a bunch of variable assignments in that method, which led to a high score. For example, I wanted to display the number of uploaded records to the end user, and, thus, had the following code, assigning values to instance variables:

if advertiser.save
  @number_of_successes = @number_of_successes + 1
else
  @number_of_failures = @number_of_failures + 1
  @error_messages[index] = advertiser.errors
  next
end

I find this code easy to read and maintain, but Flog thinks otherwise, preferring a more functional style of programming, with methods chained together. This is one case in which I'll take Flog's assertions and scores into consideration, but I'll apply my own judgment regarding the complexity of my code and whether it needs to be changed or updated.

Flay

One of my favorite tools that comes with metric_fu is Flay, also by Ryan Davis, which looks for duplicate code. One of the key principles of good coding is DRY (don't repeat yourself), and Flay makes it easy to find places where your code could use some extra DRY-ness. By running:

rake metrics:flay

you will get a nicely formatted report showing the places where your code has exact duplicates (which are embarrassing and problematic enough) and structural duplicates. So, if you have the same variable assignment in multiple controllers, Flay will find those for you and will point to the need for refactoring. For example, the simple project on which I hadn't yet run Flay had three methods, each of which contained the following identical code:

if params[:filename].blank?
  flash[:notice] = 'No file was attached. Please try again.'
  redirect_to :back
  return
end

If this sort of code appears three times in the same controller, it means some refactoring is in order. In this particular case, I can remove the problem by putting this code into a separate method and then by defining a before_filter:

before_filter :check_for_blank_filename,
     :only => [:residence_file_action,
               :advertiser_file_action,
               :whitepage_listing_file_action]

Here is the method, which looks (not surprisingly) just like the code that was duplicated:

def check_for_blank_filename
 if params[:filename].blank?
   flash[:notice] = 'No file was attached. Please try again.'
   redirect_to :back
   return
 end
end

Re-running Flay indicates that I now have made my code DRY-er than before, increasing its readability and making it easier to test. Sure enough, the Flay score for this controller dropped from 392 to 221. The measures are meaningful only relative to one another, but it seems undeniable that the code is now better, and the numbers reflect that.

Flay can find subtler similarities as well, indicating where two pieces of code look similar to one another. For example, I had the following two lines in my code, in separate locations:

(name, telephone, address, url, email, category_string) =
    line.split("\t").map { |f| f.strip }

(company, telephone, address, url, email, category_string) =
    line.split("\t").map{ |f| f.strip}

Flay noted that this code is almost identical and can be refactored to be a bit DRY-er. Would I actually change this code? Maybe and maybe not, but at least I'm more fully aware of it, which is important in and of itself. If and when I spend time refactoring this code, Flay will point to the first and most necessary areas that need attention.

Reek

Finally, I should mention Reek, a tool written by Kevin Rutherford, which also is invoked by metric_fu. Reek looks for “code smell” or code that doesn't follow commonly accepted style. This includes finding code duplication (similar to what Flay does), as well as long methods and poorly named variables. It also tries to find cases in which a method sends more messages to another object than to itself, which it calls feature envy, and methods that contain more than five lines of code, which are flagged as long.

For example, regarding code I mentioned above, which read:

(company, telephone, address, url, email, category_string) =
    line.split("\t").map{ |f| f.strip}

Flay noticed that this code was duplicated. But beyond that, a one-letter variable name is almost always a bad idea, because it reduces the readability of the code. Sure enough, Reek will flag this code as having an “uncommunicative name” for the variable f.

Even if I'm not totally sold on “Reek-driven development”, as Rutherford describes on the Reek home page, Reek is a useful way to find potential problems and provide additional feedback on the program that I'm writing.

Conclusion

Because of its dynamism and flexibility, Ruby offers programmers the chance to do things that might lead to maintainability problems down the road. Fortunately, the Ruby community has produced a set of excellent tools for automated testing and analysis that make it possible to produce high-quality code that is easy for others to follow, test and maintain. metric_fu puts many of these tools into a single package, making it easy to run a variety of tests on your code.

Reuven M. Lerner, a longtime Web/database developer and consultant, is a PhD candidate in learning sciences at Northwestern University, studying on-line learning communities. He recently returned (with his wife and three children) to their home in Modi'in, Israel, after four years in the Chicago area.