Test your Web applications by controlling real browsers from Ruby.
I recently was visiting a new client when one of the staff developers asked me to help debug his automated test suite. I asked what test frameworks he used, and he responded by saying, “Cucumber and Watir”.
Now, I've long known about Cucumber, and I even wrote about it several years ago in this column. It's an advanced acceptance-testing framework that's especially popular among Rubyists who promote behavior-driven development (BDD), an offshoot of test-driven development (TDD), which looks at a system's functionality from the end user's perspective. Cucumber allows you to describe your tests in plain English, which makes the test specifications easy to understand, even for nontechnical users.
But Watir? I knew that Watir (Web Application Testing In Ruby, pronounced “water”) is a BSD-licensed framework that lets you interact with a Web browser from within a Ruby program, allowing you to test with a real browser (like the macro-based, cross-platform Selenium test tool) but with Ruby commands (like the popular Capybara test tool). I had last looked at Watir more than four years ago, at which time it had a fatal flaw: it worked only on Windows systems and only with Internet Explorer.
But it seems that in the past few years, Watir has grown and expanded, with a suite of software projects and products that are increasingly flexible and impressive in their scope, handling many different types of browsers, operating systems and conditions.
In this article, I take a look at Watir and how you can use it to test your Web applications more easily. My client's test suite, implemented with Watir, really impressed me, and it showed how with a bit of cleverness, you can take advantage of Watir's platform-independence to ensure that your application works on different browsers and operating systems with a minimum of hassle—and without having to worry if the JavaScript emulation package you're using will reflect the version that's in a user's browser accurately.
One of the confusing things about Watir is that it's the name of one particular open-source project, as well as a set of related projects that have the “watir” name in them. Each project has slightly different capabilities and dependencies, and is maintained by a different person or group.
Moreover, although documentation for Watir certainly exists, it's not nearly as centralized, organized or easy to follow as many other open-source projects that have reached this maturity level. Understanding how to use Watir frequently requires that you understand the general world of Web browsers and automatic testing frameworks in order to work with it, and that you figure out which version of Watir will do what you want.
For example, as I mentioned previously, Watir originally worked only on Windows and IE. This remains true for the “watir” system itself, also known as “watir-classic”.
You can use Watir on non-Windows machines by using Webdriver—part of Selenium, an Apache-licensed Web application testing framework. Just in case you missed that part, let me repeat it: Watir's cross-platform capabilities are made possible by Webdriver, which is developed by the maintainers of the “competing” Selenium test automation system.
(Wait, did I say that Webdriver is an open-source product? It is, but it's also a draft specification API from the W3C, the reference implementation of which is, you guessed it, the Webdriver software in the Selenium project.)
Because Selenium is widely used and well maintained, and because the Selenium team wants to have connections from many languages and to many browsers, it's generally safe to assume that you can control just about any Web browser programmatically using Webdriver. The Watir community has taken advantage of this, creating the watir-webdriver gem—basically, connecting the Watir API to the Webdriver back end. Of course, there are exceptions; I recently found that watir-webdriver did not yet support the just-released Firefox 19. A quick check on Freenode IRC's #watir channel confirmed that Webdriver doesn't yet support the most-recent Firefox version.
The bottom line is that for virtually any work you'll want to do on a Linux system, you'll need to install the watir-webdriver gem:
sudo gem install watir-webdriver -V
Once you have installed watir-webdriver, you can work with Chrome and Firefox. Actually, that's not entirely true. You'll also need to install chromedriver, a program that lets you control Chrome via Webdriver (see Resources for the download URL).
Once you have installed Watir, things should be much easier and more exciting. Fire up an interactive Ruby shell, either using IRB or the more modern Pry. If you're running Ruby 1.8, you first need to load the Ruby gems package:
pry(main)> require 'rubygems'
And in all cases, you then need to load the watir library:
pry(main)> require 'watir'
Now I'm going to create an instance of Watir::Browser, the class that represents a Web browser, which I can control via Ruby. I do this by passing the “new” constructor method a parameter indicating which kind of browser I want. For example, I can create a Chrome browser:
pry(main)> browser = Watir::Browser.new :chrome
In my case, this takes a little while to execute, as the browser binary starts up and then opens a communication channel, via Webdriver and Watir, with my Ruby instance. I often like to use the “inspect” method on a Ruby object to examine it. Here's what I get when I do this with my browser:
pry(main)> print browser.inspect #<Watir::Browser:0x1b4e74e97c5b6930 url="about:blank" ↪title="about:blank">
Not surprisingly, my browser instance has both a current URL and a title, neither of which is particularly exciting. So, let's point the browser to somewhere that's a bit more interesting:
pry(main)> browser.goto 'http://linuxjournal.com'
Within a few moments, I not only get control back at my interactive Ruby prompt, but the Web browser also has gone to the LJ home page. I can ask the browser for its title with the “title” method:
pry(main)> browser.title => "Linux Journal | The Original Magazine of the Linux Community"
I similarly can invoke the “url” method:
pry(main)> browser.url => "http://www.linuxjournal.com/"
Notice that although I told the browser to go to http://linuxjournal.com, it was redirected to http://www.linuxjournal.com, and the browser's current URL reflects this.
At this point, I can use the “html” method to get the HTML from the current browser window or the “text” method to retrieve a version of the current browser, stripped of HTML tags. I also can go to the previous page (with the “back” method) or reload the current page (with the “refresh” method).
Now, let's say that I want to retrieve all of the headlines from the LJ site. From a quick inspection of the site, I can see that each headline is in an “h2” tag. I can ask Watir to retrieve all of the “h2” tags in a collection, from which I then can display the first headline's text. For example:
pry(main)> browser.h2s[2].text => "Kyle Rankin to Keynote SCALE 11x"
But why stop there? I can retrieve and display all of the current headlines:
pry(main)> browser.h2s.each {|h| puts h.text}
I'm not going to use up all of my column with a list of headlines, but you can be sure that this does print all of the headlines on the site. Actually, it does a little more than that. The above code prints all of the h2 tags anywhere in the site, which includes a bit more than that. Upon closer inspection, I don't really want all of the h2 tags on this page, but rather all of the h2 tags that are within the div whose name is “content-area”.
So, what I need to do is tell Watir to grab the “content-area” div and then retrieve all of the h2s contained within it. To grab the content-area div, I use the “div” method, which tags a hash describing the attributes of the div I want:
browser.div(id: 'content-area')
That returns the div, which is a good start. But I want the h2s within the div, so I can just say:
browser.div(id: 'content-area').h2s
Yes, the “h2s” method that I used before, when executed within the context of a div, restricts the search to that div, rather than the entire browser window. So, I can display all of the latest headlines with:
browser.div(id: 'content-area').h2s.each {|h| puts h.text}
And sure enough, that works very nicely. Watir provides methods for you to retrieve many different types of elements from within the full browser context, or within a more restrictive context. Thus, you can find a single div with the singular “div” method or a number of them matching (optional) criteria with the plural “divs” method. You can retrieve the first h2, or the first h2 to match optional criteria, with the “h2” method, or all of the h2s that match optional criteria with the plural “h2s” method. The same is true for “span”, “h3”, “p” and even “a” tags. The singular method retrieves the first item to match your stated criteria, or if multiple elements matched, the first from that list.
Watir provides a number of tricks that let you find and work with Web pages. For example, I can search for all paragraphs whose text is the word “Linux”:
pry(main)> browser.ps(text: 'Linux').count
Not surprisingly, this returns a value of 0, because no paragraph consists solely of the word “Linux”. However, if I pass a regexp object rather than a string, I'll get back the “p” elements that contain the word “Linux” anywhere inside them:
pry(main)> browser.ps(text: /Linux/).count => 4
As you can see, one of the ways in which I check that my criteria are working is by using the “count” method. If I'm working with a single Watir element, I also can use the “flash” method to highlight the element briefly on the screen, in the browser. For example:
pry(main)> browser.h2(text: 'Help Us Feed You Pi!').flash
Although this makes the element extremely obvious in the browser for a few seconds, you do need to be looking at the browser, and have it scrolled to the appropriate element, to see it. You also can loop through a number of matches and flash each of them in turn:
pry(main)> browser.ps(text: /Linux/).each {|p| p.flash}
Of course, Watir can do much more than retrieve information from the browser. For example, let's assume that I want to execute a search on the LJ site. Looking at the home page, I see a text field and a search button next to it. If I enter text in that text field and click on the search button, I should see some results.
By using the “view source” feature in my browser, I see that the search field has a class of “gsc-input”. So, I focus on that element, which removes the grayed-out “hint” text:
pry(main)> browser.input(class:'gsc-input').focus
Now, I type into the element:
pry(main)> browser.input(class:'gsc-input').send_keys('Reuven')
Finally, I submit the search form:
pry(main)> browser.input(class:'gsc-search-button').click
Sure enough, the browser submits the form, and I get a list of ego-boosting results from the LJ site.
Notice that I used the “send_keys” method when typing into the form element. There are other ways to modify the text. For example, I can retrieve the text of the text field with the “text” method:
pry(main)> browser.input(class:'gsc-input').text
If I use the “text_field” method, I also can set the value of the text in the search field before clicking on it:
pry(main)> browser.text_field(class:'gsc-input').set 'Reuven' pry(main)> browser.input(value:'Search').click
One of the best things about Watir is that you're not working inside an emulation package, but rather an actual browser. This means that you can use and test things that require JavaScript. For example, if I go to the AirBNB.com home page, I see a huge photo behind the simple “Where do you want to go?” form. That photo changes every 30 seconds or so, but I can force it to move forward or backward by clicking on the < and > signs. Is there a way for me to move the image forward or backward without clicking on these buttons with the mouse?
The answer is yes, but first I have to find the elements that will take the click. Opening the HTML source, I found that the < and > buttons were defined as <i> elements inside a div whose class was “arrows”. I was able to grab that div with:
browser.div(class: 'arrows')
I then wanted to get at the <i> elements inside of the div. Using the “elements” method on the div, I grabbed the left-arrow:
browser.div(class: 'arrows').elements.first
Of course, I double-checked that I had grabbed the correct element with the “flash” method:
browser.div(class: 'arrows').elements.first.flash
Once establishing that it was the right item, I clicked on it:
browser.div(class: 'arrows').elements.first.click
Sure enough, that did it—the front page of AirBNB.com switched images right away, rather than waiting for the timeout to occur. In this way, you can test JavaScript and Ajax events, as they take place in an actual browser.
The good news is that Watir is a great tool for testing Web applications. I've been using it only a short while, and I'm already delighted with the sorts of sophisticated tests I can run. The fact that I can test different browsers automatically, work against actual sites and know that I'm using a real-world environment, rather than one designed for testing, is a big help. Watir has come a long way from its Windows-and-IE-only roots, and the developers deserve a great deal of credit.
That said, there are a number of issues with Watir, ranging from the difficult-to-follow installation instructions, to the confusingly large number of related (and unrelated) Ruby gems, to the API, which is well written, but poorly documented. It's frustrating, for example, that I can retrieve the contents of a text area using either the “input” or “text_field” methods, but that only text_field allows me to change the current text value.
Integrating Watir into existing test facilities, such as RSpec and Cucumber, appears to be possible, but the documentation for doing so (like much in the Watir world) is incomplete, somewhat out of date and spread across a number of sites.
Should you use Watir? Based on what I've seen, the answer is yes, although you should be prepared to spend some time reading blog posts, downloading a number of versions of Watir and trying to understand why things that should work don't. Once you're past the initial installation and learning curve, working with Watir is pleasant and straightforward, with an API that is second to none. I've started to incorporate Watir into my own test procedures, and I encourage you to consider doing the same on your own projects.