At the Forge: Testing Your Code with Python's pytest

Don't test your code? pytest removes any excuses. By Reuven M. Lerner

Software developers don't just write software; they also use software. So, they're the first to recognize, and understand, that software is complex and inevitably contains bugs.

But, just because bugs are inevitable doesn't mean that developers can or should try to prevent them. And, thus, during the past few decades, there's been rapid growth in software testing. Testing is no longer seen as an optional or "nice to have" part of software development; it's considered an absolute must—part of the software development process. In many cases, the people in the Python courses I teach at various companies aren't developers per se, but instead testers—people with the full-time job of writing tests to ensure that the company's software is robust.

I must admit that even though I've been writing software for a long time, I have rarely been as good about testing as I'd like to be. Sure, when I'm working on a large, complex app, I'll write tests, but it always seemed to be a bit of a burden. I know that it's good for me, will save tons of time in the future, will make the software more robust and maintenance easier, but really, if I just want to get my program out the door, why test? And besides, the various test frameworks I've used through the years never struck me as very impressive or easy to use.

So for the past few years, I've been in a bit of a holding pattern. I want to test more, but testing is annoying, so I don't test, which makes it seem like even more of a burden, because it's not part of my regular process.

All of this has changed for me recently, thanks to my discovery (long after other people, I admit) of the pytest library for Python. pytest turns out to be easy to use, easy to work with and easy to integrate into my work. Part of the reason for this is that pytest abandons the Python idea of "there's only one way to do it", giving developers a great degree of flexibility and freedom in choosing how to write tests.

So in this article, I provide an introduction to pytest, showing how to start integrating it into your development process today. I plan to expand on this in my next article and describe some more advanced pytest features that you might need to use.

Basic pytest

The idea behind pytest is that if you want to test a function, you'll write a separate function to test it. Actually, you'll probably want to write more than one test function, but that's in addition.

For example, let's assume you have the following function that sums numbers:


def mysum(numbers):
    output = 0
    for one_number in numbers:
        output += one_number
    return output

How can you test this function? (And yes, I'm ignoring the "test-driven development" mode of testing, in which you first write the tests and then write the code. You certainly can do TDD with pytest, but that isn't my point right now.)

I put this function definition in mysum.py. I next can create a file called test_mysum.py in the same directory. Then, when I run pytest in the current directory, it'll run all of the files starting with test_. How might test_mysum.py look? Let's start with something simple:


from mysum import mysum

def test_sum_integers():
    assert mysum([0,1,2,3,4]) == 10

As you can see, my test file test_mysum.py is fairly short. But it contains an actual test, and it also points to how tests can and will be written.

First, you have to import the file that you want to test. This can be a simple "import XYZ" statement, or you can import names selectively from the module with "from X import Y". Either way, you'll need to import the functions and classes you'll be testing.

The tests themselves are written as Python functions whose names begin with test_. (Yes, this means that tests are written in files whose names begin with test_, and then with functions in those files whose names begin with test_.)

In simple cases, these test functions take no parameters. The functions are called by pytest, and the key to the tests is the assert statement. Normally, the assert statement in Python evaluates an expression. If the expression returns True, the assertion is recorded as a success, but otherwise ignored.

So in the case of these example test functions, I'm basically saying "if I call the function with one argument, the list [0,1,2,3,4], I'm expecting to get the integer 10 back as a result".

How do I run my test? I go into the directory where my files are located, and I type:


pytest

Sure enough, pytest notices that there's a file matching the "test_*" pattern, which it runs. After some initial boilerplate indicating my system's configuration, I get the following output:


collected 1 item

test_mysum.py .                             [100%]

================1 passed in 0.02 seconds=====================

In other words, there was one file (test_mysum.py). It contained a single test function, represented by a dot (.). And, 100% of those tests ran successfully—meaning, what I asserted is indeed what was actually returned.

But of course, it's not enough to test with this sort of thing. I should probably call it with an empty list to make sure I get a 0 value back. So, let's add another test. Now test_mysum.py looks like this:


from mysum import mysum

def test_sum_integers():
    assert mysum([0,1,2,3,4]) == 10

def test_sum_nothing():
    assert mysum([]) == 0

And when I run the tests, I get:


collected 2 items

test_mysum.py ..                             [100%]

================= 2 passed in 0.10 seconds ==================

Let's add another test to see what happens if I invoke it with some floating-point numbers:


from mysum import mysum

def test_sum_integers():
    assert mysum([0,1,2,3,4]) == 10

def test_sum_floats():
    assert mysum([0.1,1.2,2.3,3.4,4.5]) == 11.5

def test_sum_nothing():
    assert mysum([]) == 0

And now, when I test things, I get:


collected 3 items

test_mysum.py ...                             [100%]

=================== 3 passed in 0.06 seconds ================

Sure enough, I've done a great job of testing so far.

I should note that while I've used only a single assert statement in each function here, you definitely can have more than one. I prefer to keep each test function as focused as possible, and thus, I use as few assert statements as I can.

Failing Tests

What if a test fails? Let's give it a shot by deliberately introducing a test that will fail. In test_mysum.py, I've added:


def test_one_and_one_are_three():
    assert mysum([1,1]) == 3

When I run the tests, I get the following output:


test_mysum.py ...F                             [100%]

========================== FAILURES ==========================
____________________ test_one_and_one_are_three ____________

    def test_one_and_one_are_three():
>       assert mysum([1,1]) == 3
E       assert 2 == 3
E        +  where 2 = mysum([1, 1])

test_mysum.py:15: AssertionError
================== 1 failed, 3 passed in 0.30 seconds ========

First, you can see that four test ran in test_mysum.py. The first three ran successfully and were represented by dots. The fourth test failed though. "Failure" in this case means that the assert statement claimed that there would be one answer (3), but that running the function produced a different answer (2). pytest not only indicates that there was a failure, but it also indicates in which test function the error occurred and the line where it took place. This allows you to figure out where the problem lies.

In the case of failure, of course, there are two possibilities: the original code is wrong, or your test is wrong. Don't forget that tests are code, which means that they can be prone to problems too! However, if you write your tests cleanly and clearly (and before or as you write the code), I've found that most tests will be simple and straightforward, making it less likely that the tests have problems and easier to identify the location of bugs.

You even can get more detailed output from pytest with the -v option:


test_mysum.py::test_sum_integers PASSED                [ 25%]
test_mysum.py::test_sum_floats PASSED                [ 50%]
test_mysum.py::test_sum_nothing PASSED                [ 75%]
test_mysum.py::test_one_and_one_are_three FAILED       [100%]

============================= FAILURES =======================
_______________________ test_one_and_one_are_three ___________

    def test_one_and_one_are_three():
>       assert mysum([1,1]) == 3
E       assert 2 == 3
E        +  where 2 = mysum([1, 1])

test_mysum.py:15: AssertionError
==================  1 failed, 3 passed in 0.22 seconds =======

Now you can see precisely which tests passed and failed, as well as where the failures took place.

Parametrized Tests

The successful tests created so far (test_sum_nothing, test_sum_integers and test_sum_floats) are all great and useful. But if you're like me, you might be wondering why you need three separate test functions just to check those three similar, but not identical, invocations. The pytest people agree, and they suggest the use of "parametrized tests". The idea here is that you define the test a single time, but tell pytest which inputs and outputs to provide.

You can do this by applying a Python decorator to the test function. The decorator will take two arguments: a string with comma-separated names representing the parameters you want to pass to the test and a list of two-element tuples describing the inputs and outputs. For example, given all of these tests:


def test_sum_integers():
    assert mysum([0,1,2,3,4]) == 10

def test_sum_floats():
    assert mysum([0.1,1.2,2.3,3.4,4.5]) == 11.5

def test_sum_nothing():
    assert mysum([]) == 0

You can replace them all with a single test:


import pytest
@pytest.mark.parametrize('numbers,output', [
    ([], 0),
    ([10, 20, 30], 60),
    ([0.1, 1.2, 2.3, 3.4, 4.5], 11.5)])
def test_mysum(numbers, output):
    assert mysum(numbers) == output

While this does the same thing as before, it definitely looks a bit more complex. Let's break it down:

With this in place, you can now run your tests, and you'll get the following output:


test_mysum.py::test_mysum[numbers0-0] PASSED         [33%]
test_mysum.py::test_mysum[numbers1-60] PASSED          [66%]
test_mysum.py::test_mysum[numbers2-11.5] PASSED         [100%]

======================== 3 passed in 0.12 seconds ============

If you're thinking, "wow, that looks a lot like the output from the three separate tests"—well, that's exactly right.

Summary

There's much more to say about pytest, but what I've written here covers most of the cases you'll encounter in your day-to-day work. Next time, I plan to cover a few other topics, including how to deal with exceptions, user input and output, and checking the code coverage.

Resources

The pytest website is at https://docs.pytest.org/en/latest.

An excellent book on the subject is Brian Okken's Python Testing with pytest, published by Pragmatic Programmers. He also has many other resources, about pytest and code testing in general, here.

About the Author

Reuven Lerner teaches Python, data science and Git to companies around the world. You can subscribe to his free, weekly "better developers" e-mail list, and learn from his books and courses at http://lerner.co.il. Reuven lives with his wife and children in Modi'in, Israel.

Reuven M. Lerner