In my last two articles I've described some of the ways Mypy, a type checker for Python, can help identify potential problems with your code. [See "Introducing Mypy, an Experimental Optional Static Type Checker for Python" and "Python's Mypy—Advanced Usage".] For people (like me) who have enjoyed dynamic languages for a long time, Mypy might seem like a step backward. But given the many mission-critical projects being written in Python, often by large teams with limited communication and Python experience, some kind of type checking is an increasingly necessary evil.
It's important to remember that Python, the language, isn't changing, and it isn't becoming statically typed. Mypy is a separate program, running outside Python, typically as part of a continuous integration (CI) system or invoked as part of a Git commit hook. The idea is that Mypy runs before you put your code into production, identifying where the data doesn't match the annotations you've made to your variables and function parameters.
I'm going to focus on a few of Mypy's advanced features here. You might not encounter them very often, but even if you don't, it'll give you a better picture of the complexities associated with type checking, and how deeply the Mypy team is thinking about their work, and what tests need to be done. It'll also help you understand more about the ways people do type checking, and how to balance the beauty, flexibility and expressiveness of dynamic typing with the strictness and fewer errors of static typing.
When I tell participants in my Python classes that everything in Python is an object, they nod their heads, clearly thinking, "I've heard this before about other languages." But then I show them that functions and classes are both objects, and they realize that Python's notion of "everything" is a bit more expansive than theirs. (And yes, Python's definition of "everything" isn't as wide as Smalltalk's.)
When you define a function, you're creating a new object, one of type "function":
>>> def foo():
... return "I'm foo!"
>>> type(foo)
<class 'function'>
Similarly, when you create a new class, you're adding a new object type to Python:
>>> class Foo():
... pass
>>> type(Foo)
<class 'type'>
It's a pretty common paradigm in Python to write a function that, when it runs, defines and runs an inner function. This is also known as a "closure", and it has a few different uses. For example, you can write:
def foo(x):
def bar(y):
return f"In bar, {x} * {y} = {x*y}"
return bar
You then can run:
b = foo(10)
print(b(2))
And you'll get the following output:
In bar, 10 * 2 = 20
I don't want to dwell on how all of this works, including inner functions and Python's scoping rules. I do, however, want to ask the question "how can you use Mypy to check all of this?"
You could annotate both x
and y
as int
. And you can annotate the
return value from bar
as a string. But how can you annotate the return
value from foo
? Given that, as shown above, functions are of type
function
, perhaps you can use that. But function
isn't actually a
recognized name in Python.
Instead, you'll need to use the typing
module, which comes with Python
3 so you can do this kind of type checking. And in typing
, the
name Callable
is defined for precisely this purpose. So you can write:
from typing import Callable
def foo(x: int) -> Callable:
def bar(y: int) -> str:
return f"In bar, {x} * {y} = {x*y}"
return bar
b = foo(10)
print(b(2))
Sure enough, this passes Mypy's checks. The function foo
returns
Callable
, a description that includes both functions and classes.
But, wait a second. Maybe you don't only want to check that foo
returns
a Callable
. Maybe you also want to make sure that it returns a function that
takes an int
as an argument. To do that, you'll use square brackets
after the word Callable
, putting two elements in those brackets. The
first will be a list (in this case, a one-element list) of argument
types. The second element in the list will describe the return type from
the function. In other words, the code now will look like this:
#!/usr/bin/env python3
def foo(x: int) -> Callable[[int], str]:
def bar(y: int) -> str:
return f"In bar, {x} * {y} = {x*y}"
return bar
b = foo(10)
print(b(2))
With all this talk of callables, you also should consider what happens
with generator functions. Python loves iteration and encourages you to
use for
loops wherever you can. In many cases, it's easiest to express
your iterator in the form of a function, known in the Python world as a
"generator function". For example, you can create a generator function
that returns the Fibonacci sequence as follows:
def fib():
first = 0
second = 1
while True:
yield first
first, second = second, first+second
You then can get the first 50 Fibonacci numbers as follows:
g = fib()
for i in range(50):
print(next(g))
That's great, but what if you want to add Mypy checking to your fib
function? It would seem that you can just say that the return value is
an integer:
def fib() -> int:
first = 0
second = 1
while True:
yield first
first, second = second, first+second
But if you try running this via Mypy, you get a pretty stern response:
atf201906b.py:4: error: The return type of a generator function
should be "Generator" or one of its supertypes
atf201906b.py:14: error: No overload variant of "next" matches
argument type "int"
atf201906b.py:14: note: Possible overload variant:
atf201906b.py:14: note: def [_T] next(i: Iterator[_T]) -> _T
atf201906b.py:14: note: <1 more non-matching overload not
shown>
Whoa! What's going on?
Well, it's important to remember that the result of running a generator function is not whatever you're yielding with each iteration. Rather, the result is a generator object. The generator object, in turn, then yields a particular type with each iteration.
So what you really want to do is tell Mypy that fib
will return a
generator, and that with each iteration of the generator, you'll get an
integer. You would think that you could do it this way:
from typing import Generator
def fib() -> Generator[int]:
first = 0
second = 1
while True:
yield first
first, second = second, first+second
But if you try to run Mypy, you get the following:
atf201906b.py:6: error: "Generator" expects 3 type arguments,
but 1 given
It turns out that the Generator type can (optionally) get arguments in square brackets. But if you provide any arguments, you must provide three:
send
method on it.
Since only the first of these is relevant in this program, you'll pass
None
for each of the other values:
from typing import Generator
def fib() -> Generator[int, None, None]:
first = 0
second = 1
while True:
yield first
first, second = second, first+second
Sure enough, it now passes Mypy's tests.
You might think that Mypy isn't up to the task of dealing with complex typing problems, but it actually has been thought out rather well. And of course, what I've shown here (and in my previous two articles on Mypy) is just the beginning; the Mypy authors have solved all sorts of problems, from modules mutually referencing each others' types to aliasing long type descriptions.
If you're thinking of tightening up your organization's code, adding type checking via Mypy is a great way to go. A growing number of organizations are adding its checks, little by little, and are enjoying something that dynamic-language advocates have long ignored, namely that if the computer can check what types you're using, your programs actually might run more smoothly.
You can read more about Mypy here. That site
has documentation, tutorials and even information for people using
Python 2 who want to introduce mypy
via comments (rather than
annotations).
You can read more about the origins of type annotations in Python, and how to use them, in PEP (Python enhancement proposal) 484, available online here.
See my previous two articles on Mypy: "Introducing Mypy, an Experimental Optional Static Type Checker for Python" and "Python's Mypy—Advanced Usage".