In my last article, I introduced Mypy, a package that enforces type checking in Python programs. Python itself is, and always will remain, a dynamically typed language. However, Python 3 supports "annotations", a feature that allows you to attach an object to variables, function parameters and function return values. These annotations are ignored by Python itself, but they can be used by external tools.
Mypy is one such tool, and it's an increasingly popular one. The idea is that you run Mypy on your code before running it. Mypy looks at your code and makes sure that your annotations correspond with actual usage. In that sense, it's far stricter than Python itself, but that's the whole point.
In my last article, I covered some basic uses for Mypy. Here, I want to expand upon those basics and show how Mypy really digs deeply into type definitions, allowing you to describe your code in a way that lets you be more confident of its stability.
Consider the following code:
x: int = 5
x = 'abc'
print(x)
This first defines the variable x
, giving it a type
annotation of int
.
It also assigns it to the integer 5. On the next line, it assigns
x
the
string abc
. And on the third line, it prints the value of
x
.
The Python language itself has no problems with the above code. But if
you run mypy
against it, you'll get an error message:
mytest.py:5: error: Incompatible types in assignment
(expression has type "str", variable has type "int")
As the message says, the code declared the variable to have type
int
,
but then assigned a string to it. Mypy can figure this out because,
despite what many people believe, Python is a strongly typed language.
That is, every object has one clearly defined type. Mypy notices this
and then warns that the code is assigning values that are contrary to what
the declarations said.
In the above code, you can see that I declared x
to be of
type int
at
definition time, but then assigned it to a string, and then I got an error.
What if I don't add the annotation at all? That is, what if I run the
following code via Mypy:
x = 5
x = 'abc'
print(x)
You might think that Mypy would ignore it, because I didn't add any
annotation. But actually, Mypy infers the type of value a variable
should contain from the first value assigned to it. Because I assigned
an integer to x
in the first line, Mypy assumed that
x
should always
contain an integer.
This means that although you can annotate variables, you typically don't have to do so unless you're declaring one type and then might want to use another, and you want Mypy to accept both.
Python's dict
("dictionary") type is probably the most important in
the entire language. It would seem, at first glance, that name-value
pairs aren't very exciting or important. But when you think about how
often programs use name-value pairs—for variables, namespaces,
user name-ID associations—it becomes clear just how necessary this can
be.
Dictionaries also are used as small databases, or structures, for keeping track of data. For many people new to Python, it seems natural to define a new class whenever they need a new data type. But for many Python users, it's more natural to use a dictionary. Or if you need a collection of them, a list of dicts.
For example, assume that I want to keep track of prices on various items in a store. I can define the store's price list as a dictionary, in which the keys are the item names and the values are the item prices. For example:
menu = {'coffee': 5, 'sandwich': 7, 'soup': 8}
What happens if I accidentally try to add a new item to the menu, but mix up the name and value? For example:
menu[5] = 'muffin'
Python doesn't care; as far as it's concerned, you can have any hashable type as a key and absolutely any type as as value. But of course, you do care, and it might be nice to tighten up the code to ensure you don't make this mistake.
Here's a great thing about Mypy: it'll do this for you automatically, without you saying anything else. If I take the above two lines, put them into a Python file, and then check the program with Mypy, I get the following:
mytest.py:4: error: Invalid index type "int" for
↪"Dict[str, int]"; expected type "str"
mytest.py:4: error: Incompatible types in assignment
↪(expression has type "str", target has type "int")
In other words, Mypy noticed that the dictionary was (implicitly) set to have strings as keys and ints and values, simply because the initial definition was set that way. It then noticed that it was trying to assign a new key-value pair with different types and pointed to the problem.
Let's say, however, that you want to be explicit. You can do that by
using the typing
module, which defines annotation-friendly versions of
many built-in types, as well as many new types designed for this
purpose. Thus, I can say:
from typing import Dict
menu: Dict[str, int] = {'coffee': 5, 'sandwich': 7, 'soup': 8}
menu[5] = 'muffin'
In other words, when I define my menu
variable, I also give it a type
annotation. This type annotation makes explicit what Mypy implied from
the dict's definition—namely that keys should be strings and values
should be ints. So, I got the following error message from Mypy:
mytest.py:6: error: Invalid index type "int" for
↪"Dict[str, int]"; expected type "str"
mytest.py:6: error: Incompatible types in assignment
↪(expression has type "str", target has type "int")
What if I want to raise the price of the soup by 0.5? Then the code looks like this:
menu: Dict[str, int] = {'coffee': 5, 'sandwich': 7,
↪'soup': 8.5}
And I end up getting an additional warning:
mytest.py:5: error: Dict entry 2 has incompatible type "str":
↪"float"; expected "str": "int"
As I explained in my last article, you can use a Union
to define several different
options:
from typing import Dict, Union
menu: Dict[str, Union[int, float]] = {'coffee': 5,
↪'sandwich': 7, 'soup': 8.5}
menu[5] = 'muffin'
With this in place, Mypy knows that the keys must be strings, but the values can be either ints or floats. So, this silences the complaint about the soup's price being 8.5, but retains the warning about the reversed assignment regarding muffins.
In my last article, I showed how when you define a function, you can annotate not
only the parameters, but also the return type. For example, let's say
I want to implement a function, doubleget
, that takes two
arguments: a
dictionary and a key. It returns the value associated with the key, but
doubled. For example:
from typing import Dict
def doubleget(d: Dict[str, int], k) -> int:
return d[k] * 2
menu: Dict[str, int] = {'coffee': 5, 'sandwich': 7,
↪'soup': 8}
print(doubleget(menu, 'sandwich'))
This is fine, but what happens if the user passes a key that isn't in
the dict? This will end up raising a KeyError
exception. I'd like to do
what the dict.get
method does—namely return
None
if the key is
unknown. So, my implementation will look like this:
from typing import Dict
def doubleget(d: Dict[str, int], k) -> int:
if k in d:
return d[k] * 2
else:
return None
menu: Dict[str, int] = {'coffee': 5, 'sandwich': 7, 'soup': 8}
print(doubleget(menu, 'sandwich'))
print(doubleget(menu, 'elephant'))
From Python's perspective, this is totally fine; it'll get
14
back
from the first call and None
back from the second. But from Mypy's
perspective, there is a problem: this indicated that the function will
always return an integer, and now it's returning None
:
mytest.py:10: error: Incompatible return value type
↪(got "None", expected "int")
I should note that Mypy doesn't flag this problem when you call the
function. Rather, it notices that you're allowing the function to return
a None
value in the function definition itself.
One solution is to use a Union
type, as I showed earlier, allowing an
integer or None
to be returned. But that doesn't quite express what
the goal is here. What I would like to do is say that it might
return an integer, but it might not—meaning, more or less, that the
returned integer is optional.
Sure enough, Mypy provides for this with its Optional
type:
from typing import Dict, Optional
def doubleget(d: Dict[str, int], k) -> Optional[int]:
if k in d:
return d[k] * 2
else:
return None
By annotating the function's return type with
Optional[int]
, this is
saying that if something is returned, it will be an integer. But, it's
also okay to return None
.
Optional
is useful not only when you're returning values from a
function, but also when you're defining variables or object attributes.
It's pretty common, for example, for the __init__
method in a class to
define all of an object's attributes, even those that aren't defined in
__init__
itself. Since you don't yet know what values you want to set,
you use the None
value. But of course, that then means the attribute
might be equal to None
, or it might be equal to (for example) an
integer. By using Optional
when setting the attribute, you signal that
it can be either an integer or a None
value.
For example, consider the following code:
class Foo():
def __init__(self, x):
self.x = x
self.y = None
f = Foo(10)
f.y = 'abcd'
print(vars(f))
From Python's perspective, there isn't any issue. But you might like to
say that both x
and y
must be integers, except
for when y
is
initialized and set to None
. You can do that as follows:
from typing import Optional
class Foo():
def __init__(self, x: int):
self.x: int = x
self.y: Optional[int] = None
Notice that there are three type annotations here: on the
parameter x
(int
), on the attribute self.x
(also
int
) and on the attribute
self.y
(which is Optional[int]
). Python won't
complain if you break
these rules, but if you still have the code that was run before:
f = Foo(10)
f.y = 'abcd'
print(vars(f))
Mypy will complain:
mytest.py:13: error: Incompatible types in assignment
↪(expression has type "str", variable has type
↪"Optional[int]")
Sure enough, you now can assign either None
or an integer
to f.y
.
But if you try to set any other type, you'll get a warning from Mypy.
Mypy is a huge step forward for large-scale Python applications. It promises to keep Python the way you've known it for years, but with added reliability. If your team is working on a large Python project, it might well make sense to start incorporating Mypy into your integration tests. The fact that it runs outside the language means you can add Mypy slowly over time, making your code increasingly robust.
You can read more about Mypy here. That site
has documentation, tutorials and even information for people using
Python 2 who want to introduce mypy
via comments (rather than
annotations).