At the Forge

Django Models and Migrations

Reuven M. Lerner

Issue #253, May 2015

Django's migrations make it easy to define and update your database schema.

In my last two articles, I looked at the Django Web application framework, written in Python. Django's documentation describes it as an MTV framework, in which the acronym stands for model, template and views.

When a request comes in to a Django application, the application's URL patterns determine which view method will be invoked. The view method can then, as I mentioned in previous articles, directly return content to the user or send the contents of a template. The template typically contains not only HTML, but also directives, unique to Django, which allow you to pass along variable values, execute loops and display text conditionally.

You can create lots of interesting Web applications with just views and templates. However, most Web applications also use a database, and in many cases, that means a relational database. Indeed, it's a rare Web application that doesn't use a database of some sort.

For many years, Web applications typically spoke directly with the database, sending SQL via text strings. Thus, you would say something like:

s = "SELECT first_name, last_name FROM Users where id = 1"

You then would send that SQL to the server via a database client library and retrieve the results using that library. Although this approach does allow you to harness the power of SQL directly, it means that your application code now contains text strings with another language. This mix of (for example) Python and SQL can become difficult to maintain and work with. Besides, in Python, you're used to working with objects, attributes and methods. Why can't you access the database that way?

The answer, of course, is that you can't, because relational databases eventually do need to receive SQL in order to function correctly. Thus, many programs use an ORM (object-relational mapper), which translates method calls and object attributes into SQL. There is a well established ORM in the Python world known as SQLAlchemy. However, Django has opted to use its own ORM, with which you define your database tables, as well as insert, update and retrieve information in those tables.

So in this article, I cover how you create models in Django, how you can create and apply migrations based on those model definitions, and how you can interact with your models from within a Django application.

Models

A “model” in the Django world is a Python class that represents a table in the database. If you are creating an appointment calendar, your database likely will have at least two different tables: People and Appointments. To represent these in Django, you create two Python classes: Person and Appointment. Each of these models is defined in the models.py file inside your application.

This is a good place to point out that models are specific to a particular Django application. Each Django project contains one or more applications, and it is assumed that you can and will reuse applications within different projects.

In the Django project I have created for this article (“atfproject”), I have a single application (“atfapp”). Thus, I can define my model classes in atfproject/atfapp/models.py. That file, by default, contains a single line:

from django.db import models

Given the example of creating an appointment calendar, let's start by defining your Appointment model:

from django.db import models

class Appointment(models.Model):
starts_at = models.DateTimeField()
ends_at = models.DateTimeField()
meeting_with = models.TextField()
notes = models.TextField()
def __str__(self):
    return "{} - {}: Meeting with {} ({})".format(self.starts_at,
                          self.ends_at,
                          self.meeting_with,
                          self.notes)

Notice that in Django models, you define the columns as class attributes, using a Python object known as a descriptor. Descriptors allow you to work with attributes (such as appointment.starts_at), but for methods to be fired in the back. In the case of database models, Django uses the descriptors to retrieve, save, update and delete your data in the database.

The one actual instance method in the above code is __str__, which every Python object can use to define how it gets turned into a string. Django uses the __str__ method to present your models.

Django provides a large number of field types that you can use in your models, matching (to a large degree) the column types available in most popular databases. For example, the above model uses two DateTimeFields and two TextFields. As you can imagine, these are mapped to the DATETIME and TEXT columns in SQL. These field definitions not only determine what type of column is defined in the database, but also the way in which Django's admin interface and forms allow users to enter data. In addition to TextField, you can have BooleanFields, EmailFields (for e-mail addresses), FileFields (for uploading files) and even GenericIPAddressField, among others.

Beyond choosing a field type that's appropriate for your data, you also can pass one or more options that modify how the field behaves. For example, DateField and DateTimeField allow you to pass an “auto_now” keyword argument. If passed and set to True, Django automatically will set the field to the current time when a new record is stored. This isn't necessarily behavior that you always will want, but it is needed frequently enough that Django provides it. That's true for the other fields, as well—they provide options that you might not always need, but that really can come in handy.

Migrations

So, now you have a model! How can you start to use it? Well, first you somehow need to translate your model into SQL that your database can use. This means, before continuing any further, you need to tell Django what database you're using. This is done in your project's configuration file; in my case, that would be atfproject/atfproject/settings.py. That file defines a number of variables that are used throughout Django. One of them is DATABASES, a dictionary that defines the databases used in your project. (Yes, it is possible to use more than one, although I'm not sure if that's normally such a good idea.)

By default, the definition of DATABASES is:

DATABASES = {
'default': {
    'ENGINE': 'django.db.backends.sqlite3',
    'NAME': os.path.join(BASE_DIR, 'db.sqlite3'),
}
}

In other words, Django comes, out of the box, defined to use SQLite. SQLite is a wonderful database for most purposes, but it is woefully underpowered for a real, production-ready database application that will be serving the general public. For such cases, you'll want something more powerful, such as my favorite database, PostgreSQL. Nevertheless, for the purposes of this little experiment here, you can use SQLite.

One of the many advantages of SQLite is that it uses one file for each database; if the file exists, SQLite reads the data from there. And if the file doesn't yet exist, it is created upon first use. Thus, by using SQLite, you're able to avoid any configuration.

However, you still somehow need to convert your Python code to SQL definitions that SQLite can use. This is done with “migrations”.

Now, if you're coming from the world of Ruby on Rails, you are familiar with the idea of migrations—they describe the changes made to the database, such that you easily can move from an older version of the database to a newer one. I remember the days before migrations, and they were significantly less enjoyable—their invention really has made Web development easier.

Migrations are latecomers to the world of Django. There long have been external libraries, such as South, but migrations in Django itself are relatively new. Rails users might be surprised to find that in Django, developers don't create migrations directly. Rather, you tell Django to examine your model definitions, to compare those definitions with the current state of the database and then to generate an appropriate migration.

Given that I just created a model, I go back into the project's root directory, and I execute:

django-admin.py makemigrations

This command, which you execute in the project's root directory, tells Django to look at the “atfapp” application, to compare its models with the database and then to generate migrations.

Now, if you encounter an error at this point (and I often do!), you should double-check to make sure your application has been added to the project. It's not sufficient to have your app in the Django project's directory. You also must add it to INSTALLED_APPS, a tuple in the project's settings.py. For example, in my case, the definition looks like this:

INSTALLED_APPS = (
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'atfapp'
)

The output of makemigrations on my system looks like this:

Migrations for 'atfapp':
  0001_initial.py:
- Create model Appointment

In other words, Django now has described the difference between the current state of the database (in which “Appointment” doesn't exist) and the final state, in which there will be an “Appointment” table. If you're curious to see what this migration looks like, you can always look in the atfapp/migrations directory, in which you'll see Python code.

Didn't I say that the migration will describe the needed database updates in SQL? Yes, but the description originally is written in Python. This allows you, at least in theory, to migrate to a different database server, if and when you want to do so.

Now that you have the migrations, it's time to apply them. In the project's root directory, I now write:

django-admin.py migrate

And then see:

Operations to perform:
  Apply all migrations: admin, contenttypes, auth, atfapp, sessions
Running migrations:
  Applying contenttypes.0001_initial... OK
  Applying auth.0001_initial... OK
  Applying admin.0001_initial... OK
  Applying atfapp.0001_initial... OK
  Applying sessions.0001_initial... OK

The above shows that the “atfapp” initial migration was run. But where did all of these other migrations come from? The answer is simple. Django's user model and other built-in models also are described using migrations and, thus, are applied along with mine, if that hasn't yet happened in my Django project.

You might have noticed that each migration is given a number. This allows Django to keep track of the history of the migrations and also to apply more than one, if necessary. You can create a migration, then create a new migration and then apply both of them together, if you want to keep the changes separate.

Or, perhaps more practically, you can work with other people on a project, each of whom is updating the database. Each of them can create their own migrations and commit them into the shared Git repository. If and when you retrieve the latest changes from Git, you'll get all of the migrations from your coworkers and then can apply them to your app.

Migrating Further

Let's say that you modify your model. How do you create and apply a new migration? The answer actually is fairly straightforward. Modify the model and ask Django to create an appropriate migration. Then you can run the newly created migration.

So, let's add a new field to the Appointment model, “minutes”, to keep track of what happened during the meeting. I add a single line to the model, such that the file now looks like this:

from django.db import models

class Appointment(models.Model):
starts_at = models.DateTimeField()
ends_at = models.DateTimeField()
meeting_with = models.TextField()
notes = models.TextField()
minutes = models.TextField()    # New line here!
def __str__(self):
    return "{} - {}: Meeting with {} ({})".format(self.starts_at,
                          self.ends_at,
                          self.meeting_with,
                          self.notes)

Now I once again run makemigrations, but this time, Django is comparing the current definition of the model with the current state of the database. It seems like a no-brainer for Django to deal with, and it should be, except for one thing: Django defines columns, by default, to forbid NULL values. If I add the “minutes” column, which doesn't allow NULL values, I'll be in trouble for existing rows. Django thus asks me whether I want to choose a default value to put in this field or if I'd prefer to stop the migration before it begins and to adjust my definitions.

One of the things I love about migrations is that they help you avoid stupid mistakes like this one. I'm going to choose the first option, indicating that “whatever” is the (oh-so-helpful) default value. Once I have done that, Django finishes with the migration's definition and writes it to disk. Now I can, once again, apply the pending migrations:

django-admin.py migrate

And I see:

Operations to perform:
  Apply all migrations: admin, contenttypes, auth, atfapp, sessions
Running migrations:
  Applying atfapp.0002_appointment_minutes... OK

Sure enough, the new migration has been applied!

Of course, Django could have guessed as to my intentions. However, in this case and in most others, Django follows the Python rule of thumb in that it's better to be explicit than implicit and to avoid guessing.

Conclusion

Django's models allow you to create a variety of different fields in a database-independent way. Moreover, Django creates migrations between different versions of your database, making it easy to iterate database definitions as a project moves forward, even if there are multiple developers working on it.

In my next article, I plan to look at how you can use models that you have defined from within your Django application.

Reuven M. Lerner is a Web developer, consultant and trainer. He recently completed his PhD in Learning Sciences from Northwestern University. You can read his blog, Twitter feed and newsletter at lerner.co.il. Reuven lives with his wife and three children in Modi'in, Israel.