This is the second article in our Django migrations series:
- Part 1: Django Migrations: A Primer
- Part 2: Digging Deeper Into Django Migrations (current article)
- Part 3: Data Migrations
- Video: Django 1.7 Migrations - A Primer
In the previous article in this series, you learned about the purpose of Django migrations. You have become familiar with fundamental usage patterns like creating and applying migrations. Now it’s time to dig deeper into the migration system and take a peek at some of its underlying mechanics.
By the end of this article, you’ll know:
- How Django keeps track of migrations
- How migrations know which database operations to perform
- How dependencies between migrations are defined
Once you’ve wrapped your head around this part of the Django migration system, you’ll be well prepared to create your own custom migrations. Let’s jump right in where we left off!
This article uses the bitcoin_tracker
Django project built in Django Migrations: A Primer. You can either re-create that project by working through that article or you can download the source code:
Download Source Code: Click here to download the code for the Django migrations project you’ll be using in this article.
How Django Knows Which Migrations to Apply
Let’s recap the very last step of the previous article in the series. You created a migration and then applied all available migrations with python manage.py migrate
.
If that command ran successfully, then your database tables now match your model’s definitions.
What happens if you run that command again? Let’s try it out:
$ python manage.py migrate
Operations to perform:
Apply all migrations: admin, auth, contenttypes, historical_data, sessions
Running migrations:
No migrations to apply.
Nothing happened! Once a migration has been applied to a database, Django will not apply this migration to that particular database again. Ensuring that a migration is applied only once requires keeping track of the migrations that have been applied.
Django uses a database table called django_migrations
. Django automatically creates this table in your database the first time you apply any migrations. For each migration that’s applied or faked, a new row is inserted into the table.
For example, here’s what this table looks like in our bitcoin_tracker
project:
ID | App | Name | Applied |
---|---|---|---|
1 | contenttypes |
0001_initial |
2019-02-05 20:23:21.461496 |
2 | auth |
0001_initial |
2019-02-05 20:23:21.489948 |
3 | admin |
0001_initial |
2019-02-05 20:23:21.508742 |
4 | admin |
0002_logentry_remove... |
2019-02-05 20:23:21.531390 |
5 | admin |
0003_logentry_add_ac... |
2019-02-05 20:23:21.564834 |
6 | contenttypes |
0002_remove_content_... |
2019-02-05 20:23:21.597186 |
7 | auth |
0002_alter_permissio... |
2019-02-05 20:23:21.608705 |
8 | auth |
0003_alter_user_emai... |
2019-02-05 20:23:21.628441 |
9 | auth |
0004_alter_user_user... |
2019-02-05 20:23:21.646824 |
10 | auth |
0005_alter_user_last... |
2019-02-05 20:23:21.661182 |
11 | auth |
0006_require_content... |
2019-02-05 20:23:21.663664 |
12 | auth |
0007_alter_validator... |
2019-02-05 20:23:21.679482 |
13 | auth |
0008_alter_user_user... |
2019-02-05 20:23:21.699201 |
14 | auth |
0009_alter_user_last... |
2019-02-05 20:23:21.718652 |
15 | historical_data |
0001_initial |
2019-02-05 20:23:21.726000 |
16 | sessions |
0001_initial |
2019-02-05 20:23:21.734611 |
19 | historical_data |
0002_switch_to_decimals |
2019-02-05 20:30:11.337894 |
As you can see, there is an entry for each applied migration. The table not only contains the migrations from our historical_data
app, but also the migrations from all other installed apps.
The next time migrations are run, Django will skip the migrations listed in the database table. This means that, even if you manually change the file of a migration that has already been applied, Django will ignore these changes, as long as there’s already an entry for it in the database.
You could trick Django into re-running a migration by deleting the corresponding row from the table, but this is rarely a good idea and can leave you with a broken migration system.
The Migration File
What happens when you run python manage.py makemigrations <appname>
? Django looks for changes made to the models in your app <appname>
. If it finds any, like a model that has been added, then it creates a migration file in the migrations
subdirectory. This migration file contains a list of operations to bring your database schema in sync with your model definition.
Note: Your app has to be listed in the INSTALLED_APPS
setting, and it must contain a migrations
directory with an __init__.py
file. Otherwise Django will not create any migrations for it.
The migrations
directory is automatically created when you create a new app with the startapp
management command, but it’s easy to forget when creating an app manually.
The migration files are just Python, so let’s have a look at the first migration file in the historical_prices
app. You can find it at historical_prices/migrations/0001_initial.py
. It should look something like this:
from django.db import models, migrations
class Migration(migrations.Migration):
dependencies = []
operations = [
migrations.CreateModel(
name='PriceHistory',
fields=[
('id', models.AutoField(
verbose_name='ID',
serialize=False,
primary_key=True,
auto_created=True)),
('date', models.DateTimeField(auto_now_add=True)),
('price', models.DecimalField(decimal_places=2, max_digits=5)),
('volume', models.PositiveIntegerField()),
('total_btc', models.PositiveIntegerField()),
],
options={
},
bases=(models.Model,),
),
]
As you can see, it contains a single class called Migration
that inherits from django.db.migrations.Migration
. This is the class that the migration framework will look for and execute when you ask it to apply migrations.
The Migration
class contains two main lists:
dependencies
operations
Migration Operations
Let’s look at the operations
list first. This table contains the operations that are to be performed as part of the migration. Operations are subclasses of the class django.db.migrations.operations.base.Operation
. Here are the common operations that are built into Django:
Operation Class | Description |
---|---|
CreateModel |
Creates a new model and the corresponding database table |
DeleteModel |
Deletes a model and drops its database table |
RenameModel |
Renames a model and renames its database table |
AlterModelTable |
Renames the database table for a model |
AlterUniqueTogether |
Changes the unique constraints of a model |
AlterIndexTogether |
Changes the indexes of a model |
AlterOrderWithRespectTo |
Creates or deletes the _order column for a model |
AlterModelOptions |
Changes various model options without affecting the database |
AlterModelManagers |
Changes the managers available during migrations |
AddField |
Adds a field to a model and the corresponding column in the database |
RemoveField |
Removes a field from a model and drops the corresponding column from the database |
AlterField |
Changes a field’s definition and alters its database column if necessary |
RenameField |
Renames a field and, if necessary, also its database column |
AddIndex |
Creates an index in the database table for the model |
RemoveIndex |
Removes an index from the database table for the model |
Note how the operations are named after changes made to model definitions, not the actions that are performed on the database. When you apply a migration, each operation is responsible for generating the necessary SQL statements for your specific database. For example, CreateModel
would generate a CREATE TABLE
SQL statement.
Out of the box, migrations have support for all the standard databases that Django supports. So if you stick to the operations listed here, then you can do more or less any changes to your models that you want, without having to worry about the underlying SQL. That’s all done for you.
Note: In some cases, Django might not correctly detect your changes. If you rename a model and change several of its fields, then Django might mistake this for a new model.
Instead of a RenameModel
and several AlterField
operations, it will create a DeleteModel
and a CreateModel
operation. Instead of renaming the database table for the model, it will drop it and create a new table with the new name, effectively deleting all your data!
Make it a habit to check the generated migrations and test them on a copy of your database before running them on production data.
Django provides three more operation classes for advanced use cases:
RunSQL
allows you to run custom SQL in the database.RunPython
allows you to run any Python code.SeparateDatabaseAndState
is a specialized operation for advanced uses.
With these operations, you can basically do any changes you want to your database. However, you won’t find these operations in a migration that has been created automatically with the makemigrations
management command.
Since Django 2.0, there are also a couple of PostgreSQL-specific operations available in django.contrib.postgres.operations
that you can use to install various PostgreSQL extensions:
BtreeGinExtension
BtreeGistExtension
CITextExtension
CryptoExtension
HStoreExtension
TrigramExtension
UnaccentExtension
Note that a migration containing one of these operations requires a database user with superuser privileges.
Last but not least, you can also create your own operation classes. If you want to look into that, then take a look at the Django documentation on creating custom migration operations.
Migration Dependencies
The dependencies
list in a migration class contains any migrations that must be applied before this migration can be applied.
In the 0001_initial.py
migration you saw above, nothing has to be applied prior so there are no dependencies. Let’s have a look at the second migration in the historical_prices
app. In the file 0002_switch_to_decimals.py
, the dependencies
attribute of Migration
has an entry:
from django.db import migrations, models
class Migration(migrations.Migration):
dependencies = [
('historical_data', '0001_initial'),
]
operations = [
migrations.AlterField(
model_name='pricehistory',
name='volume',
field=models.DecimalField(decimal_places=3, max_digits=7),
),
]
The dependency above says that migration 0001_initial
of the app historical_data
must be run first. That makes sense, because the migration 0001_initial
creates the table containing the field that the migration 0002_switch_to_decimals
wants to change.
A migration can also have a dependency on a migration from another app, like this:
class Migration(migrations.Migration):
...
dependencies = [
('auth', '0009_alter_user_last_name_max_length'),
]
This is usually necessary if a model has a Foreign Key pointing to a model in another app.
Alternatively, you can also enforce that a migration is run before another migration using the attribute run_before
:
class Migration(migrations.Migration):
...
run_before = [
('third_party_app', '0001_initial'),
]
Dependencies can also be combined so you can have multiple dependencies. This functionality provides a lot of flexibility, as you can accommodate foreign keys that depend upon models from different apps.
The option to explicitly define dependencies between migrations also means that the numbering of the migrations (usually 0001
, 0002
, 0003
, …) doesn’t strictly represent the order in which migrations are applied. You can add any dependency you want and thus control the order without having to re-number all the migrations.
Viewing the Migration
You generally don’t have to worry about the SQL that migrations generate. But if you want to double-check that the generated SQL makes sense or are just curious what it looks like, then Django’s got you covered with the sqlmigrate
management command:
$ python manage.py sqlmigrate historical_data 0001
BEGIN;
--
-- Create model PriceHistory
--
CREATE TABLE "historical_data_pricehistory" (
"id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
"date" datetime NOT NULL,
"price" decimal NOT NULL,
"volume" integer unsigned NOT NULL
);
COMMIT;
Doing that will list out the underlying SQL queries that will be generated by the specified migration, based upon the database in your settings.py
file. When you pass the parameter --backwards
, Django generates the SQL to unapply the migration:
$ python manage.py sqlmigrate --backwards historical_data 0001
BEGIN;
--
-- Create model PriceHistory
--
DROP TABLE "historical_data_pricehistory";
COMMIT;
Once you see the output of sqlmigrate
for a slightly more complex migration, you may appreciate that you don’t have to craft all this SQL by hand!
How Django Detects Changes to Your Models
You’ve seen what a migration file looks like and how its list of Operation
classes defines the changes performed to the database. But how exactly does Django know which operations should go into a migration file? You might expect that Django compares your models to your database schema, but that is not the case.
When running makemigrations
, Django does not inspect your database. Neither does it compare your model file to an earlier version. Instead, Django goes through all migrations that have been applied and builds a project state of what the models should look like. This project state is then compared to your current model definitions, and a list of operations is created, which, when applied, would bring the project state up to date with the model definitions.
Playing Chess With Django
You can think of your models like a chess board, and Django is a chess grandmaster watching you play against yourself. But the grandmaster doesn’t watch your every move. The grandmaster only looks at the board when you shout makemigrations
.
Because there’s only a limited set of possible moves (and the grandmaster is a grandmaster), she can come up with the moves that have happened since she last looked at the board. She takes some notes and lets you play until you shout makemigrations
again.
When looking at the board the next time, the grandmaster doesn’t remember what the chessboard looked like the last time, but she can go through her notes of the previous moves and build a mental model of what the chessboard looked like.
Now, when you shout migrate
, the grandmaster will replay all the recorded moves on another chessboard and note in a spreadsheet which of her records have already been applied. This second chess board is your database, and the spreadsheet is the django_migrations
table.
This analogy is quite fitting, because it nicely illustrates some behaviors of Django migrations:
-
Django migrations try to be efficient: Just like the grandmaster assumes that you made the least number of moves, Django will try to create the most efficient migrations. If you add a field named
A
to a model, then rename it toB
, and then runmakemigrations
, then Django will create a new migration to add a field namedB
. -
Django migrations have their limits: If you make a lot of moves before you let the grandmaster look at the chessboard, then she might not be able to retrace the exact movements of each piece. Similarly, Django might not come up with the correct migration if you make too many changes at once.
-
Django migration expect you to play by the rules: When you do anything unexpected, like taking a random piece off the board or messing with the notes, the grandmaster might not notice at first, but sooner or later, she’ll throw up her hands and refuse to continue. The same happens when you mess with the
django_migrations
table or change your database schema outside of migrations, for example by deleting the database table for a model.
Understanding SeparateDatabaseAndState
Now that you know about the project state that Django builds, it’s time to take a closer look at the operation SeparateDatabaseAndState
. This operation can do exactly what the name implies: it can separate the project state (the mental model Django builds) from your database.
SeparateDatabaseAndState
is instantiated with two lists of operations:
state_operations
contains operations that are only applied to the project state.database_operations
contains operations that are only applied to the database.
This operation lets you do any kind of change to your database, but it’s your responsibility to make sure that the project state fits the database afterwards. Example use cases for SeparateDatabaseAndState
are moving a model from one app to another or creating an index on a huge database without downtime.
SeparateDatabaseAndState
is an advanced operation and you won’t need on your first day working with migrations and maybe never at all. SeparateDatabaseAndState
is similar to heart surgery. It carries quite a bit of risk and is not something you do just for fun, but sometimes it’s a necessary procedure to keep the patient alive.
Conclusion
This concludes your deep dive into Django migrations. Congratulations! You’ve covered quite a lot of advanced topics and now have a solid understanding what happens under the hood of migrations.
You learned that:
- Django keeps track of applied migrations in the Django migrations table.
- Django migrations consist of plain Python files containing a
Migration
class. - Django knows which changes to perform from the
operations
list in theMigration
classes. - Django compares your models to a project state it builds from the migrations.
With this knowledge, you’re now ready to tackle the third part of the series on Django migrations, where you’ll learn how to use data migrations to safely make one-time changes to your data. Stay tuned!
This article used the bitcoin_tracker
Django project built in Django Migrations: A Primer. You can either re-create that project by working through that article or you can download the source code: