Data Migration is a very convenient way to change the data in the database in conjunction with changes in the schema. They work like a regular schema migration. Django keep track of dependencies, order of execution and if the application already applied a given data migration or not.
A common use case of data migrations is when we need to introduce new fields that are non-nullable. Or when we are creating a new field to store a cached count of something, so we can create the new field and add the initial count.
In this post we are going to explore a simple example that you can very easily extend and modify for your needs.
Data Migrations
Let’s suppose we have an app named blog, which is installed in our project’s INSTALLED_APPS.
The blog have the following model definition:
blog/models.py
from django.db import models class Post(models.Model): title = models.CharField(max_length=255) date = models.DateTimeField(auto_now_add=True) content = models.TextField() def __str__(self): return self.titleThe application is already using this Post model; it’s already in production and there are plenty of data stored in the database.
| id | title | date | content |
|---|---|---|---|
| 1 | How to Render Django Form Manually | 2017-09-26 11:01:20.547000 | […] |
| 2 | How to Use Celery and RabbitMQ with Django | 2017-09-26 11:01:39.251000 | […] |
| 3 | How to Setup Amazon S3 in a Django Project | 2017-09-26 11:01:49.669000 | […] |
| 4 | How to Configure Mailgun To Send Emails in a Django Project | 2017-09-26 11:02:00.131000 | […] |
Now let’s say we want to introduce a new field named slug which will be used to compose the new URLs of the blog. The slug field must be unique and not null.
Generally speaking, always add new fields either as null=True or with a default value. If we can’t solve the problem with the default parameter, first create the field as null=True then create a data migration for it. After that we can then create a new migration to set the field as null=False.
Here is how we can do it:
blog/models.py
from django.db import models class Post(models.Model): title = models.CharField(max_length=255) date = models.DateTimeField(auto_now_add=True) content = models.TextField() slug = models.SlugField(null=True) def __str__(self): return self.titleCreate the migration:
python manage.py makemigrations blog Migrations for 'blog': blog/migrations/0002_post_slug.py - Add field slug to postApply it:
python manage.py migrate blog Operations to perform: Apply all migrations: blog Running migrations: Applying blog.0002_post_slug... OKAt this point, the database already have the slug column.
| id | title | date | content | slug |
|---|---|---|---|---|
| 1 | How to Render Django Form Manually | 2017-09-26 11:01:20.547000 | […] | (null) |
| 2 | How to Use Celery and RabbitMQ with Django | 2017-09-26 11:01:39.251000 | […] | (null) |
| 3 | How to Setup Amazon S3 in a Django Project | 2017-09-26 11:01:49.669000 | […] | (null) |
| 4 | How to Configure Mailgun To Send Emails in a Django Project | 2017-09-26 11:02:00.131000 | […] | (null) |
Create an empty migration with the following command:
python manage.py makemigrations blog --empty Migrations for 'blog': blog/migrations/0003_auto_20170926_1105.pyNow open the file 0003_auto_20170926_1105.py, and it should have the following contents:
blog/migrations/0003_auto_20170926_1105.py
# -*- coding: utf-8 -*- # Generated by Django 1.11.5 on 2017-09-26 11:05 from __future__ import unicode_literals from django.db import migrations class Migration(migrations.Migration): dependencies = [ ('blog', '0002_post_slug'), ] operations = [ ]Then here in this file, we can create a function that can be executed by the RunPython command:
blog/migrations/0003_auto_20170926_1105.py
# -*- coding: utf-8 -*- # Generated by Django 1.11.5 on 2017-09-26 11:05 from __future__ import unicode_literals from django.db import migrations from django.utils.text import slugify def slugify_title(apps, schema_editor): ''' We can't import the Post model directly as it may be a newer version than this migration expects. We use the historical version. ''' Post = apps.get_model('blog', 'Post') for post in Post.objects.all(): post.slug = slugify(post.title) post.save() class Migration(migrations.Migration): dependencies = [ ('blog', '0002_post_slug'), ] operations = [ migrations.RunPython(slugify_title), ]In the example above we are using the slugify utility function. It takes a string as parameter and transform it in a slug. See below some examples:
from django.utils.text import slugify slugify('Hello, World!') 'hello-world' slugify('How to Extend the Django User Model') 'how-to-extend-the-django-user-model'Anyway, the function used by the RunPython method to create a data migration, expects two parameters: apps and schema_editor. The RunPython will feed those parameters. Also remember to import models using the apps.get_model('app_name', 'model_name') method.
Save the file and execute the migration as you would do with a regular model migration:
python manage.py migrate blog Operations to perform: Apply all migrations: blog Running migrations: Applying blog.0003_auto_20170926_1105... OKNow if we check the database:
| id | title | date | content | slug |
|---|---|---|---|---|
| 1 | How to Render Django Form Manually | 2017-09-26 11:01:20.547000 | […] | how-to-render-django-form-manually |
| 2 | How to Use Celery and RabbitMQ with Django | 2017-09-26 11:01:39.251000 | […] | how-to-use-celery-and-rabbitmq-with-django |
| 3 | How to Setup Amazon S3 in a Django Project | 2017-09-26 11:01:49.669000 | […] | how-to-setup-amazon-s3-in-a-django-project |
| 4 | How to Configure Mailgun To Send Emails in a Django Project | 2017-09-26 11:02:00.131000 | […] | how-to-configure-mailgun-to-send-emails-in-a-django-project |
Every Post entry have a value, so we can safely change the switch from null=True to null=False. And since all the values are unique, we can also add the unique=True flag.
Change the model:
blog/models.py
from django.db import models class Post(models.Model): title = models.CharField(max_length=255) date = models.DateTimeField(auto_now_add=True) content = models.TextField() slug = models.SlugField(null=False, unique=True) def __str__(self): return self.titleCreate a new migration:
python manage.py makemigrations blogThis time you will see the following prompt:
You are trying to change the nullable field 'slug' on post to non-nullable without a default; we can't do that (the database needs something to populate existing rows). Please select a fix: 1) Provide a one-off default now (will be set on all existing rows with a null value for this column) 2) Ignore for now, and let me handle existing rows with NULL myself (e.g. because you added a RunPython or RunSQL operation to handle NULL values in a previous data migration) 3) Quit, and let me add a default in models.py Select an option:Select option 2 by typing “2” in the terminal.
Migrations for 'blog': blog/migrations/0004_auto_20170926_1422.py - Alter field slug on postNow we can safely apply the migration:
python manage.py migrate blog Operations to perform: Apply all migrations: blog Running migrations: Applying blog.0004_auto_20170926_1422... OKConclusions
Data migrations are tricky sometimes. When creating data migration for your projects, always examine the production data first. The implementation of the slugify_title I used in the example is a little naïve, because it could generate duplicate titles for a large dataset. Always test the data migrations first in a staging environment, so to avoid breaking things in production.
It’s also important to do it step-by-step, so you can feel in control of the changes you are introducing. Note that here I create three migration files for a simple data migration.
As you can see, it’s fairly easy to create this type of migration. It’s also very flexible. You could for example load an external text file to insert the data into a new column for example.
The source code used in this blog post is available on GitHub: https://github.com/sibtc/data-migrations-example
(Picture:
A Complete Beginner's Guide to Django - Part 5
Django Tips #16 Simple Database Access Optimizations
Django Tips #13 Using F() Expressions
How to Extend Django User Model
How to Setup a SSL Certificate on Nginx for a Django Application
How to Deploy a Django Application to Digital Ocean