News

28th August 2015 by aegeuana_sjp_admin

Efficient SQL queries with Django

Creating/fetching/saving/deleting data from database is one of the most common tasks for the the web apps. Most of the time the web app is based on database and can not work without it. Therefore developers need to remember how important is to make efficient queries and not to slow down the application as it might become really frustrating for the end user to wait ages for the page to load.
The common problem during the development is that the database contains not enough data in to notice the problem. When the app goes live and more data is being saved into the database, everything becomes slower, when that happens it’s really hard to find quick solution to the problem. The best approach is to prevent this kind of situation from the development stage, by checking how many queries particular endpoint generates and limiting them to minimum. This approach will definitely save a lot of time and money when the app goes live.
Django provides really powerful and agile way to make queries to the database by using the models module. Although if we don’t know how to use it in the proper way the queries might become messy and inefficient. To demonstrate how Django database backend works, we’ll create a simple app with different type of relations between the models.
[code language=”python”]
from django.db import models
class Gender(models.Model):
name = models.CharField(max_length=100)
def __unicode__(self):
return self.name
class Color(models.Model):
name = models.CharField(max_length=100)
def __unicode__(self):
return self.color
class Info(models.Model):
gender = models.ForeignKey(‘app.Gender’)
color = models.ForeignKey(‘app.Color’)
age = models.PositiveIntegerField(default=0)
class Owner(models.Model):
name = models.CharField(max_length=100)
info = models.OneToOneField(‘app.Info’)
kittens = models.ManyToManyField(‘app.Kitten’)
def __unicode__(self):
return self.name
class Kitten(models.Model):
name = models.CharField(max_length=100)
info = models.OneToOneField(‘app.Info’)
def __unicode__(self):
return self.name
[/code]

Efficient object creation

If we’ve got to add many objects at once it it highly recommended to add them all at once. Let’s see how to add many `Color` type objects at once.
[code language=”python”]
color_names = [‘Black’, ‘White’, ‘Other’]
color_list = []
for c in color_names:
color_list.append(Color(name=c))
Color.objects.bulk_create(color_list)
[/code]
By using `bulk_create` method, the database will receive only one query.
This code will execute 1 SQL QUERY

Show Queries

Efficient object updating

To update many objects at once we can use `update` method. Let’s change colors of all `Color` objects to be the same.
[code language=”python”]
Color.objects.all().update(name=’Black’)
[/code]
This code will execute 1 SQL QUERY

Show Queries

Efficient object deleting

Django allows to perform delete from QuerySet. So by filtering objects and calling `delete` Django will select filtered objects collect the IDs to delete and execute DELETE in SQL.
[code language=”python”]
Color.objects.filter(name=’Black’).delete()
[/code]
This code will execute 3 SQL QUERIES

Show Queries

Efficient object filtering/fetching

These were the simplest examples. Whenever the models are more complicated it becomes harder to keep track about the amount of queries. To demonstrate how tricky it might be we need to populate the database with some test data.
[code language=”python”]
from app.models import *
import random
gender_names = [‘Male’, ‘Female’, ‘Other’]
color_names = [‘Black’, ‘White’, ‘Other’]
owner_names = [‘Mick’, ‘John’, ‘Paul’, ‘Jenny’, ‘Betty’, ‘Sarah’]
kitten_names = [‘Flappy’, ‘Fluffy’, ‘Naughty’, ‘Peele’, ‘Koko’, ‘Panny’]
# prepare `Gender` objects
g_list = []
for g in gender_names:
g_list.append(Gender(name=g))
# save objects in bulk
Gender.objects.bulk_create(g_list)
# prepare `Color` objects
c_list = []
for c in color_names:
c_list.append(Color(name=c))
# save `Color` objects
Color.objects.bulk_create(c_list)
# prepare ‘Kitten’ objects
k_list = []
for k in kitten_names:
i = Info.objects.create(
gender=Gender.objects.order_by(‘?’).first(),
color=Color.objects.order_by(‘?’).first(),
age=random.randrange(1,10)
)
k_list.appned(Kitten(
name=k,
info=i
))
# save ‘Kitten’ objects
Kitten.objects.bulk_create(k_list)
# create `Owner` objects
o_list = []
for o in owner_names:
i = Info.objects.create(
gender=Gender.objects.order_by(‘?’).first(),
color=Color.objects.order_by(‘?’).first(),
age=random.randrange(15,70)
)
o_obj = Owner.objects.create(
name=o,
info=i
)
# assign kittens to the owner
s_s = random.randrange(0, 2)
s_f = s_s + random.randrange(0, 4)
o_obj.kittens.add(*[k for k in Kitten.objects.filter()[s_s:s_f]])
[/code]

Fetch object with related models

Now we will try to fetch the `Owner` object with all `Kitten’s`, `Gender` and `Info` objects that belongs to it.
We can define helper function to print all the properties of the owner and all the kittens he owns
[code language=”python”]
def printOwner(owner):
print owner.name, owner.info.gender, owner.info.color, owner.info.age
for kitten in owner.kittens.all():
print kitten.name, kitten.info.gender, kitten.info.color, kitten.info.age
[/code]
Now select the owner object and print it’s properties
[code language=”python”]
owner = Owner.objects.get(name=’John’)
printOwner(owner)
[/code]
This code will execute 11 SQL QUERIES

Show Queries

By accessing the model object and it’s properties like that we did a LOT of queries even for a one object. If we would like to list this way all owners and the kittens belongs to them we could hit database thousands of times. Luckily Django provides two functions to perform JOIN on the related tables.

  • `select_related` is used to perform JOIN on ForeignKeys and OneToOne type of fields
  • `prefetch_related` is used to preform JOIN on ManyToMany fields and reverse relations(like reverse ForeignKey or reverse ManyToMany field)

It’s also possible to nest the JOINs and prefetch more nested objects.
[code language=”python”]
owner = Owner.objects.select_related(
‘info’, # JOIN `Info` object
‘info__gender’, # JOIN `Gender` object for `Info`
‘info__color’ # JOIN `Color` objects for `Info`
).prefetch_related(
‘kittens’, # JOIN `Kitten` objects
‘kittens__info’, # JOIN `Info` for `Kitten` objects
‘kittens__info__gender’, # JOIN `Gender` for `Info` for `Kitten` objects
‘kittens__info__color’ # JOIN `Color` for `Info` for `Kitten` objects
).get(name=’John’)
printOwner(owner)
[/code]
This code will execute 5 SQL QUERIES

Show Queries

This is a huge improvement even if we look for a single object case, however it becomes even more important when we take a look on listing the objects. The first case will hit database at least 10 times for each object when the second one will hit database only 5 times, no matter how many objects will be retrieved.

Reverse relations

Django provides reverse relations for the developers convenience, by default it’s name of the related model class plus `_set`. In case of `Kitten` and `Owner` model it would be `owner_set`. It’s easy to access reverse ForeignKey or reverse ManyToMany relation. As same as previous, to limit amount of queries we can specify to join object from reverse relation as well.
[code language=”python”]
kitten = Kitten.objects.select_related(
‘info’,
‘info__gender’,
‘info__color’
).prefetch_related(
‘owner_set’, # JOIN `Owner` for `Kitten`
‘owner_set__info’, # JOIN `Info` for `Owner` for `Kitten`
‘owner_set__info__gender’, # JOIN `Gender` for `Info` for `Owner` for `Kitten`
‘owner_set__info__color’ # JOIN `Color` for `Info` for `Owner` for `Kitten`
).get(name="Fluffy")
print kitten.name, kitten.info.gender.name
for owner in kitten.owner_set.all():
print owner.name, owner.info.color
[/code]
This code will execute 5 SQL QUERIES

Show Queries

Summary

Sometimes it’s hard to keep track, how many queries each endpoint is doing but it’s really worth to always pay attention and limit the database hits to minimum.

Leave a Reply

Your email address will not be published. Required fields are marked *