13 minutes read

In your journey of building Django applications, you've become adept at retrieving and filtering objects from a database. Yet, as application logic grows more complex, harnessing the power of more nuanced queries becomes essential. Obtaining objects with exact field values is a good starting point, but challenges arise when you need to filter based on conditions like greater or less than, substring matches, or even regular expressions. In such instances, field lookups prove invaluable.

Excluding data

The Quidditch league is gaining popularity thanks to its new site we are currently developing. Users are keen on knowing the result of each game and are eager to explore interesting statistics about previous tournaments. We are working with two models: Team and Game. Our database, impressively large, contains all games from 1674 to the present day, thanks to the archives from the Hogwarts library. Given the size and scope of this data, we prefer to keep it in the database to enable us to access all the necessary information quickly:

from django.db import models


class Team(models.Model):
    name = models.CharField(max_length=64)


class Game(models.Model):
    home_team = models.ForeignKey(Team, related_name='game_at_home', on_delete=models.CASCADE)
    home_team_points = models.IntegerField()
    rival_team = models.ForeignKey(Team, related_name='rival_game', on_delete=models.CASCADE)
    rival_team_points = models.IntegerField()
    date = models.DateField()

Since we haven't published all the data on the website, we provide you with a sample:

from datetime import date

falmouth_falcons = Team.objects.create(name="Falmouth Falcons")
montrose_magpies = Team.objects.create(name="Montrose Magpies")
tutshill_tornados = Team.objects.create(name="Tutshill Tornados")
appleby_arrows = Team.objects.create(name="Appleby Arrows") 

Game.objects.create(home_team=falmouth_falcons, home_team_points=15,
                    rival_team=montrose_magpies, rival_team_points=12, date=date(1674, 5, 6))

Game.objects.create(home_team=falmouth_falcons, home_team_points=34,
                    rival_team=tutshill_tornados, rival_team_points=8, date=date(1774, 9, 30))

Game.objects.create(home_team=appleby_arrows, home_team_points=10,
                    rival_team=montrose_magpies, rival_team_points=19, date=date(1779, 7, 15))

Game.objects.create(home_team=tutshill_tornados, home_team_points=7,
                    rival_team=appleby_arrows, rival_team_points=27, date=date(2018, 6, 25))

Game.objects.create(home_team=montrose_magpies, home_team_points=24,
                    rival_team=tutshill_tornados, rival_team_points=16, date=date(1907, 5, 12))

Migrate your models before using them!

If you're familiar with the filter method of Django's Object Manager, you'll easily follow what the exclude method does. When filtering out data, we look for objects that satisfy certain conditions. Conversely, the exclude method allows us to remove objects from a QuerySet when a specific condition is met. Compare two queries:

# All home games for Falmouth Falcons
Game.objects.filter(home_team=falmouth_falcons)

# All games excluding games where Falmouth Falcons was a home team
Game.objects.exclude(home_team=falmouth_falcons)

We divide all the games into two samples: those that satisfy the condition, and those that don't. That is the point of the exclude method.

The syntax rules for exclude are the same as for the filter method.

Field lookups

Filtering objects solely based on their value isn't always convenient, especially when objects need to satisfy a more complex condition. Consider, for example, retrieving all 20th-century Quidditch matches or all games where the home team scored more than 12 points. We'll start with this query:

great_score_at_home_games = Game.objects.filter(home_team_points__gt=12)

The parameter relies on a special syntax combining the field name, double underscores, and the field lookup. The latter is a special name for operations you want to perform when filtering data. The gt lookup is a shorthand for greater than. Thus, you can interpret this query as "filter games where the home team scored more than 12 points".

Let's look at another example:

from datetime import date

twentieth_century_games = Game.objects.filter(
    date__gte=date(1900, 1, 1), date__lte=date(1999, 12, 31)
)

The gte lookup is similar in function to gt. The difference is that gte means that the field value could be greater than or equal to the provided one. Accordingly, lte means less than or equal to.

This way, we confine the value of the date within upper and lower bounds. We could also write our query using another helper:

twentieth_century_games = Game.objects.filter(date__year__gte=1900, date__year__lte=1999)

We combine two lookups here. First, we extract a year from the date and then impose a condition on its value. The year lookup is only available for DateField and DateTimeField; we cannot use it for plain IntegerField or CharField because this wouldn't make any sense.

We could write our query more intuitively as follows:

twentieth_century_games = Game.objects.filter(date__year__range=(1900, 1999))

One more useful lookup is isnull, which helps retrieve objects with a Null value in their fields:

class Player(models.Model):
    name = models.CharField(max_length=64, null=True, default=None)

noname_player = Player.objects.create()
players_without_name = Player.objects.filter(name__isnull=True)

If we need to get players with names, we use:

players_with_name = Player.objects.filter(name__isnull=False)

The syntax rule for any field lookup is simple — double underscores should precede it. When chaining multiple lookups, double underscores are required each time. The range means that we are searching for values between two provided boundaries.

We do not recommend using double underscores in field names. Name all fields as you name all Python variables — join the parts with a single underscore mark.

Collections for lookups

You already know a lot about comparing fields to a single value. Django also allows us to use collections in our queries. For instance, you can filter all home games for both Falmouth Falcons and Montrose Magpies, while excluding all other options:

special_home_games = Game.objects.exclude(home_team=tutshill_tornados) \
                                 .exclude(home_team=appleby_arrows)

But what if we have multiple conditions? Should we resort to extensive copy-pasting, or is there a smarter move? We recommend the latter. Let's use a Python list to filter what we need:

special_home_games = Game.objects.filter(home_team__in=[falmouth_falcons, montrose_magpies])

If you're familiar with Python lists, you can apply that knowledge when using the in lookup. This works just like any Python collection — we verify if the collection contains the value, and if so, we keep the objects in our QuerySet.

Field lookups for foreign keys

By now, you've learned how to filter data and use it for another query. First, we store falmouth_falcons as a variable. Then we can filter the games where Falcons were the home team. But do we need to store it in a variable to make a query? The answer is no, as we can directly access fields of foreign keys through lookups.

falcons_home_games = Game.objects.filter(home_team__name="Falmouth Falcons")

Use double underscores again. Now, the structure consists of a foreign key field name, double underscores, and a foreign key model field.

To delve deeper, let's combine the access to the field of the foreign model with field lookups:

falcons_home_games = Game.objects.filter(home_team__name__contains="Falcons")

The use of double underscores is akin to access through a period, with fields resembling class attributes and lookups mirroring class methods. Hence, you can combine them just like you would combine accesses using a period.

We can do the same with many-to-many relationships. For example, let's consider the Stadium model, which contains information about its name and the teams that have played in it:

from django.db import models

class Stadium(models.Model):
    name = models.CharField(max_length=64)
    team = models.ManyToManyField(Team)

Now we can retrieve all the stadiums where Falmouth Falcons have played:

falmouth_falcons_stadiums = Stadium.objects.filter(team__name="Falmouth Falcons")

Be cautious and don't make convoluted queries with several instances of double underscores: two pairs are generally enough. Excessive usage may prolong processing time or cause your database to utilize a lot of memory for a single query.

Nested queries

Let's say you want to find all teams that have scored over 100 runs in home games. To accomplish this task, we can use nested queries.

A nested query is a query that is embedded within another query. It’s a useful tool that allows you to filter or annotate your main queryset based on the results of a separate queryset.

Here’s an example of how you can use a nested query in Django using the Subquery class:

from django.db.models import Subquery

# define a subquery
subquery = Game.objects.filter(home_team_points__gt=100).values('home_team')

# use a subquery in the main query
high_scoring_teams = Team.objects.filter(id__in=Subquery(subquery)).values('name')

In this example, the inner query (or subquery) selects home_team from the Game model where home_team_points is greater than 100. The outer query then selects name from the Team model, where id values match those returned by the subquery.

Nested queries in Django can be useful in many situations. Here are some scenarios where you might employ nested queries:

  • Data Filtering: Subqueries can filter data in the main query based on the subquery's results. This can be useful when you need to retrieve data meeting certain criteria that a simple filter cannot express.

  • Data Aggregation: Nested queries can aggregate data. For example, you can count the number of records matching a certain condition using a subquery, and then incorporate that information into your main query.

  • Performance Optimization: In some cases, nested queries can be more efficient than running multiple separate queries. However, this efficiency depends on your specific situation, so it's vital to monitor your queries' performance.

It's important to remember that nested queries, while powerful, can be complex and may slow down your application if misused. Therefore, it's paramount to carefully test and optimize your queries for maximum performance.

Get or 404

It can often happen that the object we want to get does not exist. Such requests also need proper handling, and one method is using the get_object_or_404 function. This convenient Django function attempts to retrieve an object with the provided parameters. However, if the targeted object is not found, it raises a 404 Not Found error, leading to a 404 error page being displayed to the user.

from django.shortcuts import get_object_or_404
from .models import Team, Game

def game_detail(request, game_id):
    game = get_object_or_404(Game, id=game_id)
    return render(request, 'game_detail.html', {'game': game})

In this example, get_object_or_404 is used to get a game with a specific id. If no game with that id is found, Django automatically returns a 404 error page.

This function is useful when you want to display the details of a specific object, such as a game. If the object does not exist, the user is automatically directed to a 404 error page, which indicates proper error management.

Remember that get_object_or_404 is just a convenient wrapper around Model.objects.get(). While it offers no additional functionality compared to a regular get() request, it does enhance your code's readability and cleanliness.

Conclusion

Mastering Django queries enables efficient data management and extraction from your database. As your application evolves, sophisticated queries and lookups become essential. Django provides powerful tools for complex data filtering and leveraging field lookups.

We've covered various aspects of querying data, from simple filters to more advanced techniques like using field lookups for foreign keys, collections, and nested queries. We've also discussed the importance of error management, highlighted with the introduction of the get_object_or_404 function, which offers a graceful solution when attempting to retrieve non-existent objects.

With a solid understanding of these querying techniques, you can build robust and efficient applications, ensuring a seamless experience for both developers and users alike through expressive and performance-conscious data retrieval.

6 learners liked this piece of theory. 0 didn't like it. What about you?
Report a typo