Skip to content

Explanation

django-readers provides a set of tools to help with the parts of your business logic that are responsible for reads from the database: selecting and transforming data before presenting it to clients. It can be used with views that render Django templates as well as Django REST framework views.

The functionality that django-readers provides is deliberately straightforward and interoperable with existing Django libraries, patterns and practices. You can choose to use just the parts of django-readers that appeal to you and make sense in your project.

A django-readers "spec" precisely specifies the data that your view depends on (which particular fields from which related models). Only this data will be fetched from the database, in the most efficient way possible. This is intended to avoid the N+1 queries problem and can dramatically improve the performance of your application.

However, django-readers is more than just this. It is also intended to suggest patterns which help with organising business logic in a Django codebase, following a function-oriented style. It introduces a few simple concepts, and provides some tools to assemble them together into a working application. It can easily be combined with existing patterns and libraries.

  • queryset preparation functions replace custom queryset methods and encapsulate data selection: filtering, annotation etc. They can be composed to express complex selection logic.
  • producer and projector functions replace model methods and encapsulate business logic for transforming and presenting data. They can be combined to form lightweight business objects (dictionaries) that are the right shape for the code that consumes them.
  • reader pairs combine queryset functions and producers (or projectors), expressing the dependencies between them.
  • a high-level spec provides a concise way to express exactly which data should be selected and projected at the point of use.

Motivation

Django common practices encourage a "fat models" approach. Most of the business logic of the application goes in the model layer (on the models themselves, or on custom managers or querysets). Experience has suggested that this is often a bad idea for several reasons:

First, it goes against the Single Responsibility Principle. Models are already responsible for mapping between your database tables and your application code and back again. This mapping is a highly complex task, and that's quite enough for one set of classes to be responsible for.

Second, it is bad for code organisation, particularly in larger projects. Your models.py becomes a trash pile onto which all business logic is dumped. Models and querysets grow to thousands of lines of code. The API surface area for each model becomes huge, and this entire surface is available to any part of your application that imports the model.

Third and worst, often model methods themselves perform queries against other models. This is a disaster for application performance, leading to inefficient query patterns that can be very difficult to fix. When they are fixed (through judicious use of select_related and prefetch_related on the queryset), the model methods become tightly coupled to the precise way that the query is built, resulting in unpredictable and brittle code.

django-readers encourages you to structure your code around plain functions rather than methods on classes. You can put these functions wherever you like in your codebase. Complex business logic is built by composing and combining these functions.

Note

YAGNI, "You Aren't Gonna Need It", it's a well-understood mantra in software development. It means that you should only make changes to your software (including adding abstraction and generalising code structures) when you are sure that you absolutely need them, and not before. Related to this is the famous quote from Donald Knuth: "premature optimization is the root of all evil". This suggests that usually performance concerns fall under YAGNI: you shouldn't spend time making code fast until its slowness becomes a real problem for users.

As a counterpoint to this advice, Simon Willison coined the phrase PAGNI, "Probably Are Gonna Need It". PAGNI applies in situations "when the cost of adding something later is so dramatically expensive compared with the cost of adding it early on that it’s worth taking the risk [...] when you know from experience that an initial investment will pay off many times over."

django-readers (and its sister project django-zen-queries) were built partly as a result of a strong belief (gained through many years of experience) that understanding and controlling your application's database querying behaviour firmly falls into the PAGNI category. This is particularly true of applications that you know are going to be complex: any code abstraction patterns you decide to follow in order to manage the complexity must take into account query patterns or they are highly likely to cause terrible performance problems. This is the heart of the object-relational impedance mismatch.

Features and concepts

django-readers is organised in three layers of "reader functions". At the highest level of abstraction is django_readers.specs (the top layer), which depends on django_readers.pairs (the middle layer), which depends on django_readers.producers, django_readers.projectors and django_readers.qs (the bottom layer).

These layers can be intermingled in a real-world application. To expain each layer, it makes most sense to start at the bottom and work upwards.

Queryset preparation functions

A queryset preparation function is a function that accepts a queryset as its single argument, and returns a new queryset with some modifications applied. These functions are used to encapsulate database query logic that would traditionally live in a custom queryset method. If you were writing one of these yourself, it might look like this:

def prepare(queryset):
    return queryset.filter(name="shakespeare")

However, you don't usually need to write your own queryset functions: django-readers provides a library of functions under django_readers.qs which mirror all the default methods on the base QuerySet that return a new queryset, as well as some extra utility functions.

Queryset functions can be combined with the qs.pipe function (named following standard functional programming parlance). qs.pipe returns a new queryset function that calls each function in its argument list in turn, passing the return value of the first as the argument of the second, and so on. It literally "pipes" your queryset through its list of functions.

from django_readers import qs

recent_books_with_prefetched_authors = qs.pipe(
    qs.filter(publication_date__year=2020),
    qs.prefetch_related("authors"),
    qs.order_by("name"),
)

queryset = recent_books_with_prefetched_authors(Book.objects.all())

Producers and projectors

A producer is a function that accepts a model instance as its single argument, and returns a value representing a subset or transformation of the instance data.

Business logic that would traditionally go in model methods should instead go in producers.

from datetime import datetime


def produce_age(instance):
    return datetime.now().year - instance.birth_year


author = Author(name="Some Author", birth_year=1984)
print(produce_age(author))
#  37

The simplest producer is one that returns the value of an object attribute. django-readers provides a function to create producers that do this:

from django_readers import producers

author = Author(name="Some Author")
produce_name = producers.attr("name")
print(produce_name(author))
#  'Some Author'

Producers return a value, but in order to convert a model instance into a lightweight business object (a dictionary) suitable for passing around your project, this value must be combined with a name. This is the role of a projector. A projector function takes a model instance and returns a dictionary mapping keys to the values returned by producer functions.

These functions "project" your data layer into your application's business logic domain. Think of the dictionary returned by a projector (the "projection") as the simplest possible domain object. Generally speaking, it's not necessary to write your own projector functions. You can instead wrap a producer function with projectors.producer_to_projector

from datetime import datetime
from django_readers import projectors


def produce_age(instance):
    return datetime.now().year - instance.birth_year


project_age = projectors.producer_to_projector("age", produce_age)

author = Author(name="Some Author", birth_year=1984)
print(project_age(author))
#  {'age': 37}

Like queryset functions, projectors are intended to be composable: multiple simple projector functions can be combined into a more complex projector function that returns a dictionary containing the keys and values from all of its child projectors. This is done using the projectors.combine function:

from django_readers import producers, projectors

project = projectors.combine(
    projectors.producer_to_projector("name", producers.attr("name")),
    projectors.producer_to_projector("age", produce_age),
)
print(project(author))
#  {'name': 'Some Author', 'age': 37}

This composition generally happens at the place in your codebase where the domain model is actually being used (in a view, say). The projection will therefore contain precisely the keys needed by that view. This solves the problem of models becoming vast ever-growing flat namespaces containing all the functionality needed by all parts of your application.

Related objects can also be produced using the producers.relationship function, resulting in a nested projection:

project = projectors.combine(
    projectors.producer_to_projector("name", producers.attr("name")),
    projectors.producer_to_projector("age", produce_age),
    projectors.producer_to_projector(
        "book_set",
        producers.relationship(
            "book_set",
            projectors.combine(
                projectors.producer_to_projector(
                    "title",
                    producers.attr("title"),
                ),
                projectors.producer_to_projector(
                    "publication_date",
                    producers.attr("publication_date"),
                ),
            ),
        ),
    ),
)

print(project(author))
#  {
#      'name': 'Some Author',
#      'age': 37,
#      'book_set': [
#          {'title': 'Some Book', 'publication_date': 2019}
#      ]
#   }

Note above that the second argument to producers.relationship is a projector function to project each related object.

Pairs

prepare and produce (and therefore also project) functions are intimately connected, with the produce/project functions usually depending on fields, annotations or relationships loaded by the prepare function. For this reason, django-readers expects these functions to live together in two-tuples: (prepare, produce) (a "producer pair") and (prepare, project) (a "projector pair"). Remember that the difference between produce and project is that the former returns a single value, whereas the latter returns a dictionary binding one or more names (keys) to one or more values.

In the example used above, the produce_age producer depends on the birth_year field:

age_pair = (qs.include_fields("birth_year"), produce_age)

django-readers includes some useful functions that create pairs. These attempt to generate the most efficient queries they can, which means loading only those database fields required to produce the value or values:

from django_readers import pairs

prepare, produce = pairs.field("name")
queryset = prepare(Author.objects.all())
print(queryset.query)
#  SELECT "author"."id", "author"."name" FROM "author"
print(produce(queryset.first()))
#  'Some Author'

When composing multiple pairs together, it is again necessary to wrap the producer to convert it to a projector, thus forming (prepare, project) pairs. This can be done with the pairs.producer_to_projector function:

prepare, project = pairs.combine(
    pairs.producer_to_projector("name", pairs.field("name")),
    pairs.producer_to_projector("birth_year", pairs.field("birth_year")),
)

Relationships can automatically be loaded and projected, too:

prepare, project = pairs.combine(
    pairs.producer_to_projector("name", pairs.field("name")),
    pairs.producer_to_projector("age", age_pair),
    pairs.producer_to_projector(
        "book_set",
        pairs.relationship(
            "book_set",
            pairs.combine(
                pairs.producer_to_projector("title", pairs.field("title")),
                pairs.producer_to_projector(
                    "publication_date", pairs.field("publication_date")
                ),
            ),
        ),
    ),
)

Again, only the precise fields that are needed are loaded from the database. All relationship functions take an optional to_attr argument which is passed to the underlying Prefetch object.

Note that django-readers always uses prefetch_related to load relationships, even in circumstances where select_related would usually be used (i.e. ForeignKey and OneToOneField), resulting in one query per relationship. This approach allows the code to be "fractal": the tree of (prepare, project) pairs can be recursively applied to the tree of related querysets. It is possible to use select_related but this must be done manually.

Specs

Manually assembling trees of pairs as seen above may seem long-winded. The specs module provides a layer of syntactic sugar that makes it much easier. This layer is the real magic of django-readers: a straightforward way of specifying the shape of your data in order to efficiently select and project a complex tree of related objects.

The resulting nested dictionary structure may be returned from as view as a JSON response (assuming all your producers return JSON-serializable values), or included in a template context in place of a queryset or model instance.

A spec is a list. Under the hood, the specs module is a very lightweight wrapper on top of pairs. Each item in the list undergoes a simple transformation to replace it with a pair function. See the reference documentation for specs for details.

The example from the last section may be written as the following spec:

from django_readers import specs

spec = [
    "name",
    {"age": age_pair},
    {
        "book_set": [
            "title",
            "publication_date",
        ]
    },
]

prepare, project = specs.process(spec)

queryset = prepare(Author.objects.all())
result = [project(instance) for instance in queryset]

Note

The structure of this specification is heavily inspired by django-rest-framework-serialization-spec, minus the concept of "plugins", which are replaced with directly including (prepare, produce) pairs in the spec. It should be trivial to convert or "adapt" a serialization-spec plugin into a suitable django-readers pair.

Where should this code go?

We recommend that your custom functions go in a file called readers.py inside your Django apps. Specs should be declared at the point they are used, usually in your views.py.

What about other types of business logic?

You'll notice that django-readers is focused on reads: business logic which selects some data from the database and/or transforms it in such a way that it can be displayed to a user. What about other common types of business logic that involve accepting input from users and processing it?

django-readers doesn't provide any code to help with this, but we encourage you to follow the same function-oriented philosophy. Structure your codebase around functions which take model instances and encapsulate these sorts of write actions. You might choose to call them action functions and place them in a file called actions.py.

The other common task needed is data validation. We'd suggest Django forms and/or Django REST framework serializers are perfectly adequate here.

Is django-readers a "service layer"?

Not really, although it does solve some of the same problems. It suggests alternative (and, we think, beneficial) ways to structure your business logic without attempting to hide or abstract away the underlying Django concepts, and so should be easily understandable by any experienced Django developer. You can easily "mix and match" django-readers concepts into an existing application.

If you are someone who feels more comfortable thinking in terms of established Design Patterns, you may consider the dictionaries returned from projector functions as simple Data Transfer Objects, and the idea of dividing read and write logic into readers and actions as a version of CQRS.

Is django-readers a serialization or data conversion library?

Not really, although again it does solve some of the same problems. django-readers is often compared to projects like attrs/cattrs and pydantic.

However, django-readers is focused on the shape of the data and how to extract it from the database (via the Django ORM) efficiently, rather than converting and validating the types. It eschews a class-oriented style in favour of plain, composable functions operating on plain data structures like dictionaries, and deliberately avoids static type annotations.

If your intention is to render your data to JSON, we recommend you use django-readers to project the model field values, and then lean on Django or Django REST framework's built-in rich encoders for converting types like datetime and UUID to JSON-friendly strings.