A dirty patch to fix Django annotate related group by year/month/day related bug

Recently I was working on a Django project with a team, where we wanted to run some group-by queries for analytical data representation. As we already know that Django does not directly support group-by but there are ways one can achieve it by using Django values and annotate functions.

Model.objects.annotate(year=ExtractYear('timestamp'))
        .values('year')
        .annotate(ycount=Count('id'))

It was supposed to return a QuerySet that contains a count of entries that has been created in a specific year. Instead, it was returning a QuerySet that containing individual data.

During the first step of my investigation, I tried to log the SQL query that is associated with this query and it logged something like this.

SELECT EXTRACT(YEAR FROM `tablename`.`timestamp`) AS `year`, COUNT(`tablename`.`id`) AS `ycount` FROM `tablename` GROUP BY EXTRACT(YEAR FROM `tablename`.`timestamp`), `tablename`.`timestamp`

The SQL query that I wanted my ORM to create was:

SELECT EXTRACT(YEAR FROM `tablename`.`timestamp`) AS `year`, COUNT(`tablename`.`id`) AS `ycount` FROM `tablename` GROUP BY EXTRACT(YEAR FROM `tablename`.`timestamp`)

The difference was subtle, but the Django was grouping by using two fields and that was the reason behind this unintended result.

How can we possibly bypass this possible bug from Django? Since we had no way to group timestamps. The solution I had in mind is, while running queries, what if I can temporarily replace the value of timestamp on the runtime? Since values, or F does not allow to replace value of a field I had to rely on extra function that comes with Django.

Model.objects.annotate(year=ExtractYear('timestamp'))
             .values('year')extra(select = {'timestamp': 'year'})
             .annotate(ycount=Count('id'))

Which has produced the following SQL:

SELECT DATE(`tablename`.`timestamp`) AS `date`, MAX(`tablename`.`quantity`) AS `count` FROM `tablename` GROUP BY DATE(`tablename`.`timestamp`), (date)

It is probably not the ideal solution but it got things done before the Django team solves the problem. If you have a better solution in mind, I would love to talk about it and implement it.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s