Toggle Menu

Insights / Tech Tips / Optimizing ActiveRecord & SQL Queries

December 15, 2017

Optimizing ActiveRecord & SQL Queries

5 mins read

When first learning Rails, people tend to lean pretty heavily on Ruby’s Enumerable methods to form the queries they need. This is okay when you’re just starting and arguably if you’re just trying to get a small project out the door. But, eventually you’ll find yourself in a situation where response time matters and getting good at optimizing queries becomes really important.

I plan to write 4 posts on this topic but we can start by looking at how we can optimize Rails’ belongs_to association queries.

Domain model

Commonly, in Rails projects you’ll have a domain model as such:

[pcsh lang=”ruby” tab_size=”4″ message=”” hl_lines=”” provider=”manual”]

class Customer < ActiveRecord::Base

belongs_to :tier

end

[/pcsh]

[pcsh lang=”ruby” tab_size=”4″ message=”” hl_lines=”” provider=”manual”]

class Tier < ActiveRecord::Base

has_many :customers

end

[/pcsh]

Customers belonging to a free tier

The client wants a way to e-mail all of the customers belonging to a free tier (to send them an invite to upgrade their account, for example). Easily enough, this is achieved with the following:

[pcsh lang=”ruby” tab_size=”4″ message=”” hl_lines=”” provider=”manual”]

Customer.all.select { |customer| customer.tier.freemium }

[/pcsh]

So, we’re done, let’s push to production? Well not so fast. Let’s analyze this query a little bit in Rails console to see what it’s really doing to our database.

[pcsh lang=”ruby” tab_size=”4″ message=”” hl_lines=”” provider=”manual”]

2.3.0 :016 > Customer.all.select { |customer| customer.tier.freemium }

Customer Load (1.6ms)  SELECT "customers".* FROM "customers"

Tier Load (0.1ms)  SELECT  "tiers".* FROM "tiers" WHERE "tiers"."id" = ? LIMIT 1  [["id", 1]]

Tier Load (0.0ms)  SELECT  "tiers".* FROM "tiers" WHERE "tiers"."id" = ? LIMIT 1  [["id", 1]]

Tier Load (0.0ms)  SELECT  "tiers".* FROM "tiers" WHERE "tiers"."id" = ? LIMIT 1  [["id", 1]]

Tier Load (0.0ms)  SELECT  "tiers".* FROM "tiers" WHERE "tiers"."id" = ? LIMIT 1  [["id", 1]]

Tier Load (0.0ms)  SELECT  "tiers".* FROM "tiers" WHERE "tiers"."id" = ? LIMIT 1  [["id", 1]]

Tier Load (0.0ms)  SELECT  "tiers".* FROM "tiers" WHERE "tiers"."id" = ? LIMIT 1  [["id", 1]]

[/pcsh]

…redacted…

N+1 Anyone?

Looks like we have a classic N+1 query — for every Customer, we’re making a separate request to the Tiers table. Now imagine how inefficient this would be for a mid-sized project with thousands or millions of Customers.

What’s worse is that we don’t even need actual data from the Tiers table as our final result should just be a list of Customers. With each query, our database is sending back every one of these Tiers to the application where it is then unnecessarily built out into a full-fledged ActiveRecord object.

What SQL query do we really want anyway?

If you’re familiar with SQL at all, sometimes it’s easier to write up the optimized query that you want and to back your way into an ActiveRecord query (or not if it’s a really complex query). In our case the SQL query we want is:

[pcsh lang=”sql” tab_size=”4″ message=”” hl_lines=”” provider=”manual”]

SELECT *

FROM customers

INNER JOIN tiers

ON tiers.id = customer.tier_id

WHERE tiers.paid = false

[/pcsh]

ActiveRecord’s `#join`

The above query should give us everything we need in one database call. The main players are the JOIN and WHERE statements. Fortunately, ActiveRecord has convenience methods for these SQL statements. So, lets break this down one by one starting with the join clause.

[pcsh lang=”sql” tab_size=”4″ message=”” hl_lines=”” provider=”manual”]

Customer.all.joins(:tier)

[/pcsh]

That was easy. If we check this out on the console we should more or less get something like:

[pcsh lang=”sql” tab_size=”4″ message=”” hl_lines=”” provider=”manual”]

SELECT *

FROM customers

INNER JOIN tiers

ON tiers.id = customer.tier_id

[/pcsh]

Great, so the next step is to simply implement the WHERE clause:

[pcsh lang=”ruby” tab_size=”4″ message=”” hl_lines=”” provider=”manual”]

Customer.all.joins(:tier).where(tiers: { paid: false })

[/pcsh]

And, if we plug this into Rails console we’ll see that we’re right back to the SQL query we were aiming for.

Refactor with `#merge`

At this point it would probably be OK if we commit this code and push it up. But, we can make it just a little bit better by separating concerns between what a Customer and Tier should be doing. Let’s start by ensuring that a Tier is solely responsible for knowing whether or not it is free or paid:

[pcsh lang=”ruby” tab_size=”4″ message=”” hl_lines=”” provider=”manual”]

class Tier < ActiveRecord::Base

def self.freemium

where(paid: false)

end

end

[/pcsh]

Now, in the Customer model we can simply tell Tier what to do and not ask it about itself. The last step would be to use `#merge` to combine the results of `Tier.freemium` with our Customers table. Our complete solution is now:

[pcsh lang=”ruby” tab_size=”4″ message=”” hl_lines=”” provider=”manual”]

class Customer < ActiveRecord::Base

def self.not_paying

joins(:tier).merge(Tier.freemium)

end

end

[/pcsh]

And as a last check, we can try this out in the console one more time to ensure we’re getting the same SQL query from before — which it does!

[pcsh lang=”ruby” tab_size=”4″ message=”” hl_lines=”” provider=”manual”]

2.3.0 :016 > Customer.not_paying

Customer Load (0.5ms)  SELECT "customers".* FROM "customers" INNER JOIN "tiers" ON "tiers"."id" = "customers"."tier_id" WHERE "tiers"."paid" = ‘f'

[/pcsh]

Further Reading

Stay tuned – this article is part 1 of 4 in a series on Optimizing ActiveRecord:
Part 1: Querying “belongs_to” associations
Part 2: Querying “one-to-many” associations
Part 3: Using raw SQL for custom joins
Part 4: Using aggregations in ActiveRecord

You Might Also Like

Resources

Simplifying Tech Complexities and Cultivating Tech Talent with Dustin Gaspard

Technical Program Manager, Dustin Gaspard, join host Javier Guerra, of The TechHuman Experience to discuss the transformative...

Resources

How Federal Agencies Can Deliver Better Digital Experiences Using UX and Human-Centered Design

Excella UX/UI Xpert, Thelma Van, join host John Gilroy of Federal Tech Podcast to discuss...