Netskin Logo

Using UNION Queries with Rails

#sql
#rails
by Michael Blum on 18.10.2022

For one of our projects I needed to build an export function of a rather complex Single-Table-Inheritance association. For the sake of this article we simplify it a bit, and assume the following structure:

class Invoice < ActiveRecord::Base
end

class CustomerInvoice < Invoice
  belongs_to :customer
  delegate :name to: :customer
end

class BusinessInvoice < Invoice
  belongs_to :business
  delegate :name to: :business
end

Now imagine we want to get all Invoices in an index-view, along with some data from associated tables. Of course we need to eager-load the associations in our query, otherwise we will run in the N+1 Query - Problem.

So lets say we want to have a (haml)-view like this:

%table
  %tr
    %th Invoice-ID
    %th Name
    %th Amount
    %th ...
  @invoices.each.do |invoice|
    %tr
      %td= invoice.id
      %td= invoice.name
      %td= invoice.amount
      %td ...

As you can see in the 2nd column (labeled “name”) we have a problem: Depending on the invoice-STI-type we have to collect the data from 2 different associated tables. So here are some approaches to solve this problem:


def index
  @invoices = Invoice.all # This works! But we will have n+1 Database Queries, which exponentially slows the application down, depending on how many data we have

  @invoices = Invoice.includes(:customer, :business).all
  # This will throw an error:
  #ActiveRecord::AssociationNotFoundError: Association named 'business' was not found on Invoice; perhaps you misspelled it?

  @invoices = CustomerInvoice.includes(:customer).all + BusinessInvoice.includes(:business).all # This works! However concatenating the ActiveRecord collections feels is a bit clumsy. Also if we want to add some SQL functions like limit, offset and so on, because we have to to it in both queries

  customer_invoice_sql = CustomerInvoice.includes(:customer).to_sql
  business_invoice_sql = BusinessInvoice.includes(:business).to_sql
  @invoices = Invoice.find_by_sql("(#{customer_invoice_sql}) UNION (#{business_invoice_sql})") # This works! We are getting there! This has many advantages to the previous approach, since we are getting an Array of Arel-Objects. But `@invoices` is still an array:
  # [1] pry(#<Invoices::ExportForm>)> @invoices.class
  # => Array
end

The solution I eventually came up with uses SQL Union and ActiveRecord From. The final code looks like this:

  def index
    customer_invoice_sql = CustomerInvoice.includes(:customer).to_sql
    business_invoice_sql = BusinessInvoice.includes(:business).to_sql
    @invoices = Invoice.from("(#{customer_invoice_sql} UNION #{business_invoice_sql}) AS invoices")
    # [1] pry(#<Invoices::ExportForm>)> @invoices.class
    # => Invoice::ActiveRecord_Relation
  end

This has the major advantage, that you have an ActiveRecord-Relation as a return object instead of just an array! So you can use all your favorite Active record methods like (limit, order, pluck, …) on it, instead of dealing with a normal array. Also it is fast and easily scalable.

Happy Coding!

Resources

❮ How to Prevent force-push to master
Office 365 data in Ruby on Rails app using Microsoft Graph REST API ❯
Netskin Logo