Nested Single Table Inheritance doesn’t work well. Here’s what you must know to make it work or work around it.
Some context for illustration
I recently stumbled across the following scenario.
Initial specifications: a project owner creates a project and donors can contribute any amount of money to that project.
class ApplicationRecord < ActiveRecord::Base
self.abstract_class = true
end
class User < ApplicationRecord
# ...
end
class User::ProjectOwner < User
# ...
end
class User::Donor < User
# ...
end
class Project < ApplicationRecord
# ...
end
class Contribution < ApplicationRecord
# ...
end
Later, a little change was made to the specifications: a donor may either be a natural person (an individual human) or a legal person (a corporation or any other kind of legal entity).
Since both are donors and will share some significant amount of logic, it seems obvious that they are both a specialization of User::Donor
, hence:
class User::Donor::Natural < User::Donor
# ...
end
class User::Donor::Legal < User::Donor
# ...
end
So far, this is classic OOP and we rely on ActiveRecord’s STI mechanism to do its magic (.find
type inference and so forth).
Spoiler alert: it doesn’t work.
STI doesn’t play well with lazy code loading
This part is not specific to (nested) STI or to ActiveRecord but it’s worth knowing.
Given a recordless database (working on a new project):
User.count
# => 0
User.descendants
# => []
This is unexpected. I thought User.descendants
would give me an array of all subclasses of User (%i[User::ProjectOwner User::Donor User::Donor::Natural User::Donor::Legal]
) but I have none of that. Why??
You don’t expect a constant to exist unless it has been defined, do you? Well, unless you load the file that defines it, it won’t exist.
Here is roughly how it goes:
Me: …start a rails console…
Me: User.descendants
Me: #=> []
Me: puts "Did you know: you can clap for this article up to 50 times ;)" if User::Donor.is_a?(User)
Code loader: Oh, this `User::Donor` const does not exist yet, let me infer which file is supposed to define it and try to load it for you.
Code loader: Ok I found it and loaded it, you can proceed
Me: #=> "Did you know: you can clap for this article up to 50 times ;)"
Me: User.descendants
Me: #=> [User::Donor]
Me: puts "Another Brick In The Wall" if User::Pink.is_a?(User)
Code loader: Oh, this `User::Pink` const does not exist yet, let me infer which file is supposed to define it and try to load it for you.
Code loader: Sorry, this `User::Pink` is nowhere to be found, I hope you know how to rescue from NameError.
Me: #=> NameError (uninitialized constant #<Class:0x00007fb42cb92ef8>::Pink)
Now you see why lazy loading doesn’t play nice with Single Table Inheritance: unless you’ve already accessed every single one of your STI subclasses const names to preload them, they won’t be known to your app.
It’s not that STI doesn’t work at all, it’s just mildly frustrating because oftentimes we need to enumerate the STI hierarchy and there’s no easy, out-of-the-box way to do it.
Ruby on Rails’ guide mentions this issue and suggests an (incomplete) solution: https://guides.rubyonrails.org/autoloading_and_reloading_constants.html#single-table-inheritance
TL;DR: use a concern that collects all types from inheritance_column
and force-preloads them.
Why it’s incomplete: because a subtype that has no record yet won’t be preloaded, which means there are things you won’t be able to do. For instance, you can’t rely on inflection to generate select options because recordless types won’t be listed in your options.
Another (really not recommended) solution would be to preload all your app’s classes. It’s killing a fly with a hammer.
My solution is based on the concern suggested by Rails’ guide but instead of collecting types from inheritance_column
, I use an array that contains all of the STI’s subclasses. This way I can use inflection at will. I agree that it’s not 100% SOLID-complient but it’s a trade-off I’m willing to make.
That being said, let’s talk about the main topic of this article.
STI + lazy loading + nested models = unpredictable behavior
Single Table Inheritance is made for one base class and any number of subclasses you want as long as they all directly inherit from the base class.
Take a look at the two following samples. The first one works perfectly fine while the second will give you headaches.
# Working example
class ApplicationRecord < ActiveRecord::Base
self.abstract_class = true
end
class User < ApplicationRecord
end
class User::ProjectOwner < User
has_many :projects
end
class User::Donor < User
has_many :contributions
end
class Project < ApplicationRecord
belongs_to :project_owner, class_name: 'User::ProjectOwner', foreign_key: 'user_id'
end
class Contribution < ApplicationRecord
belongs_to :project
belongs_to :donor, class_name: 'User::Donor', foreign_key: 'user_id'
end
# Not working example
class ApplicationRecord < ActiveRecord::Base
self.abstract_class = true
end
class User < ApplicationRecord
end
class User::ProjectOwner < User
has_many :projects
end
class User::Donor < User
has_many :contributions
end
class User::Donor::Natural < User::Donor
end
class User::Donor::Legal < User::Donor
end
class Project < ApplicationRecord
belongs_to :project_owner, class_name: 'User::ProjectOwner', foreign_key: 'user_id'
end
class Contribution < ApplicationRecord
belongs_to :project
belongs_to :donor, class_name: 'User::Donor', foreign_key: 'user_id'
end
Why does the first one work in a predictable manner and not the second? Find out yourself by paying attention to the SQL queries:
class ApplicationRecord < ActiveRecord::Base
self.abstract_class = true
end
class User < ApplicationRecord
end
class User::ProjectOwner < User
has_many :projects
end
class User::Donor < User
has_many :contributions
end
class Project < ApplicationRecord
belongs_to :project_owner, class_name: 'User::ProjectOwner', foreign_key: 'user_id'
end
class Contribution < ApplicationRecord
belongs_to :project
belongs_to :donor, class_name: 'User::Donor', foreign_key: 'user_id'
end
# ...open a rails console...
project_owner = User::ProjectOwner.create
# => User::ProjectOwner(id: 1)
project = Project.create(project_owner: project_owner)
# => Project(id: 1, project_owner_id: 1)
donor = User::Donor.create
# => User::Donor(id: 1)
contribution = Contribution.create(donor: donor, project: project, amount: 100)
# => Contribution(id: 1, user_id: 1, project_id: 1, amount: 100)
# ...CLOSE the current rails console...
# ...OPEN a NEW rails console...
Contribution.last.donor
Contribution Load (0.5ms) SELECT "contributions".* FROM "contributions" ORDER BY "contributions"."id" DESC LIMIT $1 [["LIMIT", 1]]
User::Donor Load (0.3ms) SELECT "users".* FROM "users" WHERE "users"."type" = $1 AND "users"."id" = $2 LIMIT $3 [["type", "User::Donor"], ["id", 1], ["LIMIT", 1]]
# => User::Donor(id: 1)
Now with a nested STI (base class, mid-level subclass and leaf-level subclasses):
class ApplicationRecord < ActiveRecord::Base
self.abstract_class = true
end
class User < ApplicationRecord
end
class User::ProjectOwner < User
has_many :projects
end
class User::Donor < User
has_many :contributions
end
class User::Donor::Natural < User::Donor
end
class User::Donor::Legal < User::Donor
end
class Project < ApplicationRecord
belongs_to :project_owner, class_name: 'User::ProjectOwner', foreign_key: 'user_id'
end
class Contribution < ApplicationRecord
belongs_to :project
belongs_to :donor, class_name: 'User::Donor', foreign_key: 'user_id'
end
# ...open a rails console...
project_owner = User::ProjectOwner.create
# => User::ProjectOwner(id: 1)
project = Project.create(project_owner: project_owner)
# => Project(id: 1, project_owner_id: 1)
donor = User::Donor::Natural.create
# => User::Donor::Natural(id: 1)
contribution = Contribution.create(donor: donor, project: project, amount: 100)
# => Contribution(id: 1, user_id: 1, project_id: 1, amount: 100)
# ...CLOSE the current rails console...
# ...OPEN a NEW rails console...
Contribution.last.donor
Contribution Load (0.5ms) SELECT "contributions".* FROM "contributions" ORDER BY "contributions"."id" DESC LIMIT $1 [["LIMIT", 1]]
User::Donor Load (0.3ms) SELECT "users".* FROM "users" WHERE "users"."type" = $1 AND "users"."id" = $2 LIMIT $3 [["type", "User::Donor"], ["id", 1], ["LIMIT", 1]]
# => nil
See? The SQL query to find the donor associated to the contribution looks for the type User::Donor
. Since my donor is a User::Donor::Natural
, the record is not found. ActiveRecord isn’t aware that User::Donor::Natural
is a subclass of User::Donor
in the context of an STI unless I preload it first.
irb(main):001:0> User.all.pluck :id
(0.9ms) SELECT "users"."id" FROM "users"
=> [2, 1]
irb(main):002:0> User.exists?(1)
User Exists? (0.3ms) SELECT 1 AS one FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 1], ["LIMIT", 1]]
=> true
irb(main):003:0> User::Donor.exists?(1)
User::Donor Exists? (0.7ms) SELECT 1 AS one FROM "users" WHERE "users"."type" = $1 AND "users"."id" = $2 LIMIT $3 [["type", "User::Donor"], ["id", 1], ["LIMIT", 1]]
=> false
irb(main):004:0> User::Donor::Natural.exists?(1)
User::Donor::Natural Exists? (1.3ms) SELECT 1 AS one FROM "users" WHERE "users"."type" = $1 AND "users"."id" = $2 LIMIT $3 [["type", "User::Donor::Natural"], ["id", 1], ["LIMIT", 1]]
=> true
irb(main):005:0> User::Donor.exists?(1)
User::Donor Exists? (2.1ms) SELECT 1 AS one FROM "users" WHERE "users"."type" IN ($1, $2) AND "users"."id" = $3 LIMIT $4 [["type", "User::Donor"], ["type", "User::Donor::Natural"], ["id", 1], ["LIMIT", 1]]
=> true
This is not okay to me. I would rather not take the risk of choosing an architecture whose behavior is uncertain because subject to code preloading.
ActiveRecord could’ve been designed to produce the following SQL statement:
SELECT * FROM users WHERE "users"."type" = "User::Donor" OR "users"."type" LIKE "User::Donor::%" AND "users"."id" = 1
Which would allow me to:
Request
User.all
and retrieve records of type:User, User::ProjectOwner, User::Donor, User::Donor::Natural, User::Donor::Legal
Request
User::Donor.all
and retrieve records of type:User::Donor, User::Donor::Natural, User::Donor::Legal
without code preloading
Request
User::Donor::Natural.all
and retrieve records of type:User::Donor::Natural
Request
User::Donor::Legal.all
and retrieve records of type:User::Donor::Legal
But it behaves otherwise:
SELECT * FROM users WHERE "users"."type" = "User::Donor" AND "users"."id" = 1
Only when I preloaded User::Donor
’s subclasses does it start allowing me to request User::Donor.all
and retrieve records of type: User::Donor, User::Donor::Natural, User::Donor::Legal
.
SELECT * FROM users WHERE "users"."type" IN ($1, $2, $3) AND "users"."id" = 1 [["type", "User::Donor"], ["type", "User::Donor::Natural"], ["type", "User::Donor::Legal"]]
One can put the blame on lazy code loading but I don’t. While I agree that inflection and lazy code loading cannot work hand in hand out-of-the-box, and since we can’t have a predictable/stable behavior from a mid-level model, it would be better to have AR’s documentation explicitely discourage nested STIs.
I’d rather not have a feature than one I can’t rely on.
Why does it work fine from the base class of a regular STI and not from a mid-level one?
The answer is found in the source code of ActiveRecord.
When accessing the relation, ActiveRecord adds a type condition if needed:
# https://github.com/rails/rails/blob/6bc7c478ba469ad4b033125d6798d48f36d6be3e/activerecord/lib/active_record/core.rb#L306
def relation
relation = Relation.create(self)
if finder_needs_type_condition? && !ignore_default_scope?
relation.where!(type_condition)
relation.create_with!(inheritance_column.to_s => sti_name)
else
relation
end
end
To determine whether the type condition is needed, it does a couple of checks regarding the distance between the current class and ActiveRecord::Base as well as the presence of an inheritance column.
# https://github.com/rails/rails/blob/6bc7c478ba469ad4b033125d6798d48f36d6be3e/activerecord/lib/active_record/inheritance.rb#L74
# Returns +true+ if this does not need STI type condition. Returns
# +false+ if STI type condition needs to be applied.
def descends_from_active_record?
if self == Base
false
elsif superclass.abstract_class?
superclass.descends_from_active_record?
else
superclass == Base || !columns_hash.include?(inheritance_column)
end
end
def finder_needs_type_condition? #:nodoc:
# This is like this because benchmarking justifies the strange :false stuff
:true == (@finder_needs_type_condition ||= descends_from_active_record? ? :false : :true)
end
The type condition is built as follows:
# https://github.com/rails/rails/blob/6bc7c478ba469ad4b033125d6798d48f36d6be3e/activerecord/lib/active_record/inheritance.rb#L262
def type_condition(table = arel_table)
sti_column = arel_attribute(inheritance_column, table)
sti_names = ([self] + descendants).map(&:sti_name)
predicate_builder.build(sti_column, sti_names)
end
To sum up:
When requesting from the base class (in my example:
User
), no type condition is added.Since it’s listing all records of the table, it gives access to all records whose class is or inherits from
User
. Perfect.
When requesting from a leaf subclass, the exact type must be matched for the record to be found. Logical.
When requesting from a mid-level subclass such as
User::Donor
(neither the base classUser
nor a leafUser::Donor::Natural
), it depends. As expected, records of typeUser::Donor
are loaded. On the other hand, records whose class inherits fromUser::Donor
will be selected only if their class is preloaded.
Is there a workaround?
There always is.
We could consider patching ActiveRecord, making it use LIKE
in the SQL query as an alternative condition to the actual strict string comparison. Problem: I didn’t run any benchmark but it will certainly slow down database reading. Though it’s a working solution, it is inefficient, requires a lot of work to patch ActiveRecord and, frankly, we’re not even sure the Rails core team would merge such a patch.
Another workaround would be to override the default scope of User::Donor
to make it use a LIKE
statement as described above. I’m not a huge fan of default scopes because the day always comes when we need to use .unscope
and voilà it doesn’t work anymore. It’s not a sustainable solution IMO.
Yet another solution could be to preload subclasses, for instance with the solution discussed earlier. I guess it’s an acceptable one.
One more solution is to roll back to a simpler architecture that does not let any room for behavior changes: no mid-level subclasses, no preloading required. How do I not repeat myself for the common code shared by User::Donor::Natural
and User::Donor::Legal
, you ask?
Using concerns.
class ApplicationRecord < ActiveRecord::Base
self.abstract_class = true
end
class User < ApplicationRecord
scope :donors, -> { where(type: ['User::DonorNatural', 'User::DonorLegal']) }
scope :project_owners, -> { where(type: 'User::ProjectOwner') }
end
class User::ProjectOwner < User
end
class User::DonorNatural < User
include User::DonorConcern
end
class User::DonorLegal < User
include User::DonorConcern
end
module User::DonorConcern
extend ActiveSupport::Concern
included do
has_many :contributions, foreign_key: 'user_id', inverse_of: :donor
end
end
class Project < ApplicationRecord
belongs_to :project_owner, class_name: 'User::ProjectOwner', foreign_key: 'user_id'
end
class Contribution < ApplicationRecord
belongs_to :project
belongs_to :donor, class_name: 'User', foreign_key: 'user_id', inverse_of: :contributions
end
There is still room for improvement (this code is intentionally oversimplified, no validations whatsoever) to make this article easier to read, my goal being to give you the essential information so that you can choose your own favorite solution in an informed way.
My favorite solutions
When possible, I’d rather have a simpler architecture (no intermediate layers). The less complex it is, the less headaches I have.
When I must have this intermediate layer, I’ll preload all subclasses of my STI to avoid any behavior randomness. And I mean all subclasses of my STI, not just the ones having records in the database.
module UserStiPreloadConcern
unless Rails.application.config.eager_load
extend ActiveSupport::Concern
included do
cattr_accessor :preloaded, instance_accessor: false
end
class_methods do
def descendants
preload_sti unless preloaded
super
end
def preload_sti
user_subclasses = [
"User::ProjectOwner",
"User::Donor",
"User::Donor::Natural",
"User::Donor::Legal"
]
user_subclasses.each do |type|
type.constantize
end
self.preloaded = true
end
end
end
end
Thanks for reading!