A Kautionary Tale about Mixins
Ruby is my favorite programming language. It is hands down the most expressive object-oriented language I’ve ever used. Matz’s concept of least surprise gives the language fantastic developer ergonomics. And, the Ruby community's mantra of "Matz is nice, so we are nice," (MINSWAN) really bears true. If I had to choose one programming language to use and one community to be a part of for the rest of my career, it would be Ruby.
Ruby is a very sharp tool. It’s not for everyone. Like a chef’s knife, in the hands of an expert, Ruby feels like an extension of the hand. And, like a chef’s knife, in the hands of a novice, it can inflict a whole lot of pain. That's not to say that Ruby is for experts only — just that you want to watch your fingers.
Meta-programming is one of those sharp edges. Novice meta-programmers often underestimate how hard it will be to debug code that doesn’t exist on disk. And, it is really hard. In fact, it’s kind of a right of passage in Ruby to write some meta-programmed code that you come back to later and throw away because you cannot understand it yourself.
But, for my money, module mixins are the largest trap that novice object-oriented programmers fall into. Every legacy Ruby application I’ve seen has had significant complexity problems caused by mixins.
A Kautionary Tale
So, I’d like to share a cautionary tale of mixins gone wrong. The names are changed, but the problems are drawn directly from production applications I’ve personally worked on. Consider yourself warned!
Our story begins with a now defunct e-commerce company named Koupon — the Internet’s first, and only discount outlet for Koncert tickets (with a "K").
It started as a small Rails application. In the beginning, their controller stack looked like this:
There was a single
KoncertsController that inherited from
ApplicationController, with a
show method to set the instance variables for the Koncert show view.
Koupon offered such good deals that they rapidly expanded from Koncerts into Komedy.
With the advent of a second controller, some duplication entered the application. So, an enterprising engineer dried up the code and introduced an abstract controller parent class.
Now, before you go there. I know. Lots of folks would just put it in the
ApplicationController. Neither approach has a good outcome. So, I'm going with the
AbstractKoncertsController. This is where they put a method to set the
@price instance variable.
But, look what happened. The
AbstractKoncertsController is now relying on an instance variable (`@koncert`) that is set in the child controllers.
So, there’s now an implicit dependency from the
AbstractKoncertsController down to its children.
No one noticed back then. But, I know now that those dotted red lines were sirens screaming for us to pay attention. They represented an inversion of the dependencies between the parent and its children. And, that's not what's meant by Inverson of Control. (But, that's a topic for another post.)
Over time, the pricing algorithm got significantly more complex. Eventually, it came to be the most prominent feature of the
AbstractKoncertController. So, an enterprising engineer decided to simplify the controller by moving the pricing code into a separate
This, of course, moved the implicit dependencies on the instance variables over to the module.
Again, no one noticed at the time, but I know now that moving those dotted lines out of the inheritance hierarchy should have been another flashing red light. In order to find where the
@price variable is set, you have to follow the inheritance hierarchy and all of the included modules.
By this point, Koupon Koncerts and Koupon Komedy had proved so successful that people started traveling to more shows. This presented Koupon with an chance to diversify. So, they branched out into Kondo rentals and engineering got busy building out the Kondos controller.
Naturally, they wanted to leverage their existing
So, they included the module in the new controller.
And, with multiple controllers now including the module, it made sense to promote the
include statement into
But, that didn’t work because the new controller used a different model. Koncerts and Komedy managed to get by with single table inheritance. But, Kondos were a different beast. In fact, the pricing algorithms were completely different.
So, an enterprising engineer added the new algorithm to the
set_price method in the
Pricing module and checked to see which instance variable was set in order to choose an algorithm.
Naturally, this meant that, in addition to the other instance variables, the module was now taking an implicit dependency on the
@kondo instance variable in the
So, there we were, eight classes (if you count the models) and one module into this codebase and we had nearly as many problems as there are circles on this diagram.
set_pricemethod was a mystery guest, meaning that it looked like a local method, but it wasn’t in the local file or even in the inheritance hierarchy, making it very hard to discover. You had to grep for it to find it.
@pricevariable never appeared in any of the controllers, yet it was an instance variable in all of them. This meant that both the module and the views needed the variable, but the controllers never mentioned it. So, someone refactoring the
set_pricemethod might have misunderstood its scope and made it a local variable, thus breaking all of the views.
Pricingmodule took implicit dependencies on instance variables in all three concrete controllers, meaning that someone refactoring one of the controllers might break the module without ever knowing it.
- We put the two pricing algorithms side-by-side and added a conditional. Naturally, when we expanded into another offering, that just encouraged us to add more conditional logic in that method.
- Testing the
set_pricemethod required all kinds of dark RSpec arts. Because the method requires an instance variable be present, RSpec needed to set it. And, because RSpec was setting the instance variable, the removal of the instance variable in the controller would not cause the test to fail.
- And, because the method returns nil, it's not testable using the return value. You had to check the value of the
- FInally, by this point, we'd managed to completely abuse the letter K, much to the chagrin of the eminently more flexible letter C.
How did we get there?
So, how did this happen? Well, it was primarily good intentions and bad decisions.
The engineer who dried up the code was doing so out of the desire to not repeat themself. The engineer who originally moved the pricing code into a module was doing so out of the desire to simplify the controller. And, the engineer who added the second pricing algorithm to the module was trying to colocate related code to make it cohesive.
They were all good ideas. And…
Those were all sub-optimal implementation choices.
The mixin made the code harder to reason about and harder to change by tightly coupling the
Pricing module to the controllers through non-obvious, implicit dependencies. In other words, there was was an undocumented relationship between the module and the controllers. All the engineers just had to know that in order to use the
Pricing module, you had to first set an instance variable in the controller.
In addition, because the
Pricing module referenced instance variables in the controllers, it was breaking encapsulation. That's the concept that says that only the object itself is allowed to modify its internal state. And, since the module is included in the controller class, it has access to and can modify any of the controller’s instance variables, thus breaking encapsulation.
Plus, co-locating code does not improve its cohesion. Cohesion is about code having only one reason to change. And, in this case, each algorithm had different reasons to change — artist limitations set limits on the discounts Koupon could offer for ticket sales (think Taylor Swift); and zoning laws and HOA rules set limits for Kondos. And, then there’s the conditional, which only served to encourage future engineers to add more algorithms to the
set_price method. That’s three different reasons to change, right there.
How did we fix it?
Well, sadly, Koupon didn’t make it. They couldn’t keep up with the competition due to how hard it was to change their codebase. But, I know now what I would have done.
Here's where we left off:
The first thing I would do would be to move the code that set the
@price instance variable back into the controllers. This makes it obvious that the variable is needed by the view. And, it reduces the implicit dependencies between the module and the controllers.
That's a good first step. But, we still have implicit dependencies on controller instance variables. Next, I would refactor the branches of the conditional into polymorphic pricing classes, one for each algorithm. This puts a testable, enforceable interface around each algorithm, preventing it from breaking encapsulation.
Now, all of the arrows are in the diagram represent explicit dependencies that are declared in code via inheritance or
.new. You no longer need to just know anything about the codebase and you can easily trace your way back to a method just by following the inheritance hierarchy.
So, you’re probably wondering what to do with your module mixins right about now. Should you use them?
The short answer is no.
The long answer is no, period.
Ok. Now that all the novices have left and it’s just us experts, let's get real.
We all know that modules as mixins are a great tool for adding functionality to a class — IF! — if you keep them stateless and only use them to add a single capability to a class, such as
— BUT! —
But, we also recognize that mixins should never reference an instance variable defined in the class that includes it. Nor should instance variables set inside the module be used outside the module. Doing either breaks encapsulation in a way that makes it super easy to introduce a bug during refactoring. So, keep your instance variables private, and keep your privates to yourself.
Why use a whole class for that?
I get this question a lot. Folks seem to think of classes as somehow heavier than modules. But, I've got news for you. They're both objects. And, they both have the same lifecycle.
But, the real answer is twofold: encapsulation and cohesion.
With a class, enapsulation is guaranteed. No one else can modify my internal state (sans meta-programming or mixins). With a module, I have direct access to the internals of any class that might include me. And, those classes have access to things I might define. At that point, there is no more encapsulation. This is the largest problem I see with using modules.
The second biggest problem I see is that modules tend to become junk drawers. Even a module like the Pricing module above, where there was a clear intention to only handle pricing concerns. As we saw, there were three different reasons to change that method. And, even if you split the method into two methods and eliminated the conditional, you would still have had two different reasons to change that file (and another for every other algorithm that you added later).
For those reasons — modules are objets, too; modules destroy encapsulation; and modules tend to become churn factory junk drawers — I always prefer using classes over modules.
What about class methods?
I get this a lot, too. People want to know why would I chose to use a class method on a class instead of just using a module class method. My answer is that a) I don't use modules for the reasons stated above; b) I wouldn't use a class method on a class to create a junk drawer, either; and c) I only use class methods to simplify APIs, and those class methods are creating an instance of the class under the hood.
Let me show you some examples. Here's how I use class methods to create a simple API for interacting with a class:
class Foo def self.bar(baz) self.new(baz).bar end attr_reader :baz def initialize(baz) @baz = baz end def bar puts baz end end Foo.bar("baz") #=> "baz"
And, here's a way I might use a module to create a namespace for an API or façade in front of a set of classes or a gem.
module MyGem def self.bar(baz) Foo.bar(baz) end end MyGem.bar("baz") #=> "baz"
This allows me to treat the underlying Foo class as an internal construct to the gem (even though, yes, this is Ruby and there really is no such thing as private). Note though, that's not a mixin. It's just a namespace. I would not include that module anywhere.
Modules as mixins are a very sharp tool that, in the hands of the wrong engineer, can lead to taking implicit dependencies between modules and classes in ways that make refactoring incredibly dangerous and testing very difficult and prone to false positives.
Do I use modules as mixins? Yes. But, I only do so in framework code so that I don't poop all over someone's inheritance hierarchy. And, I follow all the rules I mentioned above.
- Don't take implicit dependencies on instance variable in the class that includes the module.
- Don't take implicit dependencies on instace variables in a module from the classes that include it.
- Pay attention to cohesion. If two things can change for different reasons, they belong in two separate classes.
The Manufacturable gem that Fito von Zastrow and I wrote is a good example of how I use modules as mixins.
I hope you enjoyed this article. And, I hope that you learned something. If so, I'd love for you to subscribe to my occassional newsletter where you can get these articles delivered direclty to your inbox for free!