August 9, 2011

ActiveModel::Validations and Command/Query Separation

Command/Query Separation is a principle that states that “every method should either be a command that performs an action, or a query that returns data to the caller, but not both.” An easy way to think of this is that any method that modifies state should return null, and any method that returns a value shouldn’t modify state. (However, as an exception, I find it helpful for some methods that modify state to return a value indicating the result of the modification.)

Rails doesn’t adhere to Command/Query Separation in ActiveModel::Validations, and as a consequence, the behavior is unintuitive.

Consider a simple example of a Person model that is required to have a first_name attribute.

How it Works Now

This is how it works in Rails:

class Person
  validates_presence_of :first_name
end

p = Person.new

p.first_name = ""

p.errors
#=> none

p.valid?
#=> false

p.errors
#=> first name can't be blank

p.first_name = "Dan"

p.errors
#=> first name can't be blank

p.valid?
#=> true

p.errors
#=> empty

Intuitive Behavior

This is how I would expect the object to behave:

p = Person.new

p.first_name = ""

p.errors
#=> first name can't be blank

p.valid?
#=> false

p.errors
#=> first name can't be blank

p.first_name = "Dan"

p.errors
#=> empty

p.valid?
#=> true

p.errors
#=> empty

Deciding Between Command and Query

The unintuitive behavior is caused by the valid? method having a the side effect of modifying the errors state. It’s both a query (is this object valid?) and a command (run the validations). Eliminating the side effect would lead towards more intuitive behavior. So which should this be: a command or a query? I’ll show examples of both.

The Current Implementation

The current implementation in ActiveModel is:

def valid?
  errors.clear
  run_validations
  errors.empty?
end

(I simplified the code a little bit to remove lines that aren’t relevant to this topic.)

Making it a Query

We can make validations a query by not storing any state about the errors on the object, which will eliminate the side effect. We would need to run validations when the errors method is called instead of when valid? is called.

def errors
  run_validations # returns errors
end

def valid?
  errors.empty?
end

Of course, this would cause a performance problem if there are multiple calls to errors. Using the object like this would run the validations twice:

p = Person.new
puts "total errors: #{p.errors.size}"
puts "error on first name? #{p.errors.on(:first_name)}"

You could memoize it, but that may be difficult as you would need a way to know any time state changes on the object causing validations to need to be run again.

Making it a Command

A cleaner implementation is to replace valid? and errors with a validate method that returns the errors but does not store the errors on the object itself. The caller will need to assign the errors to a local variable to make multiple inquiries on the errors.

def validate
  run_validations_and_return_errors
end

p = Person.new
errors = p.validate
puts "total errors: #{errors.size}"
puts "error on first name? #{errors.on(:first_name)}"

Ambiguity of Command/Query

In this example of a validate method that returns an errors object, is validate a command or a query? There’s still disagreement between the naming and the behavior. Validate sounds like a command method (it’s a verb), but typically when you tell an object to do something (i.e. call a command method), it modifies some state. So that makes validate behave more like a query method that queries for errors. Unfortunately if you name it as a query method (e.g. ‘def errors’), the naming doesn’t give any indication that the query is expensive, and it would be easy to make repeat calls to object.errors without realizing that an expensive computation is happening on every call.

Therefore, I think the best solution is to name the method validate to guide the user of the class to the correct usage. Although it would be easy to do this…

p = Person.new
p.errors.size
p.errors.on(:first_name)

… and not realize that each call to errors is running the validations, I think it would be unlikely for somebody to do this without realizing it:

p = Person.new
p.validate.size
p.validate.on(:first_name)

Fixing this in Rails

I’d love to submit a patch if the core team is willing to change this. Of course, it would break backwards compatibility, so if Rails is using semantic versioning it may have to wait until Rails 4.

Reviewers

Thanks to Braintree devs Phinze, Tony, and Hammer for giving me feedback on a draft of this post.