Contrary to popular belief, I don’t think you should test as much as possible. I stand by smarter, not more numerous tests. But what makes tests smarter?

Of course, it depends on the system under test, but there are some rules of thumb. One of them being…

Test behavior, not implementation

In other words, tests should assert output, not the way it is produced. Like the thing under test was a black box.

Let’s nail this down on different levels of abstraction, starting with methods.

On the method level

You should give an input and test an output. Nothing more.

Take an example. Let’s test this method:

def nl2br(str)
  str.gsub(/\n/, '<br>')

Surprisingly often I see people wanting to test it like that:

let (:input) { "line 1\nline 2\n" }
expect(nl2br(input)).to eq(input.gsub(/\n/, '<br>'))

Some or all implementation is put into test code. This has some reasoning behind but generally is a bad idea.

Instead, I would recommend to explicitly show what the resulting string looks like. Without any further method calls. Like this:

expect(nl2br(input)).to eq('line 1<br>line 2<br>')

For more complex methods tests will very rarely remain that simple. This is ok, as long as you don’t put actual implementation into test code.

Let’s discuss why.

Two (and a half) reasons why you shouldn’t test implementation:

  1. Cementing the code
  2. Worse documentation
  3. (Worse design)

Cementing the code

Too much implementation in the test can lead to a rigid code. In case of refactoring, it makes developers type every code change twice (in production code and tests). Those are two downsides of writing tests that many people complain about. These are completely valid points, but if you find yourself in that position, you probably test too much implementation. You want your test to support your refactoring, not restrain you from it.

What’s more, you want to be able to refactor at any given point without changing tests and implementation at the same time. You should only alter one at a time. One supports and validates the other.

In short – testing behavior allows you to quickly refactor with the assurance that you didn’t break anything.

Worse documentation

Tests document the code. That is one of the easily forgotten or overlooked objectives.

How do you like your documentation? “Given X, the method produces… em… I don’t know, read the source code”.  Doesn’t sound right. Some kind of elaboration instead of particular output is fine, but not at such low level.

Rather you want to triangulate the perception of the code. If someone doesn’t understand the source code, you give them another chance.

I like thinking of unit test as a living equivalent of code comments that explain how it works:

# Examples:
#   issue.workflow_rule_by_attribute # => {'due_date' => 'required', 'start_date' => 'readonly'}

In that case, you probably want to see the output of the method, not the guts.

Worse design

There’s also the premise of TDD is that it leads to better design. That is because you don’t think ahead and focus on the present situation. You separate the phase of thinking (input-output) from the phase of implementation. In both cases, you should take as little steps as possible.

I’m not the one to advocate this idea, but it’s noteworthy. And it’s against testing implementation.

On the class level

“Test behavior, not implementation” rule for classes means testing via its public interface.

The gain is the same:

  1. You are free to refactor at any given point of time
  2. You have better documentation of how to use a class

Therefore we should avoid testing private methods directly.

Of course, there are classes that use dependencies, so the tests will be more complicated than input-output. You probably want to mock something from time to time. No worries. The way a class communicates with the external world I still consider a “public interface”.

On the system level

On the level of the whole system, almost everything seems like testing implementation. Either a test case is end-to-end, or examines implementation.

If it was possible, I would end-to-end test every possible case of the app usage. But it’s simply too expensive. The sense of testing individual components rather than the app as a whole comes from the expensiveness of end-to-end testing in general.

What should be the granularity of the units we test? In Rails, you could stop at testing every controller and models, as there is no app without controllers and models. In-between steps like decorators, service objects etc. – they can be seen as implementation details, and therefore tested not directly. In reality, there’s too much orchestration in performing eg. the model test to assure every branch in a decorator.

Anyway, I would suggest starting with high-level tests. As the app grows, locate the units that concentrate more complicated logic and unit test it. This is somehow related to ATDD approach.


This is the first article on what you should not test, and probably not the last one. As I said at the beginning, I care for better tests, not more tests in general. As programmers, the code is our enemy. Let’s have it as little as possible.

EDIT (2018-07-20): I’ve discovered a nice blog post about tautological tests. It explains the topic even further.

EDIT (2018-09-26): This talk is excellent in why you shouldn’t be testing implementation.