There are many tutorials on how to do unit testing or test driven development for new code. Nevertheless, it can be difficult to get started with an existing code base that has no or just a few tests (“legacy code”). The code was not written with testability in mind, so writing tests is usually harder. I want to show you how to overcome some of the obstacles. I’ll use a real-world example from an open source project.
Why adding tests to legacy code is difficult
One reason is that it takes an investment in time. Usually you come across an existing class wanting to make some change, maybe it’s just a small tweak. You read the code and figure out where to put the change. Now you know that you should write tests before making that change. But for your small change the amount of work to get reasonable test coverage seems just too much.
You make that small change now and promise to write the tests sometime later. As time goes by, the code gets more complicated and writing the tests becomes more and more difficult.
Another reason is that legacy code is not structured for testability. You need to refactor it in order to make it testable. However, refactoring without tests is risky and takes more time because you must be very careful not to break things. This is the testing dilemma: in order to make it testable, you want to refactor, but you’re afraid to refactor without tests.
There are ways to resolve the dilemma, the classic book is Working Effectively with Legacy Code by Michael Feathers, 2004. This book is full of techniques to help you break dependencies that hinder unit testing, and to do so in a way that lets you be confident that you did not accidentally break things.
The example
Recently I came across Userbin, a service that makes it easy to incorporate two-factor authentication into your web application. They have an open-source Ruby client that is rather small, so I thought this would be a good example to show some techniques for adding tests to existing code.
Let’s start with a little context.
Userbin is a cloud service that adds multi-factor authentication, user activity monitoring, and real-time threat protection to your current authentication setup with minimal amount of configuration.
The Userbin ruby client lets you connect your application to the Userbin service API. You application can then require two-factor authentication for critical operations, such as money transfers.
When I looked at the Github page userbin/userbin-ruby, I saw 73% test coverage. Not bad already (even though coveralls.io gives it a red color).
Our goal is to increase test coverage and to do some refactoring once the code has unit tests.
Make it run
First step: run the unit tests in your own environment. I forked the repo and ran tests using the default Rake task:
rake
The detailed test coverage report is generated automatically, it can be seen in coverage/index.html
We’ll focus on the file client.rb
which has the lowest test coverage with 44.64%,
it has also the largest number of missed lines (31 of 56 lines of code).
I had never seen this code before, so I tried to figure things out as I went.
Here is the file on Github: lib/userbin/client.rb.
As you can see in the detailed coverage report,
essential methods like authorize
are not covered by tests.
I chose to start with something simple to learn my way around the library.
As a first step I concentrate on a group of short methods:
mfa_enabled?
, device_trusted?
, mfa_in_progress?
, mfa_required?
and has_default_pairing?
.
Here is one of them, mfa_enabled?
in
line 110:
So if there is a session token in the store then we ask it if MFA (multifactor authentication) is enabled.
The method returns false
if there is no session token.
Clearly there are two cases: with and without a session token. The Readme file says “You should call login as soon as the user has logged in to your application to start the Userbin session.” Example application code:
Here userbin
is the global Userbin client.
You can expect that the client will behave completely differently based on the existence of a session token:
- With a token, the session is active and some operations can be allowed or denied by the Userbin service.
- Without a token, there is no session, so everything should be forbidden
For each of these, you’ll have several examples that verify the expected behavior. This is the typical situation where you will group your examples by a context.
Use a context in RSpec to group examples that have a common precondition, such as being logged in.
The first test
Let’s start with the simpler context: we have no session token yet.
Create a new file spec/client_spec.rb
and add the first test:
I am deliberately ignoring the fact that Userbin::Client.new
needs some arguments.
Instead, I want to run a test, any test, as soon as possible:
Still being impatient, I tried to get away with two nil arguments, still got errors. By looking at the code for initialize , we find that at least the request object has to be somewhat usable:
Now instead of hunting for the proper request class and a legal way to instantiate it, we provide a double that is just enough to take us through the initialization:
Let’s take a minute and reflect: the request object is given to the client from the outside, we need it to create a client, but it is not something we want to test here. We may or may not have to learn more about the details of the request object later.
Use test doubles as substitutes for objects at the boundaries of the class under test.
Normally, we would refactor the setup code for dummy_request
and client
into a before(:each)
block
that would be reused across the examples for the similar methods device_trusted?
, mfa_in_progress?
etc.
We’ll go one step further here and use RSpec’s syntactic sugar to make the example even shorter and more readable.
The test subject (i.e. the thing we are testing here) is defined in the subject
block.
You can also access it as subject
from inside a test example.
The is_expected
method knows how to find the test subject, so we don’t have to mention the subject explicitely.
And be_mfa_enabled
is another piece of syntactic sugar that connects to the mfa_enabled?
method.
The use of subject as well as many more RSpec best practices is explained nicely in betterspecs.org.
The downside of the compact syntax is that a first-time reader of your test
will wonder where things like be_mfa_enabled
are defined.
Another stumbling block for you as a programmer is remembering (or looking up) where the spaces and underscores go when you are writing the tests.
Anyway, the real usefulness of the compact syntax shows when we add examples for the other methods:
Now this is somewhat close to a textual specification: you want all kinds of things to be false when there is no session token. In this form, it would be easy to discuss the specification with a coworker or a customer.
Let’s move on to the more interesting parts of the client.
For example, what should happen if someone tries to authorize with Userbin but has no session yet?
Looking into the code for authorize!
we find:
Let’s add a test for it:
The general pattern when testing for errors is
expect { some action }.to( raise_error(error class, error message) )
.
You can find a compact overview of RSpec’s expectation syntax in the
RSpec Cheatsheet.
A similar error is raised for the trust_device
method when there is no session token yet.
The test runs and produces a nice textual output:
Userbin::Client
without session token
should not be authorized
should not be mfa enabled
should not be device trusted
should not be mfa in progress
should not be mfa required
should not have default pairing
#trust_device
should raise Userbin::UserUnauthorizedError with message matching /Need to call login before trusting device/
#authorize!
should raise Userbin::UserUnauthorizedError with message matching /Need to call login before authorize/
Our process here is figuring out the specification by looking at the code. Then we could review the specifications (our RSpec examples) and check if every relevant case was covered, and, more importantly, if these specifications match the original requirements. Furthermore, the RSpec files give us set of executable specifications that we can use later to detect regressions, for example during refactoring.
Problems when running the test suite
When running the whole test suite using rake
we encounter the next surprise:
it says our request double was “leaked into another example”:
1) Userbin::User retrieves a user
Failure/Error: user = Userbin::User.find('9RA2j3cYDxt8gefQUduKnxUxRRGy6Rz4')
Double "request" was originally created in one example but has leaked
into another example and can no longer be used.
rspec-mocks' doubles are designed to only last for one example,
and you need to create a new one in each example you wish to use it for.
# ./spec/models/user_spec.rb:6:in `block (3 levels) in <top (required)>'
# ./spec/models/user_spec.rb:5:in `block (2 levels) in <top (required)>'
Altogether, there are three such errors. Test doubles (or mocks) are meant to be thrown away after the example is finished. This makes sense when you consider that mock objects return predefined answers and record the messages they were sent. So a double from one example should never show up in another one because answers or messages may be mixed up.
What happened here?
The name “request” points us to our own double that was used in client_spec.rb
(a case in point for naming your doubles).
Lines 5 and 6 in spec/models/user_spec.rb
look innocent:
It is not obvious how our double from client_spec.rb
is relevant here.
In order to investigate further, we switch on the backtrace output when running RSpec:
rspec --backtrace --fail-fast
Additionally, use the --fail-fast
switch to abort as soon as the first test fails.
The backtrace shows a little more (some lines omitted):
... several lines in rspec/mocks ...
# /Users/berth/.rbenv/versions/2.0.0-p353/lib/ruby/gems/2.0.0/gems/rspec-mocks-3.1.3/lib/rspec/mocks/test_double.rb:74:in `method_missing'
# /Users/berth/.rbenv/versions/2.0.0-p353/lib/ruby/gems/2.0.0/gems/rspec-mocks-3.1.3/lib/rspec/mocks/proxy.rb:182:in `message_received'
# /Users/berth/.rbenv/versions/2.0.0-p353/lib/ruby/gems/2.0.0/gems/rspec-mocks-3.1.3/lib/rspec/mocks/method_double.rb:73:in `proxy_method_invoked'
# /Users/berth/.rbenv/versions/2.0.0-p353/lib/ruby/gems/2.0.0/gems/rspec-mocks-3.1.3/lib/rspec/mocks/method_double.rb:60:in `block (2 levels) in define_proxy_method'
# ./lib/userbin/support/cookie_store.rb:10:in `[]'
# ./lib/userbin/token_store.rb:8:in `session_token'
# ./lib/userbin/client.rb:40:in `session_token'
# ./lib/userbin/request.rb:85:in `call'
# ./lib/userbin/request.rb:113:in `call'
# ./lib/userbin/request.rb:72:in `call'
# ./lib/userbin/request.rb:50:in `call'
# ./lib/userbin/request.rb:40:in `call'
# /Users/berth/.rbenv/versions/2.0.0-p353/lib/ruby/gems/2.0.0/gems/faraday-0.9.0/lib/faraday/rack_builder.rb:139:in `build_response'
... several lines in faraday ...
# /Users/berth/.rbenv/versions/2.0.0-p353/lib/ruby/gems/2.0.0/gems/her-0.7.2/lib/her/api.rb:94:in `request'
... several lines in her ...
# ./spec/models/user_spec.rb:6:in `block (3 levels) in <top (required)>'
# /Users/berth/.rbenv/versions/2.0.0-p353/lib/ruby/gems/2.0.0/gems/vcr-2.9.3/lib/vcr/util/variable_args_block_caller.rb:9:in `call'
... several lines in vcr ...
# ./spec/models/user_spec.rb:5:in `block (2 levels) in <top (required)>'
# /Users/berth/.rbenv/versions/2.0.0-p353/lib/ruby/gems/2.0.0/gems/rspec-core-3.1.7/lib/rspec/core/example.rb:152:in `instance_exec'
... several lines in rspec/core ...
# /Users/berth/.rbenv/versions/2.0.0-p353/lib/ruby/gems/2.0.0/gems/rspec-core-3.1.7/exe/rspec:4:in `<top (required)>'
# /Users/berth/.rbenv/versions/2.0.0-p353/bin/rspec:23:in `load'
# /Users/berth/.rbenv/versions/2.0.0-p353/bin/rspec:23:in `<main>'
So there is a lead: somehow the session_token
method in client.rb
is executed, that is an overlap with our previous test.
And finally, in ./lib/userbin/support/cookie_store.rb:10
we have an access to a @request
object:
@request.cookies[key]
How did our test double end up there?
Before we investigate that, let’s try to get an idea about what else goes on in the backtrace. We see some interesting libraries being used:
- Faraday is a HTTP client library
- Her is an ORM (Object Relational Mapper) that maps REST resources to Ruby objects
- VCR lets you record your test suite’s HTTP interactions and replay them during future test runs
Interpretation: Faraday is used to talk to Userbin’s API, Her wraps the API into a nice object model and VCR is used to provide canned responses to HTTP calls during unit tests.
We don’t have to chase for long, the first line in
Userbin::Client#initialize
says it all,
with a nice comment:
That means RequestStore.store[:userbin]
is like a global variable that holds on to the
Client object that we prepared for the test.
It was very kind of RSpec to warn us that this may not be what we expect in another test.
The fix is to remove the Client instance when we’re done with our tests in client_spec.rb
.
The after(:all)
hook is executed after all the examples in the context are finished.
You can use it as a cleanup procedure to release all kinds of resources as necessary.
Now the whole test suite works. Here is the whole file:
See the file client_spec.rb
on GitHub.
The tests are on the branch on Github: rspec-for-userbin-client
Summary
What did we accomplish? At first it doesn’t look like much: we are just testing one context (no session token) and only the more simple stuff. However:
- We started a spec file for Userbin::Client. The next developer will find it easier to extend this than to start from scratch.
- We overcame the obstacles in creating a client object and in leaking state across tests. Again a problem that the next developer will not have to solve.
- Test coverage for
client.rb
increased to 62.5% (21 of 56 lines missed), global test coverage is at 76.9%. While you shouldn’t write tests for changing a metric, it is nice to keep score this way.
Difficulties in testing often point to design problems. Here we encountered at least one: global state that makes it hard to isolate tests. The same global state works against the goal of a modular design. Fortunately for us, RSpec gave us the warning about the global state for free. Problems due to state leaking between tests are usually hard to debug. For example, they may depend on the order in which tests are run. Or a test may work standalone but fail when run in a test suite.
It looks like the Userbin client should be shared between requests (or Rack middlewares?) Maybe the storage of the client could be done differently. The Rack middleware architecture would suggest storing it in the environment.
In Part 2 we’ll create tests for the more interesting context: being logged-in to the Userbin service.