Classic: JUnit 3 on one page

Would you find a version of this for JUnit 4 useful?

1 year ago
Comments

Virtual Code Retreat session with Jon Kern

1 year ago
Comments

Mocks aren’t stubs, the remix

I know that Martin Fowler has written about this extensively, but I wanted to write a concise version, since this point seems to bear reiterating from time to time.

Mocks aren’t stubs.

Specifically, I recommend against setting an expectation on a method (“mocking” it) when you really want to simulate a specific response from that method for the current testing (“stubbing” it). That said, I must confess than I used mocks in place of stubs early in my test-driven development practice, and I believe that all programmers who try the so-called “mockist” style of design eventually use a mock where a stub would work better. If you’ve done this, or are doing this now, then don’t feel shame. Instead, consider changing your practice.

When one uses a mock in place of a stub, one tends to over-specify the collaboration between a Client and its Suppliers, and that leads to brittle tests. These tests tend to make hyperactive assertions, meaning assertions that fail even when the underlying production code works as expected. Such hyperactive assertions play the role of the boy who cried “wolf!”, distracting us with needless concern. Even when these superfluous mocks don’t over-specify a collaboration, they often duplicate specifications in other tests. Even when they don’t do that, they tend to encourage us to make two assertions (method expectation and checking the ModelAndView) or check irrelevant details multiple times, such as:

Writing one such test doesn’t seem to matter much, but when the method in question returns a collection, and you write the “zero, one, many, lots, oops” tests, you might mock that method exactly the same way five times. How many times do you need to check the same thing?

Now please go read, or re-read, Martin’s excellent article.

1 year ago
Comments

Simple advice for simpler tests

I have long preferred to start writing tests by writing the assertion. I use this technique in my classes, and I teach it as a Novice technique, but I have found the technique useful long past the Novice stage. I use a handful of guidelines to help me keep my tests clean.

  • Focus on what you want to check.
  • If you want to check more than one thing, write all those things down, then choose one thing to check.
  • Write the assertion or method expectation that you want to check.
  • Working backwards what you want to check to the action you need to execute, then to the objects you need to create.

I combine these specific guidelines for writing tests with the general guidelines I use to guide my design: the four elements of simple design. I find these guidelines help me write tests with a variety of valuable properties —

  1. The number of failing tests tends to accurately reflect the severity of my last mistake.
  2. The results of one test do not interfere with the results of other tests.
  3. I can easily discern the cause of most test failures.
  4. Localised production code mistakes tend not to raise alarm bells by causing a significant number of tests to fail.
  5. I tend not to need to change old tests when I add new behavior.

I recommend you try applying this technique diligently in your work for at least two weeks, just to see how it benefits you. Please comment with any observations you’ve made about the results, including how you feel about writing tests, differences in the design of your tests, and especially differences in the design of your production code.

1 year ago
Comments

The trouble with refactoring with failing tests

I strongly recommend that especially Novices only refactor when all tests pass. I can offer a straightforward explanation for this strong recommendation.

There are many, many more code bases that fail your tests than pass them.

For the mathematically-minded among you, I strongly believe that the space of “failing code bases” is much larger than the space of “passing code bases”, relative to your tests. I felt like the difference corresponded well to the difference between uncountably infinite and countably infinite, but as one of my commenters points out, the universe of code bases is countable, so that’s not it. I’d like to say “clearly” and move on, but I’d find that intellectually dishonest. I’ll think about how to support this hypothesis, but in the meantime, please suspend any disbelief you have about it.

The result of any change to a failing code base falls into one of these categories:

  1. Another passing code base.
  2. A code base that fails for exactly the same reasons it failed before.
  3. A code base that fails for new reasons, including the reasons it failed before.
  4. A code base that fails for new reasons, excluding the reasons it failed before.

Now consider that result of any change to a passing code base falls into one of these categories:

  1. Another passing code base.
  2. A failing code base, for any variety of reasons.

First, notice the difference in the variety of results: four cases for changing a failing code base, and two for changing a passing code base. Next, notice which of these results you can detect just by looking at the red/green bar. In the case of the failing code base, you’ll only notice 1 of the 4 results. In the case of the passing code base, you’ll notice both results. This last point wins the day for me.

If you prefer pictures, then these illustrate the difference.

When I try to working in a failing code base, it looks like this.

When I work in a passing code base, it looks like this.

Especially when I refactor, I prefer to have as few surprises as I can, and the passing code base seems to offer fewer surprises. Therefore, I remind myself to refactor only when all tests pass, and I recommend you do the same.

1 year ago
Comments

Injecting dependencies doesn’t have to hurt

I have noticed a recent swell of comments about the pain of dependency injection, and if this has caused you problems on your project, then I think I can help you by offering a few simple guidelines and a single goal for injecting dependencies.

The guidelines

I have formulated these as Novice or Advanced Beginner rules, and so I have worded them more strongly than I tend to word my advice.

  • When in doubt, inject each dependency directly into the method that requires it.
  • Only when you inject the same dependency into multiple methods of the same class, move the parameter into the constructor.
  • Do not use so-called “setter injection” except as an intermediate step in a refactoring aimed at injecting the dependency into a method or the constructor.
  • Stop using the Service Locator pattern, and instead inject the service into the client.
  • Stop instantiating collaborating services (in the Domain-Driven Design sense), and instead inject those services into the client.
  • When a constructor parameter list becomes uncomfortably long, split the class so that the new classes’ constructor parameter lists don’t overlap.
  • If you notice a class using the same dependency through two different object graph routes, split the class so that the new classes receive the dependency directly in their constructors.
  • If you notice a class using groups of dependencies at different times, split the class so that the new classes only use each cohesive group of dependencies.

All these guidelines have their roots in “remove duplication” and “fix bad names”, two of the four elements of simple design.

So why inject dependencies at all?

Some people inject dependencies in order to stub, mock or spy on a method for testing. I used to use that as a primary motivation, but over the years that motivation has evolved into something more widely useful. I inject dependencies for one reason: to move the things that change in the direction of the client and the things that don’t in the direction of the supplier. Or, if you prefer, “abstractions in code, details in data”. Or, if you prefer, to avoid abstractions depending on details. Any of those will do.

You don’t buy it?

No problem. I’ll write more articles in this space showing examples of each of the guidelines, and you’ll have plenty of opportunity to share your opinions about them.

1 year ago
Comments

“It works”

I have found only one truly universally useful definition for “it works”:

It passes all the design tests we’ve thought of so far.

Pithy, unambiguous, and not to be confused with “it’s done”:

It passes all the examples (business tests) we’ve thought of so far.

or with “it’s ready”:

It passes enough examples to make a significant stakeholder happy.

Notice that “it works”, “it’s done” and “it’s ready” might or might not overlap with each other.

1 year ago
Comments

A few TDD exercises

In the course of leading the Simple Approach to Modular Design workshop at YOW! Melbourne, I thought of a small number of TDD exercises that I encourage you to try. If you’d like to share your results, please fork the Gist below and show us where you reached.

This is part of the implementation of a first story in a point of sale application. The first story involves selling a single item by handling a barcode event and displaying the product’s price on a display, which I model as a wrapper for a text property. Working straightforwardly by implementing everything relatively simply, I end up at this point:

I argue that this implements the entire story, at least enough to satisfy customer-me.

From this point, I offer a few simple exercises. If you want to share your results, then please take frequent snapshots or add a comment to your work telling us what you were thinking and how you approached the problem.

  1. I want to display $0-priced products as “FREE”, rather than “$0.00”.
  2. I can change emptyBarcode() by passing null in for pricesByBarcode and the test still passes. What does that tell you about the design and how do you propose to improve it?
  3. I see duplication in the tests with the same value for pricesByBarcode in each test. Start removing duplication. Keep removing duplication. Where did you stop?

If enough people participate, providing their solutions, then I’ll provide my solutions soon thereafter.

1 year ago
Comments

What should I test? An example

Consider the following code:

Someone, somewhere, on a mailing list, and I apologise for forgetting who, posed the question, “What should I test here?” Here is one solution that someone offered:

It’s not bad, but I don’t like duplicating the irrelevant detail of where the current_user comes from. I would rather avoid that extra step by checking the user more directly:

Checking this results in the following design change:

I like this more, but I notice that we don’t check the special case of user.blank anywhere, so I add something to check that.

Of course, I don’t know the circumstances under which current_user could be blank, and frankly the mere possibility makes me nervous. I’d rather fix current_user so that it couldn’t answer blank, but I don’t know enough about the design to understand the impact of that assumption, so I’ll mark it as a smell and move on.

Now we have the problem that no-one has checked the other roles a user could have, so I add a check for that.

This check implements a kind of double-entry book-keeping for the list of acceptable roles in the production code. It makes me want to parameterize the list of acceptable roles in the production code, so I try this:

Now that I look at that, I highly doubt that a blank user would have any of those roles, so I look at the code for current_user to figure out how it gives me a blank user, then throw 1000 random roles at it to see whether it has any of them. It doesn’t, so I’ll spare you the code. Checking this assumption allows me to simplify the code, but complicates the checks.

I really dislike this revised check, because I don’t know whether to stub the blank? message or the role? message. I’d rather not know the difference. In fact, since I’m no longer checking for a blank user with a guard clause, I don’t have to check what happens with a blank user. If, somehow, we have a blank user with one of the acceptable roles, then something else is definitely broken, and I have already checked that. This means that I can delete the check for rejecting a blank user, leaving only the check for the specific roles.

Now I see that I’ve duplicated the acceptable roles in both the production code and the tests, and my experience has taught me to remove this duplication by moving the copy from the production code towards the copy in the tests. This involves passing the acceptable roles in to manage_samples_link_for as a parameter. This results in:

To avoid migrating clients too soon, I make the list of acceptable roles an optional parameter.

Once I see that pass, I push the optional parameter out towards the client.

And then once I see that pass, I remove the default value for acceptable_roles.

Now the list of acceptable roles in the test accidentally matches the list of acceptable role in manage_samples_link, which means that I should probably change the test, otherwise someone might believe that those values matter.

I could check that manage_samples_link_for invokes :role? on the user for each role until it finds the first acceptable role, but I don’t think I need to check specifically that manage_samples_link_for short-circuits its evaluation or not, so I stop here with manage_samples_link. Now I suppose I would benefit from checking that manage_samples_link uses manage_samples_link_for correctly.

Now I feel like I’ve separated the application-specific concerns from the more generic behavior. I suppose another step would include making the link_to itself a parameter. Since I don’t have enough evidence that that would help anything, I don’t do it yet. I do, however, notice that I could improve the production code a little.

I feel good about the results, so I stop here. Here are the checks I wrote:

I suppose that, if I felt the slightest bit concerned, I’d add this check:

Of course, this seems like designing a :roles? method on user, and since I don’t know enough about the user library here to make that decision, I remind myself to look into that later. In the meantime, I really ought to stop and commit these changes.

1 year ago
Comments

Using stubs in contract tests

I received this question by email, which asked about how to write contract tests for an interface whose implementers themselves have complex collaborators.

When you talked about contract test of the List interface it was pretty clear. Just call add() then check size of list and fact that it contains new item. However (“in reality”) object under test will often have its own collaborators. And here’s my confusion—how do we write contract test when we need to know implementation details? Wouldn’t it be just another collaboration test? Let me give you example.

Be careful: an interface has no collaborators. Classes declare their collaborators in the parameter list of the constructor (“constructor injection”), and interfaces have no constructors, so interfaces do not have collaborators, in general. An interface should describe what to do and with which inputs, but not which tools it needs to do the job (the collaborators).

Implementation details and the contract are disjoint: all behavior is either an implementation detail or part of the contract. No behavior can be both at the same time.

In our project we work with external system through two kind of API’s - Web Services and PL/SQL stored procedures. Over time PL/SQL API being substituted with Web Services so we need to make clients of these API’s independent from implementation. We introduced interface called Gateway (not exactly this name but meaning should be clear). Let’s assume for now that it has only one method doExternalCall(). Let’s also assume for sake of brevity that we interested only in one aspect of this method’s contract - handling of errors. It should throw ItemNotFoundException if there were no actual results returned from external system.

So I have a contract test: http://gist.github.com/627705

There are two implementation of Gateway interface - DBGateway and WSGateway. DBGateway uses DAO interface to call PL/SQL procedure and WSGateway uses WSProxy interface to call concrete Web Service. Information returned from each of these interfaces is completely different thus implementation of DBGateway and WSGateway differs significantly.

So each implementation’s tests have to know how to create a Gateway that will return no items. I suppose that could be with stubs for DBGateway and WSProxy.

Our goal would be to test implementation of some BusinessObject which uses Gateway as collaborator.

First we will do collaboration test. We will create mock object for Gateway interface and will set expectation to check that its methods were called with correct parameters. Also we would set return values in order to check that BusinessObject behaves correctly when Gateway throws ItemNotFoundException.

This sounds correct.

Next we need to write contract test for Gateway in order to make sure that it behaves according to expected contract. We can’t just instantiate DBGateway or WSGateway. We actually need to set stubs for their collaborators (DAO and WSProxy). Moreover, we would need to set return values from these collaborators to be different for different implementation of Gateway. And now it looks like we don’t test contract anymore, instead we test collaboration of concrete Gateway implementation with its dependency. If we have test for DBGateway we can’t use it as a test for WSGateway! It’s not like testing ArrayList and LinkedList. It turned out that instead of separate collaboration/contract tests we have just set of collaboration tests for different objects.

Aha! I think I see. Let me try to sketch a solution.

Here is DbGatewayContractTest: http://gist.github.com/627710

WsGatewayContractTest will be very similar.

Conclusion: using a stub doesn’t always mean “collaboration test”. I would like to see a test for Dao, however, that shows when find() returns null. This is a contract test for Dao.

1 year ago
Comments




Blue Theme by David