Book Review: Effective Unit Testing: A Guide for Java Developers

In the Preface for Effective Unit Testing: A Guide for Java Developers, author Lasse Koskela states that although the impetus for Effective Unit Testing was to "write a Java edition of Roy Osherove's book, The Art of Testing with Examples in .NET," Effective Unit Testing ended up "having very little in common with Roy's book." Koskela further explains in the Preface that "this book is for the Java programmer," but adds that "writing good tests is a language-agnostic problem" and he recommends his book even for developers using languages other than Java.

Koskela does a nice job in the "Preface" of succinctly summarizing what the book is and isn't when he says, "I didn't want to give you a tutorial on JUnit or my favorite mock object library" and "I've tried to minimize the amount of technology-specific advice." This is important to note because Effective Unit Testing is not the book you'll want if you're looking for a book covering intricate details of JUnit, TestNG, Mockito, EasyMock, Hamcrest or other commonly used Java unit testing frameworks and tools. Instead, the focus of Effective Unit Testing: A Guide for Java Developers is on more general concepts of unit testing in general (and in Java in particular) that might be implemented with any variety of different tools. That being stated, Koskela doesn't ignore these tools completely and does sprinkle JUnit code throughout the book along with code samples based on other unit testing tools such as JMock, Mockito, and Hamcrest.

Chapter 1: The Promise of Good Tests

In Chapter 1 of Effective Unit Testing, Koskela mixes brief historical testing anecdotes with basic introductory material and often-cited reasons on why unit tests and automated tests are important (identifying bugs, improving design, avoiding scope creep, and learning from the test-writing experience). Koskela talks about why units tests are more effective when used for design in addition to being used for quality assurance.

The "Factors of Productivity" section of Chapter 1 is useful for understanding certain measures by which one might determine whether unit tests are "effective." These include execution speed (performance), readability, reliability, and trustworthiness. Achieving these characteristics of effective unit tests is the focus of the book. A couple concluding sections of the first chapter focus on using tests for design and employing behavior-driven development (BDD).

Among other introductory details, he articulates why "100% code coverage isn't the goal." Several well-known testing-related terms ["test-infected", "test-driven development" (TDD), and "accidental complexity"] are also introduced in this chapter along with references for additional details. In this first chapter, the author also introduces his "The Law of Two Plateaus" to differentiate between using unit tests solely for quality assurance versus using them for design in addition to quality assurance.

Chapter 1 is mostly introductory and probably doesn't hold a lot of new insights for someone who has worked with Java extensively and/or has written unit tests extensively. However, it does manage in 12 pages to meet the author's goal for it and the other two chapters (Chapter 2 and Chapter 3) of Part 1 of providing a "shared context" for the remainder of the book.

Chapter 2: In Search of Good

Chapter 2 delves deeper into the question of "What makes a test 'good'?" Koskela is quick to point out that there is some subjectiveness to this ("some of the quality of test code is in the eye of the beholder" as is the case for any code) and that different contexts can affect whether a particular unit test is good or not.

Koskela discusses in this chapter how virtues of regular source code are often valued virtues of test code. For example, he discusses that readable test code is maintainable test code and that appropriate structure plays a big role in making tests understandable. Koskela devotes a section of Chapter 2 to a smaller but significant issue I've seen repeatedly in writing and maintaining my own and others' unit tests: unit test methods that advertise testing something they don't really test (perhaps because they are named poorly) can be very costly. I like what Koskela titles that section, "It's not good if it's testing the wrong things." This sounds obvious, but there is deep truth to that simple statement.

Another section of the second chapter focuses on the principle that, for good tests, "independent tests run easily in solitude." Koskela provides a list of dependencies (such as "randomness" and "persistence") that to him are code smells indicating that something might be wrong with the unit test code. I like his "litmus test for a project's test infrastructure" to satisfy the following scenario: "Can I check out a fresh copy from version control to a brand new computer I just unboxed, run a single command, lean back, and watch a full suite of automated tests run and pass?" This section does a nice job of covering why it's important that tests are independent and do not rely on being called in a certain order. In addition to pointing out some unit test dependency smells in this section, Kosekla also provides some specific approaches that might be taken to address these.

One of the sections of the second chapter of Effective Unit Testing looks at why testing the wrong thing or even testing nothing at all ("happy tests") are problematic. There is coverage of why tests "need to be repeatable" along with references to Java-specific examples of tests that introduce things outside of the testing developer's control into the tests.

The last section of Chapter 2 (not counting the "Summary") introduces test doubles, the subject of Chapter 3. Koskela defines test doubles as "an umbrella term for ... stubs, fakes, or mocks." He adds that "test doubles" are "objects that you substitute for the real implementation for testing purposes." Koskela groups test doubles with testing frameworks and build tools as his "top three tools of the trade for software developers writing automated tests."

Chapter 3: Test Doubles

The third chapter is the concluding chapter and my favorite chapter of Part 1 ("Foundations"). The chapter is devoted to coverage of "test doubles", a term and concept introduced in Gerard Meszaros's xUnit Test Patterns: Refactoring Test Code. Koskela outlines five reasons developers might use test doubles, including "the most fundamental of the reasons for employing a test double - to isolate the code you want to test from its surroundings." After listing these five reasons for use of test doubles, Koskela describes each of these motivating reasons in greater detail. He then describes each of the types of test doubles and compares their strengths and weaknesses. Koskela's section "Guidelines for Using Test Doubles" introduces his "logic and heuristics" for "picking the [test double] option that results in the most readable test". These include five considerations plus the simplifying rule: "stub queries; mock actions" (attributed to J.B. Rainsberger, author of JUnit Recipes).

The third chapter of Effective Unit Testing also touches on organizing unit tests with the Arrange-Act-Assert convention and likens this to behavior-driven development's Given-When-Then vocabulary. This chapter also demonstrates principles of the chapter with brief forays into JMock and Mockito code examples and a reference to J.B. Rainsberger's blog post JMock v. Mockito, but Not to the Death.

Chapter 4: Readability

The fourth chapter of Effective Unit Testing is the first chapter of Part 2 ("Catalog"). As with all three chapters in Part 2, Chapter 4 looks at "test smells" that might indicate tests that are less effective than they could be. In this chapter's case, the test smells are those most closely associated with problems related to readability of unit tests.

Koskela starts Chapter 4 by articulating the difference between reading test code and running test code: "Reading the tests ... should provide the programmer with an understanding of what the code should do. Running those tests should tell the programmer what the code actually does."

The test smells that Koskela associates most closely with the "readability" portion of his Test Smells Catalog are:

  • Primitive Assertions
    • Assertion that "uses more primitive elements than the behavior it's checking"
    • Analogous to the primitive obsession code smell
  • Bitwise Assertions
    • Special case of Primitive Assertions that uses bitwise operators for "optimized test assertions" at the cost of readability and understandability
  • Hyperassertions
    • Assertion that that "becomes brittle and hides its intent under its overwhelming breadth and depth"
  • Incidental Details
    • Incidental details make it difficult to identify the "intent, purpose, and meaning" of a unit test
  • Split Personality
  • Split Logic
    • Test code scattered over multiple files
  • Magic Numbers
    • Using numeric and String literals rather than using constants and variables with readable names
  • Setup Sermon
    • Too much code (often refactored from tests suffering incidental details smell) in the setup method
  • Overprotective Tests
    • Application of unnecessary/redundant guard or test assertions when condition would fail anyway

In each test smell case, Koskela provides examples of these test smells along with one or more ways (description and code examples) of addressing the smells.

Hamcrest is introduced in Chapter 4 as a partial solution to addressing test smells. There is also a unit test example that is built for testing JRuby source code.

In the fourth chapter, Koskela provides some memorable quotes. He articulates an opinion that I've long held regarding unit test code: "When weighing alternatives to expressing intent in test code, you should keep in mind that the nature and purpose of tests puts a higher value on readability and clarity than, say, code duplication or performance." He also writes, "A test that has never failed is of little value - it's probably not testing anything. On the other end of the spectrum, a test that always fails is a nuisance." Koskela also explains that "A test should have only one reason to fail" and explains that this is related to the Single Responsibility Principle.

Chapter 5: Maintainability

Chapter 5 continues the coverage of test smells, but moves from the focus of Chapter 4 on "readability" to focus instead of "maintainability." As he did in Chapter 4, Koskela enumerates several test smells most closely associated with test maintainability and uses code examples to demonstrate these smells and how to address these smells.

Chapter 5 focuses on the following "maintainability" test smells:

  • Duplication
    • Needless repetition that increases places where same change must be made and increases risk of not changing all necessary code
    • Duplication can be structural or semantic or both
  • Conditional Logic
    • "Conditional execution structures such as if, else, for, while, and switch" reduce the ability to use tests to "understand what the code does and what it should do"
  • Flaky Test
    • Tests that "fail intermittently," typically due to multithreading or race conditions
  • Crippling File Path
    • Absolute paths, especially hard-coded absolute paths, prevent unit tests from being run on others' machines
  • Persistent Temp Files
    • Files that are generated by unit tests may be less temporary than one realizes and interfere with later tests
  • Sleeping Snail
  • Pixel Perfection
    • Specialized version of Primitive Assertion and Magic Numbers test smells applied to exactly matching graphic representations in unit tests
  • Parameterized Mess
  • Lack of Cohesion in Methods
    • "Test methods in a test class are only interested in some of the fixture's objects"

As he did for the test smells covered in Chapter 5, Koskela provides code-based examples of each test smell discussed in Chapter 5 along with code-based examples of how to address each of the code smells.

Chapter 6: Trustworthiness

Chapter 6 finishes off Part 2 and the Catalog of Test Smells. Chapter 6's focus is on test smells closely associated with the degree of reliability and trustworthiness of tests. The test smells covered in this chapter are:

1 2 3 Page 1
Page 1 of 3