Grails - Test data fixtures

Proposal for Test data fixtures in Grails

This proposal aims to aid unit tests that depend upon persistent domain objects. The idea is inspired by Rails' test fixtures.

Background

Unit testing is important. But when you're developing in a dynamic language/framework like Groovy/Grails, where there isn't an explicit compilation phase to flag up silly errors, and there isn't the same refactoring support of modern IDEs for statically-typed languages, it's even more important to unit test.

Unit tests are only useful if the data they run on does not change between test runs. If your tests run on data that is not under the control of the testing mechanism itself and that data changes, it can break your tests, even without touching a line of code! Then what use are your tests? It's better to have a dedicated test database and test data.

Requirements

A "test fixtures" mechanism with the following characteristics:

  • Each domain model class has an associated list of test data fixtures
  • The test fixture data is human-readable "serialized" instances of the domain model objects (could use groovy's literal Map (propertyName/value) or List (table columns) for example)
  • Each unit test class (or method) declares which domain model class's fixtures it requires
  • As part of each test-method-lifecycle, the database is cleaned and the required test fixture data is loaded (could hook in to, startup/teardown)
  • The test fixture data is placed within the Grails application directory structure
  • The relationships of dependent domain objects can be expressed, ie, the data of domain model's test fixtures can depend upon another model's test fixtures

Possible additional nice to haves

  • Ant target to run a single test class
  • Ant target to copy the development schema to the test datasource
  • Ant target to export test/development database data to fixtures file

An example

Let's say I am modelling places and I have Country, Region an City domain models.

In order to test instances of these classes I would have separate text fixtures files for each model class: one for Country, one for Region, one for City. My Country test fixtures would include "USA", "England", "Spain", etc. My Region text fixtures might include an instance for "New York" and "Connecticut" and my City test fixtures might include a "New York City", etc.

In my domain models, City belongs to a Region, and Region belongs to a Country. It is possible to determine the relationships of specific instances of text fixtures from the test fixtures data alone, for example, that the "New York City" City fixture belongs to the "New York" Region fixture and that belongs to the "USA" Country fixture, by foreign keys or some shared symbol.

Now I have some complex logic in each of these classes I want to test so I want to write a unit tests for each.

My Country class only depends on itself (not Region or City), at least for the sake of this argument, so I only need to declare that my Country tests use the Country fixtures.

Now because the code in the City class I want to test does reference it's parent Region and Country, in the tests for City I declare that I need the Country, Region and City fixtures.

Benefits

  • Tests don't have to worry about loading or cleaning data - they just test. Likewise the test framework takes care of loading and cleaning test data. This is good separation of concerns.
  • Each test runs on clean data, so tests are independent. Thus order is unimportant and it be possible to run a single test class or even method on its own.
  • Test fixtures files are version controlled along with the tests and the code they test
  • Test fixture data is human readable and editable in a text editor along with the code and tests

Objections

Why not just maintain the data in a test schema?

  • It's fragile. Teams need to constantly share and update data from each other, which can cause a data merge pain.
  • Each test does not run on clean data, unless there's an explicit cleanup part of the test, which is not good separation of concerns and error-prone.
  • It's not easily version controlled.
  • It's not easily searchable and causes a mental "context switch" going from text editor to database client.
  • Grails is doing a good job of abstracting the database - do we want to force people to resort to the database client afterall?
Why not just run tests on the development database?
  • It's fragile for the reasons as above. Plus, the data almost certainly will change and break tests.
Isn't going to be pain maintaining both development and test data?
  • Well there is an overhead for sure, but you only need as much test fixture data as you have functionality to test and if you've already seen the unit testing light, then you'll know it's worth it.
What about this or that Java/Groovy framework that already does this kind of thing?
  • Great - maybe we can use it?

Solution discussion

This area is a whiteboard for discussing possible solutions

We need to agree on

  • A term for the fixtures
  • A file type and name convetion
  • A syntax
  • A directory
  • Test case data declaration
  • Implementation details

Name

We think "dataset" may be a better name than "fixture"

File type and name convention

Probably groovy files named _domain model name + "Dataset"_.

Syntax

Maybe something like

class CountryDataset {
    def dataset = {
        [ italy: [name: "Italy", code: "IT"],
          france: [name: "France", code: "FR"]
        ]
    }
}

class RegionDataset {
    def dataset = {
        [ tuscany: [name: "Tuscany", country: italy],
          provence: [name: "Provence", country: france ]
        ]
    }
}

class CityDataset {
    def dataset = { 
        [ florence: [name: "Florence", region: tuscany],
          marseilles: name: "Marseilles", region: provence]
        ]
   }
}

Data can be auto-generated:

class UserDataset {
    def dataset = { 
        def users = []
        for (i in 0..100) {
            users << [name: "user_${i}", password: "dontcare"]
        }
        return users
   }
}

Questions

What about the version and id properties? Can they be explicity stated in the datasets. Test cases will certainly obtain known instances by id in test cases.

How are dependencies resolved? Is it necessary to prefix references with domain model name like

[ florence: [name: "Florence", region: Region.tuscany],

or can the Region dataset "instances" be somehow exposed to the City dataset instances, in this example?

Should we go with Map literals or are constructors better? In other words,

florence: [name: "Florence", region: tuscany]

is less verbose but pretty close to

florence: new Region(name: "Florence", region: tuscany)

A directory

We could reorganise the directory structure to house all test atrifacts under grails-tests (like Rails):

grails-app                       -- for development/production env
grails-tests                     -- for test env
    unit
    webtest (ie, functional)
    datasets

or is there a Maven2-style structure?

Test case data declaration

Test case classes can declare that they require some or all fixtures, eg:

class CountryTests extends GroovyTestCase {
    def requireDataset = Country
}

or

class CityTests extends GroovyTestCase {
    def requireDataset = [ Country, Region, City ] // or lazily as "def requireDataset = ALL"
}

Implementation details

We discussed DbUnit, but feet it's going to be more integrated and conceptually cohesive with Hibernate.