In this post I am going to suggest doing a little design a little upfront and violating the purity of agility in the process. I think that you will see the sense.

To be fair, I do not think that what I am going to say in this post is particularly original… But in my admittedly weak survey of agile methods I did not find these ideas clearly stated… So apologies up front to those who figured this out before me… And to those readers who know this already. In other words, I am assuming that enough of you database geeks are like me, just becoming agile literate, and may find this useful.

In my prior post here I suggested that refactoring, re-working previous stuff due to the lack of design, was the price we pay to avoid over-engineering in an agile project. Now I am going to suggest that some design could eliminate some refactoring to the overall benefit of the project. In particular, I am going to suggest a little database design in advance as a little cheat.

Generally an agile project progresses by picking a set of user stories from a prioritized backlog of stories and tackling development of code for those stories in a series of short, two-week, sprints.

Since design happens in real-time during the sprints it can be uncoordinated… And code or schemas designed this way are refactored in subsequent sprints. Depending on how uncoordinated the schemas become the refactoring can require a significant effort. If the coders are not data folks… And worse, they are abstracted away from the schema via an ORM layer, the schema can become very silly.

Here is what we are trying to do at the Social Security Administration.

In the best case we would build a conceptual data model before the first sprint. This conceptual model would only define the 5-10 major entities in the system… Something highly likely to stand up over time. Sprint teams would then have the ability to agile define new objects within this conceptual framework… And require permission only when new concepts are required.

Then, and this is a better case, we have data modelers working 1-2 sprints ahead of the coders so that there is a fairly detailed model to pin data to. This better case requires the prioritized backlog to be set 1-2 sprints in advance… A reasonable, but not certain, assumption.

Finally, we are hoping to provide developers with more than just a data model… What we really want is to provide an object model with basic CRUD methods in advance. This provides developers with a very strong starting point for their sprints and let’s the data/object architecture to evolve in a more methodological manner.

Let me be clear that this is a slippery slope. I am suggesting that data folks can work 2-4 weeks ahead of the coders and still be very agile. Old school enterprise data modelers will argue… Why not be way ahead and prescribe adherence to an enterprise model. You all will have to manage this slope as you see fit.

We are seeing improvement in the velocity and quality of the code coming from our agile projects… And in agile… code is king. In the new world an enterprise data model evolves evolve based on multiple application models and enterprise data modelers need to find a way to influence, not dictate, data architecture in an agile manner.