There are a lot of debates in the SOA space. One of those debates is about centralised vs. decentralised data. Recently I reviewed an architecture that on the face of it seemed to use decentralised data. But after digging a bit deeper I noticed that someone had been cheating a bit.
One of the key principles in SOA design is respecting service boundaries. We need to create loosely coupled services. I recently reviewed an architecture that seemed to tick all the boxes. Overall it was quite well done but I had a nagging feeling that something wasn’t quite right. I finally figured it out – and in hindsight it was obvious.
It turns out that even though service boundaries were well defined and adhered to at the front end, the back end was another story. I didn’t spot the problem immediately because services didn’t share physical data stores – they were physically separate databases. The trouble was that some clever developer decided to create cross-database queries. Service A needed a bit of data that was in Service B’s domain and rather than getting the data via a service call, the developer went via the back door!
The benefit of this approach is obvious: it’s quick. Quick at development time and quick at runtime (performance of a cross-database query will be far better than a service call)
The downside: it’s rubbish. It’s violating one of the fundamental principles of SOA. Now there’s tight coupling between two services and Service B is no longer autonomous. People working on it may have no idea that someone is relying on the schema not to change or the data store not to move.
In SOA, distributed data does not (necessarily) mean physically distributed – it means logically distributed.
There is nothing with having multiple services having their data store on the same physical server. Although I personally wouldn’t do it, you could even have multiple services in the same database on that server – provided each data store is an island (e.g. in SQL Server making sure tables are owned by different users).
As we’ve see in this example – having physically different data stores doesn’t mean that architecture has decentralised data.
My advice: if you need centralised data then design for it from the start. If you design for decentralised data then don’t cheat. If you cheat, you’ll get the worst of both worlds.