I’ve spent the last few months studying Alexey Zimarev’s excellent book, Hands-On Domain Driven Design with ASP.NET Core (2017). While 5 years old now, it was clearly cutting edge at the time, since the Event Sourced system he demonstrates is now becoming more popular. I’ve found several reliable sources who refer to Event Sourcing by name, but very few actual implementations even today. Zimarev’s remains by far the most complete.
The Event Sourced system precipitates some properties I haven’t seen before. First, it comes with a natural audit log. Every action taken on the domain is stored in the event store along with our choice of metadata, most obviously the user taking the action. The event store is not queryable, necessitating the use of a projection into a readable form. While I’ve designed software using logical CQRS, this system is the first implementation I’ve seen that has different capabilities on the command and query sides, and allows many choices to fulfill these capabilities. Zimarev uses RavenDB + Event Store DB, a good choice today as they can both be deployed in cloud clusters to the AWS, GCP or Azure region of your choice. The specific choices for command and query side are quite involved themselves, and the subject for another article.
The next thing that jumps out at me is that the system is truly asynchronous. I had to move away from the idea that I present an object to the server for processing, and wait for some kind of result back (e.g. a database ID). Instead, the client does all of these id generations in advance and simply issues commands to the server to keep server state in sync. (Side note – look into ULIDs instead of GUIDs for universal IDs). This has its own challenges that must be understood. For example, the queryable database is now the “query” (“read”) side. However, there is latency between when an event enters the store, and when the read side is updated. This only matters if we insist on re-reading the object from the database to get something back. Universal id generation is important here, and the relationship between client and server is flipped on its head. The client holds the master state, and issues commands to the server to persist it.
Eventual consistency is a new challenge to developers. We should not avoid systems that are not immediately consistent, however. We simply need to learn how to build them such that we can truly fire and forget our commands, and be notified only if something goes wrong. The performance potential here is enormous: how much database effort is spent on joins and indexes to split object data into multiple tables, when we could simply store the data the way it is actually used? While some data (accounting, e.g.) seems to be better suited to tabular data, other data never needs to be anything but an object, and it is easiest to store it as such as a JSON document in a document database.
The biggest challenge that I face right now is the migration of the legacy system. It is simply too big to flip the switch on day 1. It needs to incrementally migrate to a new system, which means that the existing version must be able to write to the event store, and the new version must project back to the original database in addition to the document database. Of course, this is a lot of work that will be immediately obsolete, but we see this technique in bridge maintenance as well. If we wish to upgrade a bridge while allowing people to continue to cross the river, we need to build a temporary bridge first before closing the bridge for upgrades.
I don’t know what my performance expectations are at this point. Intuitively, it feels like a better design than using a relational database for absolutely everything. I know I no longer have to write error-prone mapping code to shove the data into an ORM. The more I work with document databases, the more I feel that relational databases are not appropriate persistence for object-based data. An interesting thought is that the databases will do less work. Writing to event streams is not compute-intensive work, nor is reading a selection of documents from a document database. The costs are already lower for the two databases, but I haven’t seen the resource requirements yet. The cost is simply the latency between write and read sides, which is handled by building the system asynchronously and making the application client-centric rather than server-centric. The projection, of course, is handled by the application server, so the latency can be managed to an extent by increasing the resources available to the application server.
Those are my initial thoughts about an Event Sourced system. With any luck, I’ll have a chance to design one soon as our company updates their applications to be cloud-native. I hope to see some tangible results this year.