Event Sourcing: Connecting the Dots for a Better Future (Part 1)

By: Nataraj Sundar and John Long

Using an Event-centric approach has enabled our team at eBay to scale to handle millions of events with the resiliency to recover from failures as quickly and reliably as possible. Though similar approaches have been widely adopted to augment large-scale data applications, for eBay's Continuous Delivery team, Event Sourcing is at the heart of decision-making and application development. To that end, we've built a system that continuously scales and tests our ability to handle an increasing volume of events and an ever growing list of external data sources and partner integrations.

In the 2000 film Memento, the protagonist suffers from a condition where he can no longer create new long-term memories. Every few minutes or so, he forgets everything that has happened to him. He has no clue why he is, where he is, why he feels, how he feels, or even who the people around him are.

In software development, this is fairly close to the situation when we use CRUD. In the conventional approach to data, all we know is what the current state is. We don’t have any reliable record of what came before the current state, and we have no context to understand why the current values are what they are. Some code, somewhere, changed the values of our data, and we can often find ourselves asking “how could we possibly be in this state?”

In Memento, the hero tries to scribble notes to himself, take pictures, and even get tattoos to let himself understand what is happening. Yet he finds he cannot trust even these, as they can be forged or made under duress or even made to manipulate his future self. This is the best we can hope for from our application logs: a disorganized collection that may or may not represent how we arrived at our current state.

Compare all of this with the natural state of a human being. We remember the events that happened, we know how we got there, and we know why our current state is the way it is. In fact, our current state is inextricably linked to our past. Reality itself is a series of changes to an initial state, resulting in the world as we know it today.

An alternative to this state without context is something referred to as Event Sourcing. In this model, instead of having a single data model that we modify as various events happen, we store all of the events that can change our model as immutable objects. When we need to know our current state, we take all of our past events and calculate what the current state should be. Our present is literally a combination of our past.

Let’s take a real world scenario, such as the NBA finals. If we were to track the score of a basketball game using CRUD, we would accurately determine who won a game, but not much else. Every time someone scored, we would update that teams score appropriately.

It’s mostly functional, but rather boring and provides no insight for how the game is going.

Even a simple form of event sourcing changes our perspective completely. Imagine instead of updating our single table with new scores, we registered an event every time someone made a basket:

Instead of simply getting the current score from our database, we get all of the events in the game and add up the points for each team. We would do this every time we need to get the score.

At first glance we are no better off, but we can now reproduce the state of the game at every point:

This already tells us a far more interesting story. Did one team dominate throughout? Did the underdog rally from behind yet not hold the lead? Has it been a neck-in-neck battle with each team claiming and losing the lead?

This also gives us one functional advantage: if the refs review a play and reverse a basket, you have a record of what kind of shot it was and can adjust the score correctly. If there is any disagreement over the score, you have the exact record of how it was calculated.

If we add just a little more data, we get an explosion of information:

Now we can get the traditional “box score” by breaking out the points for each player:

Now imagine if we had to create that box score using CRUD. We would need to update several records every time a basket was scored: the overall score, the players points total, the assisting players assists, the players 3 point or 2 point total, etc. The more data we want to collect, the more tables and relationships we need, and the system rapidly becomes more complicated and brittle as we track more and more information. And if we didn't think of something we wanted to track ahead of time? Too bad.

Contrast this with our event-sourced approach. We don’t need to decide exactly what data to collect in advance, we can simply process our existing events in new ways. Want to know how often Durant goes on a rally during the third quarter when he had a slump during the second? The data is all there, you just need to ask your events.

This is not a new model, of course. Accountants have used Event Sourcing for millennia. When an accountant tracks a financial account, you will not see a pencil or an eraser. They do not simply write the current balance and then replace it with a new balance after subtracting the latest entry. Instead, they carefully write down each event as it happens, and then use those records to calculate the current balance.

And they are fanatical about this. Even when a mistake is made, no one goes back and modifies the entry that was incorrect. Take a look: this is an actual entry from one of the authors' bank account from several years ago. He was charged a bank fee of over 200 million dollars!

Now look at how they fixed this: not by erasing the error, but by creating another entry that tried to offset the first one:

You may notice that this too is not entirely correct, so the next day yet another correcting entry was added:

If you have watched any reasonable amount of science fiction, you should know exactly why we don’t go and try to change the past. Name one time that actually went well! No, like all good protagonists, we have to acknowledge that the past cannot be changed, and that all we can hope for is to do better in the future. It makes no more sense to modify the past in our software than it does in real life. If we go and change the past, the next thing you know Biff has married your mom and you no longer exist. More practically, it’s how things like Enron happen.

There are many other practical benefits to Event Sourcing. Since writing is cheap, we can easily scale the processing of incoming data. Since we can process each event sequentially, concurrency issues are also much easier to address. Finally, separating our code into parts that record information coming in, parts that calculate our final model and parts that act when our model changes makes testing, designing and debugging this code much simpler.

In our follow-up article, we go into the details of how we have used Event Sourcing here at eBay in the implementation of our continuous delivery system. The progress of code through a development pipeline is a natural match for Event Sourcing, and we would love to share our experience and challenges in making it work.

Tags: Coding Practices, Event Sourcing