Consistency and Aggregates in Event Sourcing

Learn how we ensures data consistency in event sourcing with effective use of aggregates, enhancing system reliability and performance.

Black Friday is coming soon, so let’s talk about warehouse management and event sourcing.

When developing a system for event retrieval with aggregates, several very different concepts are possible. If you think of an aggregate as a transaction boundary, then each decision has its own implications.

The aggregate can also be a lifecycle boundary - events in a global uniform stream can often only be discarded by the aggregate stem.

In this sense, it is always very interesting when people come up with completely different solutions to the same problem. This is exactly what happened when Christian Folie and I were talking about an event-driven inventory problem.

Warehouse Management Domain

Let’s say we have a warehouse management solution that handles products, locations and sales, linking them together.

In this kata, we focus on locations. Each location is a place where products can be placed. A location can be a shelf, a table, a bin, a box, or one of many other variations.

For the purposes of this kata, we will assume that we are only concerned with box locations for now. These are the cartons that are placed on a picking cart:

Boxes are short-lived:

  • A customer order comes in.

  • The warehouse employee starts picking the items for the order. He takes an empty box or bin and creates an ID for it. At this point, we run "AddLocation."

  • Usually, warehouse workers have a batch picking cart, so they put away a dozen boxes in advance. This way, they can go through the warehouse only once and prepare a dozen orders in one go. So the system creates and prints labels for a dozen boxes at once.

  • When picking is complete, the boxes are transferred to quality assurance and then shipped. They disappear from the system. In rare cases, if something goes wrong, they can live on for a few more days until the problem is fixed.

API for managing boxes

Let's define an API for the system that can handle site creation in the proto3 specification (why proto3? see below):

message AddLocationsReq {
  repeated string name = 1;
  // location details contents go here
}

message AddLocationsResp {
  repeated uint64 id = 1;
}

Note, the repeated field which means that each message could have multiple location items. This allows to create a batch of locations at once. Warehouse management loves batching.

The service itself could look like this:

service InventoryService {
  rpc AddLocations(AddLocationsReq) returns (AddLocationsResp)
}

Given this design, we could have different implementations with different tradeoffs: both an ultimately consistent system that treats each site as a separate aggregate, and an immediately consistent system that treats the entire warehouse as a single large aggregate.

Let's ignore the implementation and focus on the API for now.

Consistency Semantics

Regardless of the implementation, this API can both support eventual and immediate consistency because of the behavior contract:

  • when executing a request, client will pass “idempotency-key” header - a unique uuid. In case of failure-retry. See Stripe documentation on idempotency.

  • If service returns status code 202 (same as HTTP accepted for processing) or in case of a transient failure, client should send the same request with the same idempotency key.

Eventually consistent implementation can then always return status code 202 at the first attempt and instruct the client to try again with the same idempotency key. The client keeps querying until the status is OK. With such results, the response data is also available (e.g., IDs for the newly created entities).

An immediate consistent implementation will always return OK on the first attempt. In case of a network problem, clients might still rely on the idempotency key to retrieve results.

Kata

How would you design a site part for an event-driven (because warehouse management loves audit logs and replication) warehouse management system.

  • We assume the following constraints:

  • We only deal with box locations.

  • Such sites are usually short-lived, existing for 1-2 days at most.

  • A single medium-sized warehouse can handle 10000 orders per day. So there are very many sites

  • A single site will probably have 10-30 events in its lifecycle.

How would you design such an API? What would the aggregates look like? What stack would you use?

Footnote: Why gRPC/proto3 spec?

Because it is ambiguous and could be used to generate code contracts in any common language. For example, one could implement service testers in Golang, someone else could implement a consistent server in F#, and someone else decides to implement a consistent flavor in Python. We could then plug these together, see and talk!

But that is not necessary. The logic remains the same whether the service is implemented in plain HTTP/JSON or something else. The only thing that would be lost here would be seamless interoperability between implementations written in different languages.

Blog 2/21/22

The Power of Event Sourcing

This is how we used Event Sourcing to maintain flexibility, handle changes, and ensure efficient error resolution in application development.

Blog 7/14/21

Building and Publishing Design Systems | Part 2

Learn how to build and publish design systems effectively. Discover best practices for creating reusable components and enhancing UI consistency.

Blog 1/29/20

Tracing IO in .NET Core

Learn how we leverage OpenTelemetry for efficient tracing of IO operations in .NET Core applications, enhancing performance and monitoring.

Blog 12/22/23

ADRs as a Tool to Build Empowered Teams

Learn how we use Architecture Decision Records (ADRs) to build empowered, autonomous teams, enhancing decision-making and collaboration.

Blog 7/14/23

Event Sourcing with Apache Kafka

For a long time, there was a consensus that Kafka and Event Sourcing are not compatible with each other. So it might look like there is no way of working with Event Sourcing. But there is if certain requirements are met.

Übersicht

Events & Webinars

Atlassian & catworkx events, virtual, hybrid or on-site: We cordially invite you and share our knowledge and experience with you.

Kompetenz

Sourcing Strategy, Spend Management & Compliance

The right service providers + Costs under control + Ensure vendor compliance ► Together we develop the right strategy

IT Vendor Management Teaserbild
Service

IT Sourcing & Vendor Management – Managing IT Partners

We ensure the application of best practice methods for the selection and evaluation of IT vendors and service providers and for cost-optimized supplier management.

Kompetenz 2/10/20

Certificates and awards

Our company and our products have won awards in the truest sense of the word. Among other things, SAP has certified our software solutions several times with various seals of approval.

Blog 10/4/24

Open-sourcing 4 solutions from the Enterprise RAG Challenge

Our RAG competition is a friendly challenge different AI Assistants competed in answering questions based on the annual reports of public companies.

Headerbild zur offenen und sicheren IT bei Versicherungen
Service

Open and secure IT

Just a few years ago, insurers were reluctant to move into the cloud or platform world. Concerns about security and governance often prevailed. The paradigm has changed.

Schild als Symbol für innere und äußere Sicherheit
Branche

Internal and external security

Defense forces and police must protect citizens and the state from ever new threats. Modern IT & software solutions support them in this task.

Teaserbild zu Lizenz- und Vertragsoptimierung.
Service

License and contract optimization

Based on the license analysis, we check the feasibility of potential savings from both a technological and commercial point of view.

HCL Leap & Volt
Technologie

HCL Leap and Volt

An intranet has the potential for a fully functional digital workplace. To achieve this, the individual user must have access to various tools.

Headerbild zu Tempo Customizing und Integration
Technologie

Tempo Customizing and Integration

Adapt Tempo for Jira to your needs and integrate your Atlassian time recording into the ERP landscape, such as SAP or HCL Domino, seamlessly.

Kompetenz 7/29/21

DevOps and CI/CD

A DevOps introduction gains momentum and focus with the support of our experts. We record the initiatives and capture the context of the company.

Service

Digitalization and cloud transformation

We adjust the levers of your individual digitalization initiative for speed, adaptability, security and compliance!

Headerbild zu Digitale Planung, Forecasting und Optimierung
Service

Demand Planning, Forecasting and Optimization

After the data has been prepared and visualized via dashboards and reports, the task is now to use the data obtained accordingly. Digital planning, forecasting and optimization describes all the capabilities of an IT-supported solution in the company to support users in digital analysis and planning.

App 8/9/22

Teamworkx Push and Pull Favorites

Filters and dashboards can be centrally managed and made available with the "Teamworkx Push and Pull Favorites" app. The app offers various modes for adding filters and dashboards to users' favorites.

Standort

Interested in Idea and Innovation Management?

Get in touch with our experts in ideas and innovation management.

Bleiben Sie mit dem TIMETOACT GROUP Newsletter auf dem Laufenden!