guh.me - gustavo's personal blog

Building Event-Driven Microservices

These are my personal notes on the book Building Event-Driven Services.

Building Event-Driven Services

Chapter 1: Why Event-Driven Microservices

Core Concepts

In event-driven microservices architecture, systems communicate by issuing and consuming events. Unlike message-passing systems, events are not destroyed upon consumption but remain available for other consumers.

Key Characteristics:

Communication Structures

Three Types of Communication Structures:

  1. Business Communication Structure - Communication between teams and departments
  2. Implementation Communication Structure - Data and logic pertaining to subdomain models
  3. Data Communication Structure - Process of data communication across business and implementations

Conway’s Law and Communication Structures

Organizations’ communication structures greatly influence engineering implementations at both organizational and team levels.

Event-Driven Communication Benefits

Events Are the Basis of Communication:

Event Streams Provide Single Source of Truth:

Key Advantages:


Chapter 2: Event-Driven Microservice Fundamentals

Microservice Types

Consumer Microservices: Consume and process events from input streams
Producer Microservices: Produce events to streams for other services
Hybrid: Both consumer and producer (most common)

Topology Concepts

Microservice Topology: Event-driven topology internal to a single microservice
Business Topology: Set of microservices, event streams, and APIs fulfilling complex business functions

Event Structure

Events contain:

Table-Stream Duality

Materializing State from Entity Events:

Core Principles

Microservice Single Writer Principle:

Event Broker Features:

Consumption Patterns

Stream Consumption: Each consumer maintains its own offset pointer
Queue Consumption: Each event consumed by one and only one instance


Chapter 3: Communication and Data Contracts

Fundamental Communication Problem

“The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point.” - Claude Shannon

Data Contracts

Two Components of Well-Defined Data Contract:

  1. Data Definition - What will be produced (fields, types, data structures)
  2. Triggering Logic - Why it is produced (business logic that triggered event creation)

Schema Management

Schema Benefits:

Compatibility Types:

Schema Evolution Best Practices

Event Design Principles

Tell the Truth, the Whole Truth, and Nothing but the Truth:

Best Practices:


Chapter 4: Integrating Event-Driven Architectures with Existing Systems

Data Liberation

Definition: Identification and publication of cross-domain data sets to corresponding event streams as part of migration strategy.

Goals:

Data Liberation Patterns

Three Main Patterns:

  1. Query-based - Extract data by querying underlying state store
  2. Log-based - Extract data by following append-only log for changes
  3. Table-based - Push data to output queue table, then emit to event streams

Critical Requirement: All patterns must produce events in sorted timestamp order using source record’s updated_at time.

Change-Data Capture

Benefits:

Outbox Pattern

Implementation:

Event Sinking

Purpose: Consuming event data and inserting into data stores for non-event-driven applications

Use Cases:


Chapter 5: Event-Driven Processing Basics

Stateless Topologies

Key Concept: Building microservice topology requires event-driven thinking - code executes in response to event arrival.

Topology Components:

Stream Operations

Branching: Apply logical operator and output to new stream based on result
Merging: Consume from multiple streams, process, output to single stream

Important: When merging streams, define new unified schema representative of merged domain.

Partition Management

Consumer Groups: Each microservice maintains unique consumer group representing collective offsets

Partition Assignment Strategies: Ensure partitions are evenly distributed across consumer instances

Copartitioning: Event streams with same key, partitioner algorithm, and partition count guarantee data locality for consumer instances.

Failure Recovery

Stateless Recovery: Effectively same as adding new instance to consumer group - no state restoration required, immediate processing after partition assignment.


Chapter 6: Deterministic Stream Processing

Processing States

Two Main States:

  1. Near real-time processing (typical of long-running microservices)
  2. Historical processing (catching up to present time)

Determinism Goal

Microservice should produce same output whether processing in real-time or catching up to present time.

Timestamp Management

Critical Requirements:

Event Scheduling

Purpose: Process events consistently for reproducible results

Implementation: Select and dispatch event with oldest timestamp from all assigned input partitions.

When Needed: When order of event processing matters to business logic.

Time Types

Event Time: When event actually occurred (most accurate)
Processing Time: When event is processed by consumer
Ingestion Time: When event is ingested into event broker

Best Practice: Use event time when reliable, ingestion time as fallback.

Watermarks and Stream Time

Watermarks: Declaration that all events of time t and prior have been processed

Stream Time: Highest timestamp of processed events - never decreases, useful for coordinating between consumer instances.

Out-of-Order Events

Definition: Event timestamp isn’t equal to or greater than events ahead of it in stream.

Handling Strategies:

Windowing Functions

Tumbling Windows: Fixed size, non-overlapping windows
Sliding Windows: Fixed size with incremental step
Session Windows: Dynamically sized, terminated by timeout

Reprocessing

Key Capability: Rewind consumer offsets and replay from arbitrary point in time.

Requirement: Event scheduling ensures same processing order during reprocessing as real-time.


Chapter 7: Stateful Streaming

State Management

Materialized State: Projection of events from source event stream (immutable)
State Store: Where service’s business state is stored (mutable)

Changelog Streams

Purpose: Record of all changes made to state store data

Benefits:

Important: Changelog streams should be compacted (only need most recent key/value pairs).

State Store Types

Internal State Store: Coexists in same container/VM as microservice business logic

Global State Store: Materializes all partitions for complete data copy on each instance

Scaling and Recovery

Process: New/recovered instance must materialize state before processing new events

Method: Reload changelog topic for each stateful store (quickest approach)

Hot Replicas: Multiple replicas for faster recovery

State Rebuilding vs Migration

Rebuilding Process:

  1. Stop microservice
  2. Reset consumer offsets to beginning
  3. Delete intermediate state
  4. Start new version - rebuild state from input streams

Transactions and Effectively-Once Processing

Goal: Updates to single source of truth consistently applied regardless of failures

Key Features:

Deduplication Challenges:


Chapter 8: Building Workflows with Microservices

Workflow Patterns

Choreography Pattern

Characteristics:

Benefits:

Challenges:

Orchestration Pattern

Characteristics:

Benefits:

Best Practice: Orchestrator’s bounded context limited to workflow logic, workers contain business fulfillment logic.

Distributed Transactions

Definition: Transaction spanning two or more microservices

Implementation: Often known as sagas in event-driven world

Best Practice: Avoid when possible due to significant risk and complexity

Requirements:

Compensation Workflows

Purpose: Handle workflows that don’t need perfect reversibility

Use Case: Customer-facing products where compensatory actions can remedy failures

Benefit: Alternative to complex distributed transactions


Chapter 9: Microservices Using Function-as-a-Service

FaaS Characteristics

Function Behavior:

Think of FaaS: Basic consumer/producer that regularly fails and must restart

Design Principles

Bounded Context: Functions and internal event streams must strictly belong to bounded context

Consumer Groups: Each function-based microservice must have independent consumer group

Offset Commits: Best practice is commit only after processing completed

Cold Start vs Warm Start

Cold Start: Default state upon starting first time or after inactivity
Warm Start: Function revived from hibernation cache

Suitable Use Cases

Ideal for:

Security: Use strict access permissions - nothing outside bounded context allowed access

Communication Patterns

Event-Driven: Output of one function produced to event stream for consuming function

Request-Response: Direct calls between functions

Hybrid: Combination of both patterns

Important: Complete processing of one event before processing next to avoid out-of-order issues

Function Optimization

Tuning Considerations:


Chapter 10: Basic Producer and Consumer Microservices

BPC Characteristics

Basic Producer and Consumer (BPC) microservices:

What BPCs Don’t Include:

Suitable Use Cases

Simple Patterns:

Integration Scenarios:

Gating Pattern: Business processes not reliant on event order but requiring all events eventually arrive

Data Layer Heavy: When underlying data layer performs most business logic (geospatial, search, ML/AI)

Hybrid Applications

External Stream Processing: BPC can leverage external stream-processing systems for complex operations while maintaining access to language features and libraries

Example: External framework for complex aggregations, BPC for populating local data store and serving request-response queries

Limitations

BPCs require investment in libraries for:


Chapter 13: Integrating Event-Driven and Request-Response Microservices

Integration Necessity

Event-driven patterns cannot serve all business needs - request-response endpoints provide real-time data serving capabilities.

Use Cases for Request-Response

Integration Patterns

External System Integration:

Human Interface Integration:


Chapter 14: Supportive Tooling

Ownership and Governance

Explicit Ownership Tracking:

Event Stream Management

Creation and Modification Rights:

Schema Registry

Critical Service for schema management providing:

Benefits:

Schema Registry Features:


Chapter 15: Testing Event-Driven Microservices

Testing Levels

Unit Testing: Test smallest pieces of code to ensure expected functionality - foundation for larger tests

Topology Testing: More complex than unit tests - exercises entire topology as specified by business logic

Think of topology: Single, large, complex function with many moving parts

Schema Compatibility Testing

Automated Checks: Pull schemas from schema registry and perform evolutionary rule checking as part of code submission process

Ensures: Output schemas compatible with previous schemas according to stream evolution rules

Integration Testing

Two Main Flavors:

Local Integration Testing:

Remote Integration Testing:

Testing Strategy

Comprehensive Approach: