Contents
- Choreography vs Orchestration
- Saga Design — Order Example
- Order Service — Initiating the Saga
- Inventory Service — Reserve & Compensate
- Payment Service — Charge & Compensate
- Handling Failures & Compensations
- Pitfalls & Best Practices
| Aspect | Choreography | Orchestration |
| Coordinator | None — services react to events | Central saga orchestrator |
| Coupling | Loose — services only know about events | Tighter — orchestrator knows all services |
| Visibility | Hard — flow spread across services | Easy — flow defined in one place |
| Complexity | Lower for simple flows | Lower for complex flows |
| Best for | 2–4 services, simple flows | 5+ services, complex branching |
A simple order-placement saga across three services:
- Order Service creates the order (status: PENDING) and publishes OrderCreatedEvent.
- Inventory Service listens, reserves stock, publishes StockReservedEvent or StockReservationFailedEvent.
- Payment Service listens to StockReservedEvent, charges the customer, publishes PaymentSucceededEvent or PaymentFailedEvent.
- Order Service listens to the payment outcome — confirms the order (CONFIRMED) or cancels it (CANCELLED).
- On failure, each service publishes a compensating event to roll back previous steps.
// Shared event records (in a common library or separate module)
public record OrderCreatedEvent(Long orderId, Long customerId,
Long productId, int quantity, BigDecimal amount) {}
public record StockReservedEvent(Long orderId, Long productId, int quantity) {}
public record StockReservationFailedEvent(Long orderId, String reason) {}
public record PaymentSucceededEvent(Long orderId, String transactionId) {}
public record PaymentFailedEvent(Long orderId, String reason) {}
public record OrderConfirmedEvent(Long orderId) {}
public record OrderCancelledEvent(Long orderId, String reason) {}
The class below shows the implementation. Key points are highlighted in the inline comments.
import org.springframework.kafka.core.KafkaTemplate;
import org.springframework.stereotype.Service;
import org.springframework.transaction.annotation.Transactional;
@Service
public class OrderService {
private final OrderRepository orderRepo;
private final KafkaTemplate<String, Object> kafka;
@Transactional
public Order createOrder(CreateOrderRequest req) {
Order order = new Order(req.customerId(), req.productId(),
req.quantity(), req.amount(), OrderStatus.PENDING);
orderRepo.save(order);
// Publish event AFTER the local transaction commits
kafka.send("order.created",
String.valueOf(order.getId()),
new OrderCreatedEvent(order.getId(), order.getCustomerId(),
order.getProductId(), order.getQuantity(), order.getAmount()));
return order;
}
// Listen for saga completion events
@KafkaListener(topics = "payment.succeeded", groupId = "order-service")
@Transactional
public void onPaymentSucceeded(PaymentSucceededEvent event) {
orderRepo.updateStatus(event.orderId(), OrderStatus.CONFIRMED);
kafka.send("order.confirmed", String.valueOf(event.orderId()),
new OrderConfirmedEvent(event.orderId()));
}
@KafkaListener(topics = {"payment.failed", "stock.reservation.failed"}, groupId = "order-service")
@Transactional
public void onSagaFailed(Object event) {
Long orderId = switch (event) {
case PaymentFailedEvent e -> e.orderId();
case StockReservationFailedEvent e -> e.orderId();
default -> throw new IllegalArgumentException("Unknown event: " + event);
};
orderRepo.updateStatus(orderId, OrderStatus.CANCELLED);
kafka.send("order.cancelled", String.valueOf(orderId),
new OrderCancelledEvent(orderId, "Saga failure"));
}
}
The class below shows the implementation. Key points are highlighted in the inline comments.
@Service
public class InventoryService {
private final StockRepository stockRepo;
private final KafkaTemplate<String, Object> kafka;
@KafkaListener(topics = "order.created", groupId = "inventory-service")
@Transactional
public void onOrderCreated(OrderCreatedEvent event) {
Stock stock = stockRepo.findByProductIdWithLock(event.productId())
.orElseThrow(() -> new ProductNotFoundException(event.productId()));
if (stock.getAvailable() < event.quantity()) {
kafka.send("stock.reservation.failed", String.valueOf(event.orderId()),
new StockReservationFailedEvent(event.orderId(), "Insufficient stock"));
return;
}
// Reserve stock — local transaction
stock.reserve(event.quantity());
stockRepo.save(stock);
kafka.send("stock.reserved", String.valueOf(event.orderId()),
new StockReservedEvent(event.orderId(), event.productId(), event.quantity()));
}
// Compensating transaction — release stock when order is cancelled
@KafkaListener(topics = "order.cancelled", groupId = "inventory-service")
@Transactional
public void onOrderCancelled(OrderCancelledEvent event) {
// Find the reservation and release it
stockRepo.releaseReservation(event.orderId());
}
}
The class below shows the implementation. Key points are highlighted in the inline comments.
@Service
public class PaymentService {
private final PaymentRepository paymentRepo;
private final PaymentGateway gateway;
private final KafkaTemplate<String, Object> kafka;
@KafkaListener(topics = "stock.reserved", groupId = "payment-service")
@Transactional
public void onStockReserved(StockReservedEvent event) {
Order order = orderClient.getOrder(event.orderId());
try {
String txId = gateway.charge(order.getCustomerId(), order.getAmount());
Payment payment = new Payment(event.orderId(), txId, PaymentStatus.SUCCEEDED);
paymentRepo.save(payment);
kafka.send("payment.succeeded", String.valueOf(event.orderId()),
new PaymentSucceededEvent(event.orderId(), txId));
} catch (PaymentDeclinedException ex) {
kafka.send("payment.failed", String.valueOf(event.orderId()),
new PaymentFailedEvent(event.orderId(), ex.getMessage()));
}
}
// Compensating transaction — refund when order is cancelled after payment
@KafkaListener(topics = "order.cancelled", groupId = "payment-service")
@Transactional
public void onOrderCancelled(OrderCancelledEvent event) {
paymentRepo.findByOrderId(event.orderId())
.filter(p -> p.getStatus() == PaymentStatus.SUCCEEDED)
.ifPresent(p -> {
gateway.refund(p.getTransactionId());
p.setStatus(PaymentStatus.REFUNDED);
paymentRepo.save(p);
});
}
}
Each service must handle two failure scenarios:
- Business failure — e.g., insufficient stock. Publish a failure event immediately so other services can compensate.
- Technical failure — e.g., Kafka is down when publishing. Use the Outbox Pattern to guarantee delivery (see the Outbox article).
# application.properties — Kafka consumer retry for transient failures
spring.kafka.consumer.group-id=order-service
spring.kafka.listener.ack-mode=RECORD
# Retry transient processing errors before sending to DLT
spring.kafka.listener.concurrency=3
// Dead-letter topic handler — log and alert on unrecoverable failures
@KafkaListener(topics = "order.created.DLT", groupId = "order-service-dlt")
public void handleDlt(OrderCreatedEvent event,
@Header(KafkaHeaders.RECEIVED_TOPIC) String topic) {
log.error("Failed to process event from topic={} event={}", topic, event);
alertService.notifyOps("Saga DLT message on " + topic);
}
- Idempotent consumers — Kafka delivers at-least-once. Every event handler must be idempotent: check if the saga step already completed before acting.
- Publish events after commit — Never publish a Kafka event inside a database transaction before it commits. Use TransactionalEventListener(phase = AFTER_COMMIT) or the Outbox Pattern.
- Correlate by saga ID — Use the orderId (or a dedicated sagaId) as the Kafka message key so all events for one saga land on the same partition and are processed in order.
- Track saga state — Store the current saga status in the database (e.g., OrderStatus) so you can query and monitor progress.
- Timeout compensation — Use a scheduler to detect stuck sagas (e.g., orders PENDING for > 10 minutes) and trigger cancellation.