In a monolith, you had transactions. In microservices, you have heartache.
When your transportation management system routes a delivery across multiple microservices—booking a shipment, reserving capacity, charging a customer—you can’t just wrap it in a database transaction. The booking service lives in a different database. So does the payment service. Welcome to the world of distributed transactions.
The traditional ACID guarantees you relied on are gone. But the requirement to keep your system consistent? That’s still there.
The Problem: Distributed Monstrosity
Imagine this scenario in our real-time logistics platform:
- Customer requests a delivery from NYC to Boston.
- Shipment Service creates a shipment order.
- Capacity Service reserves truck space.
- Billing Service charges the customer’s account.
- Notification Service sends a confirmation.
What if step 3 fails? The shipment is booked, but the truck is full. Now you’ve sold something you can’t deliver. Or step 4 fails—the payment is declined. Your shipment is reserved, but unpaid.
In a monolith, you’d roll back the transaction. In microservices, you need something cleverer.
Enter the Saga Pattern
A Saga is a sequence of local transactions, each triggered by the completion of the previous one. If any step fails, you execute compensating transactions to undo the work.
There are two flavors:
1. Choreography (Event-Driven)
Each service publishes an event when it completes. Other services listen and react.
| |
If Billing fails, it publishes PaymentFailed. Capacity Service listens and releases the reservation. Shipment Service cancels the order.
Pros: Loosely coupled, no central orchestrator. Cons: Hard to see the full flow. Difficult to test. Requires robust error handling everywhere.
2. Orchestration (Centralized)
A central orchestrator (like Azure Durable Functions) tells each service what to do and when.
| |
Pros: Clear, testable flow. Easy to see the entire transaction. Cons: The orchestrator becomes a bottleneck. Single point of failure.
Implementation: Orchestrating with Azure Durable Functions
Here’s how we handle delivery booking using orchestration:
| |
When to Use Saga vs. Other Patterns
Use Saga when:
- You need eventual consistency (not immediate ACID)
- Transactions span multiple services
- You can design compensating transactions
- You’re okay with temporary inconsistency
Don’t use Saga when:
- You absolutely need immediate consistency (use distributed locks or stronger consistency patterns instead)
- Compensating transactions are impossible (you can’t “undo” a physical truck dispatch)
- Performance is critical and you need faster responses
Key Gotchas
Idempotency: Each activity must be idempotent. If Durable Functions retries
ProcessPaymentActivity, you don’t want to charge twice.Compensating Transactions Might Fail: What if the refund fails? You need a dead-letter mechanism.
Testing is Hard: Test the happy path and every possible failure scenario.
Monitoring is Critical: You need visibility into which saga is where in its lifecycle. Add Application Insights logging.
Lessons from Our Platform
We’ve learned that orchestration scales better for us than choreography. With 15+ microservices, trying to track the flow through events became a debugging nightmare. Durable Functions gave us a central, auditable record of each delivery transaction.
The trade-off? The orchestrator is now a critical path. We run multiple instances and monitor it closely.
Remember: Distributed transactions are hard. The Saga pattern isn’t magic—it’s a framework for managing that hardness. Design your compensating transactions carefully, test thoroughly, and monitor relentlessly.
