Distributed Transaction Patterns: When ACID Goes to Sleep

In a monolith, you had transactions. In microservices, you have heartache.

When your transportation management system routes a delivery across multiple microservices—booking a shipment, reserving capacity, charging a customer—you can’t just wrap it in a database transaction. The booking service lives in a different database. So does the payment service. Welcome to the world of distributed transactions.

The traditional ACID guarantees you relied on are gone. But the requirement to keep your system consistent? That’s still there.

The Problem: Distributed Monstrosity

Imagine this scenario in our real-time logistics platform:

Customer requests a delivery from NYC to Boston.
Shipment Service creates a shipment order.
Capacity Service reserves truck space.
Billing Service charges the customer’s account.
Notification Service sends a confirmation.

What if step 3 fails? The shipment is booked, but the truck is full. Now you’ve sold something you can’t deliver. Or step 4 fails—the payment is declined. Your shipment is reserved, but unpaid.

In a monolith, you’d roll back the transaction. In microservices, you need something cleverer.

Enter the Saga Pattern

A Saga is a sequence of local transactions, each triggered by the completion of the previous one. If any step fails, you execute compensating transactions to undo the work.

There are two flavors:

1. Choreography (Event-Driven)

Each service publishes an event when it completes. Other services listen and react.

1
2
3
4
5
6
7
Shipment Service publishes "ShipmentCreated"
    ↓
Capacity Service listens → reserves space → publishes "CapacityReserved"
    ↓
Billing Service listens → charges customer → publishes "PaymentProcessed"
    ↓
Notification Service listens → sends email

If Billing fails, it publishes PaymentFailed. Capacity Service listens and releases the reservation. Shipment Service cancels the order.

Pros: Loosely coupled, no central orchestrator. Cons: Hard to see the full flow. Difficult to test. Requires robust error handling everywhere.

2. Orchestration (Centralized)

A central orchestrator (like Azure Durable Functions) tells each service what to do and when.

1
2
3
4
5
6
7
Durable Function:
  1. Call Shipment Service → create order
  2. Call Capacity Service → reserve space
  3. Call Billing Service → charge customer
  4. Call Notification Service → notify customer
  
  If any step fails, execute compensating transactions in reverse order.

Pros: Clear, testable flow. Easy to see the entire transaction. Cons: The orchestrator becomes a bottleneck. Single point of failure.

Implementation: Orchestrating with Azure Durable Functions

Here’s how we handle delivery booking using orchestration:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
[FunctionName("BookDeliveryOrchestrator")]
public static async Task RunOrchestrator(
    [OrchestrationTrigger] IDurableOrchestrationContext context,
    BookDeliveryRequest request)
{
    var deliveryId = context.NewGuid().ToString();
    
    try
    {
        // Step 1: Create shipment
        var shipment = await context.CallActivityAsync<ShipmentDto>(
            nameof(CreateShipmentActivity), 
            new { request, deliveryId });
        
        // Step 2: Reserve capacity
        var reservation = await context.CallActivityAsync<ReservationDto>(
            nameof(ReserveCapacityActivity), 
            new { shipment, request });
        
        // Step 3: Process payment
        var payment = await context.CallActivityAsync<PaymentDto>(
            nameof(ProcessPaymentActivity), 
            new { request.CustomerId, request.Amount });
        
        // Step 4: Send notification
        await context.CallActivityAsync(
            nameof(SendNotificationActivity), 
            new { deliveryId, request.CustomerEmail });
        
        return new { success = true, deliveryId, shipment, reservation, payment };
    }
    catch (Exception ex)
    {
        // Compensating transactions: undo in reverse order
        
        // Undo payment
        await context.CallActivityAsync(
            nameof(RefundPaymentActivity), 
            new { request.CustomerId });
        
        // Undo reservation
        await context.CallActivityAsync(
            nameof(CancelReservationActivity), 
            new { request.ShipmentType, request.Weight });
        
        // Undo shipment
        await context.CallActivityAsync(
            nameof(CancelShipmentActivity), 
            new { deliveryId });
        
        throw;
    }
}

[FunctionName(nameof(CreateShipmentActivity))]
public static async Task<ShipmentDto> CreateShipmentActivity(
    [ActivityTrigger] IDurableActivityContext context,
    dynamic input)
{
    var request = (BookDeliveryRequest)input.request;
    var deliveryId = (string)input.deliveryId;
    
    // Call Shipment Service
    using var client = new HttpClient();
    var response = await client.PostAsJsonAsync(
        "https://shipment-service/api/shipments",
        new { deliveryId, request.Origin, request.Destination, request.Weight });
    
    return await response.Content.ReadAsAsync<ShipmentDto>();
}

[FunctionName(nameof(RefundPaymentActivity))]
public static async Task RefundPaymentActivity(
    [ActivityTrigger] IDurableActivityContext context,
    string customerId)
{
    // Call Billing Service to refund
    using var client = new HttpClient();
    await client.PostAsJsonAsync(
        "https://billing-service/api/refunds",
        new { customerId });
}

When to Use Saga vs. Other Patterns

Use Saga when:

You need eventual consistency (not immediate ACID)
Transactions span multiple services
You can design compensating transactions
You’re okay with temporary inconsistency

Don’t use Saga when:

You absolutely need immediate consistency (use distributed locks or stronger consistency patterns instead)
Compensating transactions are impossible (you can’t “undo” a physical truck dispatch)
Performance is critical and you need faster responses

Key Gotchas

Idempotency: Each activity must be idempotent. If Durable Functions retries ProcessPaymentActivity, you don’t want to charge twice.
Compensating Transactions Might Fail: What if the refund fails? You need a dead-letter mechanism.
Testing is Hard: Test the happy path and every possible failure scenario.
Monitoring is Critical: You need visibility into which saga is where in its lifecycle. Add Application Insights logging.

Lessons from Our Platform

We’ve learned that orchestration scales better for us than choreography. With 15+ microservices, trying to track the flow through events became a debugging nightmare. Durable Functions gave us a central, auditable record of each delivery transaction.

The trade-off? The orchestrator is now a critical path. We run multiple instances and monitor it closely.

Remember: Distributed transactions are hard. The Saga pattern isn’t magic—it’s a framework for managing that hardness. Design your compensating transactions carefully, test thoroughly, and monitor relentlessly.