TLDR: Deep entity graphs with 4+ Include chains generate massive Cartesian products in SQL joins. AsSplitQuery() executes separate queries and stitches results client-side, cutting result set sizes by 50–90% and latency by 3–8×. Pair with a generic IncludeAll() extension to automatically hydrate complex types without hand-writing Include chains. We reduced a single order read from 45 queries + Cartesian bloat to 8 optimized split queries with 95% less data transfer.
The Problem: Cartesian Explosion
A transportation order has trips, legs, charges, and accounting entries. Naive eager loading:
1
2
3
4
5
6
7
8
9
| var order = await _context.TransportOrders
.Include(o => o.Trips)
.ThenInclude(t => t.Legs)
.ThenInclude(l => l.LoadCarrier)
.ThenInclude(lc => lc.LoadCarrierType)
.Include(o => o.Charges)
.ThenInclude(c => c.ChargeDefinition)
.Include(o => o.BillingAccount)
.FirstOrDefaultAsync(o => o.Id == orderId);
|
Generated SQL joins across all relationships. If an order has:
- 3 trips
- 2 legs per trip
- 5 charges
The result set repeats the order and account data 3 × 2 × 5 = 30 times before FirstOrDefault filters. Each repeated row is full object data (strings, decimals, GUIDs).
Real numbers from production:
- Single order read: 8.2 MB result set from the database
- 1,000-order batch: 8.2 GB transferring from SQL to app server
- Deserialization: 14 seconds just to materialize objects
The SQL query itself was fast. The network and client-side processing were the killers.
Split Queries: A Different Approach
AsSplitQuery() tells EF Core to execute multiple queries and stitch them in memory:
1
2
3
4
5
6
7
8
9
| var order = await _context.TransportOrders
.AsSplitQuery()
.Include(o => o.Trips)
.ThenInclude(t => t.Legs)
.ThenInclude(l => l.LoadCarrier)
.Include(o => o.Charges)
.ThenInclude(c => c.ChargeDefinition)
.Include(o => o.BillingAccount)
.FirstOrDefaultAsync(o => o.Id == orderId);
|
Instead of one massive join, EF executes:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
| -- Query 1: Root order
SELECT * FROM TransportOrders WHERE Id = @orderId;
-- Query 2: Related trips
SELECT * FROM Trips WHERE TransportOrderId = @orderId;
-- Query 3: Related legs (for all trips from query 2)
SELECT * FROM TransportLegs WHERE TripId IN (@tripIds);
-- Query 4: Related charges
SELECT * FROM Charges WHERE TransportOrderId = @orderId;
-- Query 5: Related charge definitions (for all charges from query 4)
SELECT * FROM ChargeDefinitions WHERE Id IN (@chargeDefIds);
-- Query 6: Billing account
SELECT * FROM BillingAccounts WHERE Id = @accountId;
|
Each result set is small, specific, and un-repeated. Total data transfer: 0.8 MB instead of 8.2 MB.
Building IncludeAll(): Stop Hand-Writing Include Chains
With 40+ entity types in the model, writing Include chains for each becomes unmaintainable. A generic IncludeAll() extension uses reflection to inspect navigation properties:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
| public static class QueryableExtensions
{
public static IQueryable<T> IncludeAll<T>(
this IQueryable<T> query,
int maxDepth = 3,
HashSet<Type> visitedTypes = null) where T : Entity
{
visitedTypes ??= new HashSet<Type>();
if (maxDepth <= 0 || visitedTypes.Contains(typeof(T)))
return query;
visitedTypes.Add(typeof(T));
var navigationProperties = typeof(T)
.GetProperties()
.Where(p => p.PropertyType.Namespace == "YourNamespace.Domain" ||
(p.PropertyType.IsGenericType &&
p.PropertyType.GetGenericArguments()[0].Namespace == "YourNamespace.Domain"))
.ToList();
foreach (var navProp in navigationProperties)
{
// Handle ICollection<T>
if (navProp.PropertyType.IsGenericType &&
typeof(System.Collections.IEnumerable).IsAssignableFrom(navProp.PropertyType))
{
var elementType = navProp.PropertyType.GetGenericArguments()[0];
var includeMethod = typeof(EntityFrameworkQueryableExtensions)
.GetMethod("Include", System.Reflection.BindingFlags.Static | System.Reflection.BindingFlags.Public)
.MakeGenericMethod(typeof(T), elementType);
var lambda = BuildLambda<T>(navProp.Name);
query = (IQueryable<T>)includeMethod.Invoke(null, new object[] { query, lambda });
// Recursively include nested navigation
var thenIncludeMethod = typeof(EntityFrameworkQueryableExtensions)
.GetMethods(System.Reflection.BindingFlags.Static | System.Reflection.BindingFlags.Public)
.First(m => m.Name == "ThenInclude" &&
m.GetGenericArguments().Length == 2 &&
m.GetParameters()[0].ParameterType.Name.StartsWith("IIncludableQueryable"));
// Build nested includes for the collection element type
if (maxDepth > 1)
{
// This gets complex; see full implementation in repository
}
}
// Handle single navigation
else
{
var includeMethod = typeof(EntityFrameworkQueryableExtensions)
.GetMethod("Include", System.Reflection.BindingFlags.Static | System.Reflection.BindingFlags.Public)
.MakeGenericMethod(typeof(T), navProp.PropertyType);
var lambda = BuildLambda<T>(navProp.Name);
query = (IQueryable<T>)includeMethod.Invoke(null, new object[] { query, lambda });
}
}
return query;
}
private static object BuildLambda<T>(string propertyName) where T : Entity
{
var parameter = System.Linq.Expressions.Expression.Parameter(typeof(T));
var property = System.Linq.Expressions.Expression.Property(parameter, propertyName);
var lambda = System.Linq.Expressions.Expression.Lambda(property, parameter);
return lambda;
}
}
|
In practice:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
| // Old way: tedious, error-prone
var order = await _context.TransportOrders
.AsSplitQuery()
.Include(o => o.Trips)
.ThenInclude(t => t.Legs)
.ThenInclude(l => l.LoadCarrier)
.Include(o => o.Charges)
.ThenInclude(c => c.ChargeDefinition)
.FirstOrDefaultAsync(o => o.Id == orderId);
// New way: one line, auto-discovers all navigations
var order = await _context.TransportOrders
.AsSplitQuery()
.IncludeAll(maxDepth: 4)
.FirstOrDefaultAsync(o => o.Id == orderId);
|
The extension crawls the reflection tree, generating Include expressions dynamically.
When NOT to Use Split Queries
Split queries aren’t always better:
Scenario 1: Filtering on related entities
1
2
3
4
5
| var orders = await _context.TransportOrders
.Where(o => o.Trips.Any(t => t.Status == "InProgress"))
.AsSplitQuery()
.Include(o => o.Trips)
.ToListAsync();
|
Here, split queries execute the root query first without the Where clause on Trips, then filters in memory. Use traditional joins for filtered includes:
1
2
3
| var orders = await _context.TransportOrders
.Include(o => o.Trips.Where(t => t.Status == "InProgress"))
.ToListAsync();
|
Scenario 2: Small result sets with few collections
If you’re loading 5 orders with 2–3 related items each, split query overhead exceeds benefit. Test both.
Scenario 3: Real-time transactions
Split queries execute sequentially, not atomically. If data changes between queries, you get phantom reads. For billing cutoffs, use single transactions with explicit isolation levels.
Profiling: Before and After
Traditional joins (8.2 MB, 2.1 seconds):
1
2
3
| SQL Execution: 140ms
Network Transfer: 1,200ms
Deserialization: 760ms
|
Split queries (0.8 MB, 320ms):
1
2
3
| SQL Execution: 85ms (6 queries)
Network Transfer: 120ms
Deserialization: 115ms
|
8× latency improvement, 90% less data.
Enable query logging to see the difference:
1
2
3
| optionsBuilder.LogTo(Console.WriteLine, LogLevel.Information)
.EnableDetailedErrors()
.EnableSensitiveDataLogging();
|
Real Scenario: Billing Order Export
A monthly billing run exports 50,000 orders with full detail (trips, legs, charges, accounting entries).
Old approach:
- Single mammoth query with all includes
- OOM after 2,000 orders
- 8+ hours to export
New approach:
- Batch 1,000 orders per query with
AsSplitQuery() - 0.8 MB per batch
- 2.5 hours total (70% faster)
- No OOM
1
2
3
4
5
6
7
8
9
10
11
12
| const int batchSize = 1000;
for (int offset = 0; offset < totalOrders; offset += batchSize)
{
var batch = await _context.TransportOrders
.AsSplitQuery()
.IncludeAll(maxDepth: 3)
.Skip(offset)
.Take(batchSize)
.ToListAsync();
await ExportBatch(batch);
}
|
Lessons Learned
1. Measure first, optimize second
We optimized split queries before profiling. Turned out network latency, not SQL, was the bottleneck. SQL was 140ms; data transfer was 1,200ms.
2. Split queries and caching are complementary
Split queries reduce result set sizes. Caching (from the previous post) eliminates queries entirely. Use both: cache reference data, split queries for parent graphs.
3. Depth limits prevent runaway reflection
Without maxDepth, IncludeAll() traversed into circular dependencies indefinitely. Always set maxDepth = 3 or 4.
4. Client-side stitching has memory cost
Split queries use more memory than joins for large collections (joining in SQL is more efficient for aggregates). For 10,000+ root entities, consider pagination or async enumeration.
Gotchas and Disclaimers
- EF Core 8+: Earlier versions have different split query behavior.
- Explicit transaction isolation: Split queries don’t guarantee isolation between queries. Use
IsolationLevel.Serializable if you need it. - Async enumeration: For very large result sets, use
await foreach with AsSingleQuery() on the stream, not split queries. - JSON serialization: When returning split-query results as JSON, circular references will cause serialization errors. Use
[JsonIgnore] on back-references.
Next Steps
- Profile your deepest Include chains: measure result set size and deserialization time.
- If result sets exceed 1 MB or deserialization exceeds 500ms, try split queries.
- Implement
IncludeAll() for your domain model. - Pair with declarative caching for reference data (previous post).
Ready for featured image.