The default /health endpoint is a lie.
It returns 200 OK if the web server is running. That’s it. But what if the database is unreachable? What if the Redis cache is on fire? What if your app is consuming 99% of available memory and is about to crash?
In a distributed cloud environment, “running” is not the same as “healthy.” I learned this the hard way when our load balancer kept sending traffic to a zombie instance.
Going Deeper
For our latest project, MyDashboard, we needed better visibility. We leveraged the Microsoft.Extensions.Diagnostics.HealthChecks library to build not just a “pulse” check, but a full “medical exam” for our API.
Here are three advanced checks we implemented because the basics weren’t enough:
1. The Migration Check
It’s a classic deployment nightmare: The code deploys successfully, but the database schema migration fails. The API starts up, but every request crashes because table Users doesn’t have column IsActive yet.
We wrote a custom check that queries EF Core’s __EFMigrationsHistory table.
|
|
2. The Memory Sentinel
Memory leaks in containerized apps can lead to sudden, silent kills by Kubernetes (OOMKilled). We added a check to report status based on GC memory usage.
|
|
3. Readiness vs. Liveness
We split our checks. This is crucial for Kubernetes/Container Apps:
/health/live: “Am I completely broken?” (Fast, minimal checks). If this fails, restart the container./health/ready: “Can I do work?” (Checks DB, Redis, Auth). If this fails, stop sending me traffic, but don’t kill me yet.
Dashboarding
Since the output is just JSON, we can visualize it easily. We pipe the output directly into Azure Application Insights.
Don’t settle for 200 OK. Make your app tell you how it’s doing.