A Manifest for Better Logging
Most logging is useless noise. Here's how I think about logs that actually help when things break at 3am.
Most logging is useless noise. Here's how I think about logs that actually help when things break at 3am.

I've debugged production incidents where logs told me exactly what went wrong. I've also debugged incidents where thousands of log lines told me absolutely nothing useful.
The difference isn't volume. It's intent.
Open any codebase. You'll find:
logger.info("Starting process")
logger.info("Process started")
logger.debug("Entering function X")
logger.info("Done")
Congratulations. You've documented that code runs. You've learned nothing about what it's doing or why it failed.
When something breaks at 3am, you don't need to know that processes started. You need to know:
When writing a log statement, imagine yourself debugging at 3am with an angry customer on hold. What would you wish you had logged?
Bad:
logger.info("Processing order")
Good:
logger.info("Processing order", {
order_id: order.id,
customer_id: order.customer_id,
total: order.total,
items_count: order.items.length,
payment_method: order.payment_method
})
When that order fails, you'll know exactly which order, for whom, and what was special about it.
I've seen codebases where everything is INFO. Or where DEBUG is used for actual errors. The levels exist for a reason:
ERROR — Something broke. The operation failed. You might get paged.
WARN — Something's wrong but we handled it. Worth investigating.
INFO — Significant business events. The happy path.
DEBUG — Detailed technical info. Off in production usually.
If your production logs are 90% INFO, you're probably logging too much noise. If you have zero WARN, you're probably missing early warning signs.
Unstructured logs are hell to query.
"User 12345 purchased item ABC for $99.99"
Good luck writing a query to find all purchases over $50.
Structured logs are queryable:
{
"event": "purchase_completed",
"user_id": "12345",
"item_id": "ABC",
"amount": 99.99,
"currency": "USD"
}
Now you can: amount > 50 AND event = "purchase_completed"
Every log should be a structured event with consistent field names.
A single user request might hit 5 services and generate 50 log lines. How do you connect them?
Correlation ID. Also called trace ID, request ID.
Generate it at the edge. Pass it through every service. Include it in every log line.
{"correlation_id": "abc-123", "service": "api", "event": "request_received"}
{"correlation_id": "abc-123", "service": "payments", "event": "charge_initiated"}
{"correlation_id": "abc-123", "service": "notifications", "event": "email_queued"}
Now when something fails, you can trace the entire journey: correlation_id = "abc-123"
Most logs tell you what happened. Better logs tell you why it happened.
Just action:
logger.info("Routing to fallback service")
With decision context:
logger.info("Routing to fallback service", {
reason: "primary_timeout",
primary_latency_ms: 5200,
timeout_threshold_ms: 5000,
fallback_service: "backup-api-west"
})
When you're debugging why traffic went to the fallback, you'll know it was a timeout, not an error.
Sensitive data — No passwords, tokens, full credit card numbers, PII without consent. This seems obvious but I've seen it in production logs.
High-frequency noise — If something happens 10,000 times per second, sampling or aggregating is better than logging every instance.
Success without context — logger.info("Success") tells you nothing. Either add context or don't bother.
Someone else's logs — Don't log what downstream services already log. You'll double-count everything and confuse yourself.
For any significant operation:
Not every function needs all five. But any operation a user would care about should have most of them.
Logs aren't just for debugging. They're for:
If your logs can't answer "how many orders did we process yesterday?" and "why did user X get an error at 2pm?", they're not doing their job.
ELK, Datadog, CloudWatch, Splunk — the tool doesn't matter if your logs are garbage.
I've seen teams with expensive observability platforms and useless logs. I've seen teams with basic tooling and excellent debugging capability.
The discipline is:
Logging is infrastructure. Treat it like you'd treat your database schema — with intention and consistency. The investment pays off the first time you debug a production issue in 10 minutes instead of 10 hours.