132 HTTP Client and Server Resilience
132 HTTP Client and Server Resilience
Resilience in HTTP systems is budget management: every inbound request has limited time, and each downstream call spends part of that budget.
Budget-First Thinking
total request budget: 1200ms
|-- auth --|-- cache --|-- db --|-- upstream API --|-- encode response --|
If one component consumes too much budget, upstream callers see latency spikes and timeouts.
Server-Side Guardrails
ReadHeaderTimeoutto protect against slowloris-style abuse.ReadTimeoutandWriteTimeoutto bound request/response processing.IdleTimeoutto control keep-alive resource usage.
Client-Side Guardrails
- Default
http.Clienttimeout for hard cap. - Per-call
context.WithTimeoutfor operation-specific budgets. - Retry only idempotent operations and only within remaining budget.
Failure Classification
Treat failures differently:
- Timeout/cancel: likely capacity or dependency latency issue.
- 5xx: downstream error, candidate for controlled retry.
- 4xx: usually caller/input issue, no retry.
Operational Insight
Services become stable when timeout and retry policy is explicit, consistent, and visible in logs/metrics. Hidden defaults are the most common source of cascading failure.