132 HTTP Client and Server Resilience

Resilience in HTTP systems is budget management: every inbound request has limited time, and each downstream call spends part of that budget.

Budget-First Thinking

total request budget: 1200ms
|-- auth --|-- cache --|-- db --|-- upstream API --|-- encode response --|

If one component consumes too much budget, upstream callers see latency spikes and timeouts.

Server-Side Guardrails

ReadHeaderTimeout to protect against slowloris-style abuse.
ReadTimeout and WriteTimeout to bound request/response processing.
IdleTimeout to control keep-alive resource usage.

Client-Side Guardrails

Default http.Client timeout for hard cap.
Per-call context.WithTimeout for operation-specific budgets.
Retry only idempotent operations and only within remaining budget.

Failure Classification

Treat failures differently:

Timeout/cancel: likely capacity or dependency latency issue.
5xx: downstream error, candidate for controlled retry.
4xx: usually caller/input issue, no retry.

Operational Insight

Services become stable when timeout and retry policy is explicit, consistent, and visible in logs/metrics. Hidden defaults are the most common source of cascading failure.