Advanced Profiling (pprof/trace)
Performance: Deep Dives with pprof & trace
Optimization without measurement is just guesswork. Go has arguably the best built-in profiling tools of any compiled language.
1. pprof: The Sampling Profiler
pprof creates a statistical profile of your program.
Usage
Add one line to your main:
import _ "net/http/pprof"
func main() {
go func() {
http.ListenAndServe("localhost:6060", nil)
}()
// ... app code ...
}Now, while your app runs under load, grab a profile:
go tool pprof -http=:8081 http://localhost:6060/debug/pprof/profile?seconds=30This opens a web UI. Look at: * Flame Graph: The widest bars are using the most CPU. * Alloc Space: Who allocated the most memory (Heap). * Alloc Objects: Who created the most count of objects (high GC pressure).
Mutex Profiling
If your CPU usage is low but performance sucks, you have contention. Enable mutex profiling in code: runtime.SetMutexProfileFraction(1) Then look at /debug/pprof/mutex to see who is waiting for locks.
2. Execution Tracer (go tool trace)
While pprof aggregates data, Trace shows you a timeline of every event. * When did a Goroutine start? * When did it block on a channel? * When did GC pause execution?
Capture a trace:
curl -o trace.out http://localhost:6060/debug/pprof/trace?seconds=5
go tool trace trace.outUse Case: Debugging latency spikes. You might see a 100ms gap where nothing ran. Zooming in, you might find a Stop-The-World GC event or a single mutex holding up 50 goroutines.
3. Benchmarks as Profiling Inputs
You can profile specific functions using benchmarks:
go test -bench=. -cpuprofile=cpu.out
go tool pprof -http=:8080 cpu.outSummary
- pprof: For “Where is my CPU/RAM going?”
- trace: For “Why is there latency?” and “What is the scheduler doing?”
- Optimization loop: Measure -> Fix -> Verify (Benchmark).