Advanced Profiling (pprof/trace)

Performance: Deep Dives with pprof & trace

Optimization without measurement is just guesswork. Go has arguably the best built-in profiling tools of any compiled language.

1. pprof: The Sampling Profiler

pprof creates a statistical profile of your program.

Usage

Add one line to your main:

import _ "net/http/pprof"

func main() {
    go func() {
        http.ListenAndServe("localhost:6060", nil)
    }()
    // ... app code ...
}

Now, while your app runs under load, grab a profile:

go tool pprof -http=:8081 http://localhost:6060/debug/pprof/profile?seconds=30

This opens a web UI. Look at: * Flame Graph: The widest bars are using the most CPU. * Alloc Space: Who allocated the most memory (Heap). * Alloc Objects: Who created the most count of objects (high GC pressure).

Mutex Profiling

If your CPU usage is low but performance sucks, you have contention. Enable mutex profiling in code: runtime.SetMutexProfileFraction(1) Then look at /debug/pprof/mutex to see who is waiting for locks.

2. Execution Tracer (go tool trace)

While pprof aggregates data, Trace shows you a timeline of every event. * When did a Goroutine start? * When did it block on a channel? * When did GC pause execution?

Capture a trace:

curl -o trace.out http://localhost:6060/debug/pprof/trace?seconds=5
go tool trace trace.out

Use Case: Debugging latency spikes. You might see a 100ms gap where nothing ran. Zooming in, you might find a Stop-The-World GC event or a single mutex holding up 50 goroutines.

3. Benchmarks as Profiling Inputs

You can profile specific functions using benchmarks:

go test -bench=. -cpuprofile=cpu.out
go tool pprof -http=:8080 cpu.out

Summary

  • pprof: For “Where is my CPU/RAM going?”
  • trace: For “Why is there latency?” and “What is the scheduler doing?”
  • Optimization loop: Measure -> Fix -> Verify (Benchmark).