Machine Learning & LLMs
AI & LLMs in Go
Python owns Model Training. Go owns Model Serving and Orchestration.
In 2026, you rarely train a model from scratch. You run a pre-trained model (Llama 3, Mistral) and build “Agents” around it. Go is perfect for this high-concurrency glue code.
1. Running Local Models (Ollama)
The easiest way to integrate LLMs is Ollama. Ollama runs the model; Go talks to it via API.
Use LangChainGo (github.com/tmc/langchaingo):
llm, err := ollama.New(ollama.WithModel("llama3"))
ctx := context.Background()
completion, err := llms.GenerateFromSinglePrompt(ctx, llm, "Explain Go channels")2. Go 1.26 SIMD Support (Experimental)
Go 1.26 introduces experimental SIMD support via simd and archsimd packages (enabled with GOEXPERIMENT=simd). This is critical for: * Vector Search: Calculating Cosine Similarity between embeddings is just a dot product. SIMD makes this 100x faster. * Inference: Running small neural nets directly in Go.
GOEXPERIMENT=simd go test ./...RAG request path
query text
|
v
embedding model --> vector (float32[])
|
v
SIMD dot product / ANN prefilter
|
v
top-k chunks --> prompt assembly --> LLM response
3. Embeddings & Vector Databases
Go is the standard language for Vector DBs (Milvus, Weaviate are written in Go).
Workflow: 1. Receive Text -> Send to Embedding Model (via API or Local binding). 2. Get []float32. 3. Store in Postgres (pgvector) or Weaviate.
Concrete retrieval stub:
type Chunk struct {
ID string
Text string
Score float32
}
func topK(ctx context.Context, q []float32, k int) ([]Chunk, error) {
// Use pgvector distance operator and ORDER BY distance LIMIT k.
return nil, nil
}Summary
Don’t use Python for the API layer of your AI app. * Python: Research code, PyTorch. * Go: The API server, the RAG pipeline, the WebSocket handler.