Machine Learning & LLMs

AI & LLMs in Go

Python owns Model Training. Go owns Model Serving and Orchestration.

In 2026, you rarely train a model from scratch. You run a pre-trained model (Llama 3, Mistral) and build “Agents” around it. Go is perfect for this high-concurrency glue code.

1. Running Local Models (Ollama)

The easiest way to integrate LLMs is Ollama. Ollama runs the model; Go talks to it via API.

Use LangChainGo (github.com/tmc/langchaingo):

llm, err := ollama.New(ollama.WithModel("llama3"))
ctx := context.Background()
completion, err := llms.GenerateFromSinglePrompt(ctx, llm, "Explain Go channels")

2. Go 1.26 SIMD Support (Experimental)

Go 1.26 introduces experimental SIMD support via simd and archsimd packages (enabled with GOEXPERIMENT=simd). This is critical for: * Vector Search: Calculating Cosine Similarity between embeddings is just a dot product. SIMD makes this 100x faster. * Inference: Running small neural nets directly in Go.

GOEXPERIMENT=simd go test ./...

RAG request path

query text
   |
   v
embedding model --> vector (float32[])
   |
   v
SIMD dot product / ANN prefilter
   |
   v
top-k chunks --> prompt assembly --> LLM response

3. Embeddings & Vector Databases

Go is the standard language for Vector DBs (Milvus, Weaviate are written in Go).

Workflow: 1. Receive Text -> Send to Embedding Model (via API or Local binding). 2. Get []float32. 3. Store in Postgres (pgvector) or Weaviate.

Concrete retrieval stub:

type Chunk struct {
    ID    string
    Text  string
    Score float32
}

func topK(ctx context.Context, q []float32, k int) ([]Chunk, error) {
    // Use pgvector distance operator and ORDER BY distance LIMIT k.
    return nil, nil
}

Summary

Don’t use Python for the API layer of your AI app. * Python: Research code, PyTorch. * Go: The API server, the RAG pipeline, the WebSocket handler.