Ollama Provider

The Ollama provider implements the genai.Provider interface for Ollama, an open-source tool for running LLMs locally. It works by wrapping the OpenAI provider, since Ollama (v0.1.14+) exposes an OpenAI-compatible API at /v1/chat/completions.

Quick Start

import (
    "context"
    "fmt"

    "oss.nandlabs.io/golly/genai"
    "oss.nandlabs.io/golly/genai/impl"
)

provider := impl.NewOllamaProvider(nil)
defer provider.Close()

msg := genai.NewTextMessage(genai.RoleUser, "Hello! What is Go?")
resp, err := provider.Generate(context.Background(), "llama3", msg, nil)
if err != nil {
    panic(err)
}
for _, c := range resp.Candidates {
    for _, p := range c.Message.Parts {
        if p.Text != nil {
            fmt.Println(p.Text.Text)
        }
    }
}

Prerequisites

  1. Install Ollama: https://ollama.com/download
  2. Pull a model:
    ollama pull llama3
  3. Start Ollama (if not running as a service):
    ollama serve

The default endpoint is http://localhost:11434.

Configuration

Simple Constructor

// Connects to http://localhost:11434/v1 with no authentication
provider := impl.NewOllamaProvider(nil)

Full Configuration

import (
    "oss.nandlabs.io/golly/clients"
    "oss.nandlabs.io/golly/genai/impl"
    "oss.nandlabs.io/golly/rest"
)

provider := impl.NewOllamaProviderWithConfig(&impl.OllamaProviderConfig{
    Auth:        nil, // no auth for local
    BaseURL:     "http://localhost:11434/v1",
    Models:      []string{"llama3", "mistral", "codellama", "llava"},
    Description: "Local Ollama instance",
    Version:     "1.0.0",
    ExtraHeaders: map[string]string{
        "X-Request-Source": "dev-machine",
    },
}, &rest.ClientOpts{
    // Optional: configure timeouts for long-running local inference
})

Config Reference

FieldTypeDefaultDescription
Authclients.AuthProvidernilAuthentication provider (nil for local, set for proxied)
BaseURLstringhttp://localhost:11434/v1Ollama OpenAI-compatible API URL
Models[]stringnilList of available model IDs (informational)
Descriptionstring"Ollama provider for local model inference..."Provider description
Versionstring"1.0.0"Provider version
ExtraHeadersmap[string]stringnilAdditional HTTP headers on every request

Authentication

Local (No Auth)

The default โ€” no credentials needed:

provider := impl.NewOllamaProvider(nil)

Behind a Reverse Proxy

When Ollama is deployed behind nginx, Caddy, or another proxy with authentication:

// HTTP Basic Auth proxy
provider := impl.NewOllamaProviderWithConfig(&impl.OllamaProviderConfig{
    Auth:    clients.NewBasicAuth("username", "password"),
    BaseURL: "https://ollama.internal.company.com/v1",
}, nil)

// Bearer token proxy
provider = impl.NewOllamaProviderWithConfig(&impl.OllamaProviderConfig{
    Auth:    clients.NewBearerAuth("my-proxy-token"),
    BaseURL: "https://ollama.internal.company.com/v1",
}, nil)

Behind an API Gateway

When Ollama is fronted by Kong, AWS API Gateway, or similar:

// API key gateway
provider := impl.NewOllamaProviderWithConfig(&impl.OllamaProviderConfig{
    Auth:    clients.NewAPIKeyAuth("X-API-Key", "my-gateway-key"),
    BaseURL: "https://api.company.com/ollama/v1",
}, nil)

// OAuth2 gateway
oauth := rest.NewOAuth2Provider(
    "https://auth.company.com/oauth/token",
    "client-id", "client-secret",
    "openid", // scopes
)
provider = impl.NewOllamaProviderWithConfig(&impl.OllamaProviderConfig{
    Auth:    oauth,
    BaseURL: "https://api.company.com/ollama/v1",
}, nil)

Supported Options

Since Ollama uses the OpenAI-compatible API, it supports the same options as the OpenAI provider. However, actual support depends on the model:

GenAI OptionParameterTypeNotes
OptionMaxTokensmax_tokensintMaximum tokens in the response
OptionTemperaturetemperaturefloat32Sampling temperature
OptionTopPtop_pfloat32Nucleus sampling
OptionSeedseedintDeterministic output
OptionStopWordsstop[]stringStop sequences
OptionFrequencyPenaltyfrequency_penaltyfloat64Penalise frequent tokens
OptionPresencePenaltypresence_penaltyfloat64Penalise repeated tokens
OptionSystemInstructionsmessages[0] (system)stringPrepended as system message
โ„น๏ธ
Not all models support all options. For example, vision models like llava support image inputs, while text-only models like llama3 do not.

Generating Responses

Basic Generation

msg := genai.NewTextMessage(genai.RoleUser, "Write a haiku about Go programming.")
opts := genai.NewOptionsBuilder().
    SetMaxTokens(256).
    SetTemperature(0.8).
    Build()

resp, err := provider.Generate(ctx, "llama3", msg, opts)

System Instructions

opts := genai.NewOptionsBuilder().
    SetMaxTokens(1024).
    Build()
opts.Set(genai.OptionSystemInstructions, "You are a Linux sysadmin. Respond with shell commands and brief explanations.")

msg := genai.NewTextMessage(genai.RoleUser, "How do I find large files on disk?")
resp, err := provider.Generate(ctx, "llama3", msg, opts)

Streaming

msg := genai.NewTextMessage(genai.RoleUser, "Explain the differences between goroutines and threads.")
opts := genai.NewOptionsBuilder().SetMaxTokens(2048).Build()

respCh, errCh := provider.GenerateStream(ctx, "llama3", msg, opts)
for resp := range respCh {
    for _, c := range resp.Candidates {
        for _, p := range c.Message.Parts {
            if p.Text != nil {
                fmt.Print(p.Text.Text) // prints token by token
            }
        }
    }
}
if err := <-errCh; err != nil {
    // handle streaming error
}

Multi-Modal (Vision)

Use a vision-capable model like llava:

msg := genai.NewTextMessage(genai.RoleUser, "What's in this image?")
genai.AddBinPart(msg, "photo", jpegBytes, "image/jpeg")

resp, err := provider.Generate(ctx, "llava", msg, nil)

Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   OllamaProvider     โ”‚  wraps
โ”‚                      โ”‚โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–บ  OpenAIProvider
โ”‚  Name() = "ollama"   โ”‚           (full implementation)
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ”‚
         โ”‚  HTTP POST to /v1/chat/completions
         โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   Ollama Server      โ”‚
โ”‚  localhost:11434      โ”‚
โ”‚                      โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
โ”‚  โ”‚  llama3        โ”‚  โ”‚
โ”‚  โ”‚  mistral       โ”‚  โ”‚
โ”‚  โ”‚  codellama     โ”‚  โ”‚
โ”‚  โ”‚  llava         โ”‚  โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

The OllamaProvider embeds OpenAIProvider and overrides only Name(). All Generate and GenerateStream calls are delegated to the OpenAI implementation, which makes requests to Ollama’s OpenAI-compatible endpoint.

Popular Models

ModelSizeUse CaseVision
llama38BGeneral purposeNo
llama3:70b70BHigh quality, slowerNo
mistral7BFast, good qualityNo
codellama7Bโ€“34BCode generationNo
llava7Bโ€“13BVision + textYes
phi33.8BSmall, fastNo
gemma29B/27BGoogle’s open modelNo
deepseek-coder6.7Bโ€“33BCode generationNo
qwen27Bโ€“72BMultilingualNo

Pull models with:

ollama pull llama3
ollama pull llava
ollama pull codellama

Error Handling

resp, err := provider.Generate(ctx, "llama3", msg, opts)
if err != nil {
    // Common errors:
    // - "openai API request failed: ..." โ€” Ollama not running or unreachable
    // - "openai API error [...]" โ€” model not found or invalid request
    log.Fatal(err)
}
โ„น๏ธ
Errors use the “openai” prefix because the Ollama provider delegates to the OpenAI implementation. The error messages still accurately describe the issue.

Deployment Patterns

Local Development

provider := impl.NewOllamaProvider(nil)

Docker

# docker-compose.yml
services:
  ollama:
    image: ollama/ollama
    ports:
      - "11434:11434"
    volumes:
      - ollama_data:/root/.ollama
provider := impl.NewOllamaProviderWithConfig(&impl.OllamaProviderConfig{
    BaseURL: "http://ollama:11434/v1", // Docker service name
}, nil)

Kubernetes with Auth Sidecar

provider := impl.NewOllamaProviderWithConfig(&impl.OllamaProviderConfig{
    Auth:    clients.NewBearerAuth(os.Getenv("OLLAMA_TOKEN")),
    BaseURL: os.Getenv("OLLAMA_URL"),
}, nil)