Skip to content

Run IronClaw with a 100% local model (Ollama)

IronClaw's whole premise is isolation you can prove and no data leaving your box. The last piece of that story is the model itself: instead of a hosted provider (Anthropic, OpenAI, Gemini, Vertex), you can point IronClaw at a self-hosted, OpenAI-compatible endpoint and run the entire stack — control-plane, sandbox, and model — on your own hardware with zero cloud API keys.

This tutorial uses Ollama, but the exact same steps work for LM Studio, vLLM, and llama.cpp — they all expose the OpenAI /v1 Chat Completions API.

By the end you'll have a real agent replying through a real per-session sandbox, powered by a model running locally, with no credential anywhere in the stack.

Why this works

Ollama serves the OpenAI-compatible API at http://localhost:11434/v1. IronClaw's local provider speaks that identical wire format. The host model-proxy allowlists your loopback host and forwards to it over plain HTTP (local servers serve no TLS); the sandbox stays network=none and credential-free, reaching the model only through the proxy socket — exactly as it does for a cloud provider.

Prerequisites

1. Pull a model with Ollama

ollama pull llama3.2          # ~2 GB; any chat model works (qwen2.5, mistral, …)
ollama serve &                # if it isn't already running as a service
curl -s http://localhost:11434/v1/models   # sanity check: the OpenAI-compatible API is up

2. Build IronClaw

git clone https://github.com/IronSecCo/ironclaw.git
cd ironclaw
CGO_ENABLED=1 go build -o bin/ ./cmd/controlplane ./cmd/ironctl

3. Point IronClaw at the local model (Terminal 1)

# No ANTHROPIC_API_KEY — there is no cloud credential in this posture.
export IRONCLAW_LOCAL_MODEL_URL=http://localhost:11434/v1   # the Ollama OpenAI endpoint
export IRONCLAW_LOCAL_MODEL=llama3.2                        # the model you pulled
export IRONCLAW_API_TOKEN=$(openssl rand -hex 32)           # bearer token for the API
echo "API token: $IRONCLAW_API_TOKEN"                       # copy this for Terminal 2

./bin/controlplane --dev --api-addr 127.0.0.1:8787

Setting --local-model-url (or the IRONCLAW_LOCAL_MODEL_URL env) does three things:

  1. Allowlists localhost:11434 on the model-proxy, so the sandbox may reach it.
  2. Marks it an insecure upstream — forwarded over plain HTTP, port preserved — because local servers serve no TLS.
  3. Makes it the deployment-default model, so the seeded dev-agent (and any agent group without a pinned provider) runs fully local.

You should see local model enabled host=localhost:11434 model=llama3.2 in the startup log. Leave this running.

Equivalent flags

--local-model-url http://localhost:11434/v1 --local-model llama3.2 are the flag form of the two env vars. The flag and env paths are interchangeable.

4. Chat with the local agent (Terminal 2)

Option A — the browser console

open http://127.0.0.1:8787/ui/      # Linux: xdg-open http://127.0.0.1:8787/ui/

Open the Chat tab, pick the "Dev Agent" group, and say hi. The reply is generated by your local Ollama model — first tokens may take a moment while the model warms up.

Option B — straight from the terminal

export IRONCLAW_API_TOKEN=<paste the token from Terminal 1>

curl -s -X POST http://127.0.0.1:8787/v1/ui/chat/send \
  -H "authorization: Bearer $IRONCLAW_API_TOKEN" -H 'content-type: application/json' \
  -d '{"agentGroupID":"dev-agent","text":"In one sentence, what is IronClaw?"}'

sleep 8   # local generation can be slower than a cloud API; give it a moment

curl -s -H "authorization: Bearer $IRONCLAW_API_TOKEN" \
  http://127.0.0.1:8787/v1/ui/chat/dev-agent/messages   # read the reply

The reply came back through IronClaw's encrypted per-session queue from a real sandbox — whose only network path was the host model-proxy, which forwarded to your machine's Ollama. No packet left the box.

What you just proved

  • A self-hoster ran the full chat → sandbox → reply path against a local model with no cloud API key.
  • The sandbox never held a credential and had network=none; the model-proxy was the single egress, and it pointed at localhost.

Going further

  • Pin local to specific agents. Instead of the deployment default, create an agent group on the local provider explicitly — it inherits the loopback host from IRONCLAW_LOCAL_MODEL_URL:

    ./bin/ironctl agent create --name "Local Bot" --provider local --model llama3.2
    
  • A local server that requires a key (e.g. a hardened vLLM): set IRONCLAW_LOCAL_MODEL_KEY and the host-proxy stamps it as a Bearer token. It stays host-side and never enters the sandbox. Most local servers (Ollama, LM Studio) need no key at all.

  • Run under docker compose. When the control-plane runs in a container, localhost is the container's loopback, not the host's. Point at the host instead — Docker Desktop exposes it as host.docker.internal:

    export IRONCLAW_LOCAL_MODEL_URL=http://host.docker.internal:11434/v1
    
  • Use a different runtime. LM Studio (http://localhost:1234/v1), vLLM, and llama.cpp's server are all OpenAI-compatible — point IRONCLAW_LOCAL_MODEL_URL at their /v1 and set the model id they serve.

Next steps