Run IronClaw with a 100% local model (Ollama)¶

IronClaw's whole premise is isolation you can prove and no data leaving your box. The last piece of that story is the model itself: instead of a hosted provider (Anthropic, OpenAI, Gemini, Vertex), you can point IronClaw at a self-hosted, OpenAI-compatible endpoint and run the entire stack — control-plane, sandbox, and model — on your own hardware with zero cloud API keys.

This tutorial uses Ollama, but the exact same steps work for LM Studio, vLLM, and llama.cpp — they all expose the OpenAI /v1 Chat Completions API.

By the end you'll have a real agent replying through a real per-session sandbox, powered by a model running locally, with no credential anywhere in the stack.

Why this works

Ollama serves the OpenAI-compatible API at http://localhost:11434/v1. IronClaw's local provider speaks that identical wire format. The host model-proxy allowlists your loopback host and forwards to it over plain HTTP (local servers serve no TLS); the sandbox stays network=none and credential-free, reaching the model only through the proxy socket — exactly as it does for a cloud provider.

Prerequisites¶

A Go toolchain with cgo (CGO_ENABLED=1) — same as the Quickstart. Or install a prebuilt binary and substitute ironclaw-controlplane for ./bin/controlplane below.
Ollama installed and running: https://ollama.com/download.

1. Pull a model with Ollama¶

ollama pull llama3.2          # ~2 GB; any chat model works (qwen2.5, mistral, …)
ollama serve &                # if it isn't already running as a service
curl -s http://localhost:11434/v1/models   # sanity check: the OpenAI-compatible API is up

2. Build IronClaw¶

git clone https://github.com/IronSecCo/ironclaw.git
cd ironclaw
CGO_ENABLED=1 go build -o bin/ ./cmd/controlplane ./cmd/ironctl

3. Point IronClaw at the local model (Terminal 1)¶

# No ANTHROPIC_API_KEY — there is no cloud credential in this posture.
export IRONCLAW_LOCAL_MODEL_URL=http://localhost:11434/v1   # the Ollama OpenAI endpoint
export IRONCLAW_LOCAL_MODEL=llama3.2                        # the model you pulled
export IRONCLAW_API_TOKEN=$(openssl rand -hex 32)           # bearer token for the API
echo "API token: $IRONCLAW_API_TOKEN"                       # copy this for Terminal 2

./bin/controlplane --dev --api-addr 127.0.0.1:8787

Setting --local-model-url (or the IRONCLAW_LOCAL_MODEL_URL env) does three things:

Allowlists localhost:11434 on the model-proxy, so the sandbox may reach it.
Marks it an insecure upstream — forwarded over plain HTTP, port preserved — because local servers serve no TLS.
Makes it the deployment-default model, so the seeded dev-agent (and any agent group without a pinned provider) runs fully local.

You should see local model enabled host=localhost:11434 model=llama3.2 in the startup log. Leave this running.

Equivalent flags

--local-model-url http://localhost:11434/v1 --local-model llama3.2 are the flag form of the two env vars. The flag and env paths are interchangeable.

4. Chat with the local agent (Terminal 2)¶

Option A — the browser console¶

open http://127.0.0.1:8787/ui/      # Linux: xdg-open http://127.0.0.1:8787/ui/

Open the Chat tab, pick the "Dev Agent" group, and say hi. The reply is generated by your local Ollama model — first tokens may take a moment while the model warms up.

Option B — straight from the terminal¶

export IRONCLAW_API_TOKEN=<paste the token from Terminal 1>

curl -s -X POST http://127.0.0.1:8787/v1/ui/chat/send \
  -H "authorization: Bearer $IRONCLAW_API_TOKEN" -H 'content-type: application/json' \
  -d '{"agentGroupID":"dev-agent","text":"In one sentence, what is IronClaw?"}'

sleep 8   # local generation can be slower than a cloud API; give it a moment

curl -s -H "authorization: Bearer $IRONCLAW_API_TOKEN" \
  http://127.0.0.1:8787/v1/ui/chat/dev-agent/messages   # read the reply

The reply came back through IronClaw's encrypted per-session queue from a real sandbox — whose only network path was the host model-proxy, which forwarded to your machine's Ollama. No packet left the box.

What you just proved¶

A self-hoster ran the full chat → sandbox → reply path against a local model with no cloud API key.
The sandbox never held a credential and had network=none; the model-proxy was the single egress, and it pointed at localhost.

Going further¶

Pin local to specific agents. Instead of the deployment default, create an agent group on the local provider explicitly — it inherits the loopback host from IRONCLAW_LOCAL_MODEL_URL:
```
./bin/ironctl agent create --name "Local Bot" --provider local --model llama3.2
```
A local server that requires a key (e.g. a hardened vLLM): set IRONCLAW_LOCAL_MODEL_KEY and the host-proxy stamps it as a Bearer token. It stays host-side and never enters the sandbox. Most local servers (Ollama, LM Studio) need no key at all.
Run under docker compose. When the control-plane runs in a container, localhost is the container's loopback, not the host's. Point at the host instead — Docker Desktop exposes it as host.docker.internal:
```
export IRONCLAW_LOCAL_MODEL_URL=http://host.docker.internal:11434/v1
```
Use a different runtime. LM Studio (http://localhost:1234/v1), vLLM, and llama.cpp's server are all OpenAI-compatible — point IRONCLAW_LOCAL_MODEL_URL at their /v1 and set the model id they serve.

Next steps¶

Harden it. This used --dev (loopback, in-memory registry, no gVisor). For the production seal, see Production deployment — the local-model env vars carry over unchanged.
Mix providers. Different agent groups can use different backends — see Model providers.
Understand the design. IronClaw, Explained · Architecture.