Run IronClaw with a 100% local model (Ollama)¶
IronClaw's whole premise is isolation you can prove and no data leaving your box. The last piece of that story is the model itself: instead of a hosted provider (Anthropic, OpenAI, Gemini, Vertex), you can point IronClaw at a self-hosted, OpenAI-compatible endpoint and run the entire stack — control-plane, sandbox, and model — on your own hardware with zero cloud API keys.
This tutorial uses Ollama, but the exact same steps work for LM Studio,
vLLM, and llama.cpp — they all expose the OpenAI /v1 Chat Completions API.
By the end you'll have a real agent replying through a real per-session sandbox, powered by a model running locally, with no credential anywhere in the stack.
Why this works
Ollama serves the OpenAI-compatible API at http://localhost:11434/v1. IronClaw's local
provider speaks that identical wire format. The host model-proxy allowlists your loopback
host and forwards to it over plain HTTP (local servers serve no TLS); the sandbox stays
network=none and credential-free, reaching the model only through the proxy socket — exactly as
it does for a cloud provider.
Prerequisites¶
- A Go toolchain with cgo (
CGO_ENABLED=1) — same as the Quickstart. Or install a prebuilt binary and substituteironclaw-controlplanefor./bin/controlplanebelow. - Ollama installed and running: https://ollama.com/download.
1. Pull a model with Ollama¶
ollama pull llama3.2 # ~2 GB; any chat model works (qwen2.5, mistral, …)
ollama serve & # if it isn't already running as a service
curl -s http://localhost:11434/v1/models # sanity check: the OpenAI-compatible API is up
2. Build IronClaw¶
git clone https://github.com/IronSecCo/ironclaw.git
cd ironclaw
CGO_ENABLED=1 go build -o bin/ ./cmd/controlplane ./cmd/ironctl
3. Point IronClaw at the local model (Terminal 1)¶
# No ANTHROPIC_API_KEY — there is no cloud credential in this posture.
export IRONCLAW_LOCAL_MODEL_URL=http://localhost:11434/v1 # the Ollama OpenAI endpoint
export IRONCLAW_LOCAL_MODEL=llama3.2 # the model you pulled
export IRONCLAW_API_TOKEN=$(openssl rand -hex 32) # bearer token for the API
echo "API token: $IRONCLAW_API_TOKEN" # copy this for Terminal 2
./bin/controlplane --dev --api-addr 127.0.0.1:8787
Setting --local-model-url (or the IRONCLAW_LOCAL_MODEL_URL env) does three things:
- Allowlists
localhost:11434on the model-proxy, so the sandbox may reach it. - Marks it an insecure upstream — forwarded over plain HTTP, port preserved — because local servers serve no TLS.
- Makes it the deployment-default model, so the seeded
dev-agent(and any agent group without a pinned provider) runs fully local.
You should see local model enabled host=localhost:11434 model=llama3.2 in the startup log. Leave
this running.
Equivalent flags
--local-model-url http://localhost:11434/v1 --local-model llama3.2 are the flag form of the two
env vars. The flag and env paths are interchangeable.
4. Chat with the local agent (Terminal 2)¶
Option A — the browser console¶
Open the Chat tab, pick the "Dev Agent" group, and say hi. The reply is generated by your local Ollama model — first tokens may take a moment while the model warms up.
Option B — straight from the terminal¶
export IRONCLAW_API_TOKEN=<paste the token from Terminal 1>
curl -s -X POST http://127.0.0.1:8787/v1/ui/chat/send \
-H "authorization: Bearer $IRONCLAW_API_TOKEN" -H 'content-type: application/json' \
-d '{"agentGroupID":"dev-agent","text":"In one sentence, what is IronClaw?"}'
sleep 8 # local generation can be slower than a cloud API; give it a moment
curl -s -H "authorization: Bearer $IRONCLAW_API_TOKEN" \
http://127.0.0.1:8787/v1/ui/chat/dev-agent/messages # read the reply
The reply came back through IronClaw's encrypted per-session queue from a real sandbox — whose only network path was the host model-proxy, which forwarded to your machine's Ollama. No packet left the box.
What you just proved¶
- A self-hoster ran the full chat → sandbox → reply path against a local model with no cloud API key.
- The sandbox never held a credential and had
network=none; the model-proxy was the single egress, and it pointed atlocalhost.
Going further¶
-
Pin local to specific agents. Instead of the deployment default, create an agent group on the local provider explicitly — it inherits the loopback host from
IRONCLAW_LOCAL_MODEL_URL: -
A local server that requires a key (e.g. a hardened vLLM): set
IRONCLAW_LOCAL_MODEL_KEYand the host-proxy stamps it as a Bearer token. It stays host-side and never enters the sandbox. Most local servers (Ollama, LM Studio) need no key at all. -
Run under
docker compose. When the control-plane runs in a container,localhostis the container's loopback, not the host's. Point at the host instead — Docker Desktop exposes it ashost.docker.internal: -
Use a different runtime. LM Studio (
http://localhost:1234/v1), vLLM, and llama.cpp's server are all OpenAI-compatible — pointIRONCLAW_LOCAL_MODEL_URLat their/v1and set the model id they serve.
Next steps¶
- Harden it. This used
--dev(loopback, in-memory registry, no gVisor). For the production seal, see Production deployment — the local-model env vars carry over unchanged. - Mix providers. Different agent groups can use different backends — see Model providers.
- Understand the design. IronClaw, Explained · Architecture.