Run Your Own Local AI: OpenClaw + Ollama Guide

Most AI assistants rely on cloud APIs. For a complete overview of OpenClaw’s capabilities, see what is OpenClaw. To connect your local setup to a messaging app, follow the Telegram setup guide, or the VPS installation guide for an always-on setup. — every message you type goes to OpenAI, Anthropic, or Google's servers. For some use cases that's fine. But for privacy-conscious users, offline setups, or anyone who wants zero recurring costs, a fully local AI stack is the answer.

OpenClaw + Ollama delivers exactly that. Ollama runs large language models locally on your machine. OpenClaw connects to it as the AI backend. The result: a personal AI assistant that never touches the internet for inference, costs nothing to run (beyond your hardware), and keeps every conversation on your own disk.

What You'll Need

A machine with at least 8GB RAM (16GB recommended)
Linux, macOS, or Windows (WSL2)
OpenClaw installed
Ollama installed
30 minutes

Step 1: Install Ollama

If you haven't already, install Ollama on your machine. The quickest way on Linux:

curl -fsSL https://ollama.ai/install.sh | sh

For macOS, download the installer from ollama.ai. For Windows, use WSL2 and the Linux install command.

Step 2: Pull a Model

Ollama supports dozens of models. For a balance of speed and capability with OpenClaw, start with Llama 3.2 (3B) or Mistral 7B:

ollama pull llama3.2:3b

For better results (at the cost of more RAM), use:

ollama pull mistral
ollama pull llama3.2:latest

Step 3: Configure OpenClaw for Ollama

OpenClaw's config file lets you set Ollama as the AI provider. Edit your openclaw.yaml (usually at ~/.openclaw/config.yaml):

agents:
  defaults:
    model: ollama/llama3.2:3b
    provider: openai  # OpenClaw uses OpenAI-compatible API format
    apiBase: http://127.0.0.1:11434/v1
    apiKey: ollama    # Ollama doesn't require a real key

Ollama exposes a fully OpenAI-compatible API at http://127.0.0.1:11434/v1, so OpenClaw treats it like any other OpenAI provider — just pointed at your local machine.

Step 4: Restart and Test

Restart the OpenClaw gateway:

openclaw gateway restart

Then send a message to your assistant. If Ollama is running and the model is loaded, you'll get responses from your local AI — no internet connection required.

✓ Why Go Local?
Privacy: No data ever leaves your machine
Cost: Zero API fees — one-time hardware cost only
Offline: Works without internet access
No rate limits: Unlimited conversations
Custom models: Run Llama, Mistral, Phi, Gemma, or anything Ollama supports

When to Use Local vs Cloud

Local models are great for: private conversations, sensitive data analysis, offline environments, cost-sensitive setups, and prototyping.

Cloud models (Claude, GPT-4) are better for: complex reasoning, long-form content generation, detailed coding assistance, and tasks that benefit from a larger model's capability. For a full comparison, see our OpenClaw vs Claude and personal AI assistant rankings.

You can configure OpenClaw to use both — switch between your local model and a cloud provider depending on the task. Set Ollama as the default and Claude/GPT-4 as a fallback for complex queries.

Performance Notes

3B parameter models run well on 8GB RAM machines and respond in 1-3 seconds
7B models need 12-16GB for comfortable use
13B+ parameter models are better suited for machines with 24GB+ RAM or a GPU
For best performance, use a machine with CUDA or Metal GPU acceleration

The beauty of this setup is that it scales with your hardware. Start with a tiny 3B model and upgrade as your infrastructure grows. OpenClaw stays the same — just change the model name in config.

Ready to get started with OpenClaw? Install OpenClaw → Or explore AI tools for solopreneurs for more ways to use your local setup.