Playbooks Tools Jun 07, 2026

How to Set Up OpenCode With a Local Model, No Terminal

A coding agent that runs as a normal desktop app, pointed at a model on your own laptop. No terminal, no subscription, no code leaving the building. Here is the exact setup we run.

How to Set Up OpenCode With a Local Model, No Terminal

Most open-source coding agents make you live in a terminal. Aider does. Pi does. Cline lives inside VS Code. OpenHands wants Docker. OpenCode is the one that ships as a normal desktop app: download it, double-click, point it at a model, start typing.

Pair that app with a model running on your own laptop, and you get something specific: an agentic coding assistant that reads your files, edits them, and runs commands, with nothing leaving your machine and no monthly bill attached. We run this setup daily in the studio. Here is how to build it yourself.

What you will end up with

  • OpenCode, the desktop app, on your Dock.
  • Ollama serving an open coding model locally.
  • The two wired together, so every prompt you type is answered by a model on your own disk.

No API keys. No cloud round-trips. No terminal, except for one optional command if you want to pull a model the fast way.

Before you start

You need a Mac, Windows, or Linux machine with at least 16GB of RAM. 8GB will technically run a small model, but coding work wants headroom. Apple Silicon (M1 through M4) is noticeably faster than Intel for this, though both work.

That is the whole list. You do not need Python, Homebrew, or a GPU server.

Step 1: Install Ollama

Ollama is the engine that downloads and serves the model. The easiest way to get it is the app installer.

  1. Go to https://ollama.com/download.
  2. Download the installer for your platform and run it.
  3. After install, Ollama runs quietly in the background as a service. There is nothing to keep open.

If you prefer one command on a Mac, brew install ollama does the same job. On Linux, curl -fsSL https://ollama.com/install.sh | sh installs it and sets up the service.

Step 2: Pull a coding model

This is the only step where a single command saves you time. Open a terminal once, pull a model, and you never have to touch the terminal again.

ollama pull qwen2.5-coder:7b

That gives you a capable, widely available coding model that runs comfortably on 16GB. In the studio our daily driver is Nemotron, which is stronger on agentic coding work; browse the exact tags at https://ollama.com/library and pull whichever fits your RAM. A rough guide: a 7B model is fine on 16GB, a larger model wants 32GB or more.

Confirm it landed:

ollama list

You should see your model in the list. That is the last command in this guide.

Step 3: Install OpenCode

  1. Go to https://opencode.ai/download.
  2. Download the desktop app for your platform. On Apple Silicon, take the arm64 build, not the x64 one.
  3. Open the download, drag OpenCode to Applications (Mac) or run the installer (Windows and Linux).
  4. Launch it. On macOS, if Gatekeeper complains about an unidentified developer the first time, right-click the app and choose Open.

The app opens to a clean window. OpenCode also offers a CLI and IDE extensions during onboarding. Skip past them. The desktop app is the point.

Step 4: Point OpenCode at your local model

  1. In OpenCode, open the model-provider settings.
  2. Choose Ollama as the provider.
  3. Confirm the address is http://localhost:11434. That is where Ollama listens by default.
  4. Select the model you pulled in step 2.

If your model does not appear, Ollama is not running. Confirm it with ollama list in a terminal, then reopen the provider settings.

Step 5: Prove it works

In the app, type a real prompt against a folder you do not mind it touching:

Open the current folder, summarize what is in it, and suggest three things you could help me with.

OpenCode will ask permission before it reads files or runs anything. Approve each step. When it answers, watch your network indicator: it stays at zero. The work is happening on your machine, against a model on your disk.

For a second test that proves it can write, not just read:

Add a function to utils.py that takes a list and returns the sum, and add a test for it.

Approve the edits, then open the file. The code is there.

What this is good for

  • Refactors, bug fixes, and new features in a normal app window.
  • Whole small projects, built conversationally.
  • Working on a plane, in a cabin, or during an outage. No connection required once the model is downloaded.
  • Sensitive code that should never touch a third-party server.

Honest trade-offs

The model is the ceiling. A weak local model is still a weak agent, and the very best cloud models are still ahead on the hardest problems. For most day-to-day coding, a good local coding model is enough, and the privacy and the zero bill change the math. If you hit a wall on a genuinely hard task, you can switch OpenCode to a cloud provider for that one job and switch back.

A few things that trip people up:

  • Wrong architecture. Apple Silicon Macs want the arm64 build.
  • Models do not appear. Ollama is not running, or the provider address is wrong. Check http://localhost:11434.
  • Memory is sticky. Once a model loads it stays in RAM. Run ollama ps to see what is loaded, and pick a smaller model if your machine feels tight.

Where to go next

Once this is running, the natural next builds are wiring the same Ollama model into your editor for autocomplete, and pointing a documents tool at it so you can ask questions of your own files. Both reuse the model you already pulled here. We cover the editor setup in a separate Playbook.

You now have an AI coding agent running in a regular app window, answered by a model on your own laptop. No terminal. No subscription. Curious about these things. You should be too.

Harness your curiosity.

— Stridenote · № 001