Playbooks Tools Jun 07, 2026

Run a Local ChatGPT With Open WebUI and Ollama

A private chat app that looks and works like ChatGPT, answered by a model on your own machine. Conversation history, model switching, file uploads, no subscription, no data leaving the building. Here is the exact setup we run.

Run a Local ChatGPT With Open WebUI and Ollama

Ollama is the easiest way to run a model on your own machine, but its native interface is a terminal. That is fine for a quick test and wrong for real work, and especially wrong for showing someone what local AI feels like. People expect a chat box, a sidebar of past conversations, a model picker, the ability to drop in a PDF and ask about it.

Open WebUI is that interface. It is a polished, browser-based chat app that points at your local Ollama and gives you conversation history, model switching, file uploads, and built-in document chat, all running on your own hardware. We install it on every Stridenote machine and every client demo laptop. Here is how to stand it up yourself.

What you will end up with

  • Ollama serving one or more open models locally.
  • Open WebUI running in a browser tab at http://localhost:3000.
  • A private chat experience that looks and feels like ChatGPT, with nothing leaving your laptop.

No API keys. No monthly bill. No data round-tripping to a third party.

Before you start

You need a Mac, Windows, or Linux machine with at least 16GB of RAM. The UI itself only wants about 4GB; the rest is for whatever model you run. Open WebUI’s recommended install path is Docker, so you want Docker Desktop installed (https://docker.com/products/docker-desktop). On Windows that means the WSL2 backend.

You also need a running LLM backend before any of this matters. If you do not have Ollama yet, install it first: that is Step 1 below.

Step 1: Install Ollama and pull a model

Ollama is the engine that downloads and serves the model.

On a Mac:

brew install ollama

On Windows: winget install Ollama.Ollama. On Linux: curl -fsSL https://ollama.com/install.sh | sh. Or grab the installer for any platform from https://ollama.com/download. After install, Ollama runs as a background service and listens on http://localhost:11434.

Now pull a model. A general chat model is the right first choice:

ollama pull llama3.2

If you want a second model to switch between later, a coding model is a good companion. Browse tags at https://ollama.com/library and pull whatever fits your RAM. Confirm what you have:

ollama list

Open WebUI will only show models that already appear in this list, so pull before you connect.

Step 2: Start Open WebUI with Docker

One command brings up the whole app:

docker run -d -p 3000:8080 \
  -v open-webui:/app/backend/data \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:main

A note on each piece, because it matters later. The -v open-webui:/app/backend/data flag creates a named Docker volume where your conversation history lives. The --restart always flag brings the container back after a reboot, so the app is just there when you want it.

On Windows PowerShell, the same command uses backtick line continuations instead of backslashes, or just put it on one line.

Step 3: Create your account and connect Ollama

  1. Open http://localhost:3000 in your browser.
  2. Create the first account. This is a sign-up form with no email verification. The first account becomes the admin with full server access, so make it yours.
  3. Go to Settings, then Connections.
  4. Confirm the Ollama URL. Because Open WebUI is running inside Docker and Ollama is running on your host machine, the address it needs is http://host.docker.internal:11434, not localhost. This is the single most common stuck point, so check it here.

On Linux, host networking works differently. Add --add-host=host.docker.internal:host-gateway to the docker run command in Step 2, then use the same host.docker.internal URL.

Step 4: Prove it works

Pick your model from the dropdown at the top of the chat, then send a real prompt:

Write a brief, professional email apologizing for missing a meeting.

Watch it stream. Open a new chat and notice it lands in the sidebar with the rest of your history. Now switch models in the dropdown and re-ask the same question. Comparing two answers side by side is the moment local AI stops feeling like a toy.

For the test that proves the document feature: upload a PDF using the file button, then ask a question about its contents. That is the built-in RAG working, no separate tooling required.

While all of this runs, watch your network indicator. It stays quiet. The work is happening on your machine, against a model on your disk.

Gotchas

A few things trip people up, in roughly the order they happen.

  • Models do not appear in the dropdown. Open WebUI only lists models Ollama already has. Run ollama list to confirm, and pull anything missing.
  • The connection fails from Docker. This is almost always the localhost versus host.docker.internal issue from Step 3. Inside the container, localhost means the container, not your machine.
  • That first account is the admin. If other people sign up afterward, they become regular users, but anyone can sign up by default. If multi-user access matters, lock down sign-ups in the admin settings.
  • Your chats live in the Docker volume. Do not delete the open-webui volume unless you mean to erase your history with it.
  • Updates are manual. Pull the new image and restart the container when you want the latest version: docker pull ghcr.io/open-webui/open-webui:main.

The honest limit: the built-in document chat is good, not exhaustive. It is more than enough for “chat with this PDF,” but for heavy, multi-document workflows you would reach for a dedicated RAG tool. For everyday use it saves you from installing anything else.

The other honest note is the install path. Docker is the easy road here. If you refuse Docker and use the Python install (pip install open-webui then open-webui serve, which lands on http://localhost:8080), it can fight you on dependencies. We recommend Docker for a reason.

Where to go next

You now have a private ChatGPT on your own machine. The natural next builds reuse the exact Ollama backend you just set up: wire the same model into your editor for code completion, or point a terminal coding agent at it. Both reuse the model you already pulled, and both are covered in their own Playbooks.

This is your private chat app. It looks the same as the one everyone knows, it works the same, and it runs on this laptop with no subscription and no data leaving. Curious about these things. You should be too.

Harness your curiosity.

— Stridenote · № 003