Stop paying monthly for AI: run it locally instead

Count your AI subscriptions. Then count what the same work costs on a machine you already own. The gap is the whole argument.

Written byH Hillary

Read time4 min

UpdatedJun 19, 2026

Filed underNotes · Opinion

Stop paying monthly for AI: run it locally instead

Count your AI subscriptions. The chat tool. The coding assistant. The image generator. The transcription service. The one your team added last quarter that nobody owns. Add them up as a single monthly number.

Now hold that number against a different one: what the same work costs on a machine you already own. The gap between those two numbers is the whole argument, and most people have never done the arithmetic.

Why do AI subscriptions keep stacking up?

Subscriptions do not arrive as one decision. They accrue. A free tier here, a per-seat plan there, an upgrade because you hit a limit during a deadline. Each one is small enough to wave through, and together they become a fixed cost that grows every time you adopt a tool. The subscription stack eats your margin one reasonable yes at a time.

The pattern is quiet by design. Nobody signs off on a thousand-dollar-a-year AI budget in one meeting, but plenty of teams are paying exactly that without ever having decided to. The thing nobody mentions: you are renting capability that now runs fine on hardware you bought once. We made the same case in why you should own your AI stack instead of renting it, and the maths has only moved further in local’s favour since.

What does the monthly math actually look like?

Run the numbers on a typical operator. A chat plan at roughly $20 a month. A coding assistant at another $20. An image generator between $10 and $30. Transcription at $15. One extra team seat at $30. None of those is reckless on its own. Together they land somewhere around $100 to $150 a month, which is $1,200 to $1,800 a year, every year, rising each time you add a tool.

Now stretch that across three years and a small team. Five people each carrying a $120 stack is $600 a month, $7,200 a year, and $21,600 over three years. Not one dollar of it builds an asset you keep. The same budget buys several capable machines outright, with the models running on them at no marginal cost for as long as the hardware lasts. Hardware also holds value in a way a subscription never will: a machine you stop using for AI is still a machine.

Against that, here is what we run at the studio. On a single Apple M4 Pro with 48 GB of unified memory, we use gemma4:31b for writing and glm-4.7-flash for coding, both running locally through Ollama. The recurring cost after the hardware is zero. No per-seat creep, no usage meter, no bill that climbs when we add a fifth use case. We broke the full comparison down in the true cost of cloud AI versus running local.

Is running AI locally actually easy now?

A few years ago, running a capable model locally meant Python, model conversion, and patience. It was a project. Today it is one command. Install an engine, pull a model, and you are answering prompts on your own laptop in under a minute, with nothing leaving the machine.

The path is genuinely short. Install Ollama once. Run a single pull command for the model you want, a 30-billion-parameter model if your machine has the memory for it, or a smaller 7B model if it is tighter. Then ask it a question from the terminal, or point a chat front end at it for something that looks like the tool you are replacing. The first answer lands in seconds, and every answer after that is free. The same install handles drafting, coding, and asking questions of your own files, so one setup retires several subscriptions at once.

That shift is the part the pricing pages would rather you not notice. The reason to rent was that local was hard. Local is not hard anymore.

What do you give up moving from cloud to local AI?

Be honest about the trade, both directions.

You give up the very top of the quality curve. On the hardest problems, a frontier cloud model is still ahead, and if your work lives there every day, keep paying for that one tool. We still keep one frontier subscription for the genuinely hard reasoning jobs, the kind that come up a few times a month rather than a few times an hour. That single retained tool is the exception that proves the rule: pay for the work local cannot do, and stop paying for the work it can.

You give up nothing else. For ordinary work, drafting, coding, transcription, asking questions of your own files, a model on your own machine is enough. And you get back things the subscription never offered: your data stays yours, the tool works on a plane, and the bill does not climb when you add the next capability. There is also a speed dividend people forget: a local model answers without a network round trip, a queue, or a rate limit waiting on the other end. The hidden charges are worse than they look once you count the free tiers, which we covered in the hidden costs of free cloud AI tiers.

Should you cancel your AI subscriptions?

This is not “cancel everything today.” It is “stop paying by default.” Most people rent every AI capability because renting was once the only option, and they never revisited it. Revisit it. Keep the one cloud tool that earns its bill on work a local model genuinely cannot do, and move the rest onto a machine you already own.

Own your AI, do not rent it, unless the rent is buying you something local cannot. For most of what you do, it is not.

The direction of travel matters here. Every few months the open models get smaller for the same quality, or sharper for the same size, which means the slice of work that genuinely needs a cloud subscription keeps shrinking. The bill that felt unavoidable a year ago is already optional for most of what you do, and that boundary only moves one way.

Count the subscriptions. Run the local version once. Then decide.

Why do AI subscriptions keep stacking up?

What does the monthly math actually look like?

Is running AI locally actually easy now?

What do you give up moving from cloud to local AI?

Should you cancel your AI subscriptions?

More from notes.

Google releases DiffusionGemma, an open model that generates text up to 4x faster

Google DeepMind and partners open a $10M funding call for multi-agent AI safety

Anthropic launches Claude Corps, a $150M fellowship placing 1,000 people in nonprofits