Studio Builds & Tools Jun 07, 2026

A Fully Local Image Pipeline With ComfyUI

How the studio built a repeatable image pipeline that runs entirely on one Mac: a node graph instead of a prompt box, models on disk, batch output, and an optional ffmpeg step to glue stills into a clip. Honest about the learning curve.

Written byH Hillary

Read time10 min

UpdatedJun 12, 2026

Filed underStudio · Builds & Tools

A Fully Local Image Pipeline With ComfyUI

We did not want a prompt box. We wanted a pipeline: a thing we could point at a brief on Monday and trust to produce the same kind of image on Friday, on our own machine, with the model files sitting on our own disk. That is a different goal from “type words, get a picture,” and it led us to ComfyUI, which is not the friendliest tool we run but is the one that gave us a pipeline we could actually keep.

This is the build log for that pipeline, including the part of the learning curve we will not pretend was painless.

The goal, stated plainly

Most image tools hide the work behind tabs and sliders. You type a prompt, you move some knobs, an image appears, and you have no real handle on the steps in between. That is fine for a one-off. It is the wrong shape for a studio that produces more than a handful of images a week and wants each one to come out of the same repeatable process.

Image generation is not one operation. It is a chain: load a model, encode the prompt, sample, decode, post-process, save. We wanted that chain laid out where we could see it, change one link, and rerun. ComfyUI is built exactly around that idea. Every step is a visual node you wire together on a canvas. The complexity is not a bug to apologize for. It is the thing we were buying.

The stack

Two pieces, both local.

ComfyUI, running its own server on http://localhost:8188. The node graph is the workspace. It runs on Apple Silicon using MPS for GPU acceleration automatically, which is the only reason a single Mac can carry this.
ffmpeg, installed with brew install ffmpeg and sitting on the PATH. It does no generation. It is the post step that turns a folder of stills into a clip when we need one.

The models live on disk under models/, and where they live matters more than it should. Checkpoints, VAEs, encoders, and LoRAs each go in their own subfolder. Put one in the wrong place and it simply does not appear in the UI, with no error to explain why. We learned that the boring way.

Building the graph

The first real graph is the default one ComfyUI loads: load a checkpoint, encode a positive and a negative prompt, sample, decode, save. We ran it once with a starter model just to watch an image come out of the Save Image node and confirm the whole chain was wired and alive.

Then we built the graph we actually wanted, which for us is the Flux Schnell pipeline. That meant putting the Flux model in models/diffusion_models/, the T5 and CLIP encoders in models/text_encoders/, and the VAE in models/vae/. Once those files were in their correct folders and showed up in their nodes, the graph ran. The first Flux generation was slow, because the model is large (12GB or more for the FP8 build) and has to load. Every generation after that was fast. That first-load lag caught us off guard until we understood it was a one-time cost per session, not the per-image speed.

The thing we did not expect to value as much as we do: once a graph is built, we save it as a JSON file. Dragging that file onto the canvas reloads the entire pipeline at once. That single habit is what turned a node graph into a pipeline. We do not rebuild anything. We drag in the workflow, edit the prompt node, and queue.

Daily use is two steps

Here is the part worth being honest about in both directions. Building the graph the first time is a learning curve, and the canvas genuinely feels intimidating when nodes are running off in every direction. But daily use is not that. Daily use is: edit the prompt in the text encode node, click Queue Prompt, wait, take the image out of the Save node. Two steps. The intimidating part is a one-time tax, not a recurring one.

For batch work we let the graph run a set rather than a single image: thumbnails, social cards, header variations, several at once. This is where exposing the chain pays back, because the same graph that makes one image makes fifty without us touching it between runs.

We also keep a small live trick for when we are showing the pipeline to someone: change one parameter, the seed or the sampler or the step count, and rerun. Watching the same prompt resolve differently because one node changed is the clearest possible demonstration of why the pipeline is built this way.

The ffmpeg post step

Some jobs want motion, not a still. When we generate a sequence of frames out of ComfyUI, ffmpeg is the glue that assembles them into a clip. The pattern from our notes is short:

ffmpeg -framerate 24 -i frame_%04d.png -c:v libx264 -pix_fmt yuv420p output.mp4

That is the whole post step. ComfyUI produces the numbered frames, ffmpeg stitches them at 24 frames per second into an MP4. ffmpeg does none of the AI work. It is the quietly essential tool that makes “I have a folder of stills, I need a clip” a thirty-second task instead of a trip into a video editor. The flag order matters with ffmpeg, input options before -i and output options after, which is the one rule we keep in our heads and look the rest up.

What broke, and what we watched

The breakage was never the model. It was always the plumbing.

Models not appearing. Almost always a file in the wrong subfolder under models/. The directory structure is the contract, and the UI will not warn you when you break it.
A workflow that ran on one machine and not another. Different model file names, paths, or node versions break a shared graph. We document the exact file names a workflow expects, because the JSON alone is not portable.
Custom nodes breaking on update. ComfyUI updates often, and community node packs sometimes lag behind. When a workflow matters, we pin it rather than chase the latest.

What we watched, once the plumbing held, was a real pipeline. A prompt went in, the progress moved visibly through the nodes, and a consistent image came out the other end, every time, with nothing leaving the machine. On a Mac the unified memory means Flux eats into what is available for everything else while it is loaded, so we plan around having it resident. That is a known cost, not a surprise.

The takeaway

ComfyUI is the cornerstone of how we make images now, and it earned that spot for an unglamorous reason: it gave us a process we can repeat instead of a box we have to coax. The node graph is a real learning curve and we will not tell you otherwise. But the curve is paid once. After that, the daily reality is edit-prompt, queue-prompt, and the pipeline does the rest, with an ffmpeg line on the end for the days we need a clip instead of a still.

We did not tell you it is the easiest image tool. We told you what we built and why it stays in production: a repeatable pipeline, on our own disk, that we can hand to next week without rebuilding. Curious about these things. You should be too.

Harness your curiosity.

— Stridenote · № 006

The goal, stated plainly

The stack

Building the graph

Daily use is two steps

The ffmpeg post step

What broke, and what we watched

The takeaway

More from studio.

From Subscription to Self-Hosted: A Studio Case Study

A Week of Coding With a Local Agent, No Cloud

The Cheapest Mac That Runs Serious Local AI