Inference & serving — the open-weights stack — what's accelerating
Round two on local AI. The first special edition covered the engines — ds4, llmfit, omlx, NemoClaw, QwenPaw. This one is the layer above them: the serving front-ends, control panels and fully-local agents that turn an open-weights model into something you actually use. Smaller, more honest list — most of this bucket is local-adjacent rather than true serving, and it's labelled that way below.
Top mover
The de-facto front-end for self-hosted models — a polished UI that speaks Ollama and the OpenAI API alike. At 140k stars still adding ~144/day, it's the serving layer most open-weights deployments end up sitting behind. The engine gets the headlines; this is what users actually look at.
---
The serving + open-weights layer
A fully local "Manus" — an autonomous agent that thinks, browses and codes with no APIs and no monthly bill, paying only in electricity. It's the demand-side proof for this whole stack: people want agentic behaviour running entirely on open weights they host themselves.
A modern open-source VPS control panel with native AI-agent support — run Ollama models and deploy agents from a managed UI. The interesting move is infrastructure tooling treating local model-serving as a first-class workload rather than a bolt-on.
---
Local-adjacent, not serving
labelled honestlyThree fast climbers in this bucket aren't really inference/serving and shouldn't pad the list: tobi/qmd (⭐26,179 · ↑146.3/day · TypeScript) — actually the highest-velocity repo here — is an all-local CLI search engine for your docs and notes; iOfficeAI/AionUi (⭐27,698 · ↑91.4/day · TypeScript) is a local desktop client for OpenClaw, Claude Code, Codex and 20+ CLIs; PDFMathTranslate/PDFMathTranslate (⭐34,565 · ↑54.2/day · Python) is a layout-preserving PDF translator that can call Ollama. All local-first, none of them a serving engine — flagged so the ranking stays straight.
> The honest read: after removing the engines already covered in edition #1 and the off-theme tooling, the genuine open-weights serving layer is thin this round. open-webui dominates because there isn't yet a crowded field of credible self-hosted serving front-ends — a gap worth watching.
---
How this was made
Live GitHub pull, bucketed by inference/local-runtime keywords, each repo verified not-archived and pushed recently, ranked by stars/day, then curated for substance — and de-duplicated against the prior local-inference special edition so nothing repeats. Star counts pulled at publish — they move daily; re-verify before reposting.
Accelbrief · catch acceleration, not stars · all editions