EDITION 13 · VOICE & SPEECH AI2026·06·065 min readlinks verified live

Voice & speech AI — what's accelerating

This is a smaller, more honest edition. The voice-and-speech bucket is thin, and keyword matching swept in several repos where voice is a side feature, not the point. Below are the ones genuinely built for synthesizing, cloning, and generating speech — the rest are named and set aside.

↑223/day

fastest climber
in the edition

picks that
earned a slot

live

counts pulled
at publish

5min

to read the
whole edition

Top mover

★ TOP MOVER

jamiepine/voiceboxUSETypeScript▲ 222.9 /day★ 29,426

"The open-source AI voice studio. Clone, dictate, create." A full local voice workspace built on Qwen3-TTS with CUDA and MLX backends — cloning, dictation, and generation in one app rather than a bare model checkpoint. The studio framing is why it's the fastest mover here: it's usable end-to-end, not just weights.

Who needs itcreators and developers who want voice cloning and TTS on their own hardware without a cloud API.

---

The speech stack

microsoft/VibeVoiceUSEPython▲ 170.1 /day★ 48,315

Microsoft's "open-source frontier voice AI." Backing from a major lab is the signal — it pulls open speech synthesis toward the quality bar previously held by closed APIs, and the star base reflects that trust.

Who needs itanyone wanting high-fidelity open TTS with a credible maintainer behind it.

OpenBMB/VoxCPMUSEPython▲ 102.2 /day★ 26,869

A tokenizer-free TTS model for multilingual speech generation, creative voice design, and true-to-life cloning. It surfaced in the image/video bucket by accident, but it's squarely a speech model — a strong one — so it earns a place here instead.

Who needs itdevelopers needing multilingual TTS and voice cloning from a research-grade model.

---

Voice as a feature, not the point

These climbed fast but aren't voice/speech tools — voice is a bolt-on, so they're named and set aside rather than ranked:

- Alishahryar1/free-claude-code — ⭐32,687 · ↑255.4/day. A free Claude-Code access wrapper that happens to support voice; it's a coding-agent client, not speech AI. - hugohe3/ppt-master — ⭐24,748 · ↑139.0/day. Generates editable PowerPoint with voiced speaker notes — a slide tool with TTS attached, not a voice engine. - slopus/happy — ⭐21,644 · ↑67.0/day. A mobile/web client for Codex and Claude Code with realtime voice as one feature among many. - mudler/LocalAI — ⭐46,705 · ↑39.7/day. A general local-inference engine that runs voice among LLMs, vision, and image — capable, but voice isn't its focus. - blakeblackshear/frigate — ⭐33,560 · ↑12.5/day. An NVR for camera object detection. No audio, no speech — pure keyword false-positive, dropped.

---

How this was made

Live GitHub pull, bucketed by voice/speech keywords, each repo verified not-archived and pushed recently, ranked by stars/day, then curated hard for fit. The bucket was small and noisy, so rather than pad it, mismatches were named and set aside and one mis-filed speech model was pulled in from a neighboring bucket. Star counts pulled at publish; they move daily, so re-verify before reposting.

1 · pull the firehose, verify live2 · bucket by keyword3 · rank by stars/day4 · separate signal from noise, by hand

Accelbrief · catch acceleration, not stars · all editions

1 · pull the firehose, verify live2 · bucket by keyword3 · rank by stars/day4 · separate signal from noise, by hand

Top mover

The speech stack

Voice as a feature, not the point

How this was made

Catch the next breakout before it trends.