RAG & retrieval — what's accelerating
Retrieval is splitting into two camps: classic chunk-and-embed pipelines, and a new wave of reasoning-based indexes that try to skip the vector store entirely. The repos below are the fastest-climbing tools doing the actual retrieval work — not the tutorials teaching it.
Top mover
An open-source context database built specifically for agents, unifying how an agent stores and retrieves the context it accumulates across a task. The bet here is that agent memory and retrieval are one problem, not two — a single store for documents, history, and working context rather than bolting a vector DB onto a chat loop.
---
The retrieval stack
A library for pulling structured information out of unstructured text with LLMs, with precise source grounding back to the original span. The grounding is the point: extractions you can audit and trace, instead of a JSON blob you have to trust blindly.
A mature open-source RAG engine that now fuses retrieval with agent capabilities. The trade-off it solves is document parsing quality — deep layout/table understanding so retrieval isn't poisoned by garbage chunks. The most battle-tested option in this list.
A document index for "vectorless," reasoning-based RAG — instead of embedding chunks, it builds a navigable structure the model reasons over to find relevant pages. The trade-off: you give up approximate-nearest-neighbor speed to avoid embedding drift and chunk-boundary failures on long, structured documents.
Ready-to-run templates for RAG and enterprise search over live data, kept in sync with sources like SharePoint. Built on Pathway's streaming engine, so the index updates as the source changes rather than going stale between batch re-ingests.
A knowledge platform that turns raw documents into a queryable RAG service, a reasoning agent, and a self-maintaining wiki. Written in Go, which makes it lighter to deploy than the Python-heavy stacks — a single binary path to a hosted knowledge base.
---
Context: what's climbing but isn't infrastructure
The very fastest-moving repos in this bucket are learning material, not retrieval tools — worth tracking as a demand signal, not as something to build on: - datawhalechina/hello-agents (⭐57,000 · ↑209.6/day) — a build-agents-from-scratch tutorial. Trend signal, not infrastructure. - Shubhamsaboo/awesome-llm-apps (⭐113,466 · ↑147.7/day) — a 100+ app example collection to clone, not a library. - microsoft/ai-agents-for-beginners (⭐66,567 · ↑119.9/day) — a 12-lesson course. - ruvnet/ruflo (⭐58,132 · ↑158.0/day) — tagged agentic-rag, but it's a general agent meta-harness, not a retrieval layer.
That four of the five highest-velocity repos are tutorials and collections tells you the audience is still learning RAG faster than it's standardizing on any one engine.
---
How this was made
Live GitHub pull, bucketed by theme, verified not-archived and pushed recently, ranked by stars/day, curated for substance. Counts pulled at publish — they move daily.
Accelbrief · catch acceleration, not stars · all editions