repowise writes a documentation page for every module and notable file in your repo, then keeps each one current incrementally — on every commit it regenerates only the 3 to 10 pages a change affects, in under 30 seconds, and reuses every unchanged page from cache. Each page carries a confidence and freshness score with git-informed decay, so drift is visible instead of silent. The same index also generates CLAUDE.md and AGENTS.md, so one source of truth grounds your humans and your AI agents alike.
Auto-generated codebase documentation is reference material an engine writes directly from your source and git history — a page per module and file describing structure, intent, and architecture — rather than docs a person hand-writes and forgets. repowise generates it hierarchically, cites the source ranges it describes, scores each page for freshness, and regenerates pages as the code changes.

Why does living documentation matter?
Documentation is worthless the moment it falls behind the code. Most wiki tools hand you a snapshot and let it rot from the day it is built.
The cost is trust. A doc you cannot trust is a doc nobody reads, so the team re-derives the same context by hand, and the agent grepping your repo reconstructs it token by expensive token.
- Stale docs mislead: a page describing last quarter's architecture sends a new hire or an agent down the wrong path.
- Manual upkeep does not scale: nobody updates the wiki on a busy sprint, so drift is the default state.
- Snapshot wikis decay silently: there is no signal telling you which pages have gone out of date.
repowise reclaims the wedge competitors vacated — private, self-hostable docs that stay in sync, with freshness scored on every page rather than left to rot.
How does the auto-wiki work?
Index, then generate, then score, then regenerate. The wiki is built bottom-up from the same graph and git history that powers every other repowise layer.
The pipeline is four stages.
| Stage | What happens |
|---|---|
| Index | repowise parses the repo into a tree-sitter dependency graph and reads its git history. Self-hosted, no code leaves your infrastructure. |
| Generate | An LLM writes pages hierarchically — symbols compose into file pages, files into module pages, modules into a repo overview — in the style you choose. |
| Score | Each page gets a confidence_score (how well source maps to claims) and a freshness_status (how recent it is vs the file's last meaningful change), decayed against git history. |
| Regenerate | On every commit, repowise diffs vs the last index, walks the graph to find affected pages, and regenerates only those — typically 3 to 10 pages in under 30 seconds. |
repowise writes three page types: file pages (one per source file, skipping fixtures, tiny files, and tests if configured), module pages (one per directory or logical cluster, with architecture and entry points), and symbol spotlights (for high-PageRank, many-caller symbols).
The prompt is why the output explains why, not just what. It carries the file's parsed structure and signatures, its imports and importers, top callers and callees with confidence, the 10 most significant commit messages (merges and dependency bumps filtered out), ownership and trend signals, and cross-references to existing pages.
Reproducible and idempotent. A page is keyed to its source content, so when the underlying code is unchanged the cached page is reused rather than re-billed to the LLM — re-running generation converges instead of churning. A cascade-budget knob caps how many pages one update may regenerate, so a giant refactor commit cannot blow up into hundreds of regenerations.
Resumable and crash-safe. The job system checkpoints each run to disk through a temp-file-plus-rename, so a crash mid-write cannot corrupt the checkpoint; on resume, completed pages are seeded from the vector store — the record of what was actually embedded and served — not from the checkpoint alone.
How does the auto-wiki help you?
One index, several outcomes — fresh docs, grounded agents, and editor files generated for you.
Onboarding without a tribal-knowledge tax
A new engineer opens the module page, reads the architecture and entry points up front, and follows citations into the exact source ranges. The page reflects HEAD, not the repo at some forgotten indexing date.
- Architecture and entry points surface at the top of every module page.
- Citations link claims to the source ranges they describe.
- Freshness scoring tells you whether the page reflects what shipped this morning.
Grounding for your AI agents
The wiki is the retrieval layer behind repowise's MCP tools. get_answer and search_codebase return a cited answer, not a raw file dump, so your agent stops grepping and re-reading to reconstruct context it should have been handed.
Guardrail: this is documentation grounded in your source, not a chatbot guessing — every answer cites the ranges it draws from.
CLAUDE.md and AGENTS.md, generated from the same index
repowise generates CLAUDE.md for Claude and AGENTS.md for Codex from the indexed graph — an architecture summary, key modules, entry points, hotspots, and the MCP tool guide — so your coding agent gets a curated orientation file instead of an empty or hand-stale one.
- Both files are written inside managed markers, so your own content outside the markers is preserved.
- The
CLAUDE.mdin this very repository is generated output. updaterefreshes both as the code moves;--agents/--no-agentstoggles the Codex file.
Freshness as a first-class signal
Every page tells you whether to trust it. Confidence and freshness ship by default through get_context(include=["freshness"]), and the staleness envelope (_meta.stale_warning) warns only when the index has actually diverged from HEAD — silence means current.
| Trigger | What it does |
|---|---|
| Post-commit hook | repowise hook install runs update in the background after every local commit. |
| File watcher | repowise watch updates on save, between commits, without requiring a commit. |
| GitHub / GitLab webhook | repowise serve takes signed push events and re-syncs; on hosted, the GitHub App wires this automatically. |
| Polling fallback | A 15-minute background poll catches any missed webhook delivery. |
Search that returns a cited answer
Search is hybrid RAG: SQLite FTS for exact keywords and a vector store for semantics, fused by reciprocal rank fusion, biased by PageRank, and expanded one hop along the dependency graph.
- Self-hosted vectors live in LanceDB and keywords in SQLite FTS — your code never leaves your machine.
- On the hosted plan the index is backed by Postgres with pgvector for embeddings.
- Voyage embeddings by default; OpenAI and Gemini are also supported.
Walkthrough: generate the wiki, then keep it fresh
Step 1 — Index and generate the wiki. Run repowise init to parse the repo into a graph, read its git history, and generate a page for every module and notable file. Free-tier or air-gapped? repowise init --index-only skips LLM generation entirely and you can layer the wiki on later.
Step 2 — Browse the wiki. Open the wiki index, read the architecture and entry points per module, and follow citations into the source ranges each page describes.
Step 3 — Generate CLAUDE.md and AGENTS.md. The same index writes both editor files inside managed markers at the repo root — an architecture map, key modules, entry points, hotspots, and the MCP tool guide.
Step 4 — Keep it fresh on every commit. Run repowise update to diff since the last sync and regenerate only the affected pages — typically 3 to 10 in under 30 seconds. Wire a post-commit hook, a file watcher, or a webhook and the wiki tracks HEAD with no human in the loop.

Step 5 — Search the wiki. Query in the dashboard or over MCP. Hybrid RAG returns a cited answer instead of a file dump, in the dashboard and to your agent alike.
Proof: does the wiki stay true to the code?
Each figure stands alone and is reproducible on your own repo — the engine is open source under AGPL-3.0.
| Result | Value |
|---|---|
| Pages regenerated per commit | 3 to 10, the rest reused from cache |
| Time to re-sync after a typical commit | < 30s |
| Languages parsed into the graph | 15, full-tier depth for 9 including C#/.NET |
| Editor files from one index | 2 — CLAUDE.md (Claude) and AGENTS.md (Codex) |
| Selectable wiki styles | 4 — comprehensive, reference, tutorial, caveman, no re-index to switch |
| Staleness envelope | every MCP response carries index_age_days + indexed_commit; stale_warning fires only on real divergence |
| Self-hosted storage | LanceDB (vectors) + SQLite FTS (keywords); hosted uses Postgres + pgvector |
| Resumable generation | checkpointed via temp-file-plus-rename; resume seeds from the vector store |
Try it on your repo
Generate a wiki that stays true to your code — and a CLAUDE.md your agent can trust. The engine is open source and runs locally.
pip install repowise
repowise init # parse, build the graph + git history, generate the wiki
repowise init --index-only # skip LLM generation (free-tier / air-gapped)
repowise update # diff since last sync, regenerate only affected pages
repowise hook install # auto-sync on every local commitHow each role uses this feature
Read fresh module pages and real why-context instead of grepping blind, and let the generated CLAUDE.md ground your coding agent on day one. Browse the wiki in the dashboard or pull cited answers over MCP with get_answer.
Self-host the whole wiki inside your firewall — vectors and keyword index live on your infrastructure and your code never leaves it. One generated index keeps docs provably tracking HEAD across every team and repo, no waitlist.