The dead-code detector we shipped with Pagerank and git history
The first time we ran code hotspot detection on a real repository, the answer was not “delete this file.” It was “this file is quiet, this one is suspiciously quiet, and this one is probably carrying more than its line count suggests.” That distinction mattered immediately, because dead code is not a boolean property in a living codebase. It is a ranking problem under uncertainty.
We wanted a detector that could help humans and agents spend attention well. Not a linter that shouts “dead” or “alive” with false confidence. Not a cleanup script that turns one bad assumption into a deleted module. A ranking system: probably safe to ignore, quiet but active, and suspicious / needs review.
We stopped asking whether code was dead and started asking how likely it was to matter
The mistake in most dead-code detector conversations is treating “dead” like a crisp label. In practice, code sits on a spectrum:
- Probably safe to ignore: low structural importance, no recent activity, no meaningful ownership signal, no co-change history.
- Quiet but active: not central in the import graph, but still touched, still owned, still part of a living workflow.
- Suspiciously quiet: looks unused from the graph, but history says otherwise.
- Needs review: high blast radius, ambiguous usage, or signals that disagree.
That last category is the one that saves you from deleting the wrong file.
This is why we framed the problem as code hotspot detection instead of dead/alive classification. A boolean label creates bad cleanup decisions. It encourages false certainty, and false certainty is expensive when the output can influence refactors, deletions, or agent search paths.
The detector’s job is not to pronounce judgment. It is to sort attention.
DEAD CODE IS A RANKING PROBLEM
If you want the intuition from first principles, PageRank gives you a decent starting point. It was designed to surface important pages from link structure, and the same idea maps cleanly to code: files that are pointed to by many important files tend to matter more. But code is not the web. Imports are not citations. And quiet code is not necessarily dead code.
That is where history comes in.
PageRank over import edges gave us a first pass, and it failed in exactly the ways you would expect
We started with the import graph because it is the cleanest structural signal available. Nodes are files or symbols. Import edges connect dependents to dependencies. Run PageRank over that graph, and you get a rough estimate of structural importance: files that sit near the center of many dependency paths rise to the top.
This is a useful first pass for code hotspot detection because it separates “widely relied on” from “tucked away.” It also helps with one of the oldest maintenance problems: the file that is not large, not noisy, and still somehow responsible for half the system.
The failure modes were predictable.
A low-traffic config adapter had low PageRank because almost nothing imported it directly. But it was the bridge between a feature flag system and a payment flow. It looked like dead code right up until you removed it and the feature gate stopped working.
A generated client had almost no structural importance in the import graph because the code that consumed it was thin. But it was generated from an API contract, and the contract changed often. Treating it as dead would have been nonsense.
A leaf module also fooled us. It had low PageRank, few importers, and a tiny surface area. But it was on the critical path for startup initialization, so “leaf” was a graph property, not a business property.
That is the central flaw of PageRank alone: it measures structural importance, not actual lifecycle. A file can be quiet because it is dead. It can also be quiet because it is stable, generated, abstracted behind a boundary, or simply not in the part of the system that churns.
dependency graph
Git history rescued the detector from false positives
The ranking got materially better when we added git history.
Not as a bolt-on score. As the signal that distinguishes quiet from dead.
We pulled in history depth, churn, recency, ownership concentration, and co-change patterns. The shape matters more than any single number. A file touched often over the last few months is not dead, even if its PageRank is low. A file with concentrated ownership and frequent co-changes is probably part of an active subsystem. A file with no recent churn, no co-change partners, and no clear owner starts to look like cleanup debt rather than quiet infrastructure.
The classic hotspot signal is churn × complexity. We used that intuition because it is hard to argue with: files that change often and are hard to understand deserve attention. But for dead-code detection, we inverted the question. Low churn is not proof of death. It is only evidence, and weak evidence at that.
Ownership and co-change were the corrective signals. A module that almost never changes but always changes with the same adjacent files is probably not dead. A module with one dominant owner and a trail of related commits is not dead either. A module with no recent churn, no stable owner pattern, and no co-change partners is where suspicion starts to harden.
This is also where we got something wrong initially. We over-weighted recency. That made the detector too eager to call quiet code dead, especially in parts of the codebase that only change during releases or incident response. The fix was embarrassing in retrospect: if a file is quiet but historically central, treat silence as a clue, not a verdict.
Here is the pattern that worked better:
| file | PageRank | recent git activity | ownership concentration | final label | why it landed there |
|---|---|---|---|---|---|
payments/feature_flags.py | low | medium | high | quiet but active | low graph centrality, but frequent recent edits and one clear owner |
clients/generated/api_v2.py | low | high | high | quiet but active | generated client, strong churn from contract updates, not dead |
legacy/email_helpers.py | low | none | none | probably safe to ignore | no recent churn, no owners, no co-change partners, no import pressure |
startup/bootstrap_cache.py | low | low | medium | suspiciously quiet | looked isolated in graph, but historically co-changed with startup path files |
pricing/discount_gate.py | medium | medium | high | needs review | structurally important and still actively changed |
The file that looked dead but was actually important was startup/bootstrap_cache.py. It had almost no PageRank and almost no recent edits, but the historical co-change pattern tied it to startup behavior. The file that was truly safe to ignore was legacy/email_helpers.py. It was quiet in every sense that mattered: no recent churn, no ownership signal, no downstream pressure.
That is the difference between a ranking system and a lint rule. One asks where attention is likely to pay off. The other asks whether a line of code matches a heuristic.
The files that looked dead but were actually load-bearing
The detector’s value became obvious when we started reviewing the top-ranked “dead” candidates by hand. The model was not trying to delete anything. It was trying to tell us where the uncertainty lived.
Here is a worked example from a small cleanup pass:
| file | PageRank | recent git activity | ownership concentration | final label | why it landed there |
|---|---|---|---|---|---|
config/checkout_adapter.py | 0.07 | high | high | quiet but active | low graph centrality, but frequent edits around feature rollout and one owner |
generated/openapi_client.py | 0.02 | high | medium | quiet but active | generated code, many recent contract-driven changes |
core/old_metrics.py | 0.01 | none | none | probably safe to ignore | no churn, no owners, no co-change, no import pressure |
auth/session_bridge.py | 0.05 | low | high | suspiciously quiet | low edits now, but historically tied to auth boundary changes |
reports/unused_formatter.py | 0.00 | none | none | probably safe to ignore | isolated leaf, no history, no dependents |
The false-positive example mattered most: config/checkout_adapter.py. A naive detector would have called it dead because it was low in the import graph. But git history said otherwise. It was touched during rollout work, owned by a small group, and repeatedly co-changed with feature-flag code. It was quiet in the graph and active in the repo.
That is the category we care about most when the goal is reducing agent search space without deleting the wrong module. False positives hurt more than false negatives here. If we miss some dead code, the worst outcome is wasted attention later. If we mark live code as dead, we train humans and agents to ignore the wrong thing.
We also found that the label should carry a confidence tier and a cleanup-impact estimate. A file that is probably safe to ignore with low cleanup impact is a different operational decision from a file that is suspiciously quiet with high blast radius. The detector should say that plainly.
A representative output looked like this:
dead_code_ranked
1. legacy/email_helpers.py
label: probably_safe_to_ignore
confidence: 0.91
cleanup_impact: low
2. startup/bootstrap_cache.py
label: suspiciously_quiet
confidence: 0.63
cleanup_impact: high
3. config/checkout_adapter.py
label: quiet_but_active
confidence: 0.88
cleanup_impact: medium
4. pricing/discount_gate.py
label: needs_review
confidence: 0.79
cleanup_impact: high
That output is useful even if nobody deletes a line.
How we turned the score into cleanup guidance instead of a deletion list
This was a product decision as much as an algorithmic one.
We refused to present the detector as an automatic delete list. That would have been irresponsible, and it would have encouraged the wrong behavior. The right output is a prioritized review queue with confidence tiers and cleanup-impact estimates.
The practical categories are:
- Probably safe to ignore: low priority for cleanup, useful for keeping agents away from irrelevant files.
- Quiet but active: do not touch based on silence alone.
- Suspiciously quiet: inspect next, because the graph and history disagree.
- Needs review: high structural importance or ambiguous signals.
The point is not to tell someone “delete foo.py.” The point is to tell them “this file is a good candidate for human review, and this other file can stay out of the way.”
That matters because cleanup work is rarely isolated. A dead file can be harmless, but a file that looks dead can still be the only place a boundary is enforced, a contract is adapted, or a legacy integration is kept alive.
So the detector gives you a ranking, a confidence tier, and an estimated cleanup impact. It does not pretend to know the future.
How detector output becomes better agent context
This is where the dead-code detector stops being a cleanup toy and starts being agent infrastructure.
If an AI agent is trying to answer a question about a repo, the worst thing you can do is let it reread irrelevant files. The detector helps shrink the search space before the agent starts wandering. That means fewer files read, fewer tool calls, and less time spent rediscovering what the repo already knows.
In our workflow, the ranking feeds agent context selection. A get_dead_code-style query surfaces the ranked list. A get_context-style query can then pull ownership, freshness, symbols, and surrounding community for the files that deserve attention. The point is to keep the agent from treating every file as equally plausible.
The hooks matter here too. A PreToolUse hook can intercept Grep or Glob and inject the top related files from the local graph before the agent expands its search. That means the agent starts with context instead of rediscovering it by brute force. A PostToolUse hook can notice when a commit makes the wiki stale and tell the agent to refresh its assumptions.
That is a better use of code hotspot detection than a delete button. It helps agents avoid rereading dead ends, and it helps humans review the right files first.
DEAD CODE RANKING IN THE AGENT LOOP
codebase intelligence platform
FAQ
How do you detect dead code without deleting the wrong file?
You do not try to prove deadness as a binary fact. You rank files by structural importance and history signals, then separate probably safe to ignore from quiet but active and suspiciously quiet. That keeps humans in the loop and makes false positives visible before anyone deletes anything.
Why is dead-code detection a ranking problem instead of a boolean check?
Because code lives under uncertainty. A file can be quiet because it is dead, or because it is generated, stable, boundary-adjacent, or only touched during rare workflows. Ranking lets you express that ambiguity instead of collapsing it into a yes/no label that creates bad cleanup decisions.
How does git history improve code hotspot detection?
Git history adds the temporal signal that the import graph cannot provide. Churn, recency, ownership concentration, and co-change patterns tell you whether a quiet file is actually inactive or just low-traffic. That is what separates dead from merely quiet.
Can PageRank find dead code in a codebase?
Not by itself. pagerank over the import graph is good at finding structurally important code, which is useful, but it cannot tell you whether a quiet file is dead, generated, or load-bearing in a rare path. It is a first pass, not a final judgment.
What should an engineering team do with dead-code detector output?
Use it to prioritize review, guide cleanup, and shrink agent search space. Treat the output as confidence-ranked attention guidance, not an automatic delete list. The best outcome is fewer files read and better decisions, even when nothing is removed.


