Hidden Coupling: Finding the Files That Always Change Together
Hidden coupling is the code you do not see in imports. Two files have no direct dependency on paper, yet they still move together every few weeks because they encode one rule, one data shape, or one workflow. That matters because import graphs only show explicit links. They miss the files that change together, which is where refactors, bugs, and review gaps tend to hide. Git history can surface that pattern, and the co-change signal is strong enough that research on evolutionary coupling treats it as a real relationship, not a curiosity. (researchgate.net)
What hidden coupling is and why imports miss it
Static analysis is good at explicit structure. It can tell you that auth/service.ts imports jwt.ts, or that one package calls another through a symbol edge. It cannot tell you that auth.ts and middleware/session.ts have been edited in the same commit 31 times because a token shape, a header contract, or an error path ties them together. repowise calls that hidden coupling and surfaces it as a co-change pair: files that change together without an import link. (github.com)
That distinction matters. An import edge is a design-time promise. A co-change edge is a maintenance-time fact. If the same pair keeps showing up in git log, you are probably looking at a boundary that is fake, incomplete, or too spread out. Research on evolutionary coupling has long used support and confidence over version history to find these relationships, and more recent work points out that co-change frequency can reveal hidden dependencies that static analysis misses. (researchgate.net)
A practical rule: if two files routinely ship together, reviewers should treat them as one unit until proven otherwise. That is the core of a good coupling PR review. You are not asking, “Do these imports exist?” You are asking, “Will this change make the other file stale?” (researchgate.net)
Hidden Coupling Map
Co-change as a coupling signal
Git co-change analysis is simple in principle. For each commit, collect the touched files. Count repeated pairs. Rank pairs by frequency, recency, and confidence. Then filter out noise such as formatting sweeps, dependency bumps, and vendored churn. repowise does this over the last 500 commits by default and uses the signal in its git intelligence layer. (github.com)
The literature uses related ideas under names like evolutionary coupling, change patterns, and logical coupling. The jargon varies, but the practical outcome is the same: the files that change together are often the files that should be reviewed together, tested together, and sometimes split apart. Studies also warn that frequency alone can produce false positives, especially when the pair changes for broad maintenance work or once-off migrations. (researchgate.net)
Window selection
Window size decides what “together” means.
A short window, say 30 commits, reacts fast but overweights recent churn. A long window, say 1,000 commits, stabilizes the ranking but can bury a newly formed coupling. repowise’s default of 500 commits is a pragmatic middle ground for many repos because it covers enough history to catch repeating pairs without dragging ancient architecture changes into every result. (github.com)
I use three windows in practice:
- 30 commits for recent drift.
- 180 commits for medium-term behavior.
- 500 commits for durable pairs.
If a pair appears in all three windows, it is usually real. If it only appears in the shortest window, treat it as a candidate, not a conclusion.
Confidence threshold
A pair should clear more than raw count. You want confidence. A common first pass is:
- minimum co-change count: 3
- minimum shared-commit ratio: 0.2
- exclude commits that touch more than N files
- exclude mechanical commits
That last item matters. A repo-wide rename can make every pair look coupled. So can a lockfile update, code formatting, or a vendor sync. The goal is to keep the signal tied to meaning, not housekeeping. This is also why repowise filters out merges, dependency bumps, and lint-only commits when it generates significant commit history for a file. (github.com)
A sane confidence ladder looks like this:
| Signal | What it means | Action |
|---|---|---|
| 2 co-changes, 2 different authors | Weak but interesting | Watch list |
| 4–6 co-changes, same subsystem | Likely real | Add to review checklist |
| 10+ co-changes, no import edge | Hidden coupling | Consider refactor or explicit abstraction |
Running it yourself with git log and a script
You do not need a platform to find hidden coupling. Start with git log --name-only, then build file pairs per commit. The script can be plain Python or shell plus awk. Keep it boring.
Step-by-step
- Export commit file lists.
- Drop merge commits.
- Remove non-code churn.
- Count unordered file pairs.
- Rank by count and recency.
- Compare against your import graph.
- Review the top pairs manually.
A tiny Python sketch:
from collections import Counter
from itertools import combinations
import subprocess
log = subprocess.check_output(
["git", "log", "--no-merges", "--name-only", "--format=COMMIT"],
text=True,
)
pairs = Counter()
files = []
for line in log.splitlines():
if line == "COMMIT":
if len(files) > 1:
for a, b in combinations(sorted(set(files)), 2):
pairs[(a, b)] += 1
files = []
elif line.strip():
if not line.endswith((".md", ".lock")):
files.append(line.strip())
for (a, b), count in pairs.most_common(50):
print(count, a, b)
That is enough to find the first 10 suspicious pairs in most codebases.
What to look for
- A service file and a test helper that always move together.
- A route handler and a schema file with no import edge.
- A frontend client and a backend DTO that keep changing in lockstep.
- A config file and a parser that should have one source of truth.
If a pair keeps showing up, ask why. Often the answer is one of these:
- One file encodes a contract the other mirrors.
- One file contains behavior and the other contains tests that are too specific.
- One boundary was split by directory, not by responsibility.
- One abstraction is missing.
For a real example of how this shows up in a repo graph, the FastAPI dependency graph demo makes the static side obvious, while the hotspot analysis demo shows where churn piles up. Those are the places where hidden coupling tends to live. (github.com)
Co-change Script Output
Doing it automatically
Manual scripts are fine until the repo gets bigger or the team stops remembering to run them. That is where automatic hidden coupling detection helps.
repowise exposes co-change pairs through its git intelligence and surfaces them in generated docs, the knowledge map, and MCP tools. Its repository README describes co-change pairs as files that change together in the same commit without an import link, and it includes them in the generated CLAUDE.md so agents see the pattern before they edit code. (github.com)
The interesting part is not the ranking alone. It is the pairing of history with context. repowise can show the co-change partner next to direct dependencies, ownership, hotspots, and the file’s recent significant commits. That turns a vague smell into an actionable review queue. (github.com)
If you want the background on how the system is wired, the architecture page explains the layers behind the output, and the live examples show what the generated intelligence looks like on real repos. If you want to run it yourself, try repowise on your own repo — the MCP server is configured automatically. (github.com)
repowise co-change view
The co-change view is useful because it answers three questions fast:
- Which files move together?
- How often has the pair changed?
- Is there an import edge, or is this hidden coupling?
That last check matters. A pair with an import edge may be normal dependency flow. A pair with no import edge is where you start looking for split contracts, repeated fixes, or a missing abstraction. repowise explicitly labels this as hidden coupling in its git intelligence layer. (github.com)
The platform also supports a multi-repo workspace mode, which matters for systems split across backend, frontend, and shared libraries. Cross-repo co-changes can reveal API drift long before a release breaks. (github.com)
Bot-time PR check: “missing co-change partner”
A good PR bot should not ask for more noise. It should ask for the one missing file that history says belongs in the diff.
A coupling PR review can check this rule:
- If the PR touches a file with strong co-change partners, list those partners.
- If a partner is not in the diff and the change affects the shared contract, flag it.
- If the change is deliberate, require a note in the PR description.
repowise’s PR bot is designed to post one deterministic comment per PR and stay silent on green PRs. Its public repo describes comments that summarize health deltas, hotspot touches, hidden coupling, and dead-code changes. That is the right shape for this problem: deterministic, cheap, and narrow. (github.com)
That style also avoids the worst PR bot failure mode: vague prose. The bot should say, “You changed auth.ts; middleware/session.ts is a strong co-change partner and was not included.” Then the reviewer can decide whether the omission is fine.
What to do with the findings
Hidden coupling detection is not a score to admire. It is a work queue.
1. Tighten the boundary
If two files always change together, they may deserve one abstraction, one module, or one owner. A stable boundary should reduce co-change over time.
2. Add explicit tests
If a pair exists because one file mirrors another contract, add tests that pin the contract. That lowers the chance that the pair drifts silently.
3. Split orchestration from logic
A common cause of co-change is mixed responsibilities. Move shared rules into a library, keep adapters thin, and stop duplicating validation in two places.
4. Assign review ownership
If the pair crosses team lines, add both owners to the review path. repowise’s ownership and bus-factor signals help here because hidden coupling often lines up with knowledge silos. (github.com)
5. Watch the trend
A single co-change pair can be harmless. A rising cluster is not. If a file starts accumulating new partners over the last 30 to 90 commits, that is usually a design smell or a change hot spot. repowise’s hotspot layer combines churn and complexity, which makes that trend easier to spot. (github.com)
A simple decision table works well:
| Finding | Likely cause | Next move |
|---|---|---|
| Pair changes with tests only | Narrow regression coverage | Add contract tests |
| Pair changes with config files | Hidden schema or feature flag link | Centralize config |
| Pair changes across services | API contract drift | Add shared schema or adapter |
| Pair changes in every refactor | Too much logic in one boundary | Split the module |
For teams already using agents, MCP makes this easier to wire into tooling. The current MCP spec defines versioned protocol negotiation and a standard transport model, which is why repowise can surface these signals to Claude Code, Cursor, and other MCP clients in a structured way. The current protocol version is 2025-11-25, and the spec documents the transport and versioning rules that clients and servers negotiate against. (modelcontextprotocol.io)
FAQ
What is hidden coupling in code?
Hidden coupling is a relationship between files that change together even though they do not import each other. It usually shows up in git history as repeated co-changes. (github.com)
How do I find files that change together?
Start with git log --name-only, group files by commit, count pairs, and filter out mechanical changes. Tools like repowise automate that and show co-change pairs in the UI and MCP layer. (github.com)
What is git co-change analysis?
Git co-change analysis mines repository history for files that are edited in the same commit often enough to suggest an evolutionary dependency. Research commonly uses support and confidence, and later work warns that frequency alone can miss rare but real coupling. (researchgate.net)
How accurate is co-change detection?
It is useful, but not perfect. It can surface real hidden dependencies, and it can also flag broad maintenance work or migration commits. That is why confidence thresholds, commit filters, and manual review still matter. (researchgate.net)
Should a coupling PR review block the merge?
Not by default. It should raise a question, not force a verdict. If the missing partner is a real contract dependency, the PR needs a fix or a clear rationale. If the change is intentional and isolated, the note is enough. (github.com)
Can hidden coupling exist across repositories?
Yes. Multi-repo systems often have backend and frontend files that drift together through shared APIs or release timing. Workspace-style co-change analysis can catch that pattern. (github.com)
The useful habit is simple: every time a file changes, ask which other file has been paying that cost all along. Git history will usually answer before your imports do.


