47 dependents, 23 hotspots: a React refactor without folklore

repowise team·May 8, 2026·11 min read

large codebase refactor

Forty-seven dependents is where a React refactor stops being a taste argument and starts being a large codebase refactor with real blast radius. The team in this case knew the change set was risky before they wrote a line of JSX, which is why the question was not “what should the architecture look like?” but “what do we touch first without breaking half the app?”

The answer was not folklore. It was a ranked list, built from dependency graphs, git history, and ownership signals, and it produced 23 hotspots before the first edit landed. That sequencing mattered more than the eventual refactor shape, because with 47 dependents the wrong first move would have turned the whole thing into a review swamp.

47 dependents meant this React refactor was a coordination problem, not a code-style cleanup

The module at the center of the change was not glamorous. It was a shared React surface that sat under a surprising amount of product code, and the dependency graph said 47 downstream dependents reached it directly or indirectly. That number changed the conversation immediately.

Nobody on the team could responsibly start by arguing about the ideal end state. Not because architecture questions were irrelevant, but because the blast radius was already known. A broad rewrite would have forced every reviewer to reason about hidden coupling at once, which is how refactors die: not from one bad file, but from 40 people holding different mental models of the same code.

So they treated the work like risk management. First identify what was actually central. Then rank the places where a mistake would hurt the most. Only after that did they decide which React files deserved attention.

BLAST RADIUS MAP

That is the point where a large codebase refactor becomes legible. You stop asking whether the code “should” be cleaner, and start asking which files control behavior, which ones create coordination cost, and which ones can wait.

A useful internal check here is the same one you would use to audit a codebase you've never seen before: find the entry points, map the fan-out, then inspect where ownership and churn overlap. The surprise is usually not that the code is messy. It is that the mess is uneven.

The first pass: rank 23 hotspots before opening a single JSX file

The team’s first pass used a dependency graph and git history to identify 23 hotspots. Not 23 files they happened to dislike. 23 files that scored high on a mix of fan-out, churn, ownership gaps, and review pain.

The ranking logic was blunt on purpose:

Start with the dependency graph.
Pull the files with the largest blast radius.
Overlay git history for churn and co-change.
Check ownership signal and reviewer history.
Keep only the files where multiple signals agreed.

That last step mattered. A file could be central but stable. Another could be noisy but isolated. They did not want to confuse “annoying” with “urgent.”

hotspot	dependents	ownership %	recent churn	co-change partner	why it was ranked early
`src/components/SharedShell.tsx`	47	18%	high	`src/routes/AppRouter.tsx`	central fan-out plus weak ownership
`src/hooks/useFeatureFlags.ts`	31	24%	high	`src/components/Nav.tsx`	changed with routing and auth work
`src/utils/formatters.ts`	28	12%	medium	`src/components/Table.tsx`	broad utility reuse, review churn
`src/components/FilterBar.tsx`	19	41%	high	`src/pages/Search.tsx`	repeated breakage in adjacent UI
`src/state/queryClient.ts`	17	9%	high	`src/api/client.ts`	shared behavior, low bus factor

The key move was choosing the first 23 hotspots before touching JSX. That sounds obvious until you watch teams do the opposite: open the most visible component, make a local fix, then discover the real dependency knot three PRs later.

We got one thing wrong initially. The first ranking pass over-weighted churn. A file with a lot of commits is not always a dangerous file. Sometimes it is just a file people use as a dumping ground. Once the team added ownership percentage and co-change history, several “obvious” candidates fell out of the top tier.

HOTSPOT RANKING SHEET

The practical effect was simple: they were no longer debating architecture in the abstract. They were deciding which files had earned the right to be touched first.

What the graph showed that grep could not: callers, callees, and hidden fan-out

Grep is good at finding symbols. It is bad at telling you what those symbols are doing to the rest of the app.

The team used a tree-sitter graph to get syntax-aware structure across React files, then walked callers and callees to understand fan-out from entry points. That exposed relationships grep would never have made obvious: a shared component used by multiple routes, a utility imported by both the UI and a background job, a hook that looked local but sat under a cross-cutting state layer.

A small example made this concrete.

node	role	downstream impact
`SharedShell.tsx`	entry point	route layout, auth gate, nav, telemetry wrapper
`useFeatureFlags.ts`	shared hook	conditional rendering across 11 components
`formatters.ts`	utility	tables, detail views, export paths
`FilterBar.tsx`	UI control	search, analytics, saved views

The worked example that changed the order of attack was SharedShell.tsx. On the surface it looked like a layout component. In the graph, it was an entry point that fed navigation, auth state, and telemetry into multiple branches. That meant a small change there could destabilize behavior in places that had never been in the same PR before.

Once they saw that, they stopped treating the refactor as “replace one component.” They treated it as “reduce the number of ways the component can surprise us.”

That is where dependency graphs earn their keep in a large codebase refactor. They do not tell you what to build. They tell you what not to break first.

If you want the mental model to scale, this is close to what indexing docs, graphs, and ownership at scale looks like in practice: a graph is only useful when it is paired with the other signals that explain why a node matters.

The 12 files that kept breaking reviews had one thing in common: nobody could answer who owned them

The most useful ownership signal was not “who committed here last.” It was whether anyone could confidently say who should review changes here.

Twelve files kept failing that test. They were not always the most complex files, but they had low ownership percentage, fragmented reviewer history, and a bus factor that was too close to one. In other words, they were shared in the worst way: everyone depended on them, and nobody felt responsible for them.

That slowed the refactor more than the code itself did. Reviewers hesitated. Authors over-explained. Small changes got routed through too many people because there was no obvious owner to bless the direction.

file group	ownership %	review churn	bus factor	impact on refactor
shared layout	14%	high	1	repeated review loops
feature flag layer	19%	high	2	ambiguous contract changes
formatting utilities	11%	medium	1	slow sign-off on trivial edits
route shell	8%	high	1	no clear decision maker

The lesson was not “add more owners” in the abstract. It was that weak ownership is a sequencing constraint. If a file has no clear reviewer, it should move earlier only if its blast radius justifies the coordination cost. Otherwise it becomes a bottleneck that masks the rest of the plan.

That is also why codebase visibility for engineering managers is not a luxury feature. When the team can see ownership percentage, review churn, and dependents in one place, the refactor plan stops being a social negotiation.

Co-change history separated real coupling from habit

Git history was the filter that kept the graph honest.

The team looked at co-change pairs over the same history window used for churn. That surfaced which files truly moved together during past work. It also exposed a common trap: broad cleanup PRs make unrelated files look coupled when they are only adjacent in habit.

One pair mattered:

SharedShell.tsx and AppRouter.tsx had been changed together across several feature launches and auth fixes.
The significant commit messages were consistent: route gating, shell behavior, session handling.

That was real coupling. If one changed, the other often needed a corresponding change.

One pair did not:

formatters.ts and docs/README.md showed up together in a few historical commits.
The commit messages made it clear those were cleanup passes and documentation sweeps, not evidence of runtime dependency.

That distinction mattered because co-change history is easy to misuse. If you treat every repeated pair as a dependency, you end up prioritizing old PR habits instead of system behavior.

pair	co-change history	significant commit messages	interpretation
`SharedShell.tsx` + `AppRouter.tsx`	frequent	auth, routing, shell	real coupling
`FilterBar.tsx` + `Search.tsx`	moderate	search UX, filters	likely coupling
`formatters.ts` + `README.md`	occasional	cleanup, docs	habit, not coupling
`queryClient.ts` + `api/client.ts`	frequent	request flow, retries	real coupling

The before/after decision note that changed the plan was almost boring:

Before graph + git signals: FilterBar.tsx looked like an early candidate because it was visible and flaky.
After graph + git signals: it moved down, because its dependents were modest and its churn mostly tracked a neighboring search flow.
Meanwhile SharedShell.tsx moved up, because it had the largest blast radius, weak ownership, and repeated co-change with the routing layer.

That is the kind of evidence that persuades a skeptical staff engineer. Not “this feels central,” but “this file has the highest combination of dependents, churn, and coordination risk.”

If you want a broader frame for why this matters, codebase visibility for engineering managers is the right mental model: the point is not just seeing the code, but seeing which decisions are expensive to reverse.

What would have gone wrong with grep and a PR checklist

A grep-first workflow would have found symbols. It would not have found blast radius.

A PR checklist would have asked whether tests passed, whether types compiled, and whether docs were updated. Useful questions, all of them. None of them would have told the team which 23 hotspots to touch first, or which files were risky because nobody owned them, or which modules had hidden fan-out through entry points.

The misses would have been specific:

missed dependencies behind shared React entry points
missed owners on files with repeated review churn
missed coupling that only showed up in co-change history
missed the fact that some “obvious” files were noisy but not central
missed the reason the chosen order reduced risk: it started where blast radius and coordination cost were both highest

That last part is the difference between a refactor plan and a cleanup wishlist. The team did not start with the prettiest files. They started with the files that could most easily derail the rest of the work.

That is why the 47 dependents mattered so much. It forced the team to stop pretending the refactor was about code style and start treating it as a sequencing problem with measurable risk. Once they had the graph, the history, and the ownership signal, the first 23 hotspots were not a debate. They were the only sane place to begin.

FAQ

How do you choose the first files in a large codebase refactor?

Start with the files that have the highest blast radius, then rank them by ownership gaps, churn, and co-change history. The first files should be the ones where a mistake would hurt the most and where the evidence for prioritization is strongest.

How do dependency graphs help with a React refactor?

They show callers, callees, and fan-out around shared components, hooks, and utilities. In React, that often reveals that a file you thought was local is actually an entry point for many downstream dependents.

What is the difference between hotspot analysis and code ownership analysis?

Hotspot analysis asks where change risk is concentrated, usually by combining churn with complexity or dependents. Code ownership analysis asks who can confidently review or maintain the file, which is why ownership gaps can slow a refactor even when the code is not especially complex.

Why is grep not enough for planning a large React refactor?

Grep finds symbols, not system shape. It cannot tell you which files have the largest blast radius, which ones are weakly owned, or which ones are coupled through past changes rather than direct imports.

How did the team know the chosen order reduced risk?

Because the first 23 hotspots were selected before editing JSX, using dependency graph data, git history, and ownership signals together. That let them start with the files that were most central and most coordination-heavy, instead of discovering those risks mid-refactor.