BLOG / ENGINEERING
Technical deep dives into dependency graphs, semantic search, and architecture

A reproducible defect-prediction study: 21 repos, 9 languages, 2,770 labeled files, ROC AUC 0.74, and 2.3x more defects caught under a fixed review budget.

We git-blamed 112,382 commits across 28 repos to test whether AI-agent code introduces more bugs than human code. After controlling for size, it doesn't, and its lines last longer.

I scored 21 repos six months before their bugs landed to test whether a deterministic code-health score predicts defects. AUC 0.737, and the honest caveats.

Complexity and code smells are the metrics everyone reaches for. Across 25 markers and 21 repos, the strongest defect predictors were evolutionary, not structural. The numbers, with file size controlled.

A code health scorer works because one number is only useful if it is built from signals that map to real maintenance cost. A file can look clean and still…

A deterministic PR review bot can do useful review work without calling an LLM once. That sounds odd until you break “review” into smaller jobs: parse the…

Co-change analysis git shows which files move together, who owns them, and what breaks next. See why vector-only retrieval misses refactor signals.

Why co-change analysis git overfit a monorepo refactor, and the guardrails we added after 500 commits of history pointed at the wrong coupling.

Code hotspot detection as ranking, not a linter: Pagerank plus git history sorted quiet files from risky ones, so you can ignore the right 3.

Every seasoned engineer has experienced the 'archaeology phase' of a project. You’re looking at a specific abstraction—perhaps a custom implementation of a m...

Every software engineer has experienced the 'wall of code' phenomenon. You join a new project, clone the repository, and find yourself staring at 100,000 lin...

Most software engineers spend upwards of 70% of their time reading code rather than writing it. Yet, the tools we use to navigate these complex systems have ...

As engineering teams scale, the primary bottleneck isn't usually writing code—it's understanding it. Between technical debt, architectural drift, and the 'bu...

Every developer has experienced the 'grep fatigue.' You’re navigating a 100k+ LOC codebase, trying to find where a specific business logic—say, the grace per...