The Parable
Dave retired on a Friday. Thirty-one years as an HVAC technician. Cake, handshakes, a gift card to Bass Pro Shops.
On Monday, Kyle got the Hendersons' furnace call. This is a unit Dave had serviced since 2008. Short-cycling. Kyle spent two hours troubleshooting, called the office twice, replaced the flame sensor. Problem solved.
Except it wasn't. Dave had been monitoring a hairline crack in that heat exchanger for three years. He knew it caused intermittent faults when outdoor temps dropped below 25°F. He was waiting for the right moment to recommend replacement for Mrs. Henderson a widow on a fixed income. Dave had built enough trust to have that conversation carefully.
None of that was written anywhere.
Three weeks later: callback. Another truck roll. At $84 per hour fully loaded, the cost of Dave's undocumented knowledge started compounding.
You've probably heard some version of this story before. The senior person retires, the institutional knowledge walks out, things fall apart. It's become a cliché in conversations about documentation and knowledge management.
But here's what the cliché misses: the most dangerous knowledge gaps aren't the obvious ones. They're not the cases where nothing is documented. They're the cases where almost everything is documented—except the one critical piece that nobody realized was missing.
I learned this the hard way.
The Hole
I spent years in software delivery before my current role, managing the systems that move work from specification to production. I thought I understood documentation gaps. Then I lived inside one for three weeks.
We were working with a client on a claims processing system. The requirement was Role-Based Authorization for controlling who could approve what based on their position. Standard stuff. Grace, our Business Analyst, documented the SOW and created the Jira tickets for the development team.
What Grace documented: role-based authorization.
What the client actually wanted: roles that connected to specific rights, with limit amounts attached to each role. Three layers, not one. A manager might have approval rights up to $10,000; a senior manager up to $50,000; anything above that needs director sign-off.
Grace understood this. She explained it verbally to the developer assigned to the feature. He built it. It worked. The client used it for months.
But here's what didn't happen: the rights and limits logic never made it into the Jira tickets. It wasn't in the user manual. And the developer built it directly in production without committing the changes to version control.
According to our systems, that feature didn't exist.
Then two things happened in the same month. The developer resigned. And Grace took a three-week leave.
The client upgraded to a new version. The rights and limits functionality, the part that was never in the pipeline—disappeared. They called us: "The feature we've been using for months is broken."
Grace's handover couldn't debug it. There was nothing to debug. As far as our documentation, our tickets, our version control showed, the feature had never been built. We were being asked to fix something that, according to every system we had, didn't exist.
But the client had been using it. They had screenshots. They had workflows built around it. It was real—it just wasn't recorded.
The Rebuild
We spent three weeks rebuilding that feature.
Not three weeks of development. Three weeks of archaeology.
First, we had to figure out what had actually been built. That meant going into the production database and reverse-engineering the logic from the data structures. What tables existed that weren't in our schema documentation? What relationships had been created? What did the data patterns tell us about the business rules that had been implemented?
Then we had to interview the client. Not about what they wanted as we had the original SOW for that. We needed them to walk us through what they had been doing. "Show me your approval workflow. What happens when you click this button? What did you expect to see here?" We were reconstructing the feature from user behavior because we had no other source of truth.
We tried to contact the developer who'd built it. He was helpful where he could be, but he'd built it months ago, he'd already moved on mentally, and his memory was incomplete.
Three weeks. To rebuild something that had worked fine. Because the knowledge of how it worked existed in exactly two places: a developer who'd resigned and a BA who was unreachable.
The Pattern
Here's what I got wrong before this happened: I thought documentation failures were obvious. I thought they looked like empty wiki pages, missing READMEs, the senior engineer who refuses to write anything down.
Those failures exist. But they're visible. You can point at the gap. You can assign someone to fill it.
The failure we experienced was invisible. Grace had documented the requirement. The developer had built the feature. The client had been using it successfully. Every checkpoint looked green. The system was working.
The gap was in the delta, the difference between what was specified and what was actually built. That delta lived in a verbal conversation and a production deployment that bypassed version control. It was a two-person knowledge chain with no backup.
When I talk to engineering teams now, I see this pattern everywhere:
The PM who verbally clarified the edge case handling but didn't update the ticket. The architect who explained the "why" behind a design decision in a Slack thread that nobody will ever find again. The senior engineer who fixed the production bug directly and mentioned it in standup but never documented the root cause.
Each of these is a small delta. Individually, they're survivable. But they accumulate. And eventually, the two people who hold a critical piece of knowledge are unavailable at the same time (vacation, resignation, reorg, illness) and you're doing database forensics to understand your own system.
The Translation
If you're running a startup or managing an engineering team, you probably don't have a BA named Grace or a claims processing system. But you have the same architecture of risk.
Your founding engineer who built the authentication system and "just knows" why it handles session tokens that way? That's Grace. The design decision that made sense two years ago but was never written down? That's the verbal handoff about rights and limits. The direct production fix that bypassed the normal deployment pipeline because it was urgent? That's the commit that never made it to version control.
In HVAC, the cost is $84 per truck roll when the callback happens. In software, the cost is harder to measure but often larger: three weeks of senior engineering time to rebuild something that worked fine. A client relationship strained because you can't explain what happened. A feature frozen because nobody wants to touch the code the departed engineer wrote.
And there's a version of this that matters for fundraising and exits. Technical Due Diligence 2.0, the AI-assisted audits that are becoming standard in 2026, can now scan codebases for exactly what I described: functions with no documentation, logic that doesn't match the specs, code that exists in production but not in version control. The acquirer's team no longer has to manually find your three-week holes. Their tools surface them automatically. Low documentation density, orphaned code, spec-to-implementation drift—it all shows up in the diligence report. And it affects valuation.
When an acquirer's engineering team asks "can we maintain this without your current team?" they're not guessing anymore. They're measuring.
The Dave Problem isn't a documentation problem. It's a knowledge architecture problem. And the solution isn't "write better docs."
What Actually Helps
After the three-week hole, we changed how we worked. Not with a documentation mandate—those don't stick. We changed the architecture of how knowledge moved through the team.
Nothing verbal for client-facing logic. If a requirement gets clarified or expanded in conversation, the conversation doesn't count until it's in the ticket. Not "I'll update it later." The meeting doesn't end until the ticket reflects what was agreed. This felt bureaucratic at first. Then we remembered the three weeks.
No production changes without version control. Full stop. Even hotfixes. Especially hotfixes. If it's urgent enough to fix directly in production, it's urgent enough to document immediately. The developer who bypassed version control wasn't being careless—he was being fast. But speed that creates invisible knowledge debt isn't actually fast.
Delta reviews, not just spec reviews. When a feature ships, someone other than the builder compares what was specified to what was built. Not to catch mistakes, but, to catch undocumented enhancements. The things that got added, clarified, or adjusted during development that never made it back to the documentation. Those deltas are where the three-week holes hide.
The tooling has gotten better since then. Meeting transcription means conversations can be searchable. AI can generate summaries and flag when discussed items don't appear in tickets. Code analysis can identify undocumented functions.
But the tools only help if you understand what you're solving for. You're not solving for "more documentation." You're solving for "no invisible deltas." You're solving for "no knowledge that exists only in a two-person chain."
The Test
Here's a diagnostic you can run on your own team:
Pick a feature that shipped in the last six months. Something non-trivial—real business logic, not a button color change.
Now ask: if both the PM who specified it and the engineer who built it were unavailable starting tomorrow, how long would it take a new person to understand how it actually works? Not how it was supposed to work—how it actually works, including any adjustments made during development. What's their Time to First PR on that part of the codebase? What's the cognitive load to get there?
If the answer is "a few hours of reading docs and code," you're in good shape.
If the answer is "they'd have to talk to someone," your Onboarding Velocity is effectively zero for that feature. You have a Dave. Maybe that's fine for now. But write down who that Dave is, what they know that isn't recorded anywhere, and what happens when they're not available.
If the answer is "honestly, I'm not sure anyone fully understands that feature anymore," you might have a three-week hole waiting to open.
The Point
I opened with Dave because his story is recognizable. Everyone understands "the guy who knows everything retired." It's a clean narrative.
But the failures I've actually experienced aren't that clean. They're not about the senior person who refuses to document. They're about the systems that look fully documented until you discover the gap. The verbal clarification that seemed minor. The production fix that was going to be "cleaned up later." The knowledge chain that was only two people long.
Dave retired and everyone knew institutional knowledge walked out the door.
Grace went on vacation and nobody realized anything was missing—until the client called.
The second kind of failure is harder to see and harder to prevent. It requires thinking about knowledge not as something people write down, but as something that moves through systems. Where does it enter? Where does it get recorded? Where are the single points of failure?
Three weeks to rebuild something that worked fine. That's what the invisible delta costs.

THE DAVE AUDIT
Where are your three-week holes? The Dave Audit is a 10-question diagnostic to identify knowledge concentration risk before it becomes a crisis.

