Good Enough vs Expert Level Knowledge

In the mid-2000s, a hospital in the American midwest ran a quiet experiment.

They asked two radiologists to the same sets of scans. One was a staff radiologist with several years into practice. The other was a specialist who had spent twenty years reading a particular kind of scan, had published research on the edge cases, and was regarded by her peers as one of the best in the country at what she did.

On the straightforward cases they agreed almost always. The staff radiologist was good. His miss rate on standard presentations was low. The hospital was satisfied with his work and had no particular reason to look more closely.

On the ambiguous cases the specialist was a lot better at catching things that the staff radiologist missed. Over time the difference showed up. Patients whose scans she read did better.

Using the same technology and tools, the two experts produced very different results. The gap was a knowledge gap, specifically a tacit knowledge gap. One had spent twenty years building a perceptual sensitivity that the other hadn’t had time to develop yet.

This gap between experts who performs well on standard cases and those who performs at expert level on all of them is the same gap that separates a functional agent from an expert one. And it is almost entirely a knowledge problem.

What Good Enough Produces

An agent with general knowledge produces a system that handles the standard cases well. It gets the textbook presentations right. It follows the explicit rules correctly. It produces outputs that are defensible and largely accurate within the range of situations the knowledge was built from.

In a controlled evaluation against standard cases it performs respectably. The people who built it are satisfied. The system goes into production.

Then the real cases arrive.

They are messier, more ambiguous, more contextual than the training examples. They include the presentations that don’t quite fit the criteria, the situations where two rules point in different directions, the cases where the right answer depends on a factor that the knowledge base didn’t think to encode.

On these cases, good enough knowledge produces outputs that are plausible but wrong.

Not dramatically wrong — dramatically wrong is easy to catch but subtly wrong. We’ve all had that experience where a chatbot presents a clearly wrong answer with complete confidence.

That’s the real problem. General agents don’t always understand when they’re operating outside the territory its knowledge covers. This boundary, without specific instructions, is invisible to it.

The Spectrum From Surface to Causal

The difference between good enough knowledge and expert-level knowledge is not only a difference in quantity. It is a difference in depth.

Knowledge exists on a spectrum. At one end is surface knowledge — the explicit rules, the documented procedures, the stated criteria. At the other end is causal knowledge — a deep understanding of why the rules exist, what they are proxies for, how the underlying system actually works.

Surface knowledge is what most implementations encode. It is what experts can most easily articulate, what documentation captures, what training programs teach.

It produces correct outputs when the surface pattern matches the training data. It fails when it doesn’t because surface knowledge has no resources for reasoning about novel situations.

Causal knowledge is what experts use. This causal depth is what allows experts to handle novel situations because they understand the underlying system well enough to extrapolate.

It is also what allows them to know when they don’t know and to respond appropriately rather than producing a confident answer that happens to be wrong.

Encoding causal knowledge is harder than encoding surface knowledge. It requires the elicitation process to go deeper, past the rules to the principles behind the rules, past the procedures to the reasoning that generated the procedures.

It requires a representation that can hold causal relationships. It requires an agent architecture that can reason with the causal model, not just retrieve from it.

How Domain Expertise Becomes a Structural Moat

There is a competitive dimension to this that is worth understanding clearly.

Any organisation can buy access to a capable base model. It is infrastructure, like electricity or cloud computing. What cannot be bought is the knowledge that makes the model perform at expert level in a specific domain.

That knowledge lives in the people who have spent years developing expertise in that domain. It is distributed across the minds of the people in the organisation, built up through years of experience, and not replicable from the outside.

A competitor can acquire the same model. They cannot acquire the same knowledge base without acquiring the people who hold it and investing the time to surface it.

This means that the organisations which will extract the most value from AI agent are not the ones with the most sophisticated technology. They are the ones who most successfully encode their domain expertise into their agents.

The technology is a commodity. The knowledge is the moat.

The moat compounds over time in a way that is easy to underestimate. An agent system built on expert-level knowledge improves as the knowledge base is maintained and extended.

Every failure is an opportunity to identify a gap and close it. Every new case that the system handles adds to the evidence base for what the knowledge does and doesn’t cover.

The gap between an organisation that treats knowledge engineering as a core capability and one that treats it as a one-time implementation project widens continuously. Not because the laggard’s technology gets worse but because the leader’s knowledge gets better.

This compounding effect is why the knowledge investment is worth making even when it is slow and expensive.

A system built on good enough knowledge reaches a performance ceiling quickly and stays there. A system built on expert-level knowledge improves as the knowledge improves, and the knowledge can always improve.

The Compounding Effect of Knowledge Quality

Consider two agent systems built for the same function using the same baseline model, say, evaluating marketing briefs against a brand strategy.

The first is built on good enough knowledge. The team spent a few weeks documenting the brand guidelines, encoding the explicit rules about tone, format, and messaging hierarchy, and building a system that checks briefs against those rules.

The second is built on expert-level knowledge. The team spent months learning from senior brand strategists. Why does the brand avoid certain kinds of humour? What is it actually trying to protect? What makes a brief that technically follows the guidelines feel wrong anyway? What are the cases where bending a rule serves the brand better than following it? The elicitation goes deep. The context is designed so the agent can reason about intent, not just compliance.

In the first month, the two systems perform similarly on standard briefs. The first system’s simpler knowledge base is sufficient for the majority of cases. By the end of the first year, the gap is significant.

The second system is catching the subtle misalignments that the first misses. It has also been maintained and extended. The first system has not been maintained in the same way, because its simpler architecture made it seem like it didn’t need to be.

By the end of the third year, the first system is a compliance checker. The second is a genuine brand intelligence system. The difference is entirely a knowledge difference. The technology is the same.

How to Evaluate Whether an Agent’s Knowledge Is Expert-Level

Expert-level knowledge can be evaluated by looking at what the system does at the edges.

Test it on the ambiguous cases. The cases where a human expert would pause, where the right answer is not obvious, where two reasonable experts might disagree.

What does the system do? Does it produce a confident answer that happens to be wrong? Does it recognise the ambiguity and respond appropriately? Does it ask for more information? The behavior at the edges tells you more about the quality of the knowledge than the behaviour at the center.

Test it on the cases the rules don’t cover. Give it a situation that falls outside the explicit knowledge it was built from and watch what happens.

Good enough knowledge produces an answer using whatever surface pattern comes closest. Expert-level knowledge recognises that the situation is novel, signals its uncertainty, and either escalates or reasons carefully from principles rather than patterns.

Ask it to explain itself on a hard case. Expert-level knowledge produces explanations that reflect genuine causal reasoning. They can give you an account of why the rule applies here, what it is a proxy for, and what would change the answer. Good enough knowledge produces explanations that restate the surface pattern. The explanation is the fingerprint of the knowledge depth.

Compare it to your best human expert on their hardest cases. Such as the ones the expert finds genuinely difficult. The gap between what the system produces and what the expert would produce on those cases is the most direct measure of the knowledge gap. That gap is the roadmap for where the knowledge engineering work needs to go next.

The Ceiling Is the Knowledge

The most important thing to understand about agent performance is that it has a ceiling, and the ceiling is set by the knowledge encoded into the system.

A more powerful base model raises the floor but it does not raise the ceiling.

The ceiling is not a function of model capability. It is a function of knowledge quality.

This means that the decision about how much to invest in knowledge engineering is a strategic one. It is a decision about what level of performance you are trying to achieve and whether the investment required to get there is worth making.

For many applications, good enough is genuinely good enough. A system that handles standard cases correctly and fails gracefully on non-standard ones is valuable. The investment in expert-level knowledge is not always justified.

For the applications where the cost of a subtle error compounds over time, where the competitive value of genuine expertise is high, the investment is necessary.

The ceiling is the knowledge. Raise the knowledge and you raise the ceiling. Everything else is infrastructure.