Every Vertical Just Hit the Same Wall
Thirteen companies. Seven industries. One bottleneck.
PitchBook (Morningstar’s private markets research arm) just published Part II of their analyst note on agentic AI. They interviewed CEOs and leaders from 13 startups across cybersecurity, legal, healthcare, customer support, sales, robotics, data security, and AI governance.
The question they asked each company was essentially the same: what’s working, what’s stuck, and where does durable value accrue?
The answers are remarkably consistent. And they describe a structural problem that we’ve been thinking about for a long time.
The bottleneck is not the models
Every single interviewee said it. Different words, same conclusion.
HiddenLayer (AI security): “The bottleneck is less about capability and more about trust.”
WRITER (enterprise content): “Trust, not technology, is the real barrier. The enterprise is having an immune reaction to AI agents.”
Forethought (customer support): “The hardest problem is no longer language understanding. It is operational trust.”
Geordie.ai (AI governance): “There is no shared model for guardrails.”
PitchBook’s own synthesis: “Model capability is no longer the primary bottleneck. Across industries, operators point to governance, integration, and organizational readiness as the key constraints.”
This is significant. A year ago, the conversation was about model capability: which model is smartest, which has the longest context, which scores highest on benchmarks. That conversation is over. The models are good enough. The question now is: how do you trust them to act?
Governance can’t be retrofitted
The most important finding in the report comes from WRITER’s work with Vodafone. Their critical insight: “Build governance into the system from day one, not bolted on later.”
PitchBook goes further, calling governance architecture the moat that is “hardest to replicate because it can’t be retrofitted.” WRITER’s ISO/IEC 27001, 27701, and 42001 certifications took 12-18 months. Competitors stitching together platforms from multiple vendors can’t patch the security gaps between them.
This is a structural argument, not a marketing argument. If your system wasn’t designed for governance from the start, you can’t add it later without rebuilding. The architecture is the governance. Or it isn’t.
Jazz (data security) makes the same point from the other direction. Legacy DLP vendors built complex rule-based frameworks over decades. Their business model depends on the switching costs those frameworks create. Now they’re bolting AI on top as premium pricing tiers. The result: roughly one order of magnitude of noise reduction. Jazz, built from first principles, claims four to five. “The incumbents aren’t incapable of building this. They’re unwilling to risk what they’d have to give up to do it.”
The innovator’s dilemma, in real time, across every vertical.
What enterprises actually need
Across all 13 interviews, respondents independently converge on the same set of trust requirements:
- Explainability of decision paths. Not “the model said so.” A traceable chain from input to decision to action, with the reasoning visible at every step.
- Audit trails. Append-only records that can’t be tampered with. Every decision recorded, every action logged.
- Confidence thresholds. Not binary trust (on/off) but graduated autonomy. Let the system observe first, then recommend, then act with guardrails, then operate independently.
- Reversibility. Agentic adoption accelerates where mistakes can be undone. Email security is further along than network enforcement. Outbound sales is further along than enterprise deal strategy. The pattern: the lower the rollback cost, the higher the autonomy.
PitchBook’s analyst puts it precisely: “The asymmetry in rollback cost, not model readiness, is often what determines the boundary of autonomous action.”
Abnormal AI (email security) names the structural gap directly: “There’s a critical distinction between automation (human still approves) and autonomy (agent runs the loop). The cost of that single ‘approve’ button is the difference between a tool and a teammate. Genuine autonomy requires confidence thresholds, audit trails, and exception paths that most platforms haven’t built yet.”
Most platforms haven’t built them yet. The architectures they started with weren’t designed for it. Adding audit trails to a system that doesn’t record decisions is a retrofit. Adding governance to a system where effects happen directly requires rethinking the execution model.
Value is moving from models to systems
The report’s thesis: “Durable value is shifting toward workflow ownership, proprietary context, and platform-level control as models commoditize.”
Models have a 6-12 month competitive half-life. Distribution creates friction but not barriers. What compounds: governance architecture that takes years to build, organizational context that gets richer with every deployment, workflow integrations that require full process re-architecture to displace.
PitchBook’s closing line is worth reading twice: “Agentic AI is evolving from a software category into organizational infrastructure. The competitive dynamics that follow are less like SaaS, where a better product can displace an incumbent relatively quickly, and more like ERP, where deep operational embedding creates durable advantage that compounds over years.”
More like ERP than SaaS. That’s the market talking, not us.
What this means for how systems get built
Reading these interviews, a pattern emerges. The companies that are furthest along, Abnormal AI in email security, WRITER in enterprise content, HiddenLayer in model protection, all made the same architectural choice early: they built governance into the execution model, not on top of it.
This matches something we’ve been working on with mashin. We started with a different question than most AI platforms. Not “how do we add governance to AI workflows?” but “what if programs produced intents instead of effects?”
Here’s what that means concretely. In most systems, when code says “send this email” or “call this API,” the action happens. The program is the effect. There’s no moment between deciding and doing where anything can intervene. There is no translation layer between what the program wants and what actually happens. And without a translation layer, there is nothing to govern.
Consider a specific scenario. You’re building an email triage agent for a law firm. The agent reads incoming mail, classifies it by urgency and practice area, drafts a response, and routes it to the right attorney. In a typical framework, the agent composes the email and sends it. If the governance check was supposed to catch privilege-waiver risk before that email went out, it had to be wired in correctly by the developer. If they forgot, or if the AI-generated code didn’t include the check, the email goes out.
In mashin, the agent can’t send the email directly. It produces an intent: “I want to send this email, to this person, with this content, because the incoming message was classified as urgent client matter.” That intent is a data structure. It passes through a governance interpreter before anything happens. The interpreter checks: does this agent have permission to send external email? Does the content contain privileged information? Does the confidence score meet the threshold for autonomous action, or does this need attorney review?
The difference isn’t theoretical. It’s the difference between “we have a policy that says check for privilege” and “the system cannot send email without the check occurring.” One relies on developers remembering. The other relies on architecture.
That single choice, programs produce intents instead of effects, creates a translation layer between what the program wants and what actually happens. Governance is a property of that translation layer. It doesn’t need to be added or configured. It exists because the layer exists. And everything the PitchBook respondents are asking for falls out as a consequence:
- Audit trails? Every intent is a data structure. Recording it in an append-only ledger is a natural consequence, not a separate infrastructure concern.
- Explainable decision paths? The governance interpreter produces a decision record for every intent: what was requested, which rules applied, what was decided, why.
- Staged autonomy? Trust progression from “observe” to “recommend” to “act with guardrails” to “full autonomy” is a policy configuration on the governance interpreter. The architecture supports it because every action is already mediated.
- Reversibility? Intents that haven’t been executed can be inspected, modified, or denied. The gap between deciding and doing is where governance lives.
We took this seriously enough to formally verify the core properties. The governance kernel is extracted from 572 machine-checked Rocq theorems with zero admitted lemmas. The proofs establish that every program in the language is governed by construction, not by policy or convention, but by the structure of the system itself.
We’re not claiming to have solved every problem these 13 companies face. Integration challenges, organizational readiness, domain-specific trust calibration: those are real and hard. What we are saying is that the architectural foundation matters. When PitchBook says governance can’t be retrofitted, our experience building mashin confirms it. The decision has to be made at the foundation.
Thirteen companies across seven industries, all arriving at the same conclusion: the models work, the governance doesn’t, and whoever builds the governance layer well wins the durable position.
We think that layer needs to be architectural, not operational. That’s the bet we’re making.
The full PitchBook report, “Agentic AI: The Evolution to Autonomous Systems, Part II,” is available from PitchBook.