The conversation around AI security has become saturated with anxiety.
Each week brings fresh headlines warning of jailbreaks, prompt injection, rogue agents, and AI-powered cyber crime.
It’s easy to walk away with the impression that AI is inherently uncontrollable – something that must be locked down before it spirals beyond our grasp.
But as a security practitioner, I’ve learnt to be cautious about narratives built on hypotheticals.
Many of the loudest warnings rely on engineered demos or theoretical exploits. They raise valid concerns, but rarely answer a more fundamental question: what does the real attack surface of today’s AI systems actually look like?
So instead of adding another opinion to the pile, I ran the numbers.
To ground the debate in reality, I focused on the Model Context Protocol (MCP), a framework widely used to allow language models to interact with tools, APIs and external systems. MCP is open source, replicated across environments, and built for practical integration. In other words, it’s a strong test case for understanding real-world exposure.
There were no adversarial prompts and no artificial exploits in this research. We analysed active, runnable MCP servers, examined their tool schemas, and measured what capabilities they actually exposed.
What we found was striking in its familiarity.
The MCP servers that met our criteria exposed well-understood primitives: filesystem access, HTTP requests, database queries, local script execution, orchestration workflows and read-only API searches. These are not exotic AI-only risks. They are the same building blocks embedded in cloud automation, infrastructure-as-code, and modern DevOps stacks.
MCP doesn’t invent new capabilities. It structures existing ones.
One of the most surprising findings was what we didn’t see. Despite media warnings, arbitrary code execution was rare among operational MCP servers. High-severity risk was the exception, not the rule.
That matters.
It suggests that most real-world AI deployments are not as reckless as some narratives imply. The most common issues we observed were familiar ones: weak defaults, excessive permissions, and poor input handling. These are longstanding security challenges, not novel AI failures.
Where risk meaningfully increases is in composition.
Individually, most MCP servers presented low risk. But when orchestration enters the equation, when tools can be chained together, the attack surface expands.
We observed realistic combinations such as HTTP fetch paired with filesystem writes enabling persistence or content injection; database queries combined with orchestration enabling stealthy exfiltration; and planning logic linked with execution, creating multi-stage attack paths.
None of this is new in principle.
Adversaries have long chained primitives together in traditional environments. MCP reduces friction in assembling those components, but it does not fundamentally change the playbook.
That said, a critical counterpoint deserves attention. Secure-by-design architectures, particularly tightly scoped schemas, create clear boundaries. But as the ecosystem evolves, not every AI application will be built securely.
Some developers will ship systems with overly permissive tool access or weak schema constraints. In those environments, security teams may be forced to rely on non-deterministic, “best effort” defences such as prompt injection mitigation rather than influencing inherently secure application design.
We must plan for that hybrid reality: championing architectural security wherever possible, while building resilient runtime controls to contain the fallout from insecure implementations.
As AI agents embed deeper into operational systems, control points are shifting. Historically, we validated inputs at the UI, enforced roles through IAM, and encapsulated logic in application code. With AI agents, security boundaries move to orchestration layers, schema contracts, tool composition workflows and execution sandboxes.
Security must follow that shift.
That means auditing tool chains, tightening schema definitions, isolating execution contexts and rigorously applying least privilege and defence-in-depth. It means treating orchestration workflows as critical automation infrastructure. And it means recognising that most AI tooling exposes capabilities we already understand.
AI introduces scale and complexity. It does not repeal fundamental security principles.
The real challenge is not that AI is uncontrollable. It’s whether security teams can adapt existing controls quickly enough, and influence developers early enough, to ensure that secure design becomes the norm rather than an afterthought.
If we focus on measurable exposure instead of headline panic, we can separate signal from noise – and build AI systems that are resilient, not reckless.
You can read my full analysis here.