The Uncomfortable Truth About AI Safety
Your AI vendor’s “enterprise-grade security” might be more marketing than engineering.
I know that sounds cynical. But new research from the International AI Safety Report 2026 shows sophisticated attackers can bypass AI safety measures roughly 50% of the time. In just 10 attempts.
That’s not a theoretical risk. That’s a tested, documented, published-in-peer-review reality.
The report, led by Turing Award winner Yoshua Bengio and backed by over 30 countries, represents the largest global collaboration on AI safety to date. It categorises risks into three areas: malicious use, malfunctions, and systemic risks.
The finding that should keep UK SME owners awake at night: “sophisticated attackers can bypass safeguards ~50% of the time in 10 attempts.”
Not script kiddies. Not hobby hackers. Sophisticated attackers.
What “Bypass Safeguards” Actually Means
When AI vendors talk about “safety guardrails,” they usually mean:
Content filtering (blocking harmful outputs)
Access controls (who can use the system)
Audit logging (tracking what happened)
The research shows these can be circumvented through techniques like:
Jailbreak prompts — Carefully constructed inputs that trick the model into ignoring its constraints
Indirect prompt injection — Embedding malicious instructions in data the AI processes (emails, documents, web pages)
Model extraction — Querying the system repeatedly to reconstruct its training data or proprietary algorithms
University of Florida researchers recently published a technique called Head-Masked Nullspace Steering (HMNS) that probes LLMs by silencing active internal components. It exposes vulnerabilities in safety defences for models from Meta and Alibaba.
This isn’t fringe research. This is mainstream academic work with reproducible results.
Why UK SMEs Should Care (More Than Most)
UK businesses face a specific regulatory environment. GDPR isn’t optional. The UK Online Safety Act just expanded to include AI chatbots. The EU AI Act compliance deadlines are approaching for anyone trading with Europe.
But beyond compliance, there’s a business reality:
If you’re using AI for customer service, you’re potentially exposing customer data to systems that can be manipulated.
If you’re using AI for content generation, you’re potentially publishing outputs that have been poisoned by indirect injection.
If you’re using AI for decision support, you’re potentially acting on recommendations from compromised systems.
The vendors aren’t necessarily lying. They might genuinely believe their systems are secure. But belief isn’t engineering.
The Evidence Dilemma
The AI Safety Report highlights what it calls an “evidence dilemma” — rapid general-purpose AI evolution outpaces risk data.
By the time researchers have tested a model’s vulnerabilities, there’s a new version. By the time security patches are released, new attack vectors have emerged.
This isn’t like traditional software security where vulnerabilities can be catalogued and patched. The attack surface is linguistic. It’s creative. It evolves as fast as human language does.
What to Actually Look For
So what should UK SMEs do? Stop using AI? Of course not.
But stop treating vendor security claims as gospel. Here’s a practical audit framework:
1. Ask for specifics
When a vendor says “enterprise-grade security,” ask exactly what that means. What standards? What certifications? What testing?
If they can’t answer in detail, that’s a flag.
2. Check for security research
Has the vendor published security research? Do they have a bug bounty programme? Do they engage with the academic security community?
Vendors with nothing to hide tend to be transparent.
3. Review your own data handling
What data are you sending to AI systems? Do you have a data classification policy? Are you sending PII, financial data, or proprietary information to third-party APIs?
The best security practice is: don’t send sensitive data to systems you don’t fully trust.
4. Implement defence in depth
Don’t rely on vendor security alone. Layer your own controls:
Access logging and monitoring
Output review processes for AI-generated content
Human-in-the-loop for critical decisions
Regular security audits of your AI workflows
5. Stay current
The security landscape changes weekly. Subscribe to AI security research. Follow researchers like those at the University of Florida, MIT, and the AI Safety Institutes.
The OpenAI and Microsoft commitment to the UK AI Security Institute’s Alignment Project (announced February 20, 2026) is a positive step. But it’s a starting point, not a solution.
The Real Cost of Complacency
48% of cybersecurity professionals now rank agentic AI as the top enterprise threat for 2026. Above deepfakes. Above ransomware. Above phishing.
Why? Because agentic AI doesn’t just process information. It acts on it. It makes decisions. It triggers workflows.
A compromised AI agent can do more damage than a compromised human account because it operates at machine speed and scale.
Shadow AI — unsanctioned tools employees bring in — already links to over one-third of data breaches via unmanaged data.
The risk isn’t theoretical. It’s happening now.
The Vendor Responsibility Gap
Here’s what frustrates me: vendors are rushing to market with agentic AI features, but the security infrastructure to protect them doesn’t exist yet.
We’re in a race condition. Adoption is accelerating. Security is lagging.
The vendors will say: “But we have security teams!” Yes. And those teams are doing their best. But the attack surface is expanding faster than the defence surface.
As a business owner, you can’t outsource responsibility for this. The liability sits with you.
Your 7-Day Security Audit
Day 1: Inventory
List every AI tool your business uses
Categorise by data sensitivity handled
Note which have access to customer data, financial data, or proprietary information
Day 2: Vendor Review
Check security documentation for each vendor
Look for SOC 2, ISO 27001, or equivalent certifications
Search for “[vendor name] security vulnerability” and see what comes up
Day 3: Data Flow
Map what data goes where
Identify any PII or sensitive data flowing to third-party AI
Mark high-risk flows for immediate review
Day 4: Access Control
Review who has access to which AI tools
Check if offboarded employees still have access
Implement MFA where not already present
Day 5: Output Review
For AI-generated customer-facing content, implement review workflow
Document who reviews, what they check for, and how they sign off
Day 6: Incident Response
Draft a simple incident response plan for AI security issues
Who do you call? What do you shut down? How do you communicate?
Day 7: Policy Draft
Write a one-page AI usage policy for your team
What tools are approved? What data can they handle? What’s prohibited?
The Bottom Line
AI security isn’t about finding the perfect vendor. That vendor doesn’t exist.
It’s about:
Knowing the risks
Implementing layered defences
Staying current with the threat landscape
Building organisational resilience
The International AI Safety Report isn’t saying “don’t use AI.” It’s saying “use AI with your eyes open.”
For UK SMEs, that’s particularly important. Our regulatory environment demands accountability. Our customers deserve protection. Our businesses need resilience.
Trust your vendors. But verify. And build your own defences.


