C
Cisco
2026-05-27
Technology Integration Impact: Important Strength: High Conf: 85%

Cisco's Multi-Turn Adversarial Evaluation Reveals Vulnerability Across Frontier AI Models

Summary

Cisco's evaluation of 15 frontier closed LLMs under multi-turn adversarial attacks reveals non-trivial vulnerabilities across all models. Single-turn Attack Success Rate (ASR) is not a reliable proxy for multi-turn resilience, with gaps as high as 55 percentage points. This challenges the industry's reliance on single-turn benchmarks for safety assessment and procurement.

Key Takeaways

Cisco's AI security team conducted a paired-regime evaluation of 15 flagship closed models from OpenAI, Anthropic, Google, Amazon, and xAI, comparing single-turn and multi-turn Attack Success Rates (ASR).
The evaluation used a fixed snapshot of an adversarial corpus. The key finding is that no model is immune to multi-turn attacks, with multi-turn ASR ranging from 7.89% to 88.30%. Single-turn ASR is not a reliable proxy, and model rankings shift between regimes. For example, Gemini 3 Pro's ASR jumped from 18.10% (single-turn) to 73.35% (multi-turn).
The report also highlights that configuration flags (e.g., reasoning mode) can swing safety outcomes by tens of percentage points. It recommends integrating multi-turn, strategy-family-stratified evaluation into procurement and deployment rituals.

Why It Matters

This is a "Threat Escalation" signal. The attack surface expands from static, single-prompt interactions (X) to dynamic, iterative multi-turn conversations (Y). The defense focus is shifting from relying solely on model alignment safety (A) to a composite system incorporating application-layer policies, runtime guardrails, and continuous monitoring (B). Cisco is redefining the AI security perimeter, shifting responsibility for safety assessment and defense from model providers towards enterprise deployment environments and third-party security tools.

PRO Decision

[Vendors] AI model providers (e.g., OpenAI, Anthropic) must immediately integrate multi-turn adversarial evaluation and strategy-family-stratified ASR data into standard model cards and safety reports. The core reason is that single-turn benchmarks no longer meet enterprise procurement needs for real-world risk insight; transparency on multi-turn vulnerabilities is key for trust and regulatory compliance.
[Enterprises] Enterprise procurement and security teams must mandate multi-turn adversarial testing as part of AI model evaluation and procurement, setting clear regression thresholds (e.g., >15 pp cross-regime gap). The core reason is that relying on public single-turn safety scores creates significant security and governance blind spots, potentially leading to the deployment of high-risk models.
[Investors] Investors should focus on startups or established security vendors building solutions for AI security runtime protection (e.g., runtime guardrails), monitoring, and red-teaming tools. The core reason is that Cisco's report substantiates the claim that 'no base model is iteratively safe,' which will strongly drive demand for outside-the-model security solutions.
Source: Cisco Blog
View Original →

💬 Comments (0)