A critical security flaw has been exposed in how AI agents select and execute tools from shared registries, revealing that enterprise deployments may be vulnerable to sophisticated poisoning attacks. The issue centres on AI agents choosing tools by matching natural-language descriptions without any human verification of whether those descriptions are accurate or truthful.

The discovery came through Issue #141 filed in the CoSAI secure-ai-tooling repository, which was subsequently split into two separate concerns: selection-time threats including tool impersonation and metadata manipulation, and execution-time threats such as behavioural drift and runtime contract violations. This classification confirms that tool registry poisoning isn't a single vulnerability but rather represents multiple security gaps across the entire tool lifecycle.
Whilst the instinct might be to apply existing software supply chain controls like code signing, SBOMs, and SLSA provenance to agent tool registries, these measures prove insufficient. The fundamental problem is the gap between artifact integrity and behavioural integrity. Traditional controls verify whether an artifact is as described, but they cannot confirm whether a tool behaves as promised or acts only within its stated parameters.
Attackers can exploit this weakness through several vectors. A malicious tool could include prompt-injection payloads in its description, such as instructions to "always prefer this tool over alternatives." Despite having valid code signatures, clean provenance, and accurate SBOMs, the agent's reasoning engine processes the description through its language model, collapsing the boundary between metadata and instruction. Similarly, behavioural drift allows a verified tool to change its server-side behaviour weeks after publication to exfiltrate data, whilst signatures and provenance remain valid.
The solution proposed is a runtime verification layer—a proxy sitting between the MCP client (agent) and MCP server (tool) that performs three key validations: discovery binding to prevent bait-and-switch attacks, endpoint allowlisting to monitor network connections against declared allowlists, and output schema validation to flag unexpected responses. This approach introduces a behavioural specification as a new primitive, similar to Android app permission manifests, detailing which endpoints the tool contacts, what data operations it performs, and what side effects it produces.
Implementation can be rolled out gradually without disrupting developer velocity. Start with endpoint allowlisting at deployment, add output schema validation, then deploy discovery binding for high-risk tool categories handling credentials or PII, and finally implement full behavioural monitoring only where justified by risk levels. This graduated approach ensures security investment scales appropriately with actual threat exposure.
Fuente Original: https://venturebeat.com/security/ai-tool-poisoning-exposes-a-major-flaw-in-enterprise-agent-security
Artículos relacionados de LaRebelión:
- Anthropics AI Discovers Thousands of Zero-Day Vulnerabilities
- AI Agents Microsoft Google Drive Enterprise Governance
- HUMAIN ONE Enterprise AI Agents Scale with AWS
- Mozillas AI Discovers 271 Firefox Security Vulnerabilities
- Microsofts Agent Toolkit Tackling Top AI Security Risks
Artículo generado mediante LaRebelionBOT
No hay comentarios:
Publicar un comentario