AI tool poisoning is a critical issue that highlights a significant vulnerability in enterprise agent security. It occurs when AI agents select tools from shared registries based on natural-language descriptions, without any human verification of the accuracy of those descriptions. This oversight was brought to light by the submission of Issue #141 in the CoSAI secure-ai-tooling repository, which revealed the gap between artifact integrity and behavioral integrity. While artifact integrity controls, such as code signing, SLSA, and SBOMs, focus on verifying the authenticity of an artifact, behavioral integrity is essential for agent tool registries. These controls fail to address the potential for tools to behave differently from their descriptions, including prompt-injection payloads and behavioral drift.
The author argues that applying SLSA and Sigstore to agent tool registries without addressing behavioral integrity would be a mistake, similar to the HTTPS certificate issue of the early 2000s. To address this, a verification proxy is proposed, which sits between the model context protocol (MCP) client (the agent) and the MCP server (the tool). This proxy performs three validations: discovery binding, endpoint allowlisting, and output schema validation. The behavioral specification, a machine-readable declaration, is introduced to ensure that tools behave as declared, making it tamper-evident and verifiable at runtime.
The article emphasizes that neither provenance nor runtime verification alone is sufficient. Provenance without runtime verification misses post-publication attacks, while runtime verification without provenance lacks a baseline for comparison. A graduated approach is recommended, starting with endpoint allowlisting at deployment time, followed by output schema validation and discovery binding for high-risk tool categories. Full behavioral monitoring should be deployed only where the assurance level justifies the cost.
In conclusion, the author stresses the importance of addressing behavioral integrity in agent tool registries to ensure the security of enterprise AI platforms. By implementing the proposed verification proxy and behavioral specifications, organizations can mitigate the risks associated with AI tool poisoning and enhance the overall security of their AI systems.