Best Practices for Implementing AI Agents

On March 9th, Codewall.ai disclosed how it had hacked McKinsey & Company’s AI platform called Lilli, a purpose-built system for 43,000+ employees to analyze documents, chat, and access decades of proprietary research. The researchers unleashed an AI agent which quickly scanned 200 endpoints, identified 22 that did not require authentication, and one that wrote user search queries into a database including non-parameterized JSON keys which were concatenated directly into SQL.

What sounds like a potential SQL injection vulnerability turned out to be one – albeit most normal tools would not have detected it, according to the researchers. Subsequently, the attacker AI got access to millions of chat messages, hundreds of thousands of files, thousands of user accounts and more than 300,000 AI agents all inside the database. The adversarial agent also managed to infect AI model configurations including system prompts to circumvent guardrails. These prompts were stored alongside the data the agent was accessing.

Any attacker using the SQL injection could have rewritten these prompts with a simple UPDATE statement wrapped in a single HTTP call. The consequences could have been devastating for the organization as consultants might have trusted output that was subtly altered. Other problems would have been data exfiltration, removing guardrails, and silent persistence. None of this happened as the research responsibly disclosed their findings with McKinsey, allowing the organization to patch all vulnerabilities.

Adoption is Outpacing Security

Every organization implementing or using AI agents must take this lesson seriously; protect your AI prompts like crown jewels. And many organizations are currently putting their best efforts forward. Gartner forecasts 40% of enterprise applications will embed task-specific AI agents by 2026. A PwC survey reported 79% of surveyed executives already used AI agents in their organization. In a different survey, 62% of AI practitioners identified security as a leading concern and 28% of senior executives ranked lack of trust as a top three challenge. AI governance is urgently needed to secure agents and restore trust in AI systems.

Why Governance is Hard

Governance is hard because LLMs are inherently opaque. The former OpenAI safety researcher, Steven Adler, sums it up nicely: “You can prod it and shove it with a stick to like, move it in certain directions, but you can’t ever be – at least not yet – you can’t be like, ‘Oh, this is the reason why it broke’.” These precise properties of large language models are adding to the challenge of securing AI agents – every agent uses an LLM as their brain to reason, plan, and orchestrate. This means many of the challenges LLMs face also translate to the world of AI agents.

Intruders need not always find a technical vulnerability to be able to exploit AI. LLMs can drift, avoiding guardrails deliberately after a long conversation. A researcher manipulated an AI chatbot to make him a legally binding offer to buy a car for $1. LLMs can also be manipulated through prompts hidden in recommendation or summary buttons on websites. It is in fact challenging to keep track of all the variants of prompt injection attacks that keep emerging as AI systems continue to be vulnerable through multiple vectors that do not require traditional exploitation. In practice, agentic systems do not work without human interaction and intervention; and whether by prompt injection or social engineering, attackers can manipulate the modern workforce at scale and at machine speed.

The Workforce Is Changing — Your Security Model Should Too

The Human-AI-Agent-Workforce is evolving: increased agent autonomy, declining human power and control, insufficient security and limited oversight. While AI agents such as AI assistants, LLM crawlers and automated browsers manage more work, humans become mere resources, whether or not you build agents yourself or you are using them in your organization.

Agents do not get tired, lose interest, or feel bound by norms and moral considerations. They are tenacious and persistent. They communicate at machine speed, and they will try every possible way to achieve their goals. The old security principle of least privilege–restricting access as much as possible at any level–does not suffice to provide guardrails for agents, as organizations must consider not only which systems an agent can access but also what it can do on these systems, which resources it will access, and what its reasoning is.

Two Principles That Should Guide Every Agent Deployment

Two OWASP first principles for agentic applications address these challenges. ‘Least agency’ asserts that agents are not given more autonomy than the business problem justifies. Agents must be able to succeed without a freedom to scrape through irrelevant systems and data, or exhaust other system access. ‘Least privilege’ is about access control. While least agency is about autonomy scope (what an agent is permitted to decide and to do within a system), the second principle, strong observability, sets forth the need to see and govern how agents behave on your properties. See what they are doing, why they are doing it, and which identities and tools they are using.

Organizations must take decisive action. Control the orchestration layer and establish visibility at the boundary. They must be able to distinguish humans, simple scripts, and AI agents such that they can track behavior as sequences of actions, not isolated requests. Governance must be linked to observable behavior, e.g., when an agent crosses a boundary consider slowing it down, intercepting it, or at least challenge it. Developers must ensure agents are developed robustly to interact with such challenges and to maintain the integrity of legitimate flows. Auditability must also be built in as telemetry around agent sessions is necessary to investigate incidents. Information on which endpoints were used, what was accessed, how decisions escalated, and how that behavior was different from other agent or human behavior.

In reality, strong observability without least agency is like a guard dog without teeth. Least agency without strong observability suggests you are reducing risks that you do not know. You neither want to lack means of understanding agent behavior nor means of interacting with agents to mediate behavior.

From Principles to Practice

These principles must inform a multi-layered defense. Governance frameworks, policies, and oversight mechanisms at the management level; training, awareness, and procedural compliance at the human layer; and a strong technical layer with countermeasures integrated into the LLM itself, as well as into its agent-ecosystem before and during operation.

Essential questions to get started:

Organizations are already using agents whether they built them or not. Do you know which agents you are using?
Security for AI agents is based on least agency and strong observability. What are your agent security measures?
AI governance becomes the strongest measure of trust. How are you monitoring, managing, and securing AI agents?

Secure the Digital Workforce: Human + AI

KnowBe4 empowers the modern workforce to make smarter security decisions every day. Trusted by more than 70,000 organizations worldwide, KnowBe4 is the pioneer of digital workforce security, securing both AI agents and humans. The KnowBe4 Platform provides attack simulation and training, collaboration security, and agent security powered by AIDA (Artificial Intelligence Defense Agents) and a proprietary Risk Score. The platform leverages 15 years of behavioral data to combat advanced threats including social engineering, prompt injection, and shadow AI. By securing humans and agents, KnowBe4 leads the industry in workforce trust and defense.

Best Practices for Implementing AI Agents

Adoption is Outpacing Security

Why Governance is Hard

The Workforce Is Changing — Your Security Model Should Too

Two Principles That Should Guide Every Agent Deployment

From Principles to Practice

Access the World’s Largest Security Awareness Library

Secure the Digital Workforce: Human + AI

ABOUT MARTIN KRAEMER

Subscribe the KnowBe4 Blog

Posts By Topic

Get the latest insights, trends and security news. Subscribe to CyberheistNews.

Get the latest insights, trends and security news. Subscribe to CyberheistNews.