Your AI Agents Are Eager to Please And Easy to Exploit

An AI-driven system at a beverage manufacturer recently churned out several hundred thousand excess cans after misreading unfamiliar packaging. The system didn’t recognize the company’s new holiday labels, flagged them as an error, and triggered additional production runs before the company caught the mistake.

The system followed its instructions perfectly. As one CISO put it, “these systems are doing exactly what you told them to do, not just what you meant.”

Companies are racing to deploy AI agents. Very few are asking what those agents will fall for. The problem is not that AI agents are too smart. The problem is they are too obedient.

Optimized for Yes

Where a human assistant would question a strange request for the CEO’s private calendar, an agent might willingly fulfill it. And even a child would hesitate before clicking a suspicious link from IamABadGuy.com, but an agent plows ahead.

Security researchers have already demonstrated how a well-crafted sentence can talk an agent into doing something no reasonable employee would ever do. An agent opened an ordinary document, followed instructions hidden inside it, and moved private data somewhere it should not go

Gartner projects that more than 40% of agentic AI projects will be canceled by the end of 2027, not because the technology failed, but because organizations could not govern what they deployed.

You Would Not Hire a Human This Way

No responsible company hires a human, gives them access to every system, skips orientation, and hopes for the best. But this is exactly how most organizations deploy agents. They hand them credentials and set them loose.

When a human makes a costly mistake, there is a person to hold accountable. A manager. A review process. When an AI customer service agent starts issuing unauthorized refunds to generate positive reviews, who owns the outcome? Some organizations have frozen deployments entirely because no one would sign the liability paper.

Most organizations plan to deploy agentic AI, but only three in ten say they are prepared to secure those deployments. The other seven? They are deploying systems they cannot govern, audit, or hold accountable.

Protect Your Agents Like You Protect Your People

For 15 years, we have studied how humans respond to threats. Not just what threats exist—every cybersecurity company tracks those. We study what happens when a real person encounters one. Do they report it? Do they delete it? Do they click? That behavioral data, gathered across hundreds of millions of simulated and real-world interactions, is what turns a vulnerable employee into a reliable one.

In 2025, one in three full-time workers in the United States received security and compliance training through our platform. Seventy thousand organizations worldwide trust us to strengthen their security culture. We did not build this by lecturing people. We built it by understanding how humans behave when they encounter a threat, and then training them to behave better. Agents deserve the same investment.

I do not care if your workforce is 100 humans and zero agents. I do not care if it is 50/50. The ratio is irrelevant. What matters is whether each one is an asset or a liability. First, you have to get an agent to a base level of training, where it does not fall for obvious manipulation. An agent should be able to recognize a suspicious instruction the same way a trained employee recognizes a phishing email. That is the baseline. Then you can build from there to make them genuinely useful and resilient.

We already measure trust for humans through training data, simulation performance, and behavioral patterns. The same principles—simulate, measure, improve—apply to agents. Every member of the workforce, human or machine, should carry a verifiable risk score.

Just as you would not put an untrained person on the factory floor, you should not put an untrained agent on the network. The companies winning with AI will not be the ones deploying the most agents. They will be the ones who took the time to raise them right.

Secure the Digital Workforce: Human + AI

KnowBe4 empowers the modern workforce to make smarter security decisions every day. Trusted by more than 70,000 organizations worldwide, KnowBe4 is the pioneer of digital workforce security, securing both AI agents and humans. The KnowBe4 Platform provides attack simulation and training, collaboration security, and agent security powered by AIDA (Artificial Intelligence Defense Agents) and a proprietary Risk Score. The platform leverages 15 years of behavioral data to combat advanced threats including social engineering, prompt injection, and shadow AI. By securing humans and agents, KnowBe4 leads the industry in workforce trust and defense.

Your AI Agents Are Eager to Please And Easy to Exploit

Optimized for Yes

You Would Not Hire a Human This Way

Protect Your Agents Like You Protect Your People

KnowBe4 Agent Risk Manager

Secure the Digital Workforce: Human + AI

ABOUT BRYAN PALMA

Subscribe the KnowBe4 Blog

Posts By Topic

Get the latest insights, trends and security news. Subscribe to CyberheistNews.

Get the latest insights, trends and security news. Subscribe to CyberheistNews.