What AI Can’t Hide When It Writes a Phishing Email

Lead Analysts: Jeewan Singh Jalal, Prabhakaran Ravichandhiran, and Shikhar Dalela

Phishing has always been a game of impersonation. But for decades, the tell was in the details: a misspelled word here, an awkward sentence there, a logo that was just slightly off. Security awareness training built an entire doctrine around those cues. Spot the typo, avoid the trap.

That playbook is now obsolete.

KnowBe4's latest Phishing Trends Report found that 86% of phishing attacks observed in the last six months involved some level of AI assistance. The emails arriving in your employees' inboxes today are polished, correctly branded, grammatically flawless, and highly personalized.

But here's what threat actors didn't account for: AI leaves fingerprints. And KnowBe4 Threat Lab analysts know what to look for.

Campaign Summary

Vector and type: Email Phishing
Techniques: AI-assisted content generation, no-code phishing infrastructure, anti-analysis evasion
Bypassed SEG detection: Yes
Targets: Organizations globally across multiple sectors
Unique Finding: Identifiable LLM artifacts across all campaign stages (from email body to phish page source code)

The Threat Has Changed. Your Training Hasn't.

Before diving into the technical indicators, it's worth grounding the scale of what we're seeing. Phishing powered by AI isn't a niche or emerging threat.

It’s the new baseline.

Industry data supports what KnowBe4 analysts are seeing in the wild. AI-generated phishing now achieves click-through rates of roughly 54%, compared to 12% for traditionally crafted campaigns. Cisco Talos' Q1 2026 IR Trends report found that phishing has surged back to become the #1 vector for initial access, overtaking vulnerability exploitation. Approximately 2-4 billion phishing emails are sent out globally every single day.

The attack volume is only part of the story. What KnowBe4 Threat Lab set out to document isn't just that AI is being used.

It’s about how it's being used, what it leaves behind, and what that means for defenders.

Finding 1: Identifying AI Traces and Fingerprints

One of the most revealing artifacts our analysts identified appeared in the very first line of an AI-generated extortion email. The message opened with:"Here is the message formatted and divided into sections:"

Example of a real-world phishing attempt that includes the LLM output preamble appearing as the second line of the email

When a threat actor prompts a language model to write a formatted phishing email, the model prefaces its response with a structural summary of what it was about to produce. The attacker sent it without removing that line first.

To a recipient, it reads as noise. To an analyst, it is a direct signature of AI generation.

This wasn't an isolated slip. Across the same campaign, analysts identified two additional obfuscation techniques appearing in combination with the preamble: hidden anti-fingerprinting noise tokens (invisible characters injected into the email body to disrupt hash-based detection) and systematic Unicode homoglyph substitution (replacing standard characters with visually identical Unicode equivalents to defeat text-pattern matching).

Raw email source showing hidden anti-fingerprinting noise tokens injected throughout the body

Unicode homoglyph substitutions decoded

In isolation, each technique is documented and well-understood. Appearing together in an email that opens with an LLM generation preamble, they point to a coordinated AI-assisted pipeline.

What this means for your team: Anti-spam filters trained on traditional signal patterns are not tuned to catch structural AI artifacts. Detection logic needs to evolve alongside attack methodology.

Finding 2: No-Code Phishing Pages

The second major finding doesn't require any coding expertise from the attacker. At all.

A pixel-perfect Microsoft 365 sign-in page that is hosted on enthusiastic-secure-link-go.base44.app. Every automated trust signal returns green.

Our analysts tracked credential harvesting pages built and hosted entirely on no-code platforms (examples including Base44, Beacons[.]ai and Retool). The lure in this case was a pixel-perfect Microsoft 365 password-expiry notification. The credential harvesting page was indistinguishable from a genuine Microsoft login with correct fonts, layout, branding, and flow.

The attacker built it with a text prompt and a free account.

A Meta Business Help Center impersonation page built and hosted on Beacons[.]ai (another platform used for credential harvesting)

A SharePoint document-sharing lure hosted on Retool (the platform’s own branding appears in the corner while the page harvests credentials)

The operational implication is significant: by hosting on legitimate, established platforms, attackers inherit those platforms' domain reputation. Email gateways pass the message. URL scanners return clean. Every automated heuristic signals green, because the infrastructure is legitimate – it's just being weaponized.

This isn't a gap in a specific vendor's detection capability. It's a structural weakness in reputation-based filtering as a category. When the attacker's infrastructure is built on the same trusted services your employees use every day, the signal is indistinguishable from the noise.

What this means for your team: Reputation-based URL filtering is no longer sufficient as a primary control for phishing defense. User behavior becomes a critical line of defense.

Finding 3: The Code Comments That Confessed Everything

In cybersecurity, a threat actor commenting on their own malicious code is roughly equivalent to a burglar leaving a signed note explaining how they picked the lock.

And yet, that's exactly what our analysts found.

A Google Cloud Platform billing phish (the dark-mode and emoji usage are indicators of LLM-generated content)

Following the call-to-action in a high-fidelity Google Cloud Platform billing phish, analysts landed on a credential harvesting page accompanied by a suite of linked JavaScript files. Those files were riddled with verbose, multi-lingual comments explicitly documenting the attack's logic.

Anti-bot detection JavaScript from the GCP phishing page (LLM-generated code, complete with Chinese language documentation of its own evasion logic)

One comment block translated from Chinese, described the script’s purpose in full: anti-bot detection, “Zero False Positive Level.” That phrasing didn’t come from a developer. It came directly from a prompt. LLMs document what they’re asked to do, and this one did exactly that.

The same file included a block labeled “MERGED LOGIC,” signaling the model had been instructed to combine multiple source behaviors. Another comment spelled out the attacker’s Telegram exfiltration endpoint in plain English. The model built the trap and then labeled every part of it.

Two additional LLM indicators appeared consistently across this and related campaigns:

Emoji artifacts: AI uses emojis at a significantly higher rate than human developers, particularly in interfaces and code comments. The GCP phishing page was visually distinguished by emoji-heavy design choices and had at least one low-visibility text element suggesting the model failed to correctly assess background contrast for the text it inserted.

Over-engineered CSS: A single anchor tag in one campaign email contained 19 individual CSS properties — including orphans:2, widows:2, and font-variant-ligatures:normal, which are print-layout typography controls with no meaningful application in an email. LLMs produce "technically complete" code based on training data patterns, without the contextual judgment to know which properties are appropriate. The result is code that works but reads as obviously non-human in structure.

What this means for your team: AI-generated attack code is self-annotating and structurally distinctive. Organizations investing in threat intelligence and incident response capabilities may find LLM artifacts useful for attribution and campaign clustering.

Finding 4: Fake Conversation Threads as Anti-Scanner Padding

The fourth technique our analysts documented is one of the more counterintuitive: AI-generated filler content inserted into phishing emails and hidden from recipients using CSS without being visible to scanners.

The Walmart reward lure as seen by the recipient (two large blocks of AI-generated conversation are hidden beneath using CSS)

In a Walmart reward-claim phishing campaign, analysts found two large hidden content blocks containing fabricated office lunch-order conversations. The blocks were concealed with display:none, overflow:hidden, max-height:0, and color:transparent.

The purpose is to pad the email with benign-seeming contextual text so heuristic scanners see an email full of normal-seeming language rather than a sparse lure with a suspicious link.

The hidden conversation thread extracted by the Walmart phish (note the <recipient_email> placeholder repeated throughout, perfect 7-8 minute timestamps, and the division calculation for leftover food portions)

Several AI generation artifacts appeared clearly in the hidden text:

Perfect-interval timestamps. The fabricated email chain showed four messages at exactly 7–8 minute intervals (5:08, 5:14, 5:21, 5:29). Humans do not schedule casual dinner conversations on a precise timer.
SMS-style dialogue in email format. The content was classic rapid-fire messaging about takeout orders — the type of exchange that happens over text, not email. LLMs default to this pattern when prompted to "generate a casual conversation," because their training data includes SMS and chat logs.
Identity anchoring. The recipient's email address appeared 15 times in the hidden thread, with both sides of the fabricated exchange addressing each other by full email address. The attacker was instructing the model to anchor the message to the recipient's identity to improve relevance scoring. Humans do not address each other by their full email address in dinner plans.
Mathematical slop. The conversation included a literal division calculation for dividing leftovers. Classic AI over-engineering.

What this means for your team: The "look for bad grammar" heuristic in user training is insufficient. Employees need to also be trained on URL verification and out-of-band confirmation as the primary trust signals.

Indicators of Compromise

All indicators are defanged. Restore before importing into threat intelligence platforms.

Type	Value	Campaign
Domain	akusu[.]co[.]in	Credential exfiltration endpoint
Domain	sleadtrack[.]com	Click-tracking redirect
Domain	bhuntr[.]com	Sextortion sender domain
Domain	brookstone12[.]com	Walmart reward redirect target
Domain	ssosecure[.]onlinemabethreturn[.]com	Credential redirect target
Domain	enthusiastic-secure-link-go[.]base44[.]app/	Base44 generated phishing page
URL	creative24[.]meliohbistro[.]com/wp-weblogin/	Microsoft 365 phishing page host
URL	bitly[.]cx/4g893	Phish redirector
Phone	+1 587 315 5093	Messaging number for crypto wallet drain
BTC Address	15q3oMhkVK24jT8245xCBwE5Aqgxic6ehF	Sextortion payment wallet
CDN	res[.]cloudinary[.]com/dhwr8jz0x/	Attacker-controlled hosting

MITRE ATT&CK Mapping

T1566.001 (Phishing: Spearphishing Attachment): AI-generated lures distributed at scale
T1566.002 (Phishing: Spearphishing Link): High-fidelity brand impersonation delivered via no-code platforms
T1027 (Obfuscated Files or Information): Unicode homoglyphs, hidden CSS blocks, anti-fingerprinting noise tokens
T1583.001 (Acquire Infrastructure: Domains): Attacker-controlled redirect and exfiltration infrastructure
T1598 (Phishing for Information): Credential harvesting via replica login pages

What Security Teams Should Do Now

Visual polish is not legitimacy. A polished email with correct branding and no typos is not a clean bill of health — it may simply mean the attacker used a better tool.

Retire the "look for typos" heuristic. Security awareness training must explicitly address AI-polished lures and shift the focus to URL verification and out-of-band confirmation as the primary user-facing controls.

Verify the URL bar on every login page. A genuine Microsoft login shows microsoft.com. A genuine DocuSign page shows docusign.com. Anything else is not the real service, regardless of how good it looks.

Flag recipient email addresses in anomalous body positions. Insertion of the recipient's email address mid-sentence in hidden content blocks is a specific AI template artifact with no legitimate use case.

Reassess reputation-based filtering. When attack infrastructure is built on legitimate no-code platforms, domain reputation signals are insufficient. Behavioral analysis and content inspection need to compensate.

Invest in updated detection logic. LLM artifacts are identifiable if your tooling is looking for them. Signature-based detection alone will not catch AI-generated attack variants.

For real-time updates and ongoing threat intelligence, follow the KnowBe4 ThreatLabs on X:@Kb4Threatlabs

What AI Can’t Hide When It Writes a Phishing Email

Campaign Summary

The Threat Has Changed. Your Training Hasn't.

Finding 1: Identifying AI Traces and Fingerprints

Finding 2: No-Code Phishing Pages

Finding 3: The Code Comments That Confessed Everything

Finding 4: Fake Conversation Threads as Anti-Scanner Padding

Indicators of Compromise

What Security Teams Should Do Now

Secure Your Human and AI Workforce

Secure the Digital Workforce: Human + AI

ABOUT KNOWBE4 THREAT LAB

Subscribe the KnowBe4 Blog

Posts By Topic

Get the latest insights, trends and security news. Subscribe to CyberheistNews.

Get the latest insights, trends and security news. Subscribe to CyberheistNews.