AI's Role in Cybersecurity: Black Hat USA 2023 Reveals How Large Language Models Are Shaping the Future of Phishing Attacks and Defense

blackhatlogoAt Black Hat USA 2023, a session led by a team of security researchers, including Fredrik Heiding, Bruce Schneier, Arun Vishwanath, and Jeremy Bernstein, unveiled an intriguing experiment. They tested large language models (LLMs) to see how they performed in both writing convincing phishing emails and detecting them. This is the PDF technical paper.

The Experiment: Crafting Phishing Emails
The team tested four commercial LLMs, including OpenAI's ChatGPT, Google's Bard, Anthropic's Claude, and ChatLlama, in experimental phishing attacks on Harvard students. The experiment was designed to see how AI technology could produce effective phishing lures.

Heiding, a research fellow at Harvard, emphasized that such technology has already impacted the threat landscape by making it easier to create phishing emails. He said, "GPT changed this. You don't need to be a native English speaker, you don't need to do much. You can enter a quick prompt with just a few data points."

The team sent phishing emails offering Starbucks gift cards to 112 students, comparing ChatGPT with a non-AI model called V-Triad. The results showed that the V-Triad email was the most effective, with a 70% click rate, followed by a V-Triad-ChatGPT combination at 50%, ChatGPT at 30%, and the control group at 20%.

However, in another version of the test, ChatGPT performed much better, with nearly a 50% click rate, while the V-Triad-ChatGPT combination led with almost 80%. Heiding emphasized that an untrained, general-purpose LLM was able to create very effective phishing attacks quickly.

Using LLMs for Phishing Detection
The second part of the experiment focused on how effective the LLMs were in determining the intent of suspicious emails. The team used the Starbucks emails from the first part of the experiment and asked the LLMs to determine the intent, whether it was composed by a human or an AI, identify any suspicious aspects, and offer advice on how to respond.

The results were both surprising and encouraging. The models had high success rates in identifying marketing emails but struggled with the intent of the V-Triad and ChatGPT phishing emails. They fared better when tasked with identifying suspicious content, with Claude's results being highlighted for not only achieving high results in detection tests but also providing sound advice for users.

The Phishing Power of LLMs
Overall, Heiding concluded that the out-of-the-box LLMs performed quite well in flagging emails that could be suspicious, even though they had not been trained on any security data. He stated, "This really is something that everyone can use right now. It's quite powerful."

The experiment at Black Hat USA 2023 sheds light on the double-edged sword of AI in cybersecurity. On one hand, it can be used to craft convincing phishing emails, lowering the bar for would-be attackers. On the other hand, it offers a new and powerful tool for detecting and responding to phishing attempts.

The session serves as a wake-up call for IT and InfoSec professionals, highlighting the need to understand and leverage AI's capabilities in the ever-evolving landscape of cyber threats. It's a fascinating glimpse into how technology is shaping the future of phishing attacks and defense, making it a must-know topic for anyone looking to keep their human firewall as strong as possible.

Live Demo: Identify and Respond to Email Threats Faster with PhishER

With only approximately 1 in 10 user-reported emails being verified as actually malicious, how do you not only handle the phishing attacks and threats—and just as importantly—effectively manage the other 90% of user-reported messages accurately and efficiently? PhishER.


To learn how, get a product demonstration of the new PhishER Security Orchestration, Automation and Response (SOAR) platform. In this live one-on-one demo we will show you how easy it is to identify and respond to email threats faster:

  • Automate prioritization of email messages by rules you set that categorize messages as Clean, Spam, or Threat
  • Augment your analysis and prioritization of messages with PhishML, a PhishER machine-learning module
  • Search, find, and remove email threats with PhishRIP, PhishER’s new email quarantine feature for Microsoft 365 and G Suite
  • NEW! Automatically flip active phishing attacks into safe simulated phishing campaigns with PhishFlip. You can even replace active phishing emails with safe look-alikes in your user’s inbox.
  • Easily integrate with KnowBe4's email add-in button, Phish Alert, or forwarding to a mailbox works too!

Request A Demo

PS: Don't like to click on redirected buttons? Cut & Paste this link in your browser:

Subscribe to Our Blog

Comprehensive Anti-Phishing Guide

Get the latest about social engineering

Subscribe to CyberheistNews