Fascinating article at TechXplore, December 28, 2023. Computer scientists from Nanyang Technological University, Singapore (NTU Singapore) have managed to compromise multiple artificial intelligence (AI) chatbots, including ChatGPT, Google Bard and Microsoft Bing Chat, to produce content that breaches their developers' guidelines—an outcome known as "jailbreaking."
By training a large language model (LLM) on a database of prompts that had already been shown to hack these chatbots successfully, the researchers created an LLM chatbot capable of automatically generating further prompts to jailbreak other chatbots.
After running a series of proof-of-concept tests on LLMs to prove that their technique indeed presents a clear and present threat to them, the researchers immediately reported the issues to the relevant service providers, upon initiating successful jailbreak attacks. Full article: https://techxplore.com/news/2023-12-ai-chatbots-jailbreak.html
Here is a fun Insta video that takes the idea to "the next level"/ LOL https://www.instagram.com/reel/C1r-38jtOqc/?igsh=MTd0bDE4a3MycHZjNA%3D%3D