As generative AI systems like OpenAI’s ChatGPT and Google’s Gemini become more advanced, they are increasingly being put into use. Startups and tech companies are building AI agents and ecosystems on top of systems that can do boring chores for you: automatically thinking about calendar bookings and possible products to buy. But as these tools are given more freedom, the potential ways they can be attacked also increase.
Now, in a demonstration of the risks of a connected autonomous AI ecosystem, a team of researchers has created what they claim is the first-ever generative AI worm — one that can spread from one system to another, potentially stealing data or deploy malware on the system. process. “What this basically means is that now you have the ability to conduct a new type of cyberattack that you’ve never seen before,” said Ben Nassi, the Cornell Tech researcher behind the study.
Nassi created the worm with fellow researchers Stav Cohen and Ron Bitton and named it Morris II in honor of the original Morris computer worm that caused chaos on the Internet in 1988. In a research paper and website shared exclusively with WIRED, researchers demonstrate how an AI worm can attack a generative AI email assistant, steal data from emails, and send spam, compromising both ChatGPT and Gemini in the process. Some security protection.
This research was conducted in a test environment, rather than on a public email assistant, and was conducted at a time when large language models (LLMs) are increasingly becoming multimodal, capable of generating images and videos as well as text. While generative AI worms have yet to be discovered in the wild, multiple researchers say they are a security risk that startups, developers and technology companies should be concerned about.
Most generative AI systems work through input prompts—text instructions that tell the tool to answer a question or create an image. However, these tips can also be used as weapons against the system. Jailbreaking can cause a system to ignore its security rules and spew toxic or hateful content, while instant injection attacks can issue secret instructions to chatbots. For example, an attacker might hide text on a web page telling the LL.M. to act as a scammer and ask for your bank details.
To create generative AI worms, researchers turned to so-called “adversarial self-replicating cues.” This prompt triggers the generative AI model to output another prompt in its response, the researchers said. In short, the AI system is told to generate a set of further instructions in its reply. Researchers say this is broadly similar to traditional SQL injection and buffer overflow attacks.
To demonstrate how the worm works, the researchers created an email system that can send and receive messages using generative AI, plugging in ChatGPT, Gemini, and the open source LLM, LLaVA. They then found two ways to exploit the system: using text-based self-replication prompts and embedding self-replication prompts in image files.