Artificial Intelligence & Machine Learning , Next-Generation Technologies & Secure Development

WormGPT Clones Persist by Hijacking Mainstream AI Models

Research Shows Attackers Jailbreaking LLMs such as Grok, Mixtral
WormGPT Clones Persist by Hijacking Mainstream AI Models
Image: Shutterstock

The phrase "WormGPT," once an actual evil twin of OpenAI's GPT AI model designed for malicious activities, has now become a catch-all phrase for jailbroken large language models used in cybercrime.

See Also: A CISO’s Perspective on Scaling GenAI Securely

Threat actors have adapted mainstream LLMs, such as xAI's Grok and Mistral's Mixtral, to create new "WormGPT" variants capable of generating "uncensored" responses, including to prompts with illegal or unethical intent, said researchers from Cato Networks.

"WormGPT now serves as a recognizable brand for a new class of uncensored LLMs," said Vitaly Simonovich, a Cato researcher. They are not custom-built models, but the result of "threat actors skillfully adapting existing LLMs" through prompt manipulation and, in some cases, fine-tuning on illicit datasets, he said.

WormGPT was originally a single tool, but its leaked code and popularity on cybercrime forums have contributed to its evolution into a broader brand identity (see: WormGPT: How GPT's Evil Twin Could Be Used in BEC Attacks).

New iterations, including xzin0vich-WormGPT and keanu-WormGPT, have appeared on forums such as BreachForums, often advertised with features that emphasize freedom from ethical constraints (see: Hackers Developing Malicious LLMs After WormGPT Falls Flat).

Dave Tyson, CIO at Apollo Information Systems, likens the name's stickiness to genericized trademarks. "Many of them are labeled WormGPT as a means of convenience, just like Americans say Kleenex for a facial tissue," Tyson told Information Security Media Group. Though some variants have distinct names, such as EvilGPT, "most criminal AI are glommed under the word 'WormGPT' as a catch-all term."

These tools are typically deployed through intermediary chat services or platforms, isolating the AI model from end users. "That creates a barrier of isolation between the AI and the actual user; it allows a criminal to provide a service to customers, but behind the scenes use any variety of models to meet the request," he said. Local deployments using tools such as LMStudio or model chaining via services like FlowGPT make it easy for attackers to run and jailbreak uncensored models on demand.

The jailbreaking techniques vary in complexity, from using paraphrased queries to constructing elaborate prompts that mask malicious intent as historical or academic exploration. "Some of the simplest and most observed means to do this is by using a construct of historical research to hide nefarious activity," he said.

The persistence of WormGPT variants reflects a broader trend that LLM guardrails are "not perfect," and are "more like speed bumps" that can slow down, but not stop hackers, said Margaret Cunningham, director of security and AI strategy at Darktrace. She said that the emergence of the "jailbreak-as-a-service" market makes these tools more accessible to non-technical actors, "significantly lower the barrier to entry for threat actors."


About the Author

Rashmi Ramesh

Rashmi Ramesh

Senior Associate Editor, Global News Desk, ISMG

Ramesh has more than 10 years of experience writing and editing stories on finance, enterprise and consumer technology, and diversity and inclusion. She has previously worked at formerly News Corp-owned TechCircle, business daily The Economic Times and The New Indian Express.




Around the Network

Our website uses cookies. Cookies enable us to provide the best experience possible and help us understand how visitors use our website. By browsing govinfosecurity.com, you agree to our use of cookies.