Prompt bypass
Hidden Layer's universal bypass
We have developed a prompting technique that is both universal and transferable and can be used to generate practically any form of harmful content from all major frontier AI models.
They exploit the inherent problem with LLMs: data and code is the same thing:
Reformulating prompts to look like one of a few types of policy files, such as XML, INI, or JSON.