The instructions are manipulating the LLM itself. Making it exfiltrate and collect data , fetch new instructions from an attacker etc. All the connected applications can be fine but it's basically turning your assistant into a compromised, attacker-controlled version of itself just because it looked at the wrong news article. From our GitHub:
We demonstrate the potentially brutal consequences of giving LLMs like ChatGPT interfaces to other applications. We propose newly enabled attack vectors and techniques and provide demonstrations of each in this repository:
Remote control of chat LLMs
Persistent compromise across sessions
Spread injections to other LLMs
Compromising LLMs with tiny multi-stage payloads
Leaking/exfiltrating user data
Automated Social Engineering
Targeting code completion engines
All of these are completely new but unfortunately it seems more difficult to explain the impact to people than we had anticipated.No comments yet.