The issue here is unavoidable because LLMs are broken by design. There is no encapsulation where you can separate instructions and data because LLMs are nothing more than next-token predictors and the input sequence MUST be a sequence. They can't build a model with one stream for instructions and another for data because the training data they stole from the internet and books is a single stream.