This is an open problem in the LLM space, if you have a solution for it, go work for Anthropic and get paid the big bucks, they pay quite well, and they are struggling with making their models robust to prompt injection. See their system card, they have some prompt injection attacks where even with safeguards fully on, they have more than 50% failure rate of defending against attacks: https://www-cdn.anthropic.com/c788cbc0a3da9135112f97cdf6dcd0...
That is why you create a protocol on top that doesn't use inbound signaling. That way the model is able to tell who is saying what.
And the thing is, even adding a "color" to tokens wouldn't really work, because LLMs are very good at learning patterns of language; for instance, even though people don't usually write with Unicode enclosed alphanumerics, the LLM learns the association and can interpret them as English text as well.
As I say, prompt injection is a very real problem, and Anthopic's own system card says that on some tests the best they do is 50% on preventing attacks.
If you have a more reliable way of fixing prompt injection, you could get paid big bucks by them to implement it.
The same thing could be said about the internet. When it comes down to the wire it's all 0s and 1s.