Would denormalizing a string to unicode help prevent AI from matching text in a prompt? For example, changing "The quick brown fox" to "๐ฃ๐ฑ๐ฎ ๐บ๐พ๐ฒ๐ฌ๐ด ๐ซ๐ป๐ธ๐๐ท ๐ฏ๐ธ๐" or "apple" to "รรรlรฉ". Since the obfuscated strings use different tokens, they wouldn't match in a prompt, correct? And although normalization of strings is possible, would it be (im)possible to scale it in LLMs?
Note that I'm not suggesting that an AI couldn't produce obfuscated unicode, it can. This question is only about preventing one's text from aiding a corpus.