That still doesn't guarantee any semantic correspondence between the human readable representation and the model's "thinking".
The child's game of "Opposite Day" is a trivial example of encoding internal thoughts in language in a way that does not correspond to the normal meaning of the language.