2) If you will just remove the tokens representing the first-person pronouns, this will severely harm the model's performance in almost all tasks that require interaction (either real or imagined) in a social context, ranging from simple work letter understanding to creative writing and things like that. If you will instead try to train the LLM in the way that will inhibit the "first-person behaviour" - it may work, but it will be a lot harder and you will probably have problems with the performance or usability of this model.
To conclude - it's just not that easy.
No comments yet.