I think where you are getting hung up is the idea of "better results". We as a community don't need to strive for "better results" we can easily say, hey we just want HN to be between people, if you have the LLM generate this hypothetical test, just tell people in your own words. Maybe forcing yourself to go through that exercise is better in the long run for your own understanding.