From what I gather that's why this (heuristically) tends to work. I am sure it is possible that the backtranslated prompt contains the jailbreaking phrase, but given my experience with LLMs that seems unlikely. They are too "lossy" to preserve that sort of detail.
1) subtle enough that it doesn't immediately trigger the LLM filter
2) overt enough that the relevant details to the jailbreak can be recovered from the LLM's output and put into the backtranslation
I suspect with current transformer LLMs these are mutually incompatible goals.