That paragraph is bungling the paper's interesting observation that jailbreaks may follow a power law similar to in context learning.
Sure, it would be nice to know why that happens. But just knowing about it is enough to be helpful.
Example Application: Ethical trolling bot
Consider SJW Enforcer - a hypothetical LLM cop that finds sexists/racists/transphobes/ableists and makes them feel unwelcome in a community by trolling them.
Due to guard rails, SJW Enforcer refuses to troll , even if I give it 16 hand written trolling examples.
Thanks to the paper, now I know I was just being lazy! I should have been auto-generating 256 examples! Take that, bigots!