> My experience is that most translators actually do know basic HTML or can at least translate an english base string containing HTML into their own language without messing it up. CSS would of course not be present, just sematic HTML (or any other kind of "rich text" -- it wouldn't have to be HTML specifically).
I don't know what to tell you other than that it is not my experience at all that translation services offer or even accept HTML as a source-format, and if they did they would no doubt command a significant premium over translators who know the languages but lack such tech skills.
And I absolutely wouldn't trust a third party to directly author HTML we were serving anyways. Manual audits of 3rd-party input aren't enough – your tooling should be automatically protecting you from 3rd parties inserting unsanitized HTML (as below)
> I'm not sure I understand how your example prevents that HTML threat model you mention, unless the "link" function generates some kinds of magic placeholders that you then replace with HTML in another step you did not mention. If "link" generates an A tag, then you're already trusting the translation with HTML powers anyway
Good lord, no – you should never be rendering externally controlled strings directly as unescaped HTML, and that includes strings from 3rd party translators.
The lookup function for translation keys produces instances of an "unsanitized" (tainted) string class which is escaped on rendering, so if "link" in this case takes two arguments (the URL, which will become the href, and the text that will get wrapped in the A tag – the text argument will be completely escaped such that attempting to embed HTML in the translation key .a.fancy.link.name would result in mangled output, eg)
translation file:
a.fancy.link.name: <script src="some-evil-bitcoin-miner-script.js"></script>Click Here!
HTML template:
<%= link(foo_service_url, t(.a.fancy.link.name)) %>
would produce the final HTML:
<a href="http://www.foo.com"><script src="some-evil-script.js"></script>Click Here!</a>
> (not that I find that much of a problem -- at least not with my approach where XSS via params is not possible).
That's... hardly the only threat you face if you have a translator feeding you malicious strings