Another game with a similar take on moral choices is This War of Mine. Both are fairly serious works of literature; do not play either with the expectation that you will feel good about yourself or the human condition afterwards.
But occasionally, a good game comes out which makes you think about the world, and evoke other kinds of emotions like papers please.
Will I get depressed?
I got this wrong several times (of course) when I did my first few i10n projects. First time I used the strings as keys and ran into the problem described above. Time after that, I went the 'pure' way and used GUID's as a key which was a pain in the ass to used and caused the wrong messages to show up in some places a few times. It also made the translators hate me a lot. After that I went what I'd describe the 'pragmatic' way. Every time you encounter a string message, you make up a string identifier that sort of describes the message. So 'High importance!' would be e.g. HIGH_IMPORTANCE. But a multi-sentence message might be 'INTRODUCTION_PARA'. If the id already exists, and there is no obvious alternative, you just call it HIGH_IMPORTANCE_2 - in other words, you don't think about the key too much, you just use quick and dirty keys, you don't change them EVER, and you make sure you have good tools to prevent clashes, even across 'module' boundaries (where 'module' can be 'source files', 'shared libraries' or 'projects that use the same string resources').
You also put formatting strings in the messages, and in a way that makes the order configurable by the translator (e.g. using boost::format and not sprintf). You also provide the translator with a UI that shows them the message key, the 'original' message and the translation in various languages that already exists. And you provide a way for the developer/designer to attach notes to each message, where necessary.
Finally you adapt your messages to be as 'neutral' or easy to translate as possible; how to do this is something you learn with experience. And have to test test test and write special cases where necessary, like where you absolutely need things like 'first' or things where you have weird capitalization rules and stuff like that.
I never liked gettext. First, the is the licence, which rules it out for many projects. Secondly many of the functionalities are overkill, and (while 'pure' from an engineering pov) cause more work than they save (the cases I described above as 'just implement something custom in code'). Third, the tools suck. None of the editors I ever tried were really comfortable to use. They must have gotten the last 10 years or so, which was the last time I looked, but usually they're 'open source user experience' quality' - which is fine for developer tools, but not to be used by non-tech users (which most people who end up doing the translations are, because let's face it, translation is usually an afterthought and a low-respect task).
Another big problem with using existing strings for ids is that if you notice a typo or tweak the phrasing of a particular string, boom, there goes your placeholder.
so while in English you need only one string for singular and plural for different genders, in other languages it need to be written 5 different ways and we still didn't get into grammar cases which also don't exist in English, so while in English you would use 20 times old in other language you need 20 different variations
many companies where it's the default language of products English or Chinese have later big issues to localize content properly, especially considering Chinese being even more primitive language than English. good luck trying to explain all these distinctions to Chinese developers
Potfiles are another option, but the tooling is pretty clunky and, in games in particular, people don't seem particularly attuned to their use. And they're not great for editing, though they might be for storage--when dealing with tabular stuff, it just makes a lot of sense to use tools that present a tabular interface. It makes life a lot easier.
Professional translators will already use compatible editors and for occasional translators there are open source ones available.
That same RFC explicitly notes that CSV is entirely ad-hoc and homegrown, and that it (the RFC) is an attempt to clean up the existing mess.
Beyond that, hand-editing a CSV file (because eventually you have to do that) with those special characters in it is a huge pain. It remains better than JSON, etc., because being record-based is automatically better than not for this stuff, but it's not a good option. I'm well aware that CSV is theoretically standardized; I have written standard-compliant parsers and writers. (It is awful.) And then, once I had painstakingly written that writer to spec, the next guy's--no, not Excel--trashed my data because CSV has a spec that nobody cares about.
The other day I had to fix a bug in our CSV importer; it turns out that when you install Excel, it changes the mimetype of CSV files from 'text/csv' to 'application/vnd-ms.excel' system wide. Wow! These sorts of shenanigans are never ending in the futile endeavour to support CSV.
We have used json, xml, ini-style files, or csvs, which, as i8n goes, has been pretty easy
The way that we got around this was adding another level of indirection, and putting printf format strings also as localized data.
If you're on POSIX, you can use positional arguments for that:
printf("%1$d %2$s", d, s);
printf("%2$s %1$d", d, s);
Because it's not standard C, VC++ does not support it directly in printf, but it offers _printf_p with such support, and you can always #define printf _printf_p.AFAIK localising "formatting literals" is the more normal method, it avoids redundancies as you don't need two different systems (ids and format strings) and provide more flexibility with respect to e.g. cardinalities. Most ID-based systems bundle formatting support as well, if you're using an ID-based system you basically shouldn't call the language's string formatting functions.
Furthermore translating literal sections individually (without formatting context) will often yield an incorrect result as the entire phrase needs to be shuffled around, or words need to be inflected, or a literal translation suitable for "standalone" expressions does not work for the entire phrase.
More granular is generally worse for translations.
I never understood why people think this is a good idea. The exact same sequence of letters in an English phrase, which you would like to use instead of IDs, can mean two different things in two different places - and those two different meaning could have different forms in other languages. Denormalizing translation database like that seems semantically incorrect (and strikes me as programmer laziness).
I agree that in general, more granular is worse for translations - there's too much risk your split will pierce the contextual whole that's required for some translations.
Pluralization is another nightmare of its own. Look into how Russian and similar languages pluralize. It has to do with the value of the number modulo 10, similar to English ordinals.
CLDR: http://cldr.unicode.org/
LTR vs RTL is about rendering text and unrelated to this.
It's a pipe dream to expect an average French person to enjoy art in its native language, and as a businessperson you will limit you market by doing this.
Granted, the English-speaking world is currently the world leader in technological capability and economic power, but it's quite myopic to assume that this makes everyone else irrelevant.
That said, the cat's kinda out of the bag. UTF-8 is at least well-done, and the algorithms are widely available. I study Japanese and have started studying Russian and Chinese; I think maybe the best way to convince people to learn English is to walk the walk. Who knows, maybe everything will go very wrong again before we get a chance to standardize.
I'm also working on an engineered language with a test suite/corpus maintained alongside the language. Maybe in the ashes of the old new world there'll be room for something like this.
To write it properly, we need left- and right-facing single and double quotes, diareses and accents for words like naïve, façade and café, en- and em-dashes and the ellipsis.
Longer documents will require symbols like † and ‡, bullets and §. The currency symbols £, €, ¢ and ₹ are used by countries where English is an official language.
See https://docs.djangoproject.com/en/dev/ref/templates/builtins...
It doesn't help when the plural is 1/2/many or something different (example: Arabic/Icelandic, etc) http://docs.translatehouse.org/projects/localization-guide/e...
So yes and yes.
Glory to Artstozka!
I am attempting to solve this with a small library that offers full CLDR coverage and a special expression language.
Currently for Java 8 but am porting to JS and Python (probably Swift after those)
I realize other languages provide support for this, but in my experience with Haxe it's way easier to implement something custom. The macro translation layer for manipulating the AST is flexible and speedy, and the compiler is wired directly into autocompletion requests. There's very little impedance between my fingertips and the desired outcome.
I literally had to build my own. With 2,500 different strings for a total of 10,000 words I wouldn't even consider our application even that big, it must be a nightmare in bigger projects. We haven't even done the sales site yet because the product's being upsold through a partner.
We came up with our own id naming system, then created an xlsx/resx importer/exporter that uploaded to GSheets to allow us to share files with translators. The ids and comments fields allowed us to add extra meta data, to split the strings into logical sections and sheets and order them properly. Be able to add links to the page that section of translations are on so the translator could see the context. This then additionally allowed us to highlight if a translator had missed any lines when we re-imported it, add their own questions/comments, etc. Also, as we were using sendwithus, we used the importer/exporter to allow us to import pot files from them to keep everything in one place.
Then to support those tools, I created a tool to search for phrases used before, find out the ordering from the meta data, quickly copy ids of strings we want to re-use, see missing spreadsheet tabs.
Programmatically, we had to add support for automatically translating enums into strings (think project status for example), add l10n to our audit logs so customers could see their audits in the correct language and we'd see them in English, modify how .Net did l10n of dates because their built in one is really odd with en-GB which is where we are based (shortdate is Jan 01 2025 in en-US but inexplicably 01 January 2025 in en-GB and all sorts of other oddities).
Then we used a modified version of pseudoizer (thanks John Robbins + Scott Hanselman![1]) to allow us to easily see untranslated strings while we went through the whole site without having a finished translation (we used ja-JP instead of Polish to really see the differences in date strings, currency, etc.). We ended up modifying it because it goes a bit mental with adding !!!! for things like tabs.
Probably spent a week on those tools, but boy was it worth it.
I've not tried intellij's l10n support, maybe it's better, but VS's is very lacklustre.
[1]https://www.hanselman.com/blog/GlobalizationInternationaliza...