For example, accessing Google: my browser is set to accept English only. I'm entering the English URL. In my account settings I periodically reset everything I can find to English (settings apparently decay, too). Google knows I want the English version. Yet, they still give me the interface in whatever language my IP address comes from. And not only the UI, search results as well.
Recently it's gotten even worse than that: Google figured out I'm actually German, so they start defaulting to German more often now - ignoring everything else. At least with the IP address-based routing it was impersonal.
I happened to be in Sweden when I linked my Facebook calendar to my Google calendar. Ever since that day, my friends' birthdays are given to me in Swedish. Facebook knows I want English, yet for some reason this is how it's got to be.
The same abuse is apparently considered best practice at new startups as well: recently I was testing a browser game for an acquaintance who's on their development team. Because I was in Portugal at the time, I of course got the site in Portuguese. Manually switching that to English, the game still started up in Portuguese. It's been doing that ever since. Every email I get from that company is in Portuguese, too, even though I tried everything I could to set my language to English.
It's a source of endless frustration, maybe even a hostile act. They're effectively saying "Your choices don't matter, we know what's best for you. You're from country X, so you _must_ speak Xish. People are on the internet to enjoy regional separation. Really, it's best."
Many of the people reading Hacker News will be able to find and change that setting. My mom never will. She'll just know she went to google.com, and saw Chinese.
If you're using a computer in a country, and websites seem to be showing you things in the language of that country, that's something you can probably understand. If you're using a computer and some websites insist on showing you some other language, you'll be confused.
http://maps.google.com is the US-local Google Maps; http://maps.google.ca defaults to Canada; http://maps.google.co.jp sends you to Japan; etc.
EDIT: I just noticed your guesses at the bottom of the post. Your second guess is correct. See §14.4 of RFC 2616:
As an example, users might assume that on selecting "en-gb", they will be served any kind of English document if British English is not available. A user agent might suggest in such a case to add "en" to get the best matching behavior.
As a French guy leaving in the German part of Switzerland with Accept-Language configured to get English content, I'm kind of ashamed to have that kind of bug in my language detection code. I'm always complaining about other websites language detection, looks like I should have looked at my own code first!
> curl -I -H 'Accept-Language: zh-hk,en;q=0.8' https://dolphin-emu.org/
HTTP/1.1 200 OK # No zh_HK translation (yet!)
> curl -I -H 'Accept-Language: zh-cn,en;q=0.8' https://dolphin-emu.org/
HTTP/1.1 302 FOUND
Location: http://cn.dolphin-emu.org/?cr=cn
> curl -I -H 'Accept-Language: pt,en;q=0.8' https://dolphin-emu.org/
HTTP/1.1 200 OK # No pt translation (yet!)
> curl -I -H 'Accept-Language: pt-br,en;q=0.8' https://dolphin-emu.org/
HTTP/1.1 302 FOUND
Location: http://br.dolphin-emu.org/?cr=br
i18n is hard but I think I've been doing a fairly good job on it. Proud to have more than 50% of our visitors from outside of the US!- accept-language header
- URL that includes language/region codes as a subdomain or part of the path
- language preferences set in a cookie or account
- IP region detection
In the end, any website is trying to provide the right language most often for their users, and there are no easy answers. When I access webmail from an Internet cafe in China, I don't want the interface popping up in Chinese just because the browser's accept-language is configured for Chinese. Fortunately, it doesn't.
Most web users have never even heard of accept-language, it's just automatically configured by whatever language their browser was installed in, which isn't always the language you want to be browsing in. (E.g. you bought your laptop overseas because it was cheaper, so it runs in English instead of your own language.) It's not a surprise that IP address detection provides the best default experience most of the time, which can then be overridden by URL or user choice, and that accept-language is fairly irrelevant.
* In all cases, a fairly visible language picker is displayed at the top of the page, with internationalized language names.
* If someone goes to a language-specific subdomain (fr.dolphin-emu.org, cy.dolphin-emu.org, ast.dolphin-emu.org, ...), they get this version.
* If someone goes to the generic/english dolphin-emu.org, the system checks whether the user has a "nocr" cookie. If so, they get the english website. Otherwise, they get redirected based on their Accept-Language.
* If a user uses the language picker, we assume they know what they want and set the "nocr" cookie to disable redirections in the future.
* When the user gets redirected from the standard/english version to an internationalized version, a message is shown in english saying that they have been redirected based on their browser preferences, with a link to go back to the english version (and set the "nocr" cookie).
I thought for a pretty long time about this and think it is a good compromise between providing the best version for our users and not being annoying/guessing too much. In the end, more than 50% of our users now are shown internationalized versions of our website, which is a very good number in my opinion.
Where I live now is another French speaking area. I just checked and it seems they are no longer serving French pages to me. But they were even just one year ago. (I don't use Google by default, so I don't know when they changed.)
Admittedly, that was an issue with geo-detecting rather than the website having bad language detection.
* They seemed to have stopped.
Air France is (though they have many faults) actually alright at detecting my language. And mostly gives me English pages...
The best case of this was when they launched the preview for the new Google Maps version - there was a landing page with some information and a button in the middle. This page was served to me in three languages at the same time (the header, the button and the info text) - presumably served by different internal components that all handle languages differently.
I'm not in a French-speaking country. I don't have French in my accept header. I never expressed any preference towards the French language.
But, my ISP was Orange (France Telecom) and I had a variable IP from them.
There are some IP addresses which, when viewed "raw" like http://aaa.bbb.ccc.ddd/ will return a localized Google.
Please, I hope someone hears your complains and starts fixing things. That issue is highly annoying..
Google is a Royal Pain in the Ass on this point. They completely disregard any request configuration and decide on output language based on IP geolocation (which is pretty much always Not What I Want, even more so in multilingual countries such as Belgium or Switzerland[0]), then Chrome "helpfully" suggests translating documents.
[0] where it won't even send you something matching your actual geographical location's language, usually sending the country's most common language — dutch in Belgium and german in Switzerland
> 2. en-US doesn't match en and is being bypassed
Or: 3. They are simply checking your IP address and not looking at your header at all.
The only other option is to enable cookies so that the website language choice is saved - which also invites countless tracking cookies which I do NOT want.
Your web site does NOT know better than me which language I want to read.
If you have multiple language, hopefully you already have a scheme to differentiate the language (i.e. wikipedia has the language in the URL). If the user went to a specific language URL you should ignore the other settings.
If he/she didn't go directly to a specific language, it's fair to assume he/she is in a non standard situation or is OK with the defaults, and applying heuristics doesn't help.
If IP and accept-language don't match, why not make a prominent button (in the language they didn't pick) to allow you to quickly change?
The problem is that for a large minority of people this is absolutely catastrophic. Think of the Western business traveller going to Japan or China...
The current top poster actually figured out the exact issue in this case.
From an advertising perspective this is a major market that is being overlooked, because guess what, I don't look at ads that much, but you can bet your bottom advertising dollar that I'm definitely not going to read it in a language that isn't my mother tongue.
IP address != language preference
It is about time that developers got that through their thick skulls.
Finally, over here in Europe we can live in whichever EU country we want to. This means that we can move countries easily. I've already been in four of them. I don't think I'm an edge case by any means. People migrate.
The assumption will be that country is mostly orthognal to language b/c people are übermobile. Further, that the dialect of the language should not force assumptions of other preferences... only autodiscover initial settings as close to desired as possible. (Fuck, why isn't there a standard for this common, hard-to-manage shit the OS already knows.)
i18n is taking up tons of time to get (mostly) right, but I believe it's one of those things not to botch because it's such a huge signal to everything else about your app.
If I want to be the most obscure hipster paying in Lesotho Loti, read Catalonian, have a "," for thousands separator and use UTC tz, by Flying Spaghetti Monster that's what it's gonna allow.
Current Gemfile:
# ...
# i18n
gem 'rails_locale_detection' # consider locale_setter
gem 'rails-i18n', github: 'steakknife/rails-i18n'
gem 'i18n_data', github: 'steakknife/i18n_data'
gem 'countries_and_languages', require: 'countries_and_languages/rails'
gem 'country_select' # for simple_form
# tz
gem 'tzinfo-data', '>= 1.2014.1'
gem 'tzinfo'
# symbols and images
gem 'svg-flags-rails'
# idn
gem 'resolv-idn' # resolv unicode patch
gem 'idn-ruby' # unicode IDNA domain resolution
# ...
# ...* IE9: sv-SE
* Firefox: en,sv;q=0.5
* Chrome: en-US,en;q=0.8,sv;q=0.6
I'm going to go ahead and suggest that the reason English comes before Swedish is due to my system language, and that Swedish otherwise would come first. The "users will have the wrong settings" argument seems moot to me.
https://github.com/gioele/rack-i18n_best_langs
> Differently from other similar Rack middleware components, rack-i18n_best_langs returns a list of languages in order of guessed importance, not a single language.> Language discovery is done using three clues:
> * the presences of language tags in paths (e.g. /service/warranty/ita),
> * the content of the HTTP Accept-Language header,
> * the content of the rack.i18n_best_langs cookie when set.
I needed to create a yahoo account, and I registered it selecting the kimo.com domain (kimo.com is a Chinese domain owned by yahoo). Since the first moment I set my language preferences to English.
No matter which yahoo service I'm visiting, I always get welcomed by at least the login prompt in Chinese, I can't really complain, because I was the one who looked for a rare domain, but it's an annoyance for me, because yahoo assumes that I understand Chinese because of the domain.
MS C# documentation deserve a special kind of hell, because they detected I reverted the language to english ; and now they present me a special translation mode of their freaking doc where there are huge tooltip texts everywhere.
(I'm using mono on a mac, and I'm not really doing important stuff).
The problem is that en-US is the default and I can't tell difference between user not setting language and user choosing en-US.
Add "en" or even "en-GB" to your Accept-Language header.
(I don't have an accept-language for Dutch, Italian, German, French but in all these cases I was shown the local page)