Hi!
A short time ago, i decided to try and build an API that would try to guess the gender of a first name. I thought this might be useful for segmenting user lists for campaigning, analytics or similar.
My first approach was to use a dataset of approved names from a few European countries. This was in the believe that most countries had lists like this (Which they don't) and i planned to add them as i went along. I got wiser and the first feedback i got also told me that the API should be able to do probabilistic guesses and if possible, also offer some sort of localization filter to achieve more accurate guesses.
I decided to take an approach of using large, growing datasets of user profiles from social networks. Each entry containing a first name, a gender, a country_id and language_id. At last, i exposed this datamodel through http://genderize.io
It responds in JSON. Simple example: http://api.genderize.io?name=robin
I am now looking to get some feedback on my new approach. What do you think of this way of doing guesses. What do you think of the API? Any feedback is welcome.
The API is completely free by the way.