It guessed early 70's.
It's written to guess early/mid/late subsets of ten-year ranges, and actually does have some data for kids/teens. Mostly it had fictional characters in movies, books and television but some singers snuck in there when I had my family fill out excel worksheets, heh!
Any advice on how I could collect this kind of data? A survey on ask HN? Mechanical Turk?
EDIT: the only features it uses are fictional names that one can rattle off in one sitting and the specific age to use for the label. No gender or other demographic information. So that's what I would collect, just a list of fictional names a submitter can think of in one sitting, and their age. It's basically a form of supervised topic classification training seen in other ML tutorials, but using the age as the training set topic label. I'm experimenting on enriching the data afterwards with media (book/movie/tv flags) see if that feature improves its performance, but I'm teaching a class this week and don't have any spare time to work on it.