As it stands, languages such as Chinese are intrinsically implicit in nature. In fact, the more adept at the language, the more you can express with less. If you follow the literature back a couple thousand years, the amount expressed in a few characters is absolutely astounding.
If you take the example they use at the bottom regarding wonton, it's down right criminal to map the grammar in such a hurried manner. For one, just from the romanization of wonton, the AI should be able to gauge that it's looking for 2 characters and not 1 (1 character per syllable). However, in the case of the menu, the wonton egg drop soup drops a character to save some space.
Taking a straight forward CFG approach will never result in an accurate translation. What may work well is to do multi-pass contextual analytic processing in parallel.
Developers of Google Translate described to me that Finnish is causing problems because the excessive inflection [1] in Finnish language needs a couple orders of magnitude more translated material to make statistical approach to work. At least that time it was not easy to obtain. Interestingly enough, the official EU documents and meeting translations are one of the best sources of 1-to-1 translations as they are translated to all languages in EU.
[1] http://en.wikipedia.org/wiki/Inflection#Uralic_languages_.28...
Other possible corpuses are mostly literary, which are of course subject to significant rewriting for stylistic reasons.
For example, the interfaces and processors are all very clearly defined and separated in those diagrams. Unfortunately, natural intelligence does not seem to work in the same way. The inputs to a real human do not get processed in the same places, even when they might be coming from the same sensor. Obviously the patellar reflex doesn't make it past the spinal cord, and I've never actually believed that the spectrum of intelligent behaviors can be sorted into "conscious" or "unconscious" categories, by including some sort of wet Boolean or whatever.
We could think of the brain's implementation as the sum of its internal and external interfaces, but how the hell would we model that without involving unreasonable error margins?
I believe the source of your disappointment is a matter of overall expectation about what AI research is intended for. The objective of AI is not to create intelligent beings, it's to model and create programs that solve narrowly predefined problems. AI as a field is not at all identical to AGI (=artificial general intelligence). Whenever you're talking about brains or things like "common sense", that's AGI. Over the years AI research has produced many good models for single components of our mental subsystems though. But researchers have not actually concerned themselves with AGI until very recently.
http://dl.dropbox.com/u/5137/pdf-link/StanfordAI-UnitOne.pdf
Do you think you will be doing this more?
I would be incredibly appreciative if you were to set up a mailing list! :D
It's only an idea ;)
'Ooops
Our servers are off having a quick coffee break. Wait a second and refresh the page. If you still get this message, we apologize and ask that you try again a little later.'
EDIT: It seems that it isn't just that and the site is just flaky (overloaded, I guess).
Update: Now the videos are saying I need Flash 9 (intro previously worked). Bizarre. I just went to youtube to watch the videos, unfortunately they are not organized well or queued so searching for them is a pain.
This page at least has them all easily accessible: http://www.youtube.com/user/knowitvideos#p/u
Regarding the lecture notes - is there a wiki where we can all contribute to? Earlier today, the google doc was complaining about too many people editing the document.