What would be interesting here would be a difference analysis or regression giving the preference for any given key in a given language. E.g.: '|' is highly predictive of shell, '$' of perl, '()' for lisp. Might be fun to do in R.
I really need to do reading and research on this, but I'm pretty sure that's what Hidden Markov Models are for. You could watch a webpage go from HTML to javascript and back!