It seems to be a common pattern. When I ran it with the argument "A Einstein", I got "On the electrodynamics of moving bodies." as one of the results.
Edit: Also, for "P Erdos", I got "On a new law of large numbers."
If you’re using a small corpus and long Markov chains, you’ll end up with lots of actual strings from the corpus, and no fake ones. If this happens, experiment with the second parameter to the constructor for the class “MarkovGenerator.”
For the authors you are using, the corpus is too small.
As a non high-energy physicist, it's surprisingly hard! I usually do _worse_ than random chance, sometimes substantially so.
The author also has perhaps my favorite definition of a CFG: "The snarXiv is based on a context free grammar (CFG) — basically a set of rules for computer-generated mad libs."
Reminds me of my own playing with markov chains; http://williamedwardscoder.tumblr.com/post/13292744100/the-s...
I wrote a "Lorem ipsum" replacement based on Markov chains and some public domain books as the corpus, http://wordum.net