The Python byte-compiler, specifically, appears to be fairly slow. I hadn't really thought much about this, but while contributing to a benchmark yesterday (http://news.ycombinator.com/item?id=1800396), it wound up staring me in the face. There's actually surprisingly little difference performance-wise in running Lua from source vs. precompiled (both pretty fast), whereas the difference between Python and pyc in my benchmark was wider than every other possible pair in the chart except python interpreted vs. "echo Hello World". (I didn't have any JVM languages, though.)
I don't really do eval-based metaprogramming in Python, but do so on occasion in Lua. I thought I felt better about doing so because Lua is syntactically much simpler (and has scoping rules that make avoiding unexpected variable capture easy), but the Lua compiler itself also appears to be substantially faster than Python's. (It doesn't do much analysis, but still usually runs faster than Python.)
And yes, it's not as good as straight-up Lisp macros, but Lua's reflection also covers a lot of low-hanging fruit that macros would otherwise handle. The biggest thing lacking in Lua compared to Lisp is an explicit compile-time phase for static metaprogramming. (Code generation is an inferior alternative.) Lisp macros win big in part because they can avoid the overhead of parsing, but parsing Lua is fairly cheap thanks to its small, LL(1) grammar (http://www.lua.org/manual/5.1/manual.html#8).