Depending how you're measuring it, I believe it could be because urxvt's output is buffered by default whereas xterm's isn't. That may mean 16.7ms (= 1 second / 60Hz) is added to the measured latency, though the extra time would have no effect in practice.
ed: or maybe thats excluded from the measurement but urxvt is double buffered by default?