I’m sure there are better ways. But this is something we’re trying, and in the early stages it seems to be working well for us. YMMV.
As for the metrics… from the studies that I have seen, the effect pair programming has on these metrics is related to the complexity of the tasks.
For simple tasks, pairs don’t complete work significantly faster or with significantly fewer defects (so you’re essentially spending 2x effort for the same result). But for complex tasks, pairs tend to complete the work slightly more quickly and with significantly fewer defects (so you’re spending nearly 2x effort for a better result).
Given that our team was taking a long time going back-and-forth in this code review stage, we decided it would be worth expending the effort up-front to produce more correct code and (hopefully) reduce the time we were spending ensuring the code was correct after-the-fact.
I’m not sure that this will increase our teams velocity, but I think it might increase moral. Nothing sucks harder than finishing a task and then being told it’s not good enough.