Not faster than dealing directly with ucontext from what I've seen; many wrap it directly and the rest tend to emulate using signals, setjmp and prayers.
ucontext modifies the signal mask, requiring a syscall. That’s very expensive (as shown by the boost benchmark above), and usually unnecessary. “emulate” is a rather negative sounding way to describe running effective the same code as ucontext does (which also happens to typically be the same as sigsetjmp) - it’s not like ucontext has some privledged permissions. It’s “just” a context switch.
I was under the impression that Boost Context is more than that; at least that's what the amount of assembler code tells me; but I'll be the first to admit I don't have much patience for deciphering modern C++. I get that it's possible to invoke the same functionality as ucontext minus signal masks manually, but I'm not convinced it would save enough cycles to pay for the added complexity.