I'd expect vsync-to-host to be the default, though something more like triple buffering should also work (with the host atomically grabbing the latest completed buffer to compose just-in-time for composing to finish before the RAMDAC starts the first line... so roughly grabbing at the start of blank, spending vblank compositing, and no tearing after vblank).
So in total, on the order of a millisecond additional latency if you pass vsync timing alignment down, and no variable refresh rate.
But the latter is barely compatible with compositing in general. All techniques for handling a VFR video stream in a window should also apply to the uncompressed/shared memory looking glass stream.