I suggest experimenting with cyclictest from rt-tests. On all hardware I've tried, I get 30ms+ peaks after running it on the background for not even very long. I can't comprehend how anybody could find this acceptable.
I do run linux-rt for this reason. Then again, while linux-rt provides the tools to make latency reasonable, the rest of the system hardly does use them.
As we move from the likes of Linux to better architected systems, potentially based on seL4, I do hope the responsiveness will return to sanity. Until then, I'll have to keep going back to my Amiga hardware as cope mechanism.
Because Linux is primarily funded by server companies, and servers are optimized almost exclusively for throughput?
But yeah, the point is clear: Current, popular desktop systems are pretty bad at responsiveness.
Basically everything is tuned for running web apps with loads of procs for people who don't really care about latency of 100s of millis.
Doing this decreases certain types of latency in certain situations. As an example, it tries to have interrupts disabled less frequently and for shorter intervals, and uses mutexes instead of spinlocks.
As a result, using linux-rt can provide a lower latency experience compared plain linux.
Is extremely desirable. Those multi-ms peaks of latency Linux has are the ones that cause audio cuts and perceived hiccups.
Of course it doesn't matter perceptually if the average is 1µs or 5µs. It's all about the peaks, and keeping them bounded enough so that latency does never cross the perceptual threshold.
On Windows, DWM's display compositing adds one frame of latency to every window on screen. It's not possible to render a dragged object in any window that sticks to the mouse cursor without at least one frame of latency.
But when you drag whole windows around they do stick to the mouse cursor with apparently zero frames of latency; how does DWM do it? Easy, they cheat by disabling the hardware mouse overlay during window dragging so that the mouse cursor gets that extra frame of latency too. You can prove this by enabling "Night light" in settings; watch the mouse cursor change colors as it transitions from hardware overlay to software rendering when you start dragging a window.
I had to use a monitor running at 30Hz for a while (4K over HDMI 1.4) and while that was bad enough, the compositor’s lag meant all window contents had an extra (unnecessary IMO) delay of 33ms. Add on to that normal monitor input lag.
We’ll probably all shift to 120Hz w/ variable-rate refreshing as a new baseline standard over the next 10 years as Apple seems to be heading in that direction - at 120Hz the lag of the compositor would be acceptable - but I’m worried lazy that graphics devs are going to use that as an excuse to add another frame of latency...
Yes. This concept is called hardware overlays and there are varying levels of support for it in different GPUs and compositors.
There are tradeoffs. Using multiple hardware overlays may cost extra power and/or memory bandwidth, the number of supported overlays may be very limited, alpha blending may not be supported, and the transforms that can be applied to overlays may be very limited. The extremely hardware specific nature of the restrictions and the lack of good APIs exposing overlays means they get much less use than they should.
AFAIK you can bypass this by using dxgi flip model so no additional latency is incurred. There's still is going to be 1 frame of latency from the vsync though.
>You can prove this by enabling "Night light" in settings; watch the mouse cursor change colors as it transitions from hardware overlay to software rendering when you start dragging a window.
can't reproduce on my end. maybe the upped the night light implementation so the hardware cursor is tinted as well.
Using the flip model only eliminates the latency if DWM promotes your window to a hardware overlay. On Nvidia systems this is simply not supported, so the latency is always there and it's impossible to get rid of it. Maybe DWM supports overlays on Intel or AMD, I'm not sure. It would be interesting for someone to test this.
> There's still is going to be 1 frame of latency from the vsync though.
Vsync does not inherently require any extra latency. You can render as close to vsync as you like to reduce the latency an arbitrary amount. That's what VR compositors do. All you need to do is ensure you can't flip during scanout and you can't get tearing.
I thought this was a fact of all window managers?
I'd noticed it when making games in SDL / SDL2 on Linux and just assumed it was because the X server couldn't possibly wait on me to paint a frame before updating its own cursor
any application can do it, hence why you get laggy cursor in some games that opt to draw their own cursor.
https://www.forrestthewoods.com/blog/memory-bandwidth-napkin...
It can smooth things a bit but it's not that good a substitute for actually improving latency.
(There are probably consequences for coronavirus charts as well, since they're based on lagging data.)
Ultimately you want the lowest possible latency and prediction, because you can never get the latency to zero. Once the latency is small enough, prediction becomes a net win. For example, all VR devices do prediction for head and hand positions after lowering latency as much as possible elsewhere.
https://rauchg.com/2014/7-principles-of-rich-web-application...
In the limit I guess this boils down to "do no prediction" (which I also suppose is what the linked site's conclusion is).
I've been doing some WebGL work recently and I've noticed that while it reaches ~144 fps using requestAnimationFrame() in Firefox, there's a lot of stuttering. It's very smooth at 144 fps in Edge Chromium, while Edge Legacy is below 80 fps. As far as I can tell it's not CPU bound, and it's definitely not GPU bound. It would be nice if I could get it running smoothly in Firefox but I don't know what to investigate.
~10ms on Firefox, Linux, 60hz display.
that's definitely not what I am observing (https://streamable.com/9u4cpx). Enabling the predictive tracking, however, is quite nauseating especially in circular motions. Please don't play with your users' cursors !
> predictive tracking will feel much worse than direct (technically lagging) tracking when there is no system cursors to match.
Additionally, we can see the lag between the red box and your cursor in the video of your screen that you've uploaded.
But I noticed that if I rendered the effect with motion blur, it suddenly started to feel much smoother, and the perception of jerkiness was mostly gone. I felt that it completely restored my sense of control of the motion.
It’s surprising considering that motion blur actually adds one half frame of extra latency.
Since trying this, I’m bothered by how jerky fast mouse movement always feels in MacOS. 60 fps leaves these enormous, ugly gaps between the pointer at each frame, and makes it hard to perceive the motion correctly. I can’t unsee these gaps now! I’m convinced that system-wide motion blur just for the pointer would be a simple way to make the whole OS feel much smoother and more responsive.
Then, going back to a 60hz display, I couldn't NOT see the gaps left by the cursor's movement. I had never before seen this as a problem, but seeing something better ruined 60hz for me.
What kind of algorithm could be used to improve the accuracy for curves?
This assumes compositors perform their work right after each display refresh. Compositors can decide to perform their work later, some amount of time before the next display refresh (e.g. a few milliseconds). This allows to reduce latency because the new buffers submitted by clients (such as web browsers) can be displayed with less than 1 refresh period worth of latency. For instance the browser can update its buffer at last display refresh + 8ms, then the compositor can composite at last display refresh + 13ms, and the new frame can be displayed at last display refresh + 16ms.
Here's for instance how Weston does it: [1]. Sway has a similar feature.
>However since pointing with a cursor is such a core experience in these OS'es, the "screen compositor" usually have special code to draw the cursor on screen as late as possible—as close in time to an actual display refresh as possible—to be able to use the most recent position data from the input device driver.
That's not entirely true. Nowadays all GPUs have a feature called "cursor plane". This allows the compositor to configure the cursor directly in the hardware and to avoid drawing it. So when the user just moves the mouse around the compositor doesn't need to redraw anything, all it needs to do is update the cursor position in the hardware registers.
Compositors don't have code to draw the cursor as late as possible. Instead, they program the cursor position when drawing a new frame. (On some hardware this allows the compositor to "fixup" the cursor position in case some input events happen after drawing and before the display refresh.)
But in the end, all of this doesn't really matter. What matters is that the app draws before the compositor draws, thus the compositor will have a more up-to-date cursor position.
[1]: https://ppaalanen.blogspot.com/2015/02/weston-repaint-schedu...
1. The predictive checkbox improved tracking my cursor.
2. Disabling `requestAnimationFrame` improved it more.
This is not what I'd have expected, so I'll include details about my environment:
- macOS 10.15.4
- Safari 13.1
- 2019 16" MBP with maxed RAM and ~25GB swap
I have no idea whether the browser or the memory pressure made same-thread tracking more accurate, but something did.