Application/Compositor Synchronization

This blog entry continues an extended series of posts over the last couple of years. Older entries:

Frames not Idles (May 2009)
Timing Frame Display (June 2009)
What to do if you can’t do 60fps (June 2011)
Frame Timing the Simple Way (June 2011)

What we figured out in the last post was that if you can’t animate at 60fps, then from the point of achieving a smooth display, a very good thing to do is to just animate as fast as you can while still giving the compositor time to redraw. The process is represented visually below. (You can click on the image for a larger version.)

The top section shows a timeline of activity for the Application, Compositor, and X server. At the bottom, we show the contents of the application’s window pixmap, the back buffer, and the front buffer as time progresses. From this, we can get an idea of the time between the point where a user hits a key and the point where that displays on the screen: the input latency. The keystroke C almost immediately makes its way into a new application frame, and that new frame is almost immediately drawn by the compositor into the back buffer, and the back buffer is almost immediately swapped by the X server. On the other hand, the keystroke D suffers multiple delays.

What happens if we use the same algorithm when we’re unloaded – when the total drawing time is less than the interval between screen refreshes? Then it looks like:

This is basically working pretty well – but we note that even though the application is drawing quickly and the entire system is unloaded we still have a lot of input latency. If we plot the latency versus the application drawing time it looks like:

The shaded area shows the theoretical range of latencies, the solid line the theoretical average latency, and the points show min/max/avg latencies as measured in a simulation. (It should be mentioned that this is only the latency when we’re continually drawing frames. An isolated frame won’t have any problems with previously queued frames, so will appear on the screen with minimal latency.)

We could potentially improve this by having the application delay rendering a new frame – the compositor can use the time used to render the last frame to make a guess as to a “deadline” – a time by which the application needs to have the frame rendered. We can again look at a timeline plot and simulated latencies for this algorithm:

There are downsides to delaying frame render – the obvious one is that if we guess wrong and the application starts the frame too late, then we can entirely miss a frame. From a smoothness point of view this looks really bad. In general, an application should only use a deadline provided by the compositor if it has reason to believe that the next frame is roughly similar to the previous one. Another disadvantage is that the delay algorithm does cause a frame-rate cliff as soon as the time to draw a frame exceeds the vblank period – there is an instant drop from 60fps to 30fps.

Which of these two algorithms is better likely depends upon the application: if an application wants maximum animation smoothness and protection from glitches, drawing frames as early makes sense. On the other hand, if input latency is a critical factor – for a game or real-time music application, then delaying frame drawing as late as possible would be preferable.

So, what we want to do from the level of application/compositor synchronization is provide enough information to allow applications to implement different algorithms. After drawing a frame, the compositor should send a message to the application containing:

The expected time at which the frame will be displayed on screen
If possible, a deadline by which the application needs to have finished drawing the next frame to get it appear onscreen.
The time that the next frame will be displayed onscreen

But even without the deadline information, just having a basic response at the end of the frame already greatly improves the situation from the current situation. I’m working on a proposal to add application/compositor synchronization to the Extended Window Manager Hints specification.

3 Comments

Alexander Chehovsky

Posted November 8, 2011 at 5:52 pm | Permalink

I’m so glad to hear that someone is actually working on getting this implemented, as without synchronization compositing in X11 is largely useless due to tearing and reduced performance. Thanks for your work.
Pierce Lopez

Posted November 8, 2011 at 7:41 pm | Permalink

And then there’s triple-buffering, used in some video games, in which you keep rendering frames as fast as possible to two alternating backing buffers even if you’re rendering faster than the swap interval, and the swap takes the most recently rendered of the two back buffers and swaps it with the front buffer, thus three buffers total. This of course spins the cpu and gpu and wastes a lot of power…
Ben Widawsky

Posted November 16, 2011 at 3:50 pm | Permalink

This was a really excellent post that cleared up basic app/X/compositor interaction questions I had. I really appreciate your effort.

fishsoup