Intel Speed Shift
Late last year I took a detailed look at a new feature added to Skylake called Speed Shift. This technology moved the control of the CPU clock speed and state from the operating system exclusively to the processor itself, allowing Intel’s internal logic to dictate when and how the CPU should enter new speed states based on performance needs, temperatures, etc. The result was clock speeds that got higher, faster, giving users better responsiveness for applications like touch screens, application loads and more.
As I wrote in my original piece:
It's pretty clear that Intel is targeting this feature addition for tablets and 2-in-1s where the finger/pen to screen interaction is highly reliant on immediate performance to enable improved user experiences. It has long been known that one of the biggest performance deltas between iOS from Apple and Android from Google centers on the ability for the machine to FEEL faster when doing direct interaction, regardless of how fast the background rendering of an application or web browser actually is. Intel has been on a quest to fix this problem for Android for some time, where it has the ability to influence software development, and now they are bringing that emphasis to Windows 10.
Easily the most interesting new feature in terms of power is called Intel Speed Shift Technology. This feature actually moves much of the control of P-states (performance states) from the operating system to the architecture itself. P-states are what tells the CPU to move between frequencies in order to balance performance and power consumption. In previous designs, Windows and other operating systems would perform the actual state changes. With Speed Shift, Intel is able to directly change the P-states on the processor and this results in a 30x improvement in the speed of that transition.
Here is a video I created last year explaining and demonstrating Speed Shift.
Why is this useful? First, the speed improvement in that transition should result in added “snappiness” in areas where the frequency needs to increase quickly as a result of user interaction or application need, lowering the apparent latency of some actions. Also, this gives the Skylake processors the ability to manage things like low residency workloads better. Take video recording as a good example of this type workload. Traditionally, a CPU would increase frequency to get through a set of work as quickly as possible to get to idle as fast as possible. For applications that run consistent and repeated, but non-demanding, workloads it might be more efficient to keep the CPU at a slightly higher frequency the entire time rather than spiking up and down repeatedly. Intel Speed Shift gives Skylake that capability.
At the time, we still had the ability to test machines with and without Speed Shift enabled. That is much more difficult to do since the implementation of Speed Shift has been integrated into Windows 10 for some time. This graph below shows you the differences between Speed Shift enabled and disabled on the Yoga 900.
Clearly, this implementation of Speed Shift gets the CPU to its top frequency in just 12ms while the previous OS-only implementation of speed states took more than 50ms.
Kaby Lake does offer some improvements to Speed Shift over Skylake, getting the 7th Generation of processors to their top speed in even less time. This improves responsiveness yet again, though not as dramatically as we saw last year.
This data shows while both Kaby Lake and Skylake have their initial clock speed spike at about the same time (~5ms), Intel’s 7th gen part gets to its 3.5 GHz peak Turbo clock by 5.3ms or so. For Skylake, it doesn’t reach 3.1 GHz, its top speed, until the 15-17ms mark. In practice, this should result in better touch responsiveness on 2-in-1 devices and a general improvement in the “snappy-ness” of the system when working with bursty workloads.
It is very likely that many of the benchmark results we saw on the previous pages are a direct result of the improved Speed Shift implementation on Kaby Lake. Any applications that include short bursts of work will be faster and more responsive on Intel’s latest hardware than in any previous generation of mobile product.