Intel Turbo Boost Max Technology 3.0

Getting the most out of your best core

When Intel started talking about Turbo Boost Max 3.0, I honestly assumed it was just generic renaming of a feature that we have had on Intel processors for years now, but I was wrong. TBMT3 (as I am calling it now) asks the question: how much faster could you run applications if you knew which core on your die was able to run at the highest frequency?

Overclockers already know that when you are trying to benchmark single-threaded workloads, not all cores on multi-core processors are created equal. Take a quad-core Skylake processor and try to overclock core 1, 2, 3 and 4 independently and you’ll like find a range of frequencies that can be stable at a range of voltages. With Broadwell-E, Intel is determining which cores are “best” during production, and building that into the processor identification that is read by the motherboard. When Intel tests dies for yield and binning, it is able to easily determine what the threshold is for each core; which would clock the highest with the least required voltage.

So what can you do with this information? After installing a driver from Intel on your system (Windows 10, 8.1, and 7 supported) you can see the order of your cores sorted by best to worst scalability. In my case, core 4 appears to be the top dog. Intel’s driver then assigns single-threaded workloads that need above a specific utilization to that top core, allowing it to clock higher than other cores on the processor. The Intel software is basic, but powerful, allowing you to reorder the core priority, set different utilization thresholds, and counter timings.

This is the first time we have needed a driver for a processor, though. Why is it needed? Windows 10, at least for now, does not have the capability to track which cores are performing better, and don’t know about the new information being provided by the Intel identification data. If you have ever watched Task Manager while running a single-threaded application or benchmark, without manually setting affinity, you may have seen something called core hopping.

In this example, I was running the single-threaded version of POV-Ray without the Intel TBMT3 driver installed. You can see that core/thread on the bottom left corner was running the workload for a while, then it moved to the core/thread 4th from the left on the top row, and now it is running it on the rightmost top core/thread. This moving of workloads between threads causes some performance reduction on its own, but Windows also doesn’t know which cores would be able to clock higher, getting the work done more quickly.

Here you can see the same POV-Ray workload running after driver installation; it stays on the same core/thread the entire time.

In this case, TBMT3 was able to move the clock speed of our “best core” higher than the 3.5 GHz rated maximum Turbo clock rate. What kind of performance delta can this provide?

  • POV-Ray, Single Thread, TBMT3 Disabled: 353.46
  • POV-Ray, Single Thread, TBMT3 Enabled: 398.96
  • Increase: 13%

A 13% improvement in single threaded performance is fairly substantial and means that many of the tasks that we all do every day that aren’t multi-threaded should see improvement. If you are working with more multi-threaded workloads, TBMT3 doesn’t really change performance, as clock speeds are still limited by the slower cores. This new technology is available on all four of the new Broadwell-E CPUs.

It’s expected that Windows 10 will be updated sometime this year to understand the idea of a “best core”, and will enable the same functionality without the need for Intel’s driver.

« PreviousNext »