Thread to Thread Latency and The X299 Platform
Our testing and evaluation of the thread-to-thread communication latency on Ryzen got a lot of attention during the launch of that product (as well as its corresponding discussion around 1080p gaming performance). With a simple latency evaluation tool, we are able to measure the roundtrip latency from one logical thread to every other logical thread, essentially giving us the core to core ping times that help evaluate inter-chip latency.
The data here is difficult to read, but we can get some very valuable information from it. The L2 bandwidth measures around 11ns or so, as shown in the paired logical core ping times. (This is how long it takes two threads on the same core to communicate.) However, the L3/LLC ping times are much higher, nearing 100ns in several cases.
Let’s simplify this and add some comparative data. In the graph below, we are only showing the ping times from logical core 0 to the other logical cores on each processor.
The Core i9-7900X demonstrates the highest thread-to-thread communication times of any tested system through the first 8 cores. Both the Core i7-6950X (10-core) and the Core i7-5960X (8-core) are much lower, around 80ns, while still running on the ring bus architecture. The Core i7-7700K has a much smaller ring with only four-cores, and, as such, it has the fastest thread-to-thread communication of all the Intel parts.
The AMD Ryzen 7 1800X processor has a significant jump at the half point, where the threads move to another CCX, making them dependent on the L3 cache for communication. This puts the 7900X in an interesting spot; it has a dramatically higher latency to the LLC versus the previous generation Intel processors.
I also did some quick testing to see how DDR4 memory speed affected the latency on the 7900X. With the new mesh architecture, does increasing memory clock have any effect?
Moving from DDR4-2400 up to DDR4-2800 does show us some small gain, as does the overclocking of the cache frequency in the ASUS UEFI from 2400 MHz to 2800 MHz. These changes don’t bring the LLC latency down to the levels of Broadwell-E or Kaby Lake, but it does indicate that some tweaking can help alleviate any performance issues if they arise.
The X299 Platform and ASUS Prime X299-Deluxe
Though we stuck with the X99 chipset and socket for Broadwell-E, Skylake-X and Kaby Lake-X will be using a new chipset and a new LGA2066 socket. We have already seen motherboards leaking out and we should have numerous announcements through Computex this week (check back to pcper.com for all the news!) but now we have preliminary details on what changes it offers.
Fundamentally it looks pretty much the same. The biggest change is one of connectivity – the X299 chipset will now offer up to 24 PCIe 3.0 lanes, mirroring the capability of the Z270 chipset. Compared to the X99 chipset, that only included 8 lanes of PCIe 2.0, this is a significant increase. The DMI connection between the chipset and the processor is also upgrade to DMI 3.0, giving us a doubling of peak throughput (4GB/s rather than 2GB/s). That helps to alleviate the bottleneck from the chipset to the CPU, though a highly saturated system utilizing chipset-based connectivity could still hit speed limitations.
I would expect that concern to be alleviated for a majority of consumers though as the 28 or 44 lanes of PCIe Gen 3.0 provided by the processors (Skylake-X) allows for multi-GPU configurations at x16 or x8 speeds with room for PCIe NVMe storage to boot. Either way, it’s impressive that an X299 system with a 10-core or higher processor would have access to 68 lanes of PCI Express in total – 44 from the CPU and 24 from the chipset.
Default memory speed gets a jump from 2133 MHz on the X99 systems to 2666 MHz on the new X299 systems, giving the new platform another potential advantage. We also have 8x SATA 3 channels, 10x USB 3.0 ports and support for 3-way RAID of PCIe and NVMe drives as a part of this system platform.
Morry is hard at work on reviews for several upcoming X299 motherboards, so I won’t spend too much time here going over it, but the Intel HEDT platforms have always been a pricey (yet impressive) spot in the market. ASUS sent us the Prime X299-Deluxe motherboard to help in our testing and it includes features like 802.11ad, Thunderbolt 3, and Intel VROC RAID support. I fully expect options from ASUS, MSI, and Gigabyte to have truly impressive feature sets, if you are willing to pay.
One advantage to having a platform that is an evolution of a previous is that coolers seem to be working without issue. The Corsair H100i GTX operated exactly as intended on the LGA2066 socket as it did on the LGA2011 socket all these past years. It’s nice when you don’t have to uproot your entire ecosystem every once in a while.
A 15% increase in performance
A 15% increase in performance resulting from a 50% increase in power consumption seems to indicate that this processor is firmly out of its comfort zone in terms of efficiency.
Makes me wonder where it would land with similar clock rates as the 6950X.
As for the i9 line-up, I don’t follow the argument that these CPUs are not the direct result of AMD’s renewed competitiveness. Sure, 6- through 10-core CPUs would’ve been planned for long ago, but their final clocks were set post-Ryzen. The idiotic KBL-X were rushed post-Ryzen. The MCC-i9s are clearly a rush job (hence their late launch) trying to compete with Threadripper.
I’d be willing to bet that not a single CPU launched for this platform was planned exactly as-is 9 months ago.
Even if everything you say is
Even if everything you say is true, is that a problem? Is that not what we want? Some competition to push things forward?
Sorry, I might have simply
Sorry, I might have simply misread/misunderstood your conclusion.
As far as I’m concerned, it was not giving enough credit to AMD for the final specs of these CPUs, as they are / will be shipping.
Anyways, thanks for testing the rejiggered cashes and mesh topography and showing how it affects scaling when compared to its predecessor!
I am curious if future BIOS
I am curious if future BIOS updates will affect mesh speed(ping time?), and what kinds of differences that will make.
I like the performance/$ metrics. There’s so many ways to slice those — CPU Only, including motherboard and RAM (which you have to buy anyway to use the CPU), or full system price. Pros/Cons to each.
Best internet line of the day:
“Until July. Or August. Or October…”
Great review PCPer!
Future BIOS should not have a
Future BIOS should not have a direct effect on it, unless Intel changes its stance on the clocks of the cache. It runs at a slower clock that memory or the CPU itself, but it is controllable – I show you the change on one of our graphs here looking at thread to thread "ping times".
On the performance / dollar, you are right, we could have included memory and motherboard in that and it might be worth doing in the future. But I think most people reading will understand that the X299 motherboard price average is higher than the X370 motherboard price average, so the differenecs will widen slightly.
X370 may not be a fair
X370 may not be a fair yardstick if you want price/performance. X370 is closer to X299 in features (though still a long way off), but if want you want is maximum price/performance B350 is the way to go.
Hey Ryan great review. If
Hey Ryan great review. If possible for the gaming benchmarks could you post the 1% and .1% low frame rates or just the min fps if that would be easier. I have found the enthusiast platform tends to excel in the minimum FPS and smooth delivery of frames (less stutter) and that is what motiveates my purchases more than Max or Average fps i would rather have a CPU with a min of 60 fps and a max of 85 fps than one with a max of 105 fps and a min of 45 fps even if that mean it has a lower average fps, smoothness is everything for me
At what speed where you
At what speed where you running the 1800x infinity frabic ?
Also your idle system wattage look to be half of other sites for the 1800x. I wonder what you or they are doing differently.
Cinebench value. Not sure why but I get a score of 1641 on a stock 1800x. / $440 (amazon) = 3.72
I think you are using the launch day price of $500 ?
note: I run my ram at 2400mhz (the rated XMP profile)
All of the 1800X data was
All of the 1800X data was generated at stock settings, DDR4-2400 memory. And yes, I am still using the $499 launch price for that data, as you note.
Great Video Ryan, Actually
Great Video Ryan, Actually made me read the review… that was good too.
With….one little exception New parts, higher clocks, more cores
Kinda wanted to see what the “NiceHash” daily BTC amount would be, you know for science.
Consider including it in your benchmarks for all the new CPU’s?
Maybe…but CPUs, even
Maybe…but CPUs, even 10-core CPUs, are very inefficient in comparison to even moderate GPUs.
You are totally right, and
You are totally right, and rather handsome,
However after Electrickery a Ryzen 7 1700X nets $600 per annum
Which is peanuts to golden haired tech gods granted, but some peeps may want to put one in a corner and let it pay for itself (with all the assumptions granted) while heating up their greenhouse.
As alogs change and prices fluctuate, releases get more cores, it’ll be nice to keep an eye on hashing value.
Goes without saying that it will be awesome to have it on GPU charts.
You’re obviously way too important and tall to take on such a task, maybe the smaller more condensed you (Ken) could take on such a burden of honour.
Something I haven’t seen much
Something I haven’t seen much of is the (potential) benefit of this X299 plan to boutique system builders, and even larger mass producers of custom PCs such as HP with their Omen, and Dell / Alienware.
They could standardize on X299 for most of their builds, and then offer customers the choice of i5 and “entry level” i7 now, with the option to upgrade to a true HEDT system later on, while keeping the same chassis and main system components.
That and single-core performance should be best on those parts, especially when overclocked to their max.
In terms of TDP, did you
In terms of TDP, did you measure that at stock or overclocked? I’d have to assume stock, and if so, could the measurements be off due to the new platform?
I know you know this, but for anyone who wonders how Intel defines TDP…from https://www.intel.com/content/dam/doc/white-paper/resources-xeon-measuring-processor-power-paper.pdf:
“Intel defines TDP as follows: The upper point of the thermal profile consists of the Thermal Design Power (TDP) and the associated Tcase value. Thermal Design Power (TDP) should be used for processor thermal solution design targets. TDP is not the maximum power that the processor can dissipate. TDP is measured at maximum TCASE.1”
All measured at stock
All measured at stock settings.
Seems to me that there has
Seems to me that there has been a cost-shift Intel has done here from the CPUs to the chipsets. The motherboards are about $100 more expensive than they should be. This way, Intel can make their CPUs out to be a better value than they actually are.
I don’t think that’s
I don't think that's accurate. Intel is probably getting slightly more from the X299 than the Z270, but I would guess not much. If anything, the motherboard vendors know this is a higher end platform and audience, so they put higher end products together to serve it.
Ryan,
Did overclocking the
Ryan,
Did overclocking the cache + using faster RAM have any effect on benchmarks?
I honestly did not have time
I honestly did not have time to check, only to do the latency evaluation you saw on that page. We'll be following up – my expectation is that it will have affect on things like 7zip and the 1080p gaming results, if it all.
I’m looking at Guru3D’s X299
I’m looking at Guru3D’s X299 motherboard reviews and it seems like the BIOS that run the more conservative power profile have higher memory/L3 latency and run worse in games and synthetics like Cinebench. The Cinebench scores matched your results so I’m assuming these latency tests were done using the lower power profiles.
It will be interesting to see what your latency tester shows on the higher power profiles.
Great Review!
Grammar
Great Review!
Grammar Nazi:
On the last page, under the last picture
It is worth noting here that our early testing with the X299 motherboards has including troubling amounts of performance instability and questionable compatibility.
Ah, thanks. 🙂
Ah, thanks. 🙂
Interesting to see the
Interesting to see the Intercore latency affect Skylake X so much. Despite Ryzen’s latency affecting games, it does compete well with Broadwell often, despite lower clocks usually.
It’s nearly the reverse in gaming with Skylake X, where it’s clocked higher, and still loses.
I hope Ryan does some detailed tests with Skylake X X CPUs, and Threadripper to see how the increase CPUs & CCX’s will affect latency; and as a result affect some use cases.
“And to combat Threadripper,
“And to combat Threadripper, it seems clear that Intel was willing to bring forward the release of Skylake-X, to ensure that it maintained cognitive leadership in the high-end prosumer market.”
Impressive Intel know the release date for Threadripper back in 2015 when they scheduled Basin Falls! https://regmedia.co.uk/2015/05/26/intel-kdm-roadmap-1.jpg
Hey Ryan,
Great job as
Hey Ryan,
Great job as always! Just wanted to give a little feedback about the graphs – the font is borderline unreadable and that is on a 1080p 27″ ultrasharp Dell.
Otherwise keep on rocking!
Why is your latancy test
Why is your latancy test different to that from sysoft Sandra?
http://www.tomshardware.de/performance-benchmarks-ubertaktung-leistungsaufnahme-kuhlung,testberichte-242365-2.html
Intel 7900x
Sisoft: 79ns
PCPer: 100ns
AMD Fabric:
Sisoft: 122
PcPer: 140
Perhaps you guys should
Perhaps you guys should factor in the platform cost in these reviews – B350 Mobos can be had for ~$100, while these X299 Mobos cost at least $400. It’s hard to argue the i7-7800X is a suitable competitor for the 1700 when you have to pay another $400 for the motherboard, and are still two cores short (though the higher clocks make up for this)
Intel needs to offer multi-core mainstream offers to truly compete with the 1700 in the future. Right now higher clocks trump twice the threads, but if games like Battlefield and the higher core count of consoles are anything to go for that won’t last forever.
Perhaps you guys should
Perhaps you guys should factor in the platform cost in these reviews – B350 Mobos can be had for ~$100, while these X299 Mobos cost at least $400. It’s hard to argue the i7-7800X is a suitable competitor for the 1700 when you have to pay another $400 for the motherboard, and are still two cores short (though the higher clocks make up for this)
Intel needs to offer multi-core mainstream offers to truly compete with the 1700 in the future. Right now higher clocks trump twice the threads, but if games like Battlefield and the higher core count of consoles are anything to go for that won’t last forever.