Memory Performance Testing – 1, 2 and 3 channels
The Core i7 processor is Intel’s first attempt at an on-chip memory controller and as such deserves a bit more scrutiny that we have given it up until now.  This also marks the first the time that a triple-channel memory controller is being implemented on a desktop processor and I was very curious to see how that scaled in memory intense applications. 

To run these tests, the setup was really quite simple: run a couple of benchmarks on the Core i7-965 EE CPU with a single memory channel active, a dual-channel setup and then with all three channels of the Intel X58 “Smackover” motherboard in use.  I also was curious what effect an unbalanced memory configuration would have on performance so I tested this motherboard with all four memory slots installed and thus giving us 2 DIMMs on the first channel. 

Nehalem Revolution: Intel's Core i7 Processor Complete Review - Processors 102
A single DIMM installed resulting in single channel memory…

Nehalem Revolution: Intel's Core i7 Processor Complete Review - Processors 103
Two DIMMs installed bring us to the same dual channel memory controller type that AMD uses.

Nehalem Revolution: Intel's Core i7 Processor Complete Review - Processors 104
With three DIMMs populated the board is in full speed, triple channel memory mode!

Nehalem Revolution: Intel's Core i7 Processor Complete Review - Processors 105
Here we have all FOUR DIMM slots on the Intel X58 “Smackover” motherboard installed – still three channels.

Now that we have our testing configuration figured out, I tested the whole lot of them with the SiSoft Sandra SP1 application and the memory-sensitive Euler3D fluid dynamics test.

Nehalem Revolution: Intel's Core i7 Processor Complete Review - Processors 106

Nehalem Revolution: Intel's Core i7 Processor Complete Review - Processors 107

Nehalem Revolution: Intel's Core i7 Processor Complete Review - Processors 108

The Sandra results are pretty telling, though it is obviously just a synthetic worst-case-scenario test.  Dual channel memory mode is 37% slower than triple channel and single channel is another 77% slower than dual channel.  Going from triple channel to single channel we see a more than 144% detriment in overall performance – a VERY significant drop.  This should tell everyone that if you are going to upgrade to Nehalem technology, you really should also make sure you have a memory configuration ready for triple channel functionality.  That could just mean adding in a third DIMM to your existing DDR3 memory or a completely new kit.

Also of note is the fact that adding in that 4th DIMM also hits the memory sub-system performance really dramatically.  When we have 8GB of DDR3-1333 installed the system in fact runs at nearly the same speed as the dual-channel 4GB configuration.  Memory in sets of three seems to be the requirement for optimal Core i7 systems. 

Nehalem Revolution: Intel's Core i7 Processor Complete Review - Processors 109

Our Euler3D test shows some more interesting results with our varying memory configurations being tested.  First, with just a single thread active in our test, the difference between all four configurations is minimal and nearly non-existent.  As we increase the thread count to 2, 4 and 8 the gaps increase as the work load demands more memory bandwidth to keep up with the quad cores.  If we look at the 4 thread results, where the core count matches the thread count, we see a 15% drop in performance going from three channel memory to two channel.  That ratio drop continues when we see the single channel memory result at 14% slower than dual channel. 

The 8GB fully populated result is about 11% slower than the optimal speed indicating that though the additional memory does help performance compare to the dual channel score, it isn’t enough to offset the drop compared to the triple channel results. 

What does all of this mean then?  It’s quite simple actually: the new three channel DDR3 memory controller built into the Nehalem Core i7 processors requires all three channels to be populated to get the best possible system performance. 

« PreviousNext »