Testbed and Preliminary Benchmark Results

When testing high IOPS devices like the SSD 910 Series, it’s wise to not clutter up your testbed with other devices that might interfere with the system busses, particularly the PCIe buswork. For our testbed we go the ‘Keep It Simple, Stupid’ route:

That’s right, the only PCIe slot is taken up by the card under test. We’re using SandyBridge integrated graphics which saves all possible bandwidth for the SSD 910 Series. Here’s the rest of our testbed setup for this review:

PC Perspective would like to thank ASUS, Corsair, and Kingston for supplying some of the components of our test rig.

Hard Drive Test System Setup
CPU Intel Core i5-2500K
Motherboard Asus P8Z68-V Pro
Memory Kingston HyperX 4GB DDR3-2133 CL9
Hard Drive G.Skill 32GB SLC SSD
Sound Card N/A
Video Card Intel® HD Graphics 3000
Video Drivers Intel
Power Supply Corsair CMPSU-650TX
DirectX Version DX9.0c
Operating System Windows 7 X64

 

Preliminary Benchmark Results:

Here are some rapid fire quick tests. The first is ATTO, but with a rather large caveat. Combining the four SCSI LUNs within Windows means Dynamic Disk (software) RAID. There is no way to configure stripe size, so we are stuck with the hard-coded Windows 64k stripes here. That’s not the best way to go for 4k workloads, and it tends to hold the 910 back a bit at the lower transfer sizes:

ATTO run of the full 800GB 910 SSD in "Performance Mode"

While sequential rates (near the bottom) are just above the 1.5GB/s write rating and just below the 2GB/s read rating, the 4k to 64k region (middle) appears lower than normal for a PCIe SSD. This is the result of attempting to tie the four LUNs together with Windows RAID. To demonstrate what I meant by penalties of the larger stripe size, here’s an ATTO run of just one of the 200GB LUNs:

ATTO run of a single 200GB SCSI LUN (also in "Performance Mode")

Note how the performance quickly ‘ramps up’ at the lower transfer sizes when Windows RAID is not busy getting in the way. Also, from both ATTO runs we can tell the Hitachi-based controllers absolutely hate writes that are less than 4k in size, and by ‘hate’ I mean ‘slower than a Hard Disk Drive’.

Now for some quick IOMeter results. I’ll stick with 4k random since ATTO pretty much confirmed the sequential transfer speeds for us already. Note that the below table represents pre-conditioned but still fresh out of the box values (i.e. the drive was only sequentially written to capacity as to allocate all abailable LBAs):

# of LUNs 4k 100% Read 4k 67/33 R/W 4k 100% Write
1 (200GB) QD=32 49519 34847 64279
2 (400GB) QD=64 99139 69575 127556
3 (600GB) QD=96 148574 105322 191832
4 (800GB) QD=128 198160 139707 254789

IOMeter ‘fresh’ IOPS values for varying configurations ("Performance Mode").
QD=32 per LUN

FIrst off, the above figures demonstrate that the LSI HBA can absolutely handle SSD IOPS and scale properly with the addition of multiple SAS devices. Here we see a nice linear increase as we tack on additional LUNs for the workloads. This clean scaling was only possible by having IOMeter directly access each of the four devices simultaneously, bypassing any sort of Windows RAID.

Mixed workloads (where we are reading and writing simultaneously) saw less performance than reading or writing individually, but a slight dip here is somewhat expected. We will look into that aspect further with our next piece covering the 910 Series.

Finally, realize that the above write figures were for an unfragmented state. This is what you’ll see for the first few hours of heavy use. The SSD Review did a similar test to the bottom right (*edit – lower right – see below for more details*) corner of my chart above, and reported 228k where I saw 254k, but realize those figures (theirs and mine above) don’t represent a steady-state condition. I steamed ahead a bit further to get to a long-term figure for 4k random writes. Here it is:

Just over 83k random 4k write IOPS, and it took quite a while to get there, too. This figure is comfortably above Intel’s stated spec of 75k, and it’s impressive when you consider it is the result of a *continuous* 4k random write. Combined with Intel rating their HET-MLC flash for 10x its capacity each day for five full years, this is definitely a serious SSD!

*edit continued*:

As pointed out by a few in the comments (thanks!), I was reading The SSD Review’s piece incorrectly. Sincere apologies for my 3AM fumble there. The SSD Review’s pic showed 4k random reads at QD=64 per device, which was higher than the 32 value I chose for my testing. It did bring up a good question though – was it the higher QD and/or their choice of an 8GB LBA sweep that resulted in their figure being higher than ours, or was it the fact that we are testing on a SandyBridge testbed vs. their setup? It’s true that SandyBridge-E has plenty more PCIe lanes available, but in theory this should not matter since even though SandyBridge has only 16 PCIe lanes available (vs. the E variant’s 40 lanes), the 910 SSD is only using 8 of them. To clear this up, I fired up the 910 again, this time dialing in to match the QD=64 figure used in their testing. Here’s the result:

That looks really close. Lets compare them directly:

Spec PCPer (SandyBridge) SSD Review (SandyBridge-E) Delta
IOPS 228249.80 228750.67 0.2%
MB/s 891.60 936.96 (decimal) 0.2%
Avg IO Response 1.1214 1.1178 0.3%
Max IO Response 1.7414 4.0560 57.0%
% CPU 28.69% 11.60% 59.6%

 

…so lets break these down. First off, with less than half of one percent difference in IOPS and IO Response Time, it’s fair to say the testbeds are identical in ultimate IOPS, and that the number of extra PCIe lanes available is irrelevant. With that put to rest, lets move onto CPU. With more cores enabled (as well as HyperThreading), the LGA2011 CPU uses less of a percentage to accomplish the same task. This is expected, however there is a twist. I disable HyperThreading to prevent any possible added latencies from context switching, which might have contributed to us not seeing that rather long Maximum IO seen in the SandyBridge-E figures. It might have also been caused by the SandyBridge-E system running a pair of SLI GPUs in the same system testing the SSD. There’s no way to say for certain – all we know for sure is that we didn’t see the same unusually large Maximum IO Response Time.

Conclusion:

So there you have it. I wanted the first look to cover and verify most of Intel’s stated specs. We verified, and it has absolutely impressed us thus far. There’s more to follow on the SSD 910 as we dive into further resting for a more detailed review. Oh, and regarding that endurance spec we left out – I’ll get back to you in five years :).

« PreviousNext »