PCPer Client SSD Test Suite – Introduction
A New Test Suite
I've been working behind the scenes on a radically new test methodology. I had grown tired of making excuses for benchmarks not meshing well with some SSD controllers, and that matter was amplified significantly by recent SLC+TLC hybrid SSDs that can be very picky about their workloads and how they are applied. The complexity of these caching methods has effectively flipped the SSD testing ecosystem on its head. The vast majority of benchmarking software and test methodologies out there were developed based on non-hybrid SLC, MLC, or TLC SSDs. All of those types were very consistent once a given workload was applied to them for long enough to reach a steady state condition. Once an SSD was properly prepared for testing, it would give you the same results all day long. No so for these new hybrids. The dynamic nature of the various caching mechanisms at play wreak havoc on modern tests. Even trace playback testing such as PCMark falter, as the playback of traces is typically done with idle gaps truncated to a smaller figure in the interest of accelerating the test. Caching SSDs rely on those same idle time gaps to flush their cache to higher capacity areas of their NAND. This mix up has resulted in products like the Intel SSD 600p, which bombed nearly all of the ‘legacy’ benchmarks yet did just fine once tested with a more realistic, spaced out workload.
To solve this, I needed a way to issue IO's to the SSD the same way that real-world scenarios do, and it needed to be in such a way that did not saturate the cache of hybrid SSDs. The answer, as it turned out, was staring me in the face.
Latency Percentile made its debut a year ago (ironically, with the 950 PRO review), and those results have proven to be a gold mine that continues to yield nuggets as we mine the data even further. Weighing the results allowed us to better visualize and demonstrate stutter performance even when those stutters were small enough to be lost in more common tests that employ 1-second averages. Merged with a steady pacing of the IO stream, it can provide true Quality of Service comparisons between competing enterprise SSDs, as well as high-resolution industry-standard QoS of saturated workloads. Sub-second IO burst throughput rates of simultaneous mixed workloads can be determined by additional number crunching. It is this last part that is the key to the new test methodology.
The primary goal of this new test suite is to get the most accurate sampling of real-world SSD performance possible. This meant evaluating across more dimensions than any modern benchmark is capable of. Several thousand sample points are obtained, spanning various read/write mixes, queue depths, and even varying amounts of additional data stored on the SSD. To better quantify real-world performance of SSDs employing an SLC cache, many of the samples are obtained with a new method of intermittently bursting IO requests. Each of those thousands of samples is accompanied by per-IO latency distribution data, and a Latency Percentile is calculated (for those counting, we’re up to millions of data points now). The Latency Percentiles are in turn used to derive the true instantaneous throughput and/or IOPS for each respective data point. The bursts are repeated multiple times per sample, but each completes in less than a second, so even the per-second logging employed by some of the finer review sites out there just won’t cut it.
Would you like some data with your data? Believe it or not, this is a portion of an intermittent calculation step – the Latency Percentile data has already been significantly reduced by this stage.
Each of the many additional dimensions of data obtained by the suite is tempered by a weighting system. Analyzing trace captures of live systems revealed *very* low Queue Depth (QD) under even the most demanding power-user scenarios, which means some of these more realistic values are not going to turn in the same high queue depth ‘max’ figures seen in saturation testing. I’ve looked all over, and nothing outside of benchmarks maxes out the queue. Ever. The vast majority of applications never exceed QD=1, and most are not even capable of multi-threaded disk IO. Games typically allocate a single thread for background level loads. For the vast majority of scenarios, the only way to exceed QD=1 is to have multiple applications hitting the disk at the same time, but even then it is less likely that those multiple processes will be completely saturating a read or write thread simultaneously, meaning the SSD is *still* not exceeding QD=1 most of the time. I pushed a slower SATA SSD relatively hard, launching multiple apps simultaneously, trying downloads while launching large games, etc. IO trace captures performed during these operations revealed >98% of all disk IO falling within QD=4, with the majority at QD=1. Results from the new suite will contain a section showing a simple set of results that should very closely match the true real-world performance of the tested devices.
While the above pertains to random accesses, bulk file copies are a different story. To increase throughput, file copy routines typically employ some form of threaded buffering, but it’s not the type of buffering that you might think. I’ve observed copy operations running at QD=8 or in some cases QD=16 to a slower destination drive. The catch is that instead of running at a constant 8 or 16 simultaneous IO’s as you would see with a saturation benchmark, the operations repeatedly fill and empty the queue, meaning the queue is filled, allowed to empty, and only then filled again. This is not the same as a saturation benchmark, which would constantly add requests to meet the maximum specified depth. The resulting speeds are therefore not what you would see at QD=8, but actually, a mixture of all of the queue steps from one to eight.
Conditioning
Some manufacturers achieve unrealistic ‘max IOPS’ figures by running tests that place a small file on an otherwise empty drive, essentially testing in what is referred to fresh out of box (FOB) condition. This is entirely unrealistic, as even the relatively small number of files placed during an OS install is enough to drop performance considerably from the high figures seen with a FOB test.
On the flip side, when it comes to 4KB random tests, I disagree with tests that apply a random workload across the full span of the SSD. This is an enterprise-only workload that will never be seen in any sort of realistic client scenario. Even the heaviest power users are not going to hit every square inch of an SSD with random writes, and if they are, they should be investing in a datacenter SSD that is purpose built for such a workload.
Calculation step showing full sweep of data taken at multiple capacities.
So what’s the fairest preconditioning and testing scenario? I’ve spent the past several months working on that, and the conclusion I came to ended up matching Intel’s recommended client SSD conditioning pass, which is to completely fill the SSD sequentially, with the exception of an 8GB portion of the SSD meant solely for random access conditioning and tests. I add a bit of realism here by leaving ~16GB of space unallocated (even those with a full SSD will have *some* free space, after all). The randomly conditioned section only ever sees random, and the sequential section only ever sees sequential. This parallels the majority of real-world access. Registry hives, file tables, and other such areas typically see small random writes and small random reads. It’s fair to say that a given OS install ends up with ~8GB of such data. There are corner cases where files were randomly written and later sequentially read. Bittorrent is one example, but since those files are only laid down randomly on their first pass, background garbage collection should clean those up so that read performance will gradually shift towards sequential over time. Further, those writes are not as random as the more difficult workloads selected for our testing. I don't just fill the whole thing up right away though – I pause a few times along the way and resample *everything*, as you can see above.
Comparison of Saturated vs. Burst workloads applied to the Intel 600p. Note the write speeds match the rated speed of 560 MB/s when employing the Burst workload.
SSDs employing relatively slower TLC flash coupled with a faster SLC cache present problems for testing. Prolonged saturation tests that attempt to push the drive at full speeds for more than a few seconds will quickly fill the cache and result in some odd behavior depending on the cache implementation. Some SSDs pass all writes directly to the SLC even if that cache is full, resulting in a stuttery game of musical chairs as the controller scrambles, flushing SLC to TLC while still trying to accept additional writes from the host system. More refined implementations can put the cache on hold once full and simply shift incoming writes directly to the TLC. Some more complicated methods throw all of that away and dynamically change the modes of empty flash blocks or pages to whichever mode they deem appropriate. This method looks good on paper, but we’ve frequently seen it falter under heavier writes, where SLC areas must be cleared so those blocks can be flipped over to the higher capacity (yet slower) TLC mode. The new suite and Burst workloads give these SSDs adequate idle time to empty their cache, just as they would have in a typical system.
Apologies for the wall of text. Now onto the show!







What am I missing here? It
What am I missing here? It seems like lower perf than 850 Evo while costing slightly more.
It beats the EVO in the mixed
It beats the EVO in the mixed workload test and comes very close to it in the others.
I would hope for more…
I would hope for more… given that 850 Evo was released a long time ago, and it’s halfway in 2017 already.
You hoped wrong. Not much has
You hoped wrong. Not much has changed in the SSD space and the technology has plateaued due to sata limitations. What was hoped here is slightly better performance at a similar price point. This was achieved.
People reaching for the Intel hate from a cloud of ignorance and inexperience is getting old fast. Let alone the preaching from users who don’t know what they’re talking about.
Cool story?
No need to bring
Cool story?
No need to bring in a totally irrelevant point into the discussion. I dunno what the hell the last paragraph was about, but at least you let your inner struggles out. Or maybe in your world expecting or hoping for more is Intel hate.
Sata limitations limit bandwidth, randoms don’t saturate that much.
If you can’t beat performance you win in price.
Hoped wrongly? Lol okay buddy.
Higher performance… in one test, mixed? Not a clear cut win.
Agreed, Dark wizzle. Would go
Agreed, Dark wizzle. Would go Samsung 960 Evo PCIe over this every time.
The 545S is meant for those
The 545S is meant for those upgrading an HDD-equipped laptop or desktop (which might not have an M.2 slot), and the 960 EVO costs $40 more, which not everyone can afford, despite the better performance.
Fair enough. But in that
Fair enough. But in that scenario I’d just grab an ADATA SU800 which is $140 with promo code right now on Newegg.
You aren’t every user.
What
You aren’t every user.
What is so challenging for people to understand about this product and review?
“But in that scenario I’d
“But in that scenario I’d just grab an ADATA SU800”
But with that your gambling on non-brand name SSD’s and a lower warranty as it’s 3 year vs 5 year with Intel and other name brands. basically a manufacturers warranty on a SSD pretty much speaks volumes about the quality of the product.
I got a Intel 545s 128GB SSD for only $31.99 in July 2018 as that’s hands down the best SSD I have found for around $30 even though it’s normal price, which is around $50 or a bit more, makes it far less appealing to where your better off going with something else and a larger capacity as to me 120-128GB range SSD’s are not worth more than around $30 as much beyond that your better off going to a 250GB range SSD etc. basically 120-128GB range SSD’s are mostly good for internet machines and not much beyond that because as a general rule I suggest most people get at least a 250GB range SSD and if your a gamer the 500GB range ones are the sweet spot right now in terms of capacity/price combo.
but for the most part I would be cautious buying SSD’s that are not from Crucial/Intel/Samsung (and maybe a small amount of other brands like Western Digital) if you want more proven quality as venturing outside of those can save you a bit of $ but your gambling on quality/longevity and less warranty to. personally I would straight up avoid the generic brands if you want reliability as to me it’s not worth saving a little $ for a drive that might not last nearly as long.
So that’s nice and all, but
So that’s nice and all, but where’s consumer Optane ? Take my money Intel !
I think it was supposed to be
I think it was supposed to be late this year.
More fast sata competition is
More fast sata competition is a good thing – we need nand prices to resume their historical downward trend..
Meh, if you have to put “(for
Meh, if you have to put “(for an Intel SSD)” then its not a good deal. Not feeling to gold award on this, maybe bronze or silver.
more substandard tlc shit for
more substandard tlc shit for way too much money
should be 10cents/gig for this shit already
You seem to act like TLC is
You seem to act like TLC is crap when it’s far from it as the bottom line is TLC is still plenty fast and write endurance is great as just about any modern SSD (or semi-modern) should be able to crack at least ‘7x TBW’ which means if one wrote 20GB per day EVERY SINGLE DAY it would last at least 10+ years (i.e. 73TB of written data in 10 years @ 20GB per day) and the vast majority of people won’t consistently write that much data to it day-after-day.
or put it this way… unless your going crazy with writing data to the drive one should be able to get a EASY 5+ years of life from a modern SSD. but I would expect to see 10+ years in general. say one wrote 40GB a day that should be able to do at least 5+ years of use out of it but I would expect comfortably beyond that as 40GB of writes per day is 14.6TB per year and in 10 years that 146TB of writes and it’s not unrealistic for a 7x TBW rated drive to hit that amount of writes and 40GB is a lot of data writing per day for the vast majority of people. plus, 10 years is a lot of time for technology advancement to as how many people who get a SATA SSD now will still be using it in 10 years? ; some, but probably not too many especially if the computer they have it in is obsolete in that time (like if it’s not at least a decent internet machine I don’t see people hanging onto it for long). plus, even a 500GB range SSD today, which is about the top end of what most people would buy today (in Sep 2018) in terms of SATA SSD’s, is not that much storage space either and will be that much less in 10 years.
also, from the looks of things… the official TBW rating tends to be conservative which means in the real world they will likely last quite a bit beyond the official TBW rating before actual failure occurs from writing data to the drive.
To make sure I’m
To make sure I’m understanding correctly: do you actually fill the drive with data for each of your % testing? Like, you just throw large files/large amounts of files on the drive such as movies or something?
Just wanting to make sure I’m fully grasping the testing methodology.
(Sorry, logged in now)
And
(Sorry, logged in now)
And would you recommend this testing methodology for older SSDs that are only MLC? I’ve got a PNY XLR8 Pro 480GB SSD that’s about 1.5 years old that I’d be interested in testing using your method.
Yes, the drive is filled with
Yes, the drive is filled with actual files during the sequence. They are large files meant to replicate bulk media being stored over time. The random portion remains the same size and in the same location during the test. This is all to get as close as we can to what actually happens to an SSD in real world use.
I’m still rocking Intel first
I’m still rocking Intel first generation client SSDs and Intel SSDs are the only vendor I use. I’ve never had one fail yet. Awesome and detailed review Allyn!!
If you mean still using
If you mean still using X25-M’s — a modern “good” SSD will be very noticeably faster..
I’m still using 320 series. I
I’m still using 320 series. I can’t believe it’s already 6 years old, but it’s still going strong (according to Intel’s SSD Toolbox, still 100% life remaining, which I don’t fully trust).
Why did I ever worry about SSD endurance?
“Why did I ever worry about
“Why did I ever worry about SSD endurance?”
I heard bad things about SSD’s in the earlier days. but anything semi-modern should last a long time.
also, I would not really look at ‘life remaining’ as I would look at TBW which is the amount of data written to the drive as that’s the best indicator of about how much life you got left in it assuming the drive only fails from writing data to it.
sure, I realize the SSD’s could fail out of no where on something else but chances are unless you got a faulty SSD it should easily last many years with a brand name SSD like Samsung/Crucial/Intel (and possibly some others).
in my main PC I got a Samsung 850 EVO 250GB, which I had since May 2015, and here it is 3 years and 4 months later and 12.3TB are written to it (it’s rated @ 75TBW but will likely go well beyond that before actual failure occurs from writes). so basically I am in the ball park of 4TB a year. so at my current rate, assuming the drive only fails from writing data to it, I would see 20+ years of use out of it. NOTE: the amount of data written to the Samsung SSD be noticeably higher had I not used my regular hard drives which is where I put my larger files which are video files. but then again just about any SSD’s that are more affordable currently (i.e. 500GB and less is where I expect most people to be interested in currently) are not all that large to where many will need a regular hard drive.
but as we start to see larger capacity SSD’s drop in price I suspect it will be more likely ill start using some for larger data on them which will increase the amount of writing I do on the drive quite a bit. but as the SSD’s TBW ratings start to get into the 100’s of TBW it gets to the point you can just simply use them and do pretty much whatever you want and don’t even worry about how much data your writing because the drive will still easily last 5+ years.
We do LUV you,
We do LUV you, Allyn.
However, does your new test suite mean
that we won’t be seeing ATTO numbers for
Intel’s upcoming 2.5″ NVMe Optane? 🙂
p.s.
Bumper sticker seen in Oregon:
CONSTANT CHANGE IS HERE TO STAY.
Pleasant surprise,as this
Pleasant surprise,as this year has been the-
“race to the bottom in SATA SSD’s”
Seems to have fixed the latency problems of first gen(MX300)
Be interesting to see if Crucial brings this NAND to us
with a Marvel controller.
A 850EVO killer it’s not but it’s close enough to consider
if it’s priced right………………