Latency Percentile, IO Percentile, and Saturated (High-Res) QoS
Required reading (some additional context for those unfamiliar with our Percentile testing):
- Introduction of Latency Distribution / Latency Percentile (now called IO Percentile)
- Introduction of Latency Weighted Percentile (now called Latency Percentile)
Intro to PACED workloads – 'It's not how fast you go, it's how well you go fast!'
Since this is a solo review of the P3520, with no direct comparisons to other products, I've shelved our PACED Quality of Service (QoS) testing in favor of directly evaluating the QoS figures stated in Intel's product specification. I will also be introducing a novel way of presenting the QoS "9's" data which is derived from IO percentile, but before we get into that, I figured this review would be a good opportunity to take a given SSD and lay out the three presentation methods we now have at our disposal. I'll start with Latency (total time) Percentile, shift to IO Percentile, and finally we will rework that data into our new High Resolution QoS plot, which will allow easy comparison to the specification (marked with X's as we did earlier in this review).
Latency Percentile
Latency Percentile is a translation of the latency distribution that takes into account the time spent servicing the IOs, meaning the above plots are showing the percentage of the total run time spent on those IOs. The results are effectively weighed by latency, where longer IOs have a larger impact on the percentile. These results are *not* used for QoS calculations, since QoS assumes the IOs are all independent, meaning if one IO stalls, the rest will just keep on going without waiting for it.
IO Percentile
Looking at the pure IO Percentile spread, we note some OS RAM cache hits accounting for a small percentage of reads. Those were minimized in Latency Percentile because those IOs were so much faster than the bulk and took a very small fraction of the total time to complete.
Quality of Service (QoS)
QoS is specified in percentages (99.9%, 99.99%, 99.999%), and uniquely spoken (‘three nines’, ‘four nines’, ‘five nines’). It corresponds to the latency seen at the top 99.x% of all recorded IOs in a run. Enterprise IT managers and system builders care about varying levels of 9's because those long latencies lead to potential timeouts for time-sensitive operations, and increasing the 9's is how they quantify more stringent QoS requirements. Note that these comparative results are derived from IO Percentile data and *not* from Latency Percentile data.
If you have a hard time wrapping your head around the 9's thing, It may be easier to flip things around and think about it from the standpoint of the remaining longest-latency IO's that haven't been accounted for as the plot progresses. As an example, the 99.9% line near the center of the vertical axis represents the top 10% of the top 1% (0.1%) of all recorded IOs, where 'top' means those IOs of the longest latency.
These plots are tricky to make, as they are effectively an inverse log scale. Each major increment up from the zero axis corresponds to the top 90%, and the next increment after that shows the top 90% *of that previous value*, meaning it's an asymptotic scale which will never reach 100%. The plots below essentially take the top portion of the IO Percentile results and spread them out, exponentially zooming in on the results as they approach 100%.
First the specs:
…and now the results (X'd as appropriate):
QoS was within spec in nearly all cases. There were some slight oddities noted with reads, but still well within a small margin of error considering my testing is done under Windows while Intel's rating is derived using Linux (and with a completely different test suite).
The high-resolution latency distribution binning employed back when I started this whole percentile thing was actually done as a means to get more accurate QoS figures, and with High-Resolution QoS now a fruit of that labor, the circle is now complete. In a further twist, these very results were fed back to Intel and played a part in tweaking their QoS specification for the P3520! High Resolution QoS for the win!
$0.50/GB is considered good?
$0.50/GB is considered good? Was this article written in 2005?
For pci-e ssds, that is
For pci-e ssds, that is considered good.
Yeah, for SATA SSDs anything
Yeah, for SATA SSDs anything <0.25/GB is pretty good, this is about twice that but you're also getting around twice the speeds.
Too expensive for me personally, but not unreasonable IMO.
Intel enterprise SSDs didn’t
Intel enterprise SSDs didn't launch until 2008, and did so at >$10/GB (>20x the cost).
That’s good progress, so they
That’s good progress, so they should begin to be viable around 2024
SSD market share has doubled
SSD market share has doubled for the past two years. It's expected to surpass HDD a lot sooner than 2024.
in 2005 SSDs would be more
in 2005 SSDs would be more like $50/GB 🙂
For that terrible 0.7 DWPD/5
For that terrible 0.7 DWPD/5 years, I would take 750 over this thing any day, performance wise it’s not even close to P3700/750.
Performance is no comparison,
Performance is no comparison, obviously. The point of this drive is cost, which is a fraction of all parts you mentioned.
Allyn, thank you, I really
Allyn, thank you, I really like the depth of your reviews, I’m actually learning stuff!
I do not find any mention of
I do not find any mention of capacitor for power loss writes. It’s a feature on which I place great importance.
Intel has among the highest,
Intel has among the highest, if not *the* highest power loss testing / qualification / reliability in the industry. It wasn't mentioned specifically because at this point it's just a given for their products. Here's a blurb from one of their product briefings:
They also bombard their drives with radiation (from an accelerator) until they hang, restart them, and ensure no data was corrupted. Their testing is pretty crazy, and that's why their products typically run higher in cost compared to others, but you get what you pay for.
Many think inflight data
Many think inflight data protection only as a safety issue, but it is also a significant performance issue. Without inflight data protection, use of inflight data must be turned off in the OS (it may be called something like write cache) to avoid data corruption in case of power failure, which in turn significantly lowers write speed.
So the point of inflight data protection or the lack of it should be hammered home in every review until it gets the warranted attention.
There are lots of layers of
There are lots of layers of what would/could be considered 'in-flight'. Even with all caching disabled, the mere fact that writes are queued could be considered so, as they are technically buffered by the kernel. To strip all the way down to zero buffering would reduce the performance of *most* SSDs to painful levels, as you'd have to limit to QD=1 and disable all OS buffers.
This protection, as defined by SSD makers, is a guarantee that the data that has been received by the controller at the point of power loss will be retained and available at next power up. Host / OS-side buffers will naturally not be included here.
Very excited about P3520
Very excited about P3520 especially in U.2 2.5″ format. This kind of pricing should really increase the viability (economically speaking) of big top-of-rack all flash arrays.
Not sure if you mentioned in the review but has Intel made any mention of dual-port U.2 version?
No mention of dual port for
No mention of dual port for this one, but I'd guess once 3D rolls out to other models in their lineup, it will include dual port.
So, let me make sure I
So, let me make sure I understand. This SSD is not tested against any other product, yet receives an editors choice. I smell something.
What you smell is no other
What you smell is no other products competing at this low of a cost/GB. Other companies are welcome to sample us their competing products (we ask them often).
It was pretty well-explained
It was pretty well-explained why…
what about raid 0 on 4 of
what about raid 0 on 4 of these
We are thinking of using the
We are thinking of using the P3520 or P3500 in Supermicro 48 bay nvme server. P3500 might be quicker but probably these will already move the bottleneck to the interface… Will have a look if you benchmarked the p3500 before…
Going to try out three of the
Going to try out three of the 1.2TB P3520’s for the hot tier in a three node hyperconverged environment. It’d be interesting to know what sort of benchmark would be relevant for comparison purposes on that kind of platform, since the workload mix could look like practically anything.
Yes it would, trying to set
Yes it would, trying to set up benchmarks simulating that kind of environment is not simple. Let us know how it goes as it could be very interesting.