Latency Percentile, IO Percentile, and Saturated (High-Res) QoS

Required reading (some additional context for those unfamiliar with our Percentile testing):

Intro to PACED workloads – 'It's not how fast you go, it's how well you go fast!'

Since this is a solo review of the P3520, with no direct comparisons to other products, I've shelved our PACED Quality of Service (QoS) testing in favor of directly evaluating the QoS figures stated in Intel's product specification. I will also be introducing a novel way of presenting the QoS "9's" data which is derived from IO percentile, but before we get into that, I figured this review would be a good opportunity to take a given SSD and lay out the three presentation methods we now have at our disposal. I'll start with Latency (total time) Percentile, shift to IO Percentile, and finally we will rework that data into our new High Resolution QoS plot, which will allow easy comparison to the specification (marked with X's as we did earlier in this review).

Latency Percentile

Latency Percentile is a translation of the latency distribution that takes into account the time spent servicing the IOs, meaning the above plots are showing the percentage of the total run time spent on those IOs. The results are effectively weighed by latency, where longer IOs have a larger impact on the percentile. These results are *not* used for QoS calculations, since QoS assumes the IOs are all independent, meaning if one IO stalls, the rest will just keep on going without waiting for it.

IO Percentile

Looking at the pure IO Percentile spread, we note some OS RAM cache hits accounting for a small percentage of reads. Those were minimized in Latency Percentile because those IOs were so much faster than the bulk and took a very small fraction of the total time to complete.

Quality of Service (QoS)

QoS is specified in percentages (99.9%, 99.99%, 99.999%), and uniquely spoken (‘three nines’, ‘four nines’, ‘five nines’). It corresponds to the latency seen at the top 99.x% of all recorded IOs in a run. Enterprise IT managers and system builders care about varying levels of 9's because those long latencies lead to potential timeouts for time-sensitive operations, and increasing the 9's is how they quantify more stringent QoS requirements. Note that these comparative results are derived from IO Percentile data and *not* from Latency Percentile data.

If you have a hard time wrapping your head around the 9's thing, It may be easier to flip things around and think about it from the standpoint of the remaining longest-latency IO's that haven't been accounted for as the plot progresses. As an example, the 99.9% line near the center of the vertical axis represents the top 10% of the top 1% (0.1%) of all recorded IOs, where 'top' means those IOs of the longest latency.

These plots are tricky to make, as they are effectively an inverse log scale. Each major increment up from the zero axis corresponds to the top 90%, and the next increment after that shows the top 90% *of that previous value*, meaning it's an asymptotic scale which will never reach 100%. The plots below essentially take the top portion of the IO Percentile results and spread them out, exponentially zooming in on the results as they approach 100%.

First the specs:

…and now the results (X'd as appropriate):

QoS was within spec in nearly all cases. There were some slight oddities noted with reads, but still well within a small margin of error considering my testing is done under Windows while Intel's rating is derived using Linux (and with a completely different test suite).

The high-resolution latency distribution binning employed back when I started this whole percentile thing was actually done as a means to get more accurate QoS figures, and with High-Resolution QoS now a fruit of that labor, the circle is now complete. In a further twist, these very results were fed back to Intel and played a part in tweaking their QoS specification for the P3520! High Resolution QoS for the win!

« PreviousNext »