While the Meltdown announcements and patches were in full swing, I was busily testing a round of storage devices to evaluate the potential negative impact of the Meltdown patch. Much of the testing we've seen has come in the form of Linux benchmarks, and today we saw a few come out on the Windows side of things. Most of the published data to date shows a ~20% performance hit to small random accesses, but I've noted that the majority of reviewers seem to be focusing on the Samsung 950/960 series SSDs. Sure these are popular devices, but when evaluating changes to a storage subsystem, it's unwise to just stick with a single type of product.
Test conditions were as follows:
- ASUS Prime Z270-A + 7700K
- C-States disabled, no overclock.
- ASUS MCE disabled, all other clock settings = AUTO.
- Intel Optane 900P 480GB (Intel NVMe driver)
- Samsung 960 EVO 500GB (Samsung NVMe driver)
- Samsung 850 EVO 500GB (Intel RST driver)
- NTFS partition.
- 16GB test file. Sequential conditioning.
- Remainder of SSD sequentially filled to capacity.
The first results come from a clean Windows Redstone 3 install compared to a clean Windows 10 Redstone 4 (build 17063), which is a fast ring build including the Meltdown patch:
The 960 EVO comes in at that same 20% drop seen elsewhere, but check out the 850 EVO's nearly 10% *increase* in performance. The 900P pushes this further, showing an over 15% *increase*. You would figure that a patch that adds latency to API calls would have a noticeable impact on a storage device offering extremely low latencies, but that did not end up being the case in practice.
Since the 960 EVO looked like an outlier here, I also re-tested it using the Microsoft Inbox NVMe driver, as well as by connecting it via the chipset (which uses the Intel RST driver). A similar drop in performance was seen in all configurations.
The second set of results was obtained later, taking our clean RS3 install and updating it to current, which at the time included the Microsoft roll-up 01-2018 package (KB4056892):
Note that the results are similar, though Optane did not see as much of a boost here. It is likely that some specific optimizations have been included in RS4 that are more beneficial to lower latency storage devices.
As a final data point, here's what our tests look like with software polling implemented:
The above test results are using an application method that effectively bypasses the typical interrupt requests associated with file transfers. Note that the differences are significantly reduced once IRQs are removed from the picture. Also note that kernel API calls are still taking place here.
Well there you have it. Some gain and some lose. Given that a far lower latency device (900P) sees zero performance hit (actually gaining speed), I suspect that whatever penalty associated with Meltdown could be easily optimized out via updates to the Windows Inbox and Samsung NVMe drivers.
It would be interesting to
It would be interesting to see if these Meltdown patches affect other operating systems like Linux, MacOS, Win7, etc.
Seems the impact is much
Seems the impact is much larger on applications such as AS SSD, and crystal disk mark.
Seems to also cause performance reductions on applications which make use of Postgresql (such as some video editors).
The problem here is that
The problem here is that those benchmarks are very poor tools for measuring QD=1 workloads, mainly because they do not load down the CPU as it would be while doing actual work with the data at such a workload. I documented this further in a white paper released by Shrout Research. I was able to inflate the AS SSD and CDM results by simply running something in the background to increase CPU load. If an outside app can *increase* the score of a benchmark, well, it's not a very good/accurate benchmark.
Same for this? 900p on
Same for this? 900p on Anvil
From a research and repeatability standpoint I agree you need to be as CPU-independent as you can. Does it follow from the AS benches that the impact on C-active or on lower clocked CPUs is way more significant?
Could you post the bios
Could you post the bios version of the board you are using?
I would bet there will be a drop when you switch to the latest/patched version.
I think what you’re seeing is
I think what you’re seeing is this issue with the Samsung 960 EVO/PRO…what firmware are you using? The PRO 3B6QCXP7 is know to have problems, and PRO 2B6QCXP7 is fine, Samsung is aware of the problem and is working on a fix…same with the EVO 3B7QCXE7 vs older EVO 2B7QCXE7, the newer 3XXXXXX firmware’s have issues…
Only the Pro’s have the
Only the Pro’s have the issue, not the EVOs. Also, not all of them.
If you look in the 960 EVO thread, they looked through the log files that users sent them and they had none of the errors or logs the 960 Pros had.
” I suspect that whatever
” I suspect that whatever penalty associated with Meltdown could be easily optimized out via updates to the Windows Inbox and Samsung NVMe drivers.”
Well, one of my (conspiracy) theories, looking at benchmarks all over the internet showing in some cases increases in performance, even slight, was that companies could be holding back optimizations for months, until today. Throwing those optimizations at the same time with the patch, could help to avoid a number of bad reactions from customers, not to mention, avoid making many lawyers happy.
More likely, you’re seeing
More likely, you’re seeing the effects of cores/RAM being loaded ever so slightly differently from the last run. Anything within +-5% can be considered noise as far as the results go.
On my system (Z370 and 900p)
On my system (Z370 and 900p) I also saw an increase with the patch… until I did a bios update, the bam: -20%
Can you provide details?
Can you provide details?
Using AS-SSD, the 4K reads
Using AS-SSD, the 4K reads and write dropped 20% after an update to bios 1003 (8700k, Maximus X Hero), on an already-patched windows system.
You can see Guru3d article for a similar result, he had to re-do his test.
I seem to remember somewhere
I seem to remember somewhere that Microsoft doesn’t enable the patch unless the BIOS is updated to support it, so just because we have the patch, doesn’t mean that we are actually RUNNING the patch.
I’d be interested to know if this was taken into account in the above tests…
I’ve seen >20% deltas in QD1
I’ve seen >20% deltas in QD1 performance due to BIOS updates before this, mainly due to power management tweaks, which QD1 random is most sensitive to. Which motherboard are you using, and can you confirm that the patch was specifically for Spectre?
So, for Spectre part of the
So, for Spectre part of the remediation is in software (the patch) and part in the firmware/microcode (BIOS update). If you only have the patch you are not fully protected. You should look into this. There is a powershell script (google for it) where you can see if you have everything enabled. I have seen other benchmarks where the patch alone doesn’t hurt performance much but the patch + microcode hurts a fair bit, so it would be good to get some more trusted results from yourself.
On Maximus X Hero.
On Maximus X Hero.
Bios update 1003.
From my understanding, the OS patch only activates if it detects the proper microcode (?). Running the Powershell command Get-SpeculationControlSettings now returns all green.
In light of your
In light of your testing of the microcode update’s impact on storage should I flash my bios with the new update?
I have a Samsung 850 pro 512GB for boot and 2 850 EVOs (512 each) in RAID 0. I haven’t donwloaded any update or patch yet. If I do so will I see a performance hit?
These are some nice tests and
These are some nice tests and benchmarks. But they are not all that relevant as both Intel and Google claims you need the OS hotfix/update + a UEFI Firmware/BIOS update containing the new Intel CPU microcode in order to be fully patched.
Its the combination of the microcode update and the OS update that seems to be really hitting the performance.
The impact will also be
The impact will also be greater on processors that do not have the INVPCID instruction and therefore can’t make use of the Windows OS support for PCID optimizations. Which I believe is anything Ivy Bridge and earlier.
It’s also interested to see
It’s also interested to see whom will actually get the required BIOS/UEFI Firmware updates with the microcode update.
Intel will work on microcode updates for their newest CPU line-ups first and they have only guaranteed updates for all CPU’s releases the past five years so it might seem that they might not patch CPU’s prior to Haswell. They have at least told that they will be working on pushing out updates the coming weeks.
But that doesn’t really help all that much. Your motherboard/system vendor needs to provide the BIOS/UEFI Firmware update as well. We have no clue on how vendors like Asus, Gigabyte, MSI etc is going to handle this. Everyone on Z370 and Coffee Lake will most likely see updates but whether these vendors will go back and provide updates for older systems and how long back they will go no body knows. I don’t expect that my fiancees computer with Asus Rampage IV Extreme and Intel Core i7-3960X will get a firmware update, neither will my server running Asus P8B-E/4L and Intel Xeon E3-1275v2.
Judging by how ineffective most vendors have been with providing the critical fix for the Intel MEI vulnerability I wouldn’t expect much..
I’m going to oversimplify
I’m going to oversimplify this a bit for the sake of brevity but technically since microcode updates are stored in system reserved memory and not actually patched onto the cpu microsoft could do this at boot if they wanted to but I doubt they want the liability for it if any issues arise. There’s a VMware driver that does this but I’m not sure if it’s applied before or after the windows kernel checks to make sure the support is there for whatever mitigations they’re doing. You could also hack together your own modified UEFI with an updated microcode which thankfully has been very well documented by the UEFI/SLIC hacking community from back when using OEM windows licenses was common thing to bypass windows activation requirements.
It’s going to be an even
It’s going to be an even longer wait for Firmware/Microcode updates from the Laptop OEMs, especially on older laptops.
So I’m not going to bother patching for this until after the regular Jan 2018 patches arrive on the 9th and I have installed those unrelated KB#s. It’s hard enough just managing to keep my 4 laptops updated every month and that’s for laptops from 4 different laptop OEMs some of them with crappy Firmware/Other service after the sale histories, Yes Toshiba you here that!
Then there is the Security Software Folks and the updates to their software/services engines that need to set that Windows Registry setting or the patch/s will not be enabled and that’s another level of waiting with little information provided by the Security Software makers for their respective security products.
If You are running windows 7, 8.1, well, most onlne news sources only speak of windows 10, as if windows 10 actually has a larger install base than windows 7, or even windows 8.1 and windows 7 combined. So the total lack of Information from all the concerned OS makers, Security Software makers, and Motherboard and Laptop OEMs/Others is not very helpful.
The microcode updates are not
The microcode updates are not supposed to hurt performance significantly. It is the additional user/kernel separation (that is part of the OS patch) that is reportedly responsible for the performance hit.
I think some part of the
I think some part of the patch does not activate if hardware support is missing (in the form of a microcode / bios update), which is why MS provides a Powershell tool to verify.
I know how the patch works
I know how the patch works for linux and there it can never be faster than before it has to be slower no way around that. in fact it always is slower the only reason you mostly do not see it is that the total overhead for the syscall is usually so small you can not measure it.
To see that it increase performance on windows is as unexpected as to having more money in the pocket when you return from the grocery store than you had when you entered. That is something has to be wrong it just is not possible.
Or windows did something really really stupid before and they fixed it.
The increase can be due to
The increase can be due to increased CPU usage during the QD1 workload, which leads to the CPU remaining in a higher power state during the workload, which leads to faster IRQ response times. It’s explainable if you understand what is actually going on. It also explains why polling (which keeps a core pegged at max power) shows little difference pre/post patch.
Sounds a bit strange that
Sounds a bit strange that there should be enough idle time during a benchmark for the cpu to lower the clock.
Anyway I wanted to find out just how much of an impact it really is on linux so I did a small script and it’s brutal.
I get a 150% slowdown 🙁 granted it is a worst case type of thing but still quite bad.
basically a “dd if=/dev/zero of=/dev/null bs=1” of 100M data takes 15.19 seconds without KPTI and 38.78 seconds with KPTI on a i7-6700K.
Ryzen + 960 Evo = suck it
Ryzen + 960 Evo = suck it meltdown
Uh, enjoy your Spectre patch,
Uh, enjoy your Spectre patch, I guess?
ya this is also on the amd
ya this is also on the amd verson of the evo for am3+ chip sets
Reality seem different.
Reality seem different. Compiling code (done by tens of millions of PC users many times daily) show a 6% to 15% slowdown. (on intel processors)
It also seem that MS optimized a few things to counter the slowdown (6 month in RnD) and time the release of the optimization to mitigate the patch.
So the best way to benchmark the true impact of this is on linux using the OS switch to turn ob/off kernel page isolation.
No update on this with the
No update on this with the Powershell command output to make sure everything is properly activated?