Introduction
Samsung released a new version of Magician and a new 840 EVO firmware. Are the troubles finally over?
The tale of the Samsung 840 EVO is a long and winding one, with many hitches along the way. Launched at the Samsung 2013 Global SSD Sumit, the 840 EVO was a unique entry into the SSD market. Using 19nm planar TLC flash, the EVO would have had only mediocre write performance if not for the addition of a TurboWrite cache, which added 3-12GB (depending on drive capacity) of SLC write-back cache. This gave the EVO great all around performance in most consumer usage scenarios. It tested very well, was priced aggressively, and remained our top recommended consumer SSD for quite some time. Other editors here at PCPer purchased them for their own systems. I even put one in the very laptop on which I'm writing this article.
An 840 EVO read speed test, showing areas where old data had slowed.
About a year after release, some 840 EVO users started noticing something weird with their systems. The short version is that data that sat unmodified for a period of months was no longer able to be read at full speed. Within a month of our reporting on this issue, Samsung issued a Performance Restoration Tool, which was a combination of a firmware and a software tool that initiated a 'refresh', where all stale data was rewritten, restoring read performance back to optimal speeds. When the tool came out, many were skeptical that the drives would not just slow down again in the future. We kept an eye on things, and after a few more months of waiting, we noted that our test samples were in fact slowing down again. We did note it was taking longer for the slow down to manifest this time around, and the EVOs didn't seem to be slowing down to the same degree, but the fact remained that the first attempt at a fix was not a complete solution. Samsung kept up their end of the bargain, promising another fix, but their initial statement was a bit disappointing, as it suggested they would only be able to correct this issue with a new version of their Samsung Magician software that periodically refreshed the old data. This came across as a band-aid solution, but it was better than nothing.
Now we have the fix and I can report on what it is actually doing and accomplishing. I've been working with the new firmware and a beta version of Magician 4.6 (to be released today as I understand). Before getting into the actual results, I'll post the Q+A I had with Samsung as we were testing. This should explain at least what we expect to see:
Q: The new firmware appears to restore read performance without the need for Magician. How was this accomplished?
A: Samsung revised the firmware algorithm to maintain consistency in performance for old data under exceptional circumstances. Therefore, read performance was restored without the need for Magician. This algorithm is based on a periodic refresh feature that can maintain the read performance of this older data. The algorithm does not affect normal user scenarios (i.e. occasional PC performance degradation due to background work of SSD) or the lifespan of an SSD and can actively maintain its performance without the help of Magician. However, this algorithm does not operate when the power is off.
Q: Are there any functions of the new Magician that are required to keep read performance high?A: The read performance has been improved by the revised firmware algorithm. If performance recovery is slow in instances where the SSD did not have enough run-time for the firmware algorithm to reach normal performance levels, or similarly, had been powered off for an extended amount of time, the performance can be recovered by using the Advanced Performance Optimization feature in Magician 4.6. This is a supplementary feature to maintain normal performance for a few exceptional circumstances.
Q: What is the upgrade process for those who did not previously upgrade using the performance restoration tool (meaning they are still on the original firmware)? Is it possible to skip directly to this new firmware and not use the performance restoration tool?A: Users can upgrade to the new firmware through Magician 4.6, without using the performance restoration tool.
Q: Will there be a firmware update for the other Samsung TLC-based SSD models that have also demonstrated this read performance issue? If so, which models and how soon will that firmware be made available?A: This issue had been reported for the 840 EVO SSD only.
**Edit** The new firmware will be available 'later this month'.
…so we have a firmware that can do its own periodic refreshing of data, along with an 'Advanced Performance Optimization' that can be triggered from within Samsung SSD Magician. Those that had not updated their 840 EVO before this update can skip directly to the new firmware without the need to run the old 'Performance Restoration Tool'. We were a bit disappointed to see Samsung still ignoring their other TLC SSDs, on which many have reported seeing the same type of slow down (us included), but we've pushed that one about as hard as we could. For now, lets focus on the 840 EVO and see if this fix is really what it claims to be.
Just so I’m clear… there
Just so I’m clear… there has never been a problem like this with the 850 EVO, right?
I’m thinking about getting a couple 850 EVOs… but this whole situation kinda freaked me out!
The end of the reviews says
The end of the reviews says “One final note – this issue was *only* on older TLC Samsung SSDs. Your 850 EVO is not affected (it has a completely different flash architecture), and neither is your 840 Pro or 850 Pro (those use MLC flash, not TLC).”
Yeah I read that… just
Yeah I read that… just making sure there wasn’t another kind of problem I should be worried about.
Thanks!
I feel better about buying 850 EVOs now 🙂
I’ve had a poor performance
I’ve had a poor performance with the 850 EVO compared to an 840 EVO of similar capacity. In my own benchmark the 850 had slower reads than writes, but all together was slower than the 840 I already had. Note: The 850 Evo was recently purchased from Amazon.
They keep dragging this out
They keep dragging this out need to get off their butt now!!!
Noone is dragging their butt,
Noone is dragging their butt, this is a very complicated process, and they must make sure the new firmware works in MILLIONS of completely different circumstances, and doesn’t brick the drive when you upgrade it. People seem to think that this stuff can be done by simply waving a wand, or telling some I.S. guy to throw together a script real fast. If you are having this problem, and it is bugging you, or affecting you in some way, you can fix it with an XCopy to another drive, format it, then recopy to the SSD (clone it, then re-clone back), this will fix the issue until they get the new firmware released. The slowdown is due to the way an SSD works, they aren’t holding bits magnetically, they are (as a metaphor) holding an electron prisoner in a cell (or the cell is empty). As time goes by, and the cell isn’t checked for awhile, it becomes lethargic hiding in a dark corner, and a little harder to check its status (maybe even requiring some error correction by smacking the cell to see if it moves), and this is made harder by the fact that the cells are stacked on top of each other to allow more cells in the same surface area (three floors of cells, observed from the roof). The “fix” I gave earlier basically gives them some time outside to grab some sun and exorcize, and a bit more energy so they are more active and easily visible, making the count easier and faster. Of course this is a metaphor, but its a pretty good one without explaining EXACTLY how flash memory works. New firmware (instructions telling the hardware EXACTLY how to work, timings, cache handling, addressing (again, very different than conventional HDD, which is why you NEVER defrag an SSD), error correction (on supported drives), and how to communicate with the other PC components. Hardware fixes (and firmware) are not something you can do with a line or two of code in a few minutes, which means that a company taking the proper time, processes, testing, and steps to release proper firmware is not going to be able to release it “immediately”, it takes TIME.
This also brings up another issue I have with the younger generations that grew up with tech and internet in their hands and heads from the time they were walking, they have no idea what patience is, and expect everything to be delivered instantly; most cannot even wait a few seconds for a webpage to load or a video to buffer, it is completely ridiculous. God forbid they have to wait days, or weeks for something. The american attention span has literally (do you know what that word really means?) decreased to 45 seconds, that is LESS THAN ONE MINUTE! Colleges now have their classes set up in two 1.5 hour sessions per week, rather than a single 3 hour session weekly because their students can’t pay attention that long (this was a complete annoyance to me when I returned to refresh my education 2 years ago, you should NOT have to take 2 days a week per class; it should have been left at one 3 hour class weekly, if they can’t pay attention, they learn to, or they fail). I returned to university because I wanted to make sure I held my advantage over the fresh graduates in my field, but after being there, seeing how the quality of the system has declined so drastically, I am no longer worried. I also feel very sorry for the professors, as they are forced to attempt to teach students who can not pay attention to anything longer than a 1 minute video clip, and its only that long if they are instantly interested, otherwise its only a couple of seconds before they drift off looking for something “entertaining”. There are no longer full articles going into any real detail of ANYTHING in newspapers, magazines (what are left of those), webpages, etc., and I don’t think books are even read anymore. This means the younger generation is “learning” tiny bits of things, but truly understand NOTHING. I have seen freshly graduated “programmers” who can’t do anything past a basic script (Day ONE, and MAYBE two of a quality programming class USED to cover what seems to be their entire college education). There ARE still a few students who have the capabilities, attention span, and motivation to learn their field properly, and thankfully there are programs for them, but there are only a few in these programs, receiving the education that an average student would have received 15 years ago. For those who are still reading this, you have probably noticed what I am talking about, and agree entirely. Likely, you are older than 35 years old and are as frightened for the future of this country as I am, and also just as tired of the “kids” who think they know things, but in reality barely know a definition of what it is they think they know, because they can’t pay attention to anything past the introduction (and they also gave up on reading this before the first paragraph ended, and some will comment with something like “TL-DR”). Yes, I know this is a bit “rant-ish”, but the comment it is in response to shows exactly what I am talking about, no patience, and no understanding of the underlying tech they are commenting on, otherwise they would KNOW how ridiculous their comment is. I hope more people read this than I predict will, because that means there is still hope, and there are still people out there who have the IQ to truly research, and create new things, helping to advance our technology, and our species; If I am correct though, we may return to swinging from the trees and flinging poo in the near future.
**TL-DR**
GO BACK AND READ THE ENTIRE POST, REFLECT ON WHAT IT MEANS, AND SEE IF YOU CAN FIGURE OUT WHY I PUT THIS HERE!!! (Books are NOT for propping up table legs, and they don’t put the important stuff in the intro)
Samsung should do their
Samsung should do their homework and know what they are doing with those 19nm TLC cells before they released their product.
Offtopic:
I had enough patience to read this. But I’m afraid that this won’t help your country – I’m not from USA 😀 Our high schools and colleges have sessions every working day from 8:40 to 14:00 or even 15:00 and you have to sit there and listen. If you won’t, then you can’t pass exams. That simple.
so someone like me who runs 2
so someone like me who runs 2 840’s in raid 0 is SOL? yes i can turn off raid and run the firmware update (worked on the last firmware), but i cant run the advanced optimization.
” In trying to fix their stale data issue, Samsung now has a built-in tool that can trigger a background refresh procedure that accomplishes this same task, so in trying to fix one problem, they have actually added a useful feature to this product line.”
wouldnt this cause excess writes to the drive?
I’m pretty sure Allyn said in
I’m pretty sure Allyn said in the article that as long as you keep the SSDs powered, you won’t need the Advanced optimization, and if you do unpower the drive, when you power it, the algorithm kicks in and goes to work. the advanced tool is only if you want the performance immediately after re-powering the SSD.
Firmware update procedure for
Firmware update procedure for RAID users is detailed mid-page here.
The firmware alone puts read performance where it should be. The Optimization is not necesary unless you want a bit more performance (the same can be accomplished on a RAID by running some other tools that rewrite old data). One such tool for this is DiskFresh.
Yes running these tools eat up write cycles (1-2 cycles for a full SSD), but that's a small percentage of the available cycles (1000 minimum, but TechReport got >3000 cycles on their endurance test).
So does that mean if you have
So does that mean if you have a raid-0 pair of 840 evo the new firmware will refresh old data in the background if the system is left on? I assume you’d have to make sure the ssd’s are set to always on in power management but correct me if I’m wrong.
How can Samsung say such
How can Samsung say such nonsense: “the algorithm does not affect the lifespan of an SSD” ?
If the firmware itself rewrites old data periodically, then it will eat up extra P/E cycles, there’s no way around that. An it WILL shorten the lifespan of the SSD. Maybe not by much, but it will. To me this simply looks like another lie from Samsung. I guess they’re praying for the 840 EVO warranties to run out as fast as possible…
And saying that the regular 840s are not affected by this issue: that’s the second lie from Samsung.
Seriously, they’re just a bunch of liars.
The amount of cycles used is
The amount of cycles used is negligible. We are talking only an occasional operation that consumes 1-2 cycles. Even if it took place every 6 months on a full drive, it would consume at worst 2% of the lifespan. I agree with Samsung's statement on that point.
I still don’t understand why
I still don’t understand why did they have to add “does not affect the lifespan of an SSD” affirmation ?
Additionally, they are very well aware that the regular 840s are also having the same issues as the 840 EVOs. Samsung is very unprofessional with these kind of telegraphic statements “This issue had been reported for the 840 EVO SSD only”. What are they, blind ? The thread on overclock.net has been clearly showing that the regular 840s have the same problem (they’re even worse, because they don’t have the SLC caching).
They’re are either blind or they simply disrespect their user base by acting as if they were unaware of the issue.
Again, Samsung proved that they consider us a bunch of ignorants, while they still cash in on the 840 EVO sales…
At this point, a sane person would just go with Crucial, Intel or Sandisk, instead of Samsung.
e.g. Crucial’s MX200 is as fast as the EVO (if not faster), has SLC caching, no spooky “background refreshes” that eat up NAND cycles, no risk of losing data on power outage while performing the refresh, much higher endurance (it’s MLC, not TLC) and it’s cheaper. So why would anyone still trust they’re “beloved” Samsung 840 EVO ?
I can’t disagree on them not
I can't disagree on them not acknowledging other models as having the issue.
The MX200 uses Dynamic Write Acceleration, which is different, but not necessarily better, depending on how you use the SSD. When testing the M600 (same DWA feature), we ran into many cases where there was no SLC cache to use. I'm still evaluating our MX200 samples.
I second your frustrations
I second your frustrations and thoughts regarding Samsung’s apparent “let’s just ignore it and maybe they’ll go away” approach to regular 840 drives. I own a 500GB 840 and short of trying to RMA the thing, I don’t know that Samsung has any plans to support their customers. Sure, I can occasionally run a program to rewrite stale data. This is a small inconvenience overall, but the principle of showing such indifference to customers is where I take exception.
As far as we know there is no reason to exclude the regular 840 drives from this “fix”. This is purely hypothetical, but it could be that without the SLC cache, old data on regular 840’s is subject to unacceptable risk during the ‘magic fix’ process. An important and reasonable distinction such as this would at least say Samsung has assessed the issue and regrettably does not see a way to safely improve the user experience of their product. At the very least, Samsung should acknowledge that these products have an issue and either communicate that they have plans to address the issue or state a clear reason for not doing anything.
*EDIT*
As we have seen in other situations, it is very unlikely for a company to openly admit that a product is defective or fundamentally flawed. They just don’t want to subject themselves to such liabilities when they can hide behind plausible deniability. If Samsung really can’t do anything about it, then admission would open the floodgates to RMA’s and class action suits. In the end, I just want a product that works pretty much as was advertised…even on day 1000.
*/EDIT*
Perhaps this is just a sign of the times where products are obsolete and discontinued before the warranty expires. However, this is not the first time I have been left with an unsupported/abandoned Samsung product. Therefore, I will choose not to consider more products from Samsung, because as you already point out, there are many other alternatives in what has become essentially a commodity market.
I’m afraid to ask, but is
I’m afraid to ask, but is there any talk on fixing the non-evo 840? I was fixing the problem, or at least avoiding it, but since the last format my 840 started to slow within a month. Forgive me if the answer is already in the article, it’s slow reading and I’m not finished, but super curious.
Allyn said “I will continue
Allyn said “I will continue to push Samsung in recognizing that users of other 19nm planar TLC flash SSDs (i.e. the 840) also see this issue.”
@Allyn
Things went smoothly,
@Allyn
Things went smoothly, and we didn’t even need to drop our Intel SATA controller out of RAID mode (though the SSDs we were updating were not in a RAID configuration and were seen as individual drives).
Why is your system set to RAID and not AHCI?
Individual (not part of a
Individual (not part of a RAID) SSDs are handled as standard AHCI devices. Keeping the BIOS in RAID mode is common practice for those who might have a RAID connected in the future (no need to deal with possible blue screens or other complications from switching the mode after the OS has already been installed).
Thank you for your reply. Big
Thank you for your reply. Big fan btw.
I’ve got my controller in
I’ve got my controller in RAID mode…
I have a single 840 EVO as the system drive, but I also have huge two mechanical drives running in RAID1 for massive data storage.
^Oh and thanks for keeping us
^Oh and thanks for keeping us in the loop on this issue. Pcper is always going above and beyond to get the answers from the right people to correct issues that matter. You guys are the BEST!
Allyn, As you have hands on
Allyn, As you have hands on experience , do you know or can share if magican 4.6 officialy supports windows 10? .
Magican 4.5 did not and for example, it cant’t obtain os check for enabling rapid node. I think, i also saw higher daily writes under 10 than 8.1
Maybe when oportiunity arises, you could whisper about issue to samsung people before they relese 4.6. I mean , many enthusiast already usingi it , windows 10 will be RC soon and next magican after 4.6 is probably half year away.
I haven’t asked directly, but
I haven't asked directly, but I'm sure that Samsung will update Magician to support Windows 10 once released. Fortunately, from our testing it sppears the firmware alone is 'good enough' at getting performance back to where it should be without the need for Magician (unless you absolutely want RAPID, but you really shouldn't be toying with that on a beta OS anyway).
Well since Windows 10 isn’t
Well since Windows 10 isn’t officially released, 4.6 will not. Magician didn’t support 8 until a month after it officially released. Same deal with 8.1.
TLC flash is not ever going
TLC flash is not ever going to be trusted on 19nm or smaller plainer nodes, and hopefully whenever the stacked flash memory gets smaller in the planer(x and y) dimension, it can be made to be thicker in the z dimension to maintain the proper amount of long term state retention. I would rather pay a little more and get the MLC and SLC and the faster speeds inherent with their use, than even having any TLC cells. I am fine with spinning rust for long term storage, and having the SLC/MLC speed advantage, even if it means less SSD space for higher costs. Stacked flash is definitely the way of the future, for higher density with any memory cell types so hopefully the cost for SLC/MLC will go down also, and most laptop users are not even using above 1 TB in most uses, if even that.
I would like to see more Tiered storage management software make its way into the consumer/PC market, and some reviews of any Tiered storage management software that may be available to the consumer.
“TLC flash is not ever going
“TLC flash is not ever going to be trusted on 19nm or smaller plainer nodes”
At this point in time Sandisk don’t appear to be having any
problems with their TLC flash………………………
Any idea when ‘later this
Any idea when ‘later this month’ is going to be? There are still quite a few days left in this month.
>Finally Fixed
Surely this is
>Finally Fixed
Surely this is misleading and speculative? We don’t know if it’s fixed, only time will tell. It could possibly be a cell leakage problem which any firmware- or Magician patch will simply obscure to the user; the disk itself is still borked.
I’ve got two 250GBs, I’ll be emailing the consumer watchdog people and Samsung to see what’s what.
Neither misleading nor
Neither misleading nor speculative – confirmed with testing. It's not physically possible for firmware to 'obscure' an actual significant increase in read speed – when the SSD was not given enough time to read + rewrite the stale data at the slower speed prior to retesting. By *only* applying the patch and *not* refreshing old data, the read speed of very old data saw in instantaneous restoration in speed.
Seems likey a band aid fix
Seems likey a band aid fix for the issue. If cell leakage is so bad that it needs to be refreshed in order to maintain performance after just a few weeks – months, then the drive will likely not meet data retention expectations.
Remember, they tried relaxing the threshold for when the ECC kicks in, and it still did not fix the issue.
TLC flash is getting read issues in weeks that MLC would likely not get for 10+ years of stagnation.
Also, while tech report got a lot of writes out of the flash cells, the true limit of the drive was about 100 TB of writes.
After that, it started to rapidly burn through reserve flash, and even got a few uncorrectabe errors, in addition to retention errors (after just a few weeks of being powered off).
When an SSD begins to constantly replace cells, then it is no longer reliable (This applies to TLC, MLC, and SLC).
Uncorrectable errors are worst than disk failures, as it means data can be corrupted silently. E.g., an old important document has flash cells degrade or drift to the point where it becomes unable to read back the correct info. Your automated backups eventually loops and overwrites old data (unless you have enough storage to never have to overwrite), and you eventually end up with all backups containing the corrupt file, but you will not know until the one day when you need to access that file again.
(How many people here individually open each and every individual file on their system in order to confirm that they can read properly before each and every backup?)
Bit rot sucks and can also
Bit rot sucks and can also happen on mechanical HDDs and optical media as well as flash. The concern here, as you point out, is that the Samsung 19nm TLC NAND drives exhibit an unexpected/unacceptable likelihood to lose data over time. Essentially all storage should be viewed as ultimately unreliable, hence the 3,2,1 backup strategy.
This is why I use SnapRAID on my home server for long term storage. ZFS is nice with real time parity checking, but it is a bit extreme and cumbersome for the average user. SnapRAID strikes a nice balance between convenience, flexibility, and features that help protect your data. One of those features is a scrub process which reads and verifies files against drive parity as well as block-level checksums. This provides a means to periodically verify and restore the integrity of your files.
Statistically speaking RAID 5 should no longer be used for arrays with drives 4TB and larger since the very act of a RAID rebuild (a time when your data is particularly at risk) may introduce silent errors.
http://www.zdnet.com/article/why-raid-5-stops-working-in-2009/
http://www.zdnet.com/article/data-corruption-is-worse-than-you-know/
While we’re at it, corruption of data in RAM is highly unlikely but still possible. Even ZFS and SnapRAID will happily write data that was silently corrupted while in system RAM. So to eliminate that gap, ECC RAM should also be considered for any machine used for long term storage and data management.
“I’m glad Samsung has stuck
“I’m glad Samsung has stuck with it. Not many manufacturers would put so much effort into a two year old product, and the 840 EVO has proven to need a lot of work to get a difficult problem under control. Now to try to get them to enable the advanced optimization feature for all Samsung SSDs. I will continue to push Samsung in recognizing that users of other 19nm planar TLC flash SSDs (i.e. the 840) also see this issue. We will also continue to keep these samples stored with cold / stale data and retest occasionally.”
Allyn, you can’t be serious with this! They need to do a recall, and are probably required to do so by law.
If i unplug my drive and put it in the closet for a few months, what can i expect when i plug it back in? Data atrophy. This is a colossal defect, and the solution is a bad bad joke.
Shame on you.
You should read the article.
You should read the article. The SSDs I tested were 'off' since the last update, and had slowed greatly, and simply patching the firmware alone (without running any restore function) brought read speeds back to the expected values.
What about us, I’ve installed
What about us, I’ve installed an 840EVO on my macbook pro and lately i’m seeing the drive not responding as quick as it used to. Startup time is really turning out to be bad!!
The firmware update alone
The firmware update alone (which can be done on a macbook) will bring read speeds up considerably.
Thank you allyn for you time.
Thank you allyn for you time.
Sorry guys but I’m a
Sorry guys but I’m a noob…
I just bought a brand new computer in November, with a 840 EVO SSD.
I have win 7 installed on it, then your usual stuff: games, mp3s, videos.
Here is my question: How do I update the firmware ?
Also, WILL IT DELETE ALL MY DATA?
thanks in advance
the Fw update is non
the Fw update is non destructive. Magician will prompt you to update it. You just have to press ok.
What he said, but you should
What he said, but you should still backup just in case.
This is not an acceptable
This is not an acceptable fix. I will never buy a Samsung product again. Its hard to understand how a company of this caliber can stoop to such low levels.
I don’t see how this fix
I don't see how this fix could be not acceptable. It brings read speeds back where they should be, and if advanced optimization is performed, it brings them even faster than any typical SSD (that had slowed over time due to fragmentation).
Thanks a lot for the article.
Thanks a lot for the article. Keeping my fingers crossed that this is the permanent fix.
Sorry if I missed it…
What
Sorry if I missed it…
What % do you lose with the new software if you shut the computer down multiple times in a laptop?
ie.. Work / Library /Home {Repeat}
Hello
Noob here,i have the
Hello
Noob here,i have the Evo 850 Pro on Windows 8.1 and from almost day 1 have had problems with Rapid mode,i installed magician 4.5 and did all the optimizations/overprovisioning and enabled Rapid mode.However i kept experiencing little but almost weekly problems with Windows 8.1 so imaged back to version 8.0 which was the OEM version,but now i cannot install Magician 4.5,i just get an error box which states:”Reboot is pending after last RAPID mode operation.Reboot the system and try again-ok”.Now Samsung have said to me- “There was a problem with the Magician which could have been solved remotely: we could have access your PC and change something in the Magician and the issue was solved, but at the moment we cannot use this solution anymore”.Will this update sort out my problem do you think,or should i consider the RMA route?
Thanks for reading
Regards
I think that this periodic
I think that this periodic rewriting of old files should actually INCREASE life expectancy in real-world situations.
A lot of files are written once and seldom (if ever) change. The blocks used by these files do not participate in wear levelling, so the remaining blocks will have to absorb all writes to the drive, exhausting their P/E cycles in a shorter time. As a consequence, you will end with a drive in which a lot of blocks have been erased only a few times when the rest of them are worn out.
By periodically rewriting old files, those blocks are freed, and there will be more blocks to distribute P/E cycles.