Taking an Accurate Look at SSD Write Endurance
Last year, I posted a rebuttal to a paper describing the future of flash memory as ‘bleak’. The paper went through great (and convoluted) lengths to paint a tragic picture of flash memory endurance moving forward. Yesterday a newer paper hit Slashdot – this one doing just the opposite, and going as far as to assume production flash memory handling up to 1 Million erase cycles. You’d think that since I’m constantly pushing flash memory as a viable, reliable, and super-fast successor to Hard Disks (aka 'Spinning Rust'), that I’d just sit back on this one and let it fly. After all, it helps make my argument! Well, I can’t, because if there are errors published on a topic so important to me, it’s in the interest of journalistic integrity that I must now post an equal and opposite rebuttal to this one – even if it works against my case.
First I’m going to invite you to read through the paper in question. After doing so, I’m now going to pick it apart. Unfortunately I’m crunched for time today, so I’m going to reduce my dissertation into the form of some simple bulleted points:
- Max data write speed did not take into account 8/10 encoding, meaning 6Gb/sec = 600MB/sec, not 750MB/sec.
- The flash *page* size (8KB) and block sizes (2MB) chosen more closely resemble that of MLC parts (not SLC – see below for why this is important).
- The paper makes no reference to Write Amplification.
Perhaps the most glaring and significant is that all of the formulas, while correct, fail to consider the most important factor when dealing with flash memory writes – Write Amplification.
Before geting into it, I'll reference the excellent graphic that Anand put in his SSD Relapse piece:
SSD controllers combine smaller writes into larger ones in an attempt to speed up the effective write speed. This falls flat once all flash blocks have been written to at least once. From that point forward, the SSD must play musical chairs with the data on each and every small write. In a bad case, a single 4KB write turns into a 2MB write. For that example, Write Amplification would be a factor of 500, meaning the flash memory is cycled at 500x the rate calculated in the paper. Sure that’s an extreme example, but the point is that without referencing amplification at all, it is assumed to be a factor of 1, which would only be the case if you were only writing 2MB blocks of data to the SSD. This is almost never the case, regardless of Operating System.
After posters on Slashdot called out the author on his assumptions of rated P/E cycles, he went back and added two links to justify his figures. The problem is that the first links to a 2005 data sheet for 90nm SLC flash. Samsung’s 90nm flash was 1Gb per die (128MB). The packages were available with up to 4 dies each, and scaling up to a typical 16-chip SSD, that only gives you an 8GB SSD. Not very practical. That’s not to say 100k is an inaccurate figure for SLC endurance. It’s just a really bad reference to use is all. Here's a better one from the Flash Memory Summit a couple of years back:
The second link was a 2008 PR blast from Micron, based on their proposed pushing of the 34nm process to its limits. “One Million Write Cycles” was nothing more than a tag line for an achievement accomplished in a lab under ideal conditions. That figure was never reached in anything you could actually buy in a SATA SSD. A better reference would be from that same presentation at the Summit:
This shows larger process nodes hitting even beyond 1 million cycles (given sufficient additional error bits used for error correction), but remember it has to be something that is available and in a usable capacity to be practical for real world use, and that’s just not the case for the flash in the above chart.
At the end of the day, manufacturers must balance cost, capacity, and longevity. This forces a push towards smaller processes (for more capacity per cost), with the limit being how much endurance they are willing to give up in the process. In the end they choose based on what the customer needs. Enterprise use leans towards SLC or eMLC, as they are willing to spend more for the gain in endurance. Typical PC users get standard MLC and now even TLC, which are *good enough* for that application. It's worth noting that most SSD failures are not due to burning out all of the available flash P/E cycles. The vast majority are due to infant mortality failures of the controller or even due to buggy firmware. I've never written enough to any single consumer SSD (in normal operation) to wear out all of the flash. The closest I've come to a flash-related failure was when I had an ioDrive fail during testing by excessive heat causing a solder pad to lift on one of the flash chips.
All of this said, I’d love to see a revisit to the author’s well-structured paper – only based on the corrected assumptions I’ve outlined above. *That* is the type of paper I would reference when attempting to make *accurate* arguments for SSD endurance.
To add to this with some
To add to this with some simple yet generic P/E figures for current gen flash memory:
The concern over endurance is
The concern over endurance is way over blown in the consumer space. A 20nm TLC cell can only withstand 1k writes but if you’re buying a 250GB drive (like the 840) you have plenty of bits to spread your wear across for normal workloads.
For every NAND shrink we see an increase in density which all but negates the loss in total DRIVE endurance. Sure the individual NAND cell has lower endurance but you get a lot more cells to work with.
And that’s not accounting for technological breakthroughs that can help extend the life of NAND. For instance 20nm IMFT has 3k P/E cycles, just like 25nm IMFT MLC because it uses a high-K + metal gate design. Add in adaptive DSP technologies that adjust reference voltages for aging NAND and the eventual shift to 3D NAND and it’s clear there’s still a lot of life left in NAND.
My guess is the HDD companies are the ones trying to spread this FUD about the flash market.
FYI, this guy is fucking Les
FYI, this guy is fucking Les Toker. He also has that fake account named “JoeComp” on [H], and uses an another fake name called “Jim Williams”. He’s the owner of The SSD Review and there isn’t a single place on the web he hasn’t SHIT on. He’s dumb and retarded as hell… and wants ME to REVIEW SSD’s for him…
oh and, don’t buy into TLC. TLC is garbage. Yeah, surely it’ll last, but what’s the point when you can have a MLC SandForce drive for the same price that’s faster, more endurable and will be easier to sell?
If he want’s YOU to review
If he want’s YOU to review drives for him, then he really must be retarded…
Come on you ignorant little bunghole, TheSSDReview is one of the most respectable sites when it comes to SSD testing!
You are a nobody, and I feel bad for you that you must verbally assault those of us who have reached places in our careers that you only dream of.
Jealous of other peoples talents, having some need to defend your quite-likely inferior computer hardware because heaven forbid you didn’t get the absolute best stuff available, or sitting on your behind chastising other peoples’ cars (“Why would you waste your money on a Porsche 911 GT3 RS 4.0, everyone knows the Turbo is faster…” or “Why would you buy a 2006 M3? What a stupid thing to do! I’m going to get an Audi RS4 and I’ll kick your butt in a race, you’ll see!”… I’ve heard both)
LET OTHER PEOPLE ENJOY THEIR THINGS! You’re not just raining on peoples parades, you are like a hurricane made entirely of feces that whirls around the internet, trying to make other people as miserable as you feel on the inside.
While there are numbers for
While there are numbers for media lifespan, and the reference materials for any theoretical limit. One should keep in mind, that all of those numbers are still just best guesses by people with various credentials.
While it’s highly likely that one may know someone who has had an SSD die or exhibit some failure, there has to be a number of happy SSD users that have gone beyond the date of expiration.
Personally, I’ve enjoyed my pair of first-gen Intel 80 GB SSDs since December of 2008. And it’s followed my gaming machine to countless LAN parties and events–plugging away, every day, for at least 4-12 hour stretches–as my boot drive.
Just how many cycles are estimated to happen in an hour?
Many SSDs includes SMART
Many SSDs includes SMART reporting that lets you read or approximate the average write/erase count. If you know the number of bytes written to the drive, you can actually compute wear amp backwards with this. Most of the newer windows SMART tools know how to read these values.
One issue with SSDs is not when they wear out, but how they wear out. When you go past the limit, the drive drops below minimum data error and retention specs. The error specs are likely to be invisible to you as a bit of extra error correction is transparent. The retention issues are more troubling. If you have a 3000 endurance drive at 10,000 cycles, it is likely it still seems to work OK. Then you power it down for a couple of days and it is empty. Not pleasant.
Pls excuse, but, what about a
Pls excuse, but, what about a windows vm (oracle’s virtual box) running in debian? Seems to write about 10gigs/eight hrs
Got one on a samsung 830 120G and am starting to get concerned.
Just a basic win 7 install w/ quickbooks, firefox, antivirus
and little else.
I am happy to see the passion
I am happy to see the passion in accurate information. This is another reason why i love to come to PC Perspective.com for my information.
Just a hint… As we know an
Just a hint… As we know an operating system such as windows continuously writes to disk. The biggest culprit of which is the paging file (this happens on Linux as well, known as the Swap partition). If you have the luxury of having both- SSD and HDD -drives in your computer, think about the option of moving the windows paging file to the HDD. You may think that this will reduce performance but if you have sufficient Physical RAM (Like 8GB of RAM in a workstation) there shouldn’t be much paging happening and hence minimal performance impact. (If performance is an absolute must remove the paging file completely [Don’t think this is possible in Linux though] just make sure you then never run out of physical RAM or your PC will crash – lol)
You could add some months or even years to your precious SSD 🙂 Just a thought…
Tried SSD+HDD. Worked fine
Tried SSD+HDD. Worked fine for first 4 weeks then the HDD began to fade out in the optical drive bay. Maybe overheating. Now running SSD+RAM. A 4G laptop with 400Mb ramdisk allocation, sofar about 35% fill up. I use Microsoft procmon.exe to see which applications do most writing to disk especially to %appdata% folders and assigned those files to a ramdisk using Win7 symbolic links ‘junctions’. In theory you can take the entire c:usersusername folder and link it to some other drive. But this is not recommended, because it will freeze badly if there is any hiccups in the access to the other drive. Had a handful of user profiles gone bad. Recreating a user account that got botched is possible but long windy chore. Only move the 4 or 5 most offending app folders. Always keep one unmodified and pristine user account so you can log into that account and delete the botched account and then recreated it. Can not stress this enough. The claims that a modern SSD should be just fine with win7 is bogus. Even with a laptop spartanized by turning off every possible service and logging (450+ logfiles) but still has write storms that are out of control. Explorer and system service among the worst offenders. Applications without update Registry without any self constraint, which in turn results in the registry hive being committed to the disk. Probably all these .Net frameworks are to blame for this promiscuous registry write. Procmon shows thousands of small file writes per minute. During 4 weeks of play the disk now has 400G of writes. About 3 times the size of the drive. The procmon FileWrite(_) and RegSetKey(_) function tracing are the helpers that lets you redirect the bad apps to RAMDISK. The the SSD that can even 2 or 3 seconds without any writes occuring at all.
120G SSD split 80/30 into
120G SSD split 80/30 into volumes C,D. Pagefile on Volume D. Procmon filter keeps an eye on any file paths that begin with “C:” for simplicity. Registry hive gets committed to C:usersusernamentuser.dat file. For pressing the windows button and opening the start menu already causes a whirlwind of writes. The villain is Explore.exe’s use of RegSetKey(_) calls to the update MuiCache registry key entry. Same value is written 7 times in a row – what a waste.