Surprising Findings on SSDs in Data Centers

February 29, 2016—In a report presented at the the USENIX Conference on File and Storage Technologies, researchers from Google and the University of Toronto (Raghav Lagisetty, Arif Merchant, and Bianca Schroeder) shared startling new findings regarding storage in data center environments.

The accepted position in the solid-state storage (SSD) industry is that high-end single-level cell (SLC) NAND is more reliable than multi-level cell (MLC) NAND. The researchers went further than just analyzing raw bit error rates (RBER) in studying drive reliability. They state, “We find that RBER (raw bit error rate), the standard metric for drive reliability, is not a good predictor of those failure modes that are the major concern in practice. In particular, higher RBER does not translate to a higher incidence of uncorrectable errors.”

SLC drives have always been marketed to the enterprise customer as the premium of SSD storage with higher reliability while MLC-based drives are positioned as a lower end product. The research showed that MLC drives are just as reliable as SLC overall; SLCs and MLCs have more or less equal repair/replacement rates.

Another finding stated in the paper is that the real precursor for higher drive errors is actually age rather than amount of usage. In the comparing SSDs over HDDs, the report showed that SSDs have a lower replacement rate than HDDs. However, SSDs have a 20-63% chance of getting uncorrectable errors over a four-year period, with 30-80% of them developing bad blocks.

