After we recently updated the RAID calculator a few weeks ago, the MTTDL RAID Reliability calculator is also updated. Two years ago we received requests to build a model. We then read a lot of literature, met with some of the storage industry’s reliability gurus and settled on the Poisson model. For those that are not familiar with the RAID reliability calculator here is an excerpt from the original page:
There are more elegant models that one can use versus a simple Poisson distribution for sure, and it does not take into account other parts failing such as disk controllers, motherboards, power supplies and etc. It took a lot of back and forth but the basic idea is this, the calculator is “directionally” correct but is not the most accurate way to model all of that stuff. We did evaluate a much faster model but on an AWS m1.small instance it was taking over 15 minutes to complete with only one user. Simply put, this calculator will give you a fairly good idea regarding which RAID level is the most reliable given a number of drives.
Simply put, the RAID reliability calculator is a quick method to see how changing parameters such as RAID level, enterprise v. consumer drives, number of drives in an array and other factors impact RAID reliability. Want to know the relative chance of data loss between using enterprise drives in RAID 10 versus consumer drives in RAID-Z3? This calculator will help you review.
We found a few minor calculation errors over the past year. We also found a few opportunities to speed up the model. One other major area of improvement has been in terms of making the model more clear in terms of labeling. We made the page wider to make RAID reliability calculator results easier to view. Check out the updated STH RAID Reliability Calculator here!
If you do see any remaining bugs or have other suggestions, please feel free to head to the forums to comment.