Samsung BM1743 Shows How a 128TB NVMe SSD is Made

6
Samsung BM1743 At FMS 2024 1
Samsung BM1743 At FMS 2024 1

At Future of Memory and Storage 2024 event (previously Flash Memory Summit) George on our team spotted a cool display by Samsung around the company’s new 122.88TB/ 128TB NVMe offering. We covered The Samsung BM1743 is a 61.44TB SSD Today with a 122.88TB Drive Possible and now we see the SSD’s innards on display.

Samsung BM1743 Shows How a 128TB NVMe SSD is Made

At the Samsung FMS 2024 booth, we saw the Samsung BM1743, which looks like a standard U.2 NVMe SSD.

Samsung BM1743 At FMS 2024 4
Samsung BM1743 At FMS 2024 4

Next to that, however, the company had an exploded view of the SSD.

Samsung BM1743 At FMS 2024 1
Samsung BM1743 At FMS 2024 1

Here is another look where we can see the larger number of NAND chips and DRAM chips in this SSD’s two PCBs.

Samsung BM1743 At FMS 2024 2
Samsung BM1743 At FMS 2024 2

The Samsung booth reps did not let George photograph other angles, but at least it gives some idea of what the SSD looks like underneath. The BM1743 is being touted as a 128TB SSD, although that usually means it will be a 122.88TB SSD like others we have seen.

Samsung BM1743 At FMS 2024 3
Samsung BM1743 At FMS 2024 3

Samsung says that it is using QLC, which is expected. It also gave some sequential figures. One of the more interesting is that the 45K Random Write IOPS were at 16KB. Typically SSDs have used 4K random read/ write for IOPS performance quotes, but we are starting to see 16KB a bit more.

Final Words

Although the 128 or 122.88TB Samsung BM1743 is not a shipping product today, according to what the folks in the booth said. It is also clearly coming, given the fact so much booth space has been dedicated to it. These high-capacity NVMe SSDs are in great demand in large AI clusters. Fewer and denser drives mean that 60PB in a rack can be achieved easily. When Exabytes of performance storage are needed, these giant QLC SSDs have become the go-to format.

6 COMMENTS

  1. Wow one month retention is pretty bad, guess it’s meant as a scratch/cache drive rather than for actual data storage. I wonder if it is configurable, so you can get a slower write speed in exchange for longer retention? Or maybe they are just taking QLC to the next step and there are so many different possible voltages in the cells it can’t hold a specific charge level for very long. Maybe that means if you want to increase the retention period from one month to 10 months, the cost is dropping the capacity from 120 TB to 12 TB…

  2. Notice the asterisk next to the 1 month retention. I believe Samsung are saying that the advertised read/write speeds are what can be achieved after 1 month of data aging.
    The NAND cell charge degradation/aging almost exclusively affect the read speeds rather than the write speeds. While it looks like the asterisk is only for write speed because of the way the placard is formatted, I believe Samsung intended it to be for both read/write (notice the same formatting issue is affecting the IOPS portion as well with the block size).

  3. @Paul: I’m no expert but I don’t think the age of the data affects read speeds, at least until it has degraded enough that it can’t read the data properly. But when you write data you’re effectively charging up tiny capacitors in the flash cell so presumably like charging up a battery, the less time you spend doing it, the less charge it’s going to hold on to, and the sooner it will lose charge again. So it seems that spending less time writing means the data will degrade sooner, I suppose?

  4. Its most likely 1 month data retention when unpowered. SSDs refresh their retention while powered up. In a datacenter it is not common to have your SSDs ever power down, other than for moving between DCs. These are not designed for one time write long term backup like tape storage etc.

  5. @Malvineous
    The reason age affects primarily read speeds is because the NAND cells let their charge decay with time and eventually this becomes too much for the controller to compensate for while reading at full speed, just like what prematurely happened to all the 840 EVOs way back in the day. Controllers have gotten pretty good at even domain specific (because of 3D stacked NAND) voltage offset tables to compensate for charge decay, but differences in wear level or even manufacturing defects often throw a wrench into this algorithm.
    The most successful way to deal with the charge decay problem is to simply have the SSD controller keep track of how old data within a cell is and rewrite/refresh it when it gets too old; but this refresh algorithm is typically only used in high end enterprise SSDs and burns P/E cycles, I’m not sure if the BM1743 uses this specific refresh algorithm. There are other refresh algorithms like read disturb refresh that virtually all SSDs implement, but aren’t applicable here.

    @Mark Radcliffe
    You are correct that almost all enterprise SSDs (but not consumer) will refresh their NAND automatically; the fact that the BM1743 is an enterprise drive makes me think that is probably the case.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.