Kioxia as part of CES is doing something a bit different this week. Instead of talking about consumer SSDs, it is focusing on the Kioxia AiSAQ product that looks to replace memory in the AI stack with SSDs to allow larger models at a lower cost.
Editor’s note: Hey STHers. Apologies, my (Patrick) planned review did not make it live today. We had two of our extended team members impacted, with homes that are gone in the Los Angeles fires. Cliff stepped in today with this because my head has not been on STH. My wife and I know probably a dozen or so folks who have been impacted at this point.
Kioxia AiSAQ SSD-backed RAG for Larger Scale AI Models
For folks who are new to RAG, or retrieval augmented generation, the idea is simply that a LLM can access data to help it hallucinate less, and also provide important context. For example, instead of just outputting information based on the training data set which might be months old, the LLM can access current information and bring that into the LLM that was trained earlier. All of that takes a lot of memory and storage.
Kioxia’s idea is that storing the vector data and index in storage, rather than a giant memory pool, can cost a lot less.
Often accessing the information happens using an approximate nearest neighbor operation or ANN that is often done on CPUs. Microsoft has its DiskANN that moves some of the index and vector data out od DRAM, but keeps product quantization vectors in DRAM. With Kioxia AiSAQ, the idea is to move all of that data onto SSDs.
Using flash instead of DRAM decreses costs, especially at the TB scale. So Kioxia is saying that it can store enormous vectors with minimal DRAM footprint by moving everything to SSDs.
A disadvantage of this is that it can be slower than using all DRAM but the benefit is that it can cost less to hit a given scale. Higher scale can mean higher quality or more accurate results.
Final Words
Everything is AI these days, and Kioxia is pushing this as well. Realistically, RAG is going to be an important part of many applications, and if there is an application that needs to access lots of data, but it is not used as frequently, this would be a great opportunity for something like Kioxia AiSAQ. Hopefully we can show you more of this in the future.