Why HBM memory and AI processors are happy together

A.I. Black GuyMay 24, 2024

0 0 2 minutes read

Why HBM memory and AI processors are happy together

High bandwidth memory (HBM) chips have become a game changer in artificial intelligence (AI) applications by efficiently handling complex algorithms with high memory requirements. They became a major building block in AI applications by addressing a critical bottleneck: memory bandwidth.

Figure 1 HBM comprises a stack of DRAM chips linked vertically by interconnects called TSVs. The stack of memory chips sits on top of a logic chip that acts as the interface to the processor. Source: Gen AI Experts

Jinhyun Kim, principal engineer at Samsung Electronics’ memory product planning team, acknowledges that the mainstreaming of AI and machine learning (ML) inference has led to the mainstreaming of HBM. But how did this lover affair between AI and HBM begin in the first place?

05.01.2024

04.18.2024

As Jim Handy, principal analyst with Objective Analysis, put it, GPUs and AI accelerators have an unbelievable hunger for bandwidth, and HBM gets them where they want to go. “If you tried doing it with DDR, you’d end up having to have multiple processors instead of just one to do the same job, and the processor cost would end up more than offsetting what you saved in the DRAM.”

DRAM chips struggle to keep pace with the ever-increasing demands of complex AI models, which require massive amounts of data to be processed simultaneously. On the other hand, HBM chips, which offer significantly higher bandwidth than traditional DRAM by employing a 3D stacking architecture, facilitate shorter data paths and faster communication between the processor and memory.

That allows AI applications to train on larger and more complex datasets, which in turn, leads to more accurate and powerful models. Moreover, as a memory interface for 3D-stacked DRAM, HBM uses less power in a form factor that’s significantly smaller than DDR4 or GDDR5 by stacking as many as eight DRAM dies with an optional base die that can include buffer circuitry and test logic.

Next, each new generation of HBM incorporates improvements that coincide with launches of the latest GPUs, CPUs, and FPGAs. For instance, with HBM3, bandwidth jumped to 819 GB/s and maximum density per HBM stack increased to 24 GB to manage larger datasets.

Figure 2 Host devices like GPUs and FPGAs in AI designs have embraced HBM due to their higher bandwidth needs. Source: Micron

The neural networks in AI applications require a significant amount of data both for processing and training, and training sets alone are growing about 10 times annually. That means the need for HBM is likely to grow further.

It’s important to note that the market for HBM chips is still evolving and that HBM chips are not limited to AI applications. These memory chips are increasingly finding sockets in applications serving high-performance computing (HPC) and data centers.