Welcome to Micron.com, please Log in or Register for an account to continue.
- US - English
- China - 简体中文
- India - English
- Japan - 日本語
- Malaysia - English
- Singapore - English
- Taiwan – 繁體中文
Invalid input. Special characters are not supported.
* Comparing HBM4 12-high to HBM3E 12-high. Power efficiency is measured in picojoules per bit (pJ/bit) at similar speeds.
- SOCAMM
- SOCAMM
- SOCAMM
- SOCAMM
- SOCAMM
- SOCAMM
- SOCAMM
The bandwidth of HBM4, over 2.8 TB/s, matters most for applications in AI and HPC that need to move terabytes of data at high speeds. Advanced reasoning models, for example, must evaluate hundreds of intermediate logical steps as they work through a problem. This requires terabytes of data to flow between main memory and processors each second as calculations progress.
HBM4 works alongside other memory types rather than replacing them. For example, in modern systems, CPUs use LPDDR5 or DDR5 to coordinate the system while GPUs use HBM4 for demanding math (i.e., complex algorithms).
HBM4 takes everything that worked well in HBM3 and HBM3E and makes it more powerful. A wider interface operates at speeds greater than 11.0 Gbps to deliver more than double the bandwidth of the previous gen. This matters because it addresses emerging requirements from AI workloads with long context windows spanning millions of tokens to scientific simulations running on next-gen supercomputers.
Traditional DRAM such as DDR memory handles general computing tasks while HBM supports AI and HPC applications that require continuous terabyte streams of data. The architecture of HBM stacks ultra-thin DRAM dies and connects them with thousands of through-silicon vias (TSVs). This vertical design requires higher precision in manufacturing, making HBM one of the more challenging memory products to produce.
HBM4 12-high provides 36GB of memory capacity per stack (the same as the previous generation) but with more than 2.8 TB/s. The increase in bandwidth (more than twice that of HBM3E) means the processor can access this capacity much faster, supporting more demanding AI workloads and scientific simulations than the previous gen at the same capacity could handle.
Capacity is how much data memory can hold, while bandwidth is how much of that data can flow each second. An HBM4 12-high stack can hold 36GB of data. And 2.8 TB/s means that in one second, the equivalent of 2.8 terabytes (TB) of data can flow between HBM and processor. Capacity determines what data you can fit in memory and bandwidth determines how quickly you can access that data.
Manufacturing HBM starts by fabricating three types of silicon wafers. One creates dies with through-silicon vias (TSVs) for electrical connections. Another produces thicker top dies without TSVs. The third fabricates logic dies with TSVs to interface with the system.
Only dies that pass testing continue further to assembly. Specialized equipment then stacks multiple DRAM dies on the logic die. The thicker top DRAM die completes the stack and provides memory as well as structural integrity. Once assembled, the complete HBM cube undergoes final testing to verify all connections work properly.
Yes, HBM4 works with both GPUs and custom ASICs (application-specific integrated circuits). The memory connects to any processor that can handle its high bandwidth interface and has the appropriate packaging.
High-end computing systems solve scientific problems (e.g., supercomputers) and train AI models on exabytes of data. To do this efficiently, memory must move data fast enough to keep thousands of processor cores busy. With a bandwidth of more than 2.8 TB/s, HBM4 speeds up AI training, lowers inference latency through faster KV cache access, and enables more detailed scientific simulations.
1 Anthropic. (2026, February 18). Measuring AI agent autonomy in practice. https://www.anthropic.com/research/measuring-agent-autonomy