HBM And Emerging Memory Technologies Enable AI Training And Inference

AI
During congressional hearing in the House of Representatives’ Energy & Commerce Committee Subcommittee of Communication and Technology, Ronnie Vasishta, Senior VP of telecom at Nvidia said that mobile networks will be called upon to support a new kind of traffic—AI traffic. This AI traffic includes the delivery of AI services to the edge, or inferencing at the edge. Such growth in AI data could reverse the general trend towards lower growth in traffic on mobile networks.
Many AI-enabled applications will require mobile connectivity including autonomous vehicles, smart glasses, generative AI services and many other applications. He said that the transmission of this massive increase in data needs to be resilient, fit for purpose, and secure. Supporting this creation of data from AI will require large amount of memory, particularly very high bandwidth memory, such as HBM. This will result in great demand for memory that supports AI applications.
Micron announced that it is now shipping HBM4 memory to key customers, these are for early qualification efforts. The Micron HBM4 provides up to 2.0TB/s bandwidth and 24GB capacity per 12-high die stack. The company says that their HBM4 uses its 1-beta DRAM node, advanced through silicon via technologies, and has a highly capable built-in self-test. See image below.
Micron HBM4 Memory
HBM memory consisting of stacks of DRAM die with massively parallel interconnects to provide high bandwidth are combined GPU’s such as those from Nvidia. This memory close to the processor allows training and inference of various AI models. The current generation of HBM memory used in current GPUs use HBM3e memory. At the 2025 March GTC in San Jose, Jensen Huang said that Micron HBM memory was being used in some of their GPU platforms.
The manufacturers of HBM memories are SK Hynix, Samsung and Micron with SK Hynix and Samsung providing the majority of supply and with Micron coming in third. SK hynix was the first to announce HBM memory in 2013, which was adopted as an industry standard by JEDEC that same year. Samsung followed in 2016 and in 2020 Micron said that it would create its own HBM memory. All of these companies expect to be shipping HBM4 memories in volume by sometime in 2026.
Numen, a company involved in magnetic random access memory applications, recently talked about how traditional memories used in AI applications, such as DRAM and SRAM have limitations in power, bandwidth and storage density. They said that processing performance has skyrocketed by 60,000X over the past 20 years but DRAM bandwidth has improved only 100X, creating a “memory wall.”
The company says that its AI Memory Engine is a highly configurable memory subsystem IP that enables significant improvements in power efficiency, performance, intelligence, and endurance. This is not only for Numem’s MRAM-based architecture, but also third-party MRAMs, RRAM, PCRAM, and Flash Memory.
Numem said that it has developed next-generation MRAM supporting die densities up to 1GB which can deliver SRAM-class performance with up to 2.5X higher memory density in embedded applications and 100X lower standby power consumption. The company says that its solutions are foundry-ready and production-capable today.
Coughlin Associates and Objective Analysis in their Deep Look at New Memories report predict that AI and other memory-intensive applications, including the use of AI inference in embedded devices such as smart watches, hearing aids and other applications are already using MRAM, RRAM and other emerging memory technologies will decrease the costs and increase production of these memories.
These memories technologies are already available from major semiconductor foundries. They scale to smaller lithographic scaling that DRAM and SRAM and because they are non-volatile, no refreshes are needed and so they consume less power. As a result, these memories allow more memory capacity and lower power consumption in space and power constrained environments. MRAM and RRAM are also being built into industrial, enterprise and data center applications.
The figure below shows our projections for replacement of traditional memories, SRAM, DRAM, NOR and NAND Flash memory by these emerging memories. NOR and SRAM, in particular, for embedded memories are projected to be replaced by these new memories within the next decade as part of a future $100B memory market.
Projected replacement of conventional memories with new memories
AI will generate increased demand for memory to support training and inference. It will also increase the demand for data over mobile networks. This will drive demand for HBM memory but also increase demand for new emerging memory technologies.