/dq/media/media_files/2025/12/05/korea-2025-12-05-15-05-21.jpg)
HBM4 isn’t even in mass production yet, but the industry is already sketching out HBM9!
According to The Elec, Yang Jun-mo, a principal researcher at the Korea Nanofab Center (NNFC), predicts that HBM9—expected to debut around 2040—will deliver performance more than 60 times higher than sixth-generation HBM4. The next-generation memory is projected to feature over 32 stacked layers and could reach bandwidths of up to 128 terabytes per second (TB/s).
The report, citing Yang, notes that for HBM9, its data rate could reach 36 Gbps, while the number of I/O channels is projected to rise to 32,768—four times and 16 times higher than HBM4, respectively. Memory capacity is also set to grow substantially, increasing from 24GB in HBM4 to 96GB in HBM9 by 2040.
Advanced packaging becomes critical for future HBM
As cited in the report, Yang emphasizes that overcoming the physical limits of memory stacking will require the adoption of advanced packaging technologies. The report notes that current micro-bump technology can support no more than 16 layers, and Yang cautions that exceeding this threshold would make HBM-GPU packages too tall to meet the height constraints of AI chip products.
As a result, Yang stresses that beginning with HBM7, micro-bump-less copper-to-copper direct bonding, also known as hybrid copper bonding (HCB), will be necessary to keep stack height in check. He notes that this approach could allow more than 30 layers. HBM7, expected around 2034, is projected to reach 24 layers. Yang also points out that in HBM development, back-end bonding technologies have become just as critical as front-end processes.
In addition, Yang highlights that managing heat between HBM and GPUs remains a major challenge in HBM development. According to the report, HBM4 relies on direct-to-chip (D2C) liquid cooling, while HBM5 is expected to transition to immersion cooling. For HBM7, Yang anticipates the adoption of embedded cooling solutions.
Rise of HBF, but not without limits
Apart from outlining the future HBM roadmap, Yang also underscores the potential need for high-bandwidth NAND flash (HBF). He explains that as generative AI advances toward agentic AI, relying solely on DRAM may no longer be sufficient to handle the enormous volumes of data involved. As the report notes, this has led to predictions that HBF—though slower than HBM but offering far greater capacity—could become an essential complement in future systems.
However, he cautions that unlike DRAM, flash memory suffers from limited read/write endurance, and this challenge would need to be resolved for HBF to be used alongside HBM. The report adds that recent triple-level-cell (TLC) NAND flash typically provides only about 1,000 to 3,000 program/erase cycles per cell.
Source: TrendForce, Taiwan.
/dq/media/agency_attachments/UPxQAOdkwhCk8EYzqyvs.png)
Follow Us