Overview
While CPU performance has increased exponentially for decades, the speed at which data can be retrieved from RAM has improved much more slowly. This means the CPU often spends most of its time waiting for data to arrive.
Solutions
- Cache Hierarchy: Using multiple levels of fast, on-chip memory (L1, L2, L3).
- Prefetching: Guessing what data will be needed next and fetching it early.
- High Bandwidth Memory (HBM): Placing memory closer to the CPU and using wider interfaces.
Impact
The memory wall is a primary constraint on modern system performance, making data locality and cache efficiency more important than raw clock speed.