Breaking the Memory Wall

News

The latest INTERA advances in Near-Memory Computing (NMC) for AI & Edge

Barcelona, Spain – May 4, 2026, The latest INTERA advances in Near-Memory Computing (NMC) for AI & Edge

Dear Reader,

The memory wall remains one of the biggest bottlenecks in modern computing. As AI models grow larger and more data-intensive, moving data between processors and memory consumes massive energy and latency often orders of magnitude more than the actual computation.

Near-Memory Computing (NMC), also called Near-Data Computing or Processing-Near-Memory offers a powerful solution by placing compute logic physically close to memory. This minimizes data movement, boosts bandwidth, and dramatically improves energy efficiency, especially for AI inference edge workloads.

Why Near-Memory Computing Matters for INTERA

As traditional von Neumann architectures separate compute and memory, creating a “commute” problem for data. INTERA´s NMC will bring processing units (cores, accelerators, or logic) right next to memory, which would enable a simpler 3D integration, HBM and advanced packaging.

NMC areas where INTERA is pushing boundaries:

  • Lower latency & higher bandwidth

Increase internal memory bandwidth targeting 15X higher than off-chip DDR.

  • Energy savings

Reduction of data movement targeting energy reduction by 10X for memory-bound tasks.

  • Scalability for AI

Targeting to provide a solution ideal for large embeddings, tensor operations, transformers, and edge AI inference where cloud offloading isn’t feasible.

INTERA´s recent simulations and prototypes show our NMC can reduce inference latency while lowering Total Cost of Ownership (TCO) by using cheaper remote memory pools with nearby small cores.

Key Recent Advances for INTERA (2025–2026)

  • 3D Integration & Stacked Architectures: Hybrid Memory Cube (HMC) and High-Bandwidth Memory (HBM) enabling logic layers stacked with DRAM. INTERA keeps exploring NMC for AI edge computing with a special focus scaling in multi-HMC meshes
  • AI Inference Optimization: Our Near-Memory Compute is proving to offload specific inference tasks to remote memory pools, cutting overall latency. This approach shines in heterogeneous edge systems where there is heavy lifting while NMC manages lighter, memory-bound operations.
  • Emerging Memory Synergies: Our NMC technology will aim to offer a versatile and innovative approach by seamlessly integrating with emerging non-volatile memories such as ReRAM, PCM, and MRAM, enabling efficient in-memory computing hybrids. Our development of programmable architectures that support bit-scalable operations and element-wise functions, will significantly enhances computational speed and energy efficiency, paving the way for advanced AI and data processing applications.
  • Intermittent & Edge Systems: Our design closely resembles NMC accelerators, such as those used for binary neural networks, achieving substantial forward-progress improvements up to hundreds of times better in energy-constrained environments, particularly benefiting intermittently powered systems.

INTERA’s hybrid approach, combining NMC with traditional accelerators, effectively addresses market challenges such as integration complexity, thermal management, programming models, and the balance between general-purpose and specialized compute, demonstrating practical and innovative solutions in the current landscape.

Real-World Impact

For Edge AI Acceleration solutions, INTERA´s NMC enables faster, lower-power real-time processing on-device critical for autonomous systems, industrial applications, and smart sensors where every milliwatt and millisecond counts.

Looking Ahead

By 2026–2027, we expect a substantial growth on NMC deployments in AI infrastructure, driven by chiplet ecosystems, advanced packaging, and tighter hardware-software co-design.

Stay ahead of the curve, The INTERA IP Engineering Team