Processing Model Memory

'Observational memory' cuts AI agent costs 10x and outscores RAG on long-context benchmarks

As AI agents move into production, teams are rethinking memory. Mastra’s open-source observational memory shows how stable ...

Nvidia’s new technique cuts LLM reasoning costs by 8x without losing accuracy

Nvidia researchers developed dynamic memory sparsification (DMS), a technique that compresses the KV cache in large language models by up to 8x while maintaining reasoning accuracy — and it can be ...

Semiconductor Engineering

Developing ReRAM As Next Generation On-Chip Memory For Machine Learning, Image Processing And Other Advanced CPU Applications

In modern CPU device operation, 80% to 90% of energy consumption and timing delays are caused by the movement of data between the CPU and off-chip memory. To alleviate this performance concern, ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

'Observational memory' cuts AI agent costs 10x and outscores RAG on long-context benchmarks

Nvidia’s new technique cuts LLM reasoning costs by 8x without losing accuracy

Developing ReRAM As Next Generation On-Chip Memory For Machine Learning, Image Processing And Other Advanced CPU Applications

Trending now