
δ-mem enhances large language models with a compact memory mechanism for better context utilization and performance.
δ-mem is a lightweight memory mechanism designed to enhance the performance of large language models by efficiently accumulating and reusing historical information. This approach addresses the limitations of simply expanding the context window, which can be costly and ineffective for context utilization.
Key features:
The results indicate that effective memory can be achieved through a compact online state directly integrated with attention computation, without the need for full fine-tuning or backbone replacement.