LongMem: Microsoft Research Gives AI Language Models Long-Term Memory

Microsoft Research has presented a new framework called LongMem that enhances the capabilities of large language models (LLMs) by enabling them to utilize long-term memory. The research, encapsulated in the paper “Augmenting Language Models with Long-Term Memory”, is the brainchild of a team of researchers including Weizhi Wang, Li Dong, Hao Cheng, Xiaodong Liu, Xifeng Yan, Jianfeng Gao, and Furu Wei.

LLM Framework with Comprehensive Input Memory

The team pinpointed a key limitation in the current LLMs: their inability to handle more than fixed-sized inputs due to an input length limit and long-term memory limits. These constraints hinder LLMs such as GPT-4 from used in and from fully leveraging rich long-context information derived from past inputs.

What are called context input limits refer to the maximum number of tokens that LLMs can process at a time, which restricts their ability to use rich long-context information from past inputs. Long-term memory limits describes the difficulty of storing and retrieving relevant information from previous inputs over a long period of time. To overcome these challenges, researchers have proposed various methods to augment LLMs with long-term memory, such as sparse attention, cached memory, and decoupled memory architectures. These methods aim to enable LLMs to memorize and utilize long-form contents for language modeling and downstream tasks.

To circumvent this issue, the researchers introduced a novel framework, Language Models Augmented with Long-Term Memory (LongMem). This innovative solution empowers LLMs to store a comprehensive history of inputs, thereby utilizing long-term memory for language modeling.

Cache for Long-Term Contexts

In their design, the researchers incorporated a unique decoupled network architecture. This includes the original backbone LLM functioning as a memory encoder, and an adaptive residual side-network acting as a memory retriever and reader. This ingenious design facilitates the caching and updating of long-term past contexts for memory retrieval, all while avoiding the issue of memory staleness.

One of the standout features of the LongMem framework is its ability to handle an unlimited-length context in its memory bank. This feature can be harnessed to benefit a wide array of downstream tasks. Moreover, LongMem can expand the long-form memory to 65k tokens, thereby caching numerous extra demonstration examples as long-form memory for in-context learning.

Good Results on ChapterBreak Benchmark

The team put their method to the test through a series of experiments. The results were impressive, with LongMem outperforming strong long-context models on ChapterBreak, a challenging long-context modeling benchmark for large language models (LLM).

ChapterBreak is a dataset that tests the ability of long-range language models (LRLMs) to understand discourse-level transitions in narratives. It provides a long segment from a narrative that ends at a chapter boundary and asks the LRLM to distinguish the beginning of the ground-truth next chapter from a set of negative segments sampled from the same narrative. The benchmark is challenging because it requires processing global context and comprehending complex types of chapter transitions.

Furthermore, it demonstrated remarkable improvements on memory-augmented in-context learning over LLMs. For those interested in delving deeper into the research, the paper is accessible on the arXiv preprint server.