HomeWinBuzzer NewsMeta's New AI Models With Multi-Token Prediction Promise Faster, More Efficient Language...

Meta’s New AI Models With Multi-Token Prediction Promise Faster, More Efficient Language Processing

Meta releases new AI models that predict multiple words at once (multi-token prediction) with potential benefits for code generation and language understanding.


Meta has launched pre-trained language models featuring multi-token prediction, a new technique in AI. These models are available on Hugging Face under a research license for non-commercial use and aim to push forward the capabilities of large language models (LLMs). The company made the announcement on its official /X page for the Meta artificial intelligence division. 

Advancement in AI Methodology

The multi-token prediction method, first published in a research paper in April, signifies a shift from traditional techniques. Unlike prior models that predict a single next word in a sequence, Meta's new approach predicts multiple future words at once. This could improve performance and cut down on training times, potentially altering the future of AI technology.

These changes have broad implications. As AI models grow more complex, their increased computational demands have led to concerns about cost and environmental impact. Meta's multi-token prediction could mitigate these issues, making advanced AI more practical and sustainable.

Better Language Comprehension

This method might also result in a deeper understanding of language, enhancing tasks like and creative writing. By narrowing the gap between AI and human language comprehension, these models could have a significant influence on various applications.

However, the accessibility of these introduces concerns. While this could democratize and benefit smaller organizations, it also opens up possibilities for misuse. The AI community must now create ethical frameworks and security measures to keep up with these technological advances.

Emphasis on Code Generation

Meta's release of these models is in line with its commitment to open science. The initial focus is on code completion tasks, reflecting the increasing demand for AI-assisted tools. As software development increasingly incorporates AI, Meta's contributions could speed up the trend towards collaborative human-AI coding.

Meta has open-sourced four language models, each with 7 billion parameters, aimed at code generation tasks. Two of the models were trained on 200 billion tokens of code, while the other two were trained on 1 trillion tokens. There's also a fifth, yet-unreleased model featuring 13 billion parameters.

These models consist of two main components: a shared trunk and output heads. The shared trunk handles initial computations for generating a code snippet, while the output heads generate one token at a time.

Benchmark Testing and Results

Benchmark tests using MBPP (a set of around 1,000 Python coding tasks) and HumanEval (a more complex set of coding tasks across multiple languages) were conducted to measure the accuracy of Meta's models. The models showed improvements of 17% and 12% on MBPP and HumanEval, respectively, compared to similar LLMs that generate tokens sequentially. Additionally, the output from Meta's models was generated three times faster.

This release is part of Meta's broader efforts in AI research, which also include developments in image-to-text generation and AI-generated speech detection. This extensive approach positions Meta as an important player across multiple AI fields, not just language models.

Critics argue that more efficient AI models might increase risks related to AI-generated misinformation and cyber threats. Meta has responded to these concerns by emphasizing that the models are licensed solely for research. However, questions about the effectiveness of these restrictions remain.

Luke Jones
Luke Jones
Luke has been writing about all things tech for more than five years. He is following Microsoft closely to bring you the latest news about Windows, Office, Azure, Skype, HoloLens and all the rest of their products.