Microsoft’s New Phi-1.5 1.3B Model Outperforms llama2-7b in Benchmarks

Microsoft Research has unveiled phi-1.5, a large language model (LLM) that is designed to excel in a variety of formats, including QA, chat, and code.

Research has unveiled its newest addition to the world of language models: phi-1.5. Designed to excel in a variety of formats, including QA, chat, and code, this Transformer-based model is equipped with 1.3 billion parameters. It's trained on a rich blend of data, from Python codes sourced from StackOverflow to exercises inspired by gpt-3.5-turbo-0301.

Performance Metrics in the Spotlight

When it comes to performance, phi-1.5 matches up well to rival large language models. The model competes favorably in its category, showcasing impressive results, especially when compared to models with similar parameters. In benchmark evaluations, phi-1.5 not only matched but in some instances, surpassed the capabilities of models like Meta's llama-2 7b, especially in the AGIEval score and the GPT4ALL's Benchmark suite.

Embracing the Open-Source Ethos

In a move that resonates with the broader tech community's ethos, Microsoft has released phi-1.5 as an open-source model. The goal? To foster and provide researchers worldwide with a versatile tool to tackle pressing challenges in , bias mitigation, and more.

For enthusiasts looking for a deeper understanding, Hugging Face provides an in-depth look at phi-1.5. The model's training strategy is an evolution from its predecessor, phi-1, enriched with a new data source brimming with NLP synthetic texts. While the model is a powerhouse in its own right, it steers clear of certain training methods, such as instruction following or reinforcement learning from human feedback.

A technical report published on ArXiv offers further insights into the model's development. The emphasis has been on harnessing the power of common sense reasoning in natural language. Phi-1.5, while reflecting some characteristics of larger LLMs, brings to the table unique strengths, especially in the domain of safety, by consciously omitting web data during its training phase.