In June, Google DeepMind rolled out the Gemma 2 model, a lean language model gaining attention in the AI domain. Housing a modest 2 billion parameters, this model matches or outperforms heftier companions like those built around the GPT-3.5 architecture. In a new announcement, Google is expanding Gemma 2 with three new features.
Efficiency and Performance
Google is introducing Gemma-2-2B, a new smaller parameter version that allows the model to operate efficiently across more devices, broadening its usability. On the LMSYS chatbot leaderboard, it surpasses bulkier rivals such as Mixtral-8x7B and LLaMA-2-70B—which has 35 times more parameters.
The LMSYS Chatbot Arena Leaderboard is a platform that assesses and ranks large language models (LLMs) through human pairwise comparisons. It employs the Bradley-Terry model to present model ratings on an Elo scale. Hosted on Hugging Face Spaces, the leaderboard has amassed over one million human comparisons to position various LLMs.
Introduced alongside Gemma-2-2B, Google is adding ShieldGemma, which leverages classifiers derived from the core model to filter harmful content. Available in the 2, 9, and 27 billion-parameter versions of Gemma 2, these classifiers actively work against hate speech, harassment, explicit material, and other unsafe content, aligning with industry trends toward safer AI systems.
Transparency with Gemma Scope
In a bid to foster transparency, Google DeepMind has also debuted Gemma Scope. The tool sheds light on how Gemma-2 models decipher patterns and make decisions. Researchers can use Gemma Scope to gain insights into the model’s cognition processes. Gemma-2-2B, ShieldGemma, and Gemma Scope are readily available on platforms like Kaggle, Hugging Face, and Vertex AI Model Garden. Google AI Studio and the free Google Colab plan also support experimentation with these tools.
Initially released as open-source in February, the Gemma model family continues to push forward, maintaining a balance of performance and resourcefulness. This reflects current trends in AI where new models deliver efficiency without necessarily being the largest.
The models can be hosted on a single NVIDIA A100 80GB Tensor Core GPU, NVIDIA H100 Tensor Core GPU, or Cloud Tensor Processing Units (TPUs), which lowers AI infrastructure expenses. Additionally, they are compatible with NVIDIA RTX or GeForce RTX desktop GPUs through Hugging Face Transformers. Starting next month, Google Cloud customers can deploy Gemma 2 on Vertex AI, and developers can test them on Google AI Studio.
Last Updated on November 7, 2024 3:26 pm CET