HomeWinBuzzer NewsGoogle Unveils Gemma 2 Lightweight AI Models with Enhanced Efficiency

Google Unveils Gemma 2 Lightweight AI Models with Enhanced Efficiency

The models can be hosted on single NVIDIA A100 or NVIDIA H100 TEnsor GPUs or Cloud Tensor Processing Units (TPUs).


has expanded its Gemma series of lightweight open AI models with the launch of Gemma 2. The new models, available in Studio, promise better performance and efficiency through a revamped architecture.

Google also opened developers access to the 2 million context window for Gemini 1.5 Pro, code execution capabilities in the Gemini API.

Configurations and Performance

The models can be hosted on a single NVIDIA A100 80GB Tensor Core GPU, NVIDIA H100 Tensor Core GPU, or Cloud Tensor Processing Units (TPUs), which lowers AI infrastructure expenses. Additionally, they are compatible with NVIDIA RTX or GeForce RTX desktop GPUs through Hugging Face Transformers. Starting next month, Google Cloud customers can deploy Gemma 2 on Vertex AI, and developers can test them on Google AI Studio.

Gemma 2 comes in two versions: one with 9 billion parameters and another with 27 billion. The larger model delivers performance comparable to models more than twice its size, while the 9B model exceeds the capabilities of similar-sized competitors like Llama 3 8B. A smaller 2.6B parameter version tailored for smartphone is expected soon.

Applications and Accessibility

Gemma 2 is accessible for experimentation on Google AI Studio. Meanwhile, Gemini 1.5 Flash, optimized for speed and cost, is being used in various applications.

Envision for example uses it to help visually impaired users comprehend their environment in real-time, Plural, an automated policy analysis and monitoring platform, leverages it to summarize complex legislative texts for NGOs and citizens, automation platform Zapier employs its video reasoning for automation in video editing, and AI provider Dot uses it for information compression in their long-term memory system.

Commitment to Responsible AI

Google says Gemma 2 meets safety standards by filtering pre-training data and evaluating the models against various safety metrics to minimize biases.

The models are free via the machine learning and data science community Kaggle or the Google Colab free tier, and academic researchers can apply for Google Cloud credits through the Gemma 2 Academic Research Program.

Extended Context Window and Code Execution in Gemini 1.5 Pro

Google has made a 2 million token context window for Gemini 1.5 Pro available to all developers, allowing the handling of larger inputs. Although this might increase costs, Google plans to address this with context caching in the Gemini API.

The Gemini API now features code execution for Gemini 1.5 Pro and 1.5 Flash models, enabling them to generate and execute Python code. This enhances their abilities in solving complex math and data reasoning tasks. The execution occurs in a secure, offline environment and includes several numerical libraries.

Google is also preparing to make Gemini 1.5 Flash tuning available for all developers, with text tuning now available for red-teaming and gradual rollout.

Markus Kasanmascheff
Markus Kasanmascheff
Markus is the founder of WinBuzzer and has been playing with Windows and technology for more than 25 years. He is holding a Master´s degree in International Economics and previously worked as Lead Windows Expert for Softonic.com.