Google released preliminary documentation this week for its latest Gemini 2.5 Pro reasoning model, but the move came weeks after the model was made widely available and has attracted sharp criticism from AI governance specialists. The document, known as a “model card,” appeared online around April 16th, yet experts contend it lacks critical safety details and suggests Google may be falling short of transparency promises made to governments and international bodies.
The controversy stems from the timeline: Gemini 2.5 Pro began its preview rollout initially to subscribers on March 25th (with the specific experimental version, gemini-2.5-pro-exp-03-25
, appearing March 28th, per Google Cloud docs) and access was quickly expanded to all free users via the Gemini web app starting March 29th.
The accompanying model card detailing safety evaluations and limitations, however, only surfaced more than two weeks after this broad public access began.
Kevin Bankston, a senior advisor at the Center for Democracy and Technology, described the six-page document on the social platform X as “meager” documentation, adding it tells a “troubling story of a race to the bottom on AI safety and transparency as companies rush their models to market.”
Missing Details and Unmet Pledges
A primary concern voiced by Bankston is the absence of detailed results from crucial safety assessments, such as “red-teaming” exercises meant to discover if the AI can be prompted into generating harmful content like instructions for creating bioweapons.
He suggested the timing and omissions could mean Google “hadn’t finished its safety testing before releasing its most powerful model” and “it still hasn’t completed that testing even now,” or that the company has adopted a new policy of withholding comprehensive results until a model is deemed generally available.
Other experts, including Peter Wildeford and Thomas Woodside, highlighted the model card’s lack of specific results or detailed references tied to evaluations under Google’s own Frontier Safety Framework (FSF), despite the card mentioning the FSF process was used.
This approach appears inconsistent with several public commitments Google undertook regarding AI safety and transparency. These include pledges made at a July 2023 White House meeting to publish detailed reports for powerful new models, adherence to the G7’s AI code of conduct agreed in October 2023, and promises made at the Seoul AI Safety Summit in May 2024.
Thomas Woodside of the Secure AI Project also pointed out that Google’s last dedicated publication on dangerous capability testing dates back to June 2024, questioning the company’s commitment to regular updates. Google also did not confirm whether Gemini 2.5 Pro had been submitted to the US or UK AI Safety Institutes for external evaluation prior to its preview release.
Google’s Position and Model Card Contents
While the full technical report is pending, the released model card does offer some insights. Google outlines its policy therein: “A detailed technical report will be published once per model family’s release, with the next technical report releasing after the 2.5 series is made generally available.”
It adds that separate reports on “dangerous capability evaluations” will follow “at regular cadences.” Google had previously stated that the latest Gemini underwent “pre-release testing, including internal development evaluations and assurance evaluations which had been conducted before the model was released.”
The published card confirms Gemini 2.5 Pro builds on the Mixture-of-Experts (MoE) Transformer architecture, a design aiming for efficiency by selectively activating parts of the model. It details the model’s 1 million token input context window and 64K token output limit, along with its training on diverse multimodal data with safety filtering aligned with Google’s AI Principles.
The card includes performance benchmarks (run on the `gemini-2.5-pro-exp-03-25` version) showing competitive results as of March 2025. It acknowledges limitations like potential “hallucinations” and sets the knowledge cut-off at January 2025. While outlining safety processes involving internal reviews (RSC) and various mitigations, and showing some automated safety metric improvements over Gemini 1.5, it confirms “over-refusals” persist as a limitation.
An Industry Racing Ahead?
The situation reflects broader tensions. Sandra Wachter, a professor at the Oxford Internet Institute, previously told Fortune, “If this was a car or a plane, we wouldn’t say: let’s just bring this to market as quickly as possible and we will look into the safety aspects later. Whereas with generative AI there’s an attitude of putting this out there and worrying, investigating, and fixing the issues with it later.”
This comes as OpenAI modified its safety framework, potentially allowing adjustments based on competitor actions, and Meta’s Llama 4 report also faced criticism for lacking detail. Bankston warned that if companies cannot fulfill basic, voluntary safety commitments, “it will be incumbent on lawmakers to develop and enforce clear transparency requirements that the companies can’t shirk.” Google continued its rollout pace by launching a preview of Gemini 2.5 Flash on April 17th, again stating its safety paper was “coming soon.”