Artificial Intelligence – Overview, Benchmarks, Latest News

AI Model Architectures

AI models are not a monolithic technology; they consist of multiple architectures, each designed for specific types of tasks. While some models excel at recognizing patterns, others specialize in generating content or making autonomous decisions.

ModelBest Use CasesAdvantagesLimitations
Feedforward NetworksFraud detection, risk assessment, structured data classificationSimple, fast, efficient for small-scale tasksCannot handle sequential or complex unstructured data
Recurrent Neural Networks (RNNs)Speech processing, time-series forecastingCaptures sequential dependenciesSuffers from vanishing gradient problem, inefficient for long sequences
Transformers (LLMs)Text generation, translation, multimodal AIHigh scalability, state-of-the-art performanceRequires vast computational power, black-box decision-making
GANsAI-generated images, deepfakes, artistic designProduces highly realistic outputsTraining instability, prone to mode collapse
Diffusion ModelsAI art, synthetic image generationMore stable than GANs, superior output qualityComputationally expensive, slow inference speed
Reinforcement LearningRobotics, autonomous vehicles, game AIAdapts to dynamic environments, learns from experienceHigh training cost, lack of generalization outside of trained tasks

AI Model Benchmarks – LLM Leaderboard

The transformer architecture redefined AI by enabling parallel sequence processing, eliminating the bottlenecks of RNNs. Instead of analyzing sequences step-by-step, transformers use self-attention mechanisms to determine relationships between all elements of an input at once.

This breakthrough led to the development of large language models (LLMs), such as GPT-4, Claude, and Google Gemini 1.5, which power today’s most advanced AI applications.

Last updated: April 7, 2025

Benchmark stats come from the model providers, if available. For models with optional advanced reasoning, we provide the highest benchmark score achieved.
OrganizationModelContextParameters (B)Input $/MOutput $/MLicenseGPQAMMLUMMLU ProDROPHumanEvalAIME'24SimpleBenchModel
metaLlama 4 Maverick1,000,000288$0.19-$0.49-Open69.80%84.60%80.50%---27.70%Llama 4 Maverick
metaLlama 4 Scout10,000,00017--Open57.20%-74.30%----Llama 4 Scout
metaLlama 4 Behemoth10,000,000288--Open73.70%85.80%82.20%----Llama 4 Behemoth
googleGemini 2.5 Pro (Exp)1,000,000-$2.50$15.00Proprietary84.00%89.8%---92.00%51.60&Gemini 2.5 Pro (Exp)
openai o3128,000---Proprietary87.70%-----o3
anthropic Claude 3.7 Sonnet200,000-$3.00 $15.00 Proprietary84.80%86.10%---80.00%46.4%Claude 3.7 Sonnet
xai Grok-3128,000---Proprietary84.60%-79.90%--93.30%Grok-3
xai Grok-3 Mini128,000---Proprietary84.60%-78.90%--90.80%Grok-3 Mini
openai o3-mini200,000-$1.10 $4.40 Proprietary79.70%86.90%---86.50%22.8%o3-mini
openai o1-pro128,000---Proprietary79.00%----86.00%o1-pro
openai o1200,000-$15.00 $60.00 Proprietary78.00%91.80%--88.10%83.30%40.1%o1
google Gemini 2.0 Flash Thinking1,000,000---Proprietary74.20%----73.30%30.7%Gemini 2.0 Flash Thinking
openai o1-preview128,000-$15.00 $60.00 Proprietary73.30%90.80%---44.60%41.7%o1-preview
deepseek DeepSeek-R1131,072671$0.55 $2.19 Open71.50%90.80%84.00%92.20%-79.80%30.9%DeepSeek-R1
openaiGPT-4.5128,000---Proprietary71.4%90.0%--88.0%36.7%34.5%GPT-4.5
anthropic Claude 3.5 Sonnet200,000-$3.00 $15.00 Proprietary67.20%90.40%77.60%87.10%93.70%16.00%41.4%Claude 3.5 Sonnet
qwen QwQ-32B-Preview32,76832.5$0.15 $0.20 Open65.20%-70.97%--50.00%QwQ-32B-Preview
google Gemini 2.0 Flash1,048,576---Proprietary62.10%-76.40%--35.5%18.9%Gemini 2.0 Flash
openai o1-mini128,000-$3.00 $12.00 Proprietary60.00%85.20%80.30%-92.40%70.00%18.1%o1-mini
deepseek DeepSeek-V3131,072671$0.27 $1.10 Open59.10%88.50%75.90%91.60%-39.2%18.9%DeepSeek-V3
google Gemini 1.5 Pro2,097,152-$2.50 $10.00 Proprietary59.10%85.90%75.80%74.90%84.10%19.3%27.1%Gemini 1.5 Pro
microsoft Phi-416,00014.7$0.07 $0.14 Open56.10%84.80%70.40%75.50%82.60%Phi-4
xai Grok-2128,000-$2.00 $10.00 Proprietary56.00%87.50%75.50%-88.40%22.7%Grok-2
openai GPT-4o128,000-$2.50 $10.00 Proprietary53.60%88.00%74.70%--17.8%GPT-4o
google Gemini 1.5 Flash1,048,576-$0.15 $0.60 Proprietary51.00%78.90%67.30%-74.30%Gemini 1.5 Flash
xai Grok-2 mini128,000---Proprietary51.00%86.20%72.00%-85.70%Grok-2 mini
meta Llama 3.1 405B Instruct128,000405$0.90 $0.90 Open50.70%87.30%73.30%84.80%89.00%23.0%Llama 3.1 405B Instruct
meta Llama 3.3 70B Instruct128,00070$0.20 $0.20 Open50.50%86.00%68.90%-88.40%19.9%Llama 3.3 70B Instruct
anthropic Claude 3 Opus200,000-$15.00 $75.00 Proprietary50.40%86.80%68.50%83.10%84.90%23.5%Claude 3 Opus
qwen Qwen2.5 32B Instruct131,07232.5--Open49.50%83.30%69.00%-88.40%Qwen2.5 32B Instruct
qwen Qwen2.5 72B Instruct131,07272.7$0.35 $0.40 Open49.00%-71.10%-86.60%23.30%Qwen2.5 72B Instruct
openai GPT-4 Turbo128,000-$10.00 $30.00 Proprietary48.00%86.50%-86.00%87.10%GPT-4 Turbo
amazon Nova Pro300,000-$0.80 $3.20 Proprietary46.90%85.90%-85.40%89.00%Nova Pro
meta Llama 3.2 90B Instruct128,00090$0.35 $0.40 Open46.70%86.00%---Llama 3.2 90B Instruct
qwen Qwen2.5 14B Instruct131,07214.7--Open45.50%79.70%63.70%-83.50%Qwen2.5 14B Instruct
mistral Mistral Small 332,00024$0.07 $0.14 Open45.30%-66.30%-84.80%Mistral Small 3
qwen Qwen2 72B Instruct131,07272--Open42.40%82.30%64.40%-86.00%Qwen2 72B Instruct
amazon Nova Lite300,000-$0.06 $0.24 Proprietary42.00%80.50%-80.20%85.40%Nova Lite
meta Llama 3.1 70B Instruct128,00070$0.20 $0.20 Open41.70%83.60%66.40%79.60%80.50%Llama 3.1 70B Instruct
anthropic Claude 3.5 Haiku200,000-$0.10 $0.50 Proprietary41.60%-65.00%83.10%88.10%Claude 3.5 Haiku
anthropic Claude 3 Sonnet200,000-$3.00 $15.00 Proprietary40.40%79.00%56.80%78.90%73.00%Claude 3 Sonnet
openai GPT-4o mini128,000-$0.15 $0.60 Proprietary40.20%82.00%-79.70%87.20%10.7%GPT-4o mini
amazon Nova Micro128,000-$0.04 $0.14 Proprietary40.00%77.60%-79.30%81.10%Nova Micro
google Gemini 1.5 Flash 8B1,048,5768$0.07 $0.30 Proprietary38.40%-58.70%--Gemini 1.5 Flash 8B
ai21 Jamba 1.5 Large256,000398$2.00 $8.00 Open36.90%81.20%53.50%--Jamba 1.5 Large
microsoft Phi-3.5-MoE-instruct128,00060--Open36.80%78.90%54.30%-70.70%Phi-3.5-MoE-instruct
qwen Qwen2.5 7B Instruct131,0727.6$0.30 $0.30 Open36.40%-56.30%-84.80%Qwen2.5 7B Instruct
xai Grok-1.5128,000---Proprietary35.90%81.30%51.00%-74.10%Grok-1.5
openai GPT-432,768-$30.00 $60.00 Proprietary35.70%86.40%-80.90%67.00%25.1%GPT-4
anthropic Claude 3 Haiku200,000-$0.25 $1.25 Proprietary33.30%75.20%-78.40%75.90%Claude 3 Haiku
meta Llama 3.2 11B Instruct128,00010.6$0.06 $0.06 Open32.80%73.00%---Llama 3.2 11B Instruct
meta Llama 3.2 3B Instruct128,0003.2$0.01 $0.02 Open32.80%63.40%---Llama 3.2 3B Instruct
ai21 Jamba 1.5 Mini256,14452$0.20 $0.40 Open32.30%69.70%42.50%--Jamba 1.5 Mini
openai GPT-3.5 Turbo16,385-$0.50 $1.50 Proprietary30.80%69.80%-70.20%68.00%GPT-3.5 Turbo
meta Llama 3.1 8B Instruct131,0728$0.03 $0.03 Open30.40%69.40%48.30%59.50%72.60%Llama 3.1 8B Instruct
microsoft Phi-3.5-mini-instruct128,0003.8$0.10 $0.10 Open30.40%69.00%47.40%-62.80%Phi-3.5-mini-instruct
google Gemini 1.0 Pro32,760-$0.50 $1.50 Proprietary27.90%71.80%---Gemini 1.0 Pro
qwen Qwen2 7B Instruct131,0727.6--Open25.30%70.50%44.10%--Qwen2 7B Instruct
mistral Codestral-22B32,76822.2$0.20 $0.60 Open----81.10%Codestral-22B
cohere Command R+128,000104$0.25 $1.00 Open-75.70%---17.4%Command R+
deepseek DeepSeek-V2.58,192236$0.14 $0.28 Open-80.40%--89.00%DeepSeek-V2.5
google Gemma 2 27B8,19227.2--Open-75.20%--51.80%Gemma 2 27B
google Gemma 2 9B8,1929.2--Open-71.30%--40.20%Gemma 2 9B
xai Grok-1.5V128,000---Proprietary-----Grok-1.5V
moonshotai Kimi-k1.5128,000---Proprietary-87.40%---Kimi-k1.5
nvidia Llama 3.1 Nemotron 70B Instruct128,00070--Open-80.20%---Llama 3.1 Nemotron 70B Instruct
mistral Ministral 8B Instruct128,0008$0.10 $0.10 Open-65.00%--34.80%Ministral 8B Instruct
mistral Mistral Large 2128,000123$2.00 $6.00 Open-84.00%--92.00%22.5%Mistral Large 2
mistral Mistral NeMo Instruct128,00012$0.15 $0.15 Open-68.00%---Mistral NeMo Instruct
mistral Mistral Small32,76822$0.20 $0.60 Open-----Mistral Small
microsoft Phi-3.5-vision-instruct128,0004.2--Open-----Phi-3.5-vision-instruct
mistral Pixtral-12B128,00012.4$0.15 $0.15 Open-69.20%--72.00%Pixtral-12B
mistral Pixtral Large128,000124$2.00 $6.00 Open-----Pixtral Large
qwen QvQ-72B-Preview32,76873.4--Open-----QvQ-72B-Preview
qwen Qwen2.5-Coder 32B Instruct128,00032$0.09 $0.09 Open-75.10%50.40%-92.70%Qwen2.5-Coder 32B Instruct
qwen Qwen2.5-Coder 7B Instruct128,0007--Open-67.60%40.10%-88.40%Qwen2.5-Coder 7B Instruct
qwen Qwen2-VL-72B-Instruct32,76873.4--Open-----Qwen2-VL-72B-Instruct
cohereCommand A256,000111$2.50$10.00Open-85.00%-----Command A
baiduERNIE 4.5-----75.00%-79.00%87.00%85.00%ERNIE 4.5
googleGemma 3 1B128,0001--Open19.20%29.90%14.70%-32.00%--Gemma 3 1B
googleGemma 3 4B128,0004--Open30.80%46.90%43.60%----Gemma 3 4B
googleGemma 3 12B128,00012--Open40.90%65.20%60.60%----Gemma 3 12B
googleGemma 3 27B128,00027--Open42.40%72.1%67.50%-89.00%--Gemma 3 27B
qwenQwen2.5 Max32,768-59.00%-76.00%-93.00%23.00%-Qwen2.5 Max
qwenQwQ 32B131,00032.8Open59.00%-76.00%98.00%78.00%-QwQ 32B

Amazon & HUMAIN Ignite Saudi AI with $5B+ AI Zone Investment

AWS & Saudi Arabia's HUMAIN reveal a $5B+ AI Zone, boosting the Kingdom's AI prowess with NVIDIA & AMD deals amid new U.S. tech export policies.
Perplexity Deep Research

Perplexity AI Nears $14B Value with New $500M Accel-Led Funding Round

AI 'answer engine' Perplexity AI is closing a $500M Accel-led round, targeting a $14B valuation to expand its Google-challenging product suite and scale operations.
Microsoft workers office space

Microsoft Cuts 3% of its Global Workforce

Microsoft reduces its global workforce by 3% (~7,000 employees) to enhance efficiency and strategic AI focus, even with recent strong financial performance.
Humain Saudi Arabia

HUMAIN AI: Saudi Arabia Taps Chips from NVIDIA & Groq in AI Power Play

Saudi Arabia's HUMAIN AI initiative partners with NVIDIA and Groq for "AI factories of the future," using a dual chip strategy and new U.S. export policies to fuel Vision 2030 leadership.

Apple Explores Brain Control for Devices

Apple is reportedly developing brain-computer interface technology to allow users to control devices with thoughts, enhancing accessibility for those with disabilities and signaling a new era of human-computer interaction.

SoftBank Reports $3.5Bn Q4 2024 Profit

SoftBank Group reports a surprise $3.5 billion Q4 net profit, its first full-year profit in four years, providing a crucial financial boost for its massive AI investments in OpenAI and the ambitious Project Stargate infrastructure initiative.
AI-Chip-Artificial-Intelligence-Bing-Image-Creator

Trump Eyes Large AI Chip Sales to UAE, Saudi Arabia

The Trump administration is considering large-scale sales of advanced AI chips to the UAE and Saudi Arabia, marking a policy shift and sparking debate over national security and economic ties.

Microsoft Retires Bing Search APIs; Pushes Azure AI Agents

Microsoft sunsets its public Bing Search and Custom Search APIs on August 11, 2025, requiring developers to migrate; the company recommends Azure AI Agent Service with its Grounding feature as the path forward, raising new integration and compliance considerations.
Website of the Udio AI music creator

Paul McCartney, Elton John and 400 Other UK Creatives Demand Transparency on AI Model Training of Copyrighted Works

UK creative leaders, including Paul McCartney and Elton John, demand government mandate AI transparency on copyrighted training data to protect the industry from 'mass theft'.
Gemini Deep Research file uploads via TestingCatalog

Google Gemini Deep Research Gets File Upload Option to Analyze Your Documents

Google Gemini Deep Research will soon allow users to upload and analyze their own files, bringing NotebookLM-like capabilities to the free AI tool for personalized research.
Humain Saudi Arabia

Saudi Arabia Launches “Humain” AI Venture to Advance Saudi Tech Infrastructure

Saudi Arabia launches Humain, a new AI company chaired by Crown Prince Mohammed bin Salman and owned by the PIF, to drive its AI strategy and investments as it seeks to become a global AI hub.
Microsoft Store Copilot

Windows 11 Store & Search Gain Microsoft Copilot Integration for App Discovery

Microsoft is enhancing Windows 11 app discovery with Copilot integration in the Store and direct app downloads from Search, aiming to simplify user experience.
Google Veo 2 official

Honor 400 Smartphones to Launch with Google Veo 2 AI Video Generation Model

Honor's new 400 series phones will launch with Google's advanced Veo 2 AI image-to-video generator, offering early access to powerful creative tools before broader Gemini availability.
Dynamics  Microsoft

New Microsoft Dynamics 365 Phishing Campaign Bypasses Multi-Factor Authentication

A new phishing campaign is exploiting Microsoft Dynamics 365 Customer Voice to steal login credentials and bypass MFA, impacting hundreds of organizations.

Tariff Fears Impact Financing of Stargate Project by OpenAI and Softbank

SoftBank and OpenAI's $100 billion Project Stargate, aimed at building US AI infrastructure, faces financing delays due to tariff risks.
Shira Perlmutter US Copyright Office official

Trump Fires Director of U.S. Copyright Office Amid Tech Regulation Storm

President Trump has fired the US Copyright Office director amidst Meta's antitrust trial and AI copyright lawsuits, highlighting the increasing political dimension of tech regulation and its impact on companies like Meta.
AI Research - ai generated

NHS Health Data Used for AI Model Training, Causing Privacy Concerns

An AI model trained on 57 million NHS records sparks privacy debate. Developers claim potential benefits, but critics cite re-identification risks and lack of clear opt-out for patients.
OpenAI profit money

OpenAI Eyes IPO, Negotiates with Microsoft After Restructuring U-Turn

OpenAI's decision to maintain nonprofit control sparks crucial negotiations with Microsoft, impacting their multi-billion dollar AI partnership, future funding, and IPO plans.
SoundCloud

SoundCloud AI Policy Update Sparks Creator Backlash Over Data

SoundCloud faces artist backlash after a Terms of Service update suggests user music could train AI models; the company clarifies its stance, stating it doesn't use content for generative AI but for platform features, amidst growing creator concerns over data rights.
AI Artificial Intelligence

Zencoder Launches Zen Agents for AI Coding with Custom & Open-Source Tools

Zencoder has launched Zen Agents, a new AI platform empowering developers with customizable coding agents for internal team use and an industry-first open-source marketplace for shared AI tools, aiming to boost productivity and standardize software development.
Google Workspace

Google Adds Gemini Live to Google Workspace accounts

Google has launched Gemini Live for Workspace accounts, introducing interactive voice conversations, screen sharing, and camera integration to its AI assistant, alongside specific data activity policies for enterprise users.
Meta AssetGen 2.0

Meta Announces AssetGen 2.0 for Enhanced AI-driven 3D Asset Generation

Meta has unveiled AssetGen 2.0, a significantly advanced AI model designed to generate high-quality 3D assets from text and image prompts, promising to revolutionize content creation for Horizon Worlds and democratize 3D development.

Google Pays $1.375b in Texas Privacy Settlement about Data Tracking and Face Scans

Google has agreed to a historic $1.375 billion settlement with Texas over allegations of illegal location tracking, facial recognition data misuse, and incognito mode deception, marking a significant step in data privacy enforcement.
Google-Gemini-Custom-Gems

Google Gemini API Adds Implicit Caching, Promises Usage Cost Reduction up to 75%

Google has launched implicit caching for its Gemini 2.5 API, a new feature that automatically reduces developer costs by up to 75% on repetitive prompt data by reusing common prefixes, making powerful AI models more accessible.
GitHub Copilot

GitHub Copilot Sets OpenAI’s GPT-4.1 Model as Default, Changes Rate Limits

GitHub has upgraded Copilot to use OpenAI's GPT-4.1 as its default model, enhancing AI coding, instruction following, and providing IP indemnification, alongside new VS Code features and updated platform rate limits.

Python Usage Hits Record Heights with a 25.35% Share on the TIOBE Index

Python achieved a historic 25.35% market share in the May 2025 TIOBE Index, its highest ever, establishing an unprecedented lead over other languages, largely driven by its role in AI, data science, and increasing integration into tools like Excel.

Apple is Working on New Chips for AI Servers, Macs, Smart Glasses

Apple is reportedly developing a new generation of custom chips for AI servers (Project Baltra), future Macs (M5, M6, M7, Sotra), and its first smart glasses, signaling a major expansion of its in-house silicon strategy to power upcoming intelligent devices and services.
CoreWeave Logo

CoreWeave Seeks $1.5B+ Debt After Downsized IPO, Eyes Refinancing

AI data center provider CoreWeave is pursuing a new debt deal of $1.5 billion or more, including potential high-yield unsecured bonds, to refinance existing liabilities and potentially fund growth, weeks after its March 2025 IPO.
Sam Altman during Senate Commerce Committee hearing on May 8, 2025

OpenAI CEO Altman Changes Stance on AI Safety, Urges ‘Light-Touch’ AI Rules

OpenAI CEO Sam Altman has signaled a major shift in his approach to AI regulation, now advocating for industry-led standards and warning against stringent government rules he believes could hinder U.S. innovation, a stark contrast to his previous calls for more robust federal oversight.

Alibaba’s New ZeroSearch Framework Slashes Training Costs For Search-Enabled AI by 88%

Alibaba researchers have developed ZeroSearch, a novel AI framework that trains Large Language Models to search via simulation, cutting API costs by 88% and matching or exceeding traditional search engine performance, making advanced AI more accessible.
Table of Contents: