Alibaba Cloud has announced an 85% price reduction for its visual reasoning AI model, Qwen-VL-Max. The move comes as Chinese tech giants, including ByteDance and Baidu, battle for dominance in the rapidly growing enterprise AI sector.
The reduced pricing, which positions Qwen-VL-Max at 0.003 yuan ($0.00041) per thousand tokens, mirrors ByteDance’s aggressive strategy to lower costs for its visual reasoning AI model launched earlier in December.
Alibaba’s Qwen-VL series encompasses several other advanced models that integrate visual and textual data for tasks such as image captioning, visual question answering, and multimodal content generation. The lineup includes Qwen-VL, Qwen-VL-Chat, Qwen2-VL, and the experimental QVQ-72B-Preview. Qwen2-VL, with its state-of-the-art performance, has excelled in benchmarks like MathVista and DocVQA, often surpassing leading competitors like OpenAI’s GPT-4V and Google’s Gemini Ultra.
With over 252 generative AI models approved in China this year, its market has grown saturated, prompting companies to adopt innovative pricing and technology strategies to secure market share.
Strategic Pricing as a Consistent Pattern
The December announcement is Alibaba’s third major AI price adjustment in 2024, following a 55% reduction in February for core cloud products and a 97% discount in May for the Qwen AI suite. These moves reflect a consistent focus on affordability, aiming to attract enterprise customers exploring advanced AI tools for business processes and analytics.
By reducing costs, Alibaba seeks to position its AI offerings as indispensable tools for companies navigating the complexities of adopting artificial intelligence. Token-based billing, which charges users for specific AI interactions, has become central to pricing strategies, enabling scalable access to powerful models without prohibitive upfront investments.
Advancing Multimodal AI with QVQ-72B
Earlier this week, Alibaba introduced QVQ-72B, an open-source multimodal AI model that integrates visual and textual reasoning capabilities. This release builds on its predecessor, Qwen2-VL-72B, enhancing functionality for scientific research and advanced analytics.
Benchmarks have validated QVQ-72B’s capabilities, with the model achieving a score of 70.3 on the MMMU benchmark—a test designed to evaluate university-level multimodal reasoning—and excelling in MathVista and OlympiadBench. These results place QVQ-72B among the most competitive open-source models in the industry.
QwQ-32B: A Model for Logical Precision
In November, Alibaba introduced QwQ-32B, a model tailored for logical reasoning, coding, and advanced mathematical tasks. Its test-time compute feature allocates additional computational resources during execution, improving accuracy for complex problems. While this slows down response times, the precision offered by QwQ-32B has been praised in benchmarks and enterprise applications.
The release of QwQ-32B under the Apache 2.0 license reflects Alibaba’s commitment to balancing collaboration and proprietary control. By focusing on reasoning-centric AI, Alibaba competes directly with models like DeepSeek’s R1-Lite-Preview and OpenAI’s o1 model, both of which prioritize logical depth and iterative problem-solving.
China’s generative AI sector has witnessed a rapid proliferation of models, with over 250 offerings approved for public use in 2024 alone. This saturation has fueled intense competition among industry leaders and start-ups, each vying for differentiation through pricing and unique technological features.
DeepSeek, for example, has emphasized transparency with its R1-Lite-Preview model, which uses chain-of-thought reasoning to break problems into incremental steps, enabling users to track its decision-making process. Meanwhile, ByteDance and Alibaba focus on affordability to drive adoption in an increasingly crowded market.