Saturday, May 30, 2026

Xiaomi Slashes AI API Prices by 99% in China Price War

Valyrian News Network 5 min read

Xiaomi Slashes AI API Prices by 99% in China Price War

Xiaomi Corp. has slashed the API prices for its flagship MiMo-V2.5 AI models by up to 99%, permanently matching the rates of local rival DeepSeek and threatening to reignite a price war in China’s hyper-competitive artificial intelligence sector. The dramatic cuts, effective May 27, 2026, come just one day after Xiaomi reported its weakest quarterly results in six quarters — a move that Caixin Global described as defying “a broader industry shift toward monetization and price hikes by global leaders like OpenAI.”

The Price Cut in Numbers

Xiaomi’s new pricing for its MiMo-V2.5 series eliminates the previous tiered structure based on input length, applying the same rates whether developers send 5,000 tokens or 950,000 tokens. The MiMo-V2.5 model now costs ¥0.02 per million tokens for cached input, ¥1 for uncached input, and ¥2 for output. The more powerful MiMo-V2.5-Pro is priced at ¥0.025 for cached input, ¥3 for uncached input, and ¥6 for output tokens.

According to the Xiaomi MiMo Open Platform, the new pricing represents a maximum reduction of up to 99% compared to original rates, with the company stating: “Enabling more people to use better models — this is MiMo’s unwavering mission.”

The Competitive Trigger

The price cuts directly target DeepSeek, which had just announced days earlier that its 75% promotional discount on the V4-Pro model would become permanent starting June 1. As SCMP reported, DeepSeek’s V4-Pro now costs just $0.87 per million output tokens — roughly 34 times cheaper than OpenAI’s GPT-5.5 — making it the global leader in cost-efficiency.

Xiaomi’s move essentially matches DeepSeek’s pricing floor. The Sohu report by Zhidx noted: “DeepSeek just announced permanent API price cuts, and Xiaomi followed,” highlighting how the two companies are now locked in a direct pricing confrontation.

A Broader Price War Context

The Chinese AI large language model market has been characterized by intense price competition since 2024. According to analysis from Apidog, Chinese labs cut LLM API prices six times in the first half of 2026 alone, with three of those cuts declared permanent. Alibaba offered half-price discounts for its Qwen3.7-Max model on May 26, while ByteDance and Tencent have also adjusted their pricing structures.

Interestingly, not all players are racing to the bottom. Zhipu AI raised its API prices by 83% in Q1 2026, betting on quality over cost leadership. This divergence creates a fragmented market where developers must carefully choose between cost, capability, and context window size.

The Technical Engine Behind the Cuts

Xiaomi’s ability to slash prices stems from significant inference system optimizations. The company’s technical team implemented Sliding Window Attention (SWA) based on SGLang HiCache, reducing KV Cache data transfer across GPU memory, CPU memory, and SSD to nearly one-seventh of pre-optimization levels, while increasing cacheable tokens by nearly five times.

These optimizations are critical because they make the price cuts potentially sustainable — a key question given Xiaomi’s financial position.

Financial Pressures and Strategic Calculus

The price cuts arrived at a precarious moment for Xiaomi. On May 26, the company reported Q1 2026 revenue of ¥99.1 billion (down 10.9% YoY) — the first time below ¥100 billion in six quarters — and adjusted net profit of ¥6.07 billion (down 43.1% YoY). As The Tech Portal reported, smartphone shipments fell 19.2% to 33.8 million units, while memory chip prices surged due to AI infrastructure demand.

Yet Xiaomi is doubling down on AI investment. R&D spending reached ¥9 billion in Q1 (up 33.4% YoY), and CEO Lei Jun has announced a ¥160 billion AI investment plan. Xiaomi President Lu Weibing stated on the earnings call that “Xiaomi’s major direction in the next five years is to reconstruct the entire ecosystem of vehicles, homes, and people with AI.”

This creates a striking contradiction: Xiaomi is cutting AI prices aggressively while its core business struggles, suggesting the company views AI dominance as existential rather than optional.

Token Plan Overhaul and Developer Incentives

Beyond API pricing, Xiaomi revamped its Token Plan subscription system, increasing credits by 5-8x without raising prices. The company’s “100 Trillion Token Creator Incentive Program” distributed all tokens ahead of schedule by May 26, and all existing subscriber quotas were fully reset on May 27.

The market responded immediately: daily token consumption for MiMo-V2.5-Pro surged 111% on OpenRouter following the announcement.

Global Divergence in AI Pricing

Perhaps the most significant implication is the growing divergence between Chinese and Western AI markets. While OpenAI, Anthropic, and Google have been raising prices and pushing toward monetization, Chinese companies are racing in the opposite direction. This creates a two-tier global AI market where developers in China can access frontier-level models at a fraction of the cost faced by their Western counterparts.

What to Watch

The critical question is sustainability. Xiaomi’s AI head Luo Fuli had warned against blind price wars just weeks before the cuts — yet the company has now joined one. With Xiaomi fighting simultaneous price wars in smartphones, EVs, and AI, the strain on its balance sheet is mounting.

Industry analysts expect Alibaba to respond with further Qwen price adjustments, and Moonshot’s Kimi may simplify its tiered pricing structure. The price floor in China’s AI market hasn’t finished falling — and the companies with the deepest pockets and most efficient inference infrastructure will determine where it lands.