Qwen 3.7 Max ties Claude Opus 4.7, Gemini 3.1 Pro, and GPT-5.5 at Intelligence Index 57 — Chinese-lab frontier parity now formally measured
Qwen 3.7 Max scored 57 on Artificial Analysis's Intelligence Index — exactly tied with Anthropic Claude Opus 4.7, Google Gemini 3.1 Pro, and OpenAI GPT-5.5. The independent benchmark consensus is now unambiguous: a Chinese open-weight model occupies the same frontier tier as the three top closed-frontier products from Western labs.
The benchmark consensus is the news because it's no longer contested. Through 2024-2025 the Western frontier-lab framing was that Chinese frontier-model claims were either benchmark-gaming or domain-narrow. Intelligence Index is a composite of multiple capability dimensions (reasoning, coding, math, multimodal, agentic) administered by an independent third party that has no relationship to any of the four labs being measured. Tied at 57 means tied — across the composite, not just on a single sub-benchmark where Chinese labs have historically optimized for press releases.
The structural consequence: every enterprise procurement decision that assumes "closed-frontier is qualitatively ahead of open-weight" now requires explicit justification. The empirical answer is no. The compliance regime around Chinese-lab procurement (export controls, data-residency, vendor due-diligence) becomes the actual constraint, not capability. For Western enterprises with no compliance constraint on Chinese-origin open weights, Qwen 3.7 Max at near-zero per-token cost (self-hostable, Apache 2.0) is now a real procurement option at the same quality tier as paying Anthropic $15-30 per million tokens.
FelloAI — Best AI Models May 2026 ChatGPT Claude Gemini Grok → · Medium / Sanjeev Patel — Best AI Models in 2026 GPT-5.5 vs Claude vs Gemini → · Codersera — Best Open-Source LLM May 2026 Llama Qwen DeepSeek Gemma Mistral →