// news · multimodal2026-06-17source: lushbinary / llm-stats / lovart

Kling v3 leads text-to-video leaderboard at arena score 2031 — China-lab capability lead on video generation widens against Google Veo 3.1 and ByteDance Seedance

Kling v3 leads the text-to-video leaderboard at arena score 2031, ahead of LTX-2 Fast (1920) and Seedance 2.0 Fast (1851). The Chinese-lab lead on cinematic-motion and multi-shot storyboarding widens against Google Veo 3.1's prompt-adherence lead — the bifurcation of video-generation capability axes (motion-style vs prompt-fidelity) becomes structurally stable.

The substantive piece is the capability-axis bifurcation. Video-generation leaderboard rankings through 2025 operated on a single quality-axis (overall preference); the H1 2026 leaderboard splits into two clear axes: cinematic-motion (Kling 3 / Runway Gen-4) vs prompt-fidelity (Veo 3.1 / Sora-class). Different use cases (advertising creative vs narrative-shot generation vs interactive-content) now have different vendor-default selections. Procurement-team video-generation evaluation through H2 2026 increasingly requires axis-aware vendor selection rather than 'best video model' selection.

The competitive read against Google Veo 3.1's narrative-shot establishment-default positioning is that the H2 2026 video-generation procurement pattern is structurally splitting between the China-lab cinematic-motion segment and the US-lab prompt-fidelity segment. The split is durable because each segment's capability-axis aligns with distinct customer use cases — neither vendor block needs to compete on the other's axis. The result is a stable two-bloc competitive structure through H2 2026, not the convergence pattern most video-gen 2025 analyses predicted.

See our analysis →

Lushbinary — AI Video Generation 2026: Sora 2 vs Veo 3.1 vs Kling 3.0 Compared → · LLM Stats — Best AI for Video Generation in 2026 → · Lovart AI — Sora vs Kling vs Veo vs Runway vs Lovart — 2026 AI Video Deathmatch →