// news · models · frontier-models · research2026-05-05source: subquadratic / llm-stats

SubQ 1M-Preview — first commercial subquadratic LLM, 12M token native context

Subquadratic's May 5 launch is the first generally-available large language model that drops standard transformer attention entirely. Claimed: ~5x lower cost than frontier transformers, up to 52x faster attention at scale, and a native 12 million token context window — not a sliding-window trick.

The architectural pitch is the obvious one — quadratic attention is the cost ceiling, and several research labs have shipped subquadratic variants (Mamba, RetNet, hyena, etc.) for years. SubQ is the first one positioned as a frontier-tier product, not a research artifact. The 12M token native context is the headline number; the cost-per-call claim is the actual business case.

What to watch: whether the quality holds up on long-context retrieval benchmarks (the genuine weakness of most subquadratic variants in prior work) and whether the "frontier" framing survives head-to-head on standard reasoning evals. If it does, the architectural diversity narrative finally has its commercial moment.

LLM-Stats: AI Updates May 2026 →