Three humanoid doctrines — Apptronik, Figure, 1X are running different bets on what comes first
Apptronik picks factories. Figure picks the controlled-environment-to-home gradient. 1X picks consumer-first and learns from the field. The doctrines are diverging fast enough that the next 18 months will pick a winner — or two.
The three positions
Three companies, three deployment doctrines:
- Apptronik — Apollo deployed at Mercedes-Benz and GXO Logistics; $935M raised at $5.5B valuation; deployment is factory-first, with no near-term consumer ambition.
- Figure — Figure 03 in BMW Spartanburg factory; Helix 02 home pilots targeting unseen-environment generalization by end of 2026; deployment is factory-to-home gradient.
- 1X — NEO consumer humanoid shipping at $20K / $499/mo to early adopters; the only humanoid with a real home-environment longitudinal dataset.
Tesla is the wildcard
Tesla Optimus Gen 3 is supposed to ship from Fremont this summer with both factory and consumer ambitions. That's the wider bet — operate the same robot doctrine in factory and consumer simultaneously, scaling production through both channels. The other three companies all picked one channel to start. Tesla is the only one trying both.
What each doctrine optimizes
- Apptronik's factory-first. Optimizes for unit economics in a bounded environment. The wager is that industrial deployment scales faster than home deployment because the environment is predictable and the customer is sophisticated.
- Figure's gradient. Optimizes for autonomy-stack methodology. The wager is that mastering factory autonomy produces a transferable stack that home deployment can absorb later — and the brand equity of "works in BMW's factories" helps the home launch when it arrives.
- 1X's consumer-first. Optimizes for data. The wager is that home-environment data is the moat — every Apptronik or Figure unit produces only factory-shaped data; every NEO unit produces home-shaped data that compounds faster.
The methodology paper that ties this together
The interleaved vision-language reasoning paper from May 2026 is the under-noticed technical input. It shows that mixed-modality reasoning traces produce ~30% better out-of-distribution generalization on long-horizon manipulation. Whichever humanoid program adopts the methodology fastest gets the largest near-term improvement on the "works outside the training environment" benchmark — which is the actual capability that all three doctrines are racing on.
Humanoid Press → · CNBC — Apptronik → · RoboZaps — best humanoid robots 2026 →