Already subscribed? Sign in →
Defense technology company building autonomous systems, drones, and AI-powered military hardware.
AI-first humanoid robotics company developing general-purpose bipedal robots for commercial labor and the home.
Pittsburgh-based robotics AI company building the 'Skild Brain,' a single omni-bodied foundation model meant to control any robot for any task.
Defense AI company building Hivemind, an autonomous pilot for drones and aircraft that operates without GPS or communications.
German cognitive-robotics company building the 4NE-1 humanoid and the MAiRA cognitive cobot, positioning itself as Europe's answer to Figure, Tesla Optimus, and Unitree.
San Francisco lab building general-purpose vision-language-action foundation models that can drive arbitrary robot bodies, best known for the pi-series models.
World's second-largest industrial robot maker, currently a division of ABB Ltd and being divested to SoftBank Group.
Operator of the world's largest autonomous logistics network, delivering medical supplies and consumer goods via long-range drones.
Austin-based humanoid robotics company developing Apollo, a general-purpose commercial humanoid deployed with Mercedes-Benz, GXO, and Jabil.
Chinese autonomous mobile robot (AMR) leader and the world's first publicly listed pure-play AMR warehouse robotics company.
Shenzhen-based service robotics leader shipping commercial delivery, cleaning, and industrial logistics robots to more than 80 countries.
Operator of the world's largest sidewalk delivery robot fleet, serving university campuses, retail, and residential neighborhoods.
Swiss maker of autonomous quadruped inspection robots for oil and gas, chemicals, power, and heavy industry.
Redwood City-based physical AI company building full-stack warehouse robotics for parcel, truck-loading, and container-handling applications, deployed at FedEx, UPS, and GXO.
Santa Clara-based mobile manipulator cobot company building Proxie, an AI-driven collaborative robot for logistics, healthcare, and manufacturing.
Japan's dominant industrial automation incumbent, producing CNC controls, industrial robots, and factory automation systems.
Publicly traded AI-powered warehouse automation company whose end-to-end robotic systems handle a large share of Walmart's US regional distribution.
Oregon-based bipedal humanoid robotics company building Digit, the first humanoid in commercial warehouse production deployment.
Shenzhen-based humanoid robot maker behind the Walker S industrial humanoid and a large AI-education business; first humanoid robotics company to list on HKEX.
Publicly traded operator of autonomous sidewalk delivery robots serving Uber Eats and restaurants across U.S. cities.
Leading Western autonomous mobile robot vendor for e-commerce and retail fulfillment, surpassing 4 billion robot-assisted picks in 2025.
Hangzhou-based robotics company that pioneered low-cost quadrupeds and humanoids, filing for a Shanghai STAR Board IPO at reported $7B valuation.
Columbus, Ohio-based maker of AI-driven robotic welding cells that scan, plan, and weld unique parts without programming.
Vancouver-based humanoid robotics company building Phoenix, a general-purpose humanoid with a focus on dexterous hands and cognitive AI.
Vision-first autonomous home-cleaning robot built without the cloud.
New York-based cobot maker offering a US-built six-axis robotic arm aimed at small and mid-sized manufacturers.
Maker of Servi, an autonomous food-running and bussing robot for restaurants; majority-acquired by LG Electronics in 2025.
Shanghai-based robotics maker that pivoted from a leading rehab exoskeleton business into the GR-series general-purpose humanoid.
Norwegian-founded humanoid robotics company building NEO, a consumer bipedal home robot backed by OpenAI and EQT Ventures.
Maker of ElliQ, an AI-powered companion robot for older adults.
New York-based wheeled humanoid-class mobile manipulator built for warehouse and factory work, teleop-supervised and priced for rapid deployment.
Maker of Moxi, a socially-intelligent mobile manipulation robot that runs supply, medication, and lab errands in hospitals.
Assistive home robots that help older adults and people with mobility challenges live independently.
AI-powered warehouse robotics company specializing in robotic picking and order fulfillment, taken private by SoftBank in 2023.
Danish cobot pioneer and global category leader, a wholly-owned subsidiary of Teradyne since 2015.
Pioneering robotics company building advanced mobile, dexterous robots including the Atlas humanoid, Spot quadruped, and Stretch warehouse system.
Tesla's in-house general-purpose humanoid robot program, leveraging Tesla's AI, compute, and manufacturing infrastructure to target mass-market pricing.
World's largest consumer and commercial drone manufacturer, based in Shenzhen.
Norwegian pioneer of cube-based automated storage and retrieval systems (AS/RS), with 1,900+ deployments across 65 countries.
Maker of the Roomba robotic vacuum; pioneer of consumer home robotics, now in Chapter 11 restructuring.
Embodied Inc. (Moxie) is routed directly to the Graveyard on the strength of its December 2024 shutdown — a lead investor pulled out of a late-stage round, the company ceased operations, and cloud-dependent Moxie units were bricked. Remains visible as a sector case study rather than an active tracked company.
Source ↗The inaugural Robotics roster was assembled from editorial research across humanoid, warehouse, drone, cobot, service, defense, industrial, consumer, and healthcare robotics. Automated signal tracking begins this week; the first quarterly review — where promotions and demotions happen based on capital, news volume, hiring velocity, valuation, age, and exec stability — is scheduled for May 15, 2026, anchoring a recurring Feb/May/Aug/Nov 15 cadence aligned with US earnings-season closes.
Major industry dates · soonest first
Robots must perform diverse manipulation tasks across novel objects, materials, and layouts without task-specific retraining. Current systems excel in controlled domains but fail on deformable objects, cluttered scenes, and novel task combinations. Solving this unlocks deployment across logistics, construction, and service industries where adaptability is essential.
Google's RT-2, Physical Intelligence's π0, and NVIDIA's GR00T N1.6 represent a shift toward unified policies that condition on language and visual input to predict continuous actions via diffusion or flow matching. These models train on internet-scale vision data plus hundreds of thousands of teleoperated trajectories from diverse robots (Open X-Embodiment aggregates 22 platforms). The approach shows promise: Physical Intelligence demonstrated π0 folding clothes and sorting items with few demonstrations. However, scaling remains contested—VLAs still struggle with contact-rich manipulation (plugging cables, cutting vegetables) where force feedback matters more than vision. Inference latency (50-73ms per action chunk) limits real-time control. Data efficiency gains are real but modest; models still require thousands of task-specific demonstrations to match hand-engineered baselines on novel objects.
Generative latent models (diffusion-based video prediction, autoregressive dynamics) aim to let robots plan by predicting consequences of actions before executing them. NVIDIA's world foundation model platforms (Alpamayo for AVs, PAN for long-horizon prediction) and research on generative learning claim to reduce reliance on extensive real-world data by simulating plausible futures. The pitch: train once on diverse environments, then query the model to plan novel sequences. In practice, these models excel at short-horizon (1–5 frame) prediction but struggle with long-horizon consistency and error accumulation. Computational cost remains high (diffusion models require many sampling steps). Deployment on robot hardware faces real-time constraints; even with GPU inference, latency can exceed control loop periods for dynamic manipulation.
AutoMate (NVIDIA/USC) trained specialist and generalist policies on 100 assembly task geometries using RL and imitation learning, achieving 84.5% real-world success on unseen assemblies via zero-shot sim-to-real transfer. The framework stages learning: task-specific RL in sim, then curriculum-based fine-tuning on diverse geometries. Benchmarks show this works for rigid assembly (peg-in-hole insertion), but only with careful system identification and domain randomization tuned per robot morphology. The approach does not generalize across robot types (policies trained on one arm fail on another). Reward shaping remains manual and brittle; every task class requires domain expert input on success metrics. Scalability is limited by the combinatorial explosion of task-geometry pairs.
Recent work (RH20T dataset from U. Maryland, contact-rich simulation by Nvidia Isaac) emphasizes that vision alone fails for nuanced force-dependent tasks—wiping surfaces, inserting without collision, adjusting grip on deformable objects. Multi-modal sensing (vision + force/torque + audio) improves generalization on contact-heavy tasks; the RH20T benchmark (110,000+ sequences with force sensing) shows RL policies trained with tactile feedback outperform vision-only baselines. However, tactile sensors remain expensive, fragile, and robot-specific. Generalization across sensor modalities is poor; a policy trained with one tactile array struggles on a different sensor. Field deployment of force-feedback systems in warehouses and construction sites remains rare due to maintenance burden and the need for specialized gripper hardware.
Humanoid and mobile robots operate 2–4 hours on current lithium-ion batteries, limiting deployment to structured shifts and preventing true 24/7 autonomous work. Energy density improvements remain incremental (LFP: 150–200 Wh/L; high-nickel: 250–300 Wh/L; solid-state prototype: 400–520 Wh/kg target). Thermal management in compact bipedal form factors is equally critical. Cracking this enables continuous operation, reduces downtime costs, and unlocks home-robotics and field-work applications.
Solid-state batteries replace liquid electrolytes with solid ceramic or polymer phases, targeting 400–520 Wh/kg and improved thermal stability. BTR and Tsinghua University developed silicon–carbon anodes that reduce volume expansion to below 15% while boosting density 15–30%. By 2035, demand for solid-state batteries in humanoids may reach 74 GWh (1,000x increase from 2026). However, manufacturing scale remains unproven; no mass production facility exists. Cycle life tests show degradation after 500–1,000 cycles in lab conditions. Cost per kWh is 2–3× higher than liquid Li-ion. Regulatory approval (UN38.3 transport certification) lags. Tesla Optimus Gen2 (2.3 kWh, high-nickel) demonstrates field viability, but thermal runaway risk under heavy load persists. Solid-state batteries will likely enter humanoid production 2027–2028, not 2026.
Agility Robotics' Digit and Apptronik Apollo employ modular battery packs that swap in <10 minutes without rebooting, enabling sequential multi-shift operation. A fleet of three robots with two battery sets can maintain 16-hour site uptime. TrendForce projects this as the dominant near-term strategy (2026–2028) because it sidesteps battery chemistry limitations. Trade-offs: capital cost (extra batteries), facility infrastructure (docking stations, inventory management), and real-estate footprint. Early deployments (GXO Logistics, Amazon) use this model. Scalability hinges on standardized connectors and charging protocols—currently vendor-specific. Robotics-as-a-service (RaaS) models favor fleet swaps; owned-robot customers find multi-unit maintenance burdensome. This approach buys time (2–3 years) while solid-state R&D matures.
Research from Chinese universities (Tsinghua, Shanghai Jiao Tong) and commercial systems (Apptronik, Boston Dynamics Atlas) employ active liquid cooling loops or phase-change material (PCM) jackets to dissipate heat from actuators, compute, and batteries. Passive cooling (thermal spreader materials, graphene composites) works for low-to-medium loads. Liquid cooling enables sustained high-power operation (lifting, sprinting) without throttling; active temperature control can extend motor lifespan 2–3×. Cost: $2,000–$5,000 per robot for plumbing and pumps. Complexity adds failure modes (clogged lines, pump failure). Humanoid-specific challenge: liquid cooling must route through moving joints without kinetic friction losses. Boston Dynamics' hydraulic actuators inherently dissipate heat; electric servo alternatives require retrofit cooling. AI-based predictive cooling (monitoring motor temps and preemptively ramping coolant flow) shows promise but requires field validation.
Direct-drive frameless torque motors (Mosrac U-series, Maxon EC-90) achieve 85–92% electrical-to-mechanical efficiency, vs. 60–75% for geared servo motors. Research on bioinspired passive-compliance gaits (Berkeley Humanoid, Duke Humanoid) reduces peak torque demand by exploiting leg-spring mechanics, lowering energy per stride 30–50% vs. stiff designs. NVIDIA GR00T and generative motion priors (StyleLoco, natural humanoid gaits) use RL to discover energy-efficient movement patterns for specific morphologies. Field data from Unitree H1 deployments show that gait optimization (stride length, cadence, center-of-mass trajectory) cuts energy consumption more than incremental battery upgrades. Limitations: efficiency gains plateau around 85–90% of theoretical minimum; joint friction is hard to eliminate. Actuator cost trades off efficiency (high-efficiency motors cost 2× geared alternatives). Industrial adoption favors proven actuators (ABB, FANUC servos) over experimental designs.
Policies trained in simulation often fail in the real world due to unmodeled physics (friction, damping, contact), sensor noise, and actuator delays. Domain randomization masks rather than solves this. Recent breakthroughs (system identification, diffusion-based domain adaptation, latent-space alignment) show promise but remain task-specific. Closing this gap at scale would accelerate training cycles from months to weeks and enable rapid iteration across robot variants.
Tsinghua and Stanford research (2025) on SimpleFlight and zero-shot RL shows that accurate system identification (measuring mass, inertia, motor time constants, thrust coefficients on real hardware) often outperforms domain randomization. The finding: some parameters (measurable properties like mass) should use SysID, not randomization; parameters with high sensitivity (thrust coefficient in quadrotors) need careful randomization ranges, not uniform bounds. NVIDIA's Isaac Sim couples high-fidelity physics with automatic parameter estimation from telemetry logs. This hybrid approach reduces sim-to-real gap from 20–30% (vision-only policies) to 5–10% (vision + proprioception + learned dynamics). Cost: 1–2 hours of on-hardware identification per robot instance. Scalability challenge: every robot morphology variant (arm length, joint compliance, sensor mounting) requires re-identification. Works well for flight and wheeled robots; bipedal systems with contact-rich locomotion remain harder.
Recent work (Samak et al., June 2025; Gao et al., March 2025) uses diffusion models to transform simulated perception streams into realistic images, or to learn a shared latent space where sim and real distributions align. Conditional generative models (e.g., trained on paired sim/real images) can augment simulated data with domain-specific texture, lighting, and occlusion patterns before feeding to policy networks. Performance: 40%+ improvement in sim-to-real gap on autonomous driving tasks (Waymo simulation to real roads). Limitations: requires paired training data (expensive to collect) or transfer from similar domains (e.g., driving → manipulation doesn't work). Latent-space approaches assume the learned embedding captures task-relevant features; misspecification leads to failure. Computational cost: latent diffusion inference adds 20–50ms per observation. Scalability depends on amortizing the cost across many deployments.
NVIDIA Isaac and MuJoCo contact solvers now handle complex contact dynamics (rolling, sliding, sticking) more accurately, reducing simulation errors in assembly and manipulation. Tactile sensors (simulated via contact geometry and normal forces) provide ground-truth feedback that visual-only policies lack. Curriculum learning (start with simple geometric contacts, progress to deformable surfaces) improves transfer. Published results: AutoMate achieved 84.5% zero-shot success on unseen assemblies; online iterative learning (IPP method, Chen et al. 2025) refines policies after a few real-world rollouts without massive re-simulation. Trade-off: high-fidelity contact simulation is 10–100× slower than simplified models, limiting training speed. Real-world contact remains hard to predict (fabric crumpling, adhesion, micro-slip); purely simulation-trained policies for soft-object manipulation still require extensive real-world adaptation.
Foundation models (Google's world model, OpenAI's internal systems) learn to predict future observations conditioned on language commands and action sequences. The vision: train a unified model on internet-scale video + robot trajectories, then use it as a learned simulator for planning. Recent benchmarks (EWM-Bench, VideoMix22M) show these models excel at short-horizon prediction (1–10 frames) but struggle with long-horizon consistency (error accumulation beyond 30–50 frames). Performance gap widens for contact-dependent tasks where visual prediction alone is insufficient. Inference cost is high (diffusion requires 20–50 sampling steps). Deployment on real robots requires on-device inference; most world models are too large (>1B parameters). This approach is promising for high-level planning and learning from demonstrations but not yet competitive with task-specific simulators for low-level control.
Bipedal robots must walk stably on uneven terrain, recover from disturbances, and execute dynamic behaviors (jumping, climbing stairs) while managing a high center of mass and small support region. Recent deployments show progress in structured environments, but unstructured outdoor terrain, mud, sand, and ice remain challenging. Active stability requires constant power; power loss causes collapse. Solving this unlocks field deployment and human-like mobility in unpredictable environments.
Boston Dynamics' Atlas and similar platforms combine learned policies (trained via RL in simulation) with model-predictive control (MPC) for real-time balance adjustment. RL discovers diverse locomotion modes; MPC tracks desired trajectories and handles perturbations. Recent papers (2025) on hierarchical whole-body control (HWC-Loco) and symmetry-aware RL show that jointly optimizing locomotion and manipulation (e.g., walking while pushing) is feasible. Field results: Atlas can recover from heavy pushes, navigate rough terrain, and execute acrobatic moves (backflips, etc.). Unitree H1 achieves world-record walking speeds (1.5+ m/s) via learned gaits. Trade-off: these systems require 90+ hours of simulation training and careful system identification. Terrain adaptation remains hand-tuned (different controllers for asphalt vs. sand). Generalization to never-before-seen surfaces works moderately well due to domain randomization, but edge cases (ice, water, steep slopes) still cause failures. Commercial viability limited to structured sites (warehouses, factories).
The Berkeley and Duke humanoids use variable-stiffness series elastic actuators and passive leg springs to reduce energy consumption and improve stability on uneven terrain. Gaits discovered via RL on these morphologies are inherently more robust to perturbations because the passive mechanics absorb shocks. Energy efficiency: 30–50% lower than stiff designs. Cost: variable-stiffness actuators add $10,000+ per leg. Manufacturing complexity increases; field repair requires specialized tools. Boston Dynamics' hydraulic actuators also benefit from inherent compliance. Electric servo alternatives (used by Figure, Unitree) sacrifice some passive stability for control responsiveness. Recent progress (2025): learned gaits on compliant systems achieve state-of-the-art energy efficiency and robustness in simulation. Real-world transfer less proven due to actuator wear (springs degrade) and maintenance burden.
Cameras and LiDAR feed terrain type (concrete, grass, sand, slope angle) to a controller selector that switches between pre-trained gait policies. Learning systems (LiPS, Distillation-PPO) use visual features to predict stability margins and adjust step length, cadence, and hip height accordingly. Tested on Unitree robots: vision-guided gaits achieve 20–30% better stability on varied terrain vs. fixed gaits. Limitations: classification errors (misidentifying mud as grass) cause failures. Latency matters (100+ ms perception delay can exceed balance-correction window in fast walking). Occlusion and lighting changes degrade performance. Outdoor GPS for slope estimation helps but adds cost. Most deployed systems still rely on operator override or canned controllers for hazardous terrain. Real-world deployment remains restricted to known routes with limited environmental variation.
Recent papers (STATE-NAV, Thinking in 360) address humanoid navigation in human environments—avoiding people, reading social cues, maintaining safe distance. RL with vision-language models learns implicit behavior rules from video demonstrations. Deployed systems (Mercedes with Apptronik Apollo, BMW with Figure) operate in closed factory sections or cordoned-off warehouses. In 2025, no humanoid has been certified to walk freely among untrained humans (e.g., shopping mall). Regulatory approval (functional safety per ISO 13849 or ISO 10218) requires extensive real-world testing and failure documentation. The challenge: even small collision risks are unacceptable for public spaces. Deployment timelines: 3–5 years for human-presence pilot projects in structured settings (airports, malls during off-hours). Mass market open-space humanoids are 5–10+ years away.
Humanoid robot manufacturing costs range $30,000–$150,000 per unit at current volumes (thousands/year). Achieving $20,000–$30,000 target prices requires 10–50× production scale, standardized supply chains, and design-for-manufacturability breakthroughs. Current bottlenecks: custom actuators, hand assembly, low-volume tooling. Solving this unlocks the addressable market (logistics, hospitality, construction) and shifts economics from niche to mainstream.
Tesla's Fremont factory conversion to 1 million Optimus units annually (announced 2026), Boston Dynamics' partnership with Hyundai Mobis on custom actuators, and Unitree's integrated supply chain demonstrate cost leverage through backward integration. Tesla designs its own servos, integrates battery assembly, and leverages automotive manufacturing expertise. Unitree (5,500+ units shipped in 2025) uses commodity brushless motors with custom firmware. Result: Tesla targets $20,000 retail price; Unitree H1 is $90,000. Trade-off: vertical integration requires massive capital ($500M–$1B for a gigafactory). Competitors without in-house fabrication (Figure, Agility) outsource and face higher variable costs. Supply-chain risk: if a single motor supplier has a shortage, production halts. Long-term scaling favors vertically integrated players (Tesla, Unitree, UBTECH). Mid-market entrants (Boston Dynamics, Figure) must partner or invest heavily.
Initiatives to use commodity sensors (smartphone cameras, automotive LiDAR), standard communication protocols (CAN, Ethernet), and interchangeable grippers reduce custom engineering per unit. ANSI and open-source standards (ROS, Isaac ROS) enable plug-and-play assembly. Early adopters report 20–30% cost reductions by sourcing standard actuators and controllers. Standardization enables smaller contract manufacturers (Jabil, Quanta Services) to scale production. Barriers: humanoid robotics is still early-stage; no de-facto standard motor sizes, interfaces, or firmware yet exist. Each company optimizes proprietary architectures. Convergence around standards (e.g., ISO mechanical interfaces for end-effectors, DIN connectors for power) will take 3–5 years. First-mover advantage goes to whoever sets the standard; losers forced to retool.
Jabil (announced partnership with Apptronik, 2025) offers to manufacture Apollo robots while also buying robots for internal logistics—a flywheel model. Large electronics contract manufacturers (Hon Hai, Pegatron, Flex) are evaluating robotics lines. Scaling to 100,000 units/year requires factories with flexible assembly lines, supply-chain visibility, and quality control. Cost structure: labor (currently 20–30% of BOM in Asia, 30–40% in US), materials (motors, batteries, sensors: 40–50% of BOM), overhead, and margin. Outsourcing to Taiwan or Vietnam (lower labor) cuts costs 15–25% vs. US assembly. Regulatory/political headwind: tariffs (proposed 2025) on robotics from China could add 10–20% to costs. Contract manufacturers prioritize high-volume customers; smaller robotics startups struggle to get factory allocation. Consolidation likely: 2–3 dominant contract manufacturers will service 80% of market by 2030.
STIQ (2025) estimated that reducing humanoid BOM from current $80,000–$100,000 to <$10,000 requires $5 billion in supplier R&D to commoditize actuators, sensors, and compute. Early work shows promise: cheaper brushless motors (toy-grade vs. industrial-grade) with better control firmware can replace expensive servo motors at 10% of cost. Simplified hands (3 fingers vs. 5, single joint per finger) reduce mechanical complexity 30–40%. Onboard AI compute consolidation (single edge GPU vs. distributed microcontrollers) saves integration cost. Trade-off: design simplification reduces capability; cheap motors sacrifice precision and speed. Humanoids optimized for $20,000 cost may not perform dexterous manipulation. Market segments emerging: budget humanoids for simple picking/sorting, premium humanoids for complex assembly. By 2028–2030, cost curves should hit the inflection point (20% price drop per production doubling), unlocking $5–10k consumer units.
Foundation models for robotics face a data bottleneck: the gap between robot-relevant training data and LLM-scale datasets is ~120,000×. Existing datasets (Open X-Embodiment, RH20T) aggregate hundreds of thousands of trajectories; LLMs train on trillions of text tokens. This scarcity limits generalization, forces task-specific retraining, and slows embodied AI progress. Closing this gap—through automated data collection, human-in-the-loop systems, and sim-to-real bridging—is essential for scaling foundation models and enabling zero-shot transfer across tasks and robots.
Open X-Embodiment (Google + 22 institutions) aggregated 527 skills and 1 million+ trajectories across 22 robot platforms by cross-licensing datasets. Scale AI launched its Physical AI Data Engine (collecting 100,000+ hours of real-world robotics data in 2025) with focus on semantic enrichment: every trajectory is annotated with task goals, failure modes, and success criteria. RH20T (U. Maryland, 110,000+ contact-rich sequences with force/torque) open-sourced to establish public benchmarks. The bottleneck: no single company can generate enough data alone. Pooling requires standardized formats (H5 datasets, RLDS protocol), privacy agreements, and incentive alignment. Cost: teleoperation ($100–$300 per hour of data) limits scale. Recent innovation: autonomous data collection robots deployed in homes and warehouses (Scale AI, Covariant, 1X Technologies) reduce per-hour cost to $30–$50. Challenges: diversity remains limited (most datasets skew toward table-top manipulation, not outdoor or contact-heavy tasks); heterogeneity across sensor modalities (RGB-D, stereo, lidar vary per robot) complicates learning.
Research (2025–2026) on synthetic trajectory generation aims to create diverse robot experiences without teleoperation. World models (diffusion, latent dynamics) can roll forward predictions of novel situations; behavioral cloning or imitation learning then learns from these synthetic trajectories. Results show 30–50% reduction in required real-world data for simple tasks (grasping, pushing) when augmented with realistic synthetic data. Limitations: synthetic data is only as good as the underlying world model; errors accumulate quickly, and policies trained on corrupted trajectories diverge from real robot behavior. Generative models work best for high-level planning (where does the object go?) rather than low-level control (how much torque?). Scaling requires massive compute for diffusion models; inference cost remains high (10–50× more than deterministic simulators). Adoption timeline: 2–3 years before synthetic augmentation becomes standard in production pipelines.
Human teleoperators collect high-quality demonstrations by directly controlling robots (VR headsets, haptic gloves, AR overlays). Covariant, Physical Intelligence, and Sanctuary AI employ distributed teleoperation teams (e.g., operators in Philippines, data labeled in India) to collect diverse trajectories. Cost per hour is dropping (from $500 in 2020 to $50–$100 in 2025 due to tooling improvements). Annotation overhead (per-frame labels for task progress, error detection, context) adds $20–$30/hour. Recent innovations (DOBB-E: record household tasks via smartphone camera; UMI: capture human hand demonstrations with commodity grippers) make data collection more accessible to researchers. Scale AI's semantic enrichment layer (annotating not just what happened, but why) improves downstream model quality by 20–30%. Bottleneck: human expertise is scarce (experienced roboticists earn $100k+/year). Scaling to billions of trajectories requires either cheaper labor (higher error rates) or full automation (sim-only, still low diversity). Current trajectory: 1–10 million human-collected sequences/year at industry labs; need is 10–100 billion to match LLM training scale.
Foundation models pretrained on diverse embodiments (arms, wheels, legs, grippers) aim to extract task-agnostic features (e.g., 'moving toward' applies across morphologies). GR00T N1.6 and rt-2 show modest cross-embodiment transfer (~10–30% task success when deployed on unseen robots without retraining). Significant gains come from fine-tuning on target robot data (5–50 real-world rollouts). Meta-learning approaches (MAML, Prototypical Networks) are being explored but don't yet match task-specific supervised learning. Fundamental challenge: embodiment differences (leg length, actuator speed, sensor placement) create large distribution shifts that generic pretraining doesn't fully bridge. Near-term approach: pretrain on internet-scale vision + diverse trajectory data, then quickly adapt to specific robots/tasks (few-shot learning). This reduces data scarcity from 120,000× to perhaps 10–100× over the next 2–3 years, still a major blocker for true generalist systems.
Most of the global economy still runs on physical labor that humans increasingly refuse to do at the wage on offer — warehouse picking, last-mile delivery, eldercare, construction grunt work, agricultural harvest, fast-food back-of-house. The thesis: a sufficiently general humanoid or wheeled platform priced under one year of an entry wage replaces this entire pool of jobs faster than retraining or immigration can refill it. The TAM isn't the existing automation market; it's the global wage bill for tasks no one wants to do.
Labor isn't actually as scarce as the thesis assumes once you adjust for legal immigration, gig-work elasticity, and demographic-driven domestic re-entry. Robots also fail at the tail of physical-world variance the human worker absorbs invisibly. Unit economics break before TAM becomes addressable.
The hardware problem for general-purpose robotics has been mostly solved by smartphone-scale supply chains and electric-vehicle teardowns — the bottleneck is software. The bet: a single large model trained on internet-scale video plus diverse robot data generalizes across embodiments and tasks, the way GPT-class models generalized across language tasks. Whoever gets to a credible vision-language-action foundation model first owns the platform layer that every embodiment company has to license — a Microsoft-of-robotics outcome.
Robotics data is fundamentally different from text — it's expensive to collect, embodiment-specific, and doesn't compose. Internet-scale video lacks the action labels needed for transfer. The likeliest outcome is that vertical specialists with proprietary teleop data beat a generalist platform play, the way Tesla beat Mobileye.
The hardware cost of a humanoid robot has dropped from ~$200k in 2020 to ~$30k in 2026 because Chinese drone, EV, and consumer-electronics suppliers have absorbed every component category — actuators, batteries, sensors, compute, structural materials. Continuing that curve takes a humanoid under $10k by 2030. At that price point, payback periods for industrial deployment compress from 5+ years to under 18 months, and the consumer-home market becomes addressable for the first time. The thesis: whoever wins distribution at the deflated price point wins the category, regardless of who has the best brain.
Below-cost Chinese hardware faces structural Western trade walls (CFIUS, BIS, EU equivalents) and a customer base that can't deploy a Chinese-stack robot on critical infrastructure. The deflation curve is real but the addressable market is bifurcated — Chinese hardware wins China + parts of the Global South; Western buyers pay 3-5× for sovereignty.
By tracked rounds led
A lead investor pulled out of a critical funding round at the last minute, Embodied could not find a replacement, and the company ceased operations in December 2024. Because Moxie relied on Embodied's cloud backend, the shutdown bricked every unit in the field and most refund requests were declined — the defining cautionary tale for cloud-dependent consumer robots.
IEEE Spectrum showcases research videos including MotionDisco (humanoid motion discovery), DEEP Robotics' coordination work, Agility's fitness-class stress-testing, and AIRoA's home-robot development.
Public claims with deadlines
Hearings · rulings · statutory deadlines
Round sizes leapt an order of magnitude — median $60M in 2024, $347M in 2025, $520M in 2026 — and 2025 alone drew $11.4B. ABB's pending $5.4B robotics divestiture to SoftBank, Anduril's $2.5B Series G at $30.5B, and Figure's $1B Series C at a $39B valuation top a cohort where even Apptronik's Series A extension ran $520M.
The sector funds scale, not formation: of 17 rounds since 2025, five were growth-stage and five Series C, with a single Series A on file. Public-market access widened too — Geek+ listed on the HKEX for $347M and FANUC trades near $38.5B — while ABB's robotics unit heads for a SoftBank-led carve-out.
SoftBank is the sector's anchor with four leads, including ABB, Skild, and Agility. Almost every other 2025–26 lead is a first-timer writing one strategic check: Nvidia into Figure, Google and Mercedes-Benz into Apptronik, CapitalG into Physical Intelligence, Valor into Zipline — a roster of corporates and crossover funds, not repeat robotics specialists.
By relevant articles ingested
Where the sector convenes
Talent + spinout pipeline









