AI Summary • Published on Mar 19, 2026
Traditional artificial intelligence for science (AI4S) has shown significant capabilities in predicting scientific properties, but true scientific discovery is an inherently physical, long-term, and iterative process. Current computational approaches often frame discovery as isolated, task-specific predictions, failing to align with the continuous interaction required in the physical world. The existing landscape is bifurcated: reasoning-centric systems (like LLM-based agents) expand cognitive scope but lack grounding in physical experimentation, while execution-centric systems (like automated laboratories) provide robust execution but often operate within predefined objectives. This fundamental decoupling prevents the realization of truly autonomous scientific discovery, where systems can iteratively form hypotheses, design experiments, execute them under real-world constraints, analyze results, and revise models.
The authors propose "embodied science" as a new paradigm to address this disconnect, defining it as a closed-loop process where AI participates directly in experimental workflows. This paradigm is realized through "Agentic Embodied AI," persistent cyber-physical scientific agents that couple scientific cognition, experimental perception, and laboratory action. To operationalize this, they introduce the Perception–Language–Action–Discovery (PLAD) framework. In PLAD, an agent first perceives the experimental environment through instruments (Perception), then reasons and plans using scientific language and knowledge (Language), executes physical interventions in real laboratories (Action), and finally internalizes experimental outcomes as new scientific insights (Discovery), which then drive further exploration. The Language component integrates large language models (LLMs) with specialized knowledge (e.g., scientific knowledge graphs) and tools for precision. The Action component encompasses various robotic embodiments, from spatially constrained to mobile manipulators. The Discovery component ensures that experimental results are not just observations but are abstracted into transferable scientific understanding, enabling continuous refinement of research objectives. Examples in enzyme design and chemical reaction optimization illustrate how PLAD integrates these components to foster long-horizon autonomous discovery.
The Embodied Science paradigm, underpinned by the PLAD framework, enables a critical shift from AI as a tool for augmentation to a system-level reconfiguration of the scientific method. By tightly coupling perception, language-level reasoning, embodied action, and cumulative discovery, PLAD bridges the gap between digital prediction and empirical validation. This framework ensures that scientific progress emerges from continuous engagement with real experimental environments, rather than from isolated computation over static datasets. It facilitates systems that learn not only from successes but also from failures, anomalies, and uncertainties, leading to cumulative scientific understanding and the iterative generation, testing, falsification, and revision of hypotheses over extended horizons. This unification addresses the structural limitation of current AI4S approaches, where scaling reasoning improves cognitive breadth and advancing automation improves throughput, but neither alone achieves autonomous, long-horizon discovery.
Realizing long-horizon PLAD loops faces several challenges that require coordinated design responses. For reasoning over scientific data, science-adapted LLMs (Sci-LLMs) and tool-assisted reasoning, trained with agentic reinforcement learning, are essential to interpret complex, noisy instrument signals. Execution reliability demands sim-to-real approaches utilizing digital twins to safely train and deploy diverse robotic embodiments and specialized laboratory skills, especially given the hazardous nature of many experiments. Long-horizon autonomy also necessitates robust knowledge accumulation, with knowledge graphs proposed as structured memory to persistently record, organize, and revise scientific understanding across experimental cycles. Furthermore, a protocolized infrastructure, like the Science Context Protocol (SCP), is needed to provide a unified, agent-interpretable abstraction over distributed experimental components, bridging perception, reasoning, and action. Finally, safety governance is paramount, relying on a combination of explicit knowledge-driven constraints and model-based risk assessment to prevent unsafe procedures or outcomes. The vision of Embodied Science requires comprehensive advancements across foundational models, instrument-aware perception, protocol design, scientific infrastructure, evaluation standards, and safety mechanisms to ensure trustworthy and scalable autonomous discovery.