AI Summary • Published on Jan 28, 2026
The vision of ubiquitous intelligence, powered by the convergence of 6G networks and Large Language Models (LLMs), faces significant hurdles. Existing approaches struggle with the fragmented and heterogeneous computing resources distributed across hierarchical 6G networks, which range from mobile devices to tiered edge servers. Individual LLM agents often lack the necessary resources to efficiently handle complex reasoning tasks in isolation. Current solutions, such as centralized cloud systems, introduce high latency unsuitable for time-critical applications, while isolated edge devices are constrained by limited computational power, leading to performance degradation and increased energy consumption. Challenges also include the substantial computational and memory demands of LLMs on edge infrastructure, potential error propagation in distributed execution, and difficulties with interoperability and synchronization among diverse network components.
To overcome these challenges, the authors propose Collaborative Orchestration Role at Edge (CORE), an innovative framework designed to facilitate collaborative execution of interactive tasks within 6G networks. CORE utilizes a collaborative learning system where multiple LLMs are assigned distinct functional roles and distributed across mobile devices and tiered edge servers. The framework integrates three core optimization modules: real-time perception, dynamic role orchestration, and pipeline-parallel execution, all aimed at fostering efficient agent collaboration. CORE's architecture is structured into three layers: a Feedback and Optimization Layer for continuous improvement through task evaluation, reflection, and memory; a Primary Service Layer for multi-modal perception, dynamic role orchestration, and pipeline-parallel task execution; and a 6G Infrastructure Layer providing foundational communication and computation. A key component is a novel role-affinity scheduling algorithm that dynamically assigns LLM roles based on task requirements, network conditions, and device capabilities. Inter-agent coordination is managed via the Model Context Protocol (MCP) for consistent context sharing and Directed Acyclic Graphs (DAGs) for defining task dependencies. For critical scenarios, CORE employs a predictive strategy leveraging pre-trained decision evaluation models and scheduling policy lookup tables, derived from historical and offline digital twin simulations, to ensure ultra-low latency orchestration without the overhead of real-time simulations.
The CORE framework was implemented and evaluated on a real-world edge computing platform, demonstrating its efficacy in time-sensitive applications like industrial automation and emergency response. The experimental setup involved a Deepseek-R1-distill-llama-70B model as the task scheduler on NVIDIA A40 GPUs and MiniCPM-v2.6 models on NVIDIA 3090 and 4090 GPUs at the edge. Performance was assessed using task completion rates across varying difficulty levels and latency metrics (scheduling and execution). CORE significantly outperformed single-agent methods (ReAct, LLMCompiler) and multi-agent algorithms (Static_Dual_Loop, Crew_Ai) in both task completion and latency. Specifically, CORE showed superior task completion, exceeding Static_Dual_Loop by 25% for medium tasks and 20% for hard tasks, attributed to its dynamic task decomposition and specialized multi-agent collaboration. The DynaRole-HEFT approach within CORE achieved a 52% reduction in high-load latency compared to traditional HEFT, showcasing efficient task allocation. While highly effective, the orchestrator's inference time (180-320 ms) constituted a notable portion (25-40%) of the total end-to-end delay under high-load conditions, highlighting a specific area for future optimization.
The CORE framework offers a transformative approach to deploying 6G technology by creating a "collective AI brain" through the collaborative orchestration of LLM agents across hierarchical edge networks. This enables the delivery of rapid and scalable AI services in critical domains such as smart cities, healthcare, industrial automation, and emergency response. The empirical evaluations validate CORE's superior performance in enhancing both task completion rates and reducing latency, proving its practical applicability. Future research should focus on developing lightweight scheduler designs, potentially through model distillation and specialization, to address the remaining orchestrator overhead and achieve sub-10ms latencies required for highly time-sensitive applications like remote surgery. Further directions include integrating CORE with advanced 6G network slicing for improved resource control, exploring quantum-inspired algorithms for complex scheduling tasks, and standardizing inter-agent communication protocols to ensure seamless operation across diverse and heterogeneous 6G ecosystems.