All Tags
Browse through all available tags to find articles on topics that interest you.
Browse through all available tags to find articles on topics that interest you.
Showing 2 results for this tag.
CORE:Toward Ubiquitous 6G Intelligence Through Collaborative Orchestration of Large Language Model Agents Over Hierarchical Edge
CORE is a novel framework that orchestrates collaborative Large Language Model (LLM) agents across hierarchical 6G edge networks to enable ubiquitous intelligence. It addresses the challenges of fragmented resources by integrating real-time perception, dynamic role orchestration, and pipeline-parallel execution, significantly enhancing system efficiency and task completion in various 6G applications.
OD-MoE: On-Demand Expert Loading for Cacheless Edge-Distributed MoE Inference
This paper introduces OD-MoE, a distributed Mixture-of-Experts (MoE) inference framework designed for memory-constrained edge devices. It enables fully on-demand expert loading without a cache, achieving high decoding speeds and significantly reducing GPU memory requirements while maintaining full model precision.