Articles tagged with: Mixture-of-Experts

Showing 2 results for this tag.

Advanced·Dec 2, 2025

OD-MoE: On-Demand Expert Loading for Cacheless Edge-Distributed MoE Inference

This paper introduces OD-MoE, a distributed Mixture-of-Experts (MoE) inference framework designed for memory-constrained edge devices. It enables fully on-demand expert loading without a cache, achieving high decoding speeds and significantly reducing GPU memory requirements while maintaining full model precision.

Mixture-of-Experts

Edge Computing

LLM Inference

Advanced·Dec 26, 2024

DeepSeek-V3 Technical Report

DeepSeek-V3 is a powerful 671B Mixture-of-Experts language model that demonstrates state-of-the-art performance among open-source models and competes with leading closed-source models, achieved through efficient architectures and novel training strategies while maintaining remarkably low training costs.

Mixture-of-Experts

Efficient Training

Large Language Models

Research Guy

All Tags

Research Guy

Understand New Research — Instantly

Daily AI-generated explanations of the latest arXiv papers.

Research Guy

Research Guy

All Tags

Research Guy

Research Guy

Articles tagged with: Mixture-of-Experts