All Tags
Browse through all available tags to find articles on topics that interest you.
Browse through all available tags to find articles on topics that interest you.
Showing 1 results for this tag.
Arbitrage: Efficient Reasoning via Advantage-Aware Speculation
This paper introduces Arbitrage, a novel step-level speculative generation framework designed to enhance the efficiency of Large Language Models (LLMs) in reasoning tasks. It dynamically routes between a fast draft model and a more capable target model based on the expected quality advantage, significantly reducing computational waste and inference latency while maintaining accuracy.