All Tags
Browse through all available tags to find articles on topics that interest you.
Browse through all available tags to find articles on topics that interest you.
Showing 10 results for this tag.
Dancing in Chains: Strategic Persuasion in Academic Rebuttal via Theory of Mind
This paper introduces RebuttalAgent, an AI framework that grounds academic rebuttal in Theory of Mind (ToM) to generate strategic and persuasive responses. It proposes a ToM-Strategy-Response (TSR) pipeline, supported by a large-scale synthetic dataset (RebuttalBench) and a specialized evaluation model (Rebuttal-RM), significantly outperforming existing models in automated and human evaluations.
MonoRace: Winning Champion-Level Drone Racing with Robust Monocular AI
MonoRace is an autonomous drone racing system that utilizes a monocular camera and IMU to achieve champion-level performance, notably winning the A2RL 2025 competition. It features robust state estimation combining neural-network-based gate segmentation with a drone model, an offline optimization procedure, and a neural network for guidance and control.
Nex-N1: Agentic Models Trained via a Unified Ecosystem for Large-Scale Environment Construction
The paper introduces a comprehensive method and ecosystem (NexAU, NexA4A, NexGAP) to overcome limitations in scaling interactive environments for training agentic Large Language Models (LLMs). This infrastructure enables the systematic generation of diverse, complex, and realistically grounded interaction trajectories for LLMs.
Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation
This paper introduces Reward Forcing, a novel framework for efficient streaming video generation that tackles issues like diminished motion dynamics and over-reliance on initial frames. It achieves state-of-the-art performance by combining EMA-Sink for improved long-term context and Rewarded Distribution Matching Distillation (Re-DMD) to enhance motion quality.
STARE-VLA: Progressive Stage-Aware Reinforcement for Fine-Tuning Vision-Language-Action Models
This paper introduces Stage-Aware Reinforcement (StARe), a novel module that decomposes long-horizon robotic manipulation tasks into semantically meaningful stages, providing dense, interpretable reinforcement signals. Integrated into the Imitation → Preference → Interaction (IPI) fine-tuning pipeline, StARe significantly improves the performance and robustness of Vision-Language-Action (VLA) models on complex manipulation tasks.
Tutorial on Large Language Model-Enhanced Reinforcement Learning for Wireless Networks
This paper provides a comprehensive tutorial on enhancing Reinforcement Learning (RL) for wireless networks using Large Language Models (LLMs). It proposes a taxonomy for LLM roles in RL (state perceiver, reward designer, decision-maker, generator) and showcases their application in various wireless scenarios to address classical RL's limitations in generalization, interpretability, and sample efficiency.
AdaptVision: Efficient Vision-Language Models via Adaptive Visual Acquisition
AdaptVision introduces an efficient VLM paradigm that autonomously determines the minimum number of visual tokens required for each sample by employing a coarse-to-fine visual acquisition strategy, leading to superior performance with significantly reduced computational overhead.
TempR1: Improving Temporal Understanding of MLLMs via Temporal-Aware Multi-Task Reinforcement Learning
This paper introduces TempR1, a novel temporal-aware multi-task reinforcement learning framework designed to significantly enhance the temporal understanding capabilities of Multimodal Large Language Models (MLLMs). By integrating diverse temporal tasks and tailored reward functions, TempR1 achieves state-of-the-art performance across various video understanding benchmarks and improves generalization.
SpaceTools: Tool-Augmented Spatial Reasoning via Double Interactive RL
This paper introduces SpaceTools, a vision-language model trained with Double Interactive Reinforcement Learning (DIRL) to achieve precise spatial reasoning and real-world robot manipulation by effectively coordinating multiple external tools. It demonstrates state-of-the-art performance on various spatial understanding benchmarks.
The Impact of Quantization on Large Reasoning Model Reinforcement Learning
This study investigates how quantization affects reinforcement learning in large reasoning models, finding that post-training quantization and QLoRA outperform quantization-aware training.