AI Summary • Published on Jan 20, 2026
Autonomous drone racing presents a significant challenge for robotics research, demanding high-speed perception, state estimation, planning, and control on lightweight drones with strict resource and time limitations. Previous state-of-the-art solutions, such as the "Swift" system, relied on stereo cameras and external motion tracking systems, limiting their real-world applicability and increasing costs. The core problem addressed by MonoRace is the need for an autonomous drone racing AI that operates entirely onboard, using only a single monocular camera and an Inertial Measurement Unit (IMU), without any external infrastructure. This self-contained approach is crucial for broader deployment and overcoming issues like high optical flow, motion blur, and complex aerodynamic effects encountered at racing speeds, while pushing the drone's physical limits in competitive environments.
MonoRace employs a fully onboard perception and control pipeline leveraging a single rolling-shutter monocular camera and an IMU. The system operates on an NVIDIA Jetson Orin NX, with a compact Guidance-and-Control Network (G&CNet) running at 500 Hz on the flight controller. The state estimation process begins by capturing 820x616 images at 90 Hz, which are then adaptively cropped and resized to 384x384 pixels based on predicted gate locations. A U-Net style architecture, termed GateNet, performs neural-network-based gate segmentation. GateNet is trained using both synthetic and real-world data with multi-scale predictions and extensive augmentations, including affine transformations, photometric changes, and motion blur, to enhance generalization. Sub-pixel accurate gate corners are extracted from the segmentation masks using QuAdGate, which applies a Line Segment Detector (LSD) and computes line intersections, followed by RANSAC for outlier rejection. A Perspective-n-Point (PnP) algorithm estimates the drone's pose, combining corners from multiple gates to improve accuracy and reliability, and uses a fallback strategy for distant or sparse detections. An Extended Kalman Filter (EKF) fuses these PnP measurements with high-rate IMU data, with the state vector including position, velocity, orientation, and IMU biases. To address IMU saturation during aggressive maneuvers, a dynamic drone model is used to predict accelerations; when measured accelerations exceed a threshold, model-based predictions are incorporated, and EKF uncertainty is inflated. The system also robustly handles camera interference by having GateNet ignore corrupted image regions, using RANSAC for outlier rejection in corner detection, and employing EKF filtering for deviant measurements. An offline optimization procedure leverages known gate geometry to refine state estimation parameters, particularly extrinsic camera calibration, using Bayesian optimization to minimize the Intersection over Union (IoU) between re-projected and segmented gate pixels. For guidance and control, a small (3x64 neurons) fully connected neural network (G&CNet) directly outputs motor commands. This network is trained in a simulated environment using Proximal Policy Optimization (PPO), incorporating a detailed quadcopter model with advanced aerodynamic effects and extensive domain randomization to achieve robust sim-to-real transfer. The reward function balances progress towards gates, successful gate passage, and penalties for high angular rates, off-center crossings, out-of-view gates, motor command changes, and crashes.
MonoRace achieved significant success, winning the Grand Challenge, AI vs. Human, and Drag Race events at the 2025 Abu Dhabi Autonomous Drone Racing Competition (A2RL). It also secured third place in the Multi-Drone Race, with its only limitation being the absence of collision-avoidance capabilities. Notably, MonoRace made history by sequentially beating three human FPV world champions in a direct knockout tournament, achieving a fastest lap time of 16.56 seconds. The drone reached speeds of up to 100 km/h (28.23 m/s) on the competition track, marking the fastest fully onboard autonomous flight reported to date with a monocular CMOS camera and saturating IMU without external aids. The system demonstrated remarkable robustness to IMU saturation; tests showed that incorporating model-based acceleration corrections during saturation events increased the success rate to 100%, compared to a 50% success rate with IMU-only predictions. Furthermore, MonoRace successfully completed entire tracks even when up to 50% of the camera frames were corrupted due to electromagnetic interference, thanks to integrated fail-safe strategies. The domain randomization used during training proved highly effective, as simulated completion times closely matched real-world performance, highlighting the system's ability to handle unmodeled dynamics and bridge the sim-to-real gap. The G&CNet controller's efficiency was evident in its low end-to-end latency, operating at 500 Hz and allowing full throttle authority with rapid motor command changes.
The MonoRace system represents a significant milestone in autonomous drone racing research, demonstrating champion-level performance using a fully onboard, monocular perception and control pipeline. This achievement opens new avenues for robotics development. Future work includes exploring end-to-end learning to better couple vision and control, adapting to varying gate shapes beyond the current rectangular reliance, and developing robust drone detection and avoidance capabilities for multi-drone racing scenarios. Beyond competitive racing, the compact and efficient Guidance-and-Control Networks (G&CNets) could enable optimal control for smaller and more affordable robots, significantly reducing the computational demands compared to traditional planning pipelines. The advancements in agile, high-speed autonomous flight also have broader implications, including potential military applications for both offensive and defensive utility. More broadly, the technology can be adapted for societal applications, allowing drones to perform longer-range and duration missions more efficiently by dynamically adjusting speed based on environmental and mission requirements, thereby moving towards more energy-optimal autonomous flight.