Articles tagged with: Computer Vision

Showing 14 results for this tag.

Advanced·Jan 20, 2026

MonoRace: Winning Champion-Level Drone Racing with Robust Monocular AI

MonoRace is an autonomous drone racing system that utilizes a monocular camera and IMU to achieve champion-level performance, notably winning the A2RL 2025 competition. It features robust state estimation combining neural-network-based gate segmentation with a drone model, an offline optimization procedure, and a neural network for guidance and control.

Robotics

Computer Vision

Reinforcement Learning

Advanced·Dec 25, 2025

Patch-Discontinuity Mining for Generalized Deepfake Detection

This paper introduces GenDF, a generalized deepfake detection framework that leverages a fine-tuned Vision Transformer (ViT) to identify subtle patch discontinuities in fake images and continuities in real ones. It employs deepfake-specific representation learning, feature space redistribution, and classification-invariant feature augmentation to achieve state-of-the-art generalization across various unseen deepfake patterns with minimal trainable parameters.

Machine Learning

Deepfake Detection

Computer Vision

Intermediate·Dec 23, 2025

Your Reasoning Benchmark May Not Test Reasoning: Revealing Perception Bottleneck in Abstract Reasoning Benchmarks

This paper challenges the common interpretation of AI models' performance on abstract reasoning benchmarks like ARC, hypothesizing that visual perception limitations, not reasoning deficiencies, are the primary bottleneck. It introduces a two-stage pipeline to separate perception and reasoning, revealing that most model failures stem from perception errors and demonstrating significant performance improvements.

AI Evaluation

Machine Learning

Computer Vision

Advanced·Dec 3, 2025

CNN on `Top': In Search of Scalable & Lightweight Image-based Jet Taggers

This paper explores the use of a lightweight and scalable EfficientNet architecture, combined with global jet features, for the computationally inexpensive yet competitive classification of top-quark jets. It aims to address the high computational demands of current state-of-the-art jet tagging methods like Transformers and GNNs.

High-Energy Physics

Jet Tagging

Convolutional Neural Networks

Particle Physics

Deep Learning

Computer Vision

Advanced·Dec 3, 2025

Visual Reasoning Tracer: Object-Level Grounded Reasoning Benchmark

MLLMs often lack transparent reasoning, merely providing final predictions without intermediate steps or visual evidence. This paper introduces the Visual Reasoning Tracer (VRT) task and associated benchmarks (VRT-Bench, VRT-80k) to explicitly require models to localize intermediate objects in their reasoning paths, significantly enhancing model interpretability and reliability.

Multimodal Large Language Models

Visual Reasoning

Interpretability

Computer Vision

Advanced·Dec 3, 2025

Self-Supervised Learning for Transparent Object Depth Completion Using Depth from Non-Transparent Objects

This paper introduces a novel self-supervised learning method for completing depth maps of transparent objects, a challenging task for conventional sensors. By simulating transparent object depth deficits within non-transparent regions, the approach significantly reduces reliance on costly labeled data while achieving comparable performance to supervised methods.

Self-Supervised Learning

Depth Completion

Computer Vision

Advanced·Dec 2, 2025

Fast & Efficient Normalizing Flows and Applications of Image Generative Models

This PhD thesis presents innovations to improve the efficiency of normalizing flows through new architectures and algorithms, alongside applying generative models to diverse computer vision challenges such as agricultural quality assessment, privacy-preserving autonomous driving, geological mapping, art restoration, and missing traffic sign detection.

Normalizing Flow

Generative Models

Computer Vision

Advanced·Dec 2, 2025

Artificial Microsaccade Compensation: Stable Vision for an Ornithopter

This paper introduces "Artificial Microsaccade Compensation," a real-time video stabilization method inspired by biological microsaccades. It enables stable camera-based perception for aggressively shaking tailless ornithopters, a significant challenge for autonomous flapping-wing robots.

Video Stabilization

Computer Vision

Robotics

Advanced·Dec 2, 2025

A Modular Architecture Design for Autonomous Driving Racing in Controlled Environments

This paper introduces a modular architecture for autonomous vehicles designed for racing in closed circuits. It integrates perception, localization, path planning, and control subsystems to achieve real-time, precise autonomous navigation in controlled environments.

Autonomous Driving

Computer Vision

Robotics

Intermediate·Oct 10, 2025

MRI Brain Tumor Detection with Computer Vision

This study explores the application of deep learning techniques in detecting and segmenting brain tumors from MRI scans, achieving significant improvements in accuracy and efficiency.

Medical Imaging

Deep Learning

Computer Vision

Research Guy

All Tags

Research Guy

Understand New Research — Instantly

Daily AI-generated explanations of the latest arXiv papers.

Research Guy

Research Guy

All Tags

Research Guy

Research Guy

Articles tagged with: Computer Vision