Articles tagged with: Benchmarking

Showing 3 results for this tag.

Advanced·Feb 18, 2026

Pareto Optimal Benchmarking of AI Models on ARM Cortex Processors for Sustainable Embedded Systems

This paper introduces a practical framework for benchmarking and optimizing AI models on ARM Cortex processors in embedded systems. It focuses on balancing energy efficiency, accuracy, and resource utilization, demonstrating how optimal processor and model selections depend on an application's inference cycle time.

Energy Efficiency

Benchmarking

Edge AI

Intermediate·Dec 18, 2025

Medical Imaging AI Competitions Lack Fairness

This paper systematically investigates fairness in medical imaging AI benchmarking competitions, revealing significant biases in dataset composition and critical flaws in data accessibility, licensing, and documentation. The findings highlight a disconnect between leaderboard success and clinically meaningful AI, urging for improved transparency and reusability standards.

AI Ethics

Benchmarking

Medical Imaging

Advanced·Dec 2, 2025

Evaluating Long-Context Reasoning in LLM-Based WebAgents

This paper introduces a benchmark for evaluating long context reasoning capabilities of WebAgents through sequentially dependent subtasks that require retrieval and application of information from extended interaction histories. It observes a dramatic performance degradation as context length increases and proposes an implicit RAG approach for modest improvements.

LLM Agents

Long Context

Benchmarking

Research Guy

All Tags

Research Guy

Understand New Research — Instantly

Daily AI-generated explanations of the latest arXiv papers.

Research Guy

Research Guy

All Tags

Research Guy

Research Guy

Articles tagged with: Benchmarking