AgentDS Technical Report: Benchmarking the Future of Human-AI Collaboration in Domain-Specific Data Science
This paper introduces AgentDS, a new benchmark and competition designed to evaluate the performance of AI agents and human-AI collaboration in domain-specific data science tasks across six diverse industries. The findings indicate that while current AI agents struggle with domain-specific reasoning, the most effective solutions emerge from human-AI collaboration, highlighting the enduring value of human expertise.
Security, privacy, and agentic AI in a regulatory view: From definitions and distinctions to provisions and reflections
This paper reviews 24 EU AI regulatory documents from 2024-2025 to clarify definitions and provisions related to security, privacy, and agentic AI. It aims to resolve ambiguities and align regulatory obligations with the evolving capabilities of AI, particularly autonomous agents.
From Accuracy to Readiness: Metrics and Benchmarks for Human-AI Decision-Making
This paper introduces a novel measurement framework to evaluate human-AI decision-making, shifting focus from mere model accuracy to the readiness of human-AI teams for safe and effective collaboration. It proposes a taxonomy of metrics and connects them to the Understand–Control–Improve lifecycle to assess calibration, error recovery, and governance in real-world deployments.
Integrated Channel Sounding and Communication: Requirements, Architecture, Challenges, and Key Technologies
This paper proposes an Integrated Channel Sounding and Communication (ICSC) framework that deeply integrates channel sounding and communication to address limitations of traditional channel modeling in dynamic, complex wireless networks like SAGSIN. ICSC enables real-time acquisition of channel characteristics, intelligent scenario identification, and adaptive waveform optimization to enhance communication performance and establish comprehensive channel model libraries.
AI/ML for mobile networks: Current status in Rel. 19 and challenges ahead
This paper provides a comprehensive review of 3GPP standardization efforts for integrating AI/ML into mobile networks, focusing on Release 18 and upcoming Release 19. It outlines the general AI/ML framework, key use cases, and identifies significant challenges in dataset preparation, generalization evaluation, and model selection.