Fast and Forgettable: A Controlled Study of Novices' Performance, Learning, Workload, and Emotion in AI-Assisted and Human Pair Programming Paradigms
This paper investigates the impact of AI-assisted programming (GitHub Copilot) versus human pair programming on novice programmers' performance, learning, workload, and emotional experiences. Findings indicate that while AI improves performance and reduces workload, human pairing fosters more positive emotional engagement and potentially better long-term learning retention, suggesting a need to re-evaluate traditional pair programming in education.
Continuous benchmarking: Keeping pace with an evolving ecosystem of models and technologies
This paper introduces CI-beNNch, an automated continuous benchmarking framework for high-performance computing (HPC) applications, drawing on continuous integration principles. It addresses critical issues of reproducibility and usability in scientific software development by abstracting configurations and streamlining execution across diverse computing environments.
Dual-Modal Lung Cancer AI: Interpretable Radiology and Microscopy with Clinical Risk Integration
This study introduces a dual-modal AI framework combining CT radiology and H&E microscopy with clinical data for improved lung cancer diagnosis and subtype classification. The system demonstrates high accuracy and interpretability, offering a more robust and transparent approach to overcome the limitations of single-modality diagnostic methods.
Enhancing AI and Dynamical Subseasonal Forecasts with Probabilistic Bias Correction
This paper introduces Probabilistic Bias Correction (PBC), a machine learning framework designed to significantly improve subseasonal weather forecasts (2-6 weeks ahead) by correcting systematic errors in existing dynamical and AI models. PBC has demonstrated superior performance in real-time forecasting competitions, enhancing predictions for temperature, pressure, and precipitation, and improving the accuracy of extreme event warnings.
AI-Assisted Requirements Engineering: An Empirical Evaluation Relative to Expert Judgment
This paper empirically evaluates AI‑assisted requirements engineering against expert judgments, showing AI can reliably handle initial quality checks but still relies on human expertise for deeper analysis.