Mano: Restriking Manifold Optimization for LLM Training
This paper introduces Mano, a novel optimizer for training large language models that re-approaches manifold optimization. Mano addresses the limitations of existing optimizers like AdamW and Muon by projecting momentum onto the tangent space of model parameters and constraining it on a rotational Oblique manifold, demonstrating superior performance and efficiency.
Toward Fully Autonomous Driving: AI, Challenges, Opportunities, and Needs
This paper reviews the current state of autonomous driving, identifies limitations in scalability and adaptability, and proposes a data-driven, two-stage fine-tuning process and a "service-oriented modular end-to-end (SO-M-E2E)" architecture to achieve fully autonomous driving while integrating technological and socio-political aspects.
From Basins to safe sets: a machine learning perspective on chaotic dynamics
This perspective article explores how modern machine learning techniques, such as convolutional neural networks and transformer architectures, can accelerate the analysis and control of chaotic dynamics. It highlights their potential to overcome the computational limitations of traditional methods in tasks like basin characterization and partial control, opening doors for real-time applications.
Early and Prediagnostic Detection of Pancreatic Cancer from Computed Tomography
This paper introduces ePAI, an AI-powered system designed for the early and prediagnostic detection of pancreatic ductal adenocarcinoma (PDAC) from routine computed tomography (CT) scans. The system demonstrates high accuracy in detecting small lesions and significantly outperforms radiologists, offering a median lead time of 347 days before clinical diagnosis.
Dependence of Equilibrium Propagation Training Success on Network Architecture
This paper investigates how network architecture, specifically locally connected lattices, impacts the success of Equilibrium Propagation (EP) training in neuromorphic systems. It demonstrates that sparse networks with local connections can achieve performance comparable to dense networks, offering guidelines for scaling up EP-based architectures in realistic settings.