AI Summary • Published on Mar 25, 2026
Complex molecular systems pose a significant challenge in understanding transition pathways due to their high dimensionality. Identifying the precise "reaction coordinate" (RC) – a variable that characterizes progress from reactant to product states – is crucial but often relies on intuition and trial-and-error with traditional collective variables (CVs). While the committor analysis offers a reliable measure of transition path progress, it is still limited by its manual, hypothesis-driven nature. Existing machine learning methods address some of these limitations, but deep learning models often operate as "black boxes," making it difficult to understand which input variables drive their predictions of the RC.
The paper presents a deep learning framework that uses committor values as the learning target to identify reaction coordinates. Collective variables (CVs), such as dihedral angles or atom-centered symmetry functions (ACSFs), are fed into a neural network, which then predicts the RC. To overcome the black-box nature of deep learning, the framework integrates Explainable Artificial Intelligence (XAI) techniques, specifically LIME (Local Interpretable Model-agnostic Explanations) and SHAP (Shapley Additive exPlanations). These XAI methods quantitatively assess the contribution of individual input CVs to the network's predictions. By minimizing a cross-entropy loss function, derived from Kullback–Leibler divergence, the model learns an RC that monotonically follows the committor. The framework was applied to systems like alanine dipeptide and NaCl ion dissociation using different sets of CVs.
The deep learning framework successfully identified reaction coordinates for both alanine dipeptide isomerization and NaCl ion pair dissociation. For alanine dipeptide, the models (both linear regression and deep neural networks) demonstrated a sigmoidal dependence between the committor and the predicted RC, with a clear peak at 0.5 for configurations near the transition state. XAI analysis (LIME and SHAP) consistently showed that specific dihedral angles, particularly φ and θ, were dominant contributors, with a shift in the most influential angle (from φ to θ) near the transition state, revealing local features not captured by global linear models. For NaCl ion pair dissociation, the framework utilized 1,296 ACSFs as CVs. SHAP analysis identified two specific G5 ACSFs as the largest contributors, which effectively formed a well-defined separatrix line for the transition state when combined with interionic distance. These ACSFs correlated with previously established water bridging structures and interionic water density, confirming their physical relevance. The study also explored hyperparameter tuning, finding that while different hyperparameter sets could yield similar predictive performance, the identified contributing CVs remained robust.
This explainable deep learning framework offers a systematic, data-driven strategy for identifying physically meaningful reaction coordinates in complex molecular systems, moving beyond the traditional reliance on physical intuition and trial-and-error. By integrating XAI, the method not only predicts RCs but also provides interpretability, revealing which molecular features are crucial for transition processes. This capability significantly advances the understanding of rare-event mechanisms in theoretical and computational chemistry. Future improvements are anticipated through enhanced CV definitions, integration with advanced sampling techniques, and the use of more sophisticated machine learning architectures, potentially reducing dependence on predefined CVs and enabling analysis of even larger and more intricate molecular environments.