AI Summary • Published on Mar 5, 2026
Explainable Artificial Intelligence (XAI) aims to make machine learning systems more transparent, but many existing methods offer generic explanations that are difficult for non-experts to understand. While Large Language Models (LLMs) can translate technical XAI outputs into natural language, this introduces new challenges regarding maintaining the faithfulness of the explanation to the underlying model and preventing LLM hallucinations. Current approaches to XAI narratives often fail to address these issues comprehensively or to adequately account for diverse user expertise, goals, and cognitive needs, leading to a "one-size-fits-all" paradigm that limits practical utility.
The authors propose PONTE (Personalized Orchestration for Natural language Trustworthy Explanations), a human-in-the-loop framework designed to generate adaptive and faithful XAI narratives. Instead of relying solely on prompt engineering, PONTE frames personalization as a closed-loop interaction involving several coordinated components. A Contextual Preference Model (CPM) captures user stylistic preferences across dimensions like technicality, verbosity, depth, and actionability. A Narrative Generator uses an LLM to create explanations from structured XAI artifacts, conditioned by the CPM. To ensure reliability, the system incorporates Verifiers, including a Faithfulness Verifier for numerical correctness and informational completeness, and a Style Alignment Verifier that ensures consistency with user preferences. Optionally, a Retrieval-Grounded Argumentation module uses RAG over curated knowledge bases to substantiate claims, mitigating hallucinations. User feedback then iteratively updates the CPM through a Feedback Integrator, closing the personalization loop and allowing the system to adapt to individual needs. The framework is designed to be agnostic to both the predictive model and the XAI technique used.
Automatic and human evaluations across healthcare and finance domains demonstrated PONTE's effectiveness. The verification-refinement loop significantly improved informational completeness and stylistic alignment compared to single-pass, validation-free generation, while maintaining perfect faithfulness. For instance, completeness on the Diabetes dataset increased from 0.80 to 0.99 using Kimi, and style alignment rose from 0.39 to 0.94 using GPT-OSS, with only a modest average of 1.1–1.8 refinement iterations. Both automated "agent-as-a-judge" and human interactive simulations showed reliable and rapid convergence of user preferences. Human evaluations indicated strong agreement between the intended stylistic profiles and perceived narrative style (alignment scores ≈0.75–0.78, Spearman’s ρ>0.80), demonstrating robustness to generation stochasticity. User endorsement of narrative quality, utility, and satisfaction was consistently high across various personas, particularly for non-expert roles like patients and loan applicants.
PONTE represents a significant advancement in generating personalized and trustworthy XAI narratives, addressing the critical gap between technical explanations and user comprehension. By moving beyond simple prompt engineering to a robust closed-loop validation and adaptation process, the framework enhances the reliability, faithfulness, and user alignment of AI explanations. This contributes to more human-centered and accountable AI systems, especially in high-stakes decision-making contexts. Future work will focus on further strengthening guarantees for retrieval quality, source attribution, and conducting larger-scale user studies to assess behavioral impact.