Research Guy

Problem

Traditional Explainable Artificial Intelligence (XAI) methods predominantly generate model-centric explanations, often failing to account for the diverse goals, preferences, and cognitive limitations of individual users. While some user-centric and personalized approaches exist, they frequently rely on heuristic adaptations or implicit user modeling, lacking a formal framework to represent and learn explicit preferences. Furthermore, existing personalized explanation methods, such as those leveraging Large Language Models, often suffer from issues like low fidelity and faithfulness. Even rule-based XAI, which offers high interpretability, primarily focuses on model properties rather than tailoring explanations to individual user needs, leading to outputs that can be too complex or generic.

Method

The authors propose Preference-Based Explainable Artificial Intelligence (PREF-XAI), a novel perspective that redefines explanation as a preference-driven decision problem. Instead of fixed outputs, explanations are treated as alternatives to be evaluated and selected according to user-specific criteria. The methodology focuses on providing personalized local explanations for black-box machine learning models using "if-then" decision rules, which are inherently human-interpretable. The process begins by training a black-box model (e.g., an MLP) and then inducing a comprehensive set of "if-then" rules that accurately reflect the model's decision logic for specific instances. To personalize these rules, user preferences are elicited by having the user rank a small, manageable subset of candidate rules. These preferences are then formally modeled using an additive utility function called the Preference-based Rule Utility Score (PRUS). PRUS evaluates rules based on criteria such as the presence of specific features, rule support, rule confirmation (using Nozick's N measure), and rule complexity (penalized for shorter rules). Robust Ordinal Regression (ROR) is employed to infer the weights of this PRUS function, identifying a set of compatible weight vectors that reproduce the user's reference ranking. To resolve the ambiguity of multiple compatible weight vectors, two strategies are used: Max ε, which maximizes the minimum utility difference between consecutive rules, and the Hit-and-Run (H&R) algorithm, which uniformly samples the compatible weight space to derive either a geometric centroid (H&RC) or an unordered set of "first rules" (H&RFR). The final ranked rules, tailored to the user's preferences, are presented as explanations, with the option to iteratively refine the preference model.

Results

Computational experiments were conducted on three real-world tabular datasets (Churn Banking, Churn Telecom, HELOC) using a trained Multilayer Perceptron (MLP) as the black-box model. The results demonstrated a high alignment between the top-ranked rules identified by the algorithm (Max ε and H&RC) and the true most preferred rules of the simulated users, as measured by the Jaccard index for top-5 and top-10 rule sets. This indicates effective generalization of user preferences rather than mere memorization. The robust ordinal regression models accurately reproduced the underlying Preference-based Rule Utility Score (PRUS), with positive Kendall's τ correlations for both ranking fidelity and parameter recovery. Crucially, the methodology proved capable of discovering novel, highly relevant rules that were not part of the initial user-provided reference set, appearing among the top positions in the final rankings. When comparing the two main optimization strategies, the Hit-and-Run Centroid (H&RC) generally achieved slightly higher median correlations for ranking fidelity and parameter recovery, indicating a more robust and faithful overall approximation of user preferences. Conversely, the Max ε strategy showed a more aggressive exploratory behavior, consistently elevating a higher number of novel rules, making it potentially advantageous when rule discovery is a primary goal. The study also noted that increasing the length of the initial reference ranking improved performance up to a saturation point of approximately 12 to 14 rules, balancing fidelity with user cognitive burden.

Implications

The PREF-XAI framework represents a significant shift in Explainable AI, moving from static, model-centric explanations to adaptive, preference-driven alternatives. This opens new avenues for creating interactive and adaptive explanation systems that are tailored to individual user needs and goals. Future research directions include extending the proposed methodology to personalize other explainability methods, such as counterfactuals and prototypes, and adapting it to handle different data modalities beyond tabular datasets. Additionally, the authors highlight the importance of future empirical studies involving human decision-makers to thoroughly assess how these personalized explanations impact trust, understanding, and practical decision-making in real-world applications.

Research Guy

Understand New Research — Instantly

Daily AI-generated explanations of the latest arXiv papers.

Research Guy

Research Guy

PREF-XAI: Preference-Based Personalized Rule Explanations of Black-Box Machine Learning Models

Problem

Method

Results

Implications