Articles tagged with: Model Interpretability

Showing 1 results for this tag.

Advanced·Jan 12, 2026

Evaluating the Ability of Explanations to Disambiguate Models in a Rashomon Set

This paper introduces three principles for evaluating feature-importance explanations and proposes AXE, a novel framework designed to accurately differentiate models within a Rashomon set. AXE effectively detects adversarial fairwashing, where discriminatory model behaviors are intentionally masked by misleading explanations, outperforming existing evaluation metrics.

Model Interpretability

Fairness

Explainable AI

Research Guy

All Tags

Research Guy

Understand New Research — Instantly

Daily AI-generated explanations of the latest arXiv papers.

Research Guy

Research Guy

All Tags

Research Guy

Research Guy

Articles tagged with: Model Interpretability