AI Summary • Published on Jan 20, 2026
Machine learning and artificial intelligence conferences are facing unprecedented growth in submissions, leading to significant challenges in maintaining the quality and consistency of the peer review process. This issue is particularly pronounced for best paper awards, whose selection has become increasingly controversial and threatens the credibility of these prestigious accolades. The current review system struggles with noisy and arbitrary scores, necessitating structural reforms to ensure accurate identification of truly outstanding research.
The paper introduces an author-assisted mechanism based on the Isotonic Mechanism to facilitate best paper selection. This method elicits authors' assessments of their own submissions as rankings, which are then used to adjust raw review scores to better estimate ground-truth quality. A key theoretical contribution is demonstrating that authors are incentivized to report truthfully when their utility is a nondecreasing convex additive function of the adjusted scores. Crucially, for the special case where an author nominates only one paper, truthfulness holds under the weaker assumption of a merely nondecreasing utility function. The mechanism extends to accommodate overlapping authorship and considers two evaluation scenarios: a 'Blind Case' where decision-makers only see adjusted scores, and an 'Informed Case' where they also have access to author rankings. Empirical validation uses historical ICLR and NeurIPS data to show that the utility function, when interpreted as the conditional probability of winning a best paper award, exhibits convexity.
The Isotonic Mechanism significantly improves the quality of papers selected for awards, as demonstrated through simulations on synthetic conference data, including a replication of the ICLR 2021 co-authorship network. The 'Blind Protocol,' which uses adjusted scores without explicit ranking information for final selection, consistently and substantially outperforms the benchmark of raw review scores, especially in noisy environments and denser collaboration networks. The 'Informed, Max' protocol, which prioritizes papers ranked highest by all co-authors, showed context-dependent performance, sometimes underperforming the benchmark in complex network structures. Empirical analysis of ICLR and NeurIPS data from 2019-2023 supports the convexity assumption for utility functions when considering the probability of receiving a best paper award, contrasting with the sigmoidal shape observed for general paper acceptance probabilities. This convexity is attributed to the 'unbounded quality' of top-tier papers versus the bounded nature of review scores.
This work offers a transparent, verifiable, and readily applicable solution to improve best paper selection in academic conferences, contributing to the broader goal of addressing review quality degradation. The theoretical relaxation of the convexity assumption to mere monotonicity for single nominations substantially lowers adoption barriers, making the mechanism more robust. The framework's core insight—that high-fidelity ordinal information from insiders can improve top-tier item selection—has broad implications beyond academic peer review, extending to grant allocation, corporate performance evaluation, hiring, and crowdsourcing platforms. Future work includes developing more advanced computational techniques for the Informed Case, probing the boundaries of existing assumptions like utility function additivity and noise exchangeability, and conducting a live deployment in an actual conference setting to gather real-world feedback.