AI Summary • Published on Apr 29, 2026
Semi-supervised learning benefits from combining limited labeled data with abundant unlabeled data to improve inference efficiency. Prediction-Powered Inference (PPI) utilizes a single powerful predictor, but its effectiveness relies on that predictor's quality, which can be unknown or vary due to distribution shifts or the availability of diverse AI models. This raises the problem of how to robustly integrate multiple predictors into the PPI framework to enhance efficiency without requiring prior knowledge of their individual performance.
The paper proposes a Mixture of Experts (MOE)-powered Prediction-Powered Inference (PPI) framework. It combines multiple available predictors, treating them as experts, to form a composite predictor (F_beta). The core idea is to find the optimal mixture weights (beta*) that minimize the variance of the PPI-based estimator, a distinction from traditional model averaging which often focuses on mean squared error. This "oracle MOE" ensures collective predictive power, a best-expert guarantee, and allows for safe expansion with new models. The framework estimates these optimal weights from labeled data, constructing confidence sets based on the resulting sample MOE. Although introducing a slight bias, theoretical analysis demonstrates this bias is negligible (O(n^-1)) and provides non-asymptotic guarantees for coverage probabilities and normal approximation across various statistical tasks, including mean, quantile, linear, and logistic regression.
Empirical experiments demonstrated that the proposed MOE-powered inference consistently outperformed or matched Prediction-Powered Inference (PPI) using the single best predictor, especially exhibiting robustness under model misspecification. For mean and quantile estimation, the method achieved near-nominal coverage with significantly narrower confidence intervals (e.g., width ratios of 0.2-0.35 compared to conventional methods). In linear and logistic regression, PPI-MOE provided substantial efficiency gains in misspecified (nonlinear) settings (e.g., MOE/Conventional ratio of 0.64), while remaining stable compared to other aggressive PPI alternatives. Furthermore, power analysis showed that PPI-MOE required significantly fewer labeled samples to achieve target statistical power, highlighting its practical efficiency and reduced annotation costs.
The MOE-powered framework offers a practical and robust approach to semi-supervised inference, enabling near-oracle performance without the impracticality of pre-selecting the best individual predictor. It enhances robustness against model misspecification and significantly reduces uncertainty, leading to narrower confidence intervals with reliable coverage. Crucially, this method dramatically improves labeling efficiency, requiring fewer labeled samples to achieve target statistical power, thereby reducing data collection costs. The findings suggest that aggregating experts in an inference-oriented manner is superior to relying on single models or simple averaging, providing a valuable tool for high-quality prediction-powered inference. Future research could extend this to covariate-dependent weighting for further improvements.