Articles tagged with: Annotation Quality

Showing 1 results for this tag.

Advanced·Jan 12, 2026

Pervasive Annotation Errors Break Text-to-SQL Benchmarks and Leaderboards

This paper exposes widespread annotation errors in leading text-to-SQL benchmarks, BIRD and Spider 2.0-Snow, and demonstrates how these inaccuracies severely distort model performance evaluations and leaderboard rankings. It also introduces SAR-Agent and SAPAR, an AI-powered toolkit designed to effectively detect and correct these pervasive errors, advocating for higher quality benchmark development.

Text-to-SQL

Annotation Quality

Benchmarks

Research Guy

All Tags

Research Guy

Understand New Research — Instantly

Daily AI-generated explanations of the latest arXiv papers.

Research Guy

Research Guy

All Tags

Research Guy

Research Guy

Articles tagged with: Annotation Quality