Governed, sourced lists of options. Only rankings that exist in the registry appear here.
Programming and agentic coding benchmarks for frontier language models.
Dev-only second ranking to prove DP18 shared catalog + per-ranking votes.