PReMISE 把 reusable rubrics 视为 LLM judge 的测量规格:换 rubric 就是在改变固定 judge 对 response quality 的测量。框架从 pairwise human-preference data 发现 policy-level rubric,并审计 structural adequacy、reliability、preference fit、adversarial robustness 四个轴。关键结果是 preference-rank selection 将 paired-response judge accuracy 从 65.0% 提升到 68.6%,而 reliability-constrained refinement 把 exploit responses 获高分比例从 46.4% 降到 36.0%。
–浏览
评论 · Comments