(논문 요약) CHEATING AUTOMATIC LLM BENCHMARKS: NULL MODELS ACHIEVE HIGH WIN RATES (Paper)

핵심 내용

  • null model: 무의미한 같은 output 만 출력하는 모델로 injection 공격으로 win rate 를 얻어낼수있음.
    • 86.5% LC win rate on AlpacaEval 2.0
    • 83.0 score on Arena-Hard-Auto
    • 9.55 score on MT-Bench