(블로그 요약) Universe; Scale Real-World Verifiable Environments to Millions

(블로그 요약) Universe: Scale Real-World Verifiable Environments to Millions (paper)

핵심 내용

Builder agent
- code base 에서 bug fix.
- evaluation.sh 생성.
Hacking Detector
- evaluation.sh 가 제대로 된 것인지 검토.
Verifier agent
- buggy branch 에서는 테스트 실패, fixed branch 에서는 테스트 통과 하는지 체크.
데이터셋
- train: ~1M high-quality PRs (filtered from 33.3 M pull requests in 2021–2025 of GitHub)
  - high quality: PRs: linked with issue(s) without excessive file changes
- benchmark: 320 pull requests randomly sampled from GitHub
모델
- Qwen-Next-80B-A3B
- Qwen3-Max-Thinking