(논문 요약) OpenAI o1 System Card (Paper)
핵심 내용
- 학습 관련 내용
- trained with large-scale reinforcement learning to reason using chain of thought
- the models learn to refine their thinking process, try different strategies, and recognize their mistakes
- 데이터
- Public Data: reasoning data, scientific literature
- Proprietary Data from Data Partnerships: paywalled content, specialized archives, other domain-specific datasets