(논문 요약) OpenAI o1 System Card

(논문 요약) OpenAI o1 System Card (Paper)

핵심 내용

학습 관련 내용
- trained with large-scale reinforcement learning to reason using chain of thought
- the models learn to refine their thinking process, try different strategies, and recognize their mistakes
데이터
- Public Data: reasoning data, scientific literature
- Proprietary Data from Data Partnerships: paywalled content, specialized archives, other domain-specific datasets