(논문 요약) SIMA 2; An Agent that Plays, Reasons, and Learns With You in Virtual 3D Worlds

(논문 요약) SIMA 2: An Agent that Plays, Reasons, and Learns With You in Virtual 3D Worlds (blog)

핵심 내용

Initial learning: human demonstrations
Subsequent training: SIMA 2’s own experience data can then be used to train the next, even more capable version of the agent.
- Gemini provided an initial task and an estimated reward for SIMA 2’s behavior.
- The agent uses these for further training in subsequent generations improving on previously failed tasks.