(논문 요약) Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning (Paper)
핵심 내용
- Memory manager 를 GRPO 로 학습.
- reward: exact match
감상
- LLM 이 아닌 컴퍼넌트를 RL 로 학습한 점이 신선함.
(논문 요약) Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning (Paper)