(논문 요약) TORA: A TOOL-INTEGRATED REASONING AGENT FOR MATHEMATICAL PROBLEM SOLVING (Paper)
핵심 내용
- sympy 등의 external symbolic solver 를 사용하여 reasoning.
- tool 활용 reasoning trajectory 를 supervised learning
- tool 활용 reasoning trajectory 를 supervised learning
- Training
- Imitation learning
- ToRA-CORPUS generation: GPT-4 on the GSM8k and MATH training sets
- loss: next reasoning, next action 예측 학습 (log likelihood)
- Teacher model: CodeLLaMA-34B trained on TORA-CORPUS
- Finetune LLaMA-2, CodeLLaMA series
- Imitation learning