(논문 요약) Training Large Language Models to Reason in a Continuous Latent Space | Jaemin’s Arxiv

(논문 요약) Training Large Language Models to Reason in a Continuous Latent Space (Paper)

핵심 내용

Input embedding 을 생성 (continuous space)
학습시 step 한 단계를 $c$ 단계의 continuous token(s) 으로 대체

실험 결과

iCoT: internalized CoT
Pause token: special tokens inserted between the question and answer
w/o curriculum: directly train questions and answers in the final stage
w/o thought: no continuous latent thought generated
pause as thought: tokens to replace the continuous thoughts