(논문 요약) Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens (Paper)

핵심 내용

  • segmentation network 등의 feature 를 생성하도록 VLM 을 학습.

  • curriculum learning