(논문 요약) LEAF: Knowledge Distillation of Text Embedding Models with Teacher-Aligned Representations (Paper)
핵심 내용
- leaf-ir: 23M, knowledge distillation from arctic-embed-m-v1.5 (≈109M)
- leaf-mt: 23M, knowledge distillation from mxbai-l-v1 (≈335M)
- Standard mode: the same leaf model encodes queries and documents
- Asymmetric mode: documents are embedded with the larger teacher, while queries are embedded with the smaller leaf model