(논문 요약) jina-embeddings-v3: Multilingual Embeddings With Task LoRA (Paper)

핵심 내용

  • Model specification

  • 개선사항
    • Task-specific optimization with LoRA
    • Patching retrieval failures with synthetic data
    • Integration of latest techniques
      • Matryoshka Representation Learning
      • Instruction tuning
      • Long-context retrieval
  • Architecture
    • XLM-RoBERTa 기반
      • XLM-RoBERTa tokenizer 그대로 사용.
      • absolute positional embeddings -> Rotary Position Embeddings (RoPE) 로 변경.

성능