(논문 요약) jina-embeddings-v3: Multilingual Embeddings With Task LoRA (Paper)
핵심 내용
Model specification
- 개선사항
- Task-specific optimization with LoRA
- Patching retrieval failures with synthetic data
- Integration of latest techniques
- Matryoshka Representation Learning
- Instruction tuning
- Long-context retrieval
- Architecture
- XLM-RoBERTa 기반
- XLM-RoBERTa tokenizer 그대로 사용.
- absolute positional embeddings -> Rotary Position Embeddings (RoPE) 로 변경.
- XLM-RoBERTa 기반