Skip to main content
Link
Menu
Expand
(external link)
Document
Search
Copy
Copied
Jaemin's Arxiv
Book
Code Review
Computer Vision
Economy
Quantitative Finance with Python
Language Model
Agents
Alignment
Analysis
Application
Architecture
Code and Math
Compute Efficiency
Data
Distributed Training
Embedding
Foundation Model
Hallucination
RAG
Training
Reinforcement Learning
Thoughts
Vision Language Model
(논문 요약) DiPaCo; Distributed Path Composition
(논문 요약) DiPaCo: Distributed Path Composition
(Paper)
핵심 내용
high level idea: distribute computation by path
데이터 개수에 따라 gradient 를 OuterOpt 에서 조절