Table of contents
- (개념 요약) Tokenization
- (논문 요약) Chat Vector; A Simple Approach to Equip LLMs with Instruction Following and Model Alignment in New Languages
- (논문 요약) DOES RLHF SCALE? EXPLORING THE IMPACTS FROM DATA, MODEL, AND METHOD
- (논문 요약) Direct Preference Optimization; Your Language Model is Secretly a Reward Model
- (논문 요약) Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs
- (논문 요약) Generative Representational Instruction Tuning
- (논문 요약) Grounded Language-Image Pre-training
- (논문 요약) Hunyuan-Large; An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
- (논문 요약) LLAMA-OMNI; SEAMLESS SPEECH INTERACTION WITH LARGE LANGUAGE MODELS
- (논문 요약) MAGPIE; Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing
- (논문 요약) META-REWARDING LANGUAGE MODELS; Self-Improving Alignment with LLM-as-a-Meta-Judge
- (논문 요약) Make Your LLM Fully Utilize the Context
- (논문 요약) One Initialization to Rule them All; Fine-tuning via Explained Variance Adaptation
- (논문 요약) RouteLLM; Learning to Route LLMs with Preference Data
- (논문 요약) SFT Memorizes, RL Generalizes; A Comparative Study of Foundation Model Post-training
- (논문 요약) Scaling Exponents Across Parameterizations and Optimizers
- (논문 요약) Scaling Laws for Data Filtering—Data Curation cannot be Compute Agnostic
- (논문 요약) Scaling Laws for Precision
- (논문 요약) Self-Taught Evaluators
- (논문 요약) SimPO; Simple Preference Optimization with a Reference-Free Reward
- (논문 요약) Simple and Scalable Strategies to Continually Pre-train Large Language Models
- (논문 요약) Smaller, Weaker, Yet Better; Training LLM Reasoners via Compute-Optimal Sampling
- (논문 요약) THINKING LLMS; GENERAL INSTRUCTION FOLLOWING WITH THOUGHT GENERATION
- (논문 요약) Textbooks Are All You Need
- (논문 요약) Training Language Models to Self-Correct via Reinforcement Learning
- Miscellaneous Finetuning Methods in Large Language Models