Skip to the content.

(논문 요약) DIFFUSION MODELS ARE REAL-TIME GAME ENGINES

(논문 요약) Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

(논문 요약) Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling

(논문 요약) To Code, or Not To Code? Exploring Impact of Code in Pre-training

(논문 요약) Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting

(논문 요약) Scaling Laws for Data Filtering—Data Curation cannot be Compute Agnostic

(논문 요약) LLM Pruning and Distillation in Practice: The Minitron Approach

(논문 요약) Automated Design of Agentic Systems

(논문 요약) xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

(논문 요약) MUTUAL REASONING MAKES SMALLER LLMS STRONGER PROBLEM-SOLVERS

(논문 요약) MEDICAL GRAPH RAG: TOWARDS SAFE MEDICAL LARGE LANGUAGE MODEL VIA GRAPH RETRIEVALAUGMENTED GENERATION

(논문 요약) Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine

(논문 요약) Distributed Inference and Fine-tuning of Large Language Models Over The Internet

(논문 요약) Self-Taught Evaluators

(논문 요약) Scaling Exponents Across Parameterizations and Optimizers

(논문 요약) Large Language Monkeys: Scaling Inference Compute with Repeated Sampling

(논문 요약) REACT: SYNERGIZING REASONING AND ACTING IN LANGUAGE MODELS

(논문 요약) Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget

(논문 요약) MindSearch: Mimicking Human Minds Elicits Deep AI Searcher

(논문 요약) Apple Intelligence Foundation Language Models

(논문 요약) SAM 2: Segment Anything in Images and Videos

(논문 요약) Segment Anything

(논문 요약) META-REWARDING LANGUAGE MODELS: Self-Improving Alignment with LLM-as-a-Meta-Judge

(논문 요약) LazyLLM: DYNAMIC TOKEN PRUNING FOR EFFICIENT LONG CONTEXT LLM INFERENCE

(논문 요약) MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens

(논문 요약) Solving olympiad geometry without human demonstrations

(논문 요약) LEAN-GitHub: Compiling GitHub LEAN repositories for a versatile LEAN prover

(논문 요약) Weak-to-Strong Reasoning

(논문 요약) PROVER-VERIFIER GAMES IMPROVE LEGIBILITY OF LLM OUTPUTS

(논문 요약) The Llama 3 Herd of Models

(논문 요약) From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients

(논문 요약) Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients

(논문 요약) DataComp-LM: In search of the next generation of training sets for language models

(논문 요약) Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks

(논문 요약) FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision

(논문 요약) RouteLLM: Learning to Route LLMs with Preference Data

(논문 요약) INTERNET OF AGENTS: WEAVING A WEB OF HETEROGENEOUS AGENTS FOR COLLABORATIVE INTELLIGENCE

(논문 요약) Planetarium: A Rigorous Benchmark for Translating Text to Structured Planning Languages

(논문 요약) DiPaCo: Distributed Path Composition

(논문 요약) Data curation via joint example selection further accelerates multimodal learning

(논문 요약) MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

(논문 요약) APIGen: Automated PIpeline for Generating Verifiable and Diverse Function-Calling Datasets

(논문 요약) Searching for Best Practices in Retrieval-Augmented Generation

(논문 요약) AGENTLESS: Demystifying LLM-based Software Engineering Agents

(논문 요약) InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

(논문 요약) Gemma 2: Improving Open Language Models at a Practical Size

(논문 요약) Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data

(논문 요약) WorkBench: a Benchmark Dataset for Agents in a Realistic Workplace Setting

(논문 요약) Meta Large Language Model Compiler: Foundation Models of Compiler Optimization

(논문 요약) TREE SEARCH FOR LANGUAGE MODEL AGENTS

(논문 요약) Debug like a Human: A Large Language Model Debugger via Verifying Runtime Execution Step by Step

(논문 요약) DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

(논문 요약) AGENTGYM: Evolving Large Language Model-based Agents across Diverse Environments

(논문 요약) MAGPIE: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

(논문 요약) Mixture-of-Agents Enhances Large Language Model Capabilities

(논문 요약) OpenVLA: An Open-Source Vision-Language-Action Model

(논문 요약) Show, Don’t Tell: Aligning Language Models with Demonstrated Feedback

(논문 요약) Towards Scalable Automated Alignment of LLMs: A Survey

(논문 요약) Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models

(논문 요약) SimPO: Simple Preference Optimization with a Reference-Free Reward

(논문 요약) Faithful Logical Reasoning via Symbolic Chain-of-Thought

(논문들 요약) Large Language Model Tuning

(논문 요약) Extreme Compression of Large Language Models via Additive Quantization

(논문 요약) QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks

(논문 요약) DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data

(논문 요약) Layer-Condensed KV Cache for Efficient Inference of Large Language Models

(논문 요약) Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet

(코드 탐색) llama.cpp

(논문 요약) Granite Code Models: A Family of Open Foundation Models for Code Intelligence

(논문 요약) Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?

(논문 요약) Chameleon: Mixed-Modal Early-Fusion Foundation Models

(논문 요약) What matters when building vision-language models?

(코드 실행) Alphacodium

(논문 요약) Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering

(논문 요약) Better & Faster Large Language Models via Multi-token Prediction

(논문 요약) LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale

(논문 요약) OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents

(논문 요약) SWE-AGENT: AGENT-COMPUTER INTERFACES ENABLE AUTOMATED SOFTWARE ENGINEERING

(논문 요약) Chat Vector: A Simple Approach to Equip LLMs with Instruction Following and Model Alignment in New Languages

(논문 요약) AgentCoder: Multiagent-Code Generation with Iterative Testing and Optimisation

(논문 요약) NExT: Teaching Large Language Models to Reason about Code Execution

(논문 요약) Make Your LLM Fully Utilize the Context

(논문 요약) Phi-3 Technical Report:A Highly Capable Language Model Locally on Your Phone

(데이터 요약) common crawl filtered data

(논문 요약) One Embedder, Any Task: Instruction-Finetuned Text Embeddings

(모델 요약) LLaMa3

(논문 요약) CodeGemma: Open Code Models Based on Gemma

(논문 요약) RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

(논문 요약) Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

(논문 요약) Towards General Computer Control: A Multimodal Agent for Red Dead Redemption II as a Case Study

(논문 요약) Integrating Code Generation with Execution and Refinement

(논문 요약) Gecko: Versatile Text Embeddings Distilled from Large Language Models

(논문 요약) Gorilla: Large Language Model Connected with Massive APIs

(논문 요약) The Unreasonable Ineffectiveness of the Deeper Layers

(논문 요약) SWE-BENCH: CAN LANGUAGE MODELS RESOLVE REAL-WORLD GITHUB ISSUES?

(논문 요약) GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

(논문 요약) RAFT: Adapting Language Model to Domain Specific RAG

(논문 요약) LONG-FORM FACTUALITY IN LARGE LANGUAGE MODELS

(논문 요약) MTEB: Massive Text Embedding Benchmark

(논문 요약) Chart-based Reasoning: Transferring Capabilities from LLMs to VLMs

(논문 요약) Generative Representational Instruction Tuning

(논문 요약) Text Embeddings by Weakly-Supervised Contrastive Pre-training

(논문 요약) Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM

(논문 요약) Direct Preference Optimization: Your Language Model is Secretly a Reward Model

(논문 요약) RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation

(논문 요약) Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

(논문 요약) STaR: Self-Taught Reasoner Bootstrapping Reasoning With Reasoning

(논문 요약) SceneScript: Reconstructing Scenes With An Autoregressive Structured Language Model

(논문 요약) DEMYSTIFYING EMBEDDING SPACES USING LARGE LANGUAGE MODELS

(논문 요약) Stop Regressing: Training Value Functions via Classification for Scalable Deep RL

(논문 요약) Scaling Instructable Agents Across Many Simulated Worlds

(논문 요약) Simple and Scalable Strategies to Continually Pre-train Large Language Models

(논문 요약) Efficient Tool Use with Chain-of-Abstraction Reasoning

(논문 요약) Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping

(논문 요약) InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning

(논문 요약) Grounded Language-Image Pre-training

(코드 실행) AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model

(논문 요약) Textbooks Are All You Need

(논문 요약) Generating Diverse High-Fidelity Images with VQ-VAE-2

(논문 요약) Neural Discrete Representation Learning

(논문 요약) ROFORMER: ENHANCED TRANSFORMER WITH ROTARY POSITION EMBEDDING

FED 금리 수익률 곡선

Undistort Image with OpenCV-Python

Top-view perspective transform with OpenCV-Python

Rotation matrix with Quaternion

Visualize a point cloud with Open3D