(논문 요약) Quantization-Aware Distillation for NVFP4 Inference Accuracy Recovery

(논문 요약) Quantization-Aware Distillation for NVFP4 Inference Accuracy Recovery (paper)

핵심 내용

high precision teacher 의 softmax output 을 quantized student 가 학습.
- student 는 teacher 를 quantize 하여 initialize.
Loss: KL($p_{teacher}$ || $p_{student}$)