(논문 요약) QLORA: Efficient Finetuning of Quantized LLMs (Paper)

핵심 내용

  • base model weights: 4-bit
  • lora: 16-bit
  • optimizer: 32-bit
  • double-quanitzation: quantized constants 들을 묶어서 다시 quantize