Overview

LoRA is the most widely used PEFT method. It works by adding small, trainable matrices to the model's layers, which represent the 'delta' or change needed for the new task.

Benefits

  • Tiny File Size: LoRA 'weights' are often only a few megabytes, compared to gigabytes for the full model.
  • No Inference Latency: The LoRA weights can be mathematically merged into the main model.
  • Portability: Easy to share and swap different LoRAs for different tasks.

Related Terms