LoRA Notes

Cheatsheet & scripts

Collection of notes and scripts for working with Low-Rank Adaptation (LoRA) in machine learning projects.

MLX
LoRA

LoRA Documentation

What is LoRA?

Low-Rank Adaptation (LoRA) is a parameter-efficient fine-tuning method for large language models. Instead of updating all parameters of a pre-trained model, LoRA adds trainable low-rank matrices to specific layers, significantly reducing the number of trainable parameters.

Key Benefits

  • Memory Efficient: Requires ~0.5-1% of original model parameters
  • Fast Training: Significantly faster than full fine-tuning
  • Modular: Easy to combine multiple LoRA adapters
  • Storage Friendly: Small adapter files (~MBs vs GBs)
Basic LoRA Concept
# Instead of updating W (full weight matrix)
# LoRA adds: W + ΔW where ΔW = A × B

# A: m × r matrix (trainable)
# B: r × n matrix (trainable)
# r << min(m, n) = low rank dimension