Narration · Module 18
PEFT + LoRA
0:00 / 0:00
Module 18 · Training · 10 min

Parameter-efficient
fine-tuning.

Why LoRA is cheap, what QLoRA adds.

Reading time10 min Audionarration available Prerequisites13 SourceTrack A · Gemini
§ 1

What this lesson covers.

This module is one of 42 in the curriculum. Below is the canonical interactive lesson — tabs, cards, and diagrams from the source repo, rendered inside the course shell. An audio narration runs alongside it - the sticky player at the top of the page plays the full Module 18 clip.

If you prefer to read first and play with the demos after, the interactive lesson sits below this section. If you'd rather hear it narrated while you scroll, hit play on the sticky audio bar at the top — or just let it autoplay.

§ 2

The lesson itself.

Interactive lesson · ported from Gemini track Click tabs to navigate · hover cards for details

The Frozen Brain (LoRA)

To teach a massive 70B-parameter Base Model a new skill, we could update all its weights (Full Fine-Tuning) — but that requires clusters of extremely expensive GPUs. LoRA (Low-Rank Adaptation) freezes the big brain entirely and only trains a tiny "backpack" matrix of weights attached to it. The forward pass becomes: h = W₀x + (α/r)BAx.

OUT OF MEMORY (OOM) ERROR!
Base Model Frozen Space 70,000,000,000 params
LoRA
Adapter

10M
§ DEMO

Try it: lora rank slider.

Drag the rank from 1 to 256. Watch trainable parameter count drop 256x while the matrix decomposition (A.B) visualizes live.

LoRA Rank Slider · interactiveOpen standalone
§ PAPERS

Further reading.

The canonical references for this module. External links open in a new tab.

§ NEXT

What to read next.

Use the pager below to move sequentially through the curriculum, or jump to any module from the course index. Each track has a "Prereq: ↑ foundation" callout so you can backfill anything that wasn't clear.