Quantization
Shrinking a model by storing its numbers at lower precision, which reduces memory and speeds it up with a small quality trade-off.
Why it matters
Quantization is what lets capable models run on a laptop or phone.
Related terms
Back to the full AI glossary.