The Art of Compressing LLMs: Pruning, Distillation, and Quantization Demystified

Master the essential model compression techniques to deploy high-performance, cost-efficient Large Language Models at scale.

Generative AILLMMachine LearningGPU ComputingMLOps
Provider
NVIDIA DLI
Duration
8 hrs
Mode
self-paced
Pricing
Price not stated

Catalog checked Mar 16, 2026. Enrollment happens on the provider website; progress tracking happens here.

Open provider page

What you will cover

Generative AI/LLM, Deep Learning, generative AI, LLM systems, machine learning, GPU computing

Recommended next

LLM Foundations for Builders
A free, self-paced introduction to modern large language model systems.
Review course
Machine Learning Refresher
Refresh the statistics and ML foundations needed for advanced GenAI work.
Review course
Fine-Tuning and MLOps
Bridge experimentation and operations for adapted language models.
Review course
Related

Keep the path moving

Verified freebasic

A free, self-paced introduction to modern large language model systems.

LLMGenerative AIPrompt Engineering
5 hrsself-pacedChecked Mar 1, 2026
Verified freebasic

Refresh the statistics and ML foundations needed for advanced GenAI work.

Machine LearningPython FoundationsStatistics
12 hrsself-pacedChecked Feb 22, 2026
Verified freeprofessional

Bridge experimentation and operations for adapted language models.

LLMFine-TuningMLOps
10 hrsliveChecked Mar 10, 2026
The Art of Compressing LLMs: Pruning, Distillation, and Quantization Demystified | OpenCourseMap