Category:

Model Distillation

Category:

Model Optimization

Definition

Compressing a large model into a smaller one while preserving performance.

Explanation

Distillation trains a smaller “student” model to mimic a larger “teacher” model. It reduces inference cost, improves speed, and enables on-device or enterprise deployment. Distilled models are ideal for routing systems, agent sub-tasks, and latency-sensitive applications.

Technical Architecture

Teacher Model → Knowledge Transfer → Student Model → Deployment

Core Component

Teacher LLM, student LLM, distillation dataset, evaluation suite

Use Cases

Edge AI, offline AI, fast agents, routing workflows

Pitfalls

Loss of reasoning ability; degraded accuracy for complex tasks.

LLM Keywords

Model Distillation, Compressed Llm, Student Teacher Models

Related Concepts

Related Frameworks

• Model Compression
• Routing Models
• MoE

• Distillation Workflow

Back to Glossary Index