
Category:
Category:
Category:
Core AI & LLM Concepts
Definition
The maximum number of tokens an LLM can process at once.
Explanation
The context window limits how much information an LLM can consider simultaneously. Larger windows allow entire documents or dialogue histories to fit, improving reasoning and reducing truncation errors. However, performance degrades at extreme lengths (“lost in the middle”), and costs rise. RAG and memory systems help avoid overloading the context.
Technical Architecture
Tokens → Attention Window → LLM Reasoning → Output
Core Component
Tokenizer, transformer architecture, attention heads
Use Cases
Document QA, multi-step reasoning, legal analysis, long conversations
Pitfalls
Lost relevance, high latency, expensive inference, token overflow
LLM Keywords
Context Window, Long Context LLMs, Token Limits
Related Concepts
Related Frameworks
• RAG
• Long-Context Models,
• Memory
• Context-Length Comparison Chart
