Context Chunking
The Context Chunking tool enables processing of large documents by intelligently splitting them into manageable pieces. When content exceeds a model’s context window limit, this tool automatically chunks the document and orchestrates a multi-step analysis process.Key Features
- Automatic chunk size optimization based on model context windows
- Multiple chunking strategies (semantic, fixed, hierarchical)
- Two-step processing flow for efficient large document analysis
- Smart overlap between chunks to preserve context
- Temporary storage with automatic cleanup (1-hour TTL)
- Token estimation and safety limits
How It Works
The Context Chunking tool works automatically under the hood to handle large documents:- Automatic Chunking: When your content exceeds the model’s context window, the tool intelligently splits it into smaller chunks that fit within the model’s limits.
- Smart Processing: Each chunk is processed individually with your provided prompt (or defaults to analyzing and summarizing the content if no specific prompt is given).
- Seamless Integration: The tool automatically selects the optimal chunking strategy based on your content and model, making individual API calls with smaller chunks that fit perfectly into your selected model’s context window.
- Result Synthesis: After processing all chunks, results are combined into a comprehensive answer.
Chunking Strategies
By default, the tool automatically selects the optimal strategy for your content. However, you can configure specific strategies when needed:Semantic Chunking (Default)
- Respects natural document boundaries (paragraphs, sentences)
- Maintains context and readability
- Best for: General documents, articles, reports
Fixed Chunking
- Creates equal-sized chunks based on token count
- Predictable chunk sizes
- Best for: Structured data, when uniform processing is needed
Hierarchical Chunking
- Starts with semantic boundaries
- Creates nested structure for complex documents
- Best for: Long technical documents with clear structure
Parameters
create_chunks Operation
Parameter | Type | Required | Description |
---|---|---|---|
operation | string | Yes | Set to "create_chunks" |
content | string | Yes | The large text or document to chunk |
query | string | No | The specific question or analysis task |
chunking_strategy | string | No | Strategy: "semantic" , "fixed" , or "hierarchical" (default: "semantic" ) |
chunk_size | integer | No | Target tokens per chunk (auto-calculated if not provided) |
chunk_overlap | integer | No | Tokens to overlap between chunks (auto-calculated if not provided) |
process_chunk Operation
Parameter | Type | Required | Description |
---|---|---|---|
operation | string | Yes | Set to "process_chunk" |
chunk_id | string | Yes | The ID of the chunk to retrieve and process |
Configuration Options
The tool can be configured in multiple ways:During Assistant Creation
When creating or configuring an assistant, you can customize the Context Chunking tool with:- chunking_strategy: Choose between semantic, fixed, or hierarchical
- chunk_size: Set custom chunk size in tokens
- chunk_overlap: Define overlap between chunks
- max_chunks: Set maximum number of chunks
Organization Defaults
Default values apply when not explicitly configured:- default_chunk_size: 100,000 tokens (uses 80% of model’s context window)
- chunk_overlap: 1,000 tokens (10% of chunk size)
- max_chunks: 100 chunks (safety limit)
Use Cases
Large Document Analysis
Process lengthy reports, research papers, or documentation that exceed context limits:Multi-file Processing
When analyzing multiple large files in a workflow:Token-Limited Models
Optimize usage of models with smaller context windows:Best Practices
-
Let Auto-Optimization Work
- Don’t specify chunk_size unless you have specific requirements
- The tool automatically calculates optimal sizes based on your model
-
Provide Clear Queries
- Include your analysis question in the
query
parameter - This helps maintain focus across all chunks
- Include your analysis question in the
-
Choose the Right Strategy
- Use semantic chunking for most documents
- Use fixed chunking for structured data
- Use hierarchical for complex technical documents
-
Synthesize Results
- After processing all chunks, always synthesize the findings
- Look for patterns and connections across chunks
- Provide a comprehensive final answer
-
Monitor Chunk Count
- Very large documents may generate many chunks
- Consider the max_chunks limit (default: 100)
- If you hit the limit, increase chunk_size
Example Workflow
Technical Details
- Token Estimation: Uses approximate token counting (1 token ≈ 4 characters)
- Context Window Safety: Uses 80% of model’s context window to ensure reliability
- Overlap Handling: 10% overlap between chunks preserves context at boundaries
- Automatic Orchestration: Handles all API calls and result synthesis automatically
When to Use Context Chunking
✅ Use when:- Content exceeds model’s context window
- Processing very large documents (>50,000 tokens)
- You receive context limit errors
- You need systematic analysis of lengthy content
- Content fits within model limits
- You need real-time processing
- Document structure must remain intact
- Quick responses are priority over thoroughness
Related Tools
- Add Data: For including complete file contents in workflows (use before Context Chunking)
- Model Selector: For choosing appropriate models for chunk processing
- Space Search: For finding and retrieving documents to chunk