Skip to main content

Context Chunking

The Context Chunking tool enables processing of large documents by intelligently splitting them into manageable pieces. When content exceeds a model’s context window limit, this tool automatically chunks the document and orchestrates a multi-step analysis process.

Key Features

  • Automatic chunk size optimization based on model context windows
  • Multiple chunking strategies (semantic, fixed, hierarchical)
  • Two-step processing flow for efficient large document analysis
  • Smart overlap between chunks to preserve context
  • Temporary storage with automatic cleanup (1-hour TTL)
  • Token estimation and safety limits

How It Works

The Context Chunking tool works automatically under the hood to handle large documents:
  1. Automatic Chunking: When your content exceeds the model’s context window, the tool intelligently splits it into smaller chunks that fit within the model’s limits.
  2. Smart Processing: Each chunk is processed individually with your provided prompt (or defaults to analyzing and summarizing the content if no specific prompt is given).
  3. Seamless Integration: The tool automatically selects the optimal chunking strategy based on your content and model, making individual API calls with smaller chunks that fit perfectly into your selected model’s context window.
  4. Result Synthesis: After processing all chunks, results are combined into a comprehensive answer.
You don’t need to manage the chunking process manually—simply enable the tool and it handles everything automatically.

Chunking Strategies

By default, the tool automatically selects the optimal strategy for your content. However, you can configure specific strategies when needed:

Semantic Chunking (Default)

  • Respects natural document boundaries (paragraphs, sentences)
  • Maintains context and readability
  • Best for: General documents, articles, reports

Fixed Chunking

  • Creates equal-sized chunks based on token count
  • Predictable chunk sizes
  • Best for: Structured data, when uniform processing is needed

Hierarchical Chunking

  • Starts with semantic boundaries
  • Creates nested structure for complex documents
  • Best for: Long technical documents with clear structure

Parameters

create_chunks Operation

ParameterTypeRequiredDescription
operationstringYesSet to "create_chunks"
contentstringYesThe large text or document to chunk
querystringNoThe specific question or analysis task
chunking_strategystringNoStrategy: "semantic", "fixed", or "hierarchical" (default: "semantic")
chunk_sizeintegerNoTarget tokens per chunk (auto-calculated if not provided)
chunk_overlapintegerNoTokens to overlap between chunks (auto-calculated if not provided)

process_chunk Operation

ParameterTypeRequiredDescription
operationstringYesSet to "process_chunk"
chunk_idstringYesThe ID of the chunk to retrieve and process

Configuration Options

The tool can be configured in multiple ways:

During Assistant Creation

When creating or configuring an assistant, you can customize the Context Chunking tool with:
  • chunking_strategy: Choose between semantic, fixed, or hierarchical
  • chunk_size: Set custom chunk size in tokens
  • chunk_overlap: Define overlap between chunks
  • max_chunks: Set maximum number of chunks

Organization Defaults

Default values apply when not explicitly configured:
  • default_chunk_size: 100,000 tokens (uses 80% of model’s context window)
  • chunk_overlap: 1,000 tokens (10% of chunk size)
  • max_chunks: 100 chunks (safety limit)
These values are automatically optimized based on the target model’s capabilities.

Use Cases

Large Document Analysis

Process lengthy reports, research papers, or documentation that exceed context limits:
1. Upload or provide the large document
2. Specify your analysis question
3. Let the tool chunk and process systematically
4. Receive a comprehensive analysis

Multi-file Processing

When analyzing multiple large files in a workflow:
1. Use Add Data tool to load files
2. Use Context Chunking to process each large file
3. Synthesize insights across all documents

Token-Limited Models

Optimize usage of models with smaller context windows:
1. Content is automatically split to fit model limits
2. Each chunk is processed within safe token bounds
3. Results are combined for complete coverage

Best Practices

  1. Let Auto-Optimization Work
    • Don’t specify chunk_size unless you have specific requirements
    • The tool automatically calculates optimal sizes based on your model
  2. Provide Clear Queries
    • Include your analysis question in the query parameter
    • This helps maintain focus across all chunks
  3. Choose the Right Strategy
    • Use semantic chunking for most documents
    • Use fixed chunking for structured data
    • Use hierarchical for complex technical documents
  4. Synthesize Results
    • After processing all chunks, always synthesize the findings
    • Look for patterns and connections across chunks
    • Provide a comprehensive final answer
  5. Monitor Chunk Count
    • Very large documents may generate many chunks
    • Consider the max_chunks limit (default: 100)
    • If you hit the limit, increase chunk_size

Example Workflow

Automatic Processing Example:
→ Input: Large research paper (500,000+ tokens)
→ Tool automatically chunks into optimal sizes
→ Each chunk processed with your prompt
→ Results synthesized into comprehensive answer

All of this happens seamlessly behind the scenes!

Technical Details

  • Token Estimation: Uses approximate token counting (1 token ≈ 4 characters)
  • Context Window Safety: Uses 80% of model’s context window to ensure reliability
  • Overlap Handling: 10% overlap between chunks preserves context at boundaries
  • Automatic Orchestration: Handles all API calls and result synthesis automatically

When to Use Context Chunking

Use when:
  • Content exceeds model’s context window
  • Processing very large documents (>50,000 tokens)
  • You receive context limit errors
  • You need systematic analysis of lengthy content
Don’t use when:
  • Content fits within model limits
  • You need real-time processing
  • Document structure must remain intact
  • Quick responses are priority over thoroughness
  • Add Data: For including complete file contents in workflows (use before Context Chunking)
  • Model Selector: For choosing appropriate models for chunk processing
  • Space Search: For finding and retrieving documents to chunk
I