Context Chunking

The Context Chunking tool enables processing of large documents by intelligently splitting them into manageable pieces. When content exceeds a model’s context window limit, this tool automatically chunks the document and orchestrates a multi-step analysis process.

Key Features

Automatic chunk size optimization based on model context windows
Multiple chunking strategies (semantic, fixed, hierarchical)
Two-step processing flow for efficient large document analysis
Smart overlap between chunks to preserve context
Temporary storage with automatic cleanup (1-hour TTL)
Token estimation and safety limits

How It Works

The Context Chunking tool works automatically under the hood to handle large documents:

Automatic Chunking: When your content exceeds the model’s context window, the tool intelligently splits it into smaller chunks that fit within the model’s limits.
Smart Processing: Each chunk is processed individually with your provided prompt (or defaults to analyzing and summarizing the content if no specific prompt is given).
Seamless Integration: The tool automatically selects the optimal chunking strategy based on your content and model, making individual API calls with smaller chunks that fit perfectly into your selected model’s context window.
Result Synthesis: After processing all chunks, results are combined into a comprehensive answer.

You don’t need to manage the chunking process manually—simply enable the tool and it handles everything automatically.

Chunking Strategies

By default, the tool automatically selects the optimal strategy for your content. However, you can configure specific strategies when needed:

Semantic Chunking (Default)

Respects natural document boundaries (paragraphs, sentences)
Maintains context and readability
Best for: General documents, articles, reports

Fixed Chunking

Creates equal-sized chunks based on token count
Predictable chunk sizes
Best for: Structured data, when uniform processing is needed

Hierarchical Chunking

Starts with semantic boundaries
Creates nested structure for complex documents
Best for: Long technical documents with clear structure

Parameters

create_chunks Operation

Parameter	Type	Required	Description
`operation`	string	Yes	Set to `"create_chunks"`
`content`	string	Yes	The large text or document to chunk
`query`	string	No	The specific question or analysis task
`chunking_strategy`	string	No	Strategy: `"semantic"`, `"fixed"`, or `"hierarchical"` (default: `"semantic"`)
`chunk_size`	integer	No	Target tokens per chunk (auto-calculated if not provided)
`chunk_overlap`	integer	No	Tokens to overlap between chunks (auto-calculated if not provided)

process_chunk Operation

Parameter	Type	Required	Description
`operation`	string	Yes	Set to `"process_chunk"`
`chunk_id`	string	Yes	The ID of the chunk to retrieve and process

Configuration Options

The tool can be configured in multiple ways:

During Assistant Creation

When creating or configuring an assistant, you can customize the Context Chunking tool with:

chunking_strategy: Choose between semantic, fixed, or hierarchical
chunk_size: Set custom chunk size in tokens
chunk_overlap: Define overlap between chunks
max_chunks: Set maximum number of chunks

Organization Defaults

Default values apply when not explicitly configured:

default_chunk_size: 100,000 tokens (uses 80% of model’s context window)
chunk_overlap: 1,000 tokens (10% of chunk size)
max_chunks: 100 chunks (safety limit)

These values are automatically optimized based on the target model’s capabilities.

Use Cases

Large Document Analysis

Process lengthy reports, research papers, or documentation that exceed context limits:

Upload or provide the large document
Specify your analysis question
Let the tool chunk and process systematically
Receive a comprehensive analysis

Multi-file Processing

When analyzing multiple large files in a workflow:

Use Add Data tool to load files
Use Context Chunking to process each large file
Synthesize insights across all documents

Token-Limited Models

Optimize usage of models with smaller context windows:

Content is automatically split to fit model limits
Each chunk is processed within safe token bounds
Results are combined for complete coverage

Best Practices

Let Auto-Optimization Work
- Don’t specify chunk_size unless you have specific requirements
- The tool automatically calculates optimal sizes based on your model
Provide Clear Queries
- Include your analysis question in the query parameter
- This helps maintain focus across all chunks
Choose the Right Strategy
- Use semantic chunking for most documents
- Use fixed chunking for structured data
- Use hierarchical for complex technical documents
Synthesize Results
- After processing all chunks, always synthesize the findings
- Look for patterns and connections across chunks
- Provide a comprehensive final answer
Monitor Chunk Count
- Very large documents may generate many chunks
- Consider the max_chunks limit (default: 100)
- If you hit the limit, increase chunk_size

Example Workflow

Automatic Processing Example:
→ Input: Large research paper (500,000+ tokens)
→ Tool automatically chunks into optimal sizes
→ Each chunk processed with your prompt
→ Results synthesized into comprehensive answer

All of this happens seamlessly behind the scenes!

Technical Details

Token Estimation: Uses approximate token counting (1 token ≈ 4 characters)
Context Window Safety: Uses 80% of model’s context window to ensure reliability
Overlap Handling: 10% overlap between chunks preserves context at boundaries
Automatic Orchestration: Handles all API calls and result synthesis automatically

When to Use Context Chunking

✅ Use when:

Content exceeds model’s context window
Processing very large documents (>50,000 tokens)
You receive context limit errors
You need systematic analysis of lengthy content

❌ Don’t use when:

Content fits within model limits
You need real-time processing
Document structure must remain intact
Quick responses are priority over thoroughness

Add Data: For including complete file contents in workflows (use before Context Chunking)
Model Selector: For choosing appropriate models for chunk processing
Space Search: For finding and retrieving documents to chunk

Getting Started

Models

AI Agents

Pulze Guide

Tools Guide

Vibe Coding

Developer Guide

API REFERENCE

COMMUNITY

PULZE ACADEMY

Context Chunking