Python: https://github.com/chonkie-inc/chonkie
TypeScript: https://github.com/chonkie-inc/chonkie-ts
Here's a video showing our code chunker: https://youtu.be/Xclkh6bU1P0.
Bhavnick and I have been building personal projects with LLMs for a few years. For much of this time, we found ourselves writing our own chunking logic to support RAG applications. We often hesitated to use existing libraries because they either had only basic features or felt too bloated (some are 80MB+).
We built Chonkie to be lightweight, fast, extensible, and easy. The space is evolving rapidly, and we wanted Chonkie to be able to quickly support the newest strategies. We currently support: Token Chunking, Sentence Chunking, Recursive Chunking, Semantic Chunking, plus:
- Semantic Double Pass Chunking: Chunks text semantically first, then merges closely related chunks.
- Code Chunking: Chunks code files by creating an AST and finding ideal split points.
- Late Chunking: Based on the paper (https://arxiv.org/abs/2409.04701), where chunk embeddings are derived from embedding a longer document.
- Slumber Chunking: Based on the "Lumber Chunking" paper (https://arxiv.org/abs/2406.17526). It uses recursive chunking, then an LLM verifies split points, aiming for high-quality chunks with reduced token usage and LLM costs.
You can see how Chonkie compares to LangChain and LlamaIndex in our benchmarks: https://github.com/chonkie-inc/chonkie/blob/main/BENCHMARKS....
Some technical details about the Chonkie package: - ~15MB default install vs. ~80-170MB for some alternatives. - Up to 33x faster token chunking compared to LangChain and LlamaIndex in our tests. - Works with major tokenizers (transformers, tokenizers, tiktoken). - Zero external dependencies for basic functionality. - Implements aggressive caching and precomputation. - Uses running mean pooling for efficient semantic chunking. - Modular dependency system (install only what you need).
In addition to chunking, Chonkie also provides an easy way to create embeddings. For supported providers (SentenceTransformer, Model2Vec, OpenAI), you just specify the model name as a string. You can also create custom embedding handlers for other providers.
RAG is still the most common use case currently. However, Chonkie makes chunks that are optimized for creating high quality embeddings and vector retrieval, so it is not really tied to the "generation" part of RAG. In fact, We're seeing more and more people use Chonkie for implementing semantic search and/or setting context for agents.
We are currently focused on building integrations to simplify the retrieval process. We've created "handshakes" – thin functions that interact with vector DBs like pgVector, Chroma, TurboPuffer, and Qdrant, allowing you to interact with storage easily. If there's an integration you'd like to see (vector DB or otherwise), please let us know.
We also offer hosted and on-premise versions with OCR, extra metadata, all embedding providers, and managed vector databases for teams that want a fully managed pipeline. If you're interested, reach out at shreyash@chonkie.ai or book a demo: https://cal.com/shreyashn/chonkie-demo.
We're eager to hear your feedback and comments! Thanks!
We’re Shreyash and Bhavnick. We built Chonkie, an open-source library for advanced chunking and embedding of text and code. It was previously Python-only, but we just released a TypeScript version: https://github.com/chonkie-inc/chonkie-ts
Many AI projects in JS/TS (like those using Vercel's AI SDK or Mastra) rely on basic text splitters. But better chunking = better retrieval = better performance. That’s what Chonkie is built for.
Current native chunkers (in TS):
- Code Chunker – handles Python, TypeScript, etc.
- Recursive Chunker – rule-based, hierarchical splitting
- Token Chunker – split by token count (fully customizable)
- Sentence Chunker – split on sentence boundaries. Delimiters are customizable, so it works for multiple languages.
All chunkers support custom tokenizers, chunk overlap, delimiters, and more.
Coming soon in native TS (already available via the API client):
- Semantic Chunker – splits texts wherever it detects a shift in meaning.
- SDPM Chunker – merges semantically similar disjoint chunks
- Late Chunker – generates context-aware embeddings for each chunk
- Slumber Chunker – LLM-refined recursive chunks. Significantly reduces token usage (and thus cost) while maximizing chunk quality.
- Embeddings Refinery - Embed chunks with any embedding model
- Overlap Refinery – Create overlaps between consecutive chunks for better context preservation.
Chonkie is free, open-source, and MIT licensed. GitHub: https://github.com/chonkie-inc/chonkie-ts
We’d love your feedback, ideas, or contributions. Thanks!