chunking-algorithm

SmartChunk is a lightweight, structure-aware semantic chunking toolkit designed to supercharge RAG (Retrieval-Augmented Generation) and LLM pipelines. Unlike naive splitters that break text arbitrarily, SmartChunk respects document structure (headings, lists, tables, code blocks) and semantic flow, ensuring cleaner, more coherent chunks.

nlp cli package semantic pip chunking rag chunking-algorithm llm agentic-workflow

Updated Oct 10, 2025
Python

mg98 / ae-chunker-go

Star

Go implementation of the AE chunking algorithm.

go golang chunking chunking-algorithm

Updated Jan 4, 2023
Go

FastPix / android-uploads-sdk

Star

Android Resumable Uploads SDK from Fastpix

android kotlin java retrofit2 resumable-upload chunking-algorithm

Updated Oct 11, 2025
Kotlin

isaka-james / chunks-to-file

Star

A nodejs chunking system

nodejs chunk chunking chunked-uploads chunks chunking-algorithm chunking-files nodejs-chunking node-chunking

Updated Sep 26, 2024
JavaScript

i5heu / ChunkingChampions

Star

Explore and benchmark the world of data chunking algorithms in 'ChunkingChampions' - a competitive arena to determine the most efficient and effective chunking strategies for varied data sizes.

benchmark ranking chunking chunking-algorithm

Updated Apr 6, 2024

sanbaiw / semtxtsplitter

Star

A smol Go package for splitting text into chunks while preserving semantic meaning.

nlp rag chunking-algorithm

Updated Apr 28, 2025
Go

D-X-W-Clerker / clerker-ai

Star

[2024-2] Mermaid 모델을 활용한 회의 지원 플랫폼 서비스 "Clerker"

ai deep-learning summarization stt chunking-algorithm llm

Updated Nov 25, 2024
Python

mahnoorsheikh16 / NLP-Framework-for-Literature-Summarization-in-Law-and-Policy

Star

Implementation of an interactive chatbot for summarizing legal and policy documents. Includes data preprocessing (cleaning, tokenization, chunking), extractive summarization baselines, and fine-tuned abstractive models (PEGASUS and LED). Integrates a retrieval layer for document relevance and uses ROUGE, BLEU, and cosine similarity for evaluation.

led text-summarization cosine-similarity pegasus rouge-metric nlp-keywords-extraction policy-analysis tokenization bleu-score encoder-decoder-model retrieval-chatbot chunking-algorithm longformer-models

Updated Oct 18, 2025
Jupyter Notebook

Pavansomisetty21 / Chunking-Strategies

Sponsor

Star

Detailed overview on chunking

chunk chunking chunks chunking-algorithm

Updated Sep 26, 2024

eliashossain001 / Yo_PowerpointAddInTTS

Star

MS PowerPoint extension created using Yoman generator and React JS

react javascript text-to-speech rest rest-api audio-processing chunking-algorithm

Updated Jul 17, 2023
JavaScript

mudssrali / chunkify

Star

a simple utility to split given array into chunks of input size with array reverse option

javascript typescript array split chunk chunking-algorithm array-splitter chunking-array

Updated Apr 25, 2021
TypeScript

vaibhavdangar09 / CUDAQuest-Semantic-Crawl-to-Answer-Engine

Star

scraping python3 embeddings rag scraping-python milvus chunking-algorithm vector-database sentence-transformers llm generative-ai langchain

Updated Jul 17, 2024
Python

tainmou / SmartChunk

Star

🧩 Enhance RAG processes with SmartChunk, a Python package that creates quality text chunks while preserving structure and meaning for better retrieval.

nlp cli package semantic pip chunking rag chunking-algorithm llm agentic-workflow

Updated Oct 22, 2025

Improve this page

Add a description, image, and links to the chunking-algorithm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the chunking-algorithm topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chunking-algorithm

Here are 17 public repositories matching this topic...

chonkie-inc / chonkie

chonkie-inc / chonkiejs

nlfiedler / fastcdc-rs

iscc / fastcdc-py

ayush585 / SmartChunk

mg98 / ae-chunker-go

FastPix / android-uploads-sdk

isaka-james / chunks-to-file

i5heu / ChunkingChampions

sanbaiw / semtxtsplitter

D-X-W-Clerker / clerker-ai

mahnoorsheikh16 / NLP-Framework-for-Literature-Summarization-in-Law-and-Policy

Pavansomisetty21 / Chunking-Strategies

eliashossain001 / Yo_PowerpointAddInTTS

mudssrali / chunkify

vaibhavdangar09 / CUDAQuest-Semantic-Crawl-to-Answer-Engine

tainmou / SmartChunk

Improve this page

Add this topic to your repo