byte-pair-encoding
Here are 59 public repositories matching this topic...
High performance unsupervised text tokenization for Ruby
- Updated
Dec 27, 2023 - Ruby
A ridiculously fast Python BPE (Byte Pair Encoder) implementation written in Rust
- Updated
Mar 19, 2025 - Python
Byte-Pair Encoding (BPE) (subword-based tokenization) algorithm implementaions from scratch with python
- Updated
Jan 30, 2023 - Python
Byte-level byte pair encoding (BPE) in Haskell
- Updated
May 27, 2024 - Haskell
R package for Byte Pair Encoding based on YouTokenToMe
- Updated
Sep 5, 2025 - C++
Feature extraction from sequential data
- Updated
Jul 4, 2019 - C++
This is a tool that encrypts a sequence of words (or pieces of texts) using the AES-256 algorithm and encodes the encrypted result into a PNG image by linking each byte value to a specific color. It also decodes the before image to get back the original sequence of words
- Updated
Sep 23, 2023 - Go
Code repo for the paper "AutoGO: Automated Computation Graph Optimization for Neural Network Evolution", accepted to NeurIPS 2023.
- Updated
Jun 7, 2024 - Python
Byte-Pair Encoding tokenizer for training large language models on huge datasets
- Updated
Jun 4, 2024 - Python
- Updated
Jan 30, 2025 - Python
Генерация новостных заголовков
- Updated
Nov 21, 2022 - Python
Byte Pair Encoding (BPE)
- Updated
Feb 25, 2019 - Python
A lightweight, from-scratch implementation of Byte Pair Encoding (BPE) tokenization in Python.
- Updated
Jul 8, 2025 - Python
A Visualizer to check how BPE Tokenizer in an LLM Works
- Updated
Feb 6, 2025 - JavaScript
Auto summarization from BPE tokenization
- Updated
Aug 20, 2020 - Jupyter Notebook
Code for the publication of WWW'22
- Updated
May 31, 2022 - Python
Modern Eager TensorFlow implementation of Attention Is All You Need
- Updated
Oct 7, 2024 - Python
Lightweight, header-only Byte Pair Encoding (BPE) trainer in modern C++17. Produces HuggingFace-compatible vocabularies for transformers and integrates with Modern Text Tokenizer.
- Updated
Aug 8, 2025 - C++
Improve this page
Add a description, image, and links to the byte-pair-encoding topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the byte-pair-encoding topic, visit your repo's landing page and select "manage topics."