InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →
Top 23 Rust Machine Learning Projects
-
qdrant
Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
Qdrant web site: https://qdrant.tech/
-
Stream
Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
-
NautilusTrader Official Website
-
burn
Burn is a next generation tensor library and Deep Learning Framework that doesn't compromise on flexibility, efficiency and portability.
Project mention: Burn a Deep Learning Framework with flexibility, efficiency and portability | news.ycombinator.com | 2025-10-08 -
tensorzero
TensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM gateway, observability, optimization, evaluation, and experimentation.
Project mention: Supervised Fine Tuning on Curated Data Is Reinforcement Learning | news.ycombinator.com | 2025-07-29[I'm his coworker.] We ran Unsloth ourselves on a GPU-by-the-hour server. We have a notebook in the repository showing how to query historical data and use it with Unsloth.
It's a WIP PR that we plan to merge soon: https://github.com/tensorzero/tensorzero/pull/2273
-
-
lance
Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..
It looks like Narwhals; "Narwhals and scikit-Lego came together to achieve dataframe-agnosticism" https://news.ycombinator.com/item?id=40950813 :
> Narwhals: https://narwhals-dev.github.io/narwhals/ :
>> Extremely lightweight compatibility layer between [pandas, Polars, cuDF, Modin]
> Lancedb/lance works with [Pandas, DuckDB, Polars, Pyarrow,]; https://github.com/lancedb/lance
SymPy has Solvers for ODEs and PDEs and convex optimization. SymPy also has lambdify to compile from a relatively slow symbolic expression tree to faster 'vectorized' functions
From https://news.ycombinator.com/item?id=40683777 re: warp :
> sympy.utilities.lambdify.lambdify() https://github.com/sympy/sympy/blob/main/sympy/utilities/lam... :
>>> """Convert a SymPy expression into a function that allows for fast numeric evaluation""" [with e.g. the CPython math module, mpmath, NumPy, SciPy, CuPy, JAX, TensorFlow, PyTorch (*), SymPy, numexpr, but not yet cmath]
-
-
InfluxDB
InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
-
-
-
Daft
High-performance data engine for AI and multimodal workloads. Process images, audio, video, and structured data at any scale
Project mention: 650GB of Data (Delta Lake on S3). Polars vs. DuckDB vs. Daft vs. Spark | news.ycombinator.com | 2025-11-13Hey everyone, I'm a software engineer at Eventual, the team behind Daft! Huge thanks to the op for the benchmark, we're a huge fan of your blog posts and this gave us some really useful insights. For context, Daft is a high-performance data processing engine for AI workloads that works both on single-node and distributed setups.
We're actively looking into the results of the benchmark and hope to share some of our findings soon. From initial results, we found a lot of potential optimizations we could make to our deltalake reader to improve parallelism and our groupby operator to improve pipelining for count aggregations. We're hoping to roll our these improvements over the next couple of releases.
If you're interested to learn more about our findings, check out our GitHub (https://github.com/Eventual-Inc/Daft) or follow us on Twitter (https://x.com/daftengine) and LinkedIn (https://www.linkedin.com/showcase/daftengine) for updates. Also if Daft sounds interesting to you, give us a try via pip install daft!
- Project mention: Rust Linfa: The Rising Star of Machine Learning in Systems Programming | dev.to | 2024-12-23
Linfa is a modular approach to machine learning in Rust, offering a collection of statistical learning algorithms and tools. Unlike monolithic frameworks, Linfa follows Rust's philosophy of small, focused crates that can be composed together.
-
rust-bert
Rust native ready-to-use NLP pipelines and transformer-based models (BERT, DistilBERT, GPT2,...)
Project mention: Building Sentence Transformers in Rust: A Practical Guide with Burn, ONNX Runtime, and Candle | dev.to | 2025-10-30rust-bert: Complete NLP pipelines (https://github.com/guillaume-be/rust-bert)
-
spiceai
A portable accelerated SQL query, search, and LLM-inference engine, written in Rust, for data-grounded AI apps and agents.
Project mention: Show HN: Spice Cayenne – SQL acceleration built on Vortex | news.ycombinator.com | 2025-12-18Hi HN, we’re Luke and Phillip, and we’re building Spice.ai OSS - a lightweight, portable data and AI engine and powered by Apache DataFusion & Ballista for SQL query, hybrid-search, and LLM-inference across disaggregated-storage used by enterprises like Barracuda Networks and Twilio.
We first introduced Spice [1] on HN in 2021 and re-launched it on HN [2] in 2024 re-built from the ground up in Rust.
Spice includes the concept of a Data Accelerator [3], which is a way to materialize data from disparate sources, such as other databases, in embedded databases like SQLite and DuckDB.
Today we’re excited to announce a new Ducklake-inspired Data Accelerator built on Vortex [3], a highly performant, extensible columnar data format that claims 100x faster random access, 10-20x faster scans, 5x faster writes with a similar compression ratio vs. Apache Parquet.
In our tests with Spice, Vortex performs faster than DuckDB with a third of the memory usage, and is much more scalable (multi-file). For real-world deployments, we see the DuckDB Data Accelerator often capping out around 1TB, but Spice Cayenne can do Petabyte-scale.
You can read about it at https://spice.ai/blog and in the Spice OSS release notes [4].
This is just the first version, and we’d love to get your feedback!
GitHub: https://github.com/spiceai/spiceai
[1] https://news.ycombinator.com/item?id=28448887
-
hora
🚀 efficient approximate nearest neighbor search algorithm collections library written in Rust 🦀 .
-
- Project mention: Building Sentence Transformers in Rust: A Practical Guide with Burn, ONNX Runtime, and Candle | dev.to | 2025-10-30
ONNX Runtime Rust: https://ort.pyke.io
- Project mention: Ocrs: Rust library and CLI tool for OCR (extracting text from images) | news.ycombinator.com | 2025-09-29
-
extractous
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
I will try it out. Is this the correct library? - https://github.com/yobix-ai/extractous
I have used Gemini for OCR and it was indeed good. I also used GPT 3.5 and liked that too.
-
-
-
-
sail
LakeSail's computation framework with a mission to unify batch processing, stream processing, and compute-intensive AI workloads. (by lakehq)
lake sail
- Project mention: Signature Detection in Tensorlake: Catch what’s missing, trigger what’s next | dev.to | 2025-05-29
You can try Signature Detection right now in the both the Tensorlake Playground and using the Python SDK. We also have other use cases in our Tensorlake Docs.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Rust Machine Learning discussion
Rust Machine Learning related posts
-
Chapter 1: Introduction to NautilusTrader
-
NautilusTrader 功能和使用指南
-
第1章:NautilusTrader 简介
-
Show HN: Spice Cayenne – SQL acceleration built on Vortex
-
How We Cut LLM Batch Inference Time in Half with Dynamic Prefix Bucketing
-
Building Sentence Transformers in Rust: A Practical Guide with Burn, ONNX Runtime, and Candle
-
Daft vs Ray Data: A Comprehensive Comparison for Multimodal Data Processing
- A note from our sponsor - InfluxDB www.influxdata.com | 22 Dec 2025
Index
What are some of the best open-source Machine Learning projects in Rust? This list will help you:
| # | Project | Stars |
|---|---|---|
| 1 | qdrant | 27,767 |
| 2 | nautilus_trader | 16,732 |
| 3 | burn | 13,663 |
| 4 | tensorzero | 10,721 |
| 5 | postgresml | 6,651 |
| 6 | lance | 5,838 |
| 7 | leaf | 5,548 |
| 8 | rust | 5,441 |
| 9 | tch-rs | 5,193 |
| 10 | Daft | 5,024 |
| 11 | linfa | 4,466 |
| 12 | rust-bert | 3,010 |
| 13 | spiceai | 2,648 |
| 14 | hora | 2,648 |
| 15 | dfdx | 1,853 |
| 16 | ort | 1,820 |
| 17 | ocrs | 1,706 |
| 18 | extractous | 1,652 |
| 19 | RustQuant | 1,599 |
| 20 | MusicGPT | 1,310 |
| 21 | juice | 1,122 |
| 22 | sail | 1,092 |
| 23 | indexify | 1,081 |