SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Python Arxiv Projects
-
-
Stream
Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
-
arxiv-latex-cleaner
arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv
Project mention: LaTeXpOsEd: A Systematic Analysis of Information Leakage in Preprint Archives | news.ycombinator.com | 2025-10-13I agree with other comments that this research treads a fine, unethical line. Did the authors responsibly disclose this, as is often done in the security research community? I cannot find any mention of it in the paper. The researchers seem to be involved in security-related research (first author is doing a PhD, last author holds a PhD).
At least arxiv could have run the cleaner [1] before the print of this pre-print (lol). If there was no disclosure, then I think this pre-print becomes unethical to put up.
> leading to the identification of nearly 1,200 images containing sensitive metadata. The types of data represented vary significantly. While device information (e.g., the camera used) or software details (such as the exact version of Photoshop) may already raise concerns, in over 600 cases the metadata contained GPS coordinates, potentially revealing the precise location where a photo was taken. In some instances, this could expose a researcher’s home address (when tied to a profile picture) or the location of research facilities (when images capture experimental equipment)
Oof, that's not too great.
[1] https://github.com/google-research/arxiv-latex-cleaner
-
arxiv-vanity
Renders papers from arXiv as responsive web pages so you don't have to squint at a PDF.
-
arxiv-sanity-lite
arxiv-sanity lite: tag arxiv papers of interest get recommendations of similar papers in a nice UI using SVMs over tfidf feature vectors based on paper abstracts.
-
-
-
-
InfluxDB
InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
-
paper2remarkable
Fetch an academic paper or web article and send it to the reMarkable tablet with a single command
-
summarizepaper
An AI-powered arXiv paper summarization website with a virtual assistant for answering questions.
-
-
bibcure
Bibcure helps in boring tasks by keeping your bibfile up to date and normalized...also allows you to easily download all papers inside your bibtex
-
searchthearxiv
The code powering searchthearxiv.com, a simple semantic search engine for more than 300,000 ML papers on arXiv.
Project mention: Semantic search engine for ArXiv, biorxiv and medrxiv | news.ycombinator.com | 2025-05-20 -
pdf2doi
A python library/command-line tool to extract the DOI or other identifiers of a scientific paper from a pdf file.
-
-
Auto-Research
Generate custom detailed survey paper with topic clustered sections and proper citations, from just a single query in just under 30 mins !!
-
If you're wondering how they prompt the models:
"Perform OCR on this image. Return only the text found in the image as a single continuous string without any newlines, additional text, or commentary. Separate words with single spaces. For any truncated, partially visible, or occluded text, include only the visible portions without attempting to complete or guess the full text. If no text is present, return empty double quotes."
Found in: https://github.com/video-db/ocr-benchmark/blob/main/prompts....
-
Muzero-unplugged
Pytorch Implementation of MuZero Unplugged for gym environment. This algorithm is capable of supporting a wide range of action and observation spaces, including both discrete and continuous variations.
-
ailert
An open-source platform that aggregates AI content from 230+ sources including research papers, GitHub trends, and industry news, making AI knowledge accessible to everyone.
Project mention: Building an Open-Source AI Newsletter Engine: The Story of AiLert | dev.to | 2025-01-12Code: https://github.com/anuj0456/ailert Docs: https://github.com/anuj0456/ailert/blob/main/README.md
-
Paper-Recommendation-System
Web interface to search ArXiv papers using NLP Sentence-Transformers, Faiss and Streamlit
-
Muzero
Pytorch Implementation of MuZero for gym environment. It support any Discrete , Box and Box2D configuration for the action space and observation space. (by DHDev0)
-
arxiv-to-prompt
Transform arXiv papers into a single LaTeX source that can be used as a prompt for asking LLMs questions about the paper.
Project mention: Show HN: iOS app (and CLI) for turning ArXiv papers into LLM-ready LaTeX prompts | news.ycombinator.com | 2025-08-16https://github.com/takashiishida/arxiv-to-prompt
I’ve been using the CLI tool daily to help me quickly understand arXiv papers. This downloads the arXiv source files, finds the main `\documentclass` file, and flattens everything into one coherent LaTeX source (people usually have multiple .tex files in a single paper by using `\input` and `\include`). It also has options to remove comments and appendices to shorten prompts.
I often used the CLI tool on my laptop, but since I commute by train in Tokyo I wanted somthing I could use on my phone. That's why I built the iOS app.
With equation-heavy papers, it may be better to provide the precise latex notation instead of providing the PDF. Uploading PDFs is also more difficult and time consuming (especially on the phone).
Thanks for reading! This is my first iOS app, and I’d appreciate your thoughts!
-
arxiv-mcp-server
MCP server for arXiv.org - Search, analyze, and export academic papers with AI assistants. Features advanced paper discovery, citation analysis, trend tracking, and multi-format exports.
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Python Arxiv discussion
Python Arxiv related posts
-
LaTeXpOsEd: A Systematic Analysis of Information Leakage in Preprint Archives
-
ArXiv LaTeX Cleaner: Clean the LaTeX code of your paper to submit to ArXiv
-
My Struggle with Doom Scrolling
-
Hardware Acceleration of LLMs: A comprehensive survey and comparison
-
Show HN: FileKitty – Combine and label text files for LLM prompt contexts
-
Show HN: Command Line Data Aggregation Tool for LLM Ingestion
-
Show HN: Talk to any ArXiv paper just by changing the URL
- A note from our sponsor - SaaSHub www.saashub.com | 23 Dec 2025
Index
What are some of the best open-source Arxiv projects in Python? This list will help you:
| # | Project | Stars |
|---|---|---|
| 1 | ChatPaper | 19,052 |
| 2 | arxiv-latex-cleaner | 6,619 |
| 3 | arxiv-vanity | 1,631 |
| 4 | arxiv-sanity-lite | 1,426 |
| 5 | arxiv.py | 1,414 |
| 6 | resp | 459 |
| 7 | ArxivDigest | 384 |
| 8 | paper2remarkable | 371 |
| 9 | summarizepaper | 303 |
| 10 | findpapers | 293 |
| 11 | bibcure | 204 |
| 12 | searchthearxiv | 164 |
| 13 | pdf2doi | 130 |
| 14 | cobib | 65 |
| 15 | Auto-Research | 58 |
| 16 | ocr-benchmark | 45 |
| 17 | Muzero-unplugged | 34 |
| 18 | ailert | 28 |
| 19 | Paper-Recommendation-System | 22 |
| 20 | Muzero | 18 |
| 21 | arxiv-to-prompt | 16 |
| 22 | arxiv-mcp-server | 12 |
| 23 | neozot-py | 8 |