NumPy
Scrapy
| NumPy | Scrapy | |
|---|---|---|
| 310 | 193 | |
| 31,038 | 59,259 | |
| 1.0% | 0.8% | |
| 10.0 | 9.5 | |
| 4 days ago | 7 days ago | |
| Python | Python | |
| GNU General Public License v3.0 or later | BSD 3-clause "New" or "Revised" License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
NumPy
- Python is not a great language for data science. Part 1: The experience
- Choosing Tech Stack in 2025: A Practical Guide
Unmatched integration with ML/AI ecosystems through NumPy, TensorFlow, and PyTorch
- What Dynamic Typing Is For
- Bringing NumPy's type-completeness score to nearly 90% – Pyrefly
> Let’s take a pause here for a second - the ‘CanIndex’ and ‘SupportsIndex’ from the looks are just “int”.
The PR for the change is https://github.com/numpy/numpy/pull/28913 - The details of files changed[0] shows the change was made in 'numpy/__init__.pyi'. Looking at the whole file[1] shows SupportsIndex is being imported from the standard library's typing module[2].
Where are you seeing SupportsIndex being defined as an int?
> I have a hard time dealing with these custom types because they are so obscure.
SupportsIndex is obscure, I agree, but it's not a custom type. It's defined in stdlib's typing module[2], and was added in Python 3.8.
[0]: https://github.com/numpy/numpy/pull/28913/files
[1]: https://github.com/charris/numpy/blob/c906f847f8ebfe0adec896...
[2]: https://docs.python.org/3/library/typing.html#typing.Support...
- Don’t Let Cyber Risk Kill Your GenAI Vibe: A Developer’s Guide
Know (or check) tells of older versions, such as the python sdk of OpenAI changing from a client with global state in v0.x.x, to a declared instance in v1.x.x, or numpy's change in how random generators are declared.
- Top 5 GitHub Repositories for Data Science in 2026
The book introduces the core libraries essential for working with data in Python: particularly IPython, NumPy, Pandas, Matplotlib, Scikit-Learn, and related packages Familiarity with Python as a language is assumed; if you need a quick introduction to the language itself, see the free companion project, A…
- Your 2025 Roadmap to Becoming an AI Engineer for Free for Vue.js Developers
AI starts with math and coding. You don’t need a PhD—just high school math like algebra and some geometry. Linear algebra (think matrices) and calculus (like slopes) help understand how AI models work. Python is the main language for AI, thanks to tools like TensorFlow and NumPy. If you know JavaScript from Vue.js, Python’s syntax is straightforward.
- Top 17 Tools for Scientific Simulation & Modeling
- Release v2.3.0 (June 7, 2025) · NumPy/NumPy
- How to Get Started with Scikit-Learn: A Beginner-Friendly Guide to Machine Learning in Python
As is the case with most Python libraries, it is open-source and free-to-use, making it easily accessible by anyone willing to learn machine learning, and it is built upon other open-source libraries within Python, like SciPy for advanced scientific operations, NumPy for efficient numerical computations, Matplotlib for data visualization, and Cython for increased efficiency and speed, similar to that of C/C++.
Scrapy
- Scrapy Middlewares: A Practical Guide for Beginners (With Real-World Examples)
User-Agent: Scrapy/2.11.0 (+https://scrapy.org)
- Progress Updates on Contribution to Scrapy
Last week, I worked on code refactoring in Scrapy, which is an essential practice in larger and more complex projects. Refactoring not only improves code maintainability but also makes it easier for other contributors to understand and extend the project. This task was a good starting point for me to verify that I had the Scrapy project correctly set up locally, as refactoring of codes should not break existing functionalities.
- Contributing to Larger Open Source Project - Scrapy
In the past three months, I worked on various open source projects, including my own project Repo Context Packager, Math Worksheet Generator and Open Web Calendar. This month, I want to challenge myself to work on a larger and more widely used project - Scrapy, a Python module for web crawling.
- How I Block All 26M of Your Curl Requests
What I have seen it is hard to tell what "serious scrapers" use. They use many things. Some use this, some not. This is what I have learned reading webscraping on reddit. Nobody speaks things like that out loud.
There are many tools, see links below
Personally I think that running selenium can be a bottle neck, as it does not play nice, sometimes processes break, even system sometimes requires restart because of things blocked, can be memory hog, etc. etc. That is my experience.
To be able to scale I think you have to have your own implementation. Serious scrapers complain about people using selenium, or derivatives as noobs, who will come back asking why page X does not work in scraping mechanisms.
https://github.com/lexiforest/curl_cffi
https://github.com/encode/httpx
https://github.com/scrapy/scrapy
https://github.com/apify/crawlee
- Scrapy needs to have sane defaults that do no harm
- Top 10 Tools for Efficient Web Scraping in 2025
Scrapy is a robust and scalable open-source web crawling framework. It is highly efficient for large-scale projects and supports asynchronous scraping.
- 11 best open-source web crawlers and scrapers in 2024
Language: Python | GitHub: 52.9k stars | link
- Current problems and mistakes of web scraping in Python and tricks to solve them!
One might ask, what about Scrapy? I'll be honest: I don't really keep up with their updates. But I haven't heard about Zyte doing anything to bypass TLS fingerprinting. So out of the box Scrapy will also be blocked, but nothing is stopping you from using curl_cffi in your Scrapy Spider.
- Scrapy, a fast high-level web crawling and scraping framework for Python
- Automate Spider Creation in Scrapy with Jinja2 and JSON
Install scrapy (Offical website) either using pip or conda (Follow for detailed instructions):
What are some alternatives?
mitmproxy - An interactive TLS-capable intercepting HTTP proxy for penetration testers and software developers.
requests-html - Pythonic HTML Parsing for Humans™
SymPy - A computer algebra system written in pure Python
pyspider - A Powerful Spider(Web Crawler) System in Python.
Pandas - Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
portia - Visual scraping for Scrapy