Python is often accused of being “slow.” While it’s true that Python isn’t as fast as C or Rust in raw computation, with the right techniques, you can significantly speed up your Python code—especially if you're dealing with I/O-heavy workloads.
In this post, we’ll dive into:
- When and how to use
threading
in Python. - How it differs from
multiprocessing
. - How to identify I/O-bound and CPU-bound workloads.
- Practical examples that can boost your app’s performance.
Let’s thread the needle.
🧠 Understanding I/O-Bound vs CPU-Bound
Before choosing between threading or multiprocessing, you must understand the type of task you're optimizing:
Type | Description | Example | Best Tool |
---|---|---|---|
I/O-bound | Spends most time waiting for external resources | Web scraping, File downloads | threading , asyncio |
CPU-bound | Spends most time performing heavy computations | Image processing, ML inference | multiprocessing |
💡 Rule of thumb:
If your program is slow because it's waiting, use threads.
If it's slow because it's calculating, use processes.
🧵 Using Threading in Python
Python’s Global Interpreter Lock (GIL) limits true parallelism for CPU-bound threads, but for I/O-bound tasks, threading
can bring a huge speed boost.
Example: Threading for I/O-bound Tasks
import threading import requests import time urls = [ 'https://example.com', 'https://httpbin.org/delay/2', 'https://httpbin.org/get' ] def fetch(url): print(f"Fetching {url}") response = requests.get(url) print(f"Done: {url} - Status {response.status_code}") start = time.time() threads = [] for url in urls: t = threading.Thread(target=fetch, args=(url,)) threads.append(t) t.start() for t in threads: t.join() print(f"Total time: {time.time() - start:.2f} seconds")
🕒 Without threads, this would take ~6 seconds (2s per request).
With threads, it runs in ~2 seconds, showing real speedup.
💡 Threading Caveats
Threads share memory → race conditions possible.
Use threading.Lock() to avoid shared resource conflicts.
Ideal for I/O, but not effective for CPU-heavy work.
🧮 Multiprocessing for CPU-Bound Tasks
For CPU-heavy workloads, the GIL becomes a bottleneck. That’s where the multiprocessing module comes in. It spawns separate processes, each with its own Python interpreter.
Example: CPU-bound Task with Multiprocessing
from multiprocessing import Process, cpu_count import math import time def compute(): print(f"Process starting") for _ in range(10**6): math.sqrt(12345.6789) if __name__ == "__main__": start = time.time() processes = [] for _ in range(cpu_count()): p = Process(target=compute) processes.append(p) p.start() for p in processes: p.join() print(f"Total time: {time.time() - start:.2f} seconds")
Here, we divide the work across all available CPU cores — a massive boost for computationally expensive tasks.
🔍 How to Tell if a Task is CPU-Bound or I/O-Bound
Use profiling tools or observation:
- Visual inspection Waiting on API calls, file reads → I/O-bound
Math loops, data crunching → CPU-bound
- Use profiling tools
pip install line_profiler kernprof -l script.py python -m line_profiler script.py.lprof
Or use cProfile:
python -m cProfile myscript.py Check where time is spent: in I/O calls or computation.
🧰 Bonus: concurrent.futures for Clean Code
Instead of manually managing threads or processes, use:
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
ThreadPool for I/O:
with ThreadPoolExecutor(max_workers=5) as executor: executor.map(fetch, urls)
ProcessPool for CPU:
with ProcessPoolExecutor() as executor: executor.map(compute, range(cpu_count()))
✅ Final Thoughts
Python isn’t inherently slow—it just needs the right tools.
Task Type Use This
I/O-bound threading, asyncio, ThreadPoolExecutor
CPU-bound multiprocessing, ProcessPoolExecutor
Start small, profile your code, and choose the right parallelization strategy. Your app—and your users—will thank you.
Top comments (0)