Before moving into IPC, click here to learn about Multithreading, how it is different from multi-threading, what pools are, how they communicate, etc.
Inter-Process Communication (IPC)
Inter-Process Communication (IPC) is the mechanism that allows independent processes to exchange data and coordinate their actions since each process has its own separate memory space. In Python’s multiprocessing, IPC is performed using tools such as Queue
, Pipe
, Manager
, Value
, Array
, and SharedMemory
.
multiprocessing.Queue
In multiprocessing, Queue
is a safe way for processes to exchange data. Internally, it uses pipes and locks to make sure multiple processes can put()
and get()
items without conflicts.
It works almost like a queue.Queue
in threading, but it is designed for processes. Each process has separate memory, so when something is put()
in a multiprocessing.Queue
, the data is pickled (serialised), sent through a pipe, and then unpickled on the receiving side.
- One process can
put()
items, another canget()
them. - Safe (internally uses locks).
- Best suited for producer-consumer problems.
Example:
from multiprocessing import Process, Queue import time def producer(q): for i in range(5): print(f"Producing {i}") q.put(i) time.sleep(0.5) q.put(None) # Sentinel: tells consumer "we're done" def consumer(q): while True: item = q.get() # blocks until something is available if item is None: # sentinel received break print(f"Consumed {item}") if __name__ == "__main__": q = Queue() p1 = Process(target=producer, args=(q,)) p2 = Process(target=consumer, args=(q,)) p1.start() p2.start() p1.join() p2.join()
Output:
Producing 0 Consumed 0 Producing 1 Consumed 1 Producing 2 Consumed 2 Producing 3 Consumed 3 Producing 4 Consumed 4
Important functions
-
q.put(item)
: Adds an item into the queue. -
q.get()
: Removes and returns an item (blocks if empty). -
q.get_nowait()
: Non-blocking version (raisesqueue.Empty
if empty). -
q.qsize()
: Number of items (may be approximate). -
q.empty()
: ReturnsTrue
if empty (not 100% reliable). -
q.full()
: ReturnsTrue
if the queue is full.
multiprocessing.Pipe
A Pipe is the most basic form of inter-process communication (IPC). It can be assumed as a two-way telephone line connecting two processes. A pipe returns two connection objects (conn1, conn2).
- Whatever one end sends (
send()
), the other can receive (recv()
). - Unlike Queue, which is many-to-many, a Pipe is point-to-point (between exactly two processes).
Example:
from multiprocessing import Process, Pipe def worker(conn): conn.send("Message from worker") msg = conn.recv() print("Worker got:", msg) conn.close() if __name__ == "__main__": parent_conn, child_conn = Pipe() p = Process(target=worker, args=(child_conn,)) p.start() print("Parent got:", parent_conn.recv()) parent_conn.send("Ack from parent") p.join()
Output:
Parent got: Message from worker Worker got: Ack from parent
Important functions
-
conn.send(obj)
: Send an object through the pipe. -
conn.recv()
: Receive the next object (blocks if none). -
conn.poll([timeout])
: ReturnsTrue
if data is waiting (optional timeout). -
conn.close()
: Close the connection end.
multiprocessing.Manager
Manager allows processes to share Python objects (list, dict, Namespace, etc.) safely. It is slower than Queue
/Pipe
because it uses proxies and pickling.
Example:
from multiprocessing import Process, Manager def worker(shared_list, shared_dict): shared_list.append("hello") shared_dict["count"] = shared_dict.get("count", 0) + 1 # It adds 1 to whatever value is fetched. if __name__ == "__main__": with Manager() as manager: shared_list = manager.list() shared_dict = manager.dict() processes = [Process(target=worker, args=(shared_list, shared_dict)) for _ in range(3)] for p in processes: p.start() for p in processes: p.join() print("Final list:", list(shared_list)) print("Final dict:", dict(shared_dict))
Output:
Final list: ['hello', 'hello', 'hello'] Final dict: {'count': 2}
Important functions
-
Manager()
: Start a manager object. -
manager.list([iterable])
: Returns a list proxy (shared list across processes). -
manager.dict([mapping])
: Returns a dict proxy (shared dict across processes). -
manager.Value(typecode, value)
: Shared single value (likemultiprocessing.Value
, but managed). -
manager.Array(typecode, sequence)
: Shared array (likemultiprocessing.Array
, but managed). -
manager.Queue()
: SharedQueue
across processes (proxy-based). -
manager.Namespace()
: Creates an object that can be used to set/get arbitrary attributes (like a small shared object).
Proxy object methods
Since proxies wrap normal Python objects, they support almost the same methods as the underlying type:
- For
manager.list
:.append(x)
,.extend(iterable)
,.pop()
,.remove(x)
, etc. - For
manager.dict
:.get(key)
,.keys()
,.values()
,.update(mapping)
,.pop(key)
, etc. - For
manager.Queue
:.put(item)
,.get()
,.empty()
,.full()
.
multiprocessing.Value
and multiprocessing.Array
multiprocessing.Value
creates a single scalar variable (like an int
, double
, char
, etc.) in shared memory that can be safely accessed and modified by multiple processes.
- Supports simple C data types (
'i'
= int,'d'
= double, etc.). - Exposes
.value
to get or set the stored value. - Provides
.get_lock()
for explicit synchronization.
multiprocessing.Array
creates a fixed-size array of elements (like a list of int
or double
) in shared memory that can be accessed and modified by multiple processes.
- All elements must be of the same C type (specified by a
typecode
). - Behaves like a Python list (supports indexing and iteration).
- Changes are immediately visible to all processes.
Note: Both
multiprocessing.Value
andmultiprocessing.Array
are useful when there is a need for true shared state in memory (avoiding pickling and data copying between processes). Both are faster thanManager
, but limited (only basic types).
Example:
from multiprocessing import Process, Value, Array def worker(num, arr): num.value += 1 for i in range(len(arr)): arr[i] *= -1 if __name__ == "__main__": num = Value('i', 0) arr = Array('i', [1, 2, 3]) p = Process(target=worker, args=(num, arr)) p.start(); p.join() print("num:", num.value) # 1 print("arr:", arr[:]) # [-1, -2, -3]
Output:
num: 1 arr: [-1, -2, -3]
Important functions
multiprocessing.Value
-
Value(typecode, initial_value, lock=True)
: Create a shared object. -
val.value
: Get or set the stored value. -
val.get_lock()
: Get the lock used for synchronisation. -
val.acquire()
: Manually acquire the lock. -
val.release()
: Manually release the lock. -
val.get_obj()
: Get the underlying rawctypes
object.
multiprocessing.Array
-
Array(typecode, sequence_or_size, lock=True)
: Create a shared array. -
arr[i]
: Access or update element by index. -
arr[:]
: Access or update the whole array (slice). -
len(arr)
: Get length of array. -
arr.get_lock()
: Get the lock used for synchronisation. -
arr.acquire()
: Manually acquire the lock. -
arr.release()
: Manually release the lock. -
arr.get_obj()
: Get the underlying rawctypes
array.
multiprocessing.shared_memory
(Python 3.8+)
multiprocessing.shared_memory
is a low-level shared memory block that lives outside any single process. It provides direct access to a block of memory that multiple processes can use without pickling, copying, or proxies. It is much faster than Manager
, because it avoids serialisation (pickling). It is more advanced than NumPy arrays, and large data can be shared directly across processes. So, it is quite useful for data science, large data set processing, etc.
Its key classes are:
-
SharedMemory
: Represents a shared block of memory. -
ShareableList
: A Python list-like object backed by shared memory.
Example (Raw SharedMemory
):
from multiprocessing import shared_memory, Process import numpy as np def worker(name, shape): # Attach to existing shared memory shm = shared_memory.SharedMemory(name=name) arr = np.ndarray(shape, dtype=np.int64, buffer=shm.buf) arr *= 2 # double all values shm.close() if __name__ == "__main__": # Create a numpy array in shared memory shm = shared_memory.SharedMemory(create=True, size=5 * 8) # 5 int64 = 40 bytes arr = np.ndarray((5,), dtype=np.int64, buffer=shm.buf) arr[:] = [1, 2, 3, 4, 5] print("Before:", arr) p = Process(target=worker, args=(shm.name, arr.shape)) p.start(); p.join() print("After:", arr) shm.close() shm.unlink()
Output:
Before: [1 2 3 4 5] After: [ 2 4 6 8 10]
Example (ShareableList
):
from multiprocessing import shared_memory, Process from multiprocessing import ShareableList def worker(name): lst = ShareableList(name=name) lst[0] += 100 lst.shm.close() if __name__ == "__main__": sl = ShareableList([10, 20, 30]) print("Before:", list(sl)) p = Process(target=worker, args=(sl.shm.name,)) p.start(); p.join() print("After:", list(sl)) sl.shm.close() sl.shm.unlink()
Output:
Before: [10, 20, 30] After: [110, 20, 30]
Important functions
SharedMemory
-
SharedMemory(create=True, size=N)
: Create a new shared block of size N bytes. -
SharedMemory(name, create=False)
: Attach to an existing shared block by name. -
.buf
: Memoryview of the block (like abytearray
); can slice, assign, etc. -
.name
: The unique name of this shared memory block. -
.close()
: Detach from the block in this process. -
.unlink()
: Free the memory (after all processes are done).
ShareableList
-
ShareableList(iterable)
: Create a new list backed by shared memory. -
.shm
: Underlying shared memory object. -
.close()
: Detach from memory. -
.shm.unlink()
: Free the memory. - Indexing and slicing work like a normal list:
my_list[0] = 10
.
Comparison
IPC Tool | Best For | Notes |
---|---|---|
Queue | Producer-consumer | FIFO, safe, simple |
Pipe | Two-way chat | Low-level, pairwise |
Manager | Sharing Python objects | Slower, proxy-based |
Value , Array | Simple numeric shared state | Fast, low-level |
shared_memory | Large data (NumPy, ML) | Zero-copy, efficient |
Serialisation Issues
When data is shared or passed between processes, Python cannot simply hand over memory pointers (unlike threads).
Instead, it must:
- Serialise (pickle) the object in the sender process, i.e. convert it into a byte stream.
- Send that byte stream (via pipe, queue, socket, etc.).
- Deserialise (unpickle) it back into a Python object in the receiver process.
So IPC in multiprocessing means serialise → transfer → deserialise.
pickle
It is Python’s default serialisation library. It supports most built-in types (int
, list
, dict
, set
, tuple
, classes, functions (if global)). But it has limitations:
- Cannot pickle local functions or lambdas.
- Cannot pickle open file handles, sockets, or thread locks.
- Pickling can be slow for large or complex objects (like huge NumPy arrays).
Example:
import pickle data = {"x": 42, "y": [1, 2, 3]} s = pickle.dumps(data) print(s) obj = pickle.loads(s) print(obj)
Output:
b'\x80\x04\x95\x19\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x01x\x94K*\x8c\x01y\x94]\x94(K\x01K\x02K\x03eu.' {'x': 42, 'y': [1, 2, 3]}
cloudpickle
It is a more powerful serialisation library (pip install cloudpickle
). It is often used in distributed computing frameworks like Dask, Ray, and PySpark. It can pickle:
- Lambdas
- Nested functions
- Locally defined classes
Example:
import cloudpickle f = lambda x: x + 1 s = cloudpickle.dumps(f) g = cloudpickle.loads(s) print(g(5))
Output:
6
Note:
pickle
would fail here, butcloudpickle
works.
Why does this matter in multiprocessing?
When multiprocessing.Queue
, Pool
, are used or argument(s) are sent to Process
, Python must pickle those arguments/results.
- Easy if primitives are passed (
int
,str
,list
). - Trouble if complex objects are passed (for example, a lambda, an open file, or a custom C-extension object).
Example (where pickle
fails):
from multiprocessing import Pool def demo(): return lambda x: x + 1 if __name__ == "__main__": with Pool(2) as pool: try: result = pool.apply(demo) # tries to pickle lambda → fails print(result) except Exception as e: print("Error:", e)
Output:
Error: Error sending result: '<function demo.<locals>.<lambda> at 0x1007c4680>'. Reason: 'AttributeError("Can't get local object 'demo.<locals>.<lambda>'")'
Complexity in sharing state
Every time data is sent across processes, there is pickle/unpickle overhead. For large datasets (such as a 1GB NumPy array), this is extremely slow. That’s why Python provides:
-
Value
/Array
(no pickling, true shared memory). -
shared_memory
(raw shared block, NumPy arrays, zero-copy). -
Manager
(easy, but uses proxies, internally pickling/unpickling, slower).
Top comments (0)