Python is the language of choice for most Machine Learning code, but I prefer to run my back-end services using NodeJS. I'm currently working on a Node based project where I need to call the HuggingFace Diffusers module. So I have implemented an IPC wrapper around the Python script.
It works by sending JSON messages back and forth over stdin/out using U+0000 (NULL,\0) as a delimeter. This is one of the few characters that must be escaped in valid JSON. So we can be confident that no response should contain this character.
Since the Python script is blocking, we can send an arbitrary number of messages to it and they will be automatically buffered until the process can handle them.
Responses from the Python process are manually buffered until we receive the null byte and then decoded.
The Node Side
import { spawn } from 'child_process'; import { dirname, join } from 'path'; import { fileURLToPath } from 'url'; // Start the Python process const __dirname = dirname(fileURLToPath(import.meta.url)); const pythonProcess = spawn('python3', [join(__dirname, 'example.py')]); // Buffer for incomplete messages let buffer = ''; // Handle incoming messages from Python pythonProcess.stdout.on('data', (data) => { // Split incoming data on null bytes const messages = (buffer + data.toString()).split('\0'); // Store incomplete message buffer = messages.pop(); // Process complete messages for (const message of messages) { try { const response = JSON.parse(message); console.log('Received from Python:', response); } catch (error) { console.error('Parse error:', error); } } }); // Handle Python errors pythonProcess.stderr.on('data', (data) => { console.error('Python stderr:', data.toString()); }); // Send messages to Python function sendToPython(message) { pythonProcess.stdin.write(JSON.stringify(message) + '\n'); } // Test the communication sendToPython({ type: "test", message: "Hello Python!" }); The Python Side
import sys import json import time def log_message(message): # Write messages with null terminator as delimiter sys.stdout.buffer.write(json.dumps(message).encode() + b'\0') sys.stdout.buffer.flush() if __name__ == "__main__": log_message({"status": "ready"}) # Loop forever to keep the process active while True: try: # Read input from Node data = json.loads(input()) # Echo back with timestamp log_message({ "status": "success", "received": data,p "timestamp": time.time() }) except Exception as e: log_message({ "status": "error", "message": str(e) }) This is bare-bones example of how to accomplish this. Stay tuned for more details about this project in the near future.
Top comments (4)
Super clean setup, always love seeing cross-language hacks come together like this. You ever hit messes with buffering or pipes going weird when stuff gets high volume?
So far I have not seen any issues, but I have not pushed it very hard yet. I'm sending messages over a WebSocket from the browser to Node that carry the parameters for diffusion. I have queued multiple requests across several instances of the client and the result always comes through properly.
I haven't come up with a way to trigger exactly simultaneous requests to see if that will work properly, but I think it should be fine. The requests are sent from Node to Python as a complete JSON body and should never be too large for the buffer.
It's the outputs that end up coming through in chunks due to the size of the base64 encoded image data. But the Python script can only process one request at a time so there is no chance of collisions.
The cover looks great! Looking forward to more details.
I know, right! They look so happy together. π Thanks for reading!