Skip to content

Conversation

@immerrr
Copy link
Contributor

@immerrr immerrr commented Sep 19, 2014

It turns out exceptions raised by jsonschema weren't picklable in python 2.7 because Exception.__reduce__ returned empty Exception.args tuple and ValidationError afterwards wasn't happy with being constructed without any arguments:

In [1]: import jsonschema In [2]: import pickle In [3]: pickle.loads(pickle.dumps(jsonschema.exceptions.ValidationError('foobar'))) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-3-1c3ada2ffd2d> in <module>() ----> 1 pickle.loads(pickle.dumps(jsonschema.exceptions.ValidationError('foobar'))) /home/immerrr/.conda/envs/jsonschema/lib/python2.7/pickle.pyc in loads(str) 1380 def loads(str): 1381 file = StringIO(str) -> 1382 return Unpickler(file).load() 1383 1384 # Doctest /home/immerrr/.conda/envs/jsonschema/lib/python2.7/pickle.pyc in load(self) 856 while 1: 857 key = read(1) --> 858 dispatch[key](self) 859 except _Stop, stopinst: 860 return stopinst.value /home/immerrr/.conda/envs/jsonschema/lib/python2.7/pickle.pyc in load_reduce(self) 1131 args = stack.pop() 1132 func = stack[-1] -> 1133 value = func(*args) 1134 stack[-1] = value 1135 dispatch[REDUCE] = load_reduce TypeError: __init__() takes at least 2 arguments (1 given)
@Julian
Copy link
Member

Julian commented Sep 28, 2014

Hey!

I don't particularly want to support pickle as a use case going forward, but I think adding a call to super's __init__ (which will set Exception.args) will likely fix your issue, and I wouldn't mind a patch that did that instead (and an assertion that checked that args was set properly).

If you'd be interested in changing to do that that'd be great! Thanks either way for the report.

@ankostis
Copy link
Contributor

Not invoking parent constructor breaks class inheritance and should be considered bad programming. There is no reason to doubt the necessity for this PR, or require a test-case for the obvious.

@Julian
Copy link
Member

Julian commented Sep 28, 2014

I've just said that I will accept a patch that calls the parent's init? Everything must have tests though. I just don't want a test ensuring that things are picklable when I don't intend to put in work to make that be true going forward.

@immerrr
Copy link
Contributor Author

immerrr commented Sep 28, 2014

re: super: ok

re: don't support pickling: I agree, it's not common to transfer exceptions over the wire in general but offloading validation is probably not that exotic: I myself discovered this when I moved validation into a subprocess to avoid blocking the event loop.

@Julian
Copy link
Member

Julian commented Sep 28, 2014

That sounds potentially reasonable (although things like twisted.internet.task.cooperate are probably easier) -- why'd it involve introducing pickle though?

@immerrr
Copy link
Contributor Author

immerrr commented Sep 28, 2014

In non-twisted world it's a common recommendation to use multiprocessing to side-step GIL in CPU-intensive tasks and that involves pickling and unpickling the results (or errors).

@immerrr
Copy link
Contributor Author

immerrr commented Sep 28, 2014

In twisted world fork-noexec is frowned upon, so I don't advertise a hack I've written that implements deferToSubprocess function that invokes a given callable in background process and returns the result in a deferred. It's not pretty internally, but it is pretty convenient outside and it works as long as the function is synchronous and the result/error is picklable.

@immerrr immerrr force-pushed the make-exceptions-picklable branch from 7e80f7b to 4525543 Compare September 28, 2014 17:45
@Julian
Copy link
Member

Julian commented Oct 5, 2014

Ah I see, yeah I don't care for the multiprocessing module either, but even if you use it, you don't need pickle to send things back and forth, the normal serialization and deserialization procedures should work fine in whatever format you chose for that.

@immerrr
Copy link
Contributor Author

immerrr commented Oct 6, 2014

I don't care for the multiprocessing module either, but even if you use it, you don't need pickle to send things back and forth

What do you mean?

import multiprocessing class AErr(Exception): def __reduce__(self): return raise_valueerror, () def raise_valueerror(): raise ValueError("hello, world") def a(x): raise AErr("hello, world") if __name__ == '__main__': p = multiprocessing.Pool() p.map(a, [1, 2, 3])
$ python ~/test_multiprocessing.py Exception in thread Thread-3: Traceback (most recent call last): File "/home/immerrr/.conda/envs/test/lib/python2.7/threading.py", line 810, in __bootstrap_inner self.run() File "/home/immerrr/.conda/envs/test/lib/python2.7/threading.py", line 763, in run self.__target(*self.__args, **self.__kwargs) File "/home/immerrr/.conda/envs/test/lib/python2.7/multiprocessing/pool.py", line 380, in _handle_results task = get() File "/home/immerrr/test_multiprocessing.py", line 10, in raise_valueerror raise ValueError("hello, world") ValueError: (ValueError('hello, world',), <function raise_valueerror at 0x7fabe1353a28>, ()) 
@Julian
Copy link
Member

Julian commented Oct 6, 2014

What I mean is that your example and your presumable actual code either send exceptions back across processes or otherwise propagate them by allowing them to be raised, when you could be sending other things back and forth, like whatever you're actually interested in. A list of error messages or whatever.

@immerrr immerrr force-pushed the make-exceptions-picklable branch from 4525543 to 0a51b66 Compare October 7, 2014 00:00
@immerrr
Copy link
Contributor Author

immerrr commented Oct 7, 2014

Ok, let me take out the tests so that the fix can be applied. We can talk about long-term support later, although, it seems we're close to agree to disagree here.

I mean, I might be biased because I got bitten by this issue, but I just can't see how it is beneficial to recommend users to jump through hoops to avoid running into an issue with not-really-supported pickle serialization despite its ubiquity throughout the ecosystem.

@Julian
Copy link
Member

Julian commented Oct 7, 2014

OK, sounds good, definitely would take that patch and appreciate the time.

I don't think pickle is ubiquitous by any means, but even if it were, I'd love to do my part to make it not so. It's bad software. So is multiprocessing. IMHO there's never a good reason to use either. But if you are going to use either, it's not jumping through hoops to send data back and forth, that's how you transfer things between processes, not by serializing random objects.

@immerrr
Copy link
Contributor Author

immerrr commented Oct 7, 2014

Please, see the patch.

Jumping through hoops is writing your own serialization, deserialization and communication code when sticking to pickle and multiprocessing would require you to write none. The good thing about pickle and multiprocessing is that they just work, most of the time. I mean, yes, pickle is as dangerous as eval, but bad... Why?

@Julian Julian merged commit 0a51b66 into python-jsonschema:master Oct 12, 2014
@Julian
Copy link
Member

Julian commented Oct 12, 2014

Merged and added the test. Thanks.

@immerrr
Copy link
Contributor Author

immerrr commented Oct 13, 2014

Awesome, thanks.

@immerrr immerrr deleted the make-exceptions-picklable branch October 13, 2014 10:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants