Skip to content

Python 3 support on Windows? Running into a pickling problem #36

@tjrileywisc

Description

@tjrileywisc

I'm running the MNIST example on a standalone cluster in Windows. I had to make a few changes to enable Python 3 support (I'm using 3.5):

In TFCluster.py (due to changes in relative imports):

from . import TFSparkNode 

In TFSparkNode.py (Queue > queue and relative imports again, also a problem with Python 3 not handling UUID objects the same as 2):

from . import TFManager ... authkey = uuid.uuid4() authkey = authkey.bytes 

... and then I get stuck with a pickling issue:

Caused by: org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "F:\TensorFlowOnSpark\spark-1.6.0-bin-hadoop2.6\python\lib\pyspark.zip\pyspark\worker.py", line 111, in main File "F:\TensorFlowOnSpark\spark-1.6.0-bin-hadoop2.6\python\lib\pyspark.zip\pyspark\worker.py", line 106, in process File "F:\TensorFlowOnSpark\spark-1.6.0-bin-hadoop2.6\python\lib\pyspark.zip\pyspark\rdd.py", line 2346, in pipeline_func File "F:\TensorFlowOnSpark\spark-1.6.0-bin-hadoop2.6\python\lib\pyspark.zip\pyspark\rdd.py", line 317, in func File "F:\TensorFlowOnSpark\tfspark.zip\com\yahoo\ml\tf\TFSparkNode.py", line 97, in _reserve File ".\tfspark.zip\com\yahoo\ml\tf\TFManager.py", line 36, in start mgr.start() File "C:\Python35\lib\multiprocessing\managers.py", line 479, in start self._process.start() File "C:\Python35\lib\multiprocessing\process.py", line 105, in start self._popen = self._Popen(self) File "C:\Python35\lib\multiprocessing\context.py", line 313, in _Popen return Popen(process_obj) File "C:\Python35\lib\multiprocessing\popen_spawn_win32.py", line 66, in __init__ reduction.dump(process_obj, to_child) File "C:\Python35\lib\multiprocessing\reduction.py", line 59, in dump ForkingPickler(file, protocol).dump(obj) AttributeError: Can't pickle local object 'start.<locals>.<lambda>' 

Not really how to proceed from here. I tried to use another library (dill) to do the pickling but that isn't working. Has anybody gotten this to work in Python 3?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions