mattmacy
diff --git a/‎docs/source/notes/cuda.rst‎
Lines changed: 5 additions & 3 deletions b/‎docs/source/notes/cuda.rst‎
Lines changed: 5 additions & 3 deletions
diff --git a/‎docs/source/notes/serialization.rst‎
Lines changed: 34 additions & 0 deletions b/‎docs/source/notes/serialization.rst‎
Lines changed: 34 additions & 0 deletions
diff --git a/‎torch/nn/parallel/data_parallel.py‎
Lines changed: 1 addition & 0 deletions b/‎torch/nn/parallel/data_parallel.py‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎torch/serialization.py‎
Lines changed: 1 addition & 22 deletions b/‎torch/serialization.py‎
Lines changed: 1 addition & 22 deletions
@@ -71,9 +71,11 @@ Use nn.DataParallel instead of multiprocessing
 
 Most use cases involving batched input and multiple GPUs should default to using
 :class:`~torch.nn.DataParallel` to utilize more than one GPU. Even with the GIL,
-a single python process can saturate multiple GPUs, though at very large numbers
-of GPUs (8+) utilization might drop. Test your use case before investing the
-time to develop something more complicated.
+a single python process can saturate multiple GPUs.
+
+As of version 0.1.9, large numbers of GPUs (8+) might not be fully utilized.
+However, this is a known issue that is under active development. As always,
+test your use case.
 
 There are significant caveats to using CUDA models with
 :mod:`~torch.multiprocessing`; unless care is taken to meet the data handling
 
@@ -0,0 +1,34 @@
+
+Serialization semantics
+=======================
+
+Best practices
+--------------
+
+.. _recommend-saving-models:
+
+Recommended approach for saving a model
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+There are two main approaches for serializing and restoring a model.
+
+The first (recommended) saves and loads only the model parameters::
+
+ torch.save(the_model.state_dict(), PATH)
+
+Then later::
+
+ the_model = TheModelClass(*args, **kwargs)
+ the_model.load_state_dict(torch.load(PATH))
+
+The second saves and loads the entire model::
+
+ torch.save(the_model, PATH)
+
+Then later::
+
+ the_model = torch.load(PATH))
+
+However in this case, the serialized data is bound to the specific classes
+and the exact directory structure used, so it can break in various ways when
+used in other projects, or after some serious refactors.
@@ -35,6 +35,7 @@ class DataParallel(Module):
  >>> output = net(input_var)
  """
 
+ # TODO: update notes/cuda.rst when this class handles 8+ GPUs well
  def __init__(self, module, device_ids=None, output_device=None):
  super(DataParallel, self).__init__()
  if device_ids is None:
 
@@ -103,28 +103,7 @@ def storage_to_tensor_type(storage):
 def save(obj, f, pickle_module=pickle, pickle_protocol=DEFAULT_PROTOCOL):
  """Saves an object to a disk file.
 
- There are two main approaches for serializing and restoring a model.
-
- The first (recommended) saves and loads only the model parameters::
-
- torch.save(the_model.state_dict(), PATH)
-
- Then later::
-
- the_model = TheModelClass(*args, **kwargs)
- the_model.load_state_dict(torch.load(PATH))
-
- The second saves and loads the entire model::
-
- torch.save(the_model, PATH)
-
- Then later::
-
- the_model = torch.load(PATH))
-
- The second relies on both the shape of the model, as well as the class
- definition. This results in it being more fragile, since if the source code
- of the class changes, the model will no longer load.
+ See also: :ref:`recommend-saving-models`
 
  Args:
  obj: saved object