@@ -7,11 +7,11 @@ codec, which is used to populate a :class:`~bson.codec_options.TypeRegistry`.
77The type registry can then be used to create a custom-type-aware
88:class: `~pymongo.collection.Collection `. Read and write operations
99issued against the resulting collection object transparently manipulate
10- documents as they are saved or retrieved from MongoDB.
10+ documents as they are saved to or retrieved from MongoDB.
1111
1212
13- Setup
14- -----
13+ Setting Up
14+ ----------
1515
1616We'll start by getting a clean database to use for the example:
1717
@@ -26,10 +26,10 @@ We'll start by getting a clean database to use for the example:
2626Since the purpose of the example is to demonstrate working with custom types,
2727we'll need a custom data type to use. For this example, we will be working with
2828the :py:class: `~decimal.Decimal ` type from Python's standard library. Since the
29- BSON library has a :class: `~bson.decimal128.Decimal128 ` type (that implements
30- the IEEE 754 decimal128 decimal-based floating-point numbering format) which
31- is distinct from Python's built-in :py:class: `~decimal.Decimal ` type, when we
32- try to save an instance of ``Decimal `` with PyMongo, we get an
29+ BSON library's :class: `~bson.decimal128.Decimal128 ` type (that implements
30+ the IEEE 754 decimal128 decimal-based floating-point numbering format) is
31+ distinct from Python's built-in :py:class: `~decimal.Decimal ` type, attempting
32+ to save an instance of ``Decimal `` with PyMongo, results in an
3333:exc: `~bson.errors.InvalidDocument ` exception.
3434
3535.. doctest ::
@@ -44,13 +44,13 @@ try to save an instance of ``Decimal`` with PyMongo, we get an
4444
4545.. _custom-type-type-codec :
4646
47- The Type Codec
48- --------------
47+ The :class: ` ~bson.codec_options.TypeCodec ` Class
48+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
4949
5050.. versionadded :: 3.8
5151
52- In order to encode custom types , we must first define a **type codec ** for our
53- type. A type codec describes how an instance of a custom type can be
52+ In order to encode a custom type , we must first define a **type codec ** for
53+ that type. A type codec describes how an instance of a custom type can be
5454*transformed * to and/or from one of the types :mod: `~bson ` already understands.
5555Depending on the desired functionality, users must choose from the following
5656base classes when defining type codecs:
@@ -62,7 +62,7 @@ base classes when defining type codecs:
6262 decodes a specified BSON type into a custom Python type. Users must implement
6363 the ``bson_type `` property/attribute and the ``transform_bson `` method.
6464* :class: `~bson.codec_options.TypeCodec `: subclass this to define a codec that
65- can both encode from and decode to a custom type. Users must implement the
65+ can both encode and decode a custom type. Users must implement the
6666 ``python_type `` and ``bson_type `` properties/attributes, as well as the
6767 ``transform_python `` and ``transform_bson `` methods.
6868
@@ -93,14 +93,14 @@ interested in both encoding and decoding our custom type, we use the
9393
9494.. _custom-type-type-registry :
9595
96- The Type Registry
97- -----------------
96+ The :class: ` ~bson.codec_options.TypeRegistry ` Class
97+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9898
9999.. versionadded :: 3.8
100100
101101Before we can begin encoding and decoding our custom type objects, we must
102- first inform PyMongo about our type codec. This is done by creating a
103- :class: `~bson.codec_options.TypeRegistry ` instance:
102+ first inform PyMongo about the corresponding codec. This is done by creating
103+ a :class: `~bson.codec_options.TypeRegistry ` instance:
104104
105105.. doctest ::
106106
@@ -113,7 +113,7 @@ Once instantiated, registries are immutable and the only way to add codecs
113113to a registry is to create a new one.
114114
115115
116- Putting it together
116+ Putting It Together
117117-------------------
118118
119119Finally, we can define a :class: `~bson.codec_options.CodecOptions ` instance
@@ -201,35 +201,79 @@ This is trivial to do since the same transformation as the one used for
201201 information, it is impossible to discern which incoming
202202 :class: `~bson.decimal128.Decimal128 ` value needs to be decoded as ``Decimal ``
203203 and which needs to be decoded as ``DecimalInt ``. This example only considers
204- the situation where a user wants to *encode * documents containing one or both
204+ the situation where a user wants to *encode * documents containing either
205205 of these types.
206206
207- Now, we can create a new codec options object and use it to get a collection
208- object:
207+ After creating a new codec options object and using it to get a collection
208+ object, we can seamlessly encode instances of `` DecimalInt `` :
209209
210210.. doctest ::
211211
212212 >>> type_registry = TypeRegistry([decimal_codec, decimalint_codec])
213213 >>> codec_options = CodecOptions(type_registry = type_registry)
214214 >>> collection = db.get_collection(' test' , codec_options = codec_options)
215215 >>> collection.drop()
216-
217-
218- We can now seamlessly encode instances of ``DecimalInt ``. Note that the
219- ``transform_bson `` method of the base codec class results in these values
220- being decoded as ``Decimal `` (and not ``DecimalInt ``):
221-
222- .. doctest ::
223-
224216 >>> collection.insert_one({' num' : DecimalInt(" 45.321" )})
225217 <pymongo.results.InsertOneResult object at ...>
226218 >>> mydoc = collection.find_one()
227219 >>> pprint.pprint(mydoc)
228220 {u'_id': ObjectId('...'), u'num': Decimal('45.321')}
229221
222+ Note that the ``transform_bson `` method of the base codec class results in
223+ these values being decoded as ``Decimal `` (and not ``DecimalInt ``).
224+
225+
226+ .. _decoding-binary-types :
227+
228+ Decoding :class: `~bson.binary.Binary ` Types
229+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
230+
231+ The decoding treatment of :class: `~bson.binary.Binary ` types having
232+ ``subtype = 0 `` by the :mod: `bson ` module varies slightly depending on the
233+ version of the Python runtime in use. This must be taken into account while
234+ writing a ``TypeDecoder `` that modifies how this datatype is decoded.
235+
236+ On Python 3.x, :class: `~bson.binary.Binary ` data (``subtype = 0 ``) is decoded
237+ as a ``bytes `` instance:
238+
239+ .. code-block :: python
240+
241+ >> > # On Python 3.x.
242+ >> > from bson.binary import Binary
243+ >> > newcoll = db.get_collection(' new' )
244+ >> > newcoll.insert_one({' _id' : 1 , ' data' : Binary(b " 123" , subtype = 0 )})
245+ >> > doc = newcoll.find_one()
246+ >> > type (doc[' data' ])
247+ bytes
248+
249+
250+ On Python 2.7.x, the same data is decoded as a :class: `~bson.binary.Binary `
251+ instance:
252+
253+ .. code-block :: python
254+
255+ >> > # On Python 2.7.x
256+ >> > newcoll = db.get_collection(' new' )
257+ >> > doc = newcoll.find_one()
258+ >> > type (doc[' data' ])
259+ bson.binary.Binary
230260
231- The Fallback Encoder
232- --------------------
261+
262+ As a consequence of this disparity, users must set the ``bson_type `` attribute
263+ on their :class: `~bson.codec_options.TypeDecoder ` classes differently,
264+ depending on the python version in use.
265+
266+
267+ .. note ::
268+
269+ For codebases requiring compatibility with both Python 2 and 3, type
270+ decoders will have to be registered for both possible ``bson_type `` values.
271+
272+
273+ .. _fallback-encoder-callable :
274+
275+ The ``fallback_encoder `` Callable
276+ ---------------------------------
233277
234278.. versionadded :: 3.8
235279
@@ -268,27 +312,110 @@ We can now seamlessly encode instances of :py:class:`~decimal.Decimal`:
268312 >>> pprint.pprint(mydoc)
269313 {u'_id': ObjectId('...'), u'num': Decimal128('45.321')}
270314
271- As you can tell, fallback encoders are a compelling alternative to type codecs
272- when we only want to encode custom types due to their much simpler API.
273- Users should note however, that fallback encoders cannot be used to modify the
274- encoding of types that PyMongo already understands, as illustrated by the
275- following example:
276315
277- >>> def fallback_encoder (value ):
278- ... """ Encoder that converts floats to int."""
279- ... if isinstance (value, float ):
280- ... return int (value)
281- ... return value
282- >>> type_registry = TypeRegistry(fallback_encoder = fallback_encoder)
283- >>> codec_options = CodecOptions(type_registry = type_registry)
284- >>> collection = db.get_collection(' test' , codec_options = codec_options)
285- >>> collection.drop()
286- >>> collection.insert_one({' num' : 45.321 })
287- <pymongo.results.InsertOneResult object at ...>
288- >>> mydoc = collection.find_one()
289- >>> pprint.pprint(mydoc)
290- {u'_id': ObjectId('...'), u'num': 45.321}
316+ .. note ::
317+
318+ Fallback encoders are invoked *after * attempts to encode the given value
319+ with standard BSON encoders and any configured type encoders have failed.
320+ Therefore, in a type registry configured with a type encoder and fallback
321+ encoder that both target the same custom type, the behavior specified in
322+ the type encoder will prevail.
323+
324+
325+ Because fallback encoders don't need to declare the types that they encode
326+ beforehand, they can be used to support interesting use-cases that cannot be
327+ serviced by ``TypeEncoder ``. One such use-case is described in the next
328+ section.
329+
330+
331+ Encoding Unknown Types
332+ ^^^^^^^^^^^^^^^^^^^^^^
333+
334+ In this example, we demonstrate how a fallback encoder can be used to save
335+ arbitrary objects to the database. We will use the the standard library's
336+ :py:mod: `pickle ` module to serialize the unknown types and so naturally, this
337+ approach only works for types that are picklable.
338+
339+ We start by defining some arbitrary custom types:
340+
341+ .. code-block :: python
342+
343+ class MyStringType (object ):
344+ def __init__ (self , value ):
345+ self .__value = value
346+ def __repr__ (self ):
347+ return " MyStringType('%s ')" % (self .__value,)
348+
349+ class MyNumberType (object ):
350+ def __init__ (self , value ):
351+ self .__value = value
352+ def __repr__ (self ):
353+ return " MyNumberType(%s )" % (self .__value,)
354+
355+ We also define a fallback encoder that pickles whatever objects it receives
356+ and returns them as :class: `~bson.binary.Binary ` instances with a custom
357+ subtype. The custom subtype, in turn, allows us to write a TypeDecoder that
358+ identifies pickled artifacts upon retrieval and transparently decodes them
359+ back into Python objects:
360+
361+ .. code-block :: python
362+
363+ import pickle
364+ from bson.binary import Binary, USER_DEFINED_SUBTYPE
365+ def fallback_pickle_encoder (value ):
366+ return Binary(pickle.dumps(value), USER_DEFINED_SUBTYPE )
367+
368+ class PickledBinaryDecoder (TypeDecoder ):
369+ bson_type = Binary
370+ def transform_bson (self , value ):
371+ if value.subtype == USER_DEFINED_SUBTYPE :
372+ return pickle.loads(value)
373+ return value
374+
375+
376+ .. note ::
377+
378+ The above example is written assuming the use of Python 3. If you are using
379+ Python 2, ``bson_type `` must be set to ``Binary ``. See the
380+ :ref: `decoding-binary-types ` section for a detailed explanation.
381+
382+
383+ Finally, we create a ``CodecOptions `` instance:
384+
385+ .. code-block :: python
386+
387+ codec_options = CodecOptions(type_registry = TypeRegistry(
388+ [PickledBinaryDecoder()], fallback_encoder = fallback_pickle_encoder))
389+
390+ We can now round trip our custom objects to MongoDB:
391+
392+ .. code-block :: python
393+
394+ collection = db.get_collection(' test_fe' , codec_options = codec_options)
395+ collection.insert_one({' _id' : 1 , ' str' : MyStringType(" hello world" ),
396+ ' num' : MyNumberType(2 )})
397+ mydoc = collection.find_one()
398+ assert isinstance (mydoc[' str' ], MyStringType)
399+ assert isinstance (mydoc[' num' ], MyNumberType)
400+
401+
402+ Limitations
403+ -----------
404+
405+ PyMongo's type codec and fallback encoder features have the following
406+ limitations:
291407
292- This is due to the fact that fallback encoders are invoked only after
293- an attempt to encode the value with type codecs and standard BSON encoding
294- routines has been unsuccessful.
408+ #. Users cannot customize the encoding behavior of Python types that PyMongo
409+ already understands like ``int `` and ``str `` (the 'built-in types').
410+ Attempting to instantiate a type registry with one or more codecs that act
411+ upon a built-in type results in a ``TypeError ``. This limitation extends
412+ to all subtypes of the standard types.
413+ #. Chaining type encoders is not supported. A custom type value, once
414+ transformed by a codec's ``transform_python `` method, *must * result in a
415+ type that is either BSON-encodable by default, or can be
416+ transformed by the fallback encoder into something BSON-encodable--it
417+ *cannot * be transformed a second time by a different type codec.
418+ #. The :meth: `~pymongo.database.Database.command ` method does not apply the
419+ user's TypeDecoders while decoding the command response document.
420+ #. :mod: `gridfs ` does not apply custom type encoding or decoding to any
421+ documents received from or to returned to the user.
0 commit comments