Skip to content

Commit bc26c0d

Browse files
PYTHON-1577 Allow applications to register a custom server selector (mongodb#371)
PYTHON-1577 Allow applications to register a custom server selector
1 parent 58851e1 commit bc26c0d

File tree

10 files changed

+367
-22
lines changed

10 files changed

+367
-22
lines changed

doc/examples/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,5 +27,6 @@ MongoDB, you can start it like so:
2727
gridfs
2828
high_availability
2929
mod_wsgi
30+
server_selection
3031
tailable
3132
tls

doc/examples/server_selection.rst

Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,108 @@
1+
Server Selector Example
2+
=======================
3+
4+
Users can exert fine-grained control over the `server selection algorithm`_
5+
by setting the `server_selector` option on the :class:`~pymongo.MongoClient`
6+
to an appropriate callable. This example shows how to use this functionality
7+
to prefer servers running on ``localhost``.
8+
9+
10+
.. warning::
11+
12+
Use of custom server selector functions is a power user feature. Misusing
13+
custom server selectors can have unintended consequences such as degraded
14+
read/write performance.
15+
16+
17+
.. testsetup::
18+
19+
from pymongo import MongoClient
20+
21+
22+
.. _server selection algorithm: https://docs.mongodb.com/manual/core/read-preference-mechanics/
23+
24+
25+
Example: Selecting Servers Running on ``localhost``
26+
---------------------------------------------------
27+
28+
To start, we need to write the server selector function that will be used.
29+
The server selector function should accept a list of
30+
:class:`~pymongo.server_description.ServerDescription` objects and return a
31+
list of server descriptions that are suitable for the read or write operation.
32+
A server selector must not create or modify
33+
:class:`~pymongo.server_description.ServerDescription` objects, and must return
34+
the selected instances unchanged.
35+
36+
In this example, we write a server selector that prioritizes servers running on
37+
``localhost``. This can be desirable when using a sharded cluster with multiple
38+
``mongos``, as locally run queries are likely to see lower latency and higher
39+
throughput. Please note, however, that it is highly dependent on the
40+
application if preferring ``localhost`` is beneficial or not.
41+
42+
In addition to comparing the hostname with ``localhost``, our server selector
43+
function accounts for the edge case when no servers are running on
44+
``localhost``. In this case, we allow the default server selection logic to
45+
prevail by passing through the received server description list unchanged.
46+
Failure to do this would render the client unable to communicate with MongoDB
47+
in the event that no servers were running on ``localhost``.
48+
49+
50+
The described server selection logic is implemented in the following server
51+
selector function:
52+
53+
54+
.. doctest::
55+
56+
>>> def server_selector(server_descriptions):
57+
... servers = [
58+
... server for server in server_descriptions
59+
... if server.address[0] == 'localhost'
60+
... ]
61+
... if not servers:
62+
... return server_descriptions
63+
... return servers
64+
65+
66+
67+
Finally, we can create a :class:`~pymongo.MongoClient` instance with this
68+
server selector.
69+
70+
71+
.. doctest::
72+
73+
>>> client = MongoClient(server_selector=server_selector)
74+
75+
76+
77+
Server Selection Process
78+
------------------------
79+
80+
This section dives deeper into the server selection process for reads and
81+
writes. In the case of a write, the driver performs the following operations
82+
(in order) during the selection process:
83+
84+
85+
#. Select all writeable servers from the list of known hosts. For a replica set
86+
this is the primary, while for a sharded cluster this is all the known mongoses.
87+
88+
#. Apply the user-defined server selector function. Note that the custom server
89+
selector is **not** called if there are no servers left from the previous
90+
filtering stage.
91+
92+
#. Apply the ``localThresholdMS`` setting to the list of remaining hosts. This
93+
whittles the host list down to only contain servers whose latency is at most
94+
``localThresholdMS`` milliseconds higher than the lowest observed latency.
95+
96+
#. Select a server at random from the remaining host list. The desired
97+
operation is then performed against the selected server.
98+
99+
100+
In the case of **reads** the process is identical except for the first step.
101+
Here, instead of selecting all writeable servers, we select all servers
102+
matching the user's :class:`~pymongo.read_preferences.ReadPreference` from the
103+
list of known hosts. As an example, for a 3-member replica set with a
104+
:class:`~pymongo.read_preferences.Secondary` read preference, we would select
105+
all available secondaries.
106+
107+
108+
.. _server selection algorithm: https://docs.mongodb.com/manual/core/read-preference-mechanics/

pymongo/client_options.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@
2525
from pymongo.read_concern import ReadConcern
2626
from pymongo.read_preferences import (make_read_preference,
2727
read_pref_mode_from_name)
28+
from pymongo.server_selectors import any_server_selector
2829
from pymongo.ssl_support import get_ssl_context
2930
from pymongo.write_concern import WriteConcern
3031

@@ -163,6 +164,8 @@ def __init__(self, username, password, database, options):
163164
self.__heartbeat_frequency = options.get(
164165
'heartbeatfrequencyms', common.HEARTBEAT_FREQUENCY)
165166
self.__retry_writes = options.get('retrywrites', common.RETRY_WRITES)
167+
self.__server_selector = options.get(
168+
'server_selector', any_server_selector)
166169

167170
@property
168171
def _options(self):
@@ -194,6 +197,10 @@ def server_selection_timeout(self):
194197
"""The server selection timeout for this instance in seconds."""
195198
return self.__server_selection_timeout
196199

200+
@property
201+
def server_selector(self):
202+
return self.__server_selector
203+
197204
@property
198205
def heartbeat_frequency(self):
199206
"""The monitoring frequency in seconds."""

pymongo/common.py

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -473,6 +473,15 @@ def validate_driver_or_none(option, value):
473473
return value
474474

475475

476+
def validate_is_callable_or_none(option, value):
477+
"""Validates that 'value' is a callable."""
478+
if value is None:
479+
return value
480+
if not callable(value):
481+
raise ValueError("%s must be a callable" % (option,))
482+
return value
483+
484+
476485
def validate_ok_for_replace(replacement):
477486
"""Validate a replacement document."""
478487
validate_is_mapping("replacement", replacement)
@@ -552,7 +561,7 @@ def validate_tzinfo(dummy, value):
552561
'unicode_decode_error_handler': validate_unicode_decode_error_handler,
553562
'retrywrites': validate_boolean_or_string,
554563
'compressors': validate_compressors,
555-
'zlibcompressionlevel': validate_zlib_compression_level
564+
'zlibcompressionlevel': validate_zlib_compression_level,
556565
}
557566

558567
TIMEOUT_VALIDATORS = {
@@ -572,6 +581,7 @@ def validate_tzinfo(dummy, value):
572581
'tzinfo': validate_tzinfo,
573582
'username': validate_string_or_none,
574583
'password': validate_string_or_none,
584+
'server_selector': validate_is_callable_or_none,
575585
}
576586

577587
URI_VALIDATORS.update(TIMEOUT_VALIDATORS)

pymongo/mongo_client.py

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -215,6 +215,12 @@ def __init__(
215215
milliseconds) the driver will wait during server monitoring when
216216
connecting a new socket to a server before concluding the server
217217
is unavailable. Defaults to ``20000`` (20 seconds).
218+
- `server_selector`: (callable or None) Optional, user-provided
219+
function that augments server selection rules. The function should
220+
accept as an argument a list of
221+
:class:`~pymongo.server_description.ServerDescription` objects and
222+
return a list of server descriptions that should be considered
223+
suitable for the desired operation.
218224
- `serverSelectionTimeoutMS`: (integer) Controls how long (in
219225
milliseconds) the driver will wait to find an available,
220226
appropriate server to carry out a database operation; while it is
@@ -331,6 +337,8 @@ def __init__(
331337
is set, it must be a positive integer greater than or equal to
332338
90 seconds.
333339
340+
.. seealso:: :doc:`/examples/server_selection`
341+
334342
| **Authentication:**
335343
336344
- `username`: A string.
@@ -411,13 +419,16 @@ def __init__(
411419
412420
.. mongodoc:: connections
413421
414-
.. versionchanged:: 3.6
415-
Added support for mongodb+srv:// URIs.
416-
Added the ``retryWrites`` keyword argument and URI option.
422+
.. versionchanged:: 3.8
423+
Added the ``server_selector`` keyword argument.
417424
418425
.. versionchanged:: 3.7
419426
Added the ``driver`` keyword argument.
420427
428+
.. versionchanged:: 3.6
429+
Added support for mongodb+srv:// URIs.
430+
Added the ``retryWrites`` keyword argument and URI option.
431+
421432
.. versionchanged:: 3.5
422433
Add ``username`` and ``password`` options. Document the
423434
``authSource``, ``authMechanism``, and ``authMechanismProperties ``
@@ -572,6 +583,7 @@ def __init__(
572583
condition_class=condition_class,
573584
local_threshold_ms=options.local_threshold_ms,
574585
server_selection_timeout=options.server_selection_timeout,
586+
server_selector=options.server_selector,
575587
heartbeat_frequency=options.heartbeat_frequency)
576588

577589
self._topology = Topology(self._topology_settings)

pymongo/settings.py

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,8 @@ def __init__(self,
3535
condition_class=None,
3636
local_threshold_ms=LOCAL_THRESHOLD_MS,
3737
server_selection_timeout=SERVER_SELECTION_TIMEOUT,
38-
heartbeat_frequency=common.HEARTBEAT_FREQUENCY):
38+
heartbeat_frequency=common.HEARTBEAT_FREQUENCY,
39+
server_selector=None):
3940
"""Represent MongoClient's configuration.
4041
4142
Take a list of (host, port) pairs and optional replica set name.
@@ -53,6 +54,7 @@ def __init__(self,
5354
self._condition_class = condition_class or threading.Condition
5455
self._local_threshold_ms = local_threshold_ms
5556
self._server_selection_timeout = server_selection_timeout
57+
self._server_selector = server_selector
5658
self._heartbeat_frequency = heartbeat_frequency
5759
self._direct = (len(self._seeds) == 1 and not replica_set_name)
5860
self._topology_id = ObjectId()
@@ -90,6 +92,10 @@ def local_threshold_ms(self):
9092
def server_selection_timeout(self):
9193
return self._server_selection_timeout
9294

95+
@property
96+
def server_selector(self):
97+
return self._server_selector
98+
9399
@property
94100
def heartbeat_frequency(self):
95101
return self._heartbeat_frequency

pymongo/topology.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -190,7 +190,7 @@ def _select_servers_loop(self, selector, timeout, address):
190190
now = _time()
191191
end_time = now + timeout
192192
server_descriptions = self._description.apply_selector(
193-
selector, address)
193+
selector, address, custom_selector=self._settings.server_selector)
194194

195195
while not server_descriptions:
196196
# No suitable servers.
@@ -209,7 +209,8 @@ def _select_servers_loop(self, selector, timeout, address):
209209
self._description.check_compatible()
210210
now = _time()
211211
server_descriptions = self._description.apply_selector(
212-
selector, address)
212+
selector, address,
213+
custom_selector=self._settings.server_selector)
213214

214215
self._description.check_compatible()
215216
return server_descriptions

pymongo/topology_description.py

Lines changed: 12 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -214,7 +214,7 @@ def common_wire_version(self):
214214
def heartbeat_frequency(self):
215215
return self._topology_settings.heartbeat_frequency
216216

217-
def apply_selector(self, selector, address):
217+
def apply_selector(self, selector, address, custom_selector=None):
218218

219219
def apply_local_threshold(selection):
220220
if not selection:
@@ -239,18 +239,23 @@ def apply_local_threshold(selection):
239239
common_wv))
240240

241241
if self.topology_type == TOPOLOGY_TYPE.Single:
242-
# Ignore the selector.
242+
# Ignore selectors for standalone.
243243
return self.known_servers
244244
elif address:
245+
# Ignore selectors when explicit address is requested.
245246
description = self.server_descriptions().get(address)
246247
return [description] if description else []
247248
elif self.topology_type == TOPOLOGY_TYPE.Sharded:
248-
# Ignore the read preference, but apply localThresholdMS.
249-
return apply_local_threshold(
250-
Selection.from_topology_description(self))
249+
# Ignore read preference.
250+
selection = Selection.from_topology_description(self)
251251
else:
252-
return apply_local_threshold(
253-
selector(Selection.from_topology_description(self)))
252+
selection = selector(Selection.from_topology_description(self))
253+
254+
# Apply custom selector followed by localThresholdMS.
255+
if custom_selector is not None and selection:
256+
selection = selection.with_server_descriptions(
257+
custom_selector(selection.server_descriptions))
258+
return apply_local_threshold(selection)
254259

255260
def has_readable_server(self, read_preference=ReadPreference.PRIMARY):
256261
"""Does this topology have any readable servers available matching the

0 commit comments

Comments
 (0)