Shard aware support #6

fruch · 2020-03-11T16:44:40Z

No description provided.

tests/integration/__init__.py

cassandra/pool.py

vladzcloudius · 2020-03-12T18:49:52Z

Please, don't squash commits - you can always do it later.
Please, put proper description what each next patch is about.

tests/integration/__init__.py

ultrabug · 2020-04-08T20:50:29Z

Hi guys, as discussed on Slack I've pushed the changes necessary to a more graceful shard aware connection pooling ;)

fruch · 2020-04-13T15:14:28Z

@vladzcloudius care to take a look at the test in 5b636fc.

I guess you have some more ideas, on how/what should be test in that regards

vladzcloudius · 2020-04-14T16:03:41Z

@fruch I'd like to see how driver behaves when you tear connections down: send RST from the remote side.

cassandra/connection.py

tests/integration/standard/test_shard_aware.py

tests/integration/standard/test_connection.py

* unit test for parsing and shard_id calculation * integration test using trace to verify requests going to the correct shard

since the async callback, to be on the safe side, adding a lock to the `set.remove()` call

refactor the logic of re-opening connection to all shards into it's own function so we can reuse it in `_replace()`

@ultrabug

base on @ultrabug comment, changing the debug print to be more clear and informative

@ultrabug

based on @ultrabug comments, this should be named better

@ultrabug

base on @ultrabug comments, this would be faster.

Better explian why do we loop at max twice over the shards in this function.

Protocol does not (yet?) allow clients to specify which shard they want to connect to and thus depend on a round-robin allocation of the shard_id made by the host node (see system.clients table). This means that on a live cluster where client connections come and go, we cannot guarantee that we will get a connection to every shard... even by retrying twice (which slows down the connection startup also) This commit switches to use an optimistic approach where we try to connect as many times as there are shards on the remote cluster at first. Then when routing_key is used and shard aware connection picking can take place, we will try to open missing connections as we detect them. This is more graceful and allows us to not fail if we miss shard connections as well as reduce the connection startup time! A long running client will hopefully get a connection to all shards after a while! That's the best we can do for now until the protocol evolves.

Initial connection tentatives to shards are now scheduled instead of being blocking on startup. This allows the shard aware driver to connect to a Scylla cluster as fast as Cassandra one!

* add test for multiple client at the same time * add test for closing connections * add test for blocking(timing out) connections

On busy systems we could overwhelm the threadpool executor queue with repeated submissions of speculative shard connections to the same shard which results in having an unbound number of connection openings to scylla nodes This could be seen as a "connection leak" and was also not respecting the signal of cluster connections shutting down

fruch · 2020-06-02T10:35:31Z

@ultrabug FYI, I've rebase and force pushed this branch (not very social of me.. but I want travis to start working ontop of this PR too)

fruch force-pushed the shared_aware branch from c7fba58 to 89dcd3a Compare March 11, 2020 16:46

fruch requested review from bentsi and vladzcloudius March 12, 2020 07:07

fruch self-assigned this Mar 12, 2020

fruch force-pushed the shared_aware branch from 89dcd3a to 1ab9298 Compare March 12, 2020 08:01

vladzcloudius suggested changes Mar 12, 2020

View reviewed changes

tests/integration/__init__.py Outdated Show resolved Hide resolved

tests/integration/__init__.py Outdated Show resolved Hide resolved

vladzcloudius suggested changes Mar 12, 2020

View reviewed changes

cassandra/pool.py Outdated Show resolved Hide resolved

cassandra/pool.py Outdated Show resolved Hide resolved

cassandra/pool.py Show resolved Hide resolved

cassandra/pool.py Show resolved Hide resolved

vladzcloudius approved these changes Mar 12, 2020

View reviewed changes

vladzcloudius approved these changes Mar 16, 2020

View reviewed changes

vladzcloudius reviewed Mar 16, 2020

View reviewed changes

tests/integration/__init__.py Outdated Show resolved Hide resolved

fruch force-pushed the shared_aware branch from b608a79 to 4f28ddc Compare March 22, 2020 12:33

fruch mentioned this pull request Mar 30, 2020

Initial shard aware driver fruch/python-driver#1

Open