-The issue is a logical problem in IP networking (not unique to NLB nor CockroachDB specific) called "Diamond Routing". It occurs when a client thinks it is talking to 2 different servers, when it is actually talking to the same server. In AWS, each AZ of NLB has a different IP address. A DNS lookup performed by a client returns a set of all IP addresses which belong to the NLB. When a client chooses different IP addresses for different connections, it may reuse the same source port when communicating to two different destinations. However, because of *cross zone load balancing*, the client may in fact be communicating to the same backend server (e.g. a CockroachDB node) for both connections. Because of client IP preservation, the CockroachDB node will see packets arriving from the same source IP address and source port, which appear to it as belonging to the same TCP socket. This will lead to confusion between the client and the server, and one of the connections will unexpectedly close.
0 commit comments