I manage several Oracle cloud compute instances (all Ubuntu servers) located in different regions.
All VMs have iptables set to allow ingress traffic on port 22, and the same ingress rules are applied to the VCN's security list.
I used to be able to establish connections between hosts (ssh, rsync, scp) but it is getting increasingly difficult as more and more connections between individual hosts fail after timing out. I say increasingly because a connection from hostA (Amsterdam) to hostB (Frankfurt) that was working yesterday no longer works today. The behavior appears to be expanding to more and more hosts (or maybe host pairs?), and I cannot establish a pattern.
Attempts to connect from hostA to hostB stops here and eventually time out:
hostA$ ssh -vvv hostB debug1: Connecting to hostB [xx.xx.xx.xx] port 22 debug3: set_sock_tos: set socket 3 IP_TOS 0x10
Examining tcp packets on either end show the following:
hostA# tcpdump -i any tcp port 22 and host hostB and 'tcp[tcpflags] & (tcp-syn) != 0' 08:04:02.280667 ens3 Out IP hostA.50672 > hostB: Flags [S], seq 1743155524, win 62720, options [mss 8960,sackOK,TS val 2010161184 ecr 0,nop,wscale 7], length 0 08:04:02.286807 ens3 In IP hostB > hostA.50672: Flags [S.], seq 2691905561, ack 1743155525, win 62636, options [mss 8960,sackOK,TS val 2267917840 ecr 2010161184,nop,wscale 7], length 0 08:04:03.284668 ens3 Out IP hostA.50672 > hostB: Flags [S], seq 1743155524, win 62720, options [mss 8960,sackOK,TS val 2010162189 ecr 0,nop,wscale 7], length 0 08:04:03.290862 ens3 In IP hostB > hostA.50672: Flags [S.], seq 2691905561, ack 1743155525, win 62636, options [mss 8960,sackOK,TS val 2267918844 ecr 2010161184,nop,wscale 7], length 0
hostB# tcpdump -i any tcp port 22 and host hostA and 'tcp[tcpflags] & (tcp-syn) != 0' 09:04:02.284606 enp0s3 In IP hostA.50672 > hostB: Flags [S], seq 1743155524, win 62720, options [mss 8960,sackOK,TS val 2010161184 ecr 0,nop,wscale 7], length 0 09:04:02.284651 enp0s3 Out IP hostB > hostA.50672: Flags [S.], seq 2691905561, ack 1743155525, win 62636, options [mss 8960,sackOK,TS val 2267917840 ecr 2010161184,nop,wscale 7], length 0 09:04:03.288627 enp0s3 In IP hostA.50672 > hostB: Flags [S], seq 1743155524, win 62720, options [mss 8960,sackOK,TS val 2010162189 ecr 0,nop,wscale 7], length 0 09:04:03.288668 enp0s3 Out IP hostB > hostA.50672: Flags [S.], seq 2691905561, ack 1743155525, win 62636, options [mss 8960,sackOK,TS val 2267918844 ecr 2010161184,nop,wscale 7], length 0
The above sequence repeats on both sides about once per second until timeout. My understanding of TCP dumps is limited, but it looks as though hostA's attempt is acknowledged by hostB; hostA receives the ack but then sends out another identical request.
Also, the authorization log shows no signs of hostA access attempts:
hostB# tail -f /var/log/auth.log ...
So that you know, I can successfully ssh into all VMs from either my personal computer or non-Oracle hosts.
I would much appreciate any pointers.
-n -i eth0
) to check. A timeout indicates one of two things: either the request didn't arrive, or the response didn't; and doesn't tell you which.