-
- Notifications
You must be signed in to change notification settings - Fork 111
Description
Problem
When a peer restarts with a new identity from the same IP:port, the gateway doesn't detect the identity change and continues trying to use the old session encryption keys. The new peer's handshake packets are silently dropped or fail decryption, preventing reconnection.
Impact
- Peers behind NAT cannot reconnect after restart without also restarting the gateway
- In production, this would require gateway restarts whenever peers restart, which is not scalable
- The gateway accumulates stale connection entries
Recommended Approach: Test First
Before implementing a fix, create a test that reproduces this failure.
Suggested test location: crates/core/tests/ or extend freenet-test-network
#[tokio::test] async fn test_peer_reconnect_after_restart() { // 1. Start a gateway // 2. Start a peer, connect to gateway, note its identity // 3. Disconnect the peer, clear its state // 4. Restart peer (new identity, same IP:port) // 5. Assert: peer successfully reconnects within reasonable timeout // 6. Assert: gateway has updated peer identity in its connection table }This test should FAIL with current code, then PASS after the fix.
Steps to Reproduce
-
Start a gateway with debug logging:
RUST_LOG="info,freenet::transport=debug" freenet network --is-gateway ... -
Start a peer that connects through the gateway (e.g., from behind NAT)
-
Note the peer's identity (e.g.,
5AMPifZWGRfoydocq) and source IP:port (e.g.,136.62.52.28:43227) -
Kill the peer, clean its state (
rm -rf ~/.local/share/freenet/), and restart it- The new peer will have a different identity (e.g.,
gHKWtWM62CJgyM1U) - But same source IP:port due to NAT
- The new peer will have a different identity (e.g.,
-
Observe the new peer fails to connect with "max connection attempts reached"
Evidence from Logs
Gateway thinks it has connection to OLD identity:
connect_peer: transport entry already has pub_key, tx: 01KCAPRFZHCHZ3KSW076W38100, peer_addr: 136.62.52.28:43227, existing_pub_key: Some(5AMPifZWGRfoydocq) New peer (different identity) fails handshake:
Outbound handshake failed: max connection attempts reached, peer_addr: 5.9.111.215:31337, attempts: 22, elapsed_ms: 3142, direction: "outbound" tcpdump shows packets ARE flowing both ways:
19:58:02.220516 eno1 In IP 136.62.52.28.43227 > 5.9.111.215.31337: UDP, length 256 19:58:02.222066 eno1 Out IP 5.9.111.215.31337 > 136.62.52.28.43227: UDP, length 74 But the gateway logs show NO inbound messages from the new peer - the packets are being dropped at the transport layer because they can't decrypt with the old session keys.
Expected Behavior
When the gateway receives handshake packets that don't decrypt with the existing session:
- Detect this is a new peer identity attempting to connect
- Invalidate the old session for that IP:port
- Establish a new encrypted session with the new identity
- Complete the connection handshake
Environment
- Gateway: freenet 0.1.44 (git: b75ea56-dirty)
- Peer: freenet 0.1.44 (from crates.io)
- Peer behind NAT (technic.locut.us -> 136.62.52.28:43227)
- Gateway on public IP (nova.locut.us -> 5.9.111.215:31337)
Why CI Didn't Catch This
This bug requires:
- A peer behind NAT (or same source IP:port after restart)
- Peer restart with new identity
- Multi-node test infrastructure
CI likely tests with freenet local or simulated networks that don't exercise NAT traversal or peer restart scenarios.
Suggested Fix
In the transport/connection handler, when receiving packets from a known IP:port that fail decryption:
- Check if this looks like a new handshake initiation
- If so, clear the stale session and attempt fresh handshake
- Add logging for "stale session detected, resetting connection"
[AI-assisted - Claude]
Metadata
Metadata
Assignees
Labels
Type
Projects
Status