When a T-Mobile "femto-cell" is trying to establish its IPv4, IPSEC tunnel to the T-Mobile provisioning servers, the 4640-byte return packet is silently dropped by the in-kernel NAT, even though it "matches" the outbound packet from less than 100 ms prior.
All other operations of the firewall seem to be functioning as expected. This includes iPhones using "WiFi Calling" which utilizes similar IPSEC connections to T-Mobile servers (though fragmentation has not been seen on those connections). The connection for the femto-cell can be handled by a Linux/netfilter NAT. Proper reassembly of the packet fragments within the firewall, at the start of the rule set, has been confirmed with ngtee and wireshark
Is there a known issue with large packets and in-kernel NAT?
Edit: Yes, there is a 4k limit for ipfw+nat and ng_nat, at least with 11.1-RELEASE-p9 and -p10
In advance of MFC and RELEASE builds, for patch for STABLE or RELEASE, please see https://svnweb.freebsd.org/base?view=revision&revision=335133 as committed on 2018-06-14 to CURRENT. Big "thanks" to Andrey V. Elsukov for the insight and fix.
The only sysctl that I found that seemed related was the UDP timeout. For good measure I upped it to 30 (seconds), but that did not change the behavior.
Are there known causes and/or resolutions for this behavior?
---
Diagnosis has been a challenge as there does not seem to be a way to enable logging of dropped packets with the in-kernel NAT, nor have I been able to find a way to examine the NAT table (and don't see a call to do so described in libalias(3)).
Logical flow and packet progress through the firewall has been instrumented with ngtee and ng_iface nodes, for both the "in" pass and the "out" pass through the firewall. Pre- and post-NAT packets are captured in the rule immediately prior and immediately after the nat rules.
The logical flow and NAT appear to be operating as expected, with the exception of the drop of the "IKE-AUTH MID=01 Responder Response" packet (as wireshark describes it).
The initial IKE_SA_INIT exchange on UDP 500 proceeds as expected:
11.1-RELEASE-p9 FreeBSD 11.1-RELEASE-p9 #0: Tue Apr 3 16:59:16 UTC 2018 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64
Additional keywords: Jumbo frame, frag, fragmentation
All other operations of the firewall seem to be functioning as expected. This includes iPhones using "WiFi Calling" which utilizes similar IPSEC connections to T-Mobile servers (though fragmentation has not been seen on those connections). The connection for the femto-cell can be handled by a Linux/netfilter NAT. Proper reassembly of the packet fragments within the firewall, at the start of the rule set, has been confirmed with ngtee and wireshark
Is there a known issue with large packets and in-kernel NAT?
Edit: Yes, there is a 4k limit for ipfw+nat and ng_nat, at least with 11.1-RELEASE-p9 and -p10
In advance of MFC and RELEASE builds, for patch for STABLE or RELEASE, please see https://svnweb.freebsd.org/base?view=revision&revision=335133 as committed on 2018-06-14 to CURRENT. Big "thanks" to Andrey V. Elsukov for the insight and fix.
The only sysctl that I found that seemed related was the UDP timeout. For good measure I upped it to 30 (seconds), but that did not change the behavior.
Are there known causes and/or resolutions for this behavior?
---
Diagnosis has been a challenge as there does not seem to be a way to enable logging of dropped packets with the in-kernel NAT, nor have I been able to find a way to examine the NAT table (and don't see a call to do so described in libalias(3)).
Logical flow and packet progress through the firewall has been instrumented with ngtee and ng_iface nodes, for both the "in" pass and the "out" pass through the firewall. Pre- and post-NAT packets are captured in the rule immediately prior and immediately after the nat rules.
The logical flow and NAT appear to be operating as expected, with the exception of the drop of the "IKE-AUTH MID=01 Responder Response" packet (as wireshark describes it).
The initial IKE_SA_INIT exchange on UDP 500 proceeds as expected:
- 532 bytes IKE_SA_INIT MID=00 Initiator Request
- Received at inside interface ${device_IP}:500 => some_server.t-mobile.com:500
- NAT outbound to ${outside_IP}:500 => some_server.t-mobile.com:500
- Sent to router via outside interface
- 533 bytes IKE_SA_INIT MID=00 Initiator Response
- Received at outside interface some_server.t-mobile.com:500 => ${outside_IP}:500
- NAT inbound to some_server.t-mobile.com:500 => ${device_IP}:500
- Sent to device via inside interface
- 2112 bytes IKE_AUTH MID=01 Initiator Request
- Received as two fragments, 1504 and 632 bytes at inside interface ${device_IP}:4500 => some_server.t-mobile.com:4500
- Reassembled by ipfw, wireshark indicates properly reassembled on the ngtee debug interface
- NAT outbound to ${outside_IP}:4500 => some_server.t-mobile.com:4500
- Sent to router via outside interface
- 4640 bytes IKE_AUTH MID=01 Responder Response
- Received as four fragments, 1504, 1504, 1504, and 200 bytes at outside interface some_server.t-mobile.com:4500 => ${outside_IP}:4500
- Reassembled by ipfw, wireshark indicates properly reassembled on the ngtee debug interface
- NAT inbound -- packet not seen after NAT
11.1-RELEASE-p9 FreeBSD 11.1-RELEASE-p9 #0: Tue Apr 3 16:59:16 UTC 2018 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64
Additional keywords: Jumbo frame, frag, fragmentation