On Fri, 24 Jan 2025 12:40:16 -0500
Jon Maloy <jmaloy(a)redhat.com> wrote:
I can certainly clear tp->pred_flags and post
it again, maybe with
an improved and shortened log. Would that be acceptable?
Talking about an improved log, what strikes me the most of the whole
problem is:
$ tshark -r iperf3_jon_zero_window.pcap -td -Y 'frame.number in { 1064 .. 1068
}'
1064 0.004416 192.168.122.1 → 192.168.122.198 TCP 65534 34482 → 5201 [ACK]
Seq=1611679466 Ack=1 Win=36864 Len=65480
1065 0.007334 192.168.122.1 → 192.168.122.198 TCP 65534 34482 → 5201 [ACK]
Seq=1611744946 Ack=1 Win=36864 Len=65480
1066 0.005104 192.168.122.1 → 192.168.122.198 TCP 56382 [TCP Window Full] 34482 → 5201
[ACK] Seq=1611810426 Ack=1 Win=36864 Len=56328
1067 0.015226 192.168.122.198 → 192.168.122.1 TCP 54 [TCP ZeroWindow] 5201 → 34482
[ACK] Seq=1 Ack=1611090146 Win=0 Len=0
1068 6.298138 fe80::44b3:f5ff:fe86:c529 → ff02::2 ICMPv6 70 Router Solicitation
from 46:b3:f5:86:c5:29
...and then the silence, 192.168.122.198 never announces that its
window is not zero, so the peer gives up 15 seconds later:
$ tshark -r iperf3_jon_zero_window_cut.pcap -td -Y 'frame.number in { 1069 .. 1070
}'
1069 8.709313 192.168.122.1 → 192.168.122.198 TCP 55 34466 → 5201 [ACK] Seq=166 Ack=5
Win=36864 Len=1
1070 0.008943 192.168.122.198 → 192.168.122.1 TCP 54 5201 → 34482 [FIN, ACK] Seq=1
Ack=1611090146 Win=778240 Len=0
Data in frame #1069 is iperf3 ending the test.
This didn't happen before e2142825c120 ("net: tcp: send zero-window
ACK when no memory") so it's a relatively recent (17 months) regression.
It actually looks pretty simple (and rather serious) to me.
I remembered last time it really also took me some time to totally
follow. Packetdrill should be helpful :)
As to the patch itself, I agreed with this fix last time while now I
have to re-read that long analysis to recall as much as possible. I'm
not that sure if it's a bug belonging to the Linux kernel. The other
side not sending a window probe causes this issue...? The other part
of me says we cannot break the user's behaviour.
One way or another, I will also take a look at it again.
Thanks,
Jason