TCP ephemeral port re-use in milliseconds
I got envolved with troubshooting a IE app that was frequently halting and then resuming. Did some checking and noticed it halts at the same time the workstation tries to connect to the server and fails - the server never reponds to the tcp sync
request and it's two retransmits. Ok I thought this looks pretty straightforward, but things got interesting when I put a sniffer at the server end. What I found was Vista trying to create a new connection using a source port that it had
just finished using milli-seconds before. So of course the firewall sees this sync request on a port when the old session is still in place and blocks it. Now I am pretty sure the ephemeral port should not be re-used so quickly, I would expect to only
reuse after wrapping and it has 16K ports to get through.
The aggravating conditions are that the IE cache mode is set in a particuarly inefficient mode, this results in a high rate of new tcp session and this is when the halting occurs. Putting IE Cache in auto mode seems to resolve the halting.
I thought I would share this interesting observation.
-Wes
March 9th, 2011 3:41am
In message
<f706345c-8654-43c6-bcea-e286e1c3c651@communitybridge.codeplex.com>
someone claiming to be WesE typed:
I got envolved with troubshooting a IE app that was frequently halting and then resuming. Did some checking and noticed it halts at the same time the workstation tries to connect to the server and fails - the server never reponds to the tcp sync
request and it's two retransmits. Ok I thought this looks pretty straightforward, but things got interesting when I put a sniffer at the server end. What I found was Vista trying to create a new connection using a source port that it had
just finished using milli-seconds before. So of course the firewall sees this sync request on a port when the old session is still in place and blocks it. Now I am pretty sure the ephemeral port should not be re-used so quickly, I would expect to only
reuse after wrapping and it has 16K ports to get through.
The aggravating conditions are that the IE cache mode is set in a particuarly inefficient mode, this results in a high rate of new tcp session and this is when the halting occurs. Putting IE Cache in auto mode seems to resolve the halting.
I thought I would share this interesting observation.
Was the new connection to the same IP or a different IP? If it's the
same IP then this wouldn't be allowed, but I'd be interested in getting
a sniffer on the client side to see exactly what is happening.
Normally a localIP:localPORT:remoteIP:remotePORT must be unique, and
terminated sessions will sit in TIME_WAIT to ensure that another
connection cannot be started in a way that might cause confusion (and/or
the original session must be dropped on both ends, in which case
TIME_WAIT can be avoided)
However, if the connection is to a different IP (perhaps a multihomed
server?) then this is technically valid, although I'm not really sure
that Vista does reuse source ports that quickly.
Either way, I wouldn't really expect to see Vista reusing ports
particularly aggressively, but it's possible some NAT box between the
two machines might be doing something strange.
Free Windows Admin Tool Kit Click here and download it now
March 9th, 2011 10:34am
Here is the view from the Workstation as logged by Netmon 3.4. The Server and Workstation IPs are the same throughout.
Time in MS in bold italics (all this in much less that 1 second)
Session created, workstation uses a source port of 60415
450.5810302
Wks:60415 Srv:443
TCP:Flags=......S., SrcPort=60415, DstPort=HTTPS(443), Seq=2379229223, Ack=0
450.5829386
Srv:443 Wks:60415 TCP:Flags=...A..S., SrcPort=HTTPS(443), DstPort=60415, PayloadLen=0, Seq=3059853230, Ack=2379229224
clip...
450.5972144
Srv:443 Wks:60415
TCP:Flags=...A...F, SrcPort=HTTPS(443), DstPort=60415, Seq=3059853573, Ack=2379230098
450.5975433
Wks:60415 Srv:443
TCP:Flags=...A...., SrcPort=60415, DstPort=HTTPS(443), Seq=2379230098, Ack=3059853574
450.5976363
Wks:60415 Srv:443
TCP:Flags=...A...F, SrcPort=60415, DstPort=HTTPS(443), Seq=2379230098, Ack=3059853574
session FIN'd, clip..
Try to establish new, again with a source port of 60415
450.6163085
Wks:60415 Srv:443
TCP:Flags=......S., SrcPort=60415, DstPort=HTTPS(443), Seq=2379248771, Ack=0,
clip...
453.6156913
Wks:60415 Srv:443
TCP:[SynReTransmit #110239]Flags=......S., SrcPort=60415, DstPort=HTTPS(443)
I wonder how the source port is selected?
-Wes
March 9th, 2011 10:24pm
But wait, there's more.
Today we enabled tracing, WinInet - Analytics, in that log you can see connections being established including the source port. What's interesting is that the duplicate port we see being put on the wire never appears in trace log. So, it would seem
that somewhere between where the WinInet trace is taken and where netmon captures the frames the source port is change. I checked the NIC driver version and it's a few years old, tomorrow we will change it and see if that fixes it.
-Wes
Free Windows Admin Tool Kit Click here and download it now
March 11th, 2011 4:28am
Finally figured this out. Its called a TIME_WAIT assassination. Read about it here
http://blogs.technet.com/b/networking/archive/2010/08/11/how-tcp-time-wait-assassination-works.aspx
This is what we were experiencing. We have mitigated the problem by changing the keep alives on the reverse proxy, this reduces the number of tcp sessions created by the client.
-Wes
March 30th, 2011 8:57pm