NETLOGON 5719 and DHCP 50024, applied KB2459530 and Broadcast flag set to 1, still getting timeouts

Good afternoon,

I've bumped into this issue, and have yet to find a good solution.

The environment

Catalyst 3560G running 15.0(1)SE with following port config

switchport access vlan <some number> switchport mode access spanning-tree portfast spanning-tree bpduguard enable

Catalyst 6509 w/ VS-SUP2T-10G running 15.1(2)SY with WS-X6148E-GE-45AT blades and the following port config

 switchport
 switchport access vlan <some number>
 switchport mode access
 logging event link-status
 spanning-tree portfast edge
 spanning-tree bpduguard enable

The vlan interface on the 6509 is configured as...

 ip address 10.xx.xx.252 255.255.255.0
 ip broadcast-address 10.xx.xx.255
 ip helper-address <ip of DHCP server>
 ip helper-address <ip of SCCM server>
 no ip redirects
 ip directed-broadcast
 ip pim sparse-dense-mode

We are using ip helper on the switches. There is no 802.1x configuration that might be fiddling with port settings.

I've been testing with this problem against both switches to rule out network differences.

Monitor ports have been configured on the switches so we can watch the traffic to/from the workstations that are experiencing the DHCP timeouts.

The endpoints are workstations running Windows 7 with SP1.

The problem

We're seeing lots of NETLOGON 5719 errors on boot up. This is breaking group policy processing and a few other boot time processes. The root cause appears to be DHCP requests timing out, which are visible in the DHCP Operational Log as EvendID 50024. So my problem is that DHCP requests are timing out. I need to find out why and get it working so our endpoints start working as expected.

The tests performed

I've taken a sample of machines that consistently exhibit problems. Some have the Gigabyte GA-890GPA-UD3H and others a Gigabyte F2A88XM-D3H. Both systems use the onboard Realtek NIC. From the PCI IDs, they use the exact same NIC.

F2A88XM-D3H -    PCI\VEN_10EC&DEV_8168&SUBSYS_E0001458
GA-890GPA-UD3H - PCI\VEN_10EC&DEV_8168&SUBSYS_E0001458

I've tested with this with drivers from Realtek. Both versions 7.73.618.2013 and 7.92.115.2015 (current as of 2015-05-22).

I've already read up on and deployed the hotfix from KB2459530. Checking the file versions on stuff like dhcpcore.dll and friends confirms the hotfix is installed. KB2459530 also talks about manually tweaking the DhcpGlobalForceBroadcastFlag and DhcpConnForceBroadcastFlag values. I've made the required changes and confirmed via the monitor port that requests are leaving the workstation with the Broadcast flag set to 1, instead of Unicast (0).

All of that said, I am still seeing inconsistencies between the workstation DHCP operational event log and the captured traffic on the monitor port.

Here is an example that is consistent across all the test machines...

5/22/2015 1:00:26 PM         50044 Information      Inform ack is received in the adapter 11.
5/22/2015 1:00:26 PM         50018 Information      Inform is sent in the adapter 11. Status code is 0x0
5/22/2015 1:00:26 PM         50058 Information      Your computer was successfully assigned an address from the network, and it can now connect to other computers.
5/22/2015 1:00:26 PM         50042 Information      Dns registration has happened for the adapter 11. Status Code is 0x0. DNS Flag settings is 64.
5/22/2015 1:00:26 PM         50028 Information      Address 10.40.250.2 is plumbed to the adapter 11. Status code is 0x0
5/22/2015 1:00:23 PM         50063 Information      Dhcp has notified NLA for the configuration changes for the interface 11
5/22/2015 1:00:23 PM         50035 Information      Routes are updated in the adapter 11. Status Code is 0x0
5/22/2015 1:00:23 PM         50059 Information      Route is added with the values Dest = 0.0.0.0, DestMask = 0.0.0.0, NextHop = 10.40.250.254, Address = 10.40.250.2
5/22/2015 1:00:23 PM         60000 Information      PERFTRACK (Request-Ack): Address confirmed for the adapter 11.Confirmed Address is 10.40.250.2.Server address is 10.0.10.21
5/22/2015 1:00:23 PM         60010 Information      PERFTRACK (Request-Ack): Address confirmed for the adapter 11.Confirmed Address is 10.40.250.2.Server address is 10.0.10.21
5/22/2015 1:00:23 PM         50013 Information      Ack is accepted in the adapter 11. Received Address is 10.40.250.2.Server address is 10.0.10.21
5/22/2015 1:00:23 PM         50012 Information      Request is sent from the adapter 11. Status code is 0x0
5/22/2015 1:00:23 PM         50024 Warning          Ack Receive Timeout has happened in the Interface Id 11
5/22/2015 1:00:20 PM         50012 Information      Request is sent from the adapter 11. Status code is 0x0
5/22/2015 1:00:20 PM         50006 Information      Request-Ack is initiated on the adapter with Interface Id 11
5/22/2015 1:00:20 PM         60018 Information      PERFTRACK (DHCPv4): Media Connect on adapter 11
5/22/2015 1:00:20 PM         60019 Information      PERFTRACK (DHCPv4): End of Media Connect on adapter 11
5/22/2015 1:00:20 PM         50025 Information      Cancelling pending renewals on the adapter in the Interface Id 11
5/22/2015 1:00:20 PM         50033 Information      An interface is added whose interface index is 11 and Status Code is 0x0.
5/22/2015 1:00:20 PM         50004 Information      Dhcp is enabled on the adapter with Interface Id 11
5/22/2015 1:00:20 PM         50001 Information      Media Connect notification received with Interface Id 11
5/22/2015 1:00:20 PM         50002 Information      Media Disconnect notification received with Interface Id 11
5/22/2015 1:00:20 PM         50001 Information      Media Connect notification received with Interface Id 1

The initial request (Event 50012) was sent at 1:00:20. The timeout is reached at 1:00:23 (Event 50024) and the request is subsequently resent (50012). The second request gets a response and the DHCP service binds the provided IP to the interface.

However, on the monitored port, Wireshark doesn't see ANY of the traffic from 1:00:20. The first DHCP Request we see on the wire is at 1:00:23. The rest of the conversation in Wireshark matches what is listed in the Event log.

I have confirmed that the switches and workstations are pulling NTP from the same source, so the timestamps in wireshark are accurate when comparing to event log entries.

With the Realtek drivers, I have experimented with Energy Efficient Ethernet (EEE, 802.3az) and Green Ethernet with no change in results. They remain disabled while we continue testing.

So, although this matches the problems seen in 2459530, it addresses a problem where the DHCP request was being sent with the Broadcast flag set to 0 and the windows firewall dropping the DHCP ACK. Since I don't even see the initial traffic on the wire, I do not think my problem is resolved by KB2459530.

Has anyone else seen problems like this? Any additional information would be helpful.

Thank you,

-nils


May 22nd, 2015 10:01pm

Just for more information, I just tried the Realtek LAN Driver from Gigabytes support site, as it's a different version (7.082.0317.2014) than I had previously tested with. Same settings with EEE and GE disabled, and the DHCP Broadcast flag still set.

Same results. The event log indictes a request (50012) was sent across the wire at 3:16:43 PM, the timeout (50024) was hit at 3:16:48 PM and the second request (50012) was sent at 3:16:48 PM which succeeded. Wireshark doesn't see the first request sent at 43 seconds, only the second request at 48 seconds.

Free Windows Admin Tool Kit Click here and download it now
May 22nd, 2015 10:21pm

Would also like to report that forcing the switch port to only auto negotiate for 10 100, this problem completely disappears.

May 22nd, 2015 10:36pm

Hi nf_,

So the issue is resolved by changing the negotiation port?

I am glad the issue has been resolved and thanks for updating.It will be very useful as a reference for the person who will come across the similar issue in the future.

Best regards

Free Windows Admin Tool Kit Click here and download it now
May 25th, 2015 3:16am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics