ping domain.local failed when 1 DC is down

Hi,

We have an environment with 3 DC`s. DC1 and DC2 are in the same site, and DC3 are in a different site.

We have done some testing, and we shutdown the DC1 (this DC also contains all the FSMO roles) some clients where able to ping domain.local since when pinging them they pointet to DC2 or DC3. But some clients where not able to ping domain.local since they where pointing to DC1. Should not the "system" automatically answer from either DC2 or DC3? How does the clients/servers understand which DC they are going to use when they use ping domain.local. I should not have to reboot the clients that could not ping domain.local ?

Hope you understand my que

March 21st, 2015 6:32am

Hello

check client dns settings with ipconfig -all
DNS Servers - need list all DC

Free Windows Admin Tool Kit Click here and download it now
March 21st, 2015 11:51am

Domain controllers register DNS record in order to be discovered by the Active Directory clients (workstations and servers member of the domain). This is explained here: How DNS Support for Active Directory Works https://technet.microsoft.com/en-us/library/cc759550%28v=ws.10%29.aspx?f=255&MSPPError=-2147217396

As you can read in this article, the clients locate their DC using DNS requests but using a specific type of record called SRV records. You can see them in your DNS administration console under the node _msdcs.

Now, the DC also records records for clients which are not able to use SRV records (poor those... SRV records are around for decades... but well, it happens). So DCs also register an A record for the name of the domain. For example contoso.com. You can see those records in your DNS administration console, they show up with the mention <same as parent folder> just under the node of your domain.

When a client is looking for a DC for authentication, or for LDAP queries for example, they do not use the A record contoso.com. They use the SRV records. So technically, if there was no A records for contoso.com, your workstations and servers can still work perfectly fine (unless they host applications which are explicitly using the A record, but not a Windows component). You can do the test, delete those A records for contoso.com, reboot your workstation, you'll be able to log on and to get your group policies and even use the Users and Computers console etc.

Now, what you see is due to the cache of the DNS client. When you ask for contoso.com the first time, the DNS server returns the IPs of all DCs of your domain (technically, this is also customizable, you can tell a DC not to register this A record). The client tries the first of the list and then cache the association IP <> DC FQDN. So if the DC goes down, the ping command will still try the DC in cache until the cache expires. You can do an pconfig /flushdns and try to ping again (you might have to try several ipconfig /flushdns in case the DNS server put in first position the offline DC again).

So what can you do to make sure you can always reach a DC? Well it is simply the wrong command. Don't use PING but NLTEST. For example the following:

nltest /dsgetdc:contoso.com

If this shows the DC offline, it means that no application asked for a DC since you turn this one off. And in this case, you can wait a bit and try this command again to see what is the DC that you are currently using, or you can force your client to refresh your "DC cache" with the following:

nltest /dsgetdc:contoso.com /force

Hope this blues sky the situation :)

March 21st, 2015 5:02pm


We have done some testing, and we shutdown the DC1 (this DC also contains all the FSMO roles) some clients where able to ping domain.local since when pinging them they pointet to DC2 or DC3. But some clients where not able to ping domain.local since they where pointing to DC1. Should not the "system" automatically answer from either DC2 or DC3? How does the clients/servers understand which DC they are going to use when they use ping domain.local.

I believe they are pointing to DC1 because they have cached the name in their DNS cache. Try running the command below in elevated mode and see if the issue still persist.

ipconfig /flushdns
Free Windows Admin Tool Kit Click here and download it now
March 22nd, 2015 3:14am

1. The same site is not the same network. DC and DNS should be reacheable from every client computer.

2. You have not mentioned the placement of DNS and there is no informaton on DNS configuration.

3. Test your system with dcdiag. Make sure that the content of AD/DNS is replicated properly.

4. Make sure that there are not errors that points to your problem.

5. nslookup is your primary tool whe resolving problems in this category. Thi will give you info on all record that are important for flawless function of all computers.

Regards

Milos

March 22nd, 2015 3:30am

DC Locator process is the one used to locate the closest available DC. I have documented how it works here: http://social.technet.microsoft.com/wiki/contents/articles/24457.how-domain-controllers-are-located-in-windows.aspx

I would recommend that your DC IP settings are following the IP recommendations I shared here so that they can update their DNS records with no issues: http://www.ahmedmalek.com/web/fr/articles.asp?artid=23

For your clients, make sure that they point to your 3 DCs as DNS server and that no public DNS server is configure on their IP settings. Also, make sure that you have no blocked ports or filtering between your clients and DCs.

Free Windows Admin Tool Kit Click here and download it now
March 22nd, 2015 5:31pm

Thanks for all the reply, it is now clear to me how it works.

March 23rd, 2015 2:42am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics