DC/GC server hangs/reboots causing Exchange 2007 authentication issues
We have 12 in-site domain controllers/global catalog servers for our multiple CCR cluster environment. In the past few months a situation has happened where the DC being used by an Exchange cluster freezes and has to be rebooted which then causes a large number (400-2,000) of Kerberos authentication errors (event ID 7) for clients connected to the cluster. It prompts them to authenticate to SERVER/USERNAME instead of DOMAIN/USERNAME, obviously causing confusion. As we have 12 DCs to choose from, is it expected behavior for Exchange get stuck on a particular DC that is having a problem and result in authentication issues when it determines the server is no longer available? Wouldn't it quickly switch to any of the other DCs to maintain uninterrupted service? As this has happened a few times in the past few months it has become a highly visible problem and something we don't expect with the level of redundancy we have in place. Can this behavior be modified at all or is it something that's automatically managed by Exchange? Additionally, we find that both of our Exchange CCR clusters that are physically located in different sites use the same DC at the same time. This of course does not help when a DC has a problem and both are using it. Is this how Exchange is designed to function or again, something that can be changed?
April 24th, 2012 10:20am

You didn't share the Exchange version, service pack or hotfix level. I've heard of cases like yours being reported, but I've personally not experienced what you describe in any of my organizations. My customers seem to be able to patch and reboot domain controllers without issue. It could be that the nature of your DC problem doesn't present itself to Exchange as an outright failure, and Exchange goes on thinking things are working. I would think that your focus should be on correcting whatever it is that's causing your DCs to "freeze" because that is definitely not normal behavior.Ed Crowley MVP "There are seldom good technological solutions to behavioral problems."
Free Windows Admin Tool Kit Click here and download it now
April 24th, 2012 10:27am

Sorry, we are Exchange 2007 SP3 RU6. For the DC freezing it is an issue the AD folks are actively working on and should be resolved at some point, but regardless we would like to understand the correct behavior of Exchange in these situations. There seems to be no documentation we can find to explain it, such as how long Exchange will continue to try to use a DC before it changes to another, etc.
April 24th, 2012 10:43am

It is my experience that Exchange can take several minutes before it switches to a working domain controller from a failed one.Ed Crowley MVP "There are seldom good technological solutions to behavioral problems."
Free Windows Admin Tool Kit Click here and download it now
April 24th, 2012 10:13pm

I think it is important for you guys to find out and fix the issues isolated with problem DC/GC which Exchange Servers is using or talking too. Exchange requires a healthy Active Directory & and Healthy DC/GC is *Must* to run Exchange server with error free. Exchange server will use the domain controller /GC within its own site first. How the servers do know what site they belong too, by looking at its own IP address and subnet mask and trying to connect & authenticate DC within this subnet. How to troubleshoot the DC related issues? Hanging DC might have several reasons as why this is happening, when DC hoses up, does it gives blue screen? or wont process GC lookups etc. Once again getting to bottom of this will fix your Exchange related issues and things to look at are from Problem DC Run DCdiag and fix the warnings dcdiag /s:<Domain Controller Name> at a command prompt on the Exchange serverUse the nltest /dsgetdc: /site:<local site name> command to verify that a domain controller can be located in the local sitedcdiag /v /c /d /eRun Repadmin to check replication related issues repadmin /showrepl Check the Application log for related event If server hangs, you might have to do memory dump to determine the cause of the hang in more sophisticated way. In general I have seen outdated NIC drivers, patch , NIC teaming etc. causing problems. How to Generate a Memory Dump File When a Server Stops Responding (Hangs) http://support.microsoft.com/kb/303021 Investigate event logs http://technet.microsoft.com/en-us/library/bb218748.aspx Good Luck ocd Oz Casey, Dedeal MCITP (EMA), MCITP (EA), MCITP (SA) Visit smtp25.blogspot.com Visit Telnet25.wordpress.com This posting is provided AS-IS with no warranties or guarantees and confers no rights.
April 25th, 2012 12:12am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics