Reset commands being sent to server devices
We are having an issue with our MS Windows servers (2003 & 2008) being sent internally commands from other servers to reset devices, such as a RAID drive. In some situations its enough to bring the server down (offline). We have being running scans on all servers and the scans come back with finding any malware. Wondering if anyone else has heard of this happening and might know the cause. Let me know if I can provide more details. Thx, T$
August 11th, 2011 7:09pm

Sounds malicious, but you'd really need to look through the event viewer on both servers to see if there is anything explainable going on. You might consider isolating the affected servers from each other by way of firewall or network switch ACL's for the time being while you investigate the issue as it will help limit any damage.
Free Windows Admin Tool Kit Click here and download it now
August 11th, 2011 7:24pm

Thanks Nigel. We are definitely doing that. Many of our servers that act as app servers for a central instance with database have had to be shutdown. The traffic between these servers becomes hung causing an hour glass state for users. We continue to operate and move forward but only have the warnings and errors from these reset calls in the EM. Both our McAfee and network scans return nothing.
August 11th, 2011 7:29pm

Just on another note, if there are DRAC, ILOM, ILO, etc. connections to these servers, consider disabling them for now if possible as this is sometimes not secured properly and can easily be exploited. Question, have you been able to identify which servers the reset is coming from? When you say that the commands are "being sent internally commands from other servers to reset devices", how have you been able to identify this?
Free Windows Admin Tool Kit Click here and download it now
August 11th, 2011 7:45pm

I'll look into disabling these. We do know the servers specifically. All scans are coming back negative for anything malicious though. We are currently working to get sniffers setup on the network to see the traffic with more granularity. Please keep in mind this is a large corporate landscape.
August 11th, 2011 8:03pm

Let us know what you find, I'd be interested to know.
Free Windows Admin Tool Kit Click here and download it now
August 13th, 2011 3:21pm

Hi, Please describe the symptom in detail, such as device or drive failure error message. Did you see any errors in Event Log or in Device Manager? If so, please write down the detailed errors here for research. Why did you suspect some server “sent internally commands”? Any clues? Any progress after following Nigel’s suggestions? Please also check whether Network Monitor helps in your scenario. Refer to: http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=4865. Thanks. NinaPlease remember to click Mark as Answer on the post that helps you, and to click Unmark as Answer if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread.
August 16th, 2011 11:57pm

Thanks Nina. The errors have subsided. Last issue reboot issue we had was on Sunday morning. The errors are short and say, "IDE Drive Timeout" or "RAID port 5 sent reset command". These errors occur just before the server is rebooted. They don't seem to be enough to reboot the server but it's all we have at this point. It's like someone physically presses the start button on the server. These errors, like I said, have begun to subside and we are in a monitoring mode at this point. As our confidence grows we are bringing more of our app servers back online. If this state continues I imagine, by the end of the week, we will be back to normal. It would be nice to be able to point to the reason this happens and a solution. I am not specifically part of the network team and they have their own tools for monitoring. This also precludes me from pulling some of the network errors specifically. What I'm looking for is someone who may have seen this specific type of behavior before and have some pointers on where/what to look at. Thanks so much for your help!
Free Windows Admin Tool Kit Click here and download it now
August 17th, 2011 9:37am

Solution: Update drivers on NIC's to current release. It appears that the nic's would failover during high traffic and drop network connections. These drops were a preliminary issue to the reboots occurring. While there is no 'smoking gun' with the reboot issue. The updates seem to have subsided it. Servers without the updates are still having issues. Needless to say, all other servers are being scheduled for updates.
August 24th, 2011 1:04pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics