Brief random network outages

Hi

We have a networking issue that I hope someone may be able to help me with.

We run a Windows domain - single domain, single subnet. Three domain controllers: Windows 2012 R2 Standard, Windows 2008 Standard (32bit) and Windows Server 2003 Standard. The Win 2008 DC holds all the FSMO roles. DNS is setup on the 2008 and 2012 DC's with the 2008 DC having a secondary DNS server installation. DHCP is setup on the 2008 DC. The 2003 server only functions as a DC - no roles apart from that are installed and it will be demoted very soon.

Two member servers: Windows Storage Server 2008 (64bit) and Windows 2012 Standard. The Storage server hosts 99% of our data with the other 1% being on the 2008 DC. Two email server software installations (not Exchange) - one on the storage server and one on the 2012 R2 member server. NPS Routing and Remote Access is configured on the 2012 member server to handle VPN connections.

35 client PC's: 34 run Windows 7 and one runs Vista.

We have two Draytek routers on the network - one acts as the gateway to the Internet and the other provides wireless coverage. There are two networked printers - a Ricoh MFD 'workgroup' printer and a small mono Brother.

Network shares are accessed via DFS. The Servers, the Ricoh printer and routers have static IP's, most of the clients and the Brother printer use DHCP.

All cables terminate at one of two patch panels which then feed to one or more switches. Small desktop switches are used to expand the network where needed. The network is divided into two segments, hence two patch panels, but both run under the same 192.168.0.xxx/255.255.255.0 subnet.

Before we upgraded all our computers to Windows 7 the network was fine. The present network was built using CAT5 cabling in 2004 (we'd used BNC before that). We rarely had any network issues and when we did it would affect all clients. When I first introduced Windows 7 it was on three PC's and one or more of them would randomly have problems accessing the network and Internet.

When I upgraded all our machines to Windows 7 we are seeing one or more machines experiencing network problems most days.

What happens is the (any) computer will suddenly stop - Applications accessing files across the network e.g. email and Access will hang and report as (not responding) and the 'busy' cursor appears. Try and save an office document and the busy cursor appears and the application hangs. The Start menu is not accessible - for example I always have my Taskbar hidden and when this happens on my machine moving the mouse to the bottom of the screen does nothing, the Taskbar stays hidden. Sometimes, the Taskbar may appear, but nothing happens when the Start button is clicked. When trying to access shared folders via Computer the green bar slowly moves through the Address Bar and after a while it reports the share is not accessible. The affected computers are also unable to open any web pages.

The hang will last for anything from 20 secs to a minute or more after which the computer will continue operating normally. On very rare ocassions the the computer will not recover after 10mins or more and I have to force a shutdown but I assume this is not directly related to the issues I am seeing.

The short outages are completely random and leave no trace of a problem in the System or Application Logs. When a machine does not recover (which is very rare) the System Log reports that a DNS server could not be reached.

This will happen on a single PC and others will be fine. But, it may affect several PC's over the course of a day.

When the outage happens I can Winkey+R, open cmd.exe and successfully ping the DNS servers by IP and name.

All the systems are up to date. I ensure that Windows Updates are installed on the Servers when they are released and the clients are updated the next day via WSUS.

I am pretty sure that this is also affecting Active Directory. I am seeing transient errors on the Domain Controllers where, for example, a Global Catalog can not be contacted. Both DC's are GC's and when I run nltest to test the connection to a GC within 30 minutes of the error being reported the test passes. I assume the servers are also experiencing these random outages.

The problem I have is that because they are random and because, on the clients at least, no errors are logged I cannot reproduce the problem and have no idea what may be causing the issue.

The problem is not a general network issue as it randomly affects one client at a time - if was a general network problem I would expect all the clients to lose connectivity.

Has anyone else seen a similar issue and know what the cause was, please?

Thanks

March 30th, 2015 2:08pm

It happened again on my PC this morning. I was not able to Winkey+R and open a command window to ping other resources on the network.
Free Windows Admin Tool Kit Click here and download it now
March 31st, 2015 7:27am

Use arp -d to delete arp cache and retry .

Use ping with -p flag to test if network is stable .

The computer not responding is related to operating system or software ,not network .

March 31st, 2015 12:12pm

how old is the server?

How old is your network switches?

Any computer or server with NIC or network switches going south, may cause the intermittent issue.

They may flood the network with packets thus hang the system.


Free Windows Admin Tool Kit Click here and download it now
April 1st, 2015 9:33pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics