netft.sys is the cause for the bugchk blue screen on the server Windows 2008 R2 Datacenter

Hi

we have the server geting rebooted by a bugchk error for netft.sys Please let me know if we have any fix for this issue. i am not sure wht is causing the issue on the server

the server is windows 2008 R2 Datacenter and it is on the HyperV cluster

Thanks in advance

  • Moved by BrianEhMVP Friday, March 08, 2013 8:05 PM
March 8th, 2013 7:59pm

I assume you are referring to a stop 0x9e for netft.sys, which is an intentional bugcheck caused by the cluster service due to a deadlock condition identified. I'd recommend reviewing the following for further tips to troubleshoot this issue:

http://blogs.technet.com/b/askcore/archive/2009/06/12/why-is-my-2008-failover-clustering-node-blue-screening-with-a-stop-0x0000009e.aspx

Free Windows Admin Tool Kit Click here and download it now
March 8th, 2013 9:28pm

Thanks John,

thi was the same error i was refering to. I did go through the MS site and i found some Hotfix for it, Not very sure if that Hot fix is for the same issue.

  http://support.microsoft.com/kb/2135160/en-us error FIX: "0x0000009E" Stop error when you host Hyper-V virtual machines in a Windows Server 2008 R2-based failover cluster

do u have any clue on it ?

March 8th, 2013 10:29pm

I think the most important line in that KB article:

Not all "0x0000009E" Stop errors are caused by this problem.

This blue screen just indicates that something caused the cluster to believe that the node was hung. If you are running Hyper-V, this might be a good place to start with this hotfix. Otherwise, you might consider opening a ticket with PSS to troubleshoot this further.

Free Windows Admin Tool Kit Click here and download it now
March 8th, 2013 11:00pm

Clustering has health detection between the user mode service and the kernel mode NetFT driver.  If user mode goes unresponsive, then clustering bugchecks the box in an effort to force a failover.  A STOP 0x9e is expected cluster behavior.  You should troubleshoot the condition as a user mode hang...

Thanks!
Elden

March 9th, 2013 3:02am

Techei, I am a developer on the clustering team. if the problem persists then you can share dump file with me (c:\windows\memory.dmp). I will take a look what exactly caused the bugcheck. if you do not want to share it with the world then give me your email, I'll reply with my email and you can share this dump with me. If you prefer work through customer support then they also should be able to look at the dump and tell you what netft is not happy about.

 
Free Windows Admin Tool Kit Click here and download it now
March 10th, 2013 7:16pm

We're having a very similar problem.. which i'd love for someone to actually investigate the memory dump to give me a real clue as to the problem.  We're in the midst of multiple cases with PSS but a solution couldn't come quick enough.  How can I share this dump with you?

March 13th, 2013 3:02pm

Hi Vladimir, I have the same issue with the bugcheck 0x0000009e

How can I send you the memory.dmp? Thaanks a lot

Free Windows Admin Tool Kit Click here and download it now
October 11th, 2013 8:20pm

If you can put it to some location I can download it from that would work.

Thanks,

Vladimir.

October 12th, 2013 3:28am

I used my skydrive. You can dowload it from there

https://skydrive.live.com/redir?resid=95BBBD60D3F2B190!117

thanks!

Free Windows Admin Tool Kit Click here and download it now
October 15th, 2013 3:55pm

look at the properties of the files c:\windows\system32\drivers\EmcpXcr.sys, EmcpBase.sys and EmcpGpx.sys to see what company these drivers came from and contact that company for support, and share with them your dump.

Cluster bugchecked the machine because cluster tried to terminate resource host monitor - the host process where the plug-ins that control applications, disks etc are running in, and this process did not go away in 20 minutes. Termination is stuck because some threads are stuck in the kernel. They all end up waiting for the thread below. That thread has been stuck for 25 minutes

THREAD fffffa800793f660  Cid 0004.0050  Teb: 0000000000000000 Win32Thread: 0000000000000000 WAIT: (Executive) KernelMode Non-Alertable
    fffffa8008467db8  SynchronizationEvent
    fffffa8008467dd0  NotificationEvent
IRP List:
    fffffa801452f4b0: (0006,0310) Flags: 00060070  Mdl: 00000000
    fffffa8013d9ba90: (0006,0310) Flags: 00060070  Mdl: 00000000
Not impersonating
DeviceMap                 fffff8a000008500
Owning Process            fffffa80078c49e0       Image:         System
Attached Process          N/A            Image:         N/A
Wait Start TickCount      262955         Ticks: 96163 (0:00:25:00.152)
Context Switch Count      10707          IdealProcessor: 12            
UserTime                  00:00:00.000
KernelTime                00:00:12.105
Win32 Start Address nt!ExpWorkerThread (0xfffff800024e0150)
Stack Init fffff88002b5cdb0 Current fffff88002b5be20
Base fffff88002b5d000 Limit fffff88002b57000 Call 0
Priority 13 BasePriority 12 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
Child-SP          RetAddr           Call Site
fffff880`02b5be60 fffff800`024cc5f2 nt!KiSwapContext+0x7a
fffff880`02b5bfa0 fffff800`024d90ea nt!KiCommitThreadWait+0x1d2
fffff880`02b5c030 fffff880`0146ee3b nt!KeWaitForMultipleObjects+0x272
fffff880`02b5c2f0 fffff880`01603601 EmcpBase!PowerSleep+0x83
fffff880`02b5c350 fffff880`016038d9 EmcpXcr!XcryptUpdateAssocCallout+0xc45
fffff880`02b5c3b0 fffff880`014f8c10 EmcpXcr!XcryptUpdateAssocCallout+0xf1d
fffff880`02b5c3f0 fffff880`01467adf EmcpGpx!GpxDestroySplitPirp+0x78
fffff880`02b5c420 fffff880`014f8c30 EmcpBase!PowerDispatchX+0x243
fffff880`02b5c470 fffff880`01467adf EmcpGpx!GpxDestroySplitPirp+0x98
fffff880`02b5c4a0 fffff880`01468b41 EmcpBase!PowerDispatchX+0x243
fffff880`02b5c4f0 fffff880`01475681 EmcpBase!PowerSyncIoTopDispatch+0x89
fffff880`02b5c520 fffff880`01478686 EmcpBase!PowerWinIsIrpSync+0x965
fffff880`02b5c570 fffff880`0180d362 EmcpBase!PowerWinIsPseudoBusPDO+0x219a
fffff880`02b5c5d0 fffff880`0180971f disk!DiskGetPortGeometry+0x92
fffff880`02b5c630 fffff880`018015ec disk!DiskUpdateGeometry+0x4ef
fffff880`02b5c670 fffff880`0180164c disk!DiskReadDriveCapacity+0x1c
fffff880`02b5c6a0 fffff880`0159ce7d disk!DiskDeviceControl+0x2e3
fffff880`02b5c700 fffff880`012f1244 CLASSPNP!ClassDeviceControlDispatch+0x2d
fffff880`02b5c730 fffff800`027f745d partmgr!PmFilterDeviceControl+0xd4
fffff880`02b5c790 fffff800`027f72b4 nt!FstubGetDiskGeometry+0x12d
fffff880`02b5c810 fffff800`027f76e2 nt!FstubAllocateDiskInformation+0x44
fffff880`02b5c840 fffff880`012f3512 nt!IoReadPartitionTableEx+0x1a
fffff880`02b5c870 fffff880`012f1398 partmgr!PmGetDriveLayoutEx+0x5d2
fffff880`02b5c970 fffff880`0162d48f partmgr!PmFilterDeviceControl+0x228
fffff880`02b5c9d0 fffff880`0162ae3d ClusDisk!ClusDskpSendIoctl+0x8f
fffff880`02b5ca70 fffff800`02742eb0 ClusDisk!ClusDskDeviceChangeNotification+0xc9
fffff880`02b5cae0 fffff800`02741787 nt!PnpNotifyDriverCallback+0x5c
fffff880`02b5cb70 fffff800`02742ffc nt!PnpNotifyTargetDeviceChange+0x16b
fffff880`02b5cc20 fffff800`027420ca nt!PnpProcessCustomDeviceEvent+0x2c
fffff880`02b5cc50 fffff800`024e0261 nt!PnpDeviceEventWorker+0x142
fffff880`02b5ccb0 fffff800`027732ea nt!ExpWorkerThread+0x111
fffff880`02b5cd40 fffff800`024c78e6 nt!PspSystemThreadStartup+0x5a
fffff880`02b5cd80 00000000`00000000 nt!KxStartSystemThread+0x16

Storage class device fffffa800f2ef060 with extension at fffffa800f2ef1b0

Classpnp Internal Information at fffffa801371c530

    Transfer Packet Engine:

     Packet          Status  DL Irp          Opcode  Sector/ListId   UL Irp
    --------         ------ --------         ------ --------------- --------

    Pending Idle Requests: 0x0


    Failed Requests:

           Srb    Scsi                                  
    Opcode Status Status Sense Code  Sector/ListId   Time Stamp
    ------ ------ ------ ---------- --------------- ------------
      28     04     02    02 04 03      00000001    12:28:45.684  
      28     04     02    02 04 03      003fffff    12:28:45.684  
      28     04     02    02 04 03      00000001    12:28:45.684  
      28     04     02    02 04 03      003fffff    12:28:45.684  
      28     04     02    02 04 03      00000001    12:28:45.684  
      28     04     02    02 04 03      003fffff    12:28:45.684  
      28     04     02    02 04 03      00000001    12:28:45.699  
      28     04     02    02 04 03      003fffff    12:28:45.699  
      28     04     02    02 04 03      00000001    12:28:45.699  
      28     04     02    02 04 03      003fffff    12:28:45.699  
      28     04     02    02 04 03      00000001    12:28:45.699  
      28     04     02    02 04 03      003fffff    12:28:45.715  
      28     04     02    02 04 03      00000001    12:28:45.824  
      28     04     02    02 04 03      003fffff    12:28:45.824  
      28     04     02    02 04 03      00000001    12:28:45.840  
      28     04     02    02 04 03      003fffff    12:28:45.840  

    -- dt classpnp!_CLASS_PRIVATE_FDO_DATA fffffa801371c530 --

Classpnp External Information at fffffa800f2ef1b0

    DGC RAID 5 0429 CKM00100500355

    Minidriver information at fffffa800f2ef670
    Attached device object at fffffa800f2ed060
    Physical device object at fffffa800f2ed060

    Media Geometry:

        Bytes in a Sector = 512
        Sectors per Track = 63
        Tracks / Cylinder = 255
        Media Length      = 2147483648 bytes = ~2 GB

    -- dt classpnp!_FUNCTIONAL_DEVICE_EXTENSION fffffa800f2ef1b0 --

October 16th, 2013 5:05am

Thanks a lot! I ve already talked with EMC and they recomended  to us   do an updated   of drivers and multipath.

I have another question but  using the debug  tool. I download it from http://msdn.microsoft.com/en-us/windows/hardware/gg463009.aspx   but always it give me error from symbols

Microsoft (R) Windows Debugger Version 6.2.9200.20512 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.


Loading Dump File [C:\Temp\MEMORY.DMP]
Kernel Summary Dump File: Only kernel address space is available

Symbol search path is: srv*C:\websymbols*http://msdl.microsoft.com/download/symbols;srv*
Executable search path is: srv*
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for ntkrnlmp.exe - 
Windows 7 Kernel Version 7601 (Service Pack 1) MP (16 procs) Free x64
Product: Server, suite: Enterprise TerminalServer SingleUserTS
Built by: 7601.18247.amd64fre.win7sp1_gdr.130828-1532
Machine Name:
Kernel base = 0xfffff800`02461000 PsLoadedModuleList = 0xfffff800`026a46d0
Debug session time: Fri Oct 11 16:53:45.992 2013 (UTC - 3:00)
System Uptime: 0 days 1:33:22.285
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for ntkrnlmp.exe - 
Loading Kernel Symbols
...............................................................

Loading User Symbols

Loading unloaded module list
...........
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck 9E, {fffffa8008a5eab0, 4b0, 0, 0}

***** Kernel symbols are WRONG. Please fix symbols to do analysis.

*************************************************************************
***                                                                   ***
***                                                                   ***
***    Either you specified an unqualified symbol, or your debugger   ***
***    doesn't have full symbol information.  Unqualified symbol      ***
***    resolution is turned off by default. Please either specify a   ***
***    fully qualified symbol module!symbolname, or enable resolution ***
***    of unqualified symbols by typing ".symopt- 100". Note that   ***
***    enabling unqualified symbol resolution with network symbol     ***
***    server shares in the symbol path may cause the debugger to     ***
***    appear to hang for long periods of time when an incorrect      ***
***    symbol name is typed or the network symbol server is down.     ***
***                                                                   ***
***    For some commands to work properly, your symbol path           ***
***    must point to .pdb files that have full type information.      ***
***                                                                   ***
***    Certain .pdb files (such as the public OS symbols) do not      ***
***    contain the required information.  Contact the group that      ***
***    provided you with these symbols if you need this command to    ***
***    work.                                                          ***
***                                                                   ***
***    Type referenced: nt!_KPRCB                                     ***
***                                                                   ***
*************************************************************************

*************************************************************************
Probably caused by : netft.sys ( netft!NetftWatchdogTimerDpc+b9 )

Followup: MachineOwner
---------

8: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

USER_MODE_HEALTH_MONITOR (9e)
One or more critical user mode components failed to satisfy a health check.
Hardware mechanisms such as watchdog timers can detect that basic kernel
services are not executing. However, resource starvation issues, including
memory leaks, lock contention, and scheduling priority misconfiguration,
may block critical user mode components without blocking DPCs or
draining the nonpaged pool.
Kernel components can extend watchdog timer functionality to user mode
by periodically monitoring critical applications. This bugcheck indicates
that a user mode health check failed in a manner such that graceful
shutdown is unlikely to succeed. It restores critical services by
rebooting and/or allowing application failover to other servers.
Arguments:
Arg1: fffffa8008a5eab0, Process that failed to satisfy a health check within the
configured timeout
Arg2: 00000000000004b0, Health monitoring timeout (seconds)
Arg3: 0000000000000000
Arg4: 0000000000000000

Debugging Details:
------------------

***** Kernel symbols are WRONG. Please fix symbols to do analysis.

*************************************************************************
***                                                                   ***
***                                                                   ***
***    Either you specified an unqualified symbol, or your debugger   ***
***    doesn't have full symbol information.  Unqualified symbol      ***
***    resolution is turned off by default. Please either specify a   ***
***    fully qualified symbol module!symbolname, or enable resolution ***
***    of unqualified symbols by typing ".symopt- 100". Note that   ***
***    enabling unqualified symbol resolution with network symbol     ***
***    server shares in the symbol path may cause the debugger to     ***
***    appear to hang for long periods of time when an incorrect      ***
***    symbol name is typed or the network symbol server is down.     ***
***                                                                   ***
***    For some commands to work properly, your symbol path           ***
***    must point to .pdb files that have full type information.      ***
***                                                                   ***
***    Certain .pdb files (such as the public OS symbols) do not      ***
***    contain the required information.  Contact the group that      ***
***    provided you with these symbols if you need this command to    ***
***    work.                                                          ***
***                                                                   ***
***    Type referenced: nt!_KPRCB                                     ***
***                                                                   ***
*************************************************************************


ADDITIONAL_DEBUG_TEXT:  
You can run '.symfix; .reload' to try to fix the symbol path and load symbols.

MODULE_NAME: netft

FAULTING_MODULE: fffff80002461000 nt

DEBUG_FLR_IMAGE_TIMESTAMP:  4a5bc48a

PROCESS_OBJECT: fffffa8008a5eab0

DEFAULT_BUCKET_ID:  WIN7_DRIVER_FAULT

BUGCHECK_STR:  0x9E

CURRENT_IRQL:  0

LAST_CONTROL_TRANSFER:  from fffff88000e626a5 to fffff800024d6bc0

STACK_TEXT:  
fffff880`0253d518 fffff880`00e626a5 : 00000000`0000009e fffffa80`08a5eab0 00000000`000004b0 00000000`00000000 : nt!KeBugCheckEx
fffff880`0253d520 fffff800`024e185c : fffff880`0253d618 00000000`00000001 00000000`400e0088 00000000`00000001 : netft!NetftWatchdogTimerDpc+0xb9
fffff880`0253d570 fffff800`024e16f6 : fffff880`00e6f100 00000000`00057ace 00000000`00000000 00000000`00000000 : nt!KeReleaseMutant+0xb2c
fffff880`0253d5e0 fffff800`024e15de : 0000000d`0b38db3c fffff880`0253dc58 00000000`00057ace fffff880`02518f48 : nt!KeReleaseMutant+0x9c6
fffff880`0253dc30 fffff800`024e13c7 : 00000003`0ebda1c2 00000003`00057ace 00000003`0ebda10c 00000000`000000ce : nt!KeReleaseMutant+0x8ae
fffff880`0253dcd0 fffff800`024ce8ca : fffff880`02515180 fffff880`025202c0 00000000`00000000 fffff880`01a95a48 : nt!KeReleaseMutant+0x697
fffff880`0253dd80 00000000`00000000 : fffff880`0253e000 fffff880`02538000 fffff880`0253dd40 00000000`00000000 : nt!KiCpuId+0x6fa


STACK_COMMAND:  kb

FOLLOWUP_IP: 
netft!NetftWatchdogTimerDpc+b9
fffff880`00e626a5 cc              int     3

SYMBOL_STACK_INDEX:  1

SYMBOL_NAME:  netft!NetftWatchdogTimerDpc+b9

FOLLOWUP_NAME:  MachineOwner

IMAGE_NAME:  netft.sys

BUCKET_ID:  WRONG_SYMBOLS

Followup: MachineOwner
---------

8: kd> !process fffffa8008a5eab0 3
NT symbols are incorrect, please fix symbols
Free Windows Admin Tool Kit Click here and download it now
October 16th, 2013 4:26pm

Once open the dump first run ".symfix" then ".reload". After that always start from "!analyze -v". If you see
 USER_MODE_HEALTH_MONITOR (9e) then it complains about a process not doing something in time (in the case above not terminating in time). The process address is in Arg1. run "!process <Arg1> 1f", and go from there.
October 16th, 2013 5:10pm

Hi all

Thanks for this information.
3. days ago we had a blusceen as well on one of our Exchange 2010 MBX server.
All drives are connected via iSCSI except C. These drives are not pysical. We are using NetApp and on top VmWare.

As I could see in the dump file while using WinDBG is:

Probably caused by : netft.sys ( netft+26a5 )

What do you suggest to solve this issue?

Kind regards Matthias

Free Windows Admin Tool Kit Click here and download it now
August 5th, 2014 7:35am

Hello Matthias,

If you can share the dump with me I'll be happy to take a look.

August 5th, 2014 4:09pm

Hi Vladimir

Thanks for your reply. I rely appreciate this.
Here is the link to have a look into the dump: https://drive.google.com/file/d/0BxDBxOjFJ8GLRXpVZE9JN19BXzA/edit?usp=sharing.

I am waiting for your feedback.

Kind regards

Matthias

Free Windows Admin Tool Kit Click here and download it now
August 6th, 2014 2:37pm

Looks like your VMWare disks are timing out IO at the moment, and also I see some netapp LUNs were failing. Please check your storage.

    Opcode Status Status Sense Code  Sector/ListId   Time Stamp

    ------ ------ ------ ---------- --------------- ------------

      2a     04     22    00 00 00      01f73420    16:17:20.276  Retried

      28     04     22    00 00 00      018807e0    16:17:20.276  Retried

      2a     04     22    00 00 00      000599c8    16:17:20.276  Retried

      2a     04     22    00 00 00      01aa0bb0    17:23:49.276  Retried

      2a     04     22    00 00 00      02e790f0    17:23:49.276  Retried

      2a     04     22    00 00 00      013bfc50    17:23:49.276  Retried

      2a     04     22    00 00 00      02692250    17:23:49.276  Retried

      2a     04     22    00 00 00      03553f80    17:23:49.276  Retried

      2a     04     22    00 00 00      0311d0a5    17:23:49.276  Retried

      2a     04     22    00 00 00      00608b90    17:23:49.276  Retried

      2a     04     22    00 00 00      02b2a210    17:23:49.276  Retried

      2a     04     22    00 00 00      006327f8    17:23:49.276  Retried

      2a     04     22    00 00 00      0007a050    17:23:49.276  Retried

      2a     04     22    00 00 00      0333eb68    17:23:49.276  Retried

      2a     04     22    00 00 00      0004e538    17:23:49.276  Retried

      28     04     22    00 00 00      009fa980    17:23:49.276  Retried

    VMware Virtual disk 1.0 6000c292caa09c5f634b3b75741142ea

    Opcode Status Status Sense Code  Sector/ListId   Time Stamp

    ------ ------ ------ ---------- --------------- ------------

      28     04     22    00 00 00      006bbac0    16:11:00.276  Retried

      28     04     22    00 00 00      008e1d30    16:12:26.276  Retried

      28     04     22    00 00 00      0099b8c8    16:12:26.276  Retried

      28     04     22    00 00 00      0084d9c8    16:12:26.276  Retried

      28     04     22    00 00 00      006cdc08    16:12:26.276  Retried

      28     04     22    00 00 00      006cdb88    16:12:26.276  Retried

      28     04     22    00 00 00      008e0020    16:14:50.276  Retried

      28     04     22    00 00 00      0099b828    16:14:50.276  Retried

      28     04     22    00 00 00      006bbbb8    16:14:50.276  Retried

      28     04     22    00 00 00      0079ee58    16:14:50.276  Retried

      28     04     22    00 00 00      006ed580    16:14:50.276  Retried

      28     04     22    00 00 00      008fd188    16:14:50.276  Retried

      28     04     22    00 00 00      00701e60    16:17:22.276  Retried

      28     04     22    00 00 00      00757f68    16:17:22.276  Retried

      28     04     22    00 00 00      00a87f28    16:17:22.276  Retried

      28     04     22    00 00 00      00783c60    16:17:22.276  Retried

    VMware Virtual disk 1.0 6000c29be6a5c7d86a9a12fabe644f5d

    Opcode Status Status Sense Code  Sector/ListId   Time Stamp

    ------ ------ ------ ---------- --------------- ------------

      28     04     02    06 3f 0e      0fbac380    13:33:27.572  Retried

      2a     0e     00    00 00 00      137a3c00    16:09:42.041  Retried

      2a     0e     00    00 00 00      137a3e00    16:09:42.041  Retried

      2a     0e     02    06 29 00      137a3c00    16:09:42.119  Retried

      2a     0e     00    00 00 00      00614818    17:24:09.432  Retried

    NETAPP LUN 811a 7SRjT+BTl8Uf

    Opcode Status Status Sense Code  Sector/ListId   Time Stamp

    ------ ------ ------ ---------- --------------- ------------

      2a     0e     00    00 00 00      0e910e60    16:09:42.729  Retried

      2a     0e     02    06 29 00      006407f8    16:09:42.807  Retried

      2a     0e     00    00 00 00      005fd608    16:11:00.291  Retried

      2a     0e     00    00 00 00      00632330    16:11:00.291  Retried

      2a     0e     00    00 00 00      0063d620    16:11:00.291  Retried

      2a     0e     00    00 00 00      0bfc32e0    16:11:00.291  Retried

      2a     0e     02    06 29 00      005fd608    16:11:00.322  Retried

      2a     0e     00    00 00 00      0c35dfe0    16:12:29.994  Retried

      2a     0e     02    06 29 00      0c35dfe0    16:12:29.994  Retried

      2a     0e     00    00 00 00      0060b1b8    16:12:30.010  Retried

      2a     0e     00    00 00 00      0060b1c8    16:14:50.479  Retried

      2a     0e     00    00 00 00      000d0070    16:14:50.479  Retried

      2a     0e     00    00 00 00      0e9e6b60    16:14:50.479  Retried

      2a     0e     02    06 29 00      0060b1c8    16:14:50.510  Retried

      2a     0e     00    00 00 00      0e80ece0    16:17:21.354  Retried

      2a     0e     02    06 29 00      0e80ece0    16:17:21.510  Retried

    NETAPP LUN 811a 7SRjT+BTl8Uh

    Opcode Status Status Sense Code  Sector/ListId   Time Stamp

    ------ ------ ------ ---------- --------------- ------------

      2a     0e     00    00 00 00      000f09d8    16:12:30.088  Retried

      2a     0e     00    00 00 00      002a6be8    16:12:30.088  Retried

      28     0e     02    06 29 00      0e223ce8    16:12:30.088  Retried

      28     0e     02    06 29 00      0e240ce8    16:13:43.276  Retried

    NETAPP LUN 811a 7SRjT+BTl8Uj

    Opcode Status Status Sense Code  Sector/ListId   Time Stamp

    ------ ------ ------ ---------- --------------- ------------

      2a     0e     00    00 00 00      00635770    16:11:02.369  Retried

      2a     0e     02    06 29 00      005fd5f0    16:11:02.369  Retried

      2a     0e     00    00 00 00      0011c320    16:13:42.979  Retried

      2a     0e     02    06 29 00      0011c320    16:13:43.010  Retried

    NETAPP LUN 811a 7SRjT+BTl8Ul

    Opcode Status Status Sense Code  Sector/ListId   Time Stamp

    ------ ------ ------ ---------- --------------- ------------

      2a     0e     02    06 29 00      005fd608    16:13:43.260  Retried

      2a     0e     02    06 29 00      00640a20    16:17:30.916  Retried

    NETAPP LUN 811a 7SRjT+BTl8Un

    Opcode Status Status Sense Code  Sector/ListId   Time Stamp

    ------ ------ ------ ---------- --------------- ------------

      2a     0e     00    00 00 00      0351aef8    16:11:02.463  Retried

      2a     0e     00    00 00 00      0060fb78    16:11:02.463  Retried

      28     0e     02    06 29 00      0b1ad938    16:11:02.463  Retried

      2a     0e     00    00 00 00      06557578    16:12:30.057  Retried

      28     0e     02    06 29 00      0b218f38    16:12:30.213  Retried

      28     0e     02    06 29 00      0b236338    16:13:43.369  Retried

      28     04     02    06 3f 0e      0cec5338    17:08:29.760  Retried

      28     04     02    06 3f 0e      0d187b38    17:21:28.932  Retried

    NETAPP LUN 811a 7SRjT+BTl8Up

    Opcode Status Status Sense Code  Sector/ListId   Time Stamp

    ------ ------ ------ ---------- --------------- ------------

      2a     0e     00    00 00 00      1379fca0    16:09:43.197  Retried

      2a     0e     02    06 29 00      002c8a60    16:09:43.229  Retried

      2a     0e     02    06 29 00      005fd600    16:11:02.307  Retried

      2a     0e     00    00 00 00      1379fca0    16:13:43.026  Retried

      2a     0e     00    00 00 00      00628278    16:13:43.026  Retried

      2a     0e     02    06 29 00      1379fca0    16:13:43.041  Retried

      2a     0e     00    00 00 00      0060d340    16:17:30.713  Retried

      2a     0e     00    00 00 00      002c8a60    16:17:30.713  Retried

      2a     0e     02    06 29 00      0060d340    16:17:30.729  Retried

    NETAPP LUN 811a 7SRjT+BTl8Ut

    Opcode Status Status Sense Code  Sector/ListId   Time Stamp

    ------ ------ ------ ---------- --------------- ------------

      2a     0e     00    00 00 00      02f88e50    16:17:30.572  Retried

      2a     0e     00    00 00 00      012aa038    16:17:30.588  Retried

      2a     0e     00    00 00 00      0063d768    16:17:30.588  Retried

      2a     0e     00    00 00 00      0286d1a0    16:17:30.588  Retried

      2a     0e     00    00 00 00      006199c0    16:17:30.588  Retried

      2a     0e     00    00 00 00      01175778    16:17:30.588  Retried

      2a     0e     00    00 00 00      0004cae0    16:17:30.588  Retried

      2a     0e     02    06 29 00      0323e8c8    16:17:30.791  Retried

      2a     04     02    06 3f 0e      025d736f    16:54:35.307  Retried

    NETAPP LUN 811a 7SRjT+BTl8Uv

August 6th, 2014 6:38pm

Hi Vladimir,

I've used WinDBG on many occassions, mainly for clr debugging.  however, i'm amazed at the information you keep getting out of the dump files.

What are the most useful commands to help with this?  It's a science in itself.  Any guidance will be hugely appreciated

Thanks

Andreas

Free Windows Admin Tool Kit Click here and download it now
August 21st, 2014 1:00pm

Hi Vladamir,

Can you please help me with debugging the minidump file.
https://drive.google.com/folderview?id=0B0T9i-iE-IVhWDYwY1Z1Nm1yRFk&usp=sharing


Regards

Techwiz
August 21st, 2014 5:06pm

Hello Techwiz,

Unfortunately I cannot tell much from the kernel minidump. Please change your OS settings to collect kernel full dump on bugcheck, and share that dump on the next repro.

From the minidump it looks like a thread of RHS.exe process has been stuck in the kernel for 20 minutes preventing process termination from making forward progresss. After that a watchdog has recycled machine by bugchecking it. Minidump does not contain thread's stack so I cannot tell where is it stuck.

Free Windows Admin Tool Kit Click here and download it now
August 22nd, 2014 7:42am

Vladimir,

Can you take a look at my DMP file?

https://drive.google.com/file/d/0BzIR_AB_PyatQmowSkJZX2gyQTg/view?usp=sharing

I've had two crashes in the past two weeks with Netft.sys listed as the culprit.

I would appreciate it if you noticed anything else. From a host on a Hyper-V Cluster.

Thanks,

Jon

December 15th, 2014 7:36pm

Please open a case with Microsoft support and they will be able to debug the dump and identify root cause.

Also see this blog:
http://blogs.msdn.com/b/clustering/archive/2013/11/13/10467483.aspx

Thanks!
Elden

Free Windows Admin Tool Kit Click here and download it now
December 15th, 2014 8:09pm

Hello Jon,

The problem is that you have such high DPC rate coming from network on CPU0 and CPU14 that it is stalling threads, and thread scheduling on CPU0.

On CPU 0 DPCs were running back-to-back for almost 27 seconds. CPU0 currently handling a DPC, and in the queue there are 4 to go. By the time when these 4 are processed probably new DPCs will be enqued.

--------------------------------------------------
CPU#0
--------------------------------------------------
Current DPC: NDIS!ndisInterruptDpc (Normal DPC)
Debugger Saved IRQL: 0
Cumulative DPC Time Limit: 120.000 seconds
Current Cumulative DPC Time: 26.906 seconds
Single DPC Time Limit: 20.000 seconds
Current Single DPC Time: 0.000 seconds

Pending DPCs:
----------------------------------------
CPU Type      KDPC       Function
 0: Normal  : 0xffffe00152bafb20 0xfffff800608b6ae0 vmbkmclr!InpProcessingDpcRoutine
 0: Normal  : 0xffffe00149e628e8 0xfffff80060014c60 NDIS!ndisInterruptDpc
 0: Normal  : 0xffffe00149d1f8e8 0xfffff80060014c60 NDIS!ndisInterruptDpc
 0: Normal  : 0xffffe00149ddf8e8 0xfffff80060014c60 NDIS!ndisInterruptDpc

On CPU14 they were running for almost 1.8 seconds

--------------------------------------------------

CPU#14

--------------------------------------------------

Current DPC: netft!NetftWatchdogTimerDpc (Normal DPC)

Debugger Saved IRQL: 2

Cumulative DPC Time Limit: 120.000 seconds

Current Cumulative DPC Time: 1.828 seconds

Single DPC Time Limit: 20.000 seconds

Current Single DPC Time: 0.000 seconds

The thread that is currently (RUNING) not making forward progress because CPU0 is busy processing DPCs from the network

THREAD ffffe00147e1f040  Cid 0004.01a4  Teb: 0000000000000000 Win32Thread: 0000000000000000 RUNNING on processor 0

Not impersonating

DeviceMap                 ffffc000c360dca0

Owning Process            ffffe001463c3900       Image:         System

Attached Process          N/A            Image:         N/A

Wait Start TickCount      52759634       Ticks: 52 (0:00:00:00.812)

Context Switch Count      11411046       IdealProcessor: 0            

UserTime                  00:00:00.000

KernelTime                00:09:27.515

Win32 Start Address vmbusr!AwWorkerThread (0xfffff80060a1d9f0)

Stack Init ffffd001e478add0 Current ffffd001e478a7b0

Base ffffd001e478b000 Limit ffffd001e4785000 Call 0

Priority 15 BasePriority 8 UnusualBoost 7 ForegroundBoost 0 IoPriority 2 PagePriority 5

# Child-SP          RetAddr           Call Site

00 fffff803`7df681b0 fffff800`608b9007 vmbusr!BusChSendInterrupt+0x7

01 (Inline Function) --------`-------- vmbkmclr!KmclSendSignal+0xf

02 (Inline Function) --------`-------- vmbkmclr!OutpProcessRingResult+0x29

03 (Inline Function) --------`-------- vmbkmclr!OutpTrySendControlPacket+0x227

04 (Inline Function) --------`-------- vmbkmclr!OutSendPacket+0x2ac

05 fffff803`7df681e0 fffff800`6091f89a vmbkmclr!VmbPacketSendWithTransferPageRanges+0x307

06 fffff803`7df68290 fffff800`6091f2ee vmswitch!VmsVmNicPvtRndisHostMessageSend+0xfa

07 fffff803`7df68350 fffff800`6091ec7f vmswitch!RndisDevHostDeviceIndicatePackets+0x61e

08 (Inline Function) --------`-------- vmswitch!RndisDevDeviceIndicatePackets+0x25

09 fffff803`7df68540 fffff800`6092ed39 vmswitch!VmsVmNicPvtPacketForward+0x17f

0a fffff803`7df68750 fffff800`6092bb0a vmswitch!VmsRouterDeliverNetBufferLists+0x5c9

0b fffff803`7df68830 fffff800`60012a53 vmswitch!VmsExtPtReceiveNetBufferLists+0x13a

0c fffff803`7df68890 fffff800`60012f19 NDIS!ndisMIndicateNetBufferListsToOpen+0x123

0d (Inline Function) --------`-------- NDIS!ndisMDispatchReceiveNetBufferListsInternal+0x27e

0e fffff803`7df68950 fffff800`600136b2 NDIS!ndisMTopReceiveNetBufferLists+0x2c9

0f (Inline Function) --------`-------- NDIS!ndisIterativeDPInvokeHandlerOnTracker+0x2d3

10 (Inline Function) --------`-------- NDIS!ndisInvokeNextReceiveHandler+0x64d

11 (Inline Function) --------`-------- NDIS!ndisMIndicateReceiveNetBufferListsInternal+0x6a2

12 fffff803`7df689e0 fffff800`6091963e NDIS!NdisMIndicateReceiveNetBufferLists+0x732

13 fffff803`7df68bd0 fffff800`60918aea vmswitch!VmsExtMpIndicatePackets+0x96

14 fffff803`7df68c10 fffff800`60017f81 vmswitch!VmsExtMpSendNetBufferLists+0x47a

15 (Inline Function) --------`-------- NDIS!ndisMSendNBLToMiniportInternal+0xca

16 (Inline Function) --------`-------- NDIS!ndisMSendNBLToMiniport+0xca

17 (Inline Function) --------`-------- NDIS!ndisCallSendHandler+0x24d

18 (Inline Function) --------`-------- NDIS!ndisIterativeDPInvokeHandlerOnTracker+0x27c

19 (Inline Function) --------`-------- NDIS!ndisInvokeNextSendHandler+0x417

1a (Inline Function) --------`-------- NDIS!ndisSendNBLToFilter+0x497

1b (Inline Function) --------`-------- NDIS!ndisMTopSendNetBufferLists+0x4a5

1c fffff803`7df68da0 fffff800`6091e347 NDIS!NdisSendNetBufferLists+0x551

1d fffff803`7df68f90 fffff800`6091de14 vmswitch!VmsExtPtRouteNetBufferLists+0x377

1e fffff803`7df69060 fffff800`60012a53 vmswitch!VmsPtNicReceiveNetBufferLists+0x3c4

1f fffff803`7df691c0 fffff800`60012f19 NDIS!ndisMIndicateNetBufferListsToOpen+0x123

20 (Inline Function) --------`-------- NDIS!ndisMDispatchReceiveNetBufferListsInternal+0x27e

21 fffff803`7df69280 fffff800`600136b2 NDIS!ndisMTopReceiveNetBufferLists+0x2c9

22 (Inline Function) --------`-------- NDIS!ndisIterativeDPInvokeHandlerOnTracker+0x2d3

23 (Inline Function) --------`-------- NDIS!ndisInvokeNextReceiveHandler+0x64d

24 (Inline Function) --------`-------- NDIS!ndisMIndicateReceiveNetBufferListsInternal+0x6a2

25 fffff803`7df69310 fffff800`61a0f814 NDIS!NdisMIndicateReceiveNetBufferLists+0x732

26 fffff803`7df69500 fffff800`61a0f23e NdisImPlatform!implatTryToIndicateReceiveNBLs+0x1e8

27 fffff803`7df69570 fffff800`60012a53 NdisImPlatform!implatReceiveNetBufferLists+0x1a2

28 fffff803`7df695f0 fffff800`60012f19 NDIS!ndisMIndicateNetBufferListsToOpen+0x123

29 (Inline Function) --------`-------- NDIS!ndisMDispatchReceiveNetBufferListsInternal+0x27e

2a fffff803`7df696b0 fffff800`60013094 NDIS!ndisMTopReceiveNetBufferLists+0x2c9

2b (Inline Function) --------`-------- NDIS!ndisInvokeNextReceiveHandler+0x2f

2c (Inline Function) --------`-------- NDIS!ndisMIndicateReceiveNetBufferListsInternal+0x84

2d fffff803`7df69740 fffff800`606251c4 NDIS!NdisMIndicateReceiveNetBufferLists+0x114

2e fffff803`7df69930 fffff800`60625a9d e1i63x64!RECEIVE::RxIndicateNBLs+0xd4

2f fffff803`7df69970 fffff800`60618150 e1i63x64!RECEIVE::RxProcessInterrupts+0x19d

30 fffff803`7df699f0 fffff800`6061857e e1i63x64!INTERRUPT::MsgIntDpcTxRxProcessing+0x1c0

31 fffff803`7df69a60 fffff800`60617b78 e1i63x64!INTERRUPT::MsgIntMessageInterruptDPC+0x13e

32 fffff803`7df69ac0 fffff800`60014e02 e1i63x64!INTERRUPT::MiniportMessageInterruptDPC+0x28

33 (Inline Function) --------`-------- NDIS!ndisMiniportDpc+0x110

34 fffff803`7df69b00 fffff803`7c342cd0 NDIS!ndisInterruptDpc+0x1a3

35 fffff803`7df69be0 fffff803`7c341f87 nt!KiExecuteAllDpcs+0x1b0

36 fffff803`7df69d30 fffff803`7c3cbad5 nt!KiRetireDpcList+0xd7

37 fffff803`7df69fb0 fffff803`7c3cb8d9 nt!KyRetireDpcList+0x5

38 ffffd001`e478aa10 fffff803`7c3cd9fa nt!KiDispatchInterruptContinue

39 ffffd001`e478aa40 fffff803`7c343cd3 nt!KiDpcInterrupt+0xca

3a (Inline Function) --------`-------- nt!KzLowerIrql+0x9

3b ffffd001`e478abd0 fffff800`608b7d4b nt!KeInsertQueueDpc+0x1e3

3c (Inline Function) --------`-------- vmbkmclr!InpReleaseLockAndPerformWork+0xe8

3d ffffd001`e478ac50 fffff800`608b6a90 vmbkmclr!InpTransitionRunningQueue+0x17b

3e ffffd001`e478ac90 fffff800`60a1daaf vmbkmclr!InpProcessingWorkerRoutine+0xf0

3f (Inline Function) --------`-------- vmbusr!AwRunWorkItem+0x29

40 ffffd001`e478ace0 fffff803`7c379c70 vmbusr!AwWorkerThread+0xbf

41 ffffd001`e478ad40 fffff803`7c3cefc6 nt!PspSystemThreadStartup+0x58

42 ffffd001`e478ada0 00000000`00000000 nt!KxStartSystemThread+0x16

There are several threads on the CPU0 ready to run

Processor 0: Ready Threads at priority 15

    THREAD ffffe8006e2d2080  Cid 0c08.2894  Teb: 00007ff7ac976000 Win32Thread: 0000000000000000 READY on processor 0

    THREAD ffffe0014e53f880  Cid 0c08.0d58  Teb: 00007ff7ac978000 Win32Thread: 0000000000000000 READY on processor 0

    THREAD ffffe8006df75880  Cid 0c08.247c  Teb: 00007ff7ac972000 Win32Thread: fffff901407ac010 READY on processor 0

    THREAD ffffe8006bd24880  Cid 0c08.177c  Teb: 00007ff7ac96a000 Win32Thread: 0000000000000000 READY on processor 0

    THREAD ffffe00148889880  Cid 0004.0dc8  Teb: 0000000000000000 Win32Thread: 0000000000000000 READY on processor 0

    THREAD ffffe8006a321880  Cid 0820.1e64  Teb: 00007ff671f88000 Win32Thread: fffff901406c4b60 READY on processor 0

    THREAD ffffe00150278040  Cid 0004.045c  Teb: 0000000000000000 Win32Thread: 0000000000000000 READY on processor 0

    THREAD ffffe0014d98e300  Cid 0004.2160  Teb: 0000000000000000 Win32Thread: 0000000000000000 READY on processor 0

    THREAD ffffe8006e65b880  Cid 0004.1b00  Teb: 0000000000000000 Win32Thread: 0000000000000000 READY on processor 0

I see a clussvc thread has been sitting in the READY state on CPU0 for almost 58 seconds. It is possible that this thread eventually was supposed to send the heartbeat.

THREAD ffffe0014d98e300  Cid 0004.2160  Teb: 0000000000000000 Win32Thread: 0000000000000000 READY on processor 0

Not impersonating

DeviceMap                 ffffc000c360dca0

Owning Process            ffffe001463c3900       Image:         System

Attached Process          N/A            Image:         N/A

Wait Start TickCount      52755922       Ticks: 3764 (0:00:00:58.812)

Context Switch Count      56741          IdealProcessor: 0            

UserTime                  00:00:00.000

KernelTime                00:00:01.109

Win32 Start Address nt!ExpWorkerThread (0xfffff8037c314100)

Stack Init ffffd001eb77efd0 Current ffffd001eb77ebc0

Base ffffd001eb77f000 Limit ffffd001eb779000 Call 0

Priority 15 BasePriority 7 UnusualBoost 8 ForegroundBoost 0 IoPriority 2 PagePriority 5

Child-SP          RetAddr           Call Site

ffffd001`eb77ec00 fffff803`7c29fe9b nt!KiSwapContext+0x76

(Inline Function) --------`-------- nt!KzCheckForThreadDispatch+0x134 (Inline Function @ fffff803`7c29fe9b)

ffffd001`eb77ed40 fffff803`7c29fb7b nt!KiCheckForThreadDispatch+0x153

ffffd001`eb77ed80 fffff803`7c29f98d nt!KeSetSystemGroupAffinityThread+0xfb

ffffd001`eb77edd0 fffff803`7c2f5a4b nt!KeGenericProcessorCallback+0xdd

ffffd001`eb77ef40 fffff803`7c3cc2f7 nt!KeGenericCallDpc+0x27

ffffd001`eb77ef80 fffff803`7c3cc2bd nt!KySwitchKernelStackCallout+0x27 (TrapFrame @ ffffd001`eb77ee40)

ffffd001`eb8f97f0 fffff803`7c2b6a7d nt!KiSwitchKernelStackContinue

ffffd001`eb8f9810 fffff803`7c3024ab nt!KeExpandKernelStackAndCalloutInternal+0x2fd

ffffd001`eb8f9900 fffff803`7c32634b nt!MiSwapStackPage+0x2d7

ffffd001`eb8f99d0 fffff803`7c304cee nt!MiClaimPhysicalRun+0x44f

ffffd001`eb8f9a50 fffff803`7c3043cd nt!MiFindContiguousPages+0x282

ffffd001`eb8f9bb0 fffff803`7c723a8c nt!MiRebuildLargePage+0x99

ffffd001`eb8f9c40 fffff803`7c31438c nt!MiRebuildLargePages+0x88

ffffd001`eb8f9c90 fffff803`7c379c70 nt!ExpWorkerThread+0x28c

ffffd001`eb8f9d40 fffff803`7c3cefc6 nt!PspSystemThreadStartup+0x58

ffffd001`eb8f9da0 00000000`00000000 nt!KxStartSystemThread+0x16

Because of DPC storm on CPU0 that clussvc did not send to netft heartbeat for 1 minute and netft finally bugchecked the machine.

I would be curios to learn what scenario are you running that leads to that.

As a remediation I see several options

  • Increase ClussvcHangTimeout (cluster public property) to a value above Cumulative DPC Time Limit. You can set it to 135 (2 minutes 15 seconds). In that case either machine survives or it will be bugchecked by DPC watchdog, in any case it might be better than 9e from netft,sys.
  • Look at you NIC settings. Looks like you are using  4 Intel(R) Gigabit ET Quad Port Server Adapter hooked to VMswitch and 3 Broadcom BCM5709C NetXtreme II GigE available for the host. Perhaps you can tune up how traffic from these NICs is load balanced across CPUs using RSS (Receive Side Scaling) and VMQ. Please also check that you have latest and greatest drivers for these NICs.       
  • I see you are using Windows 2012 R2. Please make sure you have latest fixes: https://support.microsoft.com/kb/2920151?wa=wsignin1.0.

I would second Eldens advise to contact Microsoft support for this case. You should share with them your dump, and you are welcome to share analysis above.

December 16th, 2014 3:58am

Hello Vladimir,

it seems that we are having a very similar problem with a new installation at customer site. Maybe you can help us locating what the Problem is?

You can find the latest dump file here: https://www.dropbox.com/s/i6m0ic49epbxk6g/MEMORY.zip?dl=0

There is a 2012 R2 Cluster running on a DataCore SANsymphonyV virtual SAN. Once we start to initialize new virtual disks the cluster crashes down.

Would be great if you could help us, because the System will not go productive until the problem is fixed...

Thanks a lot,

Martin 

Free Windows Admin Tool Kit Click here and download it now
August 26th, 2015 8:25am

Hello Martin,

Netft bugchecked machine because RHS.exe was not able to complete termination in 20 minutes. One of the threads of that process is stuck in kernel waiting for an IO to complete by the storage.

I would suggest to share that dump with support of folks who implemented DcsPoll.sys. Or see if they have any updates.

    Loaded symbol image file: DcsPoll.sys
    Image path: \SystemRoot\System32\drivers\DcsPoll.sys
    Image name: DcsPoll.sys
    Timestamp:        Mon Aug 10 11:26:39 2015 (55C8ECDF)
    CheckSum:         00013D53
    ImageSize:        00016000
    File version:     15.0.300.5312
    Product version:  15.0.300.5312
    File flags:       0 (Mask 3F)
    File OS:          40004 NT Win32
    File type:        2.0 Dll
    File date:        00000000.00000000
    Translations:     0409.04b0
    CompanyName:      DataCore Software Corporation
    ProductName:      DcsPoll.sys
    InternalName:     DcsPoll.sys
    OriginalFilename: DcsPoll.sys
    ProductVersion:   15.0.300.5312
    FileVersion:      15.0.300.5312
    FileDescription:  DcsPoll.sys
    LegalCopyright:   Copyright 1998-2015 DataCore Software Corporation. All Rights Reserved.
    Comments:         All Rights Reserved.

System Uptime: 4 days 2:58:57.605, and this thread is doing some busy loop.

THREAD ffffe000e1174880  Cid 0004.0248  Teb: 0000000000000000 Win32Thread: 0000000000000000 RUNNING on processor 2e
Not impersonating
DeviceMap                 ffffc0015500c0b0
Owning Process            ffffe000e10b05c0       Image:         System
Attached Process          N/A            Image:         N/A
Wait Start TickCount      22805606       Ticks: 0
Context Switch Count      1196804        IdealProcessor: 44            
UserTime                  00:00:00.000
KernelTime                4 Days 02:06:17.359
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for DcsPoll.sys -
Win32 Start Address DcsPoll (0xfffff8013fa85504)
Stack Init ffffd00023569c90 Current ffffd00023569810
Base ffffd0002356a000 Limit ffffd00023564000 Call 0
Priority 8 BasePriority 8 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
Child-SP          RetAddr           Call Site
ffffd000`23569b20 fffff801`3fa856a0 DcsPoll+0x5b2d
ffffd000`23569bc0 fffff803`c931536c DcsPoll+0x66a0
ffffd000`23569c00 fffff803`c936c2c6 nt!PspSystemThreadStartup+0x58@ 5906]
ffffd000`23569c60 00000000`00000000 nt!KxStartSystemThread+0x16

I see they are also busy spinning on the 3 other CPUs

THREAD ffffe000e3dd1380  Cid 0004.0644  Teb: 0000000000000000 Win32Thread: 0000000000000000 RUNNING on processor 4
Not impersonating
DeviceMap                 ffffc0015500c0b0
Owning Process            ffffe000e10b05c0       Image:         System
Attached Process          N/A            Image:         N/A
Wait Start TickCount      22805606       Ticks: 0
Context Switch Count      37419169       IdealProcessor: 4            
UserTime                  00:00:00.000
KernelTime                00:23:29.843
Win32 Start Address DcsPool (0xfffff80140a71d44)
Stack Init ffffd0002118fc90 Current ffffd0002118f890
Base ffffd00021190000 Limit ffffd0002118a000 Call 0
Priority 9 BasePriority 8 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
Child-SP          RetAddr           Call Site
ffffd000`2118f920 fffff801`40a90462 DcsPool+0xe40a
ffffd000`2118fb20 fffff801`40a903ed DcsPool+0x23462
ffffd000`2118fb60 fffff801`40a9cd9e DcsPool+0x233ed
ffffd000`2118fb90 fffff801`40a71dde DcsPool+0x2fd9e
ffffd000`2118fbc0 fffff803`c931536c DcsPool+0x4dde
ffffd000`2118fc00 fffff803`c936c2c6 nt!PspSystemThreadStartup+0x58
ffffd000`2118fc60 00000000`00000000 nt!KxStartSystemThread+0x16

THREAD ffffe8008628a040  Cid 0004.26f0  Teb: 0000000000000000 Win32Thread: 0000000000000000 RUNNING on processor 10
Not impersonating
DeviceMap                 ffffc0015500c0b0
Owning Process            ffffe000e10b05c0       Image:         System
Attached Process          N/A            Image:         N/A
Wait Start TickCount      22805606       Ticks: 0
Context Switch Count      2805765        IdealProcessor: 16            
UserTime                  00:00:00.000
KernelTime                00:19:47.625
Win32 Start Address DcsFcEng (0xfffff8013f609cf4)
Stack Init ffffd0002b121c90 Current ffffd0002b121810
Base ffffd0002b122000 Limit ffffd0002b11c000 Call 0
Priority 8 BasePriority 8 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
Child-SP          RetAddr           Call Site
ffffd000`2b121968 fffff801`3fa6f6a4 DcsSup+0x13eb
ffffd000`2b121970 fffff801`3f60a3ce DcsSup!DcsSup::memset+0x48
ffffd000`2b1219a0 fffff801`3f60a208 DcsFcEng+0xa3ce
ffffd000`2b121a40 fffff801`3f607a6c DcsFcEng+0xa208
ffffd000`2b121ad0 fffff801`3f607ead DcsFcEng+0x7a6c
ffffd000`2b121b00 fffff801`3f609c11 DcsFcEng+0x7ead
ffffd000`2b121b40 fffff801`3f609cfd DcsFcEng+0x9c11
ffffd000`2b121bd0 fffff803`c931536c DcsFcEng+0x9cfd
ffffd000`2b121c00 fffff803`c936c2c6 nt!PspSystemThreadStartup+0x58
ffffd000`2b121c60 00000000`00000000 nt!KxStartSystemThread+0x16

THREAD ffffe000e1174040  Cid 0004.0244  Teb: 0000000000000000 Win32Thread: 0000000000000000 RUNNING on processor 2c
Not impersonating
DeviceMap                 ffffc0015500c0b0
Owning Process            ffffe000e10b05c0       Image:         System
Attached Process          N/A            Image:         N/A
Wait Start TickCount      22805606       Ticks: 0
Context Switch Count      767536         IdealProcessor: 16            
UserTime                  00:00:00.000
KernelTime                4 Days 02:41:57.718
Win32 Start Address DcsPoll (0xfffff8013fa85504)
Stack Init ffffd000269e2c90 Current ffffd000269e26f0
Base ffffd000269e3000 Limit ffffd000269dd000 Call 0
Priority 8 BasePriority 8 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
Child-SP          RetAddr           Call Site
ffffd000`269e2970 fffff801`40db3264 DcsIs+0x1d4ba
ffffd000`269e29a0 fffff801`414d4c33 DcsIs+0x1d264
ffffd000`269e29d0 fffff801`3fb815aa DcsiMgr+0x20c33
ffffd000`269e2a10 fffff801`3f621150 DcsShim+0xa5aa
ffffd000`269e2a80 fffff801`3f61d82a DcsFcEng+0x21150
ffffd000`269e2ac0 fffff801`3f60ed11 DcsFcEng+0x1d82a
ffffd000`269e2af0 fffff801`3fa84b71 DcsFcEng+0xed11
ffffd000`269e2b20 fffff801`3fa856a0 DcsPoll+0x5b71
ffffd000`269e2bc0 fffff803`c931536c DcsPoll+0x66a0
ffffd000`269e2c00 fffff803`c936c2c6 nt!PspSystemThreadStartup+0x58
ffffd000`269e2c60 00000000`00000000 nt!KxStartSystemThread+0x16

In this dump I see 3 IOs outstanding to disks

 

  DO ffffe80085c2a4a0   Ext ffffe80085c2a5f0   Adapter ffffe000e48b31a0   Working
   Vendor: DataCore   Product: Mirror Disk        SCSI ID: (0, 3, 1)  
   Claimed Enumerated
   SlowLock Free   RemLock 2   PageCount 0
   QueueTagList: ffffe80085c2a6b0      Outstanding: Head ffffe000e4ff0050  Tail ffffe000e4ff0050  Timeout 50
   DeviceQueue ffffe80085c2a6e0   Depth: 250   Status: Not Frozen   PauseCount: 0   BusyCount: 0  
   IO Gateway: Busy Count 0   Pause Count 0
   Requests: Outstanding 1   Device 0   ByPass 0


[Device-Queued Requests] IRP               SRB Type   SRB               XRB               Command           MDL               SGList            Timeout
-----------------------------------------------------------------------------------------------------------------------------------
[Bypass-Queued Requests] IRP               SRB Type   SRB               XRB               Command           MDL               SGList            Timeout
-----------------------------------------------------------------------------------------------------------------------------------
[Outstanding Requests] IRP               SRB Type   SRB               XRB               Command           MDL               SGList            Timeout
-----------------------------------------------------------------------------------------------------------------------------------
ffffe80084c69350  [SCSI]     ffffe800850bd220  ffffe000e4ff0020  SCSI/UNMAP        ffffe80084222d00  0000000000000000  50
[Completed Requests] IRP               SRB Type   SRB               XRB               Command           MDL               SGList            Timeout
-----------------------------------------------------------------------------------------------------------------------------------

   DO ffffe800851d1480   Ext ffffe800851d15d0   Adapter ffffe000e48b31a0   Working
   Vendor: DataCore   Product: Virtual Disk       SCSI ID: (0, 0, 1)  
   Claimed Enumerated
   SlowLock Free   RemLock 2   PageCount 0
   QueueTagList: ffffe800851d1690      Outstanding: Head ffffe000e53f1c10  Tail ffffe000e53f1c10  Timeout 50
   DeviceQueue ffffe800851d16c0   Depth: 250   Status: Not Frozen   PauseCount: 0   BusyCount: 0  
   IO Gateway: Busy Count 0   Pause Count 0
   Requests: Outstanding 1   Device 0   ByPass 0


[Device-Queued Requests] IRP               SRB Type   SRB               XRB               Command           MDL               SGList            Timeout
-----------------------------------------------------------------------------------------------------------------------------------
[Bypass-Queued Requests] IRP               SRB Type   SRB               XRB               Command           MDL               SGList            Timeout
-----------------------------------------------------------------------------------------------------------------------------------
[Outstanding Requests] IRP               SRB Type   SRB               XRB               Command           MDL               SGList            Timeout
-----------------------------------------------------------------------------------------------------------------------------------
ffffe80085c142f0  [SCSI]     ffffe80083f7dec0  ffffe000e53f1be0  SCSI/UNMAP        ffffe80085d7b750  0000000000000000  50
[Completed Requests] IRP               SRB Type   SRB               XRB               Command           MDL               SGList            Timeout
-----------------------------------------------------------------------------------------------------------------------------------    DO ffffe80086beb060   Ext ffffe80086beb1b0   Adapter ffffe000e48b31a0   Working
   Vendor: DataCore   Product: Mirror Disk        SCSI ID: (0, 3, 2)  
   Claimed Enumerated
   SlowLock Free   RemLock 2   PageCount 0
   QueueTagList: ffffe80086beb270      Outstanding: Head ffffe000e4ff7050  Tail ffffe000e4ff7050  Timeout 50
   DeviceQueue ffffe80086beb2a0   Depth: 250   Status: Not Frozen   PauseCount: 0   BusyCount: 0  
   IO Gateway: Busy Count 0   Pause Count 0
   Requests: Outstanding 1   Device 0   ByPass 0
[Device-Queued Requests] IRP               SRB Type   SRB               XRB               Command           MDL               SGList            Timeout
-----------------------------------------------------------------------------------------------------------------------------------
[Bypass-Queued Requests] IRP               SRB Type   SRB               XRB               Command           MDL               SGList            Timeout
-----------------------------------------------------------------------------------------------------------------------------------
[Outstanding Requests] IRP               SRB Type   SRB               XRB               Command           MDL               SGList            Timeout
-----------------------------------------------------------------------------------------------------------------------------------
ffffe000e1392ee0  [SCSI]     ffffe80084579890  ffffe000e4ff7020  SCSI/UNMAP        ffffe000edf20360  0000000000000000  50
[Completed Requests] IRP               SRB Type   SRB               XRB               Command           MDL               SGList            Timeout
-----------------------------------------------------------------------------------------------------------------------------------

   DO ffffe80089d6c7f0   Ext ffffe80089d6c940   Adapter ffffe000e48b31a0   Working
   Vendor: DataCore   Product: Virtual Disk       SCSI ID: (0, 1, 0)  
   Claimed Enumerated
   SlowLock Free   RemLock 2   PageCount 0
   QueueTagList: ffffe80089d6ca00      Outstanding: Head ffffe000e53f8c10  Tail ffffe000e53f8c10  Timeout 50
   DeviceQueue ffffe80089d6ca30   Depth: 250   Status: Not Frozen   PauseCount: 0   BusyCount: 0  
   IO Gateway: Busy Count 0   Pause Count 0
   Requests: Outstanding 1   Device 0   ByPass 0


[Device-Queued Requests] IRP               SRB Type   SRB               XRB               Command           MDL               SGList            Timeout
-----------------------------------------------------------------------------------------------------------------------------------
[Bypass-Queued Requests] IRP               SRB Type   SRB               XRB               Command           MDL               SGList            Timeout
-----------------------------------------------------------------------------------------------------------------------------------
[Outstanding Requests] IRP               SRB Type   SRB               XRB               Command           MDL               SGList            Timeout
-----------------------------------------------------------------------------------------------------------------------------------
ffffe800864be010  [SCSI]     ffffe800856d2df0  ffffe000e53f8be0  SCSI/UNMAP        ffffe80084b7fa10  0000000000000000  50
[Completed Requests] IRP               SRB Type   SRB               XRB               Command           MDL               SGList            Timeout
-----------------------------------------------------------------------------------------------------------------------------------

 Since this is scsi/unmap my guess this is some solution that provides SSD/NVME based cache.

August 26th, 2015 7:20pm

Hello Vladimir,

Could you please help in reviewing memory dump uploaded at below link? 

https://drive.google.com/file/d/0B1Z6Q5Mfd7nid2lDWGJKLVZDWWs/view?usp=sharing

I have 2 node windows 2012 R2 active/passive cluster setup on Hyper-V. cluster nodes generate crash dump with bug check 09e very frequently.

Thanks & Regards,

Suchit Patil

Free Windows Admin Tool Kit Click here and download it now
September 9th, 2015 11:41am

Hello Suchit,

Cluster bugchecked machine because Resource Host Monitor has not completed termination in 20 minutes. One of the RHS threads is stuck in the kernel for about 20 minutes. Looks like things are getting stuck in TmXPFlt.sys.

 

As a remediation you might want to uninstall this product until issue is resolved. I would also suggest to talk to support of the company that provided that solution to see if they have a fix and to make sure they are aware of that issue.

 

I see lots of threads in the system are stuck with a similar call stack.

 

    Loaded symbol image file: TmXPFlt.sys

    Image path: \??\C:\Program Files (x86)\Trend Micro\OfficeScan Client\TmXPFlt.sys

    Image name: TmXPFlt.sys

    Timestamp:        Sat Aug 30 06:11:38 2014 (5401CD8A)

    CheckSum:         0005DDB6

    ImageSize:        0006C000

    Translations:     0000.04b0 0000.04e4 0409.04b0 0409.04e4

 

        THREAD fffffa806e2d6080  Cid 0d94.0f6c  Teb: 000007f7a0c6e000 Win32Thread: fffff90102e3ab80 WAIT: (Executive) KernelMode Non-Alertable

            fffff88007394440  SynchronizationEvent

        IRP List:

            fffffa8033b24010: (0006,03e8) Flags: 00000884  Mdl: 00000000

        Not impersonating

        DeviceMap                 fffff8a00000c310

        Owning Process            fffffa8032dd9980       Image:         rhs.exe

        Attached Process          N/A            Image:         N/A

        Wait Start TickCount      750939         Ticks: 76757 (0:00:19:59.328)

        Context Switch Count      378            IdealProcessor: 5            

        UserTime                  00:00:00.015

        KernelTime                00:00:00.015

        Win32 Start Address 0x000007f7a13cbc24

        Stack Init fffff88007395c90 Current fffff88007394190

        Base fffff88007396000 Limit fffff88007390000 Call 0

        Priority 14 BasePriority 13 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5

        Child-SP          RetAddr           Call Site

        fffff880`073941d0 fffff800`342aff79 nt!KiSwapContext+0x76

        (Inline Function) --------`-------- nt!KiSwapThread+0xfa (Inline Function @ fffff800`342aff79)

        fffff880`07394310 fffff800`342ac21f nt!KiCommitThreadWait+0x229

        fffff880`07394380 fffff880`05050457 nt!KeWaitForSingleObject+0x1cf

        fffff880`07394410 fffff880`050460df TmXPFlt+0xe457

        fffff880`07394470 fffff880`04384df5 TmXPFlt+0x40df

        fffff880`07394590 fffff880`016ae844 TmPreFlt!TmpQueryFullName+0xd61

        fffff880`07394660 fffff880`016afa6c fltmgr!FltpPerformPreCallbacks+0x324

        fffff880`07394770 fffff880`016da349 fltmgr!FltpPassThroughInternal+0x8c

        fffff880`073947a0 fffff800`34655228 fltmgr!FltpCreate+0x339

        (Inline Function) --------`-------- nt!IoCallDriverWithTracing+0xc3 (Inline Function @ fffff800`34655228)

        fffff880`07394850 fffff800`34668470 nt!IopParseDevice+0x173c

        fffff880`07394a30 fffff800`34656978 nt!ObpLookupObjectName+0x644

        fffff880`07394b40 fffff800`3466930e nt!ObOpenObjectByName+0x258

        fffff880`07394c10 fffff800`3463f96c nt!IopCreateFile+0x37c

        fffff880`07394cb0 fffff800`34284d53 nt!NtOpenFile+0x58

        fffff880`07394d40 fffff800`34289f30 nt!KiSystemServiceCopyEnd+0x13 (TrapFrame @ fffff880`07394db0)

        fffff880`07394f48 fffff800`34626a68 nt!KiServiceLinkage

        fffff880`07394f50 fffff800`34284d53 nt!NtCreateUserProcess+0x400

        fffff880`07395a90 000007fb`572a371b nt!KiSystemServiceCopyEnd+0x13 (TrapFrame @ fffff880`07395b00)

        000000a4`76ced028 00000000`00000000 0x000007fb`572a371b

 

 

Ive also noticed several threads where TmXPFlt is trying to open a file over SMB. Perhaps all other activity is stuck behind these activities, but it is hard to tell without symbols.

 

THREAD fffffa806e078080  Cid 0004.0c9c  Teb: 0000000000000000 Win32Thread: 0000000000000000 WAIT: (Executive) KernelMode Non-Alertable

    fffffa803444e190  SynchronizationEvent

IRP List:

    fffffa806f440010: (0006,01f0) Flags: 00000884  Mdl: 00000000

Impersonation token:  fffff8a00dc72270 (Level Impersonation)

DeviceMap                 fffff8a00e5514b0

Owning Process            fffffa8030bc9980       Image:         System

Attached Process          N/A            Image:         N/A

Wait Start TickCount      728673         Ticks: 99023 (0:00:25:47.234)

Context Switch Count      42000          IdealProcessor: 7            

UserTime                  00:00:00.000

KernelTime                00:00:40.156

Win32 Start Address TmXPFlt (0xfffff8800504dddc)

Stack Init fffff88009395fd0 Current fffff88009395b80

Base fffff88009396000 Limit fffff88009390000 Call 0

Priority 12 BasePriority 8 UnusualBoost 3 ForegroundBoost 0 IoPriority 2 PagePriority 5

Child-SP          RetAddr           Call Site

fffff880`09395bc0 fffff800`342aff79 nt!KiSwapContext+0x76

(Inline Function) --------`-------- nt!KiSwapThread+0xfa (Inline Function @ fffff800`342aff79)

fffff880`09395d00 fffff800`342ac21f nt!KiCommitThreadWait+0x229

fffff880`09395d70 fffff880`056483bb nt!KeWaitForSingleObject+0x1cf

fffff880`09395e00 fffff880`0563ffde mrxsmb10!SmbCeInitiateExchange+0x30f

fffff880`09395e70 fffff880`043a40db mrxsmb10!MRxSmbCreate+0x8d6

fffff880`09395f50 fffff800`342804a7 mrxsmb!SmbpShellCreateWithNewStack+0x1b

fffff880`09395f80 fffff800`3428046d nt!KySwitchKernelStackCallout+0x27 (TrapFrame @ fffff880`09395e40)

fffff880`0664b880 fffff800`342c786e nt!KiSwitchKernelStackContinue

fffff880`0664b8a0 fffff800`34243fc5 nt!KeExpandKernelStackAndCalloutInternal+0x20e

fffff880`0664b9a0 fffff880`043a40aa nt!KeExpandKernelStackAndCallout+0x15

fffff880`0664b9e0 fffff880`01ba8620 mrxsmb!SmbShellCreate+0x4a

fffff880`0664ba10 fffff880`01ba547d rdbss!RxCollapseOrCreateSrvOpen+0x210

fffff880`0664baa0 fffff880`01ba69ab rdbss!RxCreateFromNetRoot+0x63d

fffff880`0664bbd0 fffff880`01b6e652 rdbss!RxCommonCreate+0x15b

fffff880`0664bc70 fffff880`01ba059b rdbss!RxFsdCommonDispatch+0x522

fffff880`0664bdd0 fffff880`043d209c rdbss!RxFsdDispatch+0xcb

fffff880`0664be30 fffff880`01f37161 mrxsmb!MRxSmbFsdDispatch+0x8c

fffff880`0664be70 fffff880`01f34215 mup!MupiCallUncProvider+0x1b1

fffff880`0664bee0 fffff880`01f32475 mup!MupStateMachine+0xb6

fffff880`0664bf10 fffff880`016b04ee mup!MupCreate+0x165

fffff880`0664bf80 fffff880`016da35d fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x25e

fffff880`0664c020 fffff800`34655228 fltmgr!FltpCreate+0x34d

(Inline Function) --------`-------- nt!IoCallDriverWithTracing+0xc3 (Inline Function @ fffff800`34655228)

fffff880`0664c0d0 fffff800`34668470 nt!IopParseDevice+0x173c

fffff880`0664c2b0 fffff800`34656978 nt!ObpLookupObjectName+0x644

fffff880`0664c3c0 fffff800`3466930e nt!ObOpenObjectByName+0x258

fffff880`0664c490 fffff800`34669a59 nt!IopCreateFile+0x37c

fffff880`0664c530 fffff800`34284d53 nt!NtCreateFile+0x79 fffff880`0664c5c0 fffff800`34289f30 nt!KiSystemServiceCopyEnd+0x13 (TrapFrame @ fffff880`0664c630) fffff880`0664c7c8 fffff880`04fb1651 nt!KiServiceLinkage

fffff880`0664c7d0 fffff880`04fb243a VSApiNt!VSSwapShortTable+0x721

fffff880`0664c840 fffff880`05049d16 VSApiNt!VSKDZwCreateFile+0x5a

fffff880`0664c8b0 fffff880`0504c1b9 TmXPFlt+0x7d16

fffff880`0664c980 fffff880`0504ce6e TmXPFlt+0xa1b9

fffff880`0664c9e0 fffff880`0504da7f TmXPFlt+0xae6e

fffff880`0664cb20 fffff880`0504def1 TmXPFlt+0xba7f

fffff880`0664cbe0 fffff800`3422f2c5 TmXPFlt+0xbef1

fffff880`0664cc10 fffff800`3426c656 nt!PspSystemThreadStartup+0x59 [d:\win8_ldr\minkernel\ntos\ps\psexec.c @ 5691]

fffff880`0664cc60 00000000`00000000 nt!KxStartSystemThread+0x16 [d:\win8_ldr\minkernel\ntos\ke\amd64\threadbg.asm @ 75]

 

    Loaded symbol image file: VSApiNt.sys

    Image path: \??\C:\Program Files (x86)\Trend Micro\OfficeScan Client\VSApiNt.sys

    Image name: VSApiNt.sys

    Timestamp:        Sat Aug 30 06:03:46 2014 (5401CBB2)

    CheckSum:         0024476C

    ImageSize:        00238000

    Translations:     0000.04b0 0000.04e4 0409.04b0 0409.04e4

 

Here is a list of all opens that are stack over SMB.

 

RxContext        RDR [  Maj,  Min] Irp              Thread           FCB

 

fffffa8032dd4bb0   0 [ 0x 0, 0x 0] fffffa806e7df010 fffffa806eea9440 0000000000000000

       16:08.471   CREATE          '\HMEL-BTH-DC03.hmel.int\IPC$'

 

fffffa8034749950   0 [ 0x 0, 0x 0] fffffa806f624d90 fffffa80342c4b00 fffff88001b998c0

       13:29.967   CREATE          '<<empty>>'

 

fffffa8033ce44b0   0 [ 0x 0, 0x 0] fffffa806f130d10 fffffa80338bbb00 fffff8a00e5bf010

       25:47.248   CREATE          '\Hmelbthdd\bth-evmjavsp01\EVMJAVSP01\2015\08-18\F\0E4\F0E43290E239950FABB7730FEA0B4421.DVS'

 

fffffa8033ee15e0   0 [ 0x 0, 0x 0] fffffa8033f27400 fffffa806f4a0900 fffff8a00ee432a0

       25:47.248   CREATE          '\Hmelbthdd\bth-evmjavsp01\EVMJAVSP02\2015\05-07\6\125\6125F1071BA45DE8BA67A9D1E7004ED1~90~9F3EAD6D~00~1.DVSSP'

 

fffffa8034459200   0 [ 0x 0, 0x 0] fffffa806ef81be0 fffffa803413cb00 fffff8a00d5d4670

       25:47.247   CREATE          '\Hmelbthdd\bth-evmjavsp01\EVMJAVSP01\2015\08-18\F\0E4\F0E4B8A43ECF2FB81443FFE354A7A931.DVS'

 

fffffa8033ac7950   2 [ 0x e, 0x 0] fffffa8032de64f0 fffffa8033896080 fffff8a00f667610

       76:14.999   IOCTL           '\2'

 

fffffa8031f94010   2 [ 0x e, 0x 0] fffffa80335c2010 fffffa803216b600 fffff8a00f667610

       44:46.553   IOCTL           '\2'

 

fffffa80335e05a0   0 [ 0x 0, 0x 0] fffffa803453ec20 fffffa8033ff5080 fffff8a00daac010

       25:47.250   CREATE          '\Hmelbthdd\bth-evmjavsp01\EVMJAVSP01\2015\08-16\F\051\F0516C6C8D4D24CED66C01341EBC0F71.DVS'

 

fffffa8033b59610   0 [ 0x 0, 0x 0] fffffa80342b9600 fffffa806f8af5c0 fffff8a0118532a0

       25:47.248   CREATE          '\Hmelbthdd\bth-evmjavsp01\EVMJAVSP02\2015\08-18\6\11D\611D856FB14A3D5416A61EF0D7116911~0F~C97B4131~00~1.DVSSP'

 

fffffa8033c93240   0 [ 0x 0, 0x 0] fffffa8033b2c580 fffffa8034381b00 fffff8a00da1d600

       25:47.248   CREATE          '\Hmelbthdd\bth-evmjavsp01\EVMJAVSP01\2014\07-17\6\0A5\60A5BB94AFB6B48F2B15E53045337701~35~1FD28490~00~1.DVSSP'

 

fffffa806ef187f0   0 [ 0x 0, 0x 0] fffffa80340a4630 fffffa806e0c0080 fffff8a010c947e0

       25:46.524   CREATE          '\Hmelbthdd\bth-evmjavsp01\EVMJAVSP02\2015\05-17\D\06B\D06B43ABD3EEB44CA7CB2FE2CAB27721~39~6A552458~00~1.DVSSP'

 

fffffa80336269a0   0 [ 0x 0, 0x 0] fffffa8033528780 fffffa806e0fa080 fffff8a012ccf010

       25:46.499   CREATE          '\Hmelbthdd\bth-evmjavsp01\EVMJAVSP02\2015\03-05\3\03B\303B66B5ACE86D5696529DF90977A8F1~6D~5FD6F55D~00~1.DVSSP'

 

fffffa8034749cb0   0 [ 0x 0, 0x 0] fffffa806f030750 fffffa806f53e080 fffff88001b998c0

       13:29.967   CREATE          '<<empty>>'

 

fffffa806f35fc20   0 [ 0x 0, 0x 0] fffffa80343ced10 fffffa803417fb00 fffff8a00d22e350

       25:47.247   CREATE          '\Hmelbthdd\bth-evmjavsp01\EVMJAVSP02\2015\08-18\6\11D\611D8ED60F35534A9D9860B16528A501~C4~46589CF8~00~1.DVSSP'

 

fffffa803444e010   0 [ 0x 0, 0x 0] fffffa806f440010 fffffa806e078080 fffff8a00dd3e2f0

       25:47.241   CREATE          '\Hmelbthdd\bth-evmjavsp01\EVMJAVSP02\2015\08-18\6\11D\611D8CD7E12525CA7EE50328316C7AF1~85~6174E8BD~00~1.DVSSP'

 

fffffa8034151010   0 [ 0x 0, 0x 0] fffffa8034366690 fffffa806e0fab00 fffff8a00d89d8e0

       25:47.234   CREATE          '\Hmelbthdd\bth-evmjavsp01\EVMJAVSP01\2015\08-18\F\0E4\F0E4B2D443D4F8D070887D83728C6411.DVS'

 

fffffa8033fe7cb0   0 [ 0x 0, 0x 0] fffffa8033bf2d80 fffffa806e031b00 fffff8a00d66b500

       25:47.234   CREATE          '\Hmelbthdd\bth-evmjavsp01\EVMJAVSP02\2015\08-18\6\11D\611D83CE4F080D152AB519841E0F9551.DVS'

 

fffffa806e9acb30   0 [ 0x 0, 0x 0] fffffa8033468940 fffffa806e0bfb00 fffff8a00e6bf010

       25:47.234   CREATE          '\Hmelbthdd\bth-evmjavsp01\EVMJAVSP02\2015\08-18\6\11D\611D8585CAAFD30258F5F9204E8B8F21.DVS'

 

fffffa806f214cb0   0 [ 0x 0, 0x 0] fffffa8033e719a0 fffffa806f56e080 fffff8a012de3010

       25:46.500   CREATE          '\Hmelbthdd\bth-evmjavsp01\EVMJAVSP02\2015\06-15\5\048\50488155ACE08770781A75BBD1F269C1~29~B78492F0~00~1.DVSSP'

 

fffffa8034747cb0   0 [ 0x 0, 0x 0] fffffa806f567380 fffffa806ebda480 fffff88001b998c0

       13:29.967   CREATE          '<<empty>>'

 

fffffa80346e2010   0 [ 0x 0, 0x 0] fffffa8033c48730 fffffa803424ca80 fffff88001b998c0

       13:29.967   CREATE          '<<empty>>'

 

fffffa80349fd310   0 [ 0x 0, 0x 0] fffffa8033d55b00 fffffa8033dbd100 fffff88001b998c0

        3:29.963   CREATE          '<<empty>>'

 

fffffa803493e010   0 [ 0x 0, 0x 0] fffffa8033c3c010 fffffa80341a1b00 fffff88001b998c0

        3:29.962   CREATE          '<<empty>>'

 

fffffa806f6fe010   0 [ 0x 0, 0x 0] fffffa806f32b680 fffffa806e0bdb00 fffff8a010d06730

       25:47.234   CREATE          '\Hmelbthdd\bth-evmjavsp01\EVMJAVSP01\2015\08-18\F\0E9\F0E9001E013394D9C07EF02330BFB911.DVS'

 

fffffa806ec38010   0 [ 0x 0, 0x 0] fffffa806f07e700 fffffa806f366b00 fffff8a012ca2a80

       25:47.225   CREATE          '\Hmelbthdd\bth-evmjavsp01\EVMJAVSP01\2015\08-18\F\0E7\F0E756057B9D6D2ABECB8E13E3415CC1.DVS'

 

fffffa806ee48230   0 [ 0x 0, 0x 0] fffffa806f132010 fffffa806e0be6c0 fffff8a01242a010

       25:47.014   CREATE          '\Hmelbthdd\bth-evmjavsp01\EVMJAVSP01\2014\12-20\D\0C2\D0C292F3FB7A73F70208526C10EAF491.DVS'

 

fffffa806f78e9a0   0 [ 0x 0, 0x 0] fffffa806e8e12c0 fffffa8034337b00 fffff8a004aea010

       25:46.501   CREATE          '\Hmelbthdd\bth-evmjavsp01\EVMJAVSP02\2015\05-17\C\0E1\C0E1AAE496323E417F8B4BEFFAAE1FB1~D5~E770E121~00~1.DVSSP'

 

fffffa806eb5a450   0 [ 0x 0, 0x 0] fffffa806ef3b6b0 fffffa806ee98b00 fffff88001b998c0

       13:29.967   CREATE          '<<empty>>'

 

fffffa806f5f87f0   0 [ 0x 0, 0x 0] fffffa806f677010 fffffa806e0bf080 fffff8a00ffa4a60

       25:47.190   CREATE          '\Hmelbthdd\bth-evmjavsp01\EVMJAVSP02\2015\05-29\5\0BC\50BCEB2E36C8C4FABB47EC3453CB13B1~F3~00A34DF4~00~1.DVSSP'

 

fffffa806f1d9770   0 [ 0x 0, 0x 0] fffffa806f46d390 fffffa806f048080 fffff8a01282a010

       25:47.249   CREATE          '\Hmelbthdd\bth-evmjavsp01\EVMJAVSP01\2015\01-16\1\124\1124F16CCD8694D455954CAABDF10111~97~5E195B14~00~1.DVSSP'

 

fffffa806f71ecb0   0 [ 0x 0, 0x 0] fffffa806f543370 fffffa806f5cb080 fffff8a012705a80

       25:47.248   CREATE          '\Hmelbthdd\bth-evmjavsp01\EVMJAVSP01\2015\08-15\1\00F\100F697A6B99B08748C65EBD6A56BC21.DVS'

 

fffffa806f703900   0 [ 0x 0, 0x 0] fffffa806f742010 fffffa806e0c06c0 fffff8a012b51380

       25:47.221   CREATE          '\Hmelbthdd\bth-evmjavsp01\EVMJAVSP01\2015\08-13\F\011\F0116842118398CEE7462D0AE65D7601.DVS'

 

fffffa806eeee780   0 [ 0x 0, 0x 0] fffffa80345d6a10 fffffa806f38b300 fffff88001b998c0

        3:29.962   CREATE          '<<empty>>'

 

fffffa806f529190   0 [ 0x 0, 0x 0] fffffa806f7f8320 fffffa806e0bab00 fffff8a012b145b0

       25:47.249   CREATE          '\Hmelbthdd\bth-evmjavsp01'

 

fffffa806eb4b910   0 [ 0x 0, 0x 0] fffffa8034263e10 fffffa803438e080 fffff8a010b89a80

       25:47.027   CREATE          '\Hmelbthdd\bth-evmjavsp01\EVMJAVSP01\2015\08-16\F\054\F054CC18A0FC36B2B94C5D416455D4B1.DVS'

 

fffffa803415c010   0 [ 0x 0, 0x 0] fffffa806f800510 fffffa806f267680 fffff88001b998c0

       13:29.967   CREATE          '<<empty>>'

 

fffffa806ebee830   0 [ 0x 0, 0x 0] fffffa806f6fd010 fffffa8033ee4080 fffff88001b998c0

       13:29.966   CREATE          '<<empty>>'

 

SMB client has several Irps stuck in the networking stack for a long time

 

Time Pending  IRP

 

 25:47.251    fffffa806f775010

 16:08.472    fffffa8033fb3b90

 

6: kd> !irp fffffa806f775010

Irp is active with 2 stacks 1 is current (= 0xfffffa806f7750e0)

 No Mdl: No System Buffer: Thread 00000000:  Irp stack trace. 

     cmd  flg cl Device   File     Completion-Context

>[IRP_MJ_INTERNAL_DEVICE_CONTROL(f), N/A(10)]

            0 e1 fffffa80315c6c10 00000000 fffff8800439d8b0-fffffa8033a6fd10 Success Error Cancel pending

              \Driver\AFD   mrxsmb!SmbWskGetAddressInfoComplete

                     Args: fffffa803209e410 fffff880096e5ae0 fffffa803392d5c0 00000000

 [N/A(0), N/A(0)]

            0  0 00000000 00000000 00000000-00000000   

 

                     Args: fffff88001b97a00 fffff88001b97a00 fffffa806f775010 7184f95f

 

6: kd> !irp fffffa8033fb3b90

Irp is active with 2 stacks 1 is current (= 0xfffffa8033fb3c60)

 No Mdl: No System Buffer: Thread 00000000:  Irp stack trace. 

     cmd  flg cl Device   File     Completion-Context

>[IRP_MJ_INTERNAL_DEVICE_CONTROL(f), N/A(10)]

            0 e1 fffffa80315c6c10 00000000 fffff8800439d8b0-fffffa806f77a240 Success Error Cancel pending

              \Driver\AFD   mrxsmb!SmbWskGetAddressInfoComplete

                     Args: fffffa803209e410 fffff8800bf71c50 fffffa8033c71ec0 00000000

 [N/A(0), N/A(0)]

            0  0 00000000 00000000 00000000-00000000   

 

                     Args: fffff88001b98b00 fffff88001b98b00 fffffa8033fb3b90 718dce3a

 

Looks like these Irps are calls from SMB client to DNS client to resolve names. NDIS should send up-call back to the DNS client service in user mode to resolve them. DNS client service is hosted in one of the svchost.exe processes. I see many threads from svchost processes are stuck in TmXPFlt so it is possible that this is what is causing the deadlock.

September 9th, 2015 2:57pm

Thank you very much Vladimir for your immediate review of memory dump, we have taken required action on Trend Micro Antivirus and cluster nodes are under observation.


Free Windows Admin Tool Kit Click here and download it now
September 10th, 2015 3:13am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics