Help with decoding a BSOD on Win 2003 Server
In the last 10 days our windows server 2003 box has started BSODing several times a day. I have narrowed it down to a handful of tasks it does that cause this. It should be noted these tasks didn't demonstrate this behaviour before. The only system change was an upgrade to avg8 to avg9, then the following day we saw these issues; potentially considence maybe. During my investigations initially I removed AVG9 and used the tidy tool to clean the machine of it. The BSOD occurs when our web control panel (Helm) talks to out email server or on of my file/log tidy scripts runs during the night. The common these of these is that they will reference the machine by it's network ID, not using files. The event log reports this: Error code 000000d1, parameter1 1200001c, parameter2 d0000002, parameter3 00000000, parameter4 ba3f8315. I have so far done the following: - reinstalled SP2, - repair install helm - repair install imail email server - checked firmware on server and raid up to date - updated the LAN drivers from dell's website - taken hotfix 943545 recommended by dell I have used the windbg to do some decoding which suggests TDI.SYS is at fault (below). So I am not at the point of wanting some advice on what to try next as I'm running out of ideas. thanks, Glenn Microsoft (R) Windows Debugger Version 6.11.0001.404 X86 Copyright (c) Microsoft Corporation. All rights reserved. Loading Dump File [C:\WINDOWS\MEMORY.DMP] Kernel Complete Dump File: Full address space is available ************************************************************ WARNING: Dump file has been truncated. Data may be missing. ************************************************************ Symbol search path is: SRV*c:\localcache*http://msdl.microsoft.com/download/symbols Executable search path is: Windows Server 2003 Kernel Version 3790 (Service Pack 2) MP (4 procs) Free x86 compatible Product: Server, suite: TerminalServer SingleUserTS Blade Built by: 3790.srv03_sp2_gdr.090805-1438 Machine Name: Kernel base = 0x80800000 PsLoadedModuleList = 0x808a6ea8 Debug session time: Sun Dec 27 05:58:35.001 2009 (GMT+0) System Uptime: 0 days 0:56:06.072 Loading Kernel Symbols ............................................................... ................................................................ ................................... Loading User Symbols Loading unloaded module list .................. ******************************************************************************* * * * Bugcheck Analysis * * * ******************************************************************************* Use !analyze -v to get detailed debugging information. BugCheck D1, {1200001c, d0000002, 0, ba403315} Probably caused by : TDI.SYS ( TDI!CTEpEventHandler+32 ) Followup: MachineOwner --------- 1: kd> !analyze -v ******************************************************************************* * * * Bugcheck Analysis * * * ******************************************************************************* DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1) An attempt was made to access a pageable (or completely invalid) address at an interrupt request level (IRQL) that is too high. This is usually caused by drivers using improper addresses. If kernel debugger is available get stack backtrace. Arguments: Arg1: 1200001c, memory referenced Arg2: d0000002, IRQL Arg3: 00000000, value 0 = read operation, 1 = write operation Arg4: ba403315, address which referenced memory Debugging Details: ------------------ READ_ADDRESS: 1200001c CURRENT_IRQL: 2 FAULTING_IP: tcpip!IndicateData+38e ba403315 ff700c push dword ptr [eax+0Ch] DEFAULT_BUCKET_ID: DRIVER_FAULT BUGCHECK_STR: 0xD1 PROCESS_NAME: System TRAP_FRAME: f790e9e4 -- (.trap 0xfffffffff790e9e4) ErrCode = 00000000 eax=12000010 ebx=8a1b6d44 ecx=00000000 edx=00000000 esi=872a5e38 edi=8724b1f8 eip=ba403315 esp=f790ea58 ebp=f790eaa0 iopl=0 nv up ei pl zr na pe nc cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010246 tcpip!IndicateData+0x38e: ba403315 ff700c push dword ptr [eax+0Ch] ds:0023:1200001c=???????? Resetting default scope LAST_CONTROL_TRANSFER: from ba403315 to 8088c99b STACK_TEXT: f790e9e4 ba403315 badb0d00 00000000 8a1b6e54 nt!KiTrap0E+0x2a7 f790eaa0 ba401354 002a5e38 00001050 8a1cfd20 tcpip!IndicateData+0x38e f790eaec ba400ab0 65182f91 65182f91 8a1cfd20 tcpip!TcpFastReceive+0x301 f790ebc8 ba3fd101 8a1d1ce8 0100007f 0100007f tcpip!TCPRcv+0x72f f790ec28 ba3fb326 00000024 8a1d1ce8 ba400861 tcpip!DeliverToUser+0x189 f790ecb8 ba40710c 8a1d1ce8 89b9ca10 00000030 tcpip!IPRcvPacket+0x686 f790ed64 baa87064 ba43dea0 8a1d1ce8 8a391660 tcpip!LoopXmitRtn+0x195 f790ed80 80880469 8a1d1ce8 00000000 8a391660 TDI!CTEpEventHandler+0x32 f790edac 80949b80 ba43dea0 00000000 00000000 nt!ExpWorkerThread+0xeb f790eddc 8088e092 8088037e 00000001 00000000 nt!PspSystemThreadStartup+0x2e 00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16 STACK_COMMAND: kb FOLLOWUP_IP: TDI!CTEpEventHandler+32 baa87064 5f pop edi SYMBOL_STACK_INDEX: 7 SYMBOL_NAME: TDI!CTEpEventHandler+32 FOLLOWUP_NAME: MachineOwner MODULE_NAME: TDI IMAGE_NAME: TDI.SYS DEBUG_FLR_IMAGE_TIMESTAMP: 45d69a2f FAILURE_BUCKET_ID: 0xD1_TDI!CTEpEventHandler+32 BUCKET_ID: 0xD1_TDI!CTEpEventHandler+32 Followup: MachineOwner ---------
December 27th, 2009 11:25pm

Thse articles may help.You receive a "Stop 0xD1" error message when you try to establish a TCP/IP sessionhttp://support.microsoft.com/kb/829120Using Driver Verifier to identify issues with Windows drivers for advanced usershttp://support.microsoft.com/kb/244617Regards, Dave Patrick .... Microsoft Certified Professional Microsoft MVP [Windows]
Free Windows Admin Tool Kit Click here and download it now
December 27th, 2009 11:37pm

Hi There,from your trace the trap occured at tcpip!IndicateData+0x38ethe closest symbol is IndicateData and we are +38e bytes into it. So assuming that you have correct symbols loaded , we can point to the indicate data function. We might hit a road block as tdi.sys belongs to microsoft and you need to contact them for the analyzing the trace.If you do not want to contact microsoft and you have to debug then the other way is to disassemble the code a) start your debugigng at ba403315 where the trap occured and you need to check for the arguments by checking the stack.b) eg: u tcpip!indicateDAta will give you the assembly instruction where you need to analyze ( this holds good when you do not have the symbol files ) from here you need to check the SP to see what parameters are pushed on to the stack.Hope this helps !
December 28th, 2009 4:41am

Hi, Please understand that the forum is not the best place for analyzing dump. It’s suggested to contact Microsoft Customer Support Services (CSS) so that a dedicated Support Professional can help you on this issue. To obtain the phone numbers for specific technology request please take a look at the web site listed below. http://support.microsoft.com/default.aspx?scid=fh;EN-US;PHONENUMBERS If you are outside the US please see http://support.microsoft.com for regional support phone numbers. Hope this helps.
Free Windows Admin Tool Kit Click here and download it now
December 29th, 2009 9:49am

Remove the Nic drivers and reinstall it and check whether the problems comes againhttp://technetfaqs.wordpress.com
December 29th, 2009 9:54am

Hi Thanks for your comments and suggestions. I had reviewed: You receive a "Stop 0xD1" error message when you try to establish a TCP/IP session http://support.microsoft.com/kb/829120 Our version of tdi.sys is newer than the hotfix, so I'm not sure if this is sensible to add it - it also looks like it was targetted at sp1 so in theory the fix should already be on the machine wrapped up in the fixes provided. Do you think this is a wrong assumption? I am on the verge on speaking to Microsoft but having been trying to save the £200 ticket cost price. I've suggested to our datacentre about the NIC drivers uninstall, waiting to see if they will do this. I think I've already gone over the head now with the idea of a dump analysis. Thanks for your suggestions.
Free Windows Admin Tool Kit Click here and download it now
December 30th, 2009 1:31pm

Hi guppyuk,I understand your concern, but as i said earlier you might need to contact microsoft for server crash issues as they have the source code access for all the drivers / applications and debugging would be minutes / hours task .
December 31st, 2009 4:04am

Hello, guppyuk, what is the current status of this issue?
Free Windows Admin Tool Kit Click here and download it now
January 15th, 2010 10:07am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics