Need help reading a dump file (dump files attached)
I'm trying to read a dump file on my Windows 7 machine. I'm running WinDbg, according to the instructions on this website: http://windows7themes.net/how-to-open-dmp-files-in-windows-7.html Here is a link to the last 2 dump files. Note, the dump files are from a windows server 2008 r2 machine, that is getting the blue screen of death every few days, for no apparent reason: http://www.mediafire.com/file/1zcl8i64d65crz8/070311-18174-01.dmp http://www.mediafire.com/file/4h8hiwvvum89eca/070711-18220-01.dmp *EDIT* I have been able to read the dump file, still not able to find the problem (see following posts): Microsoft (R) Windows Debugger Version 6.12.0002.633 AMD64 Copyright (c) Microsoft Corporation. All rights reserved. Loading Dump File [C:\Users\mpapania\Desktop\062811-64350-01.dmp] Mini Kernel Dump File: Only registers and stack trace are available WARNING: Whitespace at start of path element WARNING: Whitespace at start of path element Symbol search path is: SRV*C:\Symbols*http://msdl.microsoft.com/download/symbols Executable search path is: c:\windows\System32; c:\windows\system\System32; http://www.alexander.com/SymServe Windows 7 Kernel Version 7601 (Service Pack 1) MP (8 procs) Free x64 Product: Server, suite: Enterprise TerminalServer SingleUserTS Built by: 7601.17514.amd64fre.win7sp1_rtm.101119-1850 Machine Name: Kernel base = 0xfffff800`01a59000 PsLoadedModuleList = 0xfffff800`01c9ee90 Debug session time: Tue Jun 28 02:00:38.586 2011 (UTC - 4:00) System Uptime: 46 days 18:10:34.956 Loading Kernel Symbols .................................................Unable to load image Unknown_Module_00000000`00000000, Win32 error 0n2 *** WARNING: Unable to verify timestamp for Unknown_Module_00000000`00000000 Unable to add module at 00000000`00000000 Loading User Symbols Loading unloaded module list .............................................. ******************************************************************************* * * * Bugcheck Analysis * * * ******************************************************************************* Use !analyze -v to get detailed debugging information. BugCheck 124, {0, fffffa80073b1028, fe001d00, 1009f} *** WARNING: Unable to verify checksum for PSHED.dll Probably caused by : hardware Followup: MachineOwner --------- 0: kd> !analyze -v ******************************************************************************* * * * Bugcheck Analysis * * * ******************************************************************************* WHEA_UNCORRECTABLE_ERROR (124) A fatal hardware error has occurred. Parameter 1 identifies the type of error source that reported the error. Parameter 2 holds the address of the WHEA_ERROR_RECORD structure that describes the error conditon. Arguments: Arg1: 0000000000000000, Machine Check Exception Arg2: fffffa80073b1028, Address of the WHEA_ERROR_RECORD structure. Arg3: 00000000fe001d00, High order 32-bits of the MCi_STATUS value. Arg4: 000000000001009f, Low order 32-bits of the MCi_STATUS value. Debugging Details: ------------------ BUGCHECK_STR: 0x124_GenuineIntel CUSTOMER_CRASH_COUNT: 1 DEFAULT_BUCKET_ID: DRIVER_FAULT_SERVER_MINIDUMP PROCESS_NAME: System CURRENT_IRQL: f STACK_TEXT: fffff800`01826a98 fffff800`01a22a3b : 00000000`00000124 00000000`00000000 fffffa80`073b1028 00000000`fe001d00 : nt!KeBugCheckEx fffff800`01826aa0 fffff800`01be67d3 : 00000000`00000001 fffffa80`073d0a60 00000000`00000000 fffffa80`073d0ab0 : hal!HalBugCheckSystem+0x1e3 fffff800`01826ae0 fffff800`01a22700 : 00000000`00000728 fffffa80`073d0a60 fffff800`01826e70 fffff800`01826e00 : nt!WheaReportHwError+0x263 fffff800`01826b40 fffff800`01a22052 : fffffa80`073d0a60 fffff800`01826e70 fffffa80`073d0a60 00000000`00000000 : hal!HalpMcaReportError+0x4c fffff800`01826c90 fffff800`01a21f0d : 00000000`00000008 00000000`00000001 fffff800`01826ef0 00000000`00000000 : hal!HalpMceHandler+0x9e fffff800`01826cd0 fffff800`01a15e88 : 00000000`00000001 fffff800`01c4be80 00000000`00000000 00000000`00000000 : hal!HalpMceHandlerWithRendezvous+0x55 fffff800`01826d00 fffff800`01ad7f2c : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : hal!HalHandleMcheck+0x40 fffff800`01826d30 fffff800`01ad7d93 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KxMcheckAbort+0x6c fffff800`01826e70 fffff880`02dd1c61 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiMcheckAbort+0x153 fffff800`0181ac98 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0xfffff880`02dd1c61 STACK_COMMAND: kb FOLLOWUP_NAME: MachineOwner MODULE_NAME: hardware IMAGE_NAME: hardware DEBUG_FLR_IMAGE_TIMESTAMP: 0 FAILURE_BUCKET_ID: X64_0x124_GenuineIntel__UNKNOWN BUCKET_ID: X64_0x124_GenuineIntel__UNKNOWN Followup: MachineOwner ---------
July 7th, 2011 7:16pm

OK, I was able to use !analyze -v to get the detailed debugging information. Looks like an something to do with intel (could this be the CPU driver?) CPU is an Intel Xeon. BUGCHECK_STR: 0x124_GenuineIntel CUSTOMER_CRASH_COUNT: 1 DEFAULT_BUCKET_ID: DRIVER_FAULT_SERVER_MINIDUMP CURRENT_IRQL: f STACK_TEXT: fffff800`01ae6a98 fffff800`02214a3b : 00000000`00000124 00000000`00000000 fffffa80`073bd028 00000000`fe000140 : nt!KeBugCheckEx fffff800`01ae6aa0 fffff800`01da57d3 : 00000000`00000001 fffffa80`073dca60 00000000`00000000 fffffa80`073dcab0 : hal!HalBugCheckSystem+0x1e3 fffff800`01ae6ae0 fffff800`02214700 : 00000000`00000728 fffffa80`073dca60 fffff800`01ae6e70 fffff800`01ae6e00 : nt!WheaReportHwError+0x263 fffff800`01ae6b40 fffff800`02214052 : fffffa80`073dca60 fffff800`01ae6e70 fffffa80`073dca60 00000000`00000000 : hal!HalpMcaReportError+0x4c fffff800`01ae6c90 fffff800`02213f0d : 00000000`00000008 00000000`00000001 fffff800`01ae6ef0 00000000`00000000 : hal!HalpMceHandler+0x9e fffff800`01ae6cd0 fffff800`02207e88 : 00000000`000000aa 00000000`0380fc1a 00000000`00000000 00000000`00000000 : hal!HalpMceHandlerWithRendezvous+0x55 fffff800`01ae6d00 fffff800`01c96f2c : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : hal!HalHandleMcheck+0x40 fffff800`01ae6d30 fffff800`01c96d93 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KxMcheckAbort+0x6c fffff800`01ae6e70 00000000`775003e0 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiMcheckAbort+0x153 00000000`0380fb14 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x775003e0 STACK_COMMAND: kb FOLLOWUP_NAME: MachineOwner MODULE_NAME: hardware IMAGE_NAME: hardware DEBUG_FLR_IMAGE_TIMESTAMP: 0 FAILURE_BUCKET_ID: X64_0x124_GenuineIntel__UNKNOWN BUCKET_ID: X64_0x124_GenuineIntel__UNKNOWN Followup: MachineOwner ---------
Free Windows Admin Tool Kit Click here and download it now
July 7th, 2011 9:17pm

Stop 0x00000124 This error message occurs when Windows has a problem handling a PCI-Express device. Most often, this occurs when adding or removing a hot-pluggable PCI-Express card; however, it can occur with driver/hardware related problems for PCI-Express cards. Resolving the Problem To troubleshoot this error, first make sure that you have applied all Windows and driver updates. If you recently updated a driver, roll back the change. If the stop error continues to occur, remove PCI-Express cards one by one to identify the problematic hardware. When you have identified the card causing the problem, contact the hardware manufacturer for further troubleshooting assistance. The driver might need to be updated, or the card itself could be faulty. For driver updates, please download them directly from the relevant hardware manufacturer's site and not through some 3rd party. Always download them fully first before trying to execute them, and do so from the downloaded file.
July 7th, 2011 10:17pm

Thanks for the suggestion. So far, I have applied all windows updates. If that doesn't work, I'm going to check the drivers, then I can try the PCI cards, etc. The problem is, the BSOD onlyl happens every few days, so I'll have to wait a few days after each step, to see if that was the problem or not. In the meantime, I'll keep searching for information on this error, and compiling a list of things to try. Also on my list is to do a memory check, and a BIOS update. Any other suggestions or insight to the mini dump is greatly appreciated! After working with winddbg some more (after reading a very helpful article (http://www.networkworld.com/news/2005/041105-windows-crash.html?page=1), I was able to get the complete mini-dump (I think). Use !analyze -v to get detailed debugging information. BugCheck 124, {0, fffffa8007394028, fe000700, 1009f} *** WARNING: Unable to verify checksum for PSHED.dll Probably caused by : hardware Followup: MachineOwner --------- 0: kd> !analyze -v ******************************************************************************* * * * Bugcheck Analysis * * * ******************************************************************************* WHEA_UNCORRECTABLE_ERROR (124) A fatal hardware error has occurred. Parameter 1 identifies the type of error source that reported the error. Parameter 2 holds the address of the WHEA_ERROR_RECORD structure that describes the error conditon. Arguments: Arg1: 0000000000000000, Machine Check Exception Arg2: fffffa8007394028, Address of the WHEA_ERROR_RECORD structure. Arg3: 00000000fe000700, High order 32-bits of the MCi_STATUS value. Arg4: 000000000001009f, Low order 32-bits of the MCi_STATUS value. Debugging Details: ------------------ BUGCHECK_STR: 0x124_GenuineIntel CUSTOMER_CRASH_COUNT: 1 DEFAULT_BUCKET_ID: DRIVER_FAULT_SERVER_MINIDUMP PROCESS_NAME: System CURRENT_IRQL: f STACK_TEXT: fffff800`01ae6a98 fffff800`01c20a3b : 00000000`00000124 00000000`00000000 fffffa80`07394028 00000000`fe000700 : nt!KeBugCheckEx fffff800`01ae6aa0 fffff800`01de47d3 : 00000000`00000001 fffffa80`073b0a60 00000000`00000000 fffffa80`073b0ab0 : hal!HalBugCheckSystem+0x1e3 fffff800`01ae6ae0 fffff800`01c20700 : 00000000`00000728 fffffa80`073b0a60 fffff800`01ae6e70 fffff800`01ae6e00 : nt!WheaReportHwError+0x263 fffff800`01ae6b40 fffff800`01c20052 : fffffa80`073b0a60 fffff800`01ae6e70 fffffa80`073b0a60 00000000`00000000 : hal!HalpMcaReportError+0x4c fffff800`01ae6c90 fffff800`01c1ff0d : 00000000`00000008 00000000`00000001 fffff800`01ae6ef0 00000000`00000000 : hal!HalpMceHandler+0x9e fffff800`01ae6cd0 fffff800`01c13e88 : 00000000`00000001 fffff800`01e49e80 00000000`00000000 00000000`00000000 : hal!HalpMceHandlerWithRendezvous+0x55 fffff800`01ae6d00 fffff800`01cd5f2c : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : hal!HalHandleMcheck+0x40 fffff800`01ae6d30 fffff800`01cd5d93 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KxMcheckAbort+0x6c fffff800`01ae6e70 fffff880`02e7dc61 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiMcheckAbort+0x153 fffff800`01adac98 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0xfffff880`02e7dc61 STACK_COMMAND: kb FOLLOWUP_NAME: MachineOwner MODULE_NAME: hardware IMAGE_NAME: hardware DEBUG_FLR_IMAGE_TIMESTAMP: 0 FAILURE_BUCKET_ID: X64_0x124_GenuineIntel__UNKNOWN BUCKET_ID: X64_0x124_GenuineIntel__UNKNOWN Followup: MachineOwner ---------
Free Windows Admin Tool Kit Click here and download it now
July 7th, 2011 10:26pm

Do you have a PCI-Express graphics card? If so, try removing and reseating it in its slot to see if that clears the problem. It could also be caused by a slight misalignment of the motherboard with relation to the chassis - slightly loosen the fixing screws and retighten.
July 7th, 2011 10:30pm

Do you have a PCI-Express graphics card? If so, try removing and reseating it in its slot to see if that clears the problem. It could also be caused by a slight misalignment of the motherboard with relation to the chassis - slightly loosen the fixing screws and retighten. OK, just took a look inside. There are no PCI-Express cards installed. And the motherboard is aligned correctly. So, guess it's not a PCI express card issue. Still lots more to check though.....
Free Windows Admin Tool Kit Click here and download it now
July 7th, 2011 10:37pm

Hi, I just analyzed your dump file, here are some information I want to share with you: The WHEA_UNCORRECTABLE_ERROR bug check has a value of 0x00000124. This bug check indicates that a fatal hardware error has occurred. This bug check uses the error data that is provided by the Windows Hardware Error Architecture. Parameter 1: 0x0 A machine check exception caused this error. These parameter descriptions apply if the processor is based on the x64 architecture, or the x86 architecture that has the MCA feature available. The following actions might prevent an error like this from happening again: 1.Download and install updates and device drivers for your computer from Windows Update. 2.Scan your computer for computer viruses. 3.Check your hard disk for errors. Please understand, to solid troubleshoot this kind of kernel crash issue, we need to debug the crashed system dump and analyze the related source code if needed. Debugging is beyond what we can do in the forum. I recommend that you can contact Microsoft Customer Service and Support (CSS) via telephone so that a dedicated Support Professional can assist with your request. Please be advised that contacting phone support will be a charged call. To obtain the phone numbers for specific technology request please take a look at the web site listed below: http://support.microsoft.com/default.aspx?scid=fh;EN-US;OfferProPhone#faq607 Microsoft Customer Service (800) 426-9400 is available Monday through Friday, from 6:30 A.M. to 5:30 P.M. pacific time. Note: Microsoft Customer Service mainly handles issues regarding replacement manuals, disks, drivers and service packs, product IDs, or lost CD-keys, product orders, policies related to copying software on additional computers, licensing, and product registration. Hope that helps Please remember to click Mark as Answer on the post that helps you, and to click Unmark as Answer if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread.
July 11th, 2011 12:33pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics