Need help reading a dump file (dump files attached)
I'm trying to read a dump file on my Windows 7 machine. I'm running WinDbg, according to the instructions on this website:
http://windows7themes.net/how-to-open-dmp-files-in-windows-7.html
Here is a link to the last 2 dump files. Note, the dump files are from a windows server 2008 r2 machine, that is getting the blue screen of death every few days, for no apparent reason:
http://www.mediafire.com/file/1zcl8i64d65crz8/070311-18174-01.dmp
http://www.mediafire.com/file/4h8hiwvvum89eca/070711-18220-01.dmp
*EDIT*
I have been able to read the dump file, still not able to find the problem (see following posts):
Microsoft (R) Windows Debugger Version 6.12.0002.633 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.
Loading Dump File [C:\Users\mpapania\Desktop\062811-64350-01.dmp]
Mini Kernel Dump File: Only registers and stack trace are available
WARNING: Whitespace at start of path element
WARNING: Whitespace at start of path element
Symbol search path is: SRV*C:\Symbols*http://msdl.microsoft.com/download/symbols
Executable search path is: c:\windows\System32; c:\windows\system\System32; http://www.alexander.com/SymServe
Windows 7 Kernel Version 7601 (Service Pack 1) MP (8 procs) Free x64
Product: Server, suite: Enterprise TerminalServer SingleUserTS
Built by: 7601.17514.amd64fre.win7sp1_rtm.101119-1850
Machine Name:
Kernel base = 0xfffff800`01a59000 PsLoadedModuleList = 0xfffff800`01c9ee90
Debug session time: Tue Jun 28 02:00:38.586 2011 (UTC - 4:00)
System Uptime: 46 days 18:10:34.956
Loading Kernel Symbols
.................................................Unable to load image Unknown_Module_00000000`00000000, Win32 error 0n2
*** WARNING: Unable to verify timestamp for Unknown_Module_00000000`00000000
Unable to add module at 00000000`00000000
Loading User Symbols
Loading unloaded module list
..............................................
*******************************************************************************
*
*
* Bugcheck Analysis *
*
*
*******************************************************************************
Use !analyze -v to get detailed debugging information.
BugCheck 124, {0, fffffa80073b1028, fe001d00, 1009f}
*** WARNING: Unable to verify checksum for PSHED.dll
Probably caused by : hardware
Followup: MachineOwner
---------
0: kd> !analyze -v
*******************************************************************************
*
*
* Bugcheck Analysis *
*
*
*******************************************************************************
WHEA_UNCORRECTABLE_ERROR (124)
A fatal hardware error has occurred. Parameter 1 identifies the type of error
source that reported the error. Parameter 2 holds the address of the
WHEA_ERROR_RECORD structure that describes the error conditon.
Arguments:
Arg1: 0000000000000000, Machine Check Exception
Arg2: fffffa80073b1028, Address of the WHEA_ERROR_RECORD structure.
Arg3: 00000000fe001d00, High order 32-bits of the MCi_STATUS value.
Arg4: 000000000001009f, Low order 32-bits of the MCi_STATUS value.
Debugging Details:
------------------
BUGCHECK_STR: 0x124_GenuineIntel
CUSTOMER_CRASH_COUNT: 1
DEFAULT_BUCKET_ID: DRIVER_FAULT_SERVER_MINIDUMP
PROCESS_NAME: System
CURRENT_IRQL: f
STACK_TEXT:
fffff800`01826a98 fffff800`01a22a3b : 00000000`00000124 00000000`00000000 fffffa80`073b1028 00000000`fe001d00 : nt!KeBugCheckEx
fffff800`01826aa0 fffff800`01be67d3 : 00000000`00000001 fffffa80`073d0a60 00000000`00000000 fffffa80`073d0ab0 : hal!HalBugCheckSystem+0x1e3
fffff800`01826ae0 fffff800`01a22700 : 00000000`00000728 fffffa80`073d0a60 fffff800`01826e70 fffff800`01826e00 : nt!WheaReportHwError+0x263
fffff800`01826b40 fffff800`01a22052 : fffffa80`073d0a60 fffff800`01826e70 fffffa80`073d0a60 00000000`00000000 : hal!HalpMcaReportError+0x4c
fffff800`01826c90 fffff800`01a21f0d : 00000000`00000008 00000000`00000001 fffff800`01826ef0 00000000`00000000 : hal!HalpMceHandler+0x9e
fffff800`01826cd0 fffff800`01a15e88 : 00000000`00000001 fffff800`01c4be80 00000000`00000000 00000000`00000000 : hal!HalpMceHandlerWithRendezvous+0x55
fffff800`01826d00 fffff800`01ad7f2c : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : hal!HalHandleMcheck+0x40
fffff800`01826d30 fffff800`01ad7d93 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KxMcheckAbort+0x6c
fffff800`01826e70 fffff880`02dd1c61 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiMcheckAbort+0x153
fffff800`0181ac98 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0xfffff880`02dd1c61
STACK_COMMAND: kb
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: hardware
IMAGE_NAME: hardware
DEBUG_FLR_IMAGE_TIMESTAMP: 0
FAILURE_BUCKET_ID: X64_0x124_GenuineIntel__UNKNOWN
BUCKET_ID: X64_0x124_GenuineIntel__UNKNOWN
Followup: MachineOwner
---------
July 7th, 2011 7:16pm
OK, I was able to use !analyze -v to get the detailed debugging information. Looks like an something to do with intel (could this be the CPU driver?) CPU is an Intel Xeon.
BUGCHECK_STR: 0x124_GenuineIntel
CUSTOMER_CRASH_COUNT: 1
DEFAULT_BUCKET_ID: DRIVER_FAULT_SERVER_MINIDUMP
CURRENT_IRQL: f
STACK_TEXT:
fffff800`01ae6a98 fffff800`02214a3b : 00000000`00000124 00000000`00000000 fffffa80`073bd028 00000000`fe000140 : nt!KeBugCheckEx
fffff800`01ae6aa0 fffff800`01da57d3 : 00000000`00000001 fffffa80`073dca60 00000000`00000000 fffffa80`073dcab0 : hal!HalBugCheckSystem+0x1e3
fffff800`01ae6ae0 fffff800`02214700 : 00000000`00000728 fffffa80`073dca60 fffff800`01ae6e70 fffff800`01ae6e00 : nt!WheaReportHwError+0x263
fffff800`01ae6b40 fffff800`02214052 : fffffa80`073dca60 fffff800`01ae6e70 fffffa80`073dca60 00000000`00000000 : hal!HalpMcaReportError+0x4c
fffff800`01ae6c90 fffff800`02213f0d : 00000000`00000008 00000000`00000001 fffff800`01ae6ef0 00000000`00000000 : hal!HalpMceHandler+0x9e
fffff800`01ae6cd0 fffff800`02207e88 : 00000000`000000aa 00000000`0380fc1a 00000000`00000000 00000000`00000000 : hal!HalpMceHandlerWithRendezvous+0x55
fffff800`01ae6d00 fffff800`01c96f2c : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : hal!HalHandleMcheck+0x40
fffff800`01ae6d30 fffff800`01c96d93 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KxMcheckAbort+0x6c
fffff800`01ae6e70 00000000`775003e0 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiMcheckAbort+0x153
00000000`0380fb14 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x775003e0
STACK_COMMAND: kb
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: hardware
IMAGE_NAME: hardware
DEBUG_FLR_IMAGE_TIMESTAMP: 0
FAILURE_BUCKET_ID: X64_0x124_GenuineIntel__UNKNOWN
BUCKET_ID: X64_0x124_GenuineIntel__UNKNOWN
Followup: MachineOwner
---------
Free Windows Admin Tool Kit Click here and download it now
July 7th, 2011 9:17pm
Stop 0x00000124
This error message occurs when Windows has a problem handling a PCI-Express device. Most often, this occurs when adding or removing a hot-pluggable PCI-Express card; however, it can occur with driver/hardware related problems for PCI-Express cards.
Resolving the Problem
To troubleshoot this error, first make sure that you have applied all Windows and driver updates. If you recently updated a driver, roll back the change. If the stop error continues to occur, remove PCI-Express cards one by one to identify the problematic
hardware. When you have identified the card causing the problem, contact the hardware manufacturer for further troubleshooting assistance. The driver might need to be updated, or the card itself could be faulty.
For driver updates, please download them directly from the relevant hardware manufacturer's site and not through some 3rd party. Always download them fully first before trying to execute them, and do so from the downloaded file.
July 7th, 2011 10:17pm
Thanks for the suggestion. So far, I have applied all windows updates. If that doesn't work, I'm going to check the drivers, then I can try the PCI cards, etc. The problem is, the BSOD onlyl happens every few days, so I'll have to wait a few days after each
step, to see if that was the problem or not.
In the meantime, I'll keep searching for information on this error, and compiling a list of things to try. Also on my list is to do a memory check, and a BIOS update. Any other suggestions or insight to the mini dump is greatly appreciated!
After working with winddbg some more (after reading a very helpful article (http://www.networkworld.com/news/2005/041105-windows-crash.html?page=1), I was able to get the
complete mini-dump (I think).
Use !analyze -v to get detailed debugging information.
BugCheck 124, {0, fffffa8007394028, fe000700, 1009f}
*** WARNING: Unable to verify checksum for PSHED.dll
Probably caused by : hardware
Followup: MachineOwner
---------
0: kd> !analyze -v
*******************************************************************************
*
*
* Bugcheck Analysis *
*
*
*******************************************************************************
WHEA_UNCORRECTABLE_ERROR (124)
A fatal hardware error has occurred. Parameter 1 identifies the type of error
source that reported the error. Parameter 2 holds the address of the
WHEA_ERROR_RECORD structure that describes the error conditon.
Arguments:
Arg1: 0000000000000000, Machine Check Exception
Arg2: fffffa8007394028, Address of the WHEA_ERROR_RECORD structure.
Arg3: 00000000fe000700, High order 32-bits of the MCi_STATUS value.
Arg4: 000000000001009f, Low order 32-bits of the MCi_STATUS value.
Debugging Details:
------------------
BUGCHECK_STR: 0x124_GenuineIntel
CUSTOMER_CRASH_COUNT: 1
DEFAULT_BUCKET_ID: DRIVER_FAULT_SERVER_MINIDUMP
PROCESS_NAME: System
CURRENT_IRQL: f
STACK_TEXT:
fffff800`01ae6a98 fffff800`01c20a3b : 00000000`00000124 00000000`00000000 fffffa80`07394028 00000000`fe000700 : nt!KeBugCheckEx
fffff800`01ae6aa0 fffff800`01de47d3 : 00000000`00000001 fffffa80`073b0a60 00000000`00000000 fffffa80`073b0ab0 : hal!HalBugCheckSystem+0x1e3
fffff800`01ae6ae0 fffff800`01c20700 : 00000000`00000728 fffffa80`073b0a60 fffff800`01ae6e70 fffff800`01ae6e00 : nt!WheaReportHwError+0x263
fffff800`01ae6b40 fffff800`01c20052 : fffffa80`073b0a60 fffff800`01ae6e70 fffffa80`073b0a60 00000000`00000000 : hal!HalpMcaReportError+0x4c
fffff800`01ae6c90 fffff800`01c1ff0d : 00000000`00000008 00000000`00000001 fffff800`01ae6ef0 00000000`00000000 : hal!HalpMceHandler+0x9e
fffff800`01ae6cd0 fffff800`01c13e88 : 00000000`00000001 fffff800`01e49e80 00000000`00000000 00000000`00000000 : hal!HalpMceHandlerWithRendezvous+0x55
fffff800`01ae6d00 fffff800`01cd5f2c : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : hal!HalHandleMcheck+0x40
fffff800`01ae6d30 fffff800`01cd5d93 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KxMcheckAbort+0x6c
fffff800`01ae6e70 fffff880`02e7dc61 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiMcheckAbort+0x153
fffff800`01adac98 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0xfffff880`02e7dc61
STACK_COMMAND: kb
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: hardware
IMAGE_NAME: hardware
DEBUG_FLR_IMAGE_TIMESTAMP: 0
FAILURE_BUCKET_ID: X64_0x124_GenuineIntel__UNKNOWN
BUCKET_ID: X64_0x124_GenuineIntel__UNKNOWN
Followup: MachineOwner
---------
Free Windows Admin Tool Kit Click here and download it now
July 7th, 2011 10:26pm
Do you have a PCI-Express graphics card? If so, try removing and reseating it in its slot to see if that clears the problem. It could also be caused by a slight misalignment of the motherboard with relation to the chassis - slightly loosen the fixing screws
and retighten.
July 7th, 2011 10:30pm
Do you have a PCI-Express graphics card? If so, try removing and reseating it in its slot to see if that clears the problem. It could also be caused by a slight misalignment of the motherboard with relation to the chassis - slightly loosen the fixing
screws and retighten.
OK, just took a look inside. There are no PCI-Express cards installed. And the motherboard is aligned correctly. So, guess it's not a PCI express card issue. Still lots more to check though.....
Free Windows Admin Tool Kit Click here and download it now
July 7th, 2011 10:37pm
Hi,
I just analyzed your dump file, here are some information I want to share with you:
The WHEA_UNCORRECTABLE_ERROR bug check has a value of 0x00000124. This bug check indicates that a fatal hardware error has occurred. This bug check uses the error
data that is provided by the Windows Hardware Error Architecture.
Parameter 1: 0x0
A machine check exception caused this error.
These parameter descriptions apply if the processor is based on the x64 architecture, or the x86 architecture that has the MCA feature available.
The following actions might prevent an error like this from happening again:
1.Download and install updates and device drivers for your computer from Windows Update.
2.Scan your computer for computer viruses.
3.Check your hard disk for errors.
Please understand, to solid troubleshoot this kind of kernel crash issue, we need to debug the crashed system dump and analyze the related source code if needed. Debugging
is beyond what we can do in the forum. I recommend that you can contact Microsoft Customer Service and Support (CSS) via telephone so that a dedicated Support Professional can assist with your request. Please be advised that contacting phone support will be
a charged call.
To obtain the phone numbers for specific technology request please take a look at the web site listed below:
http://support.microsoft.com/default.aspx?scid=fh;EN-US;OfferProPhone#faq607
Microsoft Customer Service (800) 426-9400 is available Monday through Friday, from 6:30 A.M. to 5:30 P.M. pacific time.
Note: Microsoft Customer Service mainly handles issues regarding replacement manuals, disks, drivers and service packs, product IDs, or lost CD-keys, product orders,
policies related to copying software on additional computers, licensing, and product registration.
Hope that helps
Please remember to click Mark as Answer on the post that helps you, and to click Unmark as Answer if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread.
July 11th, 2011 12:33pm