windows server 2003 stops responding
hi friends, i am facing very serious problem with my server (windows server 2003) in which sql server 2005 is installed. In every 4 or 5 days when all the transactions (sales and purchase ) is running, suddenly my server goes into hang position and at that time all the client applications stop responding and also i cannt connect to sql server management studio(as it gives time out expired error message and refuse to open ssmt). I have to restart my server abnormally at that time (means by pressing power on/off the server). After restarting the server i have checked the sql server log to check what has been happened at that time to resolve this very serious problem. sql server log shows following message at the time of server hang: SQL Server has encountered 1 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\tempdb.mdf] in database [tempdb] (2). The OS file handle is 0x00000630. The offset of the latest long I/O is: 0x000000008d6000 Note: the above message is also shown for 3-4 application databases. Also in the event viewer the application log shows the same message: SQL Server has encountered 1 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\tempdb.mdf] in database [tempdb] (2). The OS file handle is 0x00000630. The offset of the latest long I/O is: 0x000000008d6000 can plz any one tell me how tackle this problem so that and i dont have to restart server abnormally and server hanged state can be avoided. thanks very very much for ur suggestion.
October 26th, 2009 2:34pm
Hi Ravi Before jumping into the findings can you let me know have tried to connect sql server throughDedicated Admin Connection before restarting the server?
Explanation: This message indicates ofCPU Drift,SQL Server has issued a read or write request from disk, and the request has taken longer than 15 seconds to return. It indicates a problem with the IO subsystem. To start withverify both SQL DMV sys.dm_io_pending_io_requests and Windows Performance counters dont indicate any IO delays.
On computers with multiple CPUs, the CPUs are designed to sleep during periods of low workload.When CPUs sleep, SQL may not accurately determine CPUs overall workload and incorrectly report this as IO WARNING shown above, however, this does not represent an actual CPU performance problem.This error can occur in SQL Server 2000 or SQL Server 2005.
Note : In the other way this may error may even cause because of blocking on page Latches Possible Causes This problem can be caused system performance issues,IO overloads, or filter driver intervention in the IO process. One possible cause may belack of memory relative to the size of the application's working set of database pages. Not all pages accessed by an application need to be in memory, but if a significant fraction of the most often used pages do not fit in the buffer pool, they are likely to be constantly read from disk. A related cause is excessive and unnecessary IO activity from SQL Server, such as that caused by table scans on large tables when the query could benefit from an index to avoid the scan. These scans can cause unnecessary thrashing in the buffer pool, which is characteristic of the insufficient memory case User Action: Troubleshoot this error by examining the system event log for hardware-related error messages. Also, examine hardware-specific logs if they are available. You can use following performance counters to identify I|O bottlenecks PhysicalDisk Object: Avg. Disk Queue Lengthrepresents the average number of physical read and write requests that were queued on the selected physical disk during the sampling period. If your I/O system is overloaded, more read/write operations will be waiting. If your disk queue length frequently exceeds a value of 2 during peak usage of SQLServer, then you might have an I/O bottleneck.
Avg. Disk Sec/Readis the average time, in seconds, of a read of data from the disk. Any number
Less than 10 ms - very good
Between 10 - 20 ms -okay
Between 20 - 50 ms -slow, needs attention
Greater than 50 ms Serious I/O bottleneck
Avg. Disk Sec/Writeis the average time, in seconds, of a write of data to the disk. Please refer to the guideline in the previous bullet.
Physical Disk: %Disk Timeis the percentage of elapsed time that the selected disk drive was busy servicing read or write requests. A general guideline is that if this value is greater than 50 percent, it represents an I/O bottleneck.
Avg. Disk Reads/Secis the rate of read operations on the disk. You need to make sure that this number is less than 85 percent of the disk capacity. The disk access time increases exponentially beyond 85 percent capacity.
Avg. Disk Writes/Secis the rate of write operations on the disk. Make sure that this number is less than 85 percent of the disk capacity. The disk access time increases exponentially beyond 85 percent capacity For example, the Average Disk Sec/Transfer time on a computer that is running SQL Server is typically less than 15 milliseconds. If the Average Disk Sec/Transfer value increases, this indicates that the I/O subsystem is not optimally keeping up with the I/O demand.
You can use the following DMV query to find currently pending I/O requests. You can execute this query periodically to check the health of I/O subsystem and to isolate physical disk(s) that are involved in the I/O bottlenecks.
select
database_id,
file_id,
io_stall,
io_pending_ms_ticks,
scheduler_address
fromsys.dm_io_virtual_file_stats(NULL, NULL)t1,
sys.dm_io_pending_io_requests as t2
wheret1.file_handle = t2.io_handle
Note:
Starting with SQL 2005 SP2 two trace flags were included to disable the reporting of CPU Drift errors in the SQL Server error log.
Disable this error using Trace Flag 8033
The time stamp counter of CPU on scheduler id 1 is not synchronized with other CPUs.
Disable this error using Trace Flag 830
SQL Server has encountered 2 occurrence(s) of I/O requests taking longer than 15 seconds to complete
Regards
Suhas V
Free Windows Admin Tool Kit Click here and download it now
October 26th, 2009 9:42pm
hi Ravi Mishra, According to the description, the issue seems to be related to SQL server. As we mainly focus on the general question about Windows Server system, for better and accurate answer to the question, it is recommend you to get further support in one of the SQL forum
For your convenience, I have list the related link as followed.
SQL Server
http://forums.microsoft.com/TechNet/default.aspx?ForumGroupID=93&SiteID=17Solutions Architect
October 27th, 2009 4:12am