MPIO/iSCSI Issues on Server 2008 R2
Hi there, We're currently running a SQL server as a Hyper-V VM with iSCSI mounted volumes, using MPIO (two ports). We're getting occasional disconnects of our primary DB volume that is, understandably, causing SQL to be unhappy. The event chronology looks like this: Event ID 32 iScsiPrt - Initiator received an asynchronous logout message. The Target name is given in the dump data. Event ID 27 iScsiPrt - Initiator could not find a match for the initiator task tag in the received PDU. Dump data contains the entire iSCSI header. Event ID 27 iScsiPrt - Initiator could not find a match for the initiator task tag in the received PDU. Dump data contains the entire iSCSI header. Event ID 34 iScsiPrt - A connection to the target was lost, but Initiator successfully reconnected to the target. Dump data contains the target name. Event ID 23 mpio - All paths have failed. \Device\MPIODisk5 will be removed. Event ID 16 mpio - A fail-over on \Device\MPIODisk5 occurred. At this point, the logical DB drive disappears from Windows and the iSCSI initiator shows as Inactive. When we reconnect, it connects no problem but we usually need a reboot to get SQL happy again. Back end storage is an EqualLogic SAN running the 3.4.2 HIT kit. We have recently installed KB978157, which didn't seem to help. Has anyone run into anything like this before? - Liam
April 28th, 2011 6:19pm

Can you verify that event represented by " Event ID 34 iScsiPrt - A connection to the target was lost, but Initiator successfully reconnected to the target. Dump data contains the target name. " is for one of the paths of \Device\MPIODisk5 . You can easily verify same using target name and the LUNs exposed through the target. This could be one of the issues we are working to release fix soon In any case, I would suggest you to contact the Microsoft customer care and seek their help in investigation to find out the exact causeThanks, Ashutosh
Free Windows Admin Tool Kit Click here and download it now
April 28th, 2011 8:26pm

Yes, it's the same one. Is there a KB or internal reference I can use when calling in? - Liam
April 28th, 2011 8:30pm

I don't think you need any internal reference/KB for reporting an issue.Thanks, Ashutosh
Free Windows Admin Tool Kit Click here and download it now
April 29th, 2011 12:54pm

I was more wondering if you guys were tracking this issue internally - I've opened a case on this so we'll see how things go. - Liam
May 2nd, 2011 6:38pm

There are a bajillion fixes to MPIO in SP1 for R2. We've applied this to both SQL servers and have fingers firmly crossed.
Free Windows Admin Tool Kit Click here and download it now
May 5th, 2011 5:49am

Best of Luck. :)Thanks, Ashutosh
May 5th, 2011 8:13am

SP1 didn't help the problem, sadly. It's applied to both the guest and VM host. Ashutosh, you mentioned this was something Microsoft was working on a fix for - is there any internal reference ID I can pass along to the Microsoft Support Engineer to point him towards this information? - Liam
Free Windows Admin Tool Kit Click here and download it now
May 24th, 2011 8:10pm

Seeing the same issue. EqualLogic as well. Seems to occur when load is on the iSCSI connection. Running HIT Kit 3.5.1 MPIO. Anyone else hear back from MS ? I'm going to run this through Dell.. just to see what they say ;)
May 24th, 2011 9:44pm

Dell couldn't seem to find anything wrong with the SAN and I can confirm that it's not DPM causing the issue. We had a SQL server die during a data import before DPM was up and running. I'll post as I get more information from MS. - Liam
Free Windows Admin Tool Kit Click here and download it now
May 24th, 2011 9:54pm

Seeing the same issue. EqualLogic as well. Seems to occur when load is on the iSCSI connection. Running HIT Kit 3.5.1 MPIO. Anyone else hear back from MS ? I'm going to run this through Dell.. just to see what they say ;) Are you running in a Hyper-V environment? If so, how many hosts and how many iSCSI NICs per host? Also, I've unmarked the answer to this question as "Call Microsoft" isn't a solution. - Liam
May 24th, 2011 9:59pm

Found this thread, which is very similar: http://social.technet.microsoft.com/Forums/en-US/windowsserver2008r2virtualization/thread/38f31795-55ad-41c9-96a9-c2de2951c2a1/ We have Receive Side Scaling enabled on our iSCSI adapters. Have you tried disabling this? - Liam
Free Windows Admin Tool Kit Click here and download it now
May 24th, 2011 11:05pm

Microsoft Support came back with two hotfixes: KB2460971 - MPIO failover fails on a computer that is running Windows Server 2008 R2 KB2511962 - "0x000000D1" Stop error occurs in the Mpio.sys driver in Windows Server 2008 R2 These are both post-SP1 updates and must be installed in the following order: KB2460971 Reboot KB2511962 I'm going to install these and wait. If the problem reoccurs, I will be disabling Receive Side Scaling as the next step. Hope this is of some use to you. - Liam
May 25th, 2011 1:29am

Have it solved the problem? I got similiar problem, Win2008R2 with sp1 and HIT351, connected to EQL. Running SQL2008r2. When load is added to SQL, SAN-disk are lost, and initators are "reconnecting..." , machine needs to be rebooted to get it to work. All FW on Dell blade are added and latest Broadcom drivers. I got event id 129, 39, 20, 49, 71 on iscsiprt, 16,17,18,32 on mpio. Thanks, MagnusMagnus
Free Windows Admin Tool Kit Click here and download it now
May 27th, 2011 5:11pm

It seems to work better after I applied both hotfixes. I usually can trig the error with a large database restore. But this time, it worked better, the restore completed successfully. But I still got a iscsiprt event id 20 and after that 34. Connection to the target was lost. The initiator will attempt to retry the connection. A connection to the target was lost, but Initiator successfully reconnected to the target. Dump data contains the target name. Before the iscsi 20, I got SQL event id 833 i application log (delay write). /Magnus Magnus
May 27th, 2011 5:36pm

I thought it may be beneficial to throw out my experiences with this issue as well. I too am seeing this exact same behavior in our environment. Sounds like we have a similar setup in that we are also running Equallogic SAN (PS6500s) with various version of the HIT kit installed. We began noticing these errors on our Exchange hub transport servers some time back. Seems this issue becomes more prevelant the higher the load on the storage becomes. We notice these issues mostly on SQL and DPM servers and less frequently Exchange. The SAN shows no relevant errors that correspond with the windows events. All servers experiencing the issue are running Win2k8 or Win2k8r2 (no sp1). We also notice similar disconnects in our VMware environment. An alarm is raised stating the host lost connection to its storage. (unfortunately I do not have a current example of the exact error but will post it if and when it occurs again). These particular errors began to occur immediately upon our move to a new datacenter. We have not been able to correlate this to the fact that we nearly doubled our VM count when we moved to to the new data center, or the fact that we installed all brand new Cisco Nexus hardware. This alarm normally results in vms performing horribly and a series of vMotions occuring. I'd be interested to know if the hotfixes seemed to resolve the issue or if disabling receive side scaling helped. Obviously an MS hotfix cannot be installed on the ESXi servers, but receive side scaling can be turned off. Jon
Free Windows Admin Tool Kit Click here and download it now
June 1st, 2011 1:32am

Thanks for the info - Very interesting that it is also happening on your VMWare implementation. The hotfixes were applied but sadly did nothing to alleviate the problem. I've contacted Microsoft again on this and will keep everyone posted as to results. - Liam
June 13th, 2011 5:04pm

Hi Liam, Can you try with the KB released yesterday (http://support.microsoft.com/kb/2522766) and see if that resolves your issue?Thanks, Ashutosh
Free Windows Admin Tool Kit Click here and download it now
June 16th, 2011 8:16am

Hi, Did you get chance to test with the KB 2522766? If my initial suspect was true this should resolve your issue as this KB contains the fix mentioned by me in initial posts for this query.Thanks, Ashutosh
June 18th, 2011 1:02am

Hi there, I installed this last night on two of our servers. We'll see how she goes! This would be the third or fourth hotfix I've applied with almost identical wording. Here's hoping they got it this time! - Liam
Free Windows Admin Tool Kit Click here and download it now
June 18th, 2011 1:24am

We are having the same MPIO issue with our Datastore and Log volumes on Exchange 2010. Also connected via the HIT KIT to a PS6000 EQL. I was looking at the KB2522766 but it seems to only help for a specific MPIO.sys versions. In my case I am using the MPIO.sys 6.1.7600.16385 but MS says the hotfix is for .16818, .20970, .17619, and .21731 Curious if anyone else was experiencing these same issues with my MPIO version, or a version not listed on the KB. Also, Liam did the hotfix resolve your disconnects? -Eric
June 20th, 2011 10:51pm

http://support.microsoft.com/kb/2522766 : Windows Server 2008 R2 file information notes Important Windows 7 hotfixes and Windows Server 2008 R2 hotfixes are included in the same packages. However, hotfixes on the Hotfix Request page are listed under both operating systems. To request the hotfix package that applies to one or both operating systems, select the hotfix that is listed under "Windows 7/Windows Server 2008 R2" on the page. Always refer to the "Applies To" section in articles to determine the actual operating system that each hotfix applies to. The files that apply to a specific product, milestone (RTM, SPn), and service branch (LDR, GDR) can be identified by examining the file version numbers as shown in the following table: Version Product Milestone Service branch 6.1.760 0.16xxx Windows Server 2008 R2 RTM GDR 6.1.760 0.20xxx Windows Server 2008 R2 RTM LDR 6.1.760 1.17xxx Windows Server 2008 R2 SP1 GDR 6.1.760 1.21xxx Windows Server 2008 R2 SP1 LDR Are you facing any issues while installation? Thanks, Ashutosh
Free Windows Admin Tool Kit Click here and download it now
June 21st, 2011 6:27am

No, I'm just being cautious prior to installing a hotfix on my production Exchange server. I saw that table in the KB, but there is also a table below that showing specific MPIO.sys version numbers. I don't understand why it says 6.1.7600.16xxx but then goes on to list specific versions. My MPIO.sys is in the 6.1.7600.16xxx group but not in the list below: For all supported x64-based versions of Windows 7 and of Windows Server 2008 R2 File name File version File size Date Time Platform Mpio.sys 6.1.7600.16818 156,544 20-May-2011 12:50 x64 Mpio.sys 6.1.7600.20970 157,568 20-May-2011 12:53 x64 Mpio.sys 6.1.7601.17619 156,544 20-May-2011 12:49 x64 Mpio.sys 6.1.7601.21731 157,568 20-May-2011 15:35 x64
June 21st, 2011 8:12pm

Ohh. I got your concern. Actually the table referred by you as below, tells the version of mpio.sys present in the KB package. I hope this shall resolve your concern. For all supported x64-based versions of Windows 7 and of Windows Server 2008 R2 File name File version File size Date Time Platform Mpio.sys 6.1.7600.16818 156,544 20-May-2011 12:50 x64 Mpio.sys 6.1.7600.20970 157,568 20-May-2011 12:53 x64 Mpio.sys 6.1.7601.17619 156,544 20-May-2011 12:49 x64 Mpio.sys 6.1.7601.21731 157,568 20-May-2011 15:35 x64 Thanks, Ashutosh
Free Windows Admin Tool Kit Click here and download it now
June 21st, 2011 8:54pm

Hello Liam and Eric, Was wondering whether the KB resolved the reported issues?Thanks, Ashutosh
June 23rd, 2011 12:06pm

Seems good so far, but it's only been a week or so. Will keep you posted! - Liam
Free Windows Admin Tool Kit Click here and download it now
June 27th, 2011 5:14pm

Ashutosh, Yesterday, I only applied the KB 2522766 hot on my w2k8 R2 SP1 Enterprise server in hopes to clear its system log of Event 27 errors being logged during my heavy traffic use of my EqualLogic PS6000 arrays. Only difference is that in my configuration, the lun/device does not disconnect from the server. From my test runs after the fix was applied, I no longer see any Event 27 errors logged in my system log. However, the iscsiprt information Events 32 and 34 remain. Could you elaborate on why these information events are being logged? Event ID 32 iScsiPrt - Initiator received an asynchronous logout message. The Target name is given in the dump data. Event ID 34 iScsiPrt - A connection to the target was lost, but Initiator successfully reconnected to the target. Dump data contains the target name. Thanks - jfnatx
July 1st, 2011 3:03pm

"Event ID 32 iScsiPrt - Initiator received an asynchronous logout message. The Target name is given in the dump data. " informs you that iscsi initiator has received ASYNC LOGOUT message from iSCSI target (i.e iscsi storage array) which means the initiator has to close the session and try to reconnect again. (This is done transparently by ms iscsi driver). After this message targets can reset the TCP connection. This is the reason why you see next event "Event ID 34 iScsiPrt - A connection to the target was lost, but Initiator successfully reconnected to the target. Dump data contains the target name." once iscsi initiator driver completes the process of re-establishing the iscsi session with the target. As you might be aware the targets usually use ASYNC LOGOUT as way to load balancing among available options to use the target. I hope this clarified your queries.Thanks, Ashutosh
Free Windows Admin Tool Kit Click here and download it now
July 1st, 2011 7:31pm

Hi Liam, Checking back again ? I believe the KB has resolved your issue. if so, please mark answer to close on this.Thanks, Ashutosh
July 7th, 2011 12:10pm

Hi there, Everything seems good thusfar, but there are usually a couple of weeks in between occurrences. On a brighter note, we're now almost three weeks in, which is one week better than our prior record. If we can make it to mid-July, I'd say we have it licked. Fingers crossed! - Liam
Free Windows Admin Tool Kit Click here and download it now
July 7th, 2011 5:36pm

Hi Liam, Can you try with the KB released yesterday (http://support.microsoft.com/kb/2522766) and see if that resolves your issue? Thanks, Ashutosh Hi there! It's been a month since the last outage. I think we've got it. Thanks for the help! - Liam
July 18th, 2011 10:09am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics