Cluster Shared Volume Offline after a power outage

Hi

We need some help on a two node Clustered Windows Server 2012 Hypervisors. The CSV is on a Lefthand iSCSI SAN and they are managed by SCVMM SP1.

When the cluster is created all VMs are running fine.  We shut down the servers for a power outage but after we boot the servers up, the CSV stays OFFLINE and it fails with Event ID 1254, 1205 and 1069.

Event ID 1254

Clustered role 'SCVMM VM-1 Resources' has exceeded its failover threshold.  It has exhausted the configured number of failover attempts within the failover period of time allotted to it and will be left in a failed state.  No additional attempts will be made to bring the role online or fail it over to another node in the cluster.  Please check the events associated with the failure.  After the issues causing the failure are resolved the role can be brought online manually or the cluster may attempt to bring it online again after the restart delay period.

Event ID 1205

The Cluster service failed to bring clustered service or application '5fbe0d16-377c-40f6-8b13-af34fe7790c0' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered service or application.

Event ID 1069

Cluster resource 'CSV1' of type 'Physical Disk' in clustered role '5fbe0d16-377c-40f6-8b13-af34fe7790c0' failed. The error code was '0x1' ('Incorrect function.').

Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it.  Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.

We tried to bring it online in Failover Cluster Manager but it shows

Failed to bring the resource CSV online, Error Code: 0x8007139a and

Incorrect function Error Code: 0x80070001

Thank you in advance for your help.

July 31st, 2013 2:50am

Hi,

It seems the first event error is the cause of the issue. Please see the explanation in following thread:

Clustered role 'Cluster Group' has exceeded its failover threshold

http://social.technet.microsoft.com/Forums/windowsserver/en-US/4eb44f05-eb9b-448a-821b-359879141608/clustered-role-cluster-group-has-exceeded-its-failover-threshold

The failover threshold is the number of times the group can fail over within the number of hours specified by the failover period. For example, if a group failover threshold is set to "5" and its failover period to "3," the clustering software stops attempting to bring the group online and leaves the resources within the group in their current state. For example, if the IP Address resource is brought online but the Network Name resource fails, the group is left offline, but the IP Address resource is left online. 


Test if steps provided in this thread could help.

Free Windows Admin Tool Kit Click here and download it now
August 1st, 2013 5:37am

Hi

We tried to bring all VM resources offline and configured them not to auto start, set Cluster group resource and VM failover threshold to 10 but still cannot bring the Volume back online. By reviewing the FailoverCluster operation log, we saw error

[RES] Physical Disk <CSV1> Failed to read reservation on the disk, status 1.

We tried Powershell Clear-ClusterDiskReservation but doesn't help. Fortunately we renewed our SAN HP care pack few months ago and they provided superior support on resetting the SAN disk reservation and re-established the iSCSI connections on both of our Hypervisors. Now our volume is back online.

Thank you for your directions and it really help to eliminate a lot of resource errors and hence the problem can be identified.

August 3rd, 2013 3:40am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics