Dedicated Distributed Cache Server and ports issue

Hi All,

We are in process of setting up a dedicated distrusted search server.

While we passed the ports (22233-22236) to network team that are required by distributed cache server. We have already set up SharePoint and configured so it's connected to farm. In turned on Distributed cache service both via Powershell as well as in Central Administration UI. We received "A failure occurred in SPDistributedCacheServiceInstance::Provision. cacheHostInfo is null for host DCServer "error in PowerShell and in Central Administration UI it it showed Error Starting. We have already conveyed this to network issue and they mentioned it's an application issue and not port issue

1. Is it really a port issue as this new DC server is not able to netstat the ports  (22233-22236). Here is the SharePoint ULS Logs on Distributed Cache Server

05/17/2015 15:43:46.40	OWSTIMER.EXE (0x043C)	0x0B64	SharePoint Foundation	DistributedCache	af6fq	Unexpected	A failure occurred in SPDistributedCacheServiceInstance::Provision. cacheHostInfo is null for host 'DCServer'.	1030079d-4c0e-20ef-7652-61fdc639746b
05/17/2015 15:43:46.40	OWSTIMER.EXE (0x043C)	0x0B64	SharePoint Foundation	DistributedCache	aelvf	Unexpected	A failure occurred SPDistributedCacheServiceInstance::Provision() , Exception 'System.InvalidOperationException: cacheHostInfo is null     at Microsoft.SharePoint.DistributedCaching.Utilities.SPDistributedCacheServiceInstance.Provision()'	1030079d-4c0e-20ef-7652-61fdc639746b
05/17/2015 15:43:46.40	OWSTIMER.EXE (0x043C)	0x0B64	SharePoint Foundation	Topology	7034	Critical	An attempt to start/stop instance  of service Distributed Cache on server DCServer did not succeed.  Re-run the action via UI or command line on the specified server. Additional information is below.  cacheHostInfo is null	1030079d-4c0e-20ef-7652-61fdc639746b
05/17/2015 15:43:46.40	OWSTIMER.EXE (0x043C)	0x0B64	SharePoint Foundation	Timer	3899	Critical	Error executing service instance (un)provisioning job.  Service instance: "Distributed Cache" (id "7f6951dc-7bbc-4cc3-8c28-0131a8db6b0b) "cacheHostInfo is null""	1030079d-4c0e-20ef-7652-61fdc639746b

We referred to this URL http://www.ericjochens.com/2014/02/sharepoint-2013-distributed-cache-issues.html and author mentions that 1st issue is port and if its ports are not opened, distributed cache server may not function correctly?

2. We have 1 Application server, and 4 web front end servers. We have removed the Distributed Cache instance across servers as we have dedicated distriubuted cache server. 
Will this dedicated cache server can manage all WEFs and Application server for all social workloads? How can verify if it's new service is working for all servers?

Any help would be greatly appreciated.

May 17th, 2015 8:50am

1. It does not seems to be a port issue. Rather it more looks to be kind of issue with DCS instance itself as it failed to come up properly. The answer is in the logs: cachehostinfo is null. Check below articles for fixing this:

http://strangelittletech.blogspot.com/2014/07/cachehostinfo-is-null-repair.html

http://technet.microsoft.com/en-us/library/jj219613%28v=office.15%29.aspx

2. It should be able to handle. However that depends upon social workloads as well. You need to keep it under observation and then take a decision.

Hope this helps.

Free Windows Admin Tool Kit Click here and download it now
May 17th, 2015 10:55pm

Hi Mohit and all, 


We have brand new SharePoint 2013 server and just enabled Distributed Cache (DC) service on this new server for dedicated cache server . As per DC requirements

  1.        Ping is working on this new server.
  2.        However, ports 22233- 22234 does not opened as described in earlier question and network team did show us that these ports opened for this server though.

We have performed following links

  1.        Renamed the DC server http://www.bluesphereinc.com/blog/renaming-a-sharepoint-20102013-server/
  2.        Reboot the server and IISReset.
  3.        Followed the following habaneroconsulting blog which explains same error that we are getting and user PortQry and on this new DC server, we found that DC ports are not opened. http://habaneroconsulting.com/insights/distributed-cache-needs-ping#.VVsWSPmqqkp
  4.        Removed and Added the  following commands for DC:

Add-SPDistributedCacheServiceInstance

$instanceName ="SPDistributedCacheService Name=AppFabricCachingService"

$serviceInstance = Get-SPServiceInstance | ? {($_.service.tostring()) -eq  $instanceName -and ($_.server.name) -eq $env:<<DCServerName>>}

$serviceInstance.Provision() and also tried the GUID for distributed cache  which is disabled SPServer Name=<<DCServer>>      <<GUID>> to provision but again this same error .

PS C:\Users\SP_FARM> $serviceInstance.Provision()

Exception calling "Provision" with "0" argument(s): "cacheHostInfo is null"

At line:1 char:1

+ $serviceInstance.Provision()

+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    + CategoryInfo          : NotSpecified: (:) [], MethodInvocationException

    + FullyQualifiedErrorId : InvalidOperationException

we tried many blogs:

https://dayansameera.wordpress.com/2013/09/03/the-operation-failed-because-the-server-could-not-access-the-distributed-cache-sharepoint-2013/  http://almondlabs.com/blog/manage-the-distributed-cache/ http://sharepointengineer.com/2014/11/04/sharepoint-2013-distributed-cache-cachehostinfo-is-null/ http://sharepoint.stackexchange.com/questions/108470/appfabric-caching-service-keeps-crashing

http://www.sharepointfire.com/MyBlog/2014/03/set-sharepoint-cache-accounts-with-powershell/

http://blogs.technet.com/b/mspfe/archive/2013/12/11/tips-and-tricks-from-the-field-on-the-new-distributed-cache-service-in-sharepoint-2013.aspx http://blogs.msdn.com/b/sambetts/archive/2014/03/19/sharepoint-2013-distributed-cache-appfabric-troubleshooting.aspx

We are totally out of ideas and new server disturbed cache server has to provide by users as soon as possible. Please advice which step we are missing and how can we made distributed cache server and its service up and running.  

Thanks for your help in advice.

May 19th, 2015 4:35pm

Sandy

Did you read this: https://samlman.wordpress.com/2015/03/02/configuring-multiple-distributed-cache-servers-in-sharepoint-2013/   ?

Free Windows Admin Tool Kit Click here and download it now
May 19th, 2015 4:56pm

Hi Sandy

Ports will be shown open only when the DCS instance is running. Otherwise they wont be shown open. Regarding the issue cachehostinfo is null, the only reason I suspect is this "We have removed the Distributed Cache instance across servers as we have dedicated distributed cache server".  So now there is no farm to join. If you have already tried all above blogs, try removing server from SP farm and then add it back. This should open up a new instance of DCS farm. 

Hope this helps. 

May 20th, 2015 12:26am

Hi Mohit and others, 

Thanks for your valuable inputs. As we described earlier that we have 1 App server (with Distributed Cache (DC) service), 4 WFEs and provisioned a dedicated Distributed Cache server. We have removed DC instances from 4 WFEs and 1 Application Server, which had DC service earlier.    

We ran the PowerShell Query as follows today: 

Get-SPServiceInstance | Where-Object {$_.typename -like "*distributed cache*"} | fl server,status,id

and follows servers were listed 
SPServer Name=<<Existing App Server>>  Disabled <<GUID>>
SPServer Name=<<New Distributed Cache>> Disabled <<GUID>>

Other DC instances WFEs were removed completed though.

  1. Is it advisable to remove both the instances (APP and New DC server) as we need only Distributed Cache server in the Farm
$s = Get-SPServiceInstance <<GUID>>
$s.delete()
  1. You mentioned that remove server SP farm and join the farm. Can you advice how can I remove the farm and join the farm this server? Using this action how the servers will be impacted and will it DC will be provisioned in new DC server? 

Any help would be greatly appreciated. 

Thank you. 

Free Windows Admin Tool Kit Click here and download it now
May 20th, 2015 2:17am

So lets clear some doubts first.

1. We have removed DC instances from 4 WFEs and 1 Application Server, which had DC service earlier. We ran the PowerShell Query as follows today and found 2 servers (with disabled as service status).  ---> You thought that you removed DCS instance from all servers but its not the case. You have DCS service status enabled (but not running) on both servers listed. If this is so, this is good news. Lets name them as server A and server B.

2. Is it advisable to remove both the instances (APP and New DC server) as we need only Distributed Cache server in the Farm. --> Unless you have DCS instance running fine on your dedicated instance, don't proceed to remove the DCS instance from other server.

3. You mentioned that remove server SP farm and join the farm. Can you advice how can I remove the farm and join the farm this server? Using this action how the servers will be impacted and will it DC will be provisioned in new DC server. --> Due to good news in point 1, it should not be required.

Troubleshooting steps:

1. Bring DCS instance online on your app server first.

2. Make sure you allow ICMP and firewall rules on both servers: app server and dedicated DCS server.

3. Now bring DCS instance online on your dedicated DCS server. For this try removing the DCS instance and adding it back using add command.

4. Once its online, remove DCS from app server.

You need to perform all steps as mentioned in the TechNet article: https://technet.microsoft.com/en-us/library/jj219613(v=office.15).aspx 

Pls use the same commands as mentioned in the article.

Hope this helps.

May 20th, 2015 1:03pm

Hi Mohit, 

Thanks a ton for your time and step by step for DC troubleshooting.

  1. Unfortunately, we have just deleted the DCS instances on App (Server A) and new distributed cache server (Server). However, we bring up the DCS instance on app server (Server A) and will bring it up tomorrow.
  2. Our next step to bring up the App Server. (Server A). Once the Server A DC instance is working, we will bring the other DC instance on the dedicated cache server (Server B).
  3.  Server A is critical and primary  server.   We don't know if we disconnect the DC and add this server to farm. We don't know if disconnecting the DC server effect other servers.  In case, if we are not able bring up the DC instance on Server B, we found some workaround to make DC instance for the following Blogs:
    http://sharepointjournal.com/2014/08/19/sharepoint-2013-distributed-cache-boon-or-bane/ https://mjddesign.wordpress.com/2014/04/29/sharepoint-lessons-learned/

    Basically, they modify the registry entry in App Fabric, provide the Provider name and Connection String and second step modify the DistributedCacheService.exe.config file in C:\Program Files\AppFabric 1.1 for Windows Server  and update the Connection string. Has anyone need to know if anyone has success implementing DC service using these workarounds.

Any suggestions will be greatly appreciated.

Free Windows Admin Tool Kit Click here and download it now
May 20th, 2015 2:22pm

Hi Mohit and others,

  1.    For Server A (Existing Application server that has DC instance configured earlier) the DC instance removed due to organizational decision as DC instance was degrading server performance. Therefore it was decided to go for dedicated DC server.
  2.     We instead tried to create DC instance in Server B using this blog post and http://asharepointsolutions.blogspot.sg/2013/07/cachehostinfo-is-null.html

$SPFarm = Get-SPFarm

$cacheClusterName = "SPDistributedCacheCluster_" + $SPFarm.Id.ToString()
$cacheClusterManager = [Microsoft.SharePoint.DistributedCaching.Utilities.SPDistributedCacheClusterInfoManager]::Local
$cacheClusterInfo = $cacheClusterManager.GetSPDistributedCacheClusterInfo($cacheClusterName);
$instanceName ="SPDistributedCacheService Name=AppFabricCachingService"
$serviceInstance = Get-SPServiceInstance | ? {($_.Service.Tostring()) -eq $instanceName -and ($_.Server.Name) -eq $env:computername}
Write-Host $cacheClusterInfo -ForegroundColor DarkCyan

if([System.String]::IsNullOrEmpty($cacheClusterInfo.CacheHostsInfoCollection))

if{

    #here's the key. we can't provision, unprovision, start, or stop a Cache Service because we still have a Cache Service that have no server attached  

    $serviceInstance.Delete()

    Add-SPDistributedCacheServiceInstance

    $cacheClusterInfo.CacheHostsInfoCollection

}
  1.        Queried if new instance was created:

    Get-SPServiceInstance | ? {($_.service.tostring()) -eq "SPDistributedCacheService Name=AppFabricCachingService"} | select Server, Status, ID and new DC instance was created in dedicated distributed cache server as Disabled status as follows:

    Server                                                          Status       Id

       ------                                                             ------ --

       SPServer Name=DCServer                           Disabled  <GUID>

  1.        We found that in dedicated DC instance was indeed created in the Central Administration but with STOPPED status.
  2.    Added new DC instance by adding following Commandlets in the dedicated DC server 

Add-SPDistributedCacheServiceInstance
$instanceName ="SPDistributedCacheService Name=AppFabricCachingService"
$serviceInstance = Get-SPServiceInstance | ? {($_.service.tostring()) -eq $instanceName -and ($_.server.name) -eq $env:computername}
$serviceInstance.Provision()

and got this error. You cannot call a method on a null-valued expression

  1. Ran the Get-CacheHost, and strangely found that Server A has Server status to be Unknown with the following error Get-CacheHost : ErrorCode<ERRCAdmin032>:SubStatus<ES0001>:Invalid operation encountered on DEV:AppFabricCachingService : Cannot open Service Control Manager on computer 'ServerA'. This operation might require other privileges.
    HostName: CachePort          Service Name                           Service Status   Version Info                                ---------------------------------                -----------------                                         ------------------     ------------------    

<<ServerA.com:22233>>          AppFabricCachingService            UNKNOWN       0 [0,0] [0,0]

  1. We then queried following PowerShell Commandlets Use-CacheCluster, Get-AFCacheHostConfiguration -ComputerName "<Server B>DCServer.com" -CachePort "22233"and but it resulted in error.
    Failed to connect hosts in the cluster
  2. We then modified the Registry Edit and modified the DistributedCacheService.exe.config to update the connection string. http://sharepointjournal.com/2014/08/19/sharepoint-2013-distributed-cache-boon-or-bane/ 

     And again then step 5 but got same error. We dont know, what is missing piece that in DC configuration. Please advise. Any help will be greatly appreciated.
Thank you. 
May 21st, 2015 3:26pm

I repaired DCS recently and here is what I did:

1. First be on server where you configured DCS sucessfully.

1.1 Run this command: Use-CacheCluster

1.2 Then Get-CacheHost

When these commands executed result will give you a picture on which server appfabric service is up and or status unknown. You may ignore it if you have already done this. 

2. Next step will be repairing the instance on the server where it's corrupted. There are in some cases you may feel you removed corrupted instance but not fully which turns out to be problematic. Here is the script I run to fix the issue:

$instanceName= "SPDistributedCacheService Name=AppFabricCachingService"

$serviceInstance= Get-SpServiceInstance | ? {(_.service.tostring()) -eq $instanceName -and ($_.server.name) -eq "Your target server name"}

$serviceInstance.Delete()

Add-SpDistributedCacheServiceInstance

3. There are some requirements for AppFabric service state when you run this scripts. Normally it should be stopped state and disabled too. After successful run you will the service up and running.

Hope this helps! 

Free Windows Admin Tool Kit Click here and download it now
May 21st, 2015 7:10pm

Hi Asfaw, Mohit and others, 

Thanks for your inputs. 
@Asfaw: we followed your commandlets, unfortunately it did not work. While we make DC started working.

  1. As we were getting an error while importing the CacheClusterConfig using this blog (http://mmman.itgroove.net/2013/07/10/fixing-the-appfabric-cache-cluster-in-sharepoint-2013/ ) and followed another blog (http://spoodoo.com/fixing-the-appfabric-caching-service-after-renaming-the-server/)  we decided the Unregister-Cachehost the current DCServer. (Server B) 

    Unregister-CacheHost -HostName "<<DCServer>>" -ProviderType SPDistributedCacheClusterProvider -ConnectionString "<<DC Server Connection String>>" 

  2. Renamed the server again
    Rename-SPServer -Identity "<<FQDN>>" -Name "<<DC Server Name>>" 
  3. Reboot the server and IISReset
  4. Performed following Commandlets but  threw error
    Use-CacheClusterGet-CacheHost
  5. On this DC Server queried the Get-AFCacheHostConfiguration and it is pointing to Existing application Server (Server A) with Service Status to UNKNOWN
  6. Register cache host for Server B (DC server).
    Register-CacheHost Provider "SPDistributedCacheClusterProvider" ConnectionString "<<DC Server (Server B)>>" -Account "NT AUTHORITY\NETWORK SERVICE" -CachePort 22233 -ClusterPort 22234 -ArbitrationPort 22235 -ReplicationPort 22236 HostName "<<DC Server>>"
  7. Ran the Get-CacheHost. The Existing server showed UNKNOWN status and now got DC server with status as DOWN. 
  8. Ran the Start-CacheHost ComputerName "<<DC Server>>" CachePort 22233 and showed the status DOWN
  9. Followed the this BlogPost http://asharepointsolutions.blogspot.sg/2013/07/cachehostinfo-is-null.html 
  10. Ran the following commandlets
    Get-SPServiceInstance | ? {($_.service.tostring()) -eq "SPDistributedCacheService Name=AppFabricCachingService"} | select Server, Status, ID
  11. The DC instance on dedicated DC server and status is ONLINE finally.  http://sharepointjournal.com/2014/08/19/sharepoint-2013-distributed-cache-boon-or-bane/ http://mmman.itgroove.net/2013/07/10/fixing-the-appfabric-cache-cluster-in-sharepoint-2013/ http://asharepointsolutions.blogspot.sg/2013/07/cachehostinfo-is-null.html

  12. However, new host which has 16 GB RAM, the Get-AFCacheHostConfiguration showed 8,191 MB RAM. The server performance was slow. So used stopped the DC instance

    Use-CacheCluster
    Get-CacheHost
    Get-AFCacheHostConfiguration -ComputerName "<<DC Server name>>" -CachePort "22233"

    Stop-SPDistributedCacheServiceInstance Graceful
    Remove-SPDistributedCacheServiceInstance

    Tried to update the DC Cache size
    Update-SPDistributedCacheSize -CacheSizeInMB 2048

    It threw this error. 
    Error while loading the provider "SPDistributedCacheClusterProvider". Check HKEY_LOCAL_MACHINE -> SOFTWARE\Microsoft\AppFabric\V1.0\Providers\AppFabricCaching -> SPDistributedCacheClusterProvider

  13. The DC instance is keep on showing DISABLE status.
    We Unregister-CacheHost and Register-CacheHost,
    Ran the the http://asharepointsolutions.blogspot.sg/2013/07/cachehostinfo-is-null.html,
    renamed the server and reboot the server twice , but its keep on showingDC instance  DISABLED status on DC server. 

    How can we fix this  issue??

Any help would be greatly appreciated.  

May 24th, 2015 11:42am

Hi All,

We have fixed the issue and dedicated distributed cache instance shows STARTED status. We updated the DC cache to 6144 MB. We then monitored the page performance and each page took around 10-12 seconds which is not acceptable by users. We then change to 2 GB, 4 GB but page performance was the same. This is our farm

  1.        App Server - Windows 2012 64 bit,    4 V Core, 32GB  (Removed the DC instance) (Server A)
  2.        Web Servers -  Windows 2012 64 bit,  4 V Core, 16GB
  3.        Data servers - Windows 2012 64 bit,  4 V Core, 32GB
  4.        Dedicated Distributed Cache server Windows 2012 64 bit,  4 V Core, 16GB (New DC instance) (Server B)

We have 2 questions:

Q: In our cache host, in our dedicated DC server, we can STILL see an entry of Existing Application Server (Server A) DC instance. 

And also on the this Server A, in the Health Wizard View Issues All , we are getting this below error.

---------------------------------------------------------------------------

Title

 The Current server is running low on memory. 

Severity

2 - Warning 

Category

 Availability 

Explanation

 The memory usage on the current server is about {0}. This can cause the eviction or throttling of the Distributed Cache Service/

Remedy

Check the memory usage on the machine.And try to free up some memory or add more ram on the machine. For more information about this rule, see "http://go.microsoft.com/fwlink/?LinkID=224261".

Failing Servers

Server A

Failing Services

SPDistributedCacheService (AppFabricCachingService)

--------------------------------------------------------------------------------------

Is Existing Server is causing this issue? How can we completely delete this instance on this Server A as there is no instance whatsoever on this server.

Q: We referred to this blog post about adjusting DC cache memory (http://blogs.msdn.com/b/calvarro/archive/2013/08/29/points-to-consider-with-distributed-cache-on-sharepoint-2013.aspx ) and they suggest to install APP Fabric CU 5. Will this CU update help to resolve this issue? We also referred to this social MSDN thread and that guy same issue.

https://social.technet.microsoft.com/Forums/sharepoint/en-US/7b695bf8-bebd-4c23-9025-f8ea02ba902d/distributed-cache-slowing-down-all-my-farm

Any help would be greatly appreciated.    
Free Windows Admin Tool Kit Click here and download it now
May 25th, 2015 11:29am

Hi Sandy

You did some great troubleshooting for fixing this. :)

You can definitely think of upgrading to CU5 but before that I would recommend to remove app server A from the DCS cluster and then monitor the performance.

 

May 25th, 2015 2:04pm

Hi Mohit and others,<o:p></o:p>

Yes, we thought that UPS and search configuration/troubleshooting is tricky but we found that troubleshooting DC is trickiest of all.  we did remove the existing App Server  (Server A) by using unregister-CacheHost Commandlet

Unregister-CacheHost -HostName "DC Server" -ProviderType
SPDistributedCacheClusterProvider -ConnectionString "Data Source=sql;Initial
Catalog=SPFarm_Config;Integrated Security=True;Enlist=False"


   and started the DC instance. But the page load was the same (6.2 - 8 seconds) and we had to stop the service. We found that for Distributed cache server requires NOT dynamic memory but fixed memory
https://technet.microsoft.com/en-us/library/jj219572.aspx#plandc

We talked to Infrastructure team and mentioned the above link. They reboot the dedicated DC server and we started the DC instance. The Cache size of dedicated DC is 6144 MB (6 GB) out of 16 GB of over RAM of this DC server. <o:p></o:p>

We found the page load is still taking 6.2 - 8 seconds. We also found some links
http://developers.de/blogs/damir_dobric/archive/2013/08/13/very-slow-startup-of-appfabric-distributed-cache-service.aspx

https://support.microsoft.com/en-us/kb/2787717/en-us?wa=wsignin1.0 (AppFabic CU 3) and it requires server reboot.

A blog does mentioned to update the
DistributedCacheService.exe.config file http://jasonwarren.ca/sp2013-distributed-cache-bug/

<o:p></o:p>

We don't know, what we are missing here and any help would be greatly appreciated. 
Thanks in advance.<o:p></o:p>

Free Windows Admin Tool Kit Click here and download it now
May 26th, 2015 3:01pm

Hi All,

We are still struggling with Dedicated Distributed Cache server. We have successfully provisioned the DC instance but the page load is still takes 6.2 7 seconds no matter that is DC cache is. We have adjusted RAM to 2, 4 and 8 GB but there is no effect on page performance.

We patched the AppFabric CU 5 using http://www.wictorwilen.se/how-to-patch-the-distributed-cache-in-sharepoint-2013 and steps in the TechNet using following steps:

  1.        Shut down the DC instance.
  2.        Apply the CU 5 patch.
  3.        Update the  backgroundGC key to DistributedCacheService.exe.config file in the DC server. https://blog.imason.com/distributed-cache-errors-in-sharepoint-2013/
  4.        Start the DC instance on the server through UI or PowerShell.
  5.        Restart the Distributed Cache SharePoint service.
  6.        Perform IISReset.

Again page load is 6.2 7 seconds. DC Cache 6 GB RAM is assigned and 16 GB is total of RAM for server.
Q: Anyway we can find patch level if the AppFaric CU 5? 

ULS logs is  giving the this error:

------------------------------------------------

Unexpected Exception in SPDistributedCachePointerWrapper::InitializeDataCacheFactory for usage 'DistributedLogonTokenCache' - Exception 'Microsoft.ApplicationServer.Caching.DataCacheException: ErrorCode<ERRCA0017>:SubStatus<ES0006>:There is a temporary failure. Please retry later. (One or more specified cache servers are unavailable, which could be caused by busy network or servers. 

-----------------------------------------------------------

We dont know how to proceed. 
Any suggestions would be greatly appreciated  
May 28th, 2015 3:24pm

Hi All,

We have setup a dedicated distributed cache server and only service that is enabled is distributed Cache (DC) service. Our farm architecture is as follows:

  1.        App Server - Windows 2012 64 bit,    4 V Core, 32GB  (Removed the DC instance)(Server A)
  2.        Web Servers -  Windows 2012 64 bit,  4 V Core, 16GB
  3.        Data servers - Windows 2012 64 bit,  4 V Core, 32GB
  4.        Dedicated Distributed Cache server Windows 2012 64 bit,  4 V Core, 16GB (New DC instance) (Server B)

The new DC service is set up with status as UP in PowerShell and STARTED in Central Administration. In the AppFabric service windows the status is AUTOMATIC and application pool assigned is SPAppPool. This the database application SPAppPool pool has database role: public only and where SPFarm has application role has db_owner, db_ securityadmin, sysadmin. The DC instance is started and we monitored the page performance (using developer dashboard) on one of web servers as follows:

15:22:58.450 ah24r w3wp.exe (0x1928) 7060 Verbose DistributedCache SharePoint Foundation cache host Name: 'ServerA' Running status 'DOWN'

15:22:58.450 ah24r w3wp.exe (0x1928) 7060 Verbose DistributedCache SharePoint Foundation cache host Name: 'ServerB' Running status 'UP'

15:22:58.450 achxc w3wp.exe (0x1928) 7060 Verbose DistributedCache SharePoint Foundation DistributedCacheClient is currently using cache host 'ServerB'

15:22:58.450 ah24s w3wp.exe (0x1928) 7060 Verbose DistributedCache SharePoint Foundation DistributedCacheClient ChannelOpenTimeout '00:00:00.0200000'.

15:22:58.450 ah24t w3wp.exe (0x1928) 7060 Verbose DistributedCache SharePoint Foundation DistributedCacheClient RequestTimeout '00:00:03'.

15:22:58.450 ah24u w3wp.exe (0x1928) 7060 Verbose DistributedCache SharePoint Foundation DistributedCacheClient MaxConnectionsToServer '1'.

15:22:58.450 ah24v w3wp.exe (0x1928) 7060 Verbose DistributedCache SharePoint Foundation DistributedCacheClient TransportProperties- ChannelInitializationTimeout '00:01:00', ConnectionBufferSize '131072',                                   MaxBufferPoolSize '268435456', MaxBufferSize '8388608', MaxOutputDelay '00:00:00.0020000',ReceiveTimeout '00:01:00'.

15:23:04.470 ah24w w3wp.exe (0x1928) 7060 Unexpected DistributedCache SharePoint Foundation Unexpected Exception in SPDistributedCachePointerWrapper::InitializeDataCacheFactory for usage 'DistributedViewStateCache' - Exception 'Microsoft.ApplicationServer.Caching.DataCacheException: ErrorCode<ERRCA0017>:SubStatus<ES0006>:There is a temporary failure. Please retry later. (One or more specified cache servers are unavailable, which could be caused by busy network or servers. For on-premises cache clusters, also verify the following conditions. Ensure that security permission has been granted for this client account, and check that the AppFabric Caching Service is allowed through the firewall on all cache hosts. Also the MaxBufferSize on the server must be greater than or equal to the serialized object size sent from the client.) ---> System.ServiceModel.ProtocolException: The requested upgrade is not supported by 'net.tcp://serverB:22233/'. This could be due to mismatched bindings (for example security enabled on the client and not on the server

15:23:04.470 ajb4s w3wp.exe (0x1928) 7060 Monitorable General SharePoint Foundation ViewStateLog: Failed to write to the velocity cache: https://websiteA/Forms/AllItems.aspx?AjaxDelta=1&isStartPlt1=1433056977559

15:23:04.470 nasq w3wp.exe (0x1928) 7060 Verbose Monitoring SharePoint Foundation Entering monitored scope (PublishingMobile: Resolve the default or non-default channel custom master url token.). Parent Request (GET:https://websiteA:443/Forms/AllItems.aspx?AjaxDelta=1&isStartPlt1=1433056977559)
------------------------------------------------------------------------------------------------------------------------

As Developer Dashboard indicates around 6 seconds delay when distributed cache starts processing the page and eventually write an exception in the log:

  1.        There is a temporary failure. Please retry later. (One or more specified cache servers are unavailable, which could be caused by busy network or servers. We performed the following blog that similar issue as ours. http://wiprospguys.blogspot.sg/p/sharepoint-2013-distributed-cache.html, Stop-SPDistributedCacheServiceInstance Graceful, Remove-SPDistributedCacheServiceInstance, Run the PSCONFIG wizard, and then Add-SPDistributedCacheServiceInstance. But again we found that 6 seconds delay in the ULS logs. So we ruled out this as a cause.
  2.         We followed blog to increase the timeouts distributed cache: http://www.habaneroconsulting.com/insights/sharepoint-2013-distributed-cache-bug#.VWk0tM-qqkq http://sharepoint-community.net/profiles/blogs/distributed-cache-repairing-it-with-powershell. And applied the App Fabric CU5, added backgroundBC  <appSettings>                 <add key="backgroundGC" value="true"/> </appSettings> in the DistrubitedCache.exe.config
  3.        Check that the AppFabric Caching Service is allowed through the firewall on all cache hosts. The DC server has its ports opened and we have netstat the DC server has DC ports opened. We ruled out this as a cause.
  4.        Ensure that security permission has been granted for this client account Think it could be potential issue.   Q: As the SPAppPool that runs the AppFabic does not have enough privileges, is it advisable to SPFarm account. https://technet.microsoft.com/en-SG/library/jj219613.aspx#changesvcacct Q: How can we grant client to Distributed Cache? Does this URL help in granting the permission? http://tafakari.co.ke/2014/07/troubleshooting-the-appfabric-cache-cluster/

Please advise as we need this DC service quickly.
Thanks in advance.  

Free Windows Admin Tool Kit Click here and download it now
May 31st, 2015 8:25pm

We are working on fixing the DC issue.

In previous setup the dedicated distributed cache (DC) setup, the page load keep on taking 6.2 seconds when DC service is turned on.
Environment is:
OS: Windows Server 2012 (64 Bit)
SharePoint Edition: Standard edition. (64 Bit)
Server RAM: 16 GB (Dedicated) for DC

    •        All search service is using continuous crawl. Social MSDN
    •        Performed the restart-cachecluster cmdlet + iisreset /noforce.
    •        Delete and recreated the DC instance on dedicated distributed cache server.
    •        We followed these commands to adjust size of DC cache host size.

Use-CacheCluster
Get-CacheHost

Set-AFCacheHostConfiguration -ComputerName "DistributedCacheServer" -cacheport 22233 -cachesize 7000
  1.  And turned on developer dashboard to see the detailed page load requests and found a page load took 6 seconds as shown:

     Unexpected Exception in SPDistributedCachePointerWrapper::InitializeDataCacheFactory for usage 'DistributedViewStateCache' - Exception 'Microsoft.ApplicationServer.Caching.DataCacheException: ErrorCode<ERRCA0017>:SubStatus<ES0006>:There is a temporary failure. Please retry later. (One or more specified cache servers are unavailable, which could be caused by busy network or servers. For on-premises cache clusters, also verify the following conditions. Ensure that security permission has been granted for this client account, and check that the AppFabric Caching Service is allowed through the firewall on all cache hosts. Also the MaxBufferSize on the server must be greater than or equal to the serialized object size sent from the client.) ---> System.ServiceModel.ProtocolException:
  2.        We verified the Get-CacheAllowedClientAccounts and assigned the service accounts into WSS_Admin_WPG and WSS_Admin_WPG on the DC server. Again viewed the page load. There we no error in developer dashboard ULS logs tab as above and took 1 second. But again page load was 6.2 seconds.
  3.       We stopped the DC Service. Stop-SPDistributedCacheServiceInstance Graceful
     
  4.       We then change the user profile service account onto DC server following TechNet article and started the DC instance

    $farm = Get-SPFarm $cacheService = $farm.Services | where {$_.Name -eq "AppFabricCachingService"} $accnt = Get-SPManagedAccount -Identity contoso\sp_usersync $cacheService.ProcessIdentity.CurrentIdentityType = "SpecificUser" $cacheService.ProcessIdentity.ManagedAccount = $accnt $cacheService.ProcessIdentity.Update() $cacheService.ProcessIdentity.Deploy()

    But again page load took 6.2 seconds.

    Which configuration are we will missing?

    Any help would be greatly appreciated.    
August 19th, 2015 12:54pm

Execute the following command
Get-CacheClusterHealth

Have you defined any entries in your host file on your SharePoint servers?
Free Windows Admin Tool Kit Click here and download it now
August 19th, 2015 3:47pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics