The Distributed Cache service is one that gets relatively little attention because it’s installed by default on every server in your SharePoint farm and you would probably not know if it were having a problem (unless you paid very close attention). Recently, I had some time on my hands and went looking for trouble so to speak. My SharePoint 2013 farm was serving pages and life was good, but still I decided it was time to take stock of what was running where and if I could make some improvements. Among other things, I noticed my Distributed Cache service was only running on one of my 11 SharePoint servers in the farm. A closer inspection of the Event Viewer on servers not running Distributed Cache revealed a relatively common problem: cacheHostInfo is Null.
I say this is a common problem because it seems like a well-covered topic on the Internet, but I never came across a single page that ran this issue to group. I found many helpful hints, but decided this topic needed an end-to-end solution. Today, I cover how I approached the problem and the solution I found.
Step 1: Check Communications Channels
Make sure all member servers can communicate back to your first installed instance. Most times this will be your Central Admin box, but it doesn’t have to be. Per this Technet article, you will need IMCP communications plus TCP ports 22233 through 22236 opened.
Step 2: Fix The Core Problem
This post was a lifesaver and the one I really need to thank for getting me through this task. The error message is exctly as stated; bad cluster config is causing a problem and it needs to be rebuilt. The fix is to run the following PowerShell on the working machine:
Step 3: Fix the Remaining Machines
You are now ready to remove and recreate the Distributed Cache service by looking in your event log for something to the effect of:
Error executing service instance (un)provisioning job. Service instance: “Distributed Cache” (id “<Your Service ID>) “cacheHostInfo is null””
You will take the ID that I have marked above as <Your Service ID> and plug it in the script below:
$service = get-spserviceinstance <Your Service ID Here>
Note: You will have a good idea things are going well if it takes some time for that last line to run. If it comes back error-free, you should return to the Services on Server page of Central Admin and start it up for all remaining servers.
Thanks for reading this and I hope it helped! I don’t want to steal anybody’s thunder, but I wanted to try pulling together good information from various sources along with what I could conjur up myself and put it all in one place. That way, if anybody else runs into a similar issue, this blog post will become a resource going forward.
Like this post? If you did, be sure to share it using any of the sharing buttons at the top! We are a leading SharePoint services and product firm in the DC area, so make sure to read more of our SharePoint posts. While you’re here, make yourself comfortable and check out our blog home page to explore other technologies we use on a daily basis and the fixes we’ve solved in our day to day work. To make your life even easier, subscribe to our blog to get instant updates sent straight to your inbox: