Redis Cache Refusing Connections

From @gavinbarron via Twitter:


Cache on Azure suddenly refusing connections and consuming all of the CPU. Can't even connect via redis-cli at the moment.


Regards,
@AzureSupport

July 6th, 2015 7:55pm

I have a redis cache which was set at 1GB.

This cache is located at wpcconnect.redis.cache.windows.net

This approximately 6 hours ago (0600 NZT) users started reporting issues. via application logging I found that Redis was refusing client connections (RedisConnectionException: It was not possible to connect to the redis server(s); to create a disconnected multiplexer, disable AbortOnConnectFail.)

When examining the server the Server Load metric was a 100 as was CPU usage. Memory Usage is sitting at 496MB.

The number of connections was approximately 8K, which struck me as higher than it should be.

Due to the unresponsive nature of the cache I have created a second cache and pointed my client app at the new cache.

However since 'failing over' to the new cache the old cache still has ~8K open connections


Free Windows Admin Tool Kit Click here and download it now
July 6th, 2015 8:18pm

I have checked few things about cache wpcconnect.redis.cache.windows.net. It is a 2.5GB cache. Server load is 100%. Number of connected clients is around 7300 which is very high for 2.5 GB cache.

Please provide us following details to help us investigate it further:
1.Cache Name
2.Cache Size
3.Date and time of errors (including timezone)
4.Exception messages with full stack trace
5.Number and type of client instances (e.g. web site, web role, worker role, VM)
6.Public Virtual IP (VIP) Address of client deployments
7.Version of StackExchange.Redis (and Microsoft.Web.RedisSessionStateProvider if applicable)
8.Code snippet showing how you are configuring and using the ConnectionMultiplexer object. Are you sharing a single instance of ConnectionMultiplexer across the whole client process?
9.In what region(s) are your cache service and clients?
10.Did anything change in your client around the time of the error? Were you scaling the number of client instances up or down, or deploying a new version of the client? Does your client have auto-scale enabled?
11.What was the CPU utilization on your client both before and during the incident?
12.Did all requests experience high latency or timeouts at the time of the incident, or only some requests?
13.How were the failures distributed across your clients? Evenly split, or all on a single client?
14.What is the size of the value you are getting from or putting into the cache?
15.What timeout values are set on the client for sync timeout and for connection timeout?


You can contact us directly at: AzureCache@microsoft.com

July 6th, 2015 8:44pm

1.Cache Name:

wpcconnect.redis.cache.windows.net

2.Cache Size:

Currently 2.5 GB, Scaled up from 1GB after seeing server load at 100

3.Date and time of errors (including timezone):

Last noted error in client at July 7th 2015, 09:27:56 (UTC +12) First occurance today at July 7th 2015, 06:39:15 (UTC +12)

4.Exception messages with full stack trace

[RedisConnectionException: It was not possible to connect to the redis server(s); to create a disconnected multiplexer, disable AbortOnConnectFail. SocketFailure on PING]StackExchange.Redis.ConnectionMultiplexer.ConnectImpl(Func`1 multiplexerFactory, TextWriter log):150

Any other lines in the stack trace are our above this call.

5.Number and type of client instances (e.g. web site, web role, worker role, VM):

Client application is a SharePoint 2013 farm running in IaaS using Redis to cache data from remote service. There are 4 VMs in the farm which can connect to the Redis cache

6.Public Virtual IP (VIP) Address of client deployments:

wpc14p2spcldsvc.cloudapp.net [191.236.56.250]

7.Version of StackExchange.Redis (and Microsoft.Web.RedisSessionStateProvider if applicable):

StackExchange.Redis.StrongName 1.0.450

8.Code snippet showing how you are configuring and using the ConnectionMultiplexer object. Are you sharing a single instance of ConnectionMultiplexer across the whole client process?

        private void Init()
        {
            if (_connectionMultiplexer == null)
            {
                _connectionMultiplexer = ConnectionMultiplexer.Connect(_configurationOptions);
                _cache = _connectionMultiplexer.GetDatabase();
            }
        }

All calls which connect to Redis use the above Init method to create the connection. We take no additional steps to share the ConnectionMultiplexer instance.

        public RedisCache Build()
        {
            var configurationOptions = new ConfigurationOptions
            {
                KeepAlive = _constants.RedisCacheKeepAliveTimeSeconds,
                ConnectTimeout = 15000,
                SyncTimeout = 15000,
                Ssl = _constants.RedisSslEnabled,
                AllowAdmin = true,
                Password = _constants.RedisCachePassword
            };             var endpoint = _constants.RedisCacheEndPoint;
            var port = _constants.RedisCachePort;             configurationOptions.EndPoints.Add(endpoint, port);

            return new RedisCache(configurationOptions, _prefix);
        }

The Build method is always used to construct our wrapper object

9.In what region(s) are your cache service and clients?

EastUS

10.Did anything change in your client around the time of the error? Were you scaling the number of client instances up or down, or deploying a new version of the client? Does your client have auto-scale enabled?

No known changes. No autoscale.

11.What was the CPU utilization on your client both before and during the incident?

Before unknown as diagnostics were not enabled.During and now server load is at 100, CPU was at 100% but has decreased to ~85%

12.Did all requests experience high latency or timeouts at the time of the incident, or only some requests? 

It looks like all requests although this is solely based on my observations as we are only logging and monitoring on issues and not successes.

13.How were the failures distributed across your clients? Evenly split, or all on a single client?

All client VMs were subject to failures

14.What is the size of the value you are getting from or putting into the cache?

Unsure of exact size at present. Based on figures noted during development and testing ~1MB

15.What timeout values are set on the client for sync timeout and for connection timeout?

15000.

It should be noted that the client application has been re-configured to use a different Redis Cache server yet high usage and connection metrics are persisting.

Free Windows Admin Tool Kit Click here and download it now
July 6th, 2015 9:14pm

I have a redis cache which was set at 1GB.

This cache is located at wpcconnect.redis.cache.windows.net

This approximately 6 hours ago (0600 NZT) users started reporting issues. via application logging I found that Redis was refusing client connections (RedisConnectionException: It was not possible to connect to the redis server(s); to create a disconnected multiplexer, disable AbortOnConnectFail.)

When examining the server the Server Load metric was a 100 as was CPU usage. Memory Usage is sitting at 496MB.

The number of connections was approximately 8K, which struck me as higher than it should be.

Due to the unresponsive nature of the cache I have created a second cache and pointed my client app at the new cache.

However since 'failing over' to the new cache the old cache still has ~8K open connections


July 7th, 2015 12:17am

I have a redis cache which was set at 1GB.

This cache is located at wpcconnect.redis.cache.windows.net

This approximately 6 hours ago (0600 NZT) users started reporting issues. via application logging I found that Redis was refusing client connections (RedisConnectionException: It was not possible to connect to the redis server(s); to create a disconnected multiplexer, disable AbortOnConnectFail.)

When examining the server the Server Load metric was a 100 as was CPU usage. Memory Usage is sitting at 496MB.

The number of connections was approximately 8K, which struck me as higher than it should be.

Due to the unresponsive nature of the cache I have created a second cache and pointed my client app at the new cache.

However since 'failing over' to the new cache the old cache still has ~8K open connections


Free Windows Admin Tool Kit Click here and download it now
July 7th, 2015 12:17am

I have a redis cache which was set at 1GB.

This cache is located at wpcconnect.redis.cache.windows.net

This approximately 6 hours ago (0600 NZT) users started reporting issues. via application logging I found that Redis was refusing client connections (RedisConnectionException: It was not possible to connect to the redis server(s); to create a disconnected multiplexer, disable AbortOnConnectFail.)

When examining the server the Server Load metric was a 100 as was CPU usage. Memory Usage is sitting at 496MB.

The number of connections was approximately 8K, which struck me as higher than it should be.

Due to the unresponsive nature of the cache I have created a second cache and pointed my client app at the new cache.

However since 'failing over' to the new cache the old cache still has ~8K open connections


July 7th, 2015 12:17am

From above code it is not clear if '_connectionMultiplexer' is shared among all objects or not. If it is non static than every object is going to create its own connection and which can cause connection flood on server.

I would recommend that you just create a single ConnectionMultiplexer.  Here is the pattern we typically recommend customers use:

private static Lazy<ConnectionMultiplexer> lazyConnection = new Lazy<ConnectionMultiplexer>(() => {
    return ConnectionMultiplexer.Connect("mycache.redis.cache.windows.net,abortConnect=false,ssl=true,password=...");
});
 
public static ConnectionMultiplexer Connection {
    get {
        return lazyConnection.Value;
    }
}

PS: I see in logs that most of connections are from IP 191.236.56.250.
Free Windows Admin Tool Kit Click here and download it now
July 7th, 2015 1:54pm

From above code it is not clear if '_connectionMultiplexer' is shared among all objects or not. If it is non static than every object is going to create its own connection and which can cause connection flood on server.

I would recommend that you just create a single ConnectionMultiplexer.  Here is the pattern we typically recommend customers use:

private static Lazy<ConnectionMultiplexer> lazyConnection = new Lazy<ConnectionMultiplexer>(() => {
    return ConnectionMultiplexer.Connect("mycache.redis.cache.windows.net,abortConnect=false,ssl=true,password=...");
});
 
public static ConnectionMultiplexer Connection {
    get {
        return lazyConnection.Value;
    }
}

PS: I see in logs that most of connections are from IP 191.236.56.250.
July 7th, 2015 5:52pm

From above code it is not clear if '_connectionMultiplexer' is shared among all objects or not. If it is non static than every object is going to create its own connection and which can cause connection flood on server.

I would recommend that you just create a single ConnectionMultiplexer.  Here is the pattern we typically recommend customers use:

private static Lazy<ConnectionMultiplexer> lazyConnection = new Lazy<ConnectionMultiplexer>(() => {
    return ConnectionMultiplexer.Connect("mycache.redis.cache.windows.net,abortConnect=false,ssl=true,password=...");
});
 
public static ConnectionMultiplexer Connection {
    get {
        return lazyConnection.Value;
    }
}

PS: I see in logs that most of connections are from IP 191.236.56.250.
Free Windows Admin Tool Kit Click here and download it now
July 7th, 2015 5:52pm

From above code it is not clear if '_connectionMultiplexer' is shared among all objects or not. If it is non static than every object is going to create its own connection and which can cause connection flood on server.

I would recommend that you just create a single ConnectionMultiplexer.  Here is the pattern we typically recommend customers use:

private static Lazy<ConnectionMultiplexer> lazyConnection = new Lazy<ConnectionMultiplexer>(() => {
    return ConnectionMultiplexer.Connect("mycache.redis.cache.windows.net,abortConnect=false,ssl=true,password=...");
});
 
public static ConnectionMultiplexer Connection {
    get {
        return lazyConnection.Value;
    }
}

PS: I see in logs that most of connections are from IP 191.236.56.250.
July 7th, 2015 5:52pm

Thanks for that suggestion.

We've implemented a change to ensure that access to the ConnectionMultiplexer always passes through a static method as per your suggestion. We're still see a steady increase in the number of connections to the Redis sever over the course of the day.

Our current Cache wrapper has the results of the GetDatabase() call stored in a field use within the instance ref:

public class RedisCache : IRepositoryCache
{
 private readonly ConfigurationOptions _configurationOptions;
 private readonly CachePrefix _prefix;
 private IDatabase _cache;  public RedisCache(ConfigurationOptions configurationOptions, CachePrefix prefix)
 {
  if (configurationOptions == null) throw new ArgumentNullException("configurationOptions");
  _configurationOptions = configurationOptions;
  _prefix = prefix;
 }  private IDatabase Cache
 {
  get
  {
   if (_cache == null)
   {
    Init(_configurationOptions);
   }    return _cache;
  }
 }
 private void Init(ConfigurationOptions options)
 {
  _cache = LazyConnection(options).Value.GetDatabase();
 }  private static Lazy<ConnectionMultiplexer> LazyConnection(ConfigurationOptions options)
 {
  return new Lazy<ConnectionMultiplexer>(() => ConnectionMultiplexer.Connect(options));
 }

 public void ClearItem(string key)
 {
  key = _prefix + key;
  if (key == null) throw new ArgumentNullException("key");
  Cache.KeyDelete(key);
 }

 // Other cache access methods ommited for brevity
}

Based on the observed behavior I'm considering altering our implementation to be:

public class RedisCache : IRepositoryCache
{
 private readonly ConfigurationOptions _configurationOptions;
 private readonly CachePrefix _prefix;  public RedisCache(ConfigurationOptions configurationOptions, CachePrefix prefix)
 {
  if (configurationOptions == null) throw new ArgumentNullException("configurationOptions");
  _configurationOptions = configurationOptions;
  _prefix = prefix;
 }  private static IDatabase Cache(ConfigurationOptions options)
 {
  IDatabase cache = LazyConnection(options).Value.GetDatabase();
  return cache;
 }
 private static Lazy<ConnectionMultiplexer> LazyConnection(ConfigurationOptions options)
 {
  return new Lazy<ConnectionMultiplexer>(() => ConnectionMultiplexer.Connect(options));
 }

 public void ClearItem(string key)
 {
  key = _prefix + key;
  if (key == null) throw new ArgumentNullException("key");
  Cache(_configurationOptions).KeyDelete(key);
 }

 // Other cache access methods ommited for brevity

}

 

Free Windows Admin Tool Kit Click here and download it now
July 8th, 2015 5:27pm

You haven't implemented suggested solution correctly. You have to share ConnectionMultiplexer among all RedisCache objects. So there should be only one ConnectionMultiplexer object that is static and shared among all.

You have marked methods as static but they are getting called again and again creating new ConnectionMultiplexer for each RedisCache object. I would modified your code as below.

public class RedisCache : IRepositoryCache
    {
        private static ConfigurationOptions _configurationOptions;
        private readonly CachePrefix _prefix;
       
        public RedisCache(ConfigurationOptions configurationOptions, CachePrefix prefix)
        {
            if (configurationOptions == null) throw new ArgumentNullException("configurationOptions");
            _configurationOptions = configurationOptions;
            _prefix = prefix;
        }


        private IDatabase Cache
        {
            get
            {
                return Connection.GetDatabase();
            }
        }

        private static Lazy<ConnectionMultiplexer> lazyConnection = new Lazy<ConnectionMultiplexer>(() =>
        {
            return ConnectionMultiplexer.Connect(_configurationOptions);
        });

        public static ConnectionMultiplexer Connection
        {
            get
            {
                return lazyConnection.Value;
            }
        }

        public void ClearItem(string key)
        {
            key = _prefix + key;
            if (key == null) throw new ArgumentNullException("key");
            Cache.KeyDelete(key);
        }

        // Other cache access methods ommited for brevity
    }

July 8th, 2015 6:49pm

Sorry, but your implementation of

private static Lazy<ConnectionMultiplexer> lazyConnection = new Lazy<ConnectionMultiplexer>(() =>
        {
            return ConnectionMultiplexer.Connect(_configurationOptions);
        });

Is non-static as it uses the _configurationOptions member field. If I refactor your version a bit to account for this I get:

public class RedisCache : IRepositoryCache
{
 private readonly ConfigurationOptions _configurationOptions;
 private readonly CachePrefix _prefix;
 private readonly Lazy<ConnectionMultiplexer> _lazyConnection;  public RedisCache(ConfigurationOptions configurationOptions, CachePrefix prefix)
 {
  if (configurationOptions == null) throw new ArgumentNullException("configurationOptions");
  _configurationOptions = configurationOptions;
  _prefix = prefix;
  _lazyConnection = new Lazy<ConnectionMultiplexer>(() => ConnectionMultiplexer.Connect(_configurationOptions));
 }  private IDatabase Cache
 {
  get
  {
   return Connection.GetDatabase();
  }
 }  public ConnectionMultiplexer Connection
 {
  get
  {
   return _lazyConnection.Value;
  }
 }
 public void ClearItem(string key)
 {
  key = _prefix + key;
  if (key == null) throw new ArgumentNullException("key");
  Cache.KeyDelete(key);
 }
 // Other cache access methods ommited for brevity
}

If you're looking to share a static single instance of the ConnectionMultiplexer across all instances of the RedisCache how do we pass in configuration for the connection without that class being aware of how to construct up the configuration?

Free Windows Admin Tool Kit Click here and download it now
July 8th, 2015 7:33pm

>> Is non-static as it uses the _configurationOptions member field.

=> If you look carefully at code I have marked _configurationOptions  as static.

>>>If you're looking to share a static single instance of the ConnectionMultiplexer across all instances of the RedisCache how do we pass in configuration for the connection without that class being aware of how to construct up the configuration?

=> You can have static public property instead of _configurationOptions and set it before creating any instance of RedisCache.

July 8th, 2015 8:04pm

Ah! Sorry, totally missed that you'd made the _configurationOptions field static.

I'll get this change into prod later today and hopefully we'll see some improvement on the connection usage.

Free Windows Admin Tool Kit Click here and download it now
July 8th, 2015 9:02pm

You haven't implemented suggested solution correctly. You have to share ConnectionMultiplexer among all RedisCache objects. So there should be only one ConnectionMultiplexer object that is static and shared among all.

You have marked methods as static but they are getting called again and again creating new ConnectionMultiplexer for each RedisCache object. I would modified your code as below.

public class RedisCache : IRepositoryCache
    {
        private static ConfigurationOptions _configurationOptions;
        private readonly CachePrefix _prefix;
       
        public RedisCache(ConfigurationOptions configurationOptions, CachePrefix prefix)
        {
            if (configurationOptions == null) throw new ArgumentNullException("configurationOptions");
            _configurationOptions = configurationOptions;
            _prefix = prefix;
        }


        private IDatabase Cache
        {
            get
            {
                return Connection.GetDatabase();
            }
        }

        private static Lazy<ConnectionMultiplexer> lazyConnection = new Lazy<ConnectionMultiplexer>(() =>
        {
            return ConnectionMultiplexer.Connect(_configurationOptions);
        });

        public static ConnectionMultiplexer Connection
        {
            get
            {
                return lazyConnection.Value;
            }
        }

        public void ClearItem(string key)
        {
            key = _prefix + key;
            if (key == null) throw new ArgumentNullException("key");
            Cache.KeyDelete(key);
        }

        // Other cache access methods ommited for brevity
    }

  • Proposed as answer by GavinB.Net 1 hour 3 minutes ago
July 8th, 2015 10:48pm

You haven't implemented suggested solution correctly. You have to share ConnectionMultiplexer among all RedisCache objects. So there should be only one ConnectionMultiplexer object that is static and shared among all.

You have marked methods as static but they are getting called again and again creating new ConnectionMultiplexer for each RedisCache object. I would modified your code as below.

public class RedisCache : IRepositoryCache
    {
        private static ConfigurationOptions _configurationOptions;
        private readonly CachePrefix _prefix;
       
        public RedisCache(ConfigurationOptions configurationOptions, CachePrefix prefix)
        {
            if (configurationOptions == null) throw new ArgumentNullException("configurationOptions");
            _configurationOptions = configurationOptions;
            _prefix = prefix;
        }


        private IDatabase Cache
        {
            get
            {
                return Connection.GetDatabase();
            }
        }

        private static Lazy<ConnectionMultiplexer> lazyConnection = new Lazy<ConnectionMultiplexer>(() =>
        {
            return ConnectionMultiplexer.Connect(_configurationOptions);
        });

        public static ConnectionMultiplexer Connection
        {
            get
            {
                return lazyConnection.Value;
            }
        }

        public void ClearItem(string key)
        {
            key = _prefix + key;
            if (key == null) throw new ArgumentNullException("key");
            Cache.KeyDelete(key);
        }

        // Other cache access methods ommited for brevity
    }

  • Proposed as answer by GavinB.Net Friday, July 10, 2015 6:15 AM
Free Windows Admin Tool Kit Click here and download it now
July 8th, 2015 10:48pm

Sorry, but your implementation of

private static Lazy<ConnectionMultiplexer> lazyConnection = new Lazy<ConnectionMultiplexer>(() =>
        {
            return ConnectionMultiplexer.Connect(_configurationOptions);
        });

Is non-static as it uses the _configurationOptions member field. If I refactor your version a bit to account for this I get:

public class RedisCache : IRepositoryCache
{
 private readonly ConfigurationOptions _configurationOptions;
 private readonly CachePrefix _prefix;
 private readonly Lazy<ConnectionMultiplexer> _lazyConnection;  public RedisCache(ConfigurationOptions configurationOptions, CachePrefix prefix)
 {
  if (configurationOptions == null) throw new ArgumentNullException("configurationOptions");
  _configurationOptions = configurationOptions;
  _prefix = prefix;
  _lazyConnection = new Lazy<ConnectionMultiplexer>(() => ConnectionMultiplexer.Connect(_configurationOptions));
 }  private IDatabase Cache
 {
  get
  {
   return Connection.GetDatabase();
  }
 }  public ConnectionMultiplexer Connection
 {
  get
  {
   return _lazyConnection.Value;
  }
 }
 public void ClearItem(string key)
 {
  key = _prefix + key;
  if (key == null) throw new ArgumentNullException("key");
  Cache.KeyDelete(key);
 }
 // Other cache access methods ommited for brevity
}

If you're looking to share a static single instance of the ConnectionMultiplexer across all instances of the RedisCache how do we pass in configuration for the connection without that class being aware of how to construct up the configuration?

  • Proposed as answer by GavinB.Net 7 hours 30 minutes ago
  • Unproposed as answer by GavinB.Net 1 hour 3 minutes ago
July 8th, 2015 11:32pm

Sorry, but your implementation of

private static Lazy<ConnectionMultiplexer> lazyConnection = new Lazy<ConnectionMultiplexer>(() =>
        {
            return ConnectionMultiplexer.Connect(_configurationOptions);
        });

Is non-static as it uses the _configurationOptions member field. If I refactor your version a bit to account for this I get:

public class RedisCache : IRepositoryCache
{
 private readonly ConfigurationOptions _configurationOptions;
 private readonly CachePrefix _prefix;
 private readonly Lazy<ConnectionMultiplexer> _lazyConnection;  public RedisCache(ConfigurationOptions configurationOptions, CachePrefix prefix)
 {
  if (configurationOptions == null) throw new ArgumentNullException("configurationOptions");
  _configurationOptions = configurationOptions;
  _prefix = prefix;
  _lazyConnection = new Lazy<ConnectionMultiplexer>(() => ConnectionMultiplexer.Connect(_configurationOptions));
 }  private IDatabase Cache
 {
  get
  {
   return Connection.GetDatabase();
  }
 }  public ConnectionMultiplexer Connection
 {
  get
  {
   return _lazyConnection.Value;
  }
 }
 public void ClearItem(string key)
 {
  key = _prefix + key;
  if (key == null) throw new ArgumentNullException("key");
  Cache.KeyDelete(key);
 }
 // Other cache access methods ommited for brevity
}

If you're looking to share a static single instance of the ConnectionMultiplexer across all instances of the RedisCache how do we pass in configuration for the connection without that class being aware of how to construct up the configuration?

  • Proposed as answer by GavinB.Net Thursday, July 09, 2015 11:48 PM
  • Unproposed as answer by GavinB.Net Friday, July 10, 2015 6:15 AM
Free Windows Admin Tool Kit Click here and download it now
July 8th, 2015 11:32pm

Looks like this is all resolved now.

Thank you so very much, your help has been greatly appreciated :)

July 9th, 2015 3:42pm

Most welcome can you please mark this post as answered.
Free Windows Admin Tool Kit Click here and download it now
July 9th, 2015 7:07pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics