前导  

  上次Redis MQ分布式改造完成之后, 编排的容器稳定运行了一个多月,昨天突然收到ETL端同事通知,没有采集到解析日志了。

赶紧进服务器看了一下,用于数据接收的receiver容器挂掉了, 尝试docker container start [containerid],  几分钟后该容器再次崩溃。

Redis连接超限

docker log [containerid]  查看容器日志; 重点:CSRedis.RedisException: ERR max number of clients reached

Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker[]
Executed action EqidManager.Controllers.EqidController.BatchPutEqidAndProfileIds (EqidReceiver) in .1767ms
fail: Microsoft.AspNetCore.Server.Kestrel[]
Connection id "0HLPR3AP8ODKH", Request id "0HLPR3AP8ODKH:00000001": An unhandled exception was thrown by the application.
CSRedis.RedisException: ERR max number of clients reached
at CSRedis.CSRedisClient.GetAndExecute[T](RedisClientPool pool, Func` handler, Int32 jump, Int32 errtimes)
at CSRedis.CSRedisClient.ExecuteScalar[T](String key, Func` hander)
at CSRedis.CSRedisClient.LPush[T](String key, T[] value)
at RedisHelper.LPush[T](String key, T[] value)
at EqidManager.Controllers.EqidController.BatchPutEqidAndProfileIds(List` eqidPairs) in /home/gitlab-runner/builds/haD2h5xC//webdissector/datasource/eqid-manager/src/EqidReceiver/Controllers/EqidController.cs:line
at lambda_method(Closure , Object )
at Microsoft.Extensions.Internal.ObjectMethodExecutorAwaitable.Awaiter.GetResult()
at Microsoft.AspNetCore.Mvc.Internal.ActionMethodExecutor.AwaitableResultExecutor.Execute(IActionResultTypeMapper mapper, ObjectMethodExecutor executor, Object controller, Object[] arguments)
at System.Threading.Tasks.ValueTask`.get_Result()
at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.InvokeActionMethodAsync()
at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.InvokeNextActionFilterAsync()
at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.Rethrow(ActionExecutedContext context)
at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted)
at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.InvokeInnerFilterAsync()
at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.InvokeNextResourceFilter()
at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.Rethrow(ResourceExecutedContext context)
at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted)
at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.InvokeFilterPipelineAsync()
at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.InvokeAsync()
at Microsoft.AspNetCore.Builder.RouterMiddleware.Invoke(HttpContext httpContext)
at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpProtocol.ProcessRequests[TContext](IHttpApplication` application)
info: Microsoft.AspNetCore.Hosting.Internal.WebHost[]
Request finished in .9549ms
【dockerhost:/】仍然不可用,下一次恢复检查时间:// ::,错误:(ERR max number of clients reached)
【dockerhost:/】仍然不可用,下一次恢复检查时间:// ::,错误:(ERR max number of clients reached)
【dockerhost:/】仍然不可用,下一次恢复检查时间:// ::,错误:(ERR max number of clients reached)
【dockerhost:/】仍然不可用,下一次恢复检查时间:// ::,错误:(ERR max number of clients reached)【dockerhost:/】仍然不可用,下一次恢复检查时间:// ::,错误:(ERR max number of clients reached) 【dockerhost:/】仍然不可用,下一次恢复检查时间:// ::,错误:(ERR max number of clients reached)
【dockerhost:/】仍然不可用,下一次恢复检查时间:// ::,错误:(ERR max number of clients reached)

docker logs [containerid]

日志上显示连接Redis服务器的客户端数量超限,头脑快速思考,目前编排的某容器使用CSRedisCore 对于16个Redis DB实例化了16个客户端,但Redis服务器也不至于这么不经折腾吧。

赶紧进redis.io官网搜集相关资料

After the client is initialized, Redis checks if we are already at the limit of the number of clients that it is possible to handle simultaneously (this is configured using the maxclients configuration directive, see the next section of this document for further information).

In case it can't accept the current client because the maximum number of clients was already accepted, Redis tries to send an error to the client in order to make it aware of this condition, and closes the connection immediately. The error message will be able to reach the client even if the connection is closed immediately by Redis because the new socket output buffer is usually big enough to contain the error, so the kernel will handle the transmission of the error.

大致意思是:Redis服务器maxclients配置了最大客户端连接数, 如果当前连接的客户端超限,Redis会回发一个错误消息给客户端,并迅速关闭客户端连接。

立刻登录Redis服务器查看默认配置,确认当前Redis服务器maxclients=10000(这是一个动态值,由maxclients和最大进程文件句柄决定),

# Set the max number of connected clients at the same time. By default
# this limit is set to 10000 clients, however if the Redis server is not
# able to configure the process file limit to allow for the specified limit
# the max number of allowed clients is set to the current file limit
# minus 32 (as Redis reserves a few file descriptors for internal uses).
#
# Once the limit is reached Redis will close all the new connections sending
# an error 'max number of clients reached'.
# maxclients 10000

 左图表明:通过Redis-Cli 登录进服务器立即就被踢下线。

基本可认定redis客户端使用方式有问题。

CSRedisCore使用方式

 继续查看相关资料,可在redis服务器上利用redis-cli命令:info clients、client list仔细分析客户端连接。

info clients 命令显示现场确实有10000的连接数;

id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=lpush
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping

client list命令显示连接如下:

官方对client list命令输出字段的解释:

  • addr: The client address, that is, the client IP and the remote port number it used to connect with the Redis server.

  • fd: The client socket file descriptor number.

  • name: The client name as set by CLIENT SETNAME.

  • age: The number of seconds the connection existed for.

  • idle: The number of seconds the connection is idle.

  • flags: The kind of client (N means normal client, check the full list of flags).

  • omem: The amount of memory used by the client for the output buffer.

  • cmd: The last executed command.

根据以上解释,表明 Redis服务器收到很多ip=172.16.1.3(故障容器在网桥内的Ip 地址)的客户端连接,这些连接最后发出的是ping命令(这是一个测试命令)

故障容器使用的Redis客户端是CSRedisCore,该客户端只是单纯将 Msg 写入Redis list 数据结构,CSRedisCore上相关github issue给了我一些启发。

发现自己将CSRedisClient实例化代码写在 .netcore api Controller构造函数,这样每次请求构造Controller时都实例化一次Redis客户端,最终Redis客户端连接数达到最大允许连接值。

依赖注入三种模式: 单例(系统内单一实例,一次性注入);瞬态(每次请求产生实例并注入);自定义范围。

有关dotnet apicontroller 以瞬态模式 注入,请查阅链接

还有一个疑问? 为什么Redis服务器没有释放空闲的 客户端连接,如果空闲连接被释放了,即使我写了low代码也不至于如此吧?

查询官方:

By default recent versions of Redis don't close the connection with the client if the client is idle for many seconds: the connection will remain open forever.

However if you don't like this behavior, you can configure a timeout, so that if the client is idle for more than the specified number of seconds, the client connection will be closed.

You can configure this limit via redis.conf or simply using CONFIG SET timeout <value>.

大致意思是最近的Redis服务端版本 默认不会释放空闲的客户端连接:

# Close the connection after a client is idle for N seconds (0 to disable)
timeout 0

可通过修改Redis配置释放 空闲客户端连接。

我们最佳实践当然不是修改Redis idle timeout 配置,问题核心还是因为我实例化了多客户端,赶紧将CSRedisCore实例化代码移到startup.cs并注册为单例。

大胆求证

info clients命令显示稳定在53个Redis连接。

client list命令显示:172.16.1.3(故障容器)建立了50个客户端连接,编排的另一个容器webapp建立了2个连接,redis-cli命令登录到服务器建立了1个连接。

127.0.0.1:> client list
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=127.0.0.1: fd= name= age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=client
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=lpush
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.4: fd= name=f176f125a4c5 age= idle= flags=P db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.4: fd= name=f176f125a4c5 age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=rpop
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=ping
id= addr=172.16.1.3: fd= name=receiver age= idle= flags=N db= sub= psub= multi=- qbuf= qbuf-free= obl= oll= omem= events=r cmd=pin

client list命令显示连接

那么问题来了,修改之后,receiver容器为什么还稳定建立了50个redis连接?

进一步与CSRedisCore原作者沟通,确定CSRedisCore有预热机制,默认在连接池中预热了50个连接。

bingo,故障和困惑全部排查清楚。

总结

经此一役,在使用CSRedisCore客户端时, 要深入理解

① Stackexchange.Redis 使用的多路复用连接机制(使用时很容易想到注册到单例),CSRedisCore开源库采用连接池机制,在高并发场景下强烈建议注册为单例, 否则在生产使用中可能会误用在瞬态请求中实例化,导致redis客户端几天之后被占满。

② CSRedisCore会默认建立连接池,预热50个连接, 开发者心里要有数。

额外的方法论: 尽量不要从某度找答案,要学会问问题,并尝试从官方、stackoverflow 、github社区寻求解答,你挖过的坑也许别人早就挖过并踏平过。

------------------------------update  多说两句---------------------------------------------

很多博友说问题在于我没有细看CSRedisCore官方readme(readme推荐使用单例),使用方式上我确实没有做成单例:

③ 一般连接池都会有空闲释放回收机制 (CSRedisCore也是连接池机制),所以当时并没有把 单例放在心上

④ 本次重要知识点:Redis默认并不会释放空闲客户端连接(但是又设置了最大连接数),这也直接促成了本次容器崩溃事故。

嗯,坑是自己挖的。

一次误用CSRedisCore引发的redis故障排除经历的更多相关文章

  1. 记一次linux Docker网络故障排除经历

    背景: 之前做了一个项目,需要在容器内访问宿主机提供的Redis 服务(这是一个比较常见的应用场景哈), 常规方案: ①   主机网络(docker run --network=host): 完全应用 ...

  2. spark 性能优化 数据倾斜 故障排除

    版本:V2.0 第一章       Spark 性能调优 1.1      常规性能调优 1.1.1   常规性能调优一:最优资源配置 Spark性能调优的第一步,就是为任务分配更多的资源,在一定范围 ...

  3. Sentry 监控 - 私有 Docker Compose 部署与故障排除详解

    内容整理自官方开发文档 系列 1 分钟快速使用 Docker 上手最新版 Sentry-CLI - 创建版本 快速使用 Docker 上手 Sentry-CLI - 30 秒上手 Source Map ...

  4. 理解 OpenStack + Ceph (7): Ceph 的基本操作和常见故障排除方法

    本系列文章会深入研究 Ceph 以及 Ceph 和 OpenStack 的集成: (1)安装和部署 (2)Ceph RBD 接口和工具 (3)Ceph 物理和逻辑结构 (4)Ceph 的基础数据结构 ...

  5. 细化如何安装LNMP + Zabbix 监控安装文档以及故障排除

    1.LNMP所需安装包: 上传如下软件包到/soft目录中 mysql- (centos6. 64位自带)也可根据版本自行挑选,前提你了解这个版本 pcre-8.36.tar.gz nginx-.ta ...

  6. 第十篇 Replication:故障排除

    本篇文章是SQL Server Replication系列的第十篇,详细内容请参考原文. 复制故障排除是一项艰巨的任务.在任何复制设置中,都涉及到很多移动部件,而可用的工具并不总是很容易识别问题.Th ...

  7. 《DevOps故障排除:Linux服务器运维最佳实践》读书笔记

    首先,这本书是Linux.CN赠送的,多谢啦~ http://linux.cn/thread-12733-1-1.html http://linux.cn/thread-12754-1-1.html ...

  8. 利用Ring Buffer在SQL Server 2008中进行连接故障排除

    原文:利用Ring Buffer在SQL Server 2008中进行连接故障排除 出自:http://blogs.msdn.com/b/apgcdsd/archive/2011/11/21/ring ...

  9. JVMTI 中间JNI系列功能,线程安全和故障排除技巧

    JVMTI 中间JNI系列功能,线程安全和故障排除技巧 jni functions 在使用 JVMTI 的过程中,有一大系列的函数是在 JVMTI 的文档中 没有提及的,但在实际使用却是很实用的. 这 ...

随机推荐

  1. JSP前端数据本地排序

    在前端中我们经常需要数据的排序,首先写引入我写好的js $(function($) { $('#sclazzId').val($('#voId').val()); document.getElemen ...

  2. WPF中TimeSpan的坑

    记一次在WPF中,在将格式为“DD.HH:mm:ss”字符串转换成TimeSpan时遇到的坑 如果字符串为:DD.HH:mm:ss,转换结果正确.例如: var currentValue = &quo ...

  3. Appium+python自动化(三十)- 实现代码与数据分离 - 数据配置-yaml(超详解)

    简介 本篇文章主要介绍了python中yaml配置文件模块的使用让其完成数据和代码的分离,宏哥觉得挺不错的,于是就义无反顾地分享给大家,也给大家做个参考.一起跟随宏哥过来看看吧. 思考问题 前面我们配 ...

  4. ServerResponse(服务器统一响应数据格式)

    ServerResponse(服务器统一响应数据格式) 前言: 其实严格来说,ServerResponse应该归类到common包中.但是我实在太喜欢这玩意儿了.而且用得也非常频繁,所以忍不住推荐一下 ...

  5. Scrapy爬虫框架学习

    一.Scrapy框架简介 1. 下载页面 2. 解析 3. 并发 4. 深度 二.安装 linux下安装 pip3 install scrapy windows下安装 a.pip3 install w ...

  6. Spring源码剖析6:Spring AOP概述

    原文出处: 五月的仓颉 我们为什么要使用 AOP 前言 一年半前写了一篇文章Spring3:AOP,是当时学习如何使用Spring AOP的时候写的,比较基础.这篇文章最后的推荐以及回复认为我写的对大 ...

  7. vsftpd 530 Login incorrect问题处理

    vsftpd 530 login incorrect 的N中情况 1.密码错误. 2.检查/etc/vsftpd/vsftpd.conf配置 vim /etc/vsftpd/vsftpd.conf 看 ...

  8. switch语句(上)(转载)

    switch语句是C#中常用的跳转语句,可以根据一个参数的不同取值执行不同的代码.switch语句可以具备多个分支,也就是说,根据参数的N种取值,可以跳转到N个代码段去运行.这不同于if语句,一条单独 ...

  9. P2486 [SDOI2011]染色 维护区间块数 树链剖分

    https://www.luogu.org/problemnew/show/P2486   题意 对一个树上维护两种操作,一种是把x到y间的点都染成c色,另一种是求x到y间的点有多少个颜色块,比如11 ...

  10. andriod开发--使用Http的Get和Post方式与网络交互通信

    package com.example.a350773523.myapplication; import android.os.AsyncTask; import android.support.v7 ...