redis-server进程CPU百分百问题
结论:
待确认是否为redis的BUG,原因是进程实际占用的内存远小于配置的最大内存,所以不会是内存不够需要淘汰。
CPU百分百redis-server进程集群状态:
slave
临时解决办法:
使用gdb将d.ht[0].used的值改为0
问题原因:
dictGetRandomKey()过程中,
无法走到分支“if (dictSize(d) == 0) return NULL;”,
导致函数dbRandomKey()进入死循环。
版本:
Redis server v=3.2.0 sha=00000000:0 malloc=jemalloc-4.0.3 bits=64 build=9894db3ef433c070
现象1:CPU百分百
PID   USER  PR NI VIRT  RES  SHR  S %CPU  %MEM TIME+   COMMAND                                                                         25636 redis 20 0  38492 4096 1360 R 100.0 0.0  2578:10 redis-server
现象2:大量CLOSE_WAIT状态连接:
tcp     2417      0 1.49.26.98:11382      1.49.26.98:37268      CLOSE_WAIT  -                   
tcp     2521      0 1.49.26.98:11382      1.49.26.98:35141      CLOSE_WAIT  -                   
tcp     2521      0 1.49.26.98:11382      1.49.26.98:57181      CLOSE_WAIT  -
进程状态:
redis 25636 30.0 0.0 38492  4096 ? Rsl 3月23 2579:55 /data/redis/bin/redis-server *:1382 [cluster]
最大内存配置(1G):
maxmemory 1073741824
运行日志:
25636:S 28 Mar 00:21:24.526 - 1 clients connected (0 slaves), 1312384 bytes in use
25636:S 28 Mar 00:21:29.531 - DB 0: 1 keys (1 volatile) in 8 slots HT.
25636:S 28 Mar 00:21:29.531 - 1 clients connected (0 slaves), 1312384 bytes in use
25636:S 28 Mar 00:21:32.585 - Accepted 1.118.14.7:58132
调用栈:
#0  dictGenHashFunction (key=<optimized out>, len=5) at dict.c:123
#1  0x00000000004232e6 in dictFind (d=0x7f71c2a17240, key=key@entry=0x7f71c2a15001) at dict.c:499
#2  0x000000000043a00a in dbRandomKey (db=0x7f71c2a24800) at db.c:176
#3  0x000000000043a0a2 in randomkeyCommand (c=0x7f71c2aae1c0) at db.c:355
#4  0x0000000000426b95 in call (c=c@entry=0x7f71c2aae1c0, flags=flags@entry=15) at server.c:2221
#5  0x0000000000429ba7 in processCommand (c=0x7f71c2aae1c0) at server.c:2500
#6  0x0000000000436515 in processInputBuffer (c=0x7f71c2aae1c0) at networking.c:1296
#7  0x0000000000421338 in aeProcessEvents (eventLoop=eventLoop@entry=0x7f71c2a2e050, flags=flags@entry=3) at ae.c:412
#8  0x00000000004215eb in aeMain (eventLoop=0x7f71c2a2e050) at ae.c:455
#9  0x000000000041e5df in main (argc=2, argv=0x7ffef34b2418) at server.c:4079
#0  0x00007f71c2fbc3a2 in random () from /lib64/libc.so.6
#1  0x0000000000423745 in dictGetRandomKey (d=0x7f71c2a171e0) at dict.c:646
#2  0x0000000000439fc0 in dbRandomKey (db=0x7f71c2a24800) at db.c:171
#3  0x000000000043a0a2 in randomkeyCommand (c=0x7f71c2aae1c0) at db.c:355
#4  0x0000000000426b95 in call (c=c@entry=0x7f71c2aae1c0, flags=flags@entry=15) at server.c:2221
#5  0x0000000000429ba7 in processCommand (c=0x7f71c2aae1c0) at server.c:2500
#6  0x0000000000436515 in processInputBuffer (c=0x7f71c2aae1c0) at networking.c:1296
#7  0x0000000000421338 in aeProcessEvents (eventLoop=eventLoop@entry=0x7f71c2a2e050, flags=flags@entry=3) at ae.c:412
#8  0x00000000004215eb in aeMain (eventLoop=0x7f71c2a2e050) at ae.c:455
#9  0x000000000041e5df in main (argc=2, argv=0x7ffef34b2418) at server.c:4079
#0  0x00007f71c30e17e4 in __memcmp_sse4_1 () from /lib64/libc.so.6
#1  0x0000000000424219 in dictSdsKeyCompare (privdata=<optimized out>, key1=<optimized out>, key2=<optimized out>) at server.c:445
#2  0x000000000042331d in dictFind (d=0x7f71c2a17240, key=0x7f71c2a27e73) at dict.c:504
#3  0x0000000000439494 in getExpire (db=0x7f71c2a24800, key=0x7f71c2a27e60) at db.c:824
#4  0x0000000000439c4f in expireIfNeeded (db=0x7f71c2a24800, key=0x7f71c2a27e60) at db.c:858
#5  0x000000000043a01a in dbRandomKey (db=0x7f71c2a24800) at db.c:177
#6  0x000000000043a0a2 in randomkeyCommand (c=0x7f71c2aae1c0) at db.c:355
#7  0x0000000000426b95 in call (c=c@entry=0x7f71c2aae1c0, flags=flags@entry=15) at server.c:2221
#8  0x0000000000429ba7 in processCommand (c=0x7f71c2aae1c0) at server.c:2500
#9  0x0000000000436515 in processInputBuffer (c=0x7f71c2aae1c0) at networking.c:1296
#10 0x0000000000421338 in aeProcessEvents (eventLoop=eventLoop@entry=0x7f71c2a2e050, flags=flags@entry=3) at ae.c:412
#11 0x00000000004215eb in aeMain (eventLoop=0x7f71c2a2e050) at ae.c:455
#12 0x000000000041e5df in main (argc=2, argv=0x7ffef34b2418) at server.c:4079
#0  dictGetRandomKey (d=<optimized out>) at dict.c:663
#1  0x0000000000439fc0 in dbRandomKey (db=0x7f71c2a24800) at db.c:171
#2  0x000000000043a0a2 in randomkeyCommand (c=0x7f71c2aae1c0) at db.c:355
#3  0x0000000000426b95 in call (c=c@entry=0x7f71c2aae1c0, flags=flags@entry=15) at server.c:2221
#4  0x0000000000429ba7 in processCommand (c=0x7f71c2aae1c0) at server.c:2500
#5  0x0000000000436515 in processInputBuffer (c=0x7f71c2aae1c0) at networking.c:1296
#6  0x0000000000421338 in aeProcessEvents (eventLoop=eventLoop@entry=0x7f71c2a2e050, flags=flags@entry=3) at ae.c:412
#7  0x00000000004215eb in aeMain (eventLoop=0x7f71c2a2e050) at ae.c:455
#8  0x000000000041e5df in main (argc=2, argv=0x7ffef34b2418) at server.c:4079
猜测:
达到最大内存,进入淘汰keys逻辑,但没有keys符合淘汰,从而死循环。
相关代码:
/* Return a random key from the currently selected database. */
void randomkeyCommand(client *c) {
    robj *key;
    if ((key = dbRandomKey(c->db)) == NULL) {
        addReply(c,shared.nullbulk);
        return;
    }
    addReplyBulk(c,key);
    decrRefCount(key);
}
/* Return a random key, in form of a Redis object.
 * If there are no keys, NULL is returned.
 *
 * The function makes sure to return keys not already expired. */
robj *dbRandomKey(redisDb *db) {
    dictEntry *de;
    while(1) { // CPU百分百的原因,是这里死循环了
        sds key;
        robj *keyobj;
        de = dictGetRandomKey(db->dict);
        if (de == NULL) return NULL;
        key = dictGetKey(de);
        keyobj = createStringObject(key,sdslen(key));
        if (dictFind(db->expires,key)) {
            if (expireIfNeeded(db,keyobj)) {
                decrRefCount(keyobj);
                continue; /* search for another key. This expired. */
            }
        }
        return keyobj;
    }
}
void call(client *c, int flags) {
    long long dirty, start, duration;
    int client_old_flags = c->flags;
    /* Sent the command to clients in MONITOR mode, only if the commands are
     * not generated from reading an AOF. */
    if (listLength(server.monitors) &&
        !server.loading &&
        !(c->cmd->flags & (CMD_SKIP_MONITOR|CMD_ADMIN)))
    {
        replicationFeedMonitors(c,server.monitors,c->db->id,c->argv,c->argc);
    }
    /* Initialization: clear the flags that must be set by the command on
     * demand, and initialize the array for additional commands propagation. */
    c->flags &= ~(CLIENT_FORCE_AOF|CLIENT_FORCE_REPL|CLIENT_PREVENT_PROP);
    redisOpArrayInit(&server.also_propagate);
    /* Call the command. */
    dirty = server.dirty;
    start = ustime();
    c->cmd->proc(c);
    duration = ustime()-start;
    dirty = server.dirty-dirty;
    if (dirty < 0) dirty = 0;
    。。。。。。
}
/* With multiplexing we need to take per-client state.
 * Clients are taken in a linked list. */
typedef struct client {
    。。。。。。
    struct redisCommand *cmd, *lastcmd;  /* Last command executed. */
    。。。。。。
};
typedef void redisCommandProc(client *c);
typedef int *redisGetKeysProc(struct redisCommand *cmd, robj **argv, int argc, int *numkeys);
struct redisCommand {
    char *name;
    redisCommandProc *proc;
    int arity;
    char *sflags; /* Flags as string representation, one char per flag. */
    int flags;    /* The actual flags, obtained from the 'sflags' field. */
    /* Use a function to determine keys arguments in a command line.
     * Used for Redis Cluster redirect. */
    redisGetKeysProc *getkeys_proc;
    /* What keys should be loaded in background when calling this command? */
    int firstkey; /* The first argument that's a key (0 = no keys) */
    int lastkey;  /* The last argument that's a key */
    int keystep;  /* The step between first and last key */
    long long microseconds, calls;
};
/* This is our hash table structure. Every dictionary has two of this as we
 * implement incremental rehashing, for the old to the new table. */
typedef struct dictht {
    dictEntry **table;
    unsigned long size;
    unsigned long sizemask;
    unsigned long used;
} dictht;
typedef struct dict {
    dictType *type;
    void *privdata;
    dictht ht[2];
    long rehashidx; /* rehashing not in progress if rehashidx == -1 */
    int iterators; /* number of iterators currently running */
} dict;
/* Return a random entry from the hash table. Useful to
 * implement randomized algorithms */
dictEntry *dictGetRandomKey(dict *d)
{
    dictEntry *he, *orighe;
    unsigned int h;
    int listlen, listele;
    // (gdb) p *d
    // $1 = {type = 0x71d940 <dbDictType>, privdata = 0x0, ht = {{table = 0x7f71c2a1e480, size = 8, sizemask = 7, used = 1}, {table = 0x0, size = 0, sizemask = 0, used = 0}}, rehashidx = -1, iterators = 0}
    //
    // (gdb) p d.ht[0]
    // $3 = {table = 0x7f71c2a1e480, size = 8, sizemask = 7, used = 1}
    // (gdb) p d.ht[1]
    // $4 = {table = 0x0, size = 0, sizemask = 0, used = 0}
    //
    // (gdb) set variable d.ht[0].used=0
    // (gdb) p d.ht[0].used
    // $7 = 0
    // #define dictSize(d) ((d)->ht[0].used+(d)->ht[1].used)
    if (dictSize(d) == 0) return NULL;
    if (dictIsRehashing(d)) _dictRehashStep(d);
    if (dictIsRehashing(d)) {
        do {
            /* We are sure there are no elements in indexes from 0
             * to rehashidx-1 */
            h = d->rehashidx + (random() % (d->ht[0].size +
                                            d->ht[1].size -
                                            d->rehashidx));
            he = (h >= d->ht[0].size) ? d->ht[1].table[h - d->ht[0].size] :
                                      d->ht[0].table[h];
        } while(he == NULL);
    } else {
        do {
            h = random() & d->ht[0].sizemask;
            he = d->ht[0].table[h];
        } while(he == NULL);
    }
    /* Now we found a non empty bucket, but it is a linked
     * list and we need to get a random element from the list.
     * The only sane way to do so is counting the elements and
     * select a random index. */
    listlen = 0;
    orighe = he;
    while(he) {
        he = he->next;
        listlen++;
    }
    listele = random() % listlen;
    he = orighe;
    while(listele--) he = he->next;
    return he;
}
/* This function performs just a step of rehashing, and only if there are
 * no safe iterators bound to our hash table. When we have iterators in the
 * middle of a rehashing we can't mess with the two hash tables otherwise
 * some element can be missed or duplicated.
 *
 * This function is called by common lookup or update operations in the
 * dictionary so that the hash table automatically migrates from H1 to H2
 * while it is actively used. */
static void _dictRehashStep(dict *d) {
    if (d->iterators == 0) dictRehash(d,1);
}进程内存(问题解决,退出死循环后才能看到,但结果和ps看到一致):
# Memory
used_memory:1375320
used_memory_human:1.31M
used_memory_rss:4321280
used_memory_rss_human:4.12M
used_memory_peak:2468448
used_memory_peak_human:2.35M
total_system_memory:33453797376
total_system_memory_human:31.16G
used_memory_lua:34816
used_memory_lua_human:34.00K
maxmemory:1073741824
maxmemory_human:1.00G
maxmemory_policy:allkeys-lru
mem_fragmentation_ratio:3.14
mem_allocator:jemalloc-4.0.3
redis-server进程CPU百分百问题的更多相关文章
- Redis优化之CPU充分利用
		Linux Redis Server之CPU充分利用 不知道大家有没有注意到你们公司的集群配置是否是有一种配置是这样的: 多个Redis Server分布在同一个节点,只是端口不同,如果有的话,应该是 ... 
- Weblogic的Admin server进程将CPU消耗尽问题解决
		1.serverCPU被耗尽,持续100% 以下附nmon图 2.两个weblogicadmin server进程将CPU耗尽 问题:24298进程,占用百分之四千多的CPU资源 23529进程,占用 ... 
- Redis used_cpu_sys used_cpu_user meaning (redis info中cpu信息的含义)
		Redis 中 used_cpu_sys 和 used_cpu_user含义. 在Redis的info命令输出结果中有如下四个指标,redis官网给出了下面一段解释,但是还是不明白什么意思. used ... 
- SQL Server服务器CPU爆高解决
		昨天下午,测试反映trunk测试环境的数据库CPU一直100%,一开始以为是病毒,内网这段时间老是有个挖矿的病毒,查了一下被隔离了,但是数据库还是慢,停掉SQL server的服务CPU降下来,启动S ... 
- Linux 下安装 Redis server
		版权声明:本文为博主原创文章.未经博主同意不得转载. https://blog.csdn.net/defonds/article/details/30047611 本文简介了 Linu ... 
- 【SQL Server】SQL Server占用CPU使用率100%的解决方法
		原文:[SQL Server]SQL Server占用CPU使用率100%的解决方法 近日,帮一个客户解决了服务器CPU占用率高达100%的问题. 以前做的一个某污水处理厂自控系统项目,客户反映其自控 ... 
- 解决一个 MySQL 服务器进程 CPU 占用 100%解决一个 MySQL 服务器进程 CPU 占用 100%的技术笔记》[转]
		转载地址:http://bbs.chinaunix.net/archiver/tid-1823500.html 解决一个 MySQL 服务器进程 CPU 占用 100%解决一个 MySQL 服务器进程 ... 
- 曹工说Redis源码(3)-- redis server 启动过程完整解析(中)
		文章导航 Redis源码系列的初衷,是帮助我们更好地理解Redis,更懂Redis,而怎么才能懂,光看是不够的,建议跟着下面的这一篇,把环境搭建起来,后续可以自己阅读源码,或者跟着我这边一起阅读.由于 ... 
- 曹工说Redis源码(5)-- redis server 启动过程解析,以及EventLoop每次处理事件前的前置工作解析(下)
		曹工说Redis源码(5)-- redis server 启动过程解析,eventLoop处理事件前的准备工作(下) 文章导航 Redis源码系列的初衷,是帮助我们更好地理解Redis,更懂Redis ... 
随机推荐
- spyder在编辑过程中被自己弄乱了,想要恢复成安装时默认的格式或者重置页面格式的解决办法
			打开spyder,tools-->Reset Spyder to factory defaults,按照如上操作即可恢复成安装时的默认格式. 
- xslt中substring 函数的用法
			1.函数定义: string substring(string, number, number?) 2.xslt中substring 函数功能: 返回第一个参数中从第二个参数指定的位置开始.第三个参数 ... 
- 一位大牛整理的Python资源
			Python基本安装: * http://www.python.org/ 官方标准Python开发包和支持环境,同时也是Python的官方网站: * http://www.activestate ... 
- 公共的service接口
			package com.taotao.manager.service; import java.util.List; /** * @author Administrator * * @param &l ... 
- istream_iterator和ostream_iterator
			总结: istream_iterator<T>in(strm);T指明此istream_iterator的输入类型,strm为istream_iterator指向的流 提供了输入操作符(& ... 
- (转找了好久)实现一个2008serve的IIS的虚拟目录(通过网络路径(UNC)的形式,共享在另外一个2008服务器上
			目的:实现一个2008serve的IIS的虚拟目录(通过网络路径(UNC)的形式,共享在另外一个2008服务器上) 准备工作 1.共享资源服务器为 ShareServer,IP地址为:192.168. ... 
- nginx 的 负载均衡
			一.正向代理和反向代理 1.正向代理 正向代理类似一个跳板机,代理访问外部资源. 正向代理是客户端和目标服务器之间的代理服务器(中间服务器).为了从指定的服务器取得内容,客户端向代理服务器发送一个请求 ... 
- 使用开源的工具解析erspan流量
			Decapsulation ERSPAN Traffic With Open Source Tools Posted on May 3, 2015 by Radovan BrezulaUpdated ... 
- (O)js核心:this
			什么是this this是js中的一个关键词,它总是指向一个对象,而具体指向哪个对象是在运行时基于函数的执行环境动态绑定的,而非函数被声明时的环境. 当函数被调用时,this被添加到作用域中,例如: ... 
- python多线程下载网页图片并保存至特定目录
			#!python3 #multidownloadXkcd.py - Download XKCD comics using multiple threads. import requests impor ... 
