Ceph 状态报警告 pool rbd has many more objects per pg than average (too few pgs?)
定位问题
[root@lab8106 ~]# ceph -s
cluster fa7ec1a1-662a-4ba3-b478-7cb570482b62
health HEALTH_WARN
pool rbd has many more objects per pg than average (too few pgs?)
monmap e1: 1 mons at {lab8106=192.168.8.106:6789/0}
election epoch 30, quorum 0 lab8106
osdmap e157: 2 osds: 2 up, 2 in
flags sortbitwise
pgmap v1023: 417 pgs, 13 pools, 18519 MB data, 15920 objects
18668 MB used, 538 GB / 556 GB avail
417 active+clean
集群出现了这个警告,pool rbd has many more objects per pg than average (too few pgs?) 这个警告在hammer版本里面的提示是 pool rbd has too few pgs
这个地方查看集群详细信息:
[root@lab8106 ~]# ceph health detail
HEALTH_WARN pool rbd has many more objects per pg than average (too few pgs?); mon.lab8106 low disk space
pool rbd objects per pg (1912) is more than 50.3158 times cluster average (38)
看下集群的pool的对象状态
[root@lab8106 ~]# ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
556G 538G 18668M 3.28
POOLS:
NAME ID USED %USED MAX AVAIL OBJECTS
rbd 6 16071M 2.82 536G 15296
pool1 7 204M 0.04 536G 52
pool2 8 184M 0.03 536G 47
pool3 9 188M 0.03 536G 48
pool4 10 192M 0.03 536G 49
pool5 11 204M 0.04 536G 52
pool6 12 148M 0.03 536G 38
pool7 13 184M 0.03 536G 47
pool8 14 200M 0.04 536G 51
pool9 15 200M 0.04 536G 51
pool10 16 248M 0.04 536G 63
pool11 17 232M 0.04 536G 59
pool12 18 264M 0.05 536G 67
查看存储池的pg个数
[root@lab8106 ~]# ceph osd dump|grep pool
pool 6 'rbd' replicated size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 132 flags hashpspool stripe_width 0
pool 7 'pool1' replicated size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 1 pgp_num 1 last_change 134 flags hashpspool stripe_width 0
pool 8 'pool2' replicated size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 1 pgp_num 1 last_change 136 flags hashpspool stripe_width 0
pool 9 'pool3' replicated size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 1 pgp_num 1 last_change 138 flags hashpspool stripe_width 0
pool 10 'pool4' replicated size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 1 pgp_num 1 last_change 140 flags hashpspool stripe_width 0
pool 11 'pool5' replicated size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 1 pgp_num 1 last_change 142 flags hashpspool stripe_width 0
pool 12 'pool6' replicated size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 1 pgp_num 1 last_change 144 flags hashpspool stripe_width 0
pool 13 'pool7' replicated size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 1 pgp_num 1 last_change 146 flags hashpspool stripe_width 0
pool 14 'pool8' replicated size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 1 pgp_num 1 last_change 148 flags hashpspool stripe_width 0
pool 15 'pool9' replicated size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 1 pgp_num 1 last_change 150 flags hashpspool stripe_width 0
pool 16 'pool10' replicated size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 100 pgp_num 100 last_change 152 flags hashpspool stripe_width 0
pool 17 'pool11' replicated size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 100 pgp_num 100 last_change 154 flags hashpspool stripe_width 0
pool 18 'pool12' replicated size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 200 pgp_num 200 last_change 156 flags hashpspool stripe_width 0
我们看下这个是怎么得到的
pool rbd objects per pg (1912) is more than 50.3158 times cluster average (38)
rbd objects_per_pg = 15296 / 8 = 1912
objects_per_pg = 15920 /417 ≈ 38
50.3158 = rbd objects_per_pg / objects_per_pg = 1912 / 38
也就是出现其他pool的对象太少,而这个pg少,对象多,就会提示这个了,我们看下代码里面的判断
https://github.com/ceph/ceph/blob/master/src/mon/PGMonitor.cc
int average_objects_per_pg = pg_map.pg_sum.stats.sum.num_objects / pg_map.pg_stat.size();
if (average_objects_per_pg > 0 &&
pg_map.pg_sum.stats.sum.num_objects >= g_conf->mon_pg_warn_min_objects &&
p->second.stats.sum.num_objects >= g_conf->mon_pg_warn_min_pool_objects) {
int objects_per_pg = p->second.stats.sum.num_objects / pi->get_pg_num();
float ratio = (float)objects_per_pg / (float)average_objects_per_pg;
if (g_conf->mon_pg_warn_max_object_skew > 0 &&
ratio > g_conf->mon_pg_warn_max_object_skew) {
ostringstream ss;
ss << "pool " << name << " has many more objects per pg than average (too few pgs?)";
summary.push_back(make_pair(HEALTH_WARN, ss.str()));
if (detail) {
ostringstream ss;
ss << "pool " << name << " objects per pg ("
<< objects_per_pg << ") is more than " << ratio << " times cluster average ("
<< average_objects_per_pg << ")";
detail->push_back(make_pair(HEALTH_WARN, ss.str()));
}
主要下面的几个限制条件
mon_pg_warn_min_objects = 10000 //总的对象超过10000
mon_pg_warn_min_pool_objects = 1000 //存储池对象超过1000
mon_pg_warn_max_object_skew = 10 //就是上面的存储池的平均对象与所有pg的平均值的倍数关系
解决问题
有三个方法解决这个警告的提示:
删除无用的存储池
如果集群中有一些不用的存储池,并且相对的pg数目还比较高,那么可以删除一些这样的存储池,从而降低mon_pg_warn_max_object_skew这个值,警告就会没有了增加提示的pool的pg数目
有可能的情况就是,这个存储池的pg数目从一开始就不够,增加pg和pgp数目,同样降低了mon_pg_warn_max_object_skew这个值了增加
mon_pg_warn_max_object_skew的参数值
如果集群里面已经有足够多的pg了,再增加pg会不稳定,如果想去掉这个警告,就可以增加这个参数值,默认为10
总结
这个警告是比较的是存储池中的对象数目与整个集群的pg的平均对象数目的偏差,如果偏差太大就会发出警告
检查的步骤:
ceph health detail
ceph df
ceph osd dump | grep pool
mon_pg_warn_max_object_skew = 10.0
((objects/pg_num) in the affected pool)/(objects/pg_num in the entire system) >= 10.0 警告就会出现
变更记录
| Why | Who | When |
|---|---|---|
| 创建 | 武汉-运维-磨渣 | 2016-07-27 |
Ceph 状态报警告 pool rbd has many more objects per pg than average (too few pgs?)的更多相关文章
- 理解 OpenStack + Ceph (4):Ceph 的基础数据结构 [Pool, Image, Snapshot, Clone]
本系列文章会深入研究 Ceph 以及 Ceph 和 OpenStack 的集成: (1)安装和部署 (2)Ceph RBD 接口和工具 (3)Ceph 物理和逻辑结构 (4)Ceph 的基础数据结构 ...
- Ceph 的基础数据结构 [Pool, Image, Snapshot, Clone]
原文链接:http://www.cnblogs.com/sammyliu/p/4843812.html?utm_source=tuicool&utm_medium=referral 1 Poo ...
- Kafka生产者案例报警告SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
一.SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". 这个报警告的原因简单来说时因为slf4j的版本 ...
- mac go环境报警告
go get -u github.com/beego/bee 报警告: # github.com/beego/beeld: warning: text-based stub file /System/ ...
- IDEA maven 项目报警告解决(自己的maven配置记录)
IDEA maven 项目报警告解决 应该是JDK版本太低 虽然你装的高但是默认使用maven 默认的 这里要配一下JDK版本 理解不深入只为 自己记录使用 1 配置 仓库为阿里云 配置本地储存j ...
- 写webpack插件报警告Tapable.plugin is deprecated. Use new API on .hooks instead解决方案,webpack4插件新写法
最近写了个小插件报了个警告,然后去百度了一下,全都给我说extract-text-webpack-plugin这个插件有问题要更新,我也是无语了,这个插件我用都没用,百度翻了下齐刷刷全是这个答案,搞得 ...
- ceph集群jewel版本 rbd 块map 报错-故障排查
测试信息如下: [root@ceph_1 ~]# ceph osd pool lsrbdchy_123swimmingpool #新建rbd 块: rbd create swimmingpool/ba ...
- ceph 005 赋权补充 rbd块映射
我的ceph版本 [root@serverc ~]# ceph -v ceph version 16.2.0-117.el8cp (0e34bb74700060ebfaa22d99b7d2cdc037 ...
- ios8调用相机报警告: Snapshotting a view that has not been rendered results in an empty snapshot. Ensure you(转)
我这也报了这个警告,但按他的方法并没有起作用,把写到这个地方看是否其他人用的到 错误代码:Snapshotting a view that has not been rendered results ...
随机推荐
- 用 C 语言游戏编程开发!果然最担心的事又发生了!
30了.我要怎么办,老了.人就像一头小毛驴,方向都是牵着的人定的. 这个项目从去年开始的,一个手机游戏,当时接这个项目的时候其实没有太多考虑,我一向都喜欢打肿脸充胖子的,好面子,人家找上门来,不能不给 ...
- kibana-安装-通过docker
拉取镜像 docker pull kibana:7.9.1 创建用户自定义网络 docker network create esnet 运行Kibana docker run --name ...
- kubernetes:用label让pod在指定的node上运行(kubernetes1.18.3)
一,为什么要为node指定label? 通常scheduler会把pod调度到所有可用的Node,有的情况下我们希望能把 Pod 部署到指定的 Node, 例如: 有的Node上配备了速度更快的SSD ...
- PyCharm搭配github错误处理
ssh -T git@github.com 验证时 报错Could not open a connection to your authentication agent. 删除前面生成的.ssh文件 ...
- 老板,来五道misc
开个杂项坑 穿越时空的思念 音频隐写,audacity分离音道,摩斯密码一把锁 金三胖 是个gif,明显能感觉到里面藏有flag stegsolve逐帧分离太low了,直接用脚本一把梭 import ...
- vue-cli3使用jq
第一步安装 npm install jquery --save 第二部配置vue.config.js, 没有这个文件就创建 主要是框框出来的那些: 忽略我配置的另一个uglifyjs-webpack- ...
- PHP代码审计04之strpos函数使用不当
前言 根据红日安全写的文章,学习PHP代码审计的第四节内容,题目均来自PHP SECURITY CALENDAR 2017,讲完题目会用一个实例来加深巩固,这是之前写的,有兴趣可以去看看: PHP代码 ...
- python中使用with操作文件,为什么不需要手动关闭?
python中的with关键字,它是用来启动一个对象的上下文管理器的.它的原理是,当我们使用with去通过open打开文件的时候,它会触发文件对象的上下文管理器, 当with中的代码结束完成之后,去自 ...
- codeforces 1442 A. Extreme Subtraction(贪心,构造)
传送门 样例(x): 8 15 16 17 19 27 36 29 33 结果(t1) 15 15 16 18 26 35 28 32 思路:我们可以把最左端和最右端当做两个水龙头,每个水龙头流量的上 ...
- [Luogu P4173]残缺的字符串 ( 数论 FFT)
题面 传送门:洛咕 Solution 这题我写得脑壳疼,我好菜啊 好吧,我们来说正题. 这题.....emmmmmmm 显然KMP类的字符串神仙算法在这里没法用了. 那咋搞啊(或者说这题和数学有半毛钱 ...