谁记录了mysql error log中的超长信息(记pt-stalk一个bug的定位过程)

【问题】

最近查看MySQL的error log文件时，发现有很多服务器的文件中有大量的如下日志，内容很长（大小在200K左右），从记录的内容看，并没有明显的异常信息。

有一台测试服务器也有类似的问题，为什么会记录这些信息，是谁记录的这些信息，分析的过程比较周折。

Status information:

Current dir:

Running threads: 2452 Stack size: 262144

Current locks:

lock: 0x7f783f5233f0:

Key caches:

default

Buffer_size: 8388608

Block_size: 1024

Division_limit: 100

Age_limit: 300

blocks used: 10

not flushed: 0

w_requests: 6619

writes: 1

r_requests: 275574

handler status:

read_key: 32241480828

read_next: 451035381896

read_rnd 149361175

read_first: 1090473

write: 4838429521

delete 12155820

update: 3331297842

【分析过程】

1、首先在官方文档中查到，当mysqld进程收到SIGHUP信号量时，就会输出类似的信息，

On Unix, signals can be sent to processes. mysqld responds to signals sent to it as follows:

SIGHUP causes the server to reload the grant tables and to flush tables, logs, the thread cache, and the host cache. These actions are like various forms of the FLUSH statement. The server also writes a status report to the error log that has this format:

https://dev.mysql.com/doc/refman/5.6/en/server-signal-response.html

2、有别的程序在kill mysqld进程吗，用systemtap脚本监控kill命令

probe nd_syscall.kill

{

target[tid()] = uint_arg(1);

signal[tid()] = uint_arg(2);

}

probe nd_syscall.kill.return

{

if (target[tid()] != 0) {

printf("%-6d %-12s %-5d %-6d %6d\n", pid(), execname(),

signal[tid()], target[tid()], int_arg(1));

delete target[tid()];

delete signal[tid()];

}

用下面命令测试，确实会在error log中记录日志

kill -SIGHUP 12455

从systemtap的输出看到12455就是mysqld进程，被kill掉了，信号量是1，对应的就是SIGHUP

不过在测试环境后面问题重现时，却没有抓到SIGHUP的信号量。

FROM COMMAND SIG TO RESULT

17010 who 0 12153 1340429600

36681 bash 1 12455 642

3、看来并不是kill导致的，后面用gdb attach到mysqld进程上，在error log的三个入口函数sql_print_error，sql_print_warning，sql_print_information加上断点

但是在问题重现时，程序并没有停在断点处

4、写error log还有别的分支吗，翻源码找到了答案，原来是通过mysql_print_status函数直接写到error log中

void mysql_print_status()

{

char current_dir[FN_REFLEN];

STATUS_VAR current_global_status_var;

printf("\nStatus information:\n\n");

(void) my_getwd(current_dir, sizeof(current_dir),MYF(0));

printf("Current dir: %s\n", current_dir);

printf("Running threads: %u Stack size: %ld\n",

Global_THD_manager::get_instance()->get_thd_count(),

(long) my_thread_stack_size);

…

puts("");

fflush(stdout);

}

5、再次用gdb attach到mysqld进程上，在mysql_print_status函数上加断点，在问题重现时，线程停在断点处，通过ps的结果多次对比，判断是pt-stalk工具运行时调用了mysql_print_status

6、从堆栈中看到dispatch_command调用了mysql_print_status，下面是具体的逻辑，当command=COM_DEBUG时就会执行到mysql_print_status

case COM_DEBUG:

thd->status_var.com_other++;

if (check_global_access(thd, SUPER_ACL))

break; /* purecov: inspected */

mysql_print_status();

query_logger.general_log_print(thd, command, NullS);

my_eof(thd);

break;

7、查看pt-stalk的代码

if [ "$mysql_error_log" -a ! "$OPT_MYSQL_ONLY" ]; then

log "The MySQL error log seems to be $mysql_error_log"

tail -f "$mysql_error_log" >"$d/$p-log_error" &

tail_error_log_pid=$!

$CMD_MYSQLADMIN $EXT_ARGV debug

else

log "Could not find the MySQL error log"

在调用mysqladmin时使用了debug模式

debug Instruct server to write debug information to log

8、在percona官网上搜到了相关的bug描述，目前bug还未修复，会在下个版本中3.0.13中修复。

https://jira.percona.com/browse/PT-1340

【解决方案】

定位到问题后，实际修复也比较简单，将pt-stalk脚本中$CMD_MYSQLADMIN $EXT_ARGV debug中的debug去掉就可以了，测试生效。

谁记录了mysql error log中的超长信息(记pt-stalk一个bug的定位过程)的更多相关文章

谁记录了mysql error log中的超长信息
[问题] 最近查看MySQL的error log文件时,发现有很多服务器的文件中有大量的如下日志,内容很长(大小在200K左右),从记录的内容看,并没有明显的异常信息. 有一台测试服务器也有类似的问题 ...
MySQL Error Log 中IO写入瓶颈的警告分析
周末在一台MySQL实例上频繁做大批量的写入测试,无意中发现MySQL的errorlog中频繁出现如下的Note:page_cleaner: 1000ms intended loop took *** ...
Sort aborted Error in MySQL Error Log
现象 [ERROR] lines containing "Sort aborted" are present in the MySQL error log file. [Warni ...
Git 实战手册(一): 批量修改log中的提交信息
本文须知教程所示图片使用的是 github 仓库图片,网速过慢的朋友请移步原文地址有空就来看看个人技术小站, 我一直都在背景介绍事情的起源是这样的:迷恋的谷歌的我最近申请了一个新的 googl ...
MySQL Error Log 文件丢失导致The server quit without updating PID file启动失败的场景
今天在做mysql sniff测试的时候,中间重启MySQL实例的过程中,出现了"The server quit without updating PID file"这个经典的错误 ...
记录一下 mysql 的查询中like字段的用法
SELECT * from t_yymp_auth_role where role_name not like '%测试%' and role_name not like '%部门%' and rol ...
APP 和小程序中通过日期格式获取时间戳的一个bug
介绍一下背景:业务逻辑就不多说了,就说关键出问题的一步,需要将 2019-10-10 这个格式转换为时间戳.在不同平台不同场景下问题还很怪异 app上:ios 安卓线上的都有问题 ios模拟器没问题 ...
【MySQL案例】error.log的Warning：If a crash happens thisconfiguration does not guarantee that the relay lo（转）
标签: 1.1.1. If a crash happens thisconfiguration does not guarantee that the relay log info will be c ...
【MySQL案例】error.log的Warning：If a crash happens thisconfiguration does not guarantee that the relay lo
1.1.1. If a crash happens thisconfiguration does not guarantee that the relay log info will be consi ...

随机推荐

阿里云apache服务器外网无法访问（配置安全组，添加80服务）
CentOS的系统 ,已经安装好了 apache php mysql 常规排错过程(ps:没耐心的童鞋请直接看最后一步,学习在阿里云控制台配置安全组,允许 http服务) 第一步:检查apache ...
ODPS
ODPS 功能之概述篇原文 http://blog.aliyun.com/2962 主题 SQL 概述 ODPS是阿里云基于自有的云计算技术研发一套开放数据处理服务(Open Data Proce ...
IDEA常用快捷键和常用插件集成,持续更新......
用习惯了eclipse,不容易转过来,记一下! 快捷键 psvm: main 方法快捷键 sout :syso快捷键 CTRL+O: 重写父类方法 Ctrl+Alt+V :自动补全返回值 Ctrl+S ...
mysql系列三、mysql开启缓存、设置缓存大小、缓存过期机制
一.开启缓存 mysql 开启查询缓存可以有两种方法来开启一种是使用set命令来进行开启,另一种是直接修改my.ini文件来直接设置都是非常的简单的哦. 开启缓存,设置缓存大小,具体实施如下: 1.修 ...
企业内部在centos7.2系统中必杀技NTP时间服务器及内网服务器时间同步(windows和linux客户端同步)
网络时间协议NTP(Network Time Protocol)是用于互联网中时间同步的标准互联网协议.NTP的用途是把计算机的时间同步到某些时间标准.目前采用的时间标准是世界协调时UTC(Unive ...
Expm 4_2 有向无环图中的最短路径问题
[问题描述] 建立一个从源点S到终点E的有向无环图,设计一个动态规划算法求出从S到E的最短路径值,并输出相应的最短路径. 解: package org.xiu68.exp.exp4; import j ...
windows安装配置git和Tortoisegit
git github gitlab Tortoisegit 的概念自行百度 1. 安装git 2. 安装小乌龟:Tortoisegit 和中文包 3. 配置 4. 使用参考: 目录安装及配置 ...
从零开始自己搭建复杂网络2（以Tensorflow为例）
从零开始自己搭建复杂网络(以DenseNet为例) DenseNet 是一种具有密集连接的卷积神经网络.在该网络中,任何两层之间都有直接的连接,也就是说,网络每一层的输入都是前面所有层输出的并集, 而 ...
SpringBoot入门小案例
1.创建一个简单的maven project项目 2.下面来看一下项目结构: 3.pom.xml 配置jar包 <parent> <groupId>org.springfram ...
Linear Algebra（未完待续）
[矩阵消元] The result of multiplying a matrix by some vector is a combination of the columns of the matr ...

谁记录了mysql error log中的超长信息(记pt-stalk一个bug的定位过程)

谁记录了mysql error log中的超长信息(记pt-stalk一个bug的定位过程)的更多相关文章

随机推荐

热门专题