解析mysql慢日志

mysql慢日志太多，需要分析下具体有哪些慢日志

mysql可以直接记录所有慢日志，现在的问题是将日志文件sql进行去重

想了老半天该怎样将sql的查询字段去掉进行排序，没有get到重点。后来发现mysql自带提供了mysqldumpslow工具用于解析慢日志

下面是选项：

Option Name	Description
-a	Do not abstract all numbers to N and strings to 'S'
-n	Abstract numbers with at least the specified digits
--de	bug Write debugging information
-g	Only consider statements that match the pattern
--he	lp Display help message and exit
-h	Host name of the server in the log file name
-i	Name of the server instance
-l	Do not subtract lock time from total time
-r	Reverse the sort order
-s	How to sort output
-t	Display only first num queries
--verbose	Verbose mode

默认添加-a选项将不替换sql的查询参数，导致相同类型的sql只是查询串不一样也作为两条语句了

所以-a选项可以做参考，依然会记录很多重复sql

下面是修改后的文件，当不使用-a选项时添加一个耗时最大的sql作为例子

#!/usr/bin/perl

# Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

#

# This program is free software; you can redistribute it and/or

# modify it under the terms of the GNU Library General Public

# License as published by the Free Software Foundation; version 2

# of the License.

#

# This program is distributed in the hope that it will be useful,

# but WITHOUT ANY WARRANTY; without even the implied warranty of

# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU

# Library General Public License for more details.

#

# You should have received a copy of the GNU Library General Public

# License along with this library; if not, write to the Free

# Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston,

# MA 02110-1301, USA

# mysqldumpslow - parse and summarize the MySQL slow query log

# Original version by Tim Bunce, sometime in 2000.

# Further changes by Tim Bunce, 8th March 2001.

# Handling of strings with \ and double '' by Monty 11 Aug 2001.

use strict;

use Getopt::Long;

# t=time, l=lock time, r=rows

# at, al, and ar are the corresponding averages

my %opt = (

    s => 'at',

    h => '*',

);

GetOptions(\%opt,

    'v|verbose+',# verbose

    'help+',	# write usage info

    'd|debug+',	# debug

    's=s',	# what to sort by (al, at, ar, c, t, l, r)

    'r!',	# reverse the sort order (largest last instead of first)

    't=i',	# just show the top n queries

    'a!',	# don't abstract all numbers to N and strings to 'S'

    'n=i',	# abstract numbers with at least n digits within names

    'g=s',	# grep: only consider stmts that include this string

    'h=s',	# hostname of db server for *-slow.log filename (can be wildcard)

    'i=s',	# name of server instance (if using mysql.server startup script)

    'l!',	# don't subtract lock time from total time

) or usage("bad option");

$opt{'help'} and usage();

unless (@ARGV) {

    my $defaults   = `my_print_defaults mysqld`;

    my $basedir = ($defaults =~ m/--basedir=(.*)/)[0]

	or die "Can't determine basedir from 'my_print_defaults mysqld' output: $defaults";

    warn "basedir=$basedir\n" if $opt{v};

    my $datadir = ($defaults =~ m/--datadir=(.*)/)[0];

    my $slowlog = ($defaults =~ m/--slow-query-log-file=(.*)/)[0];

    if (!$datadir or $opt{i}) {

	# determine the datadir from the instances section of /etc/my.cnf, if any

	my $instances  = `my_print_defaults instances`;

	die "Can't determine datadir from 'my_print_defaults mysqld' output: $defaults"

	    unless $instances;

	my @instances = ($instances =~ m/^--(\w+)-/mg);

	die "No -i 'instance_name' specified to select among known instances: @instances.\n"

	    unless $opt{i};

	die "Instance '$opt{i}' is unknown (known instances: @instances)\n"

	    unless grep { $_ eq $opt{i} } @instances;

	$datadir = ($instances =~ m/--$opt{i}-datadir=(.*)/)[0]

	    or die "Can't determine --$opt{i}-datadir from 'my_print_defaults instances' output: $instances";

	warn "datadir=$datadir\n" if $opt{v};

    }

    if ( -f $slowlog ) {

        @ARGV = ($slowlog);

        die "Can't find '$slowlog'\n" unless @ARGV;

    } else {

        @ARGV = <$datadir/$opt{h}-slow.log>;

        die "Can't find '$datadir/$opt{h}-slow.log'\n" unless @ARGV;

    }

}

warn "\nReading mysql slow query log from @ARGV\n";

my @pending;

my %stmt;

$/ = ";\n#";		# read entire statements using paragraph mode

while ( defined($_ = shift @pending) or defined($_ = <>) ) {

    warn "[[$_]]\n" if $opt{d};	# show raw paragraph being read

    my @chunks = split /^\/.*Version.*started with[\000-\377]*?Time.*Id.*Command.*Argument.*\n/m;

    if (@chunks > 1) {

	unshift @pending, map { length($_) ? $_ : () } @chunks;

	warn "<<".join(">>\n<<",@chunks).">>" if $opt{d};

	next;

    }

    s/^#? Time: \d{6}\s+\d+:\d+:\d+.*\n//;

    my ($user,$host,$dummy,$thread_id) = s/^#? User\@Host:\s+(\S+)\s+\@\s+(\S+)\s+\S+(\s+Id:\s+(\d+))?.*\n// ? ($1,$2,$3,$4) : ('','','','','');

    s/^# Query_time: ([0-9.]+)\s+Lock_time: ([0-9.]+)\s+Rows_sent: ([0-9.]+).*\n//;

    my ($t, $l, $r) = ($1, $2, $3);

    $t -= $l unless $opt{l};

    # remove fluff that mysqld writes to log when it (re)starts:

    s!^/.*Version.*started with:.*\n!!mg;

    s!^Tcp port: \d+  Unix socket: \S+\n!!mg;

    s!^Time.*Id.*Command.*Argument.*\n!!mg;

    s/^use \w+;\n//;	# not consistently added

    s/^SET timestamp=\d+;\n//;

    s/^[ 	]*\n//mg;	# delete blank lines

    s/^[ 	]*/  /mg;	# normalize leading whitespace

    s/\s*;\s*(#\s*)?$//;	# remove trailing semicolon(+newline-hash)

    next if $opt{g} and !m/$opt{g}/io;

    # 定义eg变量用于保存原始sql，避免被下面语句替换

    my $eg = $_;

    unless ($opt{a}) {

	s/\b\d+\b/N/g;

	s/\b0x[0-9A-Fa-f]+\b/N/g;

        s/''/'S'/g;

        s/""/"S"/g;

        s/(\\')//g;

        s/(\\")//g;

        s/'[^']+'/'S'/g;

        s/"[^"]+"/"S"/g;

	# -n=8: turn log_20001231 into log_NNNNNNNN

	s/([a-z_]+)(\d{$opt{n},})/$1.('N' x length($2))/ieg if $opt{n};

	# abbreviate massive "in (...)" statements and similar

	s!(([NS],){100,})!sprintf("$2,{repeated %d times}",length($1)/2)!eg;

    }

    my $s = $stmt{$_} ||= { users=>{}, hosts=>{} };

    $s->{c} += 1;

    $s->{t} += $t;

    $s->{l} += $l;

    $s->{r} += $r;

    # 选取耗时最大的sql保存在eg变量里面

    $s->{max} = $s->{c}>1?$t>$s->{max}?$t:$s->{max}:$t;

    $s->{eg} = $s->{max}>$t?$s->{eg}:$eg;

    $s->{users}->{$user}++ if $user;

    $s->{hosts}->{$host}++ if $host;

    warn "{{$_}}\n\n" if $opt{d};	# show processed statement string

}

foreach (keys %stmt) {

    my $v = $stmt{$_} || die;

    my ($c, $t, $l, $r) = @{ $v }{qw(c t l r)};

    $v->{at} = $t / $c;

    $v->{al} = $l / $c;

    $v->{ar} = $r / $c;

}

my @sorted = sort { $stmt{$b}->{$opt{s}} <=> $stmt{$a}->{$opt{s}} } keys %stmt;

@sorted = @sorted[0 .. $opt{t}-1] if $opt{t};

@sorted = reverse @sorted         if $opt{r};

foreach (@sorted) {

    my $v = $stmt{$_} || die;

    my ($c, $t,$at, $l,$al, $r,$ar,$eg) = @{ $v }{qw(c t at l al r ar eg)};

    my @users = keys %{$v->{users}};

    my $user  = (@users==1) ? $users[0] : sprintf "%dusers",scalar @users;

    my @hosts = keys %{$v->{hosts}};

    my $host  = (@hosts==1) ? $hosts[0] : sprintf "%dhosts",scalar @hosts;

    printf "Count: %d  Time=%.2fs (%ds)  Lock=%.2fs (%ds)  Rows=%.1f (%d), $user\@$host\n%s\n",

	    $c, $at,$t, $al,$l, $ar,$r, $_;

    # 如果没有使用-a选项打印example作为例子

    printf "Example:\n%s\n", $eg if not $opt{a};

    printf "\n";

}

sub usage {

    my $str= shift;

    my $text= <<HERE;

Usage: mysqldumpslow [ OPTS... ] [ LOGS... ]

Parse and summarize the MySQL slow query log. Options are

  --verbose    verbose

  --debug      debug

  --help       write this text to standard output

  -v           verbose

  -d           debug

  -s ORDER     what to sort by (al, at, ar, c, l, r, t), 'at' is default

                al: average lock time

                ar: average rows sent

                at: average query time

                 c: count

                 l: lock time

                 r: rows sent

                 t: query time

  -r           reverse the sort order (largest last instead of first)

  -t NUM       just show the top n queries

  -a           don't abstract all numbers to N and strings to 'S'

  -n NUM       abstract numbers with at least n digits within names

  -g PATTERN   grep: only consider stmts that include this string

  -h HOSTNAME  hostname of db server for *-slow.log filename (can be wildcard),

               default is '*', i.e. match all

  -i NAME      name of server instance (if using mysql.server startup script)

  -l           don't subtract lock time from total time

HERE

    if ($str) {

      print STDERR "ERROR: $str\n\n";

      print STDERR $text;

      exit 1;

    } else {

      print $text;

      exit 0;

    }

}

可以看到上面的perl脚本很简单，添加example也很简单。之前打算用python来做，是我想复杂了。直接将数字替换为N，引号里面的字符替换成S就可以了。

这个还有一个问题是where后面的条件顺序也会影响，不过这个影响不大

如下面的情况(只是作为示例)，不使用-a时正常只显示第一行，现在将显示第一行和执行第2,3,4行sql时耗时最大的一条sql作为示例以便用户分析

select * from mysql.user where N=N;

select * from mysql.user where 1=1;

select * from mysql.user where 2=2;

select * from mysql.user where 3=3;

解析mysql慢日志的更多相关文章

基于innodb_print_all_deadlocks从errorlog中解析MySQL死锁日志
本文是说明如何获取死锁日志记录的,不是说明如何解决死锁问题的. MySQL的死锁可以通过show engine innodb status;来查看,但是show engine innodb statu ...
MySQL慢日志查询全解析：从参数、配置到分析工具【转】
转自: MySQL慢日志查询全解析:从参数.配置到分析工具 - MySQL - DBAplus社群——围绕数据库.大数据.PaaS云,运维圈最专注围绕“数据”的学习交流和专业社群http://dbap ...
mysql 二进制日志后缀数字最大为多少
之前看到mysql二进制日志后面会加一个以数字递增为结尾的后缀,一直在想当尾数到达999999后会发生什么情况,先查了一下官网,对后缀有这样一句介绍:The server creates binary ...
Mysql Binlog日志详解
一．Mysql Binlog格式介绍 Mysql binlog日志有三种格式,分别为Statement,MiXED,以及ROW! 1.Statement:每一条会修改数据的sql都会记录在 ...
MySQL二进制日志总结
二进制日志简单介绍 MySQL的二进制日志(binary log)是一个二进制文件,主要用于记录修改数据或有可能引起数据变更的MySQL语句.二进制日志(binary log)中记录了对MySQL数据 ...
腾讯工程师带你深入解析 MySQL binlog
欢迎大家前往云+社区,获取更多腾讯海量技术实践干货哦~ 本文由腾讯云数据库内核团队发布在云+社区 1.概述 binlog是Mysql sever层维护的一种二进制日志,与innodb引擎中的red ...
关于MySQL慢日志，你想知道的都在这
关于MySQL慢日志,你想知道的都在这 https://mp.weixin.qq.com/s/Ifbq0Dk13SO3WVghqWVUbA 作者介绍邹鹏,现任职于腾讯云数据库团队,负责腾讯云数据库My ...
MySQL各类日志文件相关变量介绍
文章转自:http://www.ywnds.com/?p=3721 MySQL各类日志文件相关变量介绍查询所有日志的变量 1 mysql> show global variables li ...
MySQL binlog日志操作详解
MySQL的二进制日志可以说是MySQL最重要的日志了,它记录了所有的DDL和DML(除了数据查询语句)语句,以事件形式记录,还包含语句所执行的消耗的时间,MySQL的二进制日志是事务安全型的. bi ...

随机推荐

zookeeper与kafka安装搭建
1.2181:对cline端提供服务 2.3888:选举leader使用 3.2888:集群内机器通讯使用(Leader监听此端口)
谈一谈测试驱动开发（TDD）的好处以及你的理解
DD是指在编写真正的功能实现代码之前先写测试代码,然后根据需要重构实现代码.在JUnit的作者Kent Beck的大作<测试驱动开发:实战与模式解析>(Test-Driven Develo ...
BZOJ5415 [NOI2018] 归程
今天也要踏上归程了呢~(题外话 kruskal重构树!当时就听学长们说过是重构树辣所以做起来也很快233 就是我们按照a建最大生成树这样话呢我们就可以通过生成树走到尽量多的点啦然后呢就是从这个子树 ...
pythonerror ValueError：invalid literal for int() with base 10: '3.14'
解释:对于int()来说,文本输入‘3.14’这个输入是无效的,原因是int类要求输入数字或者整数字符解决:a= int(float(value)) 注:int本身是一个类,所以返回的是int类,i ...
python tkinter开始
tkinter是python自带的GUI库,所以用起来会比较简单运行一个什么都没有的窗口 import tkinter window=tkinter.Tk()#窗口类定义 window.mainlo ...
python 利用subprocess调用cmd命令程序，并正确输出控制台的输出中文
平台Python3.7 1.利用控制台运行程序后在控制台会输出中文提示,但是用python调用subprocess.run函数后返回的输出是乱码,于是,解决方法是用subprocess.check_o ...
delphi 打开和关闭外部exe
一.打开外部exe 1.use文件-SHELLAPI 2.ShellExecute(handle,'open','E:\test.exe','-s','',SW_SHOWNORMAL); 二.关闭外部 ...
vs code 使用技巧整理
快捷键 Ctrl + Shift + F:在文件夹中搜索; Ctrl + Shift + P:命令面板; Ctrl + Shift + T:重新打开关闭的编辑页面; Ctrl+Shift+PgUp/ ...
百度小程序-swiper组件
.swan  <view class="swiper-box"> <swiper class="banner&qu ...
json序列化反序列
json只能处理简单的数据类型:字典列表等... 文件只能存字符串和二进制序列化:把内存的对象变为字符串反序列化:将字符串变回为内存对象

解析mysql慢日志

解析mysql慢日志的更多相关文章

随机推荐

热门专题