index merge的一次优化

手机微博4040端口SQL优化

现象

某端口常态化延迟，通过使用pt-query-digest发现主要由于一条count（*）语句引发，具体如下：

# 13.5s user time, 40ms system time, 21.58M rss, 156.84M vsz

# Current date: Fri Apr  1 17:43:05 2016

# Hostname: naga64

# Files: /data1/mysql4040/slow.log

# Overall: 45.87k total, 53 unique, 1.01 QPS, 9.05x concurrency __________

# Time range: 2016-04-01 05:05:02 to 17:43:05

# Attribute          total     min     max     avg     95%  stddev  median

# ============     ======= ======= ======= ======= ======= ======= =======

# Exec time        411622s      1s    238s      9s     29s     13s      6s

# Lock time            70s       0      4s     2ms   138us    57ms    76us

# Rows sent         12.66M       0   1.31M  289.43   19.46  13.90k    0.99

# Rows examine     310.43M       0   5.40M   6.93k  31.59k  65.56k    0.99

# Query size         5.89M      17   4.14k  134.67  563.87  150.53   76.28

# Profile

# Rank Query ID           Response time     Calls R/Call  Apdx V/M   Item

# ==== ================== ================= ===== ======= ==== ===== =====

#    1 0xE74340EE1DEFEC99 317229.0380 77.1% 34627  9.1613 0.11 12.60 SELECT user_rec_?

#    2 0xB9959C570826EFA4  72164.9508 17.5%  3746 19.2645 0.15 36.13 SELECT app

#    3 0xECEF2B7CA2BE445C   7136.5824  1.7%  3581  1.9929 0.53  2.75 SELECT user_rec_?

#    4 0x7B9529D6435F23B3   3465.0381  0.8%   137 25.2922 0.16 33.53 SELECT app

#    5 0x270C8D7D3EC37561   2209.2050  0.5%  1087  2.0324 0.51  2.34 SELECT apk

#    6 0x6AF45A776EDFF7A9   1921.4956  0.5%   905  2.1232 0.50  2.63 SELECT apk

#    7 0x67DC38C9C5F7EEBB   1816.0314  0.4%   108 16.8151 0.08  7.32 SELECT ios_apk

#    8 0x5F7E7D2BFA8FB79B   1388.2303  0.3%   518  2.6800 0.49 10.45 SELECT apk cooper

#    9 0x79F2C2072394C9BB   1005.4780  0.2%   656  1.5327 0.59  1.64 SELECT user_rec_?b

#   10 0x3229403E99601A69    632.3939  0.2%    81  7.8073 0.07  1.07 SELECT ios_app

#   11 0x83D4C6B0BB535E12    506.5923  0.1%    15 33.7728 0.10 11.12 SELECT apk

#   13 0x2F002402DBB98EE9    226.3586  0.1%    73  3.1008 0.42  4.04 SELECT app

#   14 0x992F97D6C4D52DF6    219.2329  0.1%    44  4.9826 0.19  2.00 SHOW STATUS

#   16 0x791C5370A1021F19    140.2855  0.0%    30  4.6762 0.25  1.87 SHOW SLAVE STATUS

#   18 0x2F27EBCFABB23992    110.6802  0.0%    36  3.0744 0.40  2.47 SELECT app_recommend app

#   19 0x980736573219087A    108.8593  0.0%    15  7.2573 0.00  0.45 SELECT ios_app_free ios_app

#   20 0x58492BB2C89253D8     71.5322  0.0%    10  7.1532 0.05  0.57 SELECT ios_app_free ios_app

#   21 0x0EB86D9E4630253A     61.5251  0.0%    27  2.2787 0.52  0.33 SELECT ios_app_recommend ios_app

#   22 0x398799E91C3C2AAD     59.5222  0.0%    12  4.9602 0.33  3.46 SELECT apk cooper

#   24 0x53148D850C2E022E     45.0953  0.0%    11  4.0996 0.23  1.04 SELECT ios_app

#   25 0x07387FA6467B3DB9     34.6657  0.0%    17  2.0392 0.50  0.39 SELECT app_recommend app

#   26 0xBD799CC975081065     31.1719  0.0%    16  1.9482 0.47  0.51 SELECT app

#   27 0xB7F06103A7ADA5C0     30.4686  0.0%    13  2.3437 0.42  0.52 SELECT user_rec_?d

#   30 0x188747BC3CB9728B     19.8929  0.0%    12  1.6577 0.58  0.22 SELECT app_recommend app

# MISC 0xMISC                987.4775  0.2%    92 10.7335   NS   0.0 <29 ITEMS>

# Query 1: 0.76 QPS, 6.97x concurrency, ID 0xE74340EE1DEFEC99 at byte 2753434

# This item is included in the report because it matches --limit.

# Scores: Apdex = 0.11 [1.0], V/M = 12.60

# Query_time sparkline: |      ^_|

# Time range: 2016-04-01 05:05:02 to 17:43:04

# Attribute    pct   total     min     max     avg     95%  stddev  median

# ============ === ======= ======= ======= ======= ======= ======= =======

# Count         75   34627

# Exec time     77 317229s      1s    174s      9s     23s     11s      7s

# Lock time     55     39s    46us      3s     1ms   119us    46ms    73us

# Rows sent      0  31.80k       0       1    0.94    0.99    0.23    0.99

# Rows examine   0  22.97k       0       5    0.68    0.99    0.55    0.99

# Query size    44   2.61M      76      79   79.00   76.28    0.02   76.28

# String:

# Databases    apps

# Hosts

# Users        apps_r

# Query_time distribution

#   1us

#  10us

# 100us

#   1ms

#  10ms

# 100ms

#    1s  ################################################################

#  10s+  #######################

# Tables

#    SHOW TABLE STATUS FROM `apps` LIKE 'user_rec_07'\G

#    SHOW CREATE TABLE `apps`.`user_rec_07`\G

# EXPLAIN /*!50100 PARTITIONS*/

select count(*) as total from user_rec_07 where type=5 and weiboId=''\G

我们来查看一下这个表的表结构和这条语句的explain结果，看是否可以优化，具体如下：

localhost.apps>show create table user_rec_45;

+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

| Table       | Create Table                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |

+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

| user_rec_45 | CREATE TABLE `user_rec_45` (

  `id` int(11) NOT NULL AUTO_INCREMENT,

  `softId` int(11) NOT NULL DEFAULT '',

  `weiboId` bigint(20) NOT NULL DEFAULT '',

  `type` tinyint(4) NOT NULL DEFAULT '' COMMENT '0???',

  `content` varchar(512) NOT NULL DEFAULT '' COMMENT '???????url??????????????',

  `ctime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,

  PRIMARY KEY (`id`),

  KEY `idx_softId_weiboId` (`softId`,`weiboId`),

  KEY `idx_weiboId` (`weiboId`),

  KEY `idx_type` (`type`)

) ENGINE=TokuDB AUTO_INCREMENT=3252283 DEFAULT CHARSET=utf8 ROW_FORMAT=TOKUDB_LZMA |

+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

1 row in set (0.00 sec)

localhost.apps>explain select count(*) as total from user_rec_07 where type=5 and weiboId=1934676487\G;

*************************** 1. row ***************************

           id: 1

  select_type: SIMPLE

        table: user_rec_07

         type: index_merge

possible_keys: idx_weiboId,idx_type

          key: idx_weiboId,idx_type

      key_len: 8,1

          ref: NULL

         rows: 1

        Extra: Using intersect(idx_weiboId,idx_type); Using where; Using index

1 row in set (0.01 sec)

可以看到通过type和extra都可以发现其实是用到了index的，但是为这么还会这么慢呢？

ps：一开始看到是tokuDB的引擎，下意识的以为是tk对count（）支持不好，后来实践证明，还是index的问题。

推理

这条sql的查询条件还是相当简单的，仅为2个等式，根据个人的习惯，我会先看下这2个等值条件的结果集分别是多大？

首先是weiboID的explain：

localhost.apps>explain select count(*) as total from user_rec_07 where weiboId=1934676487\G;

*************************** 1. row ***************************

           id: 1

  select_type: SIMPLE

        table: user_rec_07

         type: ref

possible_keys: idx_weiboId

          key: idx_weiboId

      key_len: 8

          ref: const

         rows: 18

        Extra: Using index

1 row in set (0.00 sec)

接下来是type的explain：

localhost.apps>explain select count(*) as total from user_rec_07 where type=5\G;

*************************** 1. row ***************************

           id: 1

  select_type: SIMPLE

        table: user_rec_07

         type: ref

possible_keys: idx_type

          key: idx_type

      key_len: 1

          ref: const

         rows: 114834

        Extra: Using index

1 row in set (0.00 sec)

可以很明显的看到weiboID的区分度还是很好的，而type的就差很多了（需要扫描将近12w rows），但是理论上使用weiboID作为index只需要扫描18 rows左右，按说查询时间应该在5ms之内才对。

我们分别看下3条sql的查询时间：

2个条件：

localhost.apps>select count(*) as total from user_rec_45 where type=5 and weiboId='';

+-------+

| total |

+-------+

|     1 |

+-------+

1 row in set (0.57 sec)

weiboID作为条件：

localhost.apps>select count(*) as total from user_rec_45 where weiboId=''\G;

*************************** 1. row ***************************

total: 9

1 row in set (0.00 sec)

type作为条件：

localhost.apps>select count(*) as total from user_rec_45 where type=5\G;

*************************** 1. row ***************************

total: 103838

1 row in set (0.19 sec)

可以从上面明显的看出来双条件耗时最多570ms，weiboID作为条件0ms，type作为条件190ms

根据以上的结果，我们就可以进行index的优化了。

优化

添加index的思路非常的简单，直接加一个两条件的index即可，具体SQL如下：

localhost.apps>alter table user_rec_45 drop index idx_weiboID,add index idx_weiboID_type(weiboID,type);

我们看下添加前和添加之后的区别：

添加前：

localhost.apps>select count(*) as total from user_rec_45 where type=5 and weiboId='';

+-------+

| total |

+-------+

|     1 |

+-------+

1 row in set (0.57 sec)

添加后：

localhost.apps>select count(*) as total from user_rec_45 where type=5 and weiboId='';

+-------+

| total |

+-------+

|     1 |

+-------+

1 row in set (0.00 sec)

可以看到效果非常的明显。

从服务器的负载看下：

修改之前：

:: PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle

:: PM  all   96.00    0.00    3.38    0.00    0.00    0.62    0.00    0.00    0.00

:: PM       91.00    0.00    5.00    0.00    0.00    4.00    0.00    0.00    0.00

:: PM       97.98    0.00    2.02    0.00    0.00    0.00    0.00    0.00    0.00

:: PM       98.00    0.00    2.00    0.00    0.00    0.00    0.00    0.00    0.00

:: PM       96.00    0.00    4.00    0.00    0.00    0.00    0.00    0.00    0.00

:: PM       95.96    0.00    3.03    0.00    0.00    1.01    0.00    0.00    0.00

:: PM       96.00    0.00    4.00    0.00    0.00    0.00    0.00    0.00    0.00

:: PM       97.00    0.00    3.00    0.00    0.00    0.00    0.00    0.00    0.00

:: PM       97.00    0.00    3.00    0.00    0.00    0.00    0.00    0.00    0.00

修改之后：

:: PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle

:: PM  all   24.25    0.00    1.12    3.50    0.00    0.12    0.00    0.00   71.00

:: PM       16.16    0.00    2.02   18.18    0.00    1.01    0.00    0.00   62.63

:: PM        3.03    0.00    0.00    6.06    0.00    0.00    0.00    0.00   90.91

:: PM       90.00    0.00    0.00    1.00    0.00    0.00    0.00    0.00    9.00

:: PM       84.00    0.00    6.00    2.00    0.00    0.00    0.00    0.00    8.00

:: PM        0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00

:: PM        0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00

:: PM        0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00

:: PM        0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00

但是为什么会这样呢？细心的同学应该发现了，之前其实MySQL也使用了2个索引，只不过是使用的index merge，将两个单独的index合并在一起使用了，为什么差距会这么大呢？

分析

我们首先来看下index merge也就是 index intersect（indx1，index2）的定义

index_merge: This join type indicates that the Index Merge optimization is used. In this case, the key column in the output row contains a list of indexes used, and key_len contains a list of the longest key parts for the indexes used.

The Index Merge method is used to retrieve rows with several range scans and to merge their results into one. The merge can produce unions, intersections, or unions-of-intersections of its underlying scans. This access method merges index scans from a single table; it does not merge scans across multiple tables.

从上面的解释我们可以看出来，index merge其实就是分别通过对两个独立的index进行过滤之后，将过滤之后的结果聚合在一起，然后在返回结果集。

在我们的这个例子中，由于type字段的过滤性不好，故返回的rows依然很多，所以造成的很多的磁盘read，导致了cpu的负载非常的高，直接就出现了延迟。

ps：其实在这个case中，并不需要加2个条件的index，只需要将type这个index干掉，直接使用weiboID这个index即可，毕竟这个index的过滤的结果集已经很小了。

或者通过关闭index intersect功能也可以。

SET [GLOBAL|SESSION] optimizer_switch="index_merge_intersection=off";

展示一下优化前后的io吞吐：

优化前

----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--

usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw

               |3842k 3440k|         |         |   

              |  26M 2593k|  69k   47k|         |  31k 

              |  26M 3258k|  79k   47k|         |  27k 

              |  24M   12M|  56k   37k|         |  21k 

               |  27M 2523k|  56k   20k|         |  16k 

              |  25M 2199k| 102k   43k|        

优化后

 ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--

usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw

               |2935k 3362k|         |         |   

               |4313k 4330k| 129k  353k|         |  

               |4242k 3424k| 138k  392k|         |  11k 

               |4441k 3840k| 169k  397k|         |  

               |3720k 9161k| 135k  398k|         |  

               |4567k 3569k| 139k  368k|         |  

               |3972k 4199k| 135k  341k|         |

index merge的一次优化的更多相关文章

MySQL 优化之 index merge(索引合并)
深入理解 index merge 是使用索引进行优化的重要基础之一.理解了 index merge 技术,我们才知道应该如何在表上建立索引. 1. 为什么会有index merge 我们的 where ...
8.2.1.4 Index Merge Optimization 索引合并优化:
8.2.1.4 Index Merge Optimization 索引合并优化: 索引合并方法是用于检索记录使用多个范围扫描和合并它们的结果集到一起 mysql> show index fr ...
MySQL Index Merge Optimization
Index Merge用在通过一些range scans得到检索数据行和合并成一个整体.合并可以通过 unions,intersections,或者unions-intersection运用在底层的扫 ...
MySQL index merge
深入理解 index merge 是使用索引进行优化的重要基础之一. [ index merge] 当where谓词中存在多个条件(或者join)涉及到多个字段,它们之间进行 AND 或者 ...
MySQL 查询优化之 Index Merge
MySQL 查询优化之 Index Merge Index Merge Intersection 访问算法 Index Merge Union 访问算法 Index Merge Sort-Union ...
MySQL中Index Merge简介
索引合并优化官网翻译 MySQL5.7文档索引合并是为了减少几个范围(type中的range类型:range can be used when a key column is compared t ...
Python, pandas: how to sort dataframe by index// Merge two dataframes by index
pd.concat([df1, df2], axis=1) df.sort_index(inplace=True) https://stackoverflow.com/questions/404680 ...
0103MySQL中的B-tree索引 USINGWHERE和USING INDEX同时出现
转自博客http://www.amogoo.com/article/4 前提1,为了与时俱进,文中数据库环境为MySQL5.6版本2,为了通用,更为了避免造数据的痛苦,文中所涉及表.数据,均来自于My ...
Mysql中的force index和ignore index
前几天统计一个sql,是一个人提交了多少工单,顺便做了相关sql优化.数据大概2000多w. ) c order by c desc; 为了实验最少受其他因素干扰,将生产库的200多w数据导出来,用测 ...

随机推荐

JavaScript异常捕获
理论准备 ★ 异常捕获 △ 异常:当JavaScript引擎执行JavaScript代码时,发生了错误,导致程序停止运行: △ 异常抛出:当异常产生,并且这个异常生成一个错误信息: △ 异常捕获: ...
（旧）子数涵数·DW——图文混排页面
一.首先,打开Dreamweaver,新建一个的HTML项目. 二.在设计区里,写一些文字,随便写一点(也可以在代码区中的<body>和</body>之间写). 三.插入一张图 ...
自定义View_1_关于View，ViewGroup的测量和绘制流程
自定义View(1) ------ 关于View,ViewGroup的测量和绘制流程在Android当中,自定义控件属于比较高级的知识体系,今天我们就一起研究研究关于自定义View的那点事,看看它到 ...
couchbase单向同步
我们知道,couchbase默认情况下就是N主的HA模式,bucket同时存储在多个节点中.如下所示: 但事实上,有些时候我们希望某些节点只能读,不能写以避免各种副作用以及分布式系统下出于管理和安全性 ...
DShow实现一个avi视频的播放（含有个人解释和注释）
此项目为win32下的控制台C++代码(别忘记配置DShow库) // movie_test.cpp : 定义控制台应用程序的入口点. // #include "stdafx.h" ...
Entity Framework 实体关系总结
刚开始使用 Entity Framework 的时候,由于没有静下心来认真理清关系,走了一些"痛不欲生"的弯路.而我们目前开发的项目都在使用 Entity Framework,为了 ...
GridView1_RowDataBound解决限制字段显示长度用"..."显示ToolTip
ToolTip: // // 摘要: // 获取或设置当鼠标指针悬停在 Web 服务器控件上时显示的文本. // // 返回结果: // 当鼠标指针悬停在 Web 服务器控件上时显示的文本.默认值为 ...
通过重写OnScrollListener来监听RecyclerView是否滑动到底部
为了增加复用性和灵活性,我们还是定义一个接口来做监听滚动到底部的回调,这样你就可以把它用在listview,scrollView中去. OnBottomListener package kale.co ...
Lucene总体架构
Lucene总的来说是:• 一个高效的,可扩展的,全文检索库.• 全部用Java实现,无须配置.• 仅支持纯文本文件的索引(Indexing)和搜索(Search).• 不负责由其他格式的文件抽取纯文 ...
Android PopupWindow使用之地区、学校选择二级联动
最近在做一个社交类APP时,希望用户在注册时根据地区来选择自己所在的学校,由于用户手动输入学校,可能会出现各种问题,不利于后面对用户信息的统计.于是决定在客户端做好设置,用户只要根据地区来选择就好.第 ...

index merge的一次优化

index merge的一次优化的更多相关文章

随机推荐

热门专题