MySQL 5.6查询优化器新特性的“BUG” eq_range_index_dive

本文转自 http://www.imysql.cn

最近碰到一个慢SQL问题，解决过程有点小曲折，和大家分享下。 SQL本身不复杂，表结构、索引也比较简单，不过个别字段存在于多个索引中。

CREATE TABLE `pre_forum_post` (
`pid` int(10) unsigned NOT NULL,
`fid` mediumint(8) unsigned NOT NULL DEFAULT ‘0’,
`tid` mediumint(8) unsigned NOT NULL DEFAULT ‘0’,
`first` tinyint(1) NOT NULL DEFAULT ‘0’,
`author` varchar(40) NOT NULL DEFAULT ”,
`authorid` int(10) unsigned NOT NULL DEFAULT ‘0’,
`subject` varchar(80) NOT NULL DEFAULT ”,
`dateline` int(10) unsigned NOT NULL DEFAULT ‘0’,
`message` mediumtext NOT NULL,
`useip` varchar(15) NOT NULL DEFAULT ”,
`invisible` tinyint(1) NOT NULL DEFAULT ‘0’,
`anonymous` tinyint(1) NOT NULL DEFAULT ‘0’,
`usesig` tinyint(1) NOT NULL DEFAULT ‘0’,
`htmlon` tinyint(1) NOT NULL DEFAULT ‘0’,
`bbcodeoff` tinyint(1) NOT NULL DEFAULT ‘0’,
`smileyoff` tinyint(1) NOT NULL DEFAULT ‘0’,
`parseurloff` tinyint(1) NOT NULL DEFAULT ‘0’,
`attachment` tinyint(1) NOT NULL DEFAULT ‘0’,
`rate` smallint(6) NOT NULL DEFAULT ‘0’,
`ratetimes` tinyint(3) unsigned NOT NULL DEFAULT ‘0’,
`status` int(10) NOT NULL DEFAULT ‘0’,
`tags` varchar(255) NOT NULL DEFAULT ‘0’,
`comment` tinyint(1) NOT NULL DEFAULT ‘0’,
`replycredit` int(10) NOT NULL DEFAULT ‘0’,
`position` int(8) unsigned NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`tid`,`position`),
UNIQUE KEY `pid` (`pid`),
KEY `fid` (`fid`),
KEY `displayorder` (`tid`,`invisible`,`dateline`),
KEY `first` (`tid`,`first`),
KEY `new_auth` (`authorid`,`invisible`,`tid`),
KEY `idx_dt` (`dateline`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

“root@localhost Fri Aug 1 11:59:56 2014 11:59:56 [test]>show table status like ‘pre_forum_post’\G
*************************** 1. row ***************************
Name: pre_forum_post
Engine: MyISAM
Version: 10
Row_format: Dynamic
Rows: 23483977
Avg_row_length: 203
Data_length: 4782024708
Max_data_length: 281474976710655
Index_length: 2466093056
Data_free: 0
Auto_increment: 1
Create_time: 2014-08-01 11:00:56
Update_time: 2014-08-01 11:08:49
Check_time: 2014-08-01 11:12:23
Collation: utf8_general_ci
Checksum: NULL
Create_options:
Comment:

mysql> show index from pre_forum_post;
+—————-+————+————–+————–+————-+———–+————-+———-+——–+——+————+———+—————+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+—————-+————+————–+————–+————-+———–+————-+———-+——–+——+————+———+—————+
| pre_forum_post | 0 | PRIMARY | 1 | tid | A | 838713 | NULL | NULL | | BTREE | | |
| pre_forum_post | 0 | PRIMARY | 2 | position | A | 23483977 | NULL | NULL | | BTREE | | |
| pre_forum_post | 0 | pid | 1 | pid | A | 23483977 | NULL | NULL | | BTREE | | |
| pre_forum_post | 1 | fid | 1 | fid | A | 1470 | NULL | NULL | | BTREE | | |
| pre_forum_post | 1 | displayorder | 1 | tid | A | 838713 | NULL | NULL | | BTREE | | |
| pre_forum_post | 1 | displayorder | 2 | invisible | A | 869776 | NULL | NULL | | BTREE | | |
| pre_forum_post | 1 | displayorder | 3 | dateline | A | 23483977 | NULL | NULL | | BTREE | | |
| pre_forum_post | 1 | first | 1 | tid | A | 838713 | NULL | NULL | | BTREE | | |
| pre_forum_post | 1 | first | 2 | first | A | 1174198 | NULL | NULL | | BTREE | | |
| pre_forum_post | 1 | new_auth | 1 | authorid | A | 1806459 | NULL | NULL | | BTREE | | |
| pre_forum_post | 1 | new_auth | 2 | invisible | A | 1956998 | NULL | NULL | | BTREE | | |
| pre_forum_post | 1 | new_auth | 3 | tid | A | 11741988 | NULL | NULL | | BTREE | | |
| pre_forum_post | 1 | idx_dt | 1 | dateline | A | 23483977 | NULL | NULL | | BTREE | | |
+—————-+————+————–+————–+————-+———–+————-+———-+——–+——+————+———+—————+
我们来看下这个SQL的执行计划：

mysql> select * from pre_forum_post where tid=7932612 and `invisible` in(‘0′,’-2′) order by dateline limit 15;
15 rows in set (26.78 sec)
看下MySQL的会话状态值：Handler_read_next

| Handler_read_next | 17274153 |
从1700多万数据中选取15条记录，结果可想而知，非常慢。我们强制指定比较靠谱的索引再看下：

mysql> explain select * from pre_forum_post force index(displayorder) where tid=7932612 and `invisible` in(‘0′,’-2′) order by dateline limit 15\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: pre_forum_post
type: range
possible_keys: displayorder
key: displayorder
key_len: 4
ref: NULL
rows: 46131
Extra: Using index condition; Using filesort
看下实际执行的耗时：

mysql> select * from pre_forum_post force index(displayorder) where tid=7932612 and `invisible` in(‘0′,’-2′) order by dateline limit 15;
15 rows in set (0.08 sec)
尼玛，怎么可以这么快，查询优化器未免太坑爹了吧。再看下MySQL的会话状态值：Handler_read_next

| Handler_read_next | 31188 |
和不强制索引的情况相比，差了553倍！所幸，5.6以上除了EXPLAIN外，还支持OPTIMIZER_TRACE，我们来观察下两种执行计划的区别，发现不强制指定索引时的执行计划有诈，会在最后判断到 ORDER BY 子句时，修改执行计划：

{\
“reconsidering_access_paths_for_index_ordering”: {\
“clause”: “ORDER BY”,\
“index_order_summary”: {\
“table”: “`pre_forum_post`”,\
“index_provides_order”: true,\
“order_direction”: “asc”,\
“index”: “idx_dt”,\
“plan_changed”: true,\
“access_type”: “index_scan”\
} /* index_order_summary */\
} /* reconsidering_access_paths_for_index_ordering */\
而在前面analyzing_range_alternatives和considered_execution_plans阶段，都认为其他几个索引也是可选择的，直到这里才给强X了，你Y的… 看起来像是MySQL 5.6查询优化器的bug了，GOOGLE了一下，还真发有人已经反馈过类似的问题： MySQL bug 70245: incorrect costing for range scan causes optimizer to choose incorrect index

看完才发现，其实不是神马BUG，而是原来从5.6开始，增加了一个选项叫eq_range_index_dive_limit 的高级货，这货大概的用途是：在较多等值查询（例如多值的IN查询）情景中，预估可能会扫描的记录数，从而选择相对更合适的索引，避免所谓的index dive问题。

当面临下面两种选择时：

1、索引代价较高，但结果较为精确；
2、索引代价较低，但结果可能不够精确；
简单说，选项 eq_range_index_dive_limit 的值设定了 IN列表中的条件个数上线，超过设定值时，会将执行计划分支从 1 变成 2。

该值默认为10，但社区众多人反馈较低了，因此在5.7版本后，将默认值调整为200了。

不过，今天我们这里的案例却是想反的，因为优化器选择了看似代价低但精确的索引，实际却选择了更低效的索引。因此，我们需要将其阈值调低，尝试设置 eq_range_index_dive_limit = 2 后（上面的例子中，IN条件里有2个值），再看下新的查询计划：

mysql> set eq_range_index_dive_limit = 2;

mysql> explain select * from pre_forum_post where tid=7932612 and `invisible` in(‘0′,’-2′) order by dateline limit 15\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: pre_forum_post
type: range
possible_keys: PRIMARY,displayorder,first
key: displayorder
key_len: 4
ref: NULL
rows: 54
Extra: Using index condition; Using filesort
卧槽，预估扫描记录数又降了557倍，相比最开始降了接近32万倍！在这个案例中，虽然通过修改选项 eq_range_index_dive_limit 的阈值可以达到优化效果，但事实上更靠谱的做法是：直接删除 idx_dt 索引。是的，没错，删除这个垃圾重复索引，因为实际上这个索引的用处不大，够坑爹吧

MySQL 5.6查询优化器新特性的“BUG” eq_range_index_dive_limit的更多相关文章

MySQL 8.0.2复制新特性（翻译）
译者:知数堂星耀队 MySQL 8.0.2复制新特性 MySQL 8 正在变得原来越好,而且这也在我们MySQL复制研发团队引起了一阵热潮.我们一直致力于全面提升MySQL复制,通过引入新的和一些有趣 ...
MySQL 8.0的关系数据库新特性详解
前言 MySQL 8.0 当前的最新版本是 8.0.4 rc,估计正式版本出来也快了.本文介绍几个 8.0 在关系数据库方面的主要新特性. 你可能已经知道 MySQL 从版本 5.7 开始提供了 No ...
高性能MySql进化论(九):查询优化器常用的优化方式
1 介绍 1.1 处理流程当MYSQL 收到一条查询请求时,会首先通过关键字对SQL语句进行解析,生成一颗“解析树”,然后预处理器会校验“解析树”是否合法(主要校验数据列和表明 ...
Atitit.mysql 5.0 5.5 5.6 5.7 新特性新功能
Atitit.mysql 5.0 5.5 5.6 5.7 新特性新功能 1. MySQL 5.6 5 大新特性1 1.1. 优化器的改进1 1.2. InnoDB 改进1 1.3. 使用 ...
Atitit.mysql 5.0 5.5 5.6 5.7 新特性新功能
Atitit.mysql 5.0 5.5 5.6 5.7 新特性新功能 1. MySQL 5.6 5 大新特性1 1.1. 优化器的改进1 1.2. InnoDB 改进1 1.3. 使用 ...
MySQL 5.7新特性之Generated Column（函数索引）
MySQL 5.7引入了Generated Column,这篇文章简单地介绍了Generated Column的使用方法和注意事项,为读者了解MySQL 5.7提供一个快速的.完整的教程.这篇文章围绕 ...
MySQL 5.7新特性之generated column
MySQL 5.7引入了generated column,这篇文章简单地介绍了generated column的使用方法和注意事项,为读者了解MySQL 5.7提供一个快速的.完整的教程.这篇文章围绕 ...
跨时代的MySQL8.0新特性解读
目录 MySQL发展历程 MySQL8.0新特性秒级加列性能提升文档数据库 SQL增强共用表表达式(CTEs) 不可见索引(Invisible Indexes) 降序索引(Descending ...
webpack 4.0.0-beta.0 新特性介绍
webpack 可以看做是模块打包机.它做的事情是:分析你的项目结构,找到JavaScript模块以及其它的一些浏览器不能直接运行的拓展语言(Scss,TypeScript等),并将其打包为合适的格式 ...

随机推荐

C#可空类型的速度和GC Alloc测试
在Unity中进行速度和GC Alloc的测试测试脚本: using UnityEngine; using System; using System.Collections; using Syste ...
Go语言执行系统命令行命令（转）
package main import ( "os" "os/exec" "fmt" "flag" "stri ...
XAF视频教程来啦,已出15课
第一到第七课在这里: http://www.cnblogs.com/foreachlife/p/xafvideo_1_6.html 视频地址:http://i.youku.com/i/UMTI5OTE ...
Hive数据仓库
Hive 是一个基于Hadoop分布式文件系统(HDFS)之上的数据仓库架构,同时依赖于MapReduce.适用于大数据集的批处理,而不适用于低延迟快速查询. Hive将用户的HiveQL语句转换为M ...
ajax请求总是进入Error里
ajax请求时找到了正确的地址,执行完返回总是进入error里,并且浏览器错误显示是找不到请求的地址. 解决办法: 查看配置文件的,把maxJsonLength值改大. <system.web. ...
Linux C相关基础
系统求助 man 函数名 man 2 函数名 - 表示函数是系统调用函数 man 3 函数名 - 表示函数是C的库函数 eg:man fread man 2 w ...
java 计算 1到10 的阶层的和（采用递归的方法）
package hibernate; public class t { public static void main(String[] args) { System.out.println(jiec ...
python——操作Redis
在使用django的websocket的时候,发现web请求和其他当前的django进程的内存是不共享的,猜测django的机制可能是每来一个web请求,就开启一个进程去与web进行交互,一次来达到利 ...
php 使用curl模拟登录人人(校内)网
$login_url = 'http://passport.renren.com/PLogin.do'; $post_fields['email'] = 'XXXX';$post_fields['pa ...
How to run a geoprocessing tool
How to run a geoprocessing tool In this topic Running a geoprocessing tool Toolbox names and namespa ...

MySQL 5.6查询优化器新特性的“BUG” eq_range_index_dive_limit

MySQL 5.6查询优化器新特性的“BUG” eq_range_index_dive_limit的更多相关文章

随机推荐

热门专题