现在的CMS系统、博客系统、BBS等都喜欢使用标签tag作交叉链接,因此我也尝鲜用了下。但用了后发现我想查询某个tag的文章列表时速度很慢,达到5秒之久!百思不解(后来终于解决),我的表结构是下面这样的,文章只有690篇。

文章表article(id,title,content)
标签表tag(tid,tag_name)
标签文章中间表article_tag(id,tag_id,article_id)
其中有个标签的tid是135,我帮查询标签tid是135的文章列表
用以下语句时发现速度好慢,我文章才690篇
select id,title from article where id in(
select article_id from article_tag where tag_id=135
)
其中这条速度很快:select article_id from article_tag where tag_id=135
查询结果是五篇文章,id为428,429,430,431,432
我用写死的方式用下面sql来查文章也很快
select id,title from article where id in(
428,429,430,431,432
)
我在SqlServer中好像不会这样慢,不知MySQL怎样写好点,也想不出慢在哪里。

后来我找到了解决方法:

select id,title from article where id in(
select article_id from (select article_id from article_tag where tag_id=135) as tbt
)

其它解决方法:(举例)

mysql> select * from abc_number_prop where number_id in (select number_id from abc_number_phone where phone = '82306839');

为了节省篇幅,省略了输出内容,下同。

67 rows in set (12.00 sec)

只有67行数据返回,却花了12秒,而系统中可能同时会有很多这样的查询,系统肯定扛不住。用desc看一下(注:explain也可)

mysql> desc select * from abc_number_prop where number_id in (select number_id from abc_number_phone where phone = '82306839');
+----+--------------------+------------------+--------+-----------------+-------+---------+------------+---------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+------------------+--------+-----------------+-------+---------+------------+---------+--------------------------+
| 1 | PRIMARY | abc_number_prop | ALL | NULL | NULL | NULL | NULL | 2679838 | Using where |
| 2 | DEPENDENT SUBQUERY | abc_number_phone | eq_ref | phone,number_id | phone | 70 | const,func | 1 | Using where; Using index |
+----+--------------------+------------------+--------+-----------------+-------+---------+------------+---------+--------------------------+
2 rows in set (0.00 sec)

从上面的信息可以看出,在执行此查询时会扫描两百多万行,难道是没有创建索引吗,看一下

mysql>show index from abc_number_phone;
+------------------+------------+-------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+------------------+------------+-------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| abc_number_phone | 0 | PRIMARY | 1 | number_phone_id | A | 36879 | NULL | NULL | | BTREE | | |
| abc_number_phone | 0 | phone | 1 | phone | A | 36879 | NULL | NULL | | BTREE | | |
| abc_number_phone | 0 | phone | 2 | number_id | A | 36879 | NULL | NULL | | BTREE | | |
| abc_number_phone | 1 | number_id | 1 | number_id | A | 36879 | NULL | NULL | | BTREE | | |
| abc_number_phone | 1 | created_by | 1 | created_by | A | 36879 | NULL | NULL | | BTREE | | |
| abc_number_phone | 1 | modified_by | 1 | modified_by | A | 36879 | NULL | NULL | YES | BTREE | | |
+------------------+------------+-------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
6 rows in set (0.06 sec)

mysql>show index from abc_number_prop;
+-----------------+------------+-------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------+------------+-------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| abc_number_prop | 0 | PRIMARY | 1 | number_prop_id | A | 311268 | NULL | NULL | | BTREE | | |
| abc_number_prop | 1 | number_id | 1 | number_id | A | 311268 | NULL | NULL | | BTREE | | |
| abc_number_prop | 1 | created_by | 1 | created_by | A | 311268 | NULL | NULL | | BTREE | | |
| abc_number_prop | 1 | modified_by | 1 | modified_by | A | 311268 | NULL | NULL | YES | BTREE | | |
+-----------------+------------+-------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
4 rows in set (0.15 sec)

从上面的输出可以看出,这两张表在number_id字段上创建了索引的。

看看子查询本身有没有问题。

mysql> desc select number_id from abc_number_phone where phone = '82306839';
+----+-------------+------------------+------+---------------+-------+---------+-------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------------+------+---------------+-------+---------+-------+------+--------------------------+
| 1 | SIMPLE | abc_number_phone | ref | phone | phone | 66 | const | 6 | Using where; Using index |
+----+-------------+------------------+------+---------------+-------+---------+-------+------+--------------------------+
1 row in set (0.00 sec)

没有问题,只需要扫描几行数据,索引起作用了。查询出来看看

mysql> select number_id from abc_number_phone where phone = '82306839';
+-----------+
| number_id |
+-----------+
| 8585 |
| 10720 |
| 148644 |
| 151307 |
| 170691 |
| 221897 |
+-----------+
6 rows in set (0.00 sec)

直接把子查询得到的数据放到上面的查询中

mysql> select * from abc_number_prop where number_id in (8585, 10720, 148644, 151307, 170691, 221897);

67 rows in set (0.03 sec)

速度也快,看来MySQL在处理子查询的时候是不够好。我在MySQL 5.1.42 和 MySQL 5.5.19 都进行了尝试,都有这个问题。

搜索了一下网络,发现很多人都遇到过这个问题:

参考资料1:使用连接(JOIN)来代替子查询(Sub-Queries) mysql优化系列记录
http://blog.csdn.net/hongsejiaozhu/article/details/1876181
参考资料2:网站开发日记(14)-MYSQL子查询和嵌套查询优化
http://dodomail.iteye.com/blog/250199

根据网上这些资料的建议,改用join来试试。

修改前:select * from abc_number_prop where number_id in (select number_id from abc_number_phone where phone = '82306839');

修改后:select a.* from abc_number_prop a inner join abc_number_phone b on a.number_id = b.number_id where phone = '82306839';

mysql> select a.* from abc_number_prop a inner join abc_number_phone b on a.number_id = b.number_id where phone = '82306839';

67 rows in set (0.00 sec)

效果不错,查询所用时间几乎为0。看一下MySQL是怎么执行这个查询的

mysql>desc select a.* from abc_number_prop a inner join abc_number_phone b on a.number_id = b.number_id where phone = '82306839';
+----+-------------+-------+------+-----------------+-----------+---------+-----------------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+-----------------+-----------+---------+-----------------+------+--------------------------+
| 1 | SIMPLE | b | ref | phone,number_id | phone | 66 | const | 6 | Using where; Using index |
| 1 | SIMPLE | a | ref | number_id | number_id | 4 | eap.b.number_id | | |
+----+-------------+-------+------+-----------------+-----------+---------+-----------------+------+--------------------------+
2 rows in set (0.00 sec)

小结:当子查询速度慢时,可用JOIN来改写一下该查询来进行优化。

网上也有文章说,使用JOIN语句的查询不一定总比使用子查询的语句快。

参考资料3:改变了对Mysql子查询的看法
http://hi.baidu.com/yzx110/blog/item/e694f536f92075360b55a92b.html

mysql手册也提到过,具体的原文在mysql文档的这个章节:

I.3. Restrictions on Subqueries

13.2.8. Subquery Syntax

摘抄:

1)关于使用IN的子查询:

Subquery optimization for IN is not as effective as for the = operator or for IN(value_list) constructs.

A typical case for poor IN subquery performance is when the subquery returns a small number of rows but the outer query returns a large number of rows to be compared to the subquery result.

The problem is that, for a statement that uses an IN subquery, the optimizer rewrites it as a correlated subquery. Consider the following statement that uses an uncorrelated subquery:

SELECT ... FROM t1 WHERE t1.a IN (SELECT b FROM t2);

The optimizer rewrites the statement to a correlated subquery:

SELECT ... FROM t1 WHERE EXISTS (SELECT 1 FROM t2 WHERE t2.b = t1.a);

If the inner and outer queries return M and N rows, respectively, the execution time becomes on the order of O(M×N), rather than O(M+N) as it would be for an uncorrelated subquery.

An implication is that an IN subquery can be much slower than a query written using an IN(value_list) construct that lists the same values that the subquery would return.

2)关于把子查询转换成join的:

The optimizer is more mature for joins than for subqueries, so in many cases a statement that uses a subquery can be executed more efficiently if you rewrite it as a join.

An exception occurs for the case where an IN subquery can be rewritten as a SELECT DISTINCT join. Example:

SELECT col FROM t1 WHERE id_col IN (SELECT id_col2 FROM t2 WHERE condition);

That statement can be rewritten as follows:

SELECT DISTINCT col FROM t1, t2 WHERE t1.id_col = t2.id_col AND condition;

But in this case, the join requires an extra DISTINCT operation and is not more efficient than the subquery

mysql in 子查询 效率慢 优化(转)的更多相关文章

  1. mysql in 子查询 效率慢,对比

    desc SELECT id,detail,groupId from hs_knowledge_point where groupId in ( UNION all ) UNION ALL SELEC ...

  2. mysql数据库的优化和查询效率的优化

    一.数据库的优化 1.优化索引.SQL 语句.分析慢查询: 2.设计表的时候严格根据数据库的设计范式来设计数据库: 3.使用缓存,把经常访问到的数据而且不需要经常变化的数据放在缓存中,能节约磁盘IO: ...

  3. MySQL的一次优化记录 (IN子查询和索引优化)

    这两天实习项目遇到一个网页加载巨慢的问题(10多秒),然后定位到是一个MySQL查询特别慢的语句引起的: SELECT * FROM ( SELECT DISTINCT t.vc_date, t.c_ ...

  4. MySQL 表子查询

    MySQL 表子查询 表子查询是指子查询返回的结果集是 N 行 N 列的一个表数据. MySQL 表子查询实例 下面是用于例子的两张原始数据表: article 表: aid title conten ...

  5. 深入MySQL(四):MySQL的SQL查询语句性能优化概述

    关于SQL查询语句的优化,有一些一般的优化步骤,本节就介绍一下通用的优化步骤. 一条查询语句是如何执行的 首先,我们如果要明白一条查询语句所运行的过程,这样我们才能针对过程去进行优化. 参考我之前画的 ...

  6. 记一次mysql多表查询(left jion)优化案例

    一次mysql多表查询(left jion)优化案例 在新上线的供需模块中,发现某一个查询按钮点击后,出不来结果,找到该按钮对应sql手动执行,发现需要20-30秒才能出结果,所以服务端程序判断超时, ...

  7. MySQL 行子查询(转)

    MySQL 行子查询 行子查询是指子查询返回的结果集是一行 N 列,该子查询的结果通常是对表的某行数据进行查询而返回的结果集. 一个行子查询的例子如下: SELECT * FROM table1 WH ...

  8. MySQL FROM 子查询

    FROM 子句中的子查询 MySQL FROM 子查询是指 FROM 的子句作为子查询语句,主查询再到子查询结果中获取需要的数据.FROM 子查询语法如下: SELECT ... FROM (subq ...

  9. MySQL 行子查询

    MySQL 行子查询 行子查询是指子查询返回的结果集是一行 N 列,该子查询的结果通常是对表的某行数据进行查询而返回的结果集. 一个行子查询的例子如下: SELECT * FROM table1 WH ...

随机推荐

  1. angular学习-入门基础

    angular .caret,.dropup>.btn>.caret{border-top-color:#000!important}.label{border:1px solid #00 ...

  2. 每天一个linux命令(29):date命令

    在linux环境中,不管是编程还是其他维护,时间是必不可少的,也经常会用到时间的运算,熟练运用date命令来表示自己想要表示的时间,肯定可以给自己的工作带来诸多方便. 1.命令格式: date [参数 ...

  3. 报课系统APP

    031302307黄丰润 031302343张晓燕 #NABCD模型分析 合理分析需求有助于说服客户,所以我们有如下分析 N(need)--客户需要什么 负责人需要将选课信息和选课表格一起发送给所负责 ...

  4. OC基础--分类(category) 和 协议(protocol)

    OC 中的category分类文件相当于 C#中的部分类:OC 中的protocol协议文件(本质是头文件)相当于 C#中的接口.今天就简单说明一下OC中的这两个文件. 由于视频中的Xcode版本低, ...

  5. Netbeans 中的编译器相关配置

    gcc-core:C 编译器 gcc-g++:C++ 编译器 gdb:GNU 调试器 make:"make" 实用程序的 GNU 版本

  6. ThinkPHP多表联合查询的常用方法

    1.原生查询示例: $Model = new Model(); $sql = 'select a.id,a.title,b.content from think_test1 as a, think_t ...

  7. 高斯混合聚类及EM实现

    一.引言 我们谈到了用 k-means 进行聚类的方法,这次我们来说一下另一个很流行的算法:Gaussian Mixture Model (GMM).事实上,GMM 和 k-means 很像,不过 G ...

  8. POJ2135 Farm Tour

      Farm Tour Time Limit: 2MS   Memory Limit: 65536KB   64bit IO Format: %I64d & %I64u Description ...

  9. Code Review Engine Learning

    相关学习资料 https://www.owasp.org/index.php/Code_review https://www.owasp.org/images/8/8e/OWASP_Code_Revi ...

  10. jsp学习(四)

    JavaBean是一个可重复使用的软件组件,是遵循一定标准.用java语言编写的一个类,该类的一个实例称为一个JavaBean,简称bean. 它必须符合如下规范:1.必须有一个零参数的默认构造函数2 ...