1110Nested Loop Join算法

转自 http://blog.csdn.net/tonyxf121/article/details/7796657

join的实现原理

join的实现是采用Nested Loop Join算法，就是通过驱动表的结果集作为循环基础数据，然后一条一条的通过该结果集中的数据作为过滤条件到下一个表中查询数据，然后合并结果。如果有多个join，则将前面的结果集作为循环数据，再一次作为循环条件到后一个表中查询数据。

接下来通过一个三表join查询来说明MySQL的Nested Loop Join的实现方式。

select m.subject msg_subject, c.content msg_content
from user_group g,group_message m,group_message_content c
where g.user_id = 1
and m.group_id = g.group_id
and c.group_msg_id = m.id

使用explain看看执行计划：

explain select m.subject msg_subject, c.content msg_content from user_group g,group_message m,
group_message_content c where g.user_id = 1 and m.group_id = g.group_id and c.group_msg_id = m.id\G;

结果如下：

*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: g
type: ref
possible_keys: user_group_gid_ind,user_group_uid_ind,user_group_gid_uid_ind
key: user_group_uid_ind
key_len: 4
ref: const
rows: 2
Extra:
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: m
type: ref
possible_keys: PRIMARY,idx_group_message_gid_uid
key: idx_group_message_gid_uid
key_len: 4
ref: g.group_id
rows: 3
Extra:
*************************** 3. row ***************************
id: 1
select_type: SIMPLE
table: c
type: ref
possible_keys: idx_group_message_content_msg_id
key: idx_group_message_content_msg_id
key_len: 4
ref: m.id
rows: 2
Extra:

从结果可以看出，explain选择user_group作为驱动表，首先通过索引user_group_uid_ind来进行const条件的索引ref查找，然后用user_group表中过滤出来的结果集group_id字段作为查询条件，对group_message循环查询，然后再用过滤出来的结果集中的group_message的id作为条件与group_message_content的group_msg_id进行循环比较查询，获得最终的结果。

这个过程可以通过如下代码来表示：

for each record g_rec in table user_group that g_rec.user_id=1{
     for each record m_rec in group_message that m_rec.group_id=g_rec.group_id{
          for each record c_rec in group_message_content that c_rec.group_msg_id=m_rec.id
                pass the (g_rec.user_id, m_rec.subject, c_rec.content) row
          combination to output;
      }
}

如果去掉group_message_content表上面的group_msg_id字段的索引，执行计划会有所不一样。

drop index idx_group_message_content_msg_id on group_message_content;
explain select m.subject msg_subject, c.content msg_content from user_group g,group_message m,
group_message_content c where g.user_id = 1 and m.group_id = g.group_id and c.group_msg_id = m.id\G;

得到的执行计划如下：

*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: g
type: ref
possible_keys: user_group_uid_ind
key: user_group_uid_ind
key_len: 4
ref: const
rows: 2
Extra:
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: m
type: ref
possible_keys: PRIMARY,idx_group_message_gid_uid
key: idx_group_message_gid_uid
key_len: 4
ref: g.group_id
rows: 3
Extra:
*************************** 3. row ***************************
id: 1
select_type: SIMPLE
table: c
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 96
Extra:Using where;Using join buffer

因为删除了索引，所以group_message_content的访问从ref变成了ALL，keys相关的信息也变成了NULL，Extra信息也变成了Using Where和Using join buffer，也就是说需要获取content内容只能通过对全表的数据进行where过滤才能获取。Using join buffer是指使用到了Cache，只有当join类型为ALL，index，rang或者是index_merge的时候才会使用join buffer，它的使用过程可以用下面代码来表示：

for each record g_rec in table user_group{
      for each record m_rec in group_message that m_rec.group_id=g_rec.group_id{
           put (g_rec, m_rec) into the buffer
           if (buffer is full)
                 flush_buffer();
      }
}
flush_buffer(){
      for each record c_rec in group_message_content that c_rec.group_msg_id = c_rec.id{
            for each record in the buffer
                 pass (g_rec.user_id, m_rec.subject, c_rec.content) row combination to output;
      }
      empty the buffer;
}
在实现过程中可以看到把user_group和group_message的结果集放到join buffer中，而不用每次user_group和group_message关联后马上和group_message_content关联，这也是没有必要的；需要注意的是join buffer中只保留查询结果中出现的列值，它的大小不依赖于表的大小，我们在伪代码中看到当join buffer被填满后，mysql将会flush buffer。

join语句的优化

1. 用小结果集驱动大结果集，尽量减少join语句中的Nested Loop的循环总次数；

2. 优先优化Nested Loop的内层循环，因为内层循环是循环中执行次数最多的，每次循环提升很小的性能都能在整个循环中提升很大的性能；

3. 对被驱动表的join字段上建立索引；

4. 当被驱动表的join字段上无法建立索引的时候，设置足够的Join Buffer Size。

增加一点：

ON是最先执行， WHERE次之，HAVING最后，因为ON是先把不符合条件的记录过滤后才进行统计，它就可以减少中间运算要处理的数据，按理说应该速度是最快的，WHERE也应该比 HAVING快点的，因为它过滤数据后才进行SUM，在两个表联接时才用ON的，所以在一个表的时候，就剩下WHERE跟HAVING比较了

1考虑联接优先顺序：

2INNER JOIN

3LEFT JOIN (注：RIGHT JOIN 用 LEFT JOIN 替代)

4CROSS JOIN

打包小工具

http://www.linuxidc.com/Linux/2014-03/98553.htm

1110Nested Loop Join算法的更多相关文章

1122MySQL性能优化之 Nested Loop Join和Block Nested-Loop Join(BNL)
转自http://blog.itpub.net/22664653/viewspace-1692317/ 一介绍相信许多开发/DBA在使用MySQL的过程中,对于MySQL处理多表关联的方式或者说 ...
关于join算法的四篇文章
MySQL Join算法与调优白皮书(一) MySQL Join算法与调优白皮书(二) MySQL Join算法与调优白皮书(三) MySQL Join算法与调优白皮书(四) MariaDB Join ...
44 答疑（三）--join的写法/Simple nested loop join的性能问题/Distinct和group by的性能/备库自增主键问题
44 答疑(三) Join的写法 35节介绍了join执行顺序,加了straight_join,两个问题: --1 如果用left join,左边的表一定是驱动表吗 --2 如果两个表的join包含多 ...
MySQL Nested-Loop Join算法学习
不知不觉的玩了两年多的MySQL,发现很多人都说MySQL对比Oracle来说,优化器做的比较差,其实某种程度上来说确实是这样,但是毕竟MySQL才到5.7版本,Oracle都已经发展到12c了,今天 ...
SQL Server的三种物理连接之Loop Join（一）
Sql Server有三种物理连接Loop Join,Merge Join,Hash Join, 当表之间连接的时候会选择其中之一,不同的连接产生的性能不同,理解这三种物理连接对性能调优有很大帮助. ...
24.join算法/锁_1
一. JOIN算法1.1. JOIN 语法 mysql> select * from t4; +---+------+ | a | b | +---+------+ | | 11 | | | 5 ...
SQL Server nested loop join 效率试验
从很多网页上都看到,SQL Server有三种Join的算法, nested loop join, merge join, hash join. 其中最常用的就是nested loop join. 在 ...
Merge join、Hash join、Nested loop join对比分析
简介我们所常见的表与表之间的Inner Join,Outer Join都会被执行引擎根据所选的列,数据上是否有索引,所选数据的选择性转化为Loop Join,Merge Join,Hash Join ...
022：SQL优化--JOIN算法
目录一. SQL优化--JOIN算法 1.1. JOIN 写法对比 2. JOIN的成本 3. JOIN算法 3.1. simple nested loop join 3.2. index nest ...

随机推荐

ViewPager+Fragment再探：和TAB滑动条一起三者结合
Fragment前篇: <Android Fragment初探:静态Fragment组成Activity> ViewPager前篇: <Android ViewPager初探:让页面 ...
[引] Security tips for web developers
Source :Security tips for web developers
MS SQLServer 操作XML语句的存储过程
-- ================================================ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO ...
[转]教你一招 - 如何给nopcommerce增加新闻类别模块
本文转自:http://www.nopchina.net/post/nopchina-teach-newscategory.html nopcommerce的新闻模块一直都没有新闻类别,但是很多情况下 ...
.Net程序员之Python基础教程学习----函数和异常处理[Fifth Day]
今天主要记录,Python中函数的使用以及异常处理. 一.函数: 1.函数的创建以及调用. def Add(val1,val2): return val1+val2; print Add( ...
Windows环境下Android Studio v1.0安装教程
Windows环境下Android Studio v1.0安装教程准备工具 JDK安装包. 要求:JDK 7以及以上版本. Android Studio安装文件. Windows: exe(包含SD ...
hdu 5898 odd-even number 数位DP
传送门:hdu 5898 odd-even number 思路:数位DP,套着数位DP的模板搞一发就可以了不过要注意前导0的处理,dp[pos][pre][status][ze] pos:当前处理的位 ...
分页查询的两种方法（双top 双order 和 row_number() over ()）
--跳过10条取2条也叫分页select top 2 * from studentwhere studentno not in (select top 2 studentno from studen ...
[No000003]现代版三十六计,计计教你如何做人
<现代版三十六计,计计教你如何做人> …………………………………………………………………………………… 第1计施恩计在人际交往中,见到给人帮忙的机会,要立马扑上去,像一只饥饿的松鼠扑向地 ...
转：设置Eclipse中的tab键为4个空格的完整方法
from: https://my.oschina.net/xunxun10/blog/110074 设置Eclipse中的tab键为4个空格的完整方法收藏 XunXun10 发表于 4年前阅读 ...

1110Nested Loop Join算法

1110Nested Loop Join算法的更多相关文章

随机推荐

热门专题