mysql优化器在统计全表扫描的代价时的方法
innodb 的聚集索引 的叶子结点 存放的 是 索引值以及数据页的偏移量
那么在计算全表扫描的代价是怎么计算的呢?
我们知道代价 为 cpu代价+io代价
cpu代价 就是 每5条记录比对 计算一个代价 (这里的记录并不是我们数据记录,而是索引记录) 是数据记录个数
又是如何取出全表的总记录呢 (即全表的总索引记录)
具体方法是 通过索引能拿到叶子结点的page数,page页默认16K ,那么总容量为 leaf page num * 16k
再计算这个索引的长度,因为索引可能是由多个字段构成,因此要遍历,假设为 m
total_records = leaf page num * 16k /m 就是 索引记录个数了, 一条聚焦索引记录对应一条数据记录,所以这里是总的记录数
还是有问题 这个leaf page是数据页,而m是主键的长度,上面的total_records计算出来的结果 并不是准确的记录个数,按理说m为一条记录的长度,但代码里是主键的长度
那么cpu cost 就是 total_records/5+1
io cost 就是 (double) (prebuilt->table->stat_clustered_index_size(聚簇索引叶页面数);
/******************************************************************//**
Calculate the time it takes to read a set of ranges through an index
This enables us to optimise reads for clustered indexes.
@return estimated time measured in disk seeks */
UNIV_INTERN
double
ha_innobase::read_time(
/*===================*/
uint index, /*!< in: key number */
uint ranges, /*!< in: how many ranges */
ha_rows rows) /*!< in: estimated number of rows in the ranges */
{
ha_rows total_rows;
double time_for_scan; if (index != table->s->primary_key) {
/* Not clustered */
return(handler::read_time(index, ranges, rows));
} if (rows <= ) { return((double) rows);
} /* Assume that the read time is proportional to the scan time for all
rows + at most one seek per range. */ time_for_scan = scan_time(); //estimate_rows_upper_bound这里就是计算全表总记录的函数
if ((total_rows = estimate_rows_upper_bound()) < rows) { return(time_for_scan);
} return(ranges + (double) rows / (double) total_rows * time_for_scan);
} /*********************************************************************//**
Gives an UPPER BOUND to the number of rows in a table. This is used in
filesort.cc.
@return upper bound of rows */
UNIV_INTERN
ha_rows
ha_innobase::estimate_rows_upper_bound(void)
/*======================================*/
{
dict_index_t* index;
ulonglong estimate;
ulonglong local_data_file_length;
ulint stat_n_leaf_pages; //取得该表的第一个索引,就是聚集索引
index = dict_table_get_first_index(prebuilt->table); //聚焦索引的叶子结点个数
stat_n_leaf_pages = index->stat_n_leaf_pages; //大小为 叶子结点个数*16k
local_data_file_length = ((ulonglong) stat_n_leaf_pages) * UNIV_PAGE_SIZE; /* Calculate a minimum length for a clustered index record and from
that an upper bound for the number of rows. Since we only calculate
new statistics in row0mysql.c when a table has grown by a threshold
factor, we must add a safety factor 2 in front of the formula below. */ //计算这个聚集索引的大小
// 2* 总叶子个数*16K / 聚焦索引大小 得到聚集索引记录个数
estimate = * local_data_file_length /
dict_index_calc_min_rec_len(index); DBUG_RETURN((ha_rows) estimate);
} /*********************************************************************//**
Calculates the minimum record length in an index. */
UNIV_INTERN
ulint
dict_index_calc_min_rec_len(
/*========================*/
const dict_index_t* index) /*!< in: index */
{
ulint sum = ;
ulint i; //记录为compack 紧凑模式,因为有可能这个索引是由多个字段组成,要遍历,求出总字节数
ulint comp = dict_table_is_comp(index->table); if (comp) {
ulint nullable = ;
sum = REC_N_NEW_EXTRA_BYTES;
for (i = ; i < dict_index_get_n_fields(index); i++) {
const dict_col_t* col
= dict_index_get_nth_col(index, i);
ulint size = dict_col_get_fixed_size(col, comp);
sum += size;
if (!size) {
size = col->len;
sum += size < ? : ;
}
if (!(col->prtype & DATA_NOT_NULL)) {
nullable++;
}
} /* round the NULL flags up to full bytes */
sum += UT_BITS_IN_BYTES(nullable); return(sum);
}
}
结构体dict_index_t
/** InnoDB B-tree index */
typedef struct dict_index_struct dict_index_t; /** Data structure for an index. Most fields will be
initialized to 0, NULL or FALSE in dict_mem_index_create(). */
struct dict_index_struct{
index_id_t id; /*!< id of the index */
mem_heap_t* heap; /*!< memory heap */
const char* name; /*!< index name */
const char* table_name;/*!< table name */
dict_table_t* table; /*!< back pointer to table */ //
#ifndef UNIV_HOTBACKUP
unsigned space:;
/*!< space where the index tree is placed */
unsigned page:;/*!< index tree root page number */
#endif /* !UNIV_HOTBACKUP */
unsigned type:DICT_IT_BITS;
/*!< index type (DICT_CLUSTERED, DICT_UNIQUE,
DICT_UNIVERSAL, DICT_IBUF, DICT_CORRUPT) */
#define MAX_KEY_LENGTH_BITS 12
unsigned trx_id_offset:MAX_KEY_LENGTH_BITS;
/*!< position of the trx id column
in a clustered index record, if the fields
before it are known to be of a fixed size,
0 otherwise */
#if (1<<MAX_KEY_LENGTH_BITS) < MAX_KEY_LENGTH
# error (<<MAX_KEY_LENGTH_BITS) < MAX_KEY_LENGTH
#endif
unsigned n_user_defined_cols:;
/*!< number of columns the user defined to
be in the index: in the internal
representation we add more columns */
unsigned n_uniq:;/*!< number of fields from the beginning
which are enough to determine an index
entry uniquely */
unsigned n_def:;/*!< number of fields defined so far */
unsigned n_fields:;/*!< number of fields in the index */
unsigned n_nullable:;/*!< number of nullable fields */
unsigned cached:;/*!< TRUE if the index object is in the
dictionary cache */
unsigned to_be_dropped:;
/*!< TRUE if this index is marked to be
dropped in ha_innobase::prepare_drop_index(),
otherwise FALSE. Protected by
dict_sys->mutex, dict_operation_lock and
index->lock.*/
dict_field_t* fields; /*!< array of field descriptions */
#ifndef UNIV_HOTBACKUP
UT_LIST_NODE_T(dict_index_t)
indexes;/*!< list of indexes of the table */
btr_search_t* search_info; /*!< info used in optimistic searches */
/*----------------------*/
/** Statistics for query optimization */
/* @{ */
ib_int64_t* stat_n_diff_key_vals;
/*!< approximate number of different
key values for this index, for each
n-column prefix where n <=
dict_get_n_unique(index); we
periodically calculate new
estimates */
ib_int64_t* stat_n_non_null_key_vals;
/* approximate number of non-null key values
for this index, for each column where
n < dict_get_n_unique(index); This
is used when innodb_stats_method is
"nulls_ignored". */
ulint stat_index_size;
/*!< approximate index size in
database pages */
ulint stat_n_leaf_pages;
/*!< approximate number of leaf pages in the
index tree */
/* @} */
rw_lock_t lock; /*!< read-write lock protecting the
upper levels of the index tree */
trx_id_t trx_id; /*!< id of the transaction that created this
index, or 0 if the index existed
when InnoDB was started up */
#endif /* !UNIV_HOTBACKUP */
#ifdef UNIV_BLOB_DEBUG
mutex_t blobs_mutex;
/*!< mutex protecting blobs */
void* blobs; /*!< map of (page_no,heap_no,field_no)
to first_blob_page_no; protected by
blobs_mutex; @see btr_blob_dbg_t */
#endif /* UNIV_BLOB_DEBUG */
#ifdef UNIV_DEBUG
ulint magic_n;/*!< magic number */
/** Value of dict_index_struct::magic_n */
# define DICT_INDEX_MAGIC_N
#endif
};
mysql优化器在统计全表扫描的代价时的方法的更多相关文章
- MySQL查询优化:LIMIT 1避免全表扫描
在某些情况下,如果明知道查询结果只有一个,SQL语句中使用LIMIT 1会提高查询效率. 例如下面的用户表(主键id,邮箱,密码): create table t_user(id int primar ...
- 记录一次没有收集直方图优化器选择全表扫描导致CPU耗尽
场景:数据库升级第二天,操作系统CPU使用率接近100%. 查看ash报告: 再看TOP SQL 具体SQL: select count(1) as chipinCount, sum(bets) as ...
- SQL SERVER中关于OR会导致索引扫描或全表扫描的浅析
在SQL SERVER的查询语句中使用OR是否会导致不走索引查找(Index Seek)或索引失效(堆表走全表扫描 (Table Scan).聚集索引表走聚集索引扫描(Clustered Index ...
- SQL SERVER中关于OR会导致索引扫描或全表扫描的浅析 (转载)
在SQL SERVER的查询语句中使用OR是否会导致不走索引查找(Index Seek)或索引失效(堆表走全表扫描 (Table Scan).聚集索引表走聚集索引扫描(Clustered Index ...
- MySql避免全表扫描【转】
原文地址:http://blog.163.com/ksm19870304@126/blog/static/37455233201251901943705/ 对查询进行优化,应尽量避免全表扫描,首先应考 ...
- Mysql避免全表扫描sql查询优化 .
对查询进行优化,应尽量避免全表扫描,首先应考虑在 where 及 order by 涉及的列上建立索引: .尝试下面的技巧以避免优化器错选了表扫描: · 使用ANALYZE TABLE tbl_n ...
- MySql避免全表扫描
对查询进行优化,应尽量避免全表扫描,首先应考虑在where 及order by 涉及的列上建立索引: .尝试下面的技巧以避免优化器错选了表扫描: · 使用ANALYZE TABLE tbl_name为 ...
- 【转】避免全表扫描的sql优化
对查询进行优化,应尽量避免全表扫描,首先应考虑在where 及order by 涉及的列上建立索引: .尝试下面的技巧以避免优化器错选了表扫描:· 使用ANALYZE TABLE tbl_name为扫 ...
- 避免全表扫描的sql优化
对查询进行优化,应尽量避免全表扫描,首先应考虑在where 及order by 涉及的列上建立索引: .尝试下面的技巧以避免优化器错选了表扫描: · 使用ANALYZE TABLE tbl_na ...
随机推荐
- eclipse 远程调试mapreduce
使用环境:centos6.5+eclipse(4.4.2)+hadoop2.7.0 1.下载eclipse hadoop 插件 hadoop-eclipse-plugin-2.7.0.jar 粘贴到 ...
- 使用Application Center Test (ACT)来做压力测试 【转】
在我们完成了基于SPS2003的开发,实现了我们的具体应用以后,我们是不是就可以直接请用户来使用了呢?如果我这么做,那么有经验的开发人员一定会对此嗤之以鼻:居然连压力测试也不做!真是不想活了…… 呵呵 ...
- Linux下timer延时的使用
http://blog.csdn.net/hzpeterchen/article/details/8090385 因笔者工作在嵌入式平台上(非x386),下面给出的结论仅在arm平台上测试过. 1. ...
- 二进制搭建kubernetes多master集群【一、使用TLS证书搭建etcd集群】
上一篇我们介绍了kubernetes集群架构以及系统参数配置,参考:二进制搭建kubernetes多master集群[开篇.集群环境和功能介绍] 下面本文etcd集群才用三台centos7.5搭建完成 ...
- 661. Image Smoother
static int wing=[]() { std::ios::sync_with_stdio(false); cin.tie(NULL); ; }(); class Solution { publ ...
- 2018.06.30 BZOJ 2342: [Shoi2011]双倍回文(manacher)
2342: [Shoi2011]双倍回文 Time Limit: 10 Sec Memory Limit: 128 MB Description Input 输入分为两行,第一行为一个整数,表示字符串 ...
- Django(4)
https://www.cnblogs.com/yuanchenqi/articles/7439088.html
- IoC的基本概念
一.什么是IOC ioc是一个英文缩写,英文全称是 Inversion of Control,翻译过来是“控制反转”.理解好Ioc的关键是要明确“谁控制谁,控制了什么,为何是反转,哪些方面反转了” 谁 ...
- TCP协议理解
一.前言: TCP协议和UDP协议是网络编程里最重要的协议,很多新出的技术.新出的协议本质上都是基于这两个协议的,其中又以TCP协议居多:比如HTTP协议就是基于TCP协议的,应用程序和数据库交互也是 ...
- 201709021工作日记--CAS解读
CAS主要参考博文:classtag http://www.jianshu.com/p/473e14d5ab2d CAS(Compare and swap)比较和替换是设计并发算法时用到的一种技术 ...