一、前言

MySQL分页查询作为Java面试的一道高频面试题，这里有必要实践一下，毕竟实践出真知。

很多同学在做测试时苦于没有海量数据，官方其实是有一套测试库的。

二、模拟数据

这里模拟数据分2种情况导入，如果只是需要数据测试下，那么推荐官方数据。如果官方数据满足不了需求的话，那么我们自己模拟数据。

1. 导入官方测试库

该测试库含有6个表。

首先进入 employees_db, 执行导入数据指令

mysql -uroot -proot -t < employees.sql

有些环境可能会报错

ERROR 1193 (HY000) at line 38: Unknown system variable 'storage_engine'

连接mysql查看默认引擎，发现不是本地环境的问题。

mysql> show variables like '%engine%';

+----------------------------------+--------+

| Variable_name                    | Value  |

+----------------------------------+--------+

| default_storage_engine           | InnoDB |

| default_tmp_storage_engine       | InnoDB |

| disabled_storage_engines         |        |

| internal_tmp_disk_storage_engine | InnoDB |

+----------------------------------+--------+

4 rows in set (0.01 sec)

修改 employees.sql 脚本

   set default_storage_engine = InnoDB;

-- set storage_engine = MyISAM;

-- set storage_engine = Falcon;

-- set storage_engine = PBXT;

-- set storage_engine = Maria;

select CONCAT('storage engine: ', @@default_storage_engine) as INFO;

再次执行发现导入成功

➜  employees_db mysql -uroot -proot -t < employees.sql

mysql: [Warning] Using a password on the command line interface can be insecure.

+-----------------------------+

| INFO                        |

+-----------------------------+

| CREATING DATABASE STRUCTURE |

+-----------------------------+

+------------------------+

| INFO                   |

+------------------------+

| storage engine: InnoDB |

+------------------------+

+---------------------+

| INFO                |

+---------------------+

| LOADING departments |

+---------------------+

+-------------------+

| INFO              |

+-------------------+

| LOADING employees |

+-------------------+

+------------------+

| INFO             |

+------------------+

| LOADING dept_emp |

+------------------+

+----------------------+

| INFO                 |

+----------------------+

| LOADING dept_manager |

+----------------------+

+----------------+

| INFO           |

+----------------+

| LOADING titles |

+----------------+

+------------------+

| INFO             |

+------------------+

| LOADING salaries |

+------------------+

验证结果(配置修改同上)

➜  employees_db mysql -uroot -proot -t < test_employees_sha.sql

mysql: [Warning] Using a password on the command line interface can be insecure.

+----------------------+

| INFO                 |

+----------------------+

| TESTING INSTALLATION |

+----------------------+

+--------------+------------------+------------------------------------------+

| table_name   | expected_records | expected_crc                             |

+--------------+------------------+------------------------------------------+

| departments  |                9 | 4b315afa0e35ca6649df897b958345bcb3d2b764 |

| dept_emp     |           331603 | d95ab9fe07df0865f592574b3b33b9c741d9fd1b |

| dept_manager |               24 | 9687a7d6f93ca8847388a42a6d8d93982a841c6c |

| employees    |           300024 | 4d4aa689914d8fd41db7e45c2168e7dcb9697359 |

| salaries     |          2844047 | b5a1785c27d75e33a4173aaa22ccf41ebd7d4a9f |

| titles       |           443308 | d12d5f746b88f07e69b9e36675b6067abb01b60e |

+--------------+------------------+------------------------------------------+

我们可以看到emp大概有33万条数据。

2. 存储过程导入模拟数据

这里我们可以选择存储过程批量导入。

首先创建一张表

drop table if exists `user`;

create table `user`(

  `id` int unsigned auto_increment,

  `username` varchar(64) not null default '',

  `score` int(11) not null default 0,

    primary key(`id`)

)ENGINE = InnoDB;

创建存储过程

DROP PROCEDURE IF EXISTS batchInsert;

delimiter $$  -- 声明存储过程结束符号

create procedure batchInsert() -- 创建存储过程

begin   -- 存储过程主体开始

    declare num int; -- 声明变量

    set num=1; -- 初始值

    while num<=3000000 do -- 循环条件

        insert into user(`username`,`score`) values(concat('user-', num),num); -- 执行语句

        set num=num+1; -- 循环变量自增

    end while; -- 结束循环

end$$ -- 存储过程主体结束

delimiter ; #恢复;表示结束

CALL batchInsert; -- 执行存储过程

可以看到测试300W条数据大概1046s插入完成。好吧，本来计划导入1000w的结果时间太长了。

三、常用的MySQL分页查询问题复现及优化。

我们拿现有的表 user 进行测试，该表有 300w 条数据。

1. 前置检查

首先查看下该表结构以及目前存在哪些索引

mysql> desc user;

+----------+------------------+------+-----+---------+----------------+

| Field    | Type             | Null | Key | Default | Extra          |

+----------+------------------+------+-----+---------+----------------+

| id       | int(10) unsigned | NO   | PRI | NULL    | auto_increment |

| username | varchar(30)      | NO   |     |         |                |

| score    | int(11)          | NO   |     | 0       |                |

+----------+------------------+------+-----+---------+----------------+

3 rows in set (0.00 sec)

mysql> show index from user;

+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |

+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

| user  |          0 | PRIMARY  |            1 | id          | A         |     2991886 |     NULL | NULL   |      | BTREE      |         |               |

+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

1 row in set (0.00 sec)

可以看到只有 id 主键索引。

其次查看是否开启 缓存 (避免查询缓存对执行效率产生影响)

mysql> show variables like '%query_cache%';

+------------------------------+---------+

| Variable_name                | Value   |

+------------------------------+---------+

| have_query_cache             | YES     |

| query_cache_limit            | 1048576 |

| query_cache_min_res_unit     | 4096    |

| query_cache_size             | 1048576 |

| query_cache_type             | OFF     |

| query_cache_wlock_invalidate | OFF     |

+------------------------------+---------+

6 rows in set (0.00 sec)

mysql> show profiles;

Empty set, 1 warning (0.00 sec)

have_query_cache 和 query_cache_type 说明支持缓存但并未开启。

show profiles 显示为空，说明profiles功能是关闭的。

开启 profiles

mysql> SET profiling = 1;

Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> show profiles;

+----------+------------+-------------------+

| Query_ID | Duration   | Query             |

+----------+------------+-------------------+

|        1 | 0.00012300 | SET profiling = 1 |

+----------+------------+-------------------+

1 row in set, 1 warning (0.00 sec)

2. 无索引分页查询

一般我们最常用的分页查询的方式为 order by + limit m,n 的方式, 现在我们测试下分页性能

select * from user order by score limit 0,10; -- 10 rows in set (0.65 sec)

select * from user order by score limit 10000,10; -- 10 rows in set (0.83 sec)

select * from user order by score limit 100000,10; -- 10 rows in set (1.03 sec)

select * from user order by score limit 1000000,10; -- 10 rows in set (1.14 sec)

这里我们确认下是否用到了索引

mysql> explain select * from user order by score limit 1000000,10;

+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+----------------+

| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra          |

+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+----------------+

|  1 | SIMPLE      | user  | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 2991995 |   100.00 | Using filesort |

+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+----------------+

1 row in set, 1 warning (0.00 sec)

可以看到确实没有用到索引，全表扫描100W数据分页大概需要1.14s的时间。

3. 有索引分页查询

select * from user order by id limit 10000,10; -- 10 rows in set (0.01 sec)

select * from user order by id limit 1000000,10; -- 10 rows in set (0.18 sec)

select * from user order by id limit 2000000,10; -- 10 rows in set (0.35 sec)

该查询用到了主键索引，所以查询效率比较高。

可以看到，当数据量变大时，查询效率明显下降。

这里我们确认下是否使用到了索引

mysql> explain select * from user order by id limit 2000000,10;

+----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+-------+

| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref  | rows    | filtered | Extra |

+----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+-------+

|  1 | SIMPLE      | user  | NULL       | index | NULL          | PRIMARY | 4       | NULL | 2000010 |   100.00 | NULL  |

+----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+-------+

1 row in set, 1 warning (0.00 sec)

可以看到用了全索引扫描，共查询了2000010行数据。

4. 优化

我们根据MYSQL自带的一种query诊断分析工具查看下sql语句执行各个操作的耗时详情。可以看到查询获取到的2000010条记录都返回给客户端了，耗时主要集中在Sending data阶段。但是客户端只需要10条数据，我们能否只给客户端返回10条数据呢？

mysql> show profiles;

+----------+------------+---------------------------------------------------------+

| Query_ID | Duration   | Query                                                   |

+----------+------------+---------------------------------------------------------+

|        1 | 0.00012300 | SET profiling = 1                                       |

|        2 | 0.00009200 | SET profiling = 1                                       |

|        3 | 0.35689500 | select * from user order by id limit 2000000,10         |

|        4 | 0.00023900 | explain select * from user order by id limit 2000000,10 |

+----------+------------+---------------------------------------------------------+

4 rows in set, 1 warning (0.00 sec)

mysql> show profile for query 3;

+----------------------+----------+

| Status               | Duration |

+----------------------+----------+

| starting             | 0.000071 |

| checking permissions | 0.000007 |

| Opening tables       | 0.000012 |

| init                 | 0.000017 |

| System lock          | 0.000008 |

| optimizing           | 0.000005 |

| statistics           | 0.000024 |

| preparing            | 0.000016 |

| Sorting result       | 0.000004 |

| executing            | 0.000003 |

| Sending data         | 0.356653 |

| end                  | 0.000013 |

| query end            | 0.000005 |

| closing tables       | 0.000008 |

| freeing items        | 0.000019 |

| cleaning up          | 0.000030 |

+----------------------+----------+

16 rows in set, 1 warning (0.00 sec)

网上的优化方案: 子查询 + 覆盖索引

mysql> select * from user where id > (select id from user order by id limit 2000000, 1) limit 10;

+---------+--------------+---------+

| id      | username     | score   |

+---------+--------------+---------+

| 2000002 | user-2000002 | 2000002 |

| 2000003 | user-2000003 | 2000003 |

| 2000004 | user-2000004 | 2000004 |

| 2000005 | user-2000005 | 2000005 |

| 2000006 | user-2000006 | 2000006 |

| 2000007 | user-2000007 | 2000007 |

| 2000008 | user-2000008 | 2000008 |

| 2000009 | user-2000009 | 2000009 |

| 2000010 | user-2000010 | 2000010 |

| 2000011 | user-2000011 | 2000011 |

+---------+--------------+---------+

10 rows in set (0.29 sec)

mysql> explain select * from user where id > (select id from user order by id limit 2000000, 1) limit 10;

+----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+

| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref  | rows    | filtered | Extra       |

+----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+

|  1 | PRIMARY     | user  | NULL       | range | PRIMARY       | PRIMARY | 4       | NULL | 1495997 |   100.00 | Using where |

|  2 | SUBQUERY    | user  | NULL       | index | NULL          | PRIMARY | 4       | NULL | 2000001 |   100.00 | Using index |

+----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+

2 rows in set, 1 warning (0.30 sec)

然而并没有提升查询性能。没看到问题出在哪里呢？从执行计划可以看出，索引和我们期望是一致的。rows这里检索了很多行。单独看下子查询

mysql> select id from user order by id limit 2000000, 1;

+---------+

| id      |

+---------+

| 2000001 |

+---------+

1 row in set (0.29 sec)

mysql> explain select id from user order by id limit 2000000, 1;

+----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+

| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref  | rows    | filtered | Extra       |

+----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+

|  1 | SIMPLE      | user  | NULL       | index | NULL          | PRIMARY | 4       | NULL | 2000001 |   100.00 | Using index |

+----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+

1 row in set, 1 warning (0.00 sec)

这里可以看出子查询即使走了覆盖索引，依旧消耗3s左右，我觉得这就是正常的索引IO花费的时间。没找到官方测试数据做对比，以及MySQL一次IO查询花费的时间来做对比。

理论上int主键一页可以存储1000个键,根常驻内存,那么B+Tree第二层大概100W个键,测试数据在200W的分页，理论上需要2次IO可以找到数据。2次IO花费的时间是3s的话，1次应该在1.5s左右, 我们查询下99W左右的分页看是否符合假想。

mysql> select id from user order by id limit 990000,1;

+--------+

| id     |

+--------+

| 990001 |

+--------+

1 row in set (0.15 sec)

所以这里笔者大胆的猜想结果是正常开销

四、最后

本来想复盘网上的分页优化方案是否可靠，但是预期结果还是有区别。希望聪明的读者有不同见解的不吝赐教。公众号里有笔者的微信二维码。

复盘MySQL分页查询优化方案的更多相关文章

4种MySQL分页查询优化的方法，你知道几个？
前言当需要从数据库查询的表有上万条记录的时候,一次性查询所有结果会变得很慢,特别是随着数据量的增加特别明显,这时需要使用分页查询.对于数据库分页查询,也有很多种方法和优化的点.下面简单说一下我知道的 ...
mysql分页查询优化
由于MySql的分页机制:并不是跳过 offset 行,而是取 offset + N 行,然后返回放弃前 offset 行,返回N 行, 所以当 offset 特别大的时候,效率就非常的低下,要么控制 ...
mysql分页查询优化(索引延迟关联)
对于web后台报表导出是一种常见的功能点,实际对应服务后端即数据库的排序分页查询.如下示例为公司商户积分报表导出其中一个sql ,当大批量的导出请求进入时候,mysql的cpu急剧上升瞬间有拖垮库的风 ...
MySQL 分页查询优化
有时在处理偏移量非常大的分页时候查询时,例如LIMIT 1000,10这样的查询,这时MySQL需要查询1010条记录然后只返回最后10条,前面1000条记录都被抛弃,这样的代价非常高.要优化这种查询 ...
MySQL 分页查询优化——延迟关联优化
目录 1. InnoDB表的索引的几个概念 2. 覆盖索引和回表 3. 分页查询 4. 延迟关联优化写在前面下面的介绍均是在选用MySQL数据库和Innodb引擎的基础开展.我们先 ...
MySQL 千万数据库深分页查询优化，拒绝线上故障！
文章首发在公众号(龙台的技术笔记),之后同步到博客园和个人网站:xiaomage.info 优化项目代码过程中发现一个千万级数据深分页问题,缘由是这样的库里有一张耗材 MCS_PROD 表,通过同步 ...
EF获取多个数据集以及MySQL分页数据查询优化
背景:MySQL分页查询语句为 ,10; 一般页面还会获取总条数,这时候还需要一条查询总条数语句 , 这样数据库需要执行两次查询操作.MySQL提供了SQL_CALC_FOUND_ROWS追踪总条数的 ...
MySQL分页优化中的“INNER JOIN方式优化分页算法”到底在什么情况下会生效？
本文出处:http://www.cnblogs.com/wy123/p/7003157.html 最近无意间看到一个MySQL分页优化的测试案例,并没有非常具体地说明测试场景的情况下,给出了一种经典的 ...
Mysql 分页语句Limit用法
转载自:http://qimo601.iteye.com/blog/1634748 1.Mysql的limit用法在我们使用查询语句的时候,经常要返回前几条或者中间某几行数据,这个时候怎么办呢?不用 ...

随机推荐

Do jobs|permanent|secure job|Move|Look after|provide sb with sth|Move|Enjoy a good time|Learn about|Be fond of|Have a clearer idea|String quarter|Be subject to|A has little with B|Pigment
Do jobs|work jobs Long-terms|permanent Gain jobs/secure job Move|go to |stay in |live in|settle down ...
linux的nohup命令的用法（后台运行程序命令）
linux的nohup命令的用法. 在应用Unix/Linux时,我们一般想让某个程序在后台运行,于是我们将常会用 & 在程序结尾来让程序自动运行.比如我们要运行mysql在后台: /usr/ ...
【SpringCloud】Eureka入门与原理
为了开发效率高效和业务逻辑清晰,越来越多的项目采用分布式系统.分布式最重要的就是注册中心了.Eureka是SpringCloud原生提供的注册中心,来look一波吧. 超光速入门服务端引入依赖: ...
CF-595
题目传送门 A .Yet Another Dividing into Teams sol:原先是用比较复杂的方法来解的,后来学弟看了一眼,发现不是1就是2,当出现两个人水平相差为1就分成两组,1组全是 ...
mysql在当前服务器复制数据库
mysqldump newwq -u root -proot --add-drop-table | mysql jxg -u root -proot
php配置memcached的扩展。
(一)安装memcached服务器 1根据系统下载相应版本的memcached服务器版本:如win7(64位=====>memcached-win64/memcached.exe 2.解压到目录 ...
IOC初始化销毁的2种实现方式
IOC初始化销毁的2种实现方式 1.bean内调用init-method 和destroy-method 2.通过注解实现@PostConstruct 和@PreDestroy ----------- ...
idea常见需求
1.给class加注释模板 /** *@ClassName ${NAME} *@Description TODO *@Author xxx *@Date ${DATE} ${TIME} *@Versi ...
js 原生轮播图插件
<!doctype html> <html lang="en"> <head> <meta charset="UTF-8&quo ...
springboot 多数据源之错误 HikariPool-1 - jdbcUrl is required with driverClassName.
数据源连接报错: 之前在1.5.7的版本的时候用该数据源配置没问题,看如下所示 springboot1.5.7配置多数据源: datasource.master.url=jdbc:mysql://lo ...

复盘MySQL分页查询优化方案