客服平台,线上查询存在性能问题,为了解决或者说是缓解这个问题,除了加必要的索引,另外就是将表进行分区。

这里主要是针对既有的表进行分区,采用的是alter table xxx的方式,当然,也可以采用create table xxx partition by range(abc)的方式,都是可以的。两种方式,都验证和测试过,都可行!这里主要介绍alter的方式!

主要是因为alter的过程,遇到一点小小的问题,以备后查。

通过show create table 的方式查看我们的chat_message_history表,结构如下:

Table    Create Table
chat_message_history CREATE TABLE `chat_message_history` (
`id` int() NOT NULL AUTO_INCREMENT,
`visitor_id` varchar() DEFAULT NULL,
`visitor_name` varchar() DEFAULT NULL,
`contentBlob` blob,
`sender` varchar() DEFAULT NULL,
`message_time` datetime DEFAULT NULL COMMENT '消息发送时间',
`jobId` varchar() DEFAULT NULL,
`robot_response` varchar() DEFAULT NULL COMMENT '机器人的回复消息',
`skill_group_id` varchar() DEFAULT NULL,
`type` varchar() DEFAULT NULL,
`new_skill_group_id` varchar() DEFAULT NULL,
`channel` varchar() DEFAULT 'WXAPP' COMMENT '渠道',
`message_id` varchar() DEFAULT NULL,
`sessionId` varchar() DEFAULT NULL,
`message_status` varchar() DEFAULT NULL,
`error_message` varchar() DEFAULT NULL,
`businessType` varchar() DEFAULT NULL COMMENT '1-欢迎语;',
`pFlag` varchar() DEFAULT NULL COMMENT '消息的产品属性 1-微医保 2-微重疾',
PRIMARY KEY (`id`),
KEY `IDX_jobId` (`jobId`),
KEY `idx_his_vis_ctdesc_key` (`visitor_id`,`skill_group_id`,`message_time`)
) DEFAULT CHARSET=utf8

然后就是alter table的方式添加分区,分区按照消息时间,大体是每个月一个分区:

alter table chat_message_history partition by range(to_days(message_time)) (
partition p201708 values less than (to_days('2017-08-31')),
partition p201709 values less than (to_days('2017-09-30')),
partition p201710 values less than (to_days('2017-10-31')),
partition p201711 values less than (to_days('2017-11-30')),
partition p201712 values less than (to_days('2017-12-31')),
partition p201801 values less than (to_days('2018-01-31')),
partition p201802 values less than (to_days('2018-02-30')),
partition p201803 values less than (to_days('2018-03-31')),
partition p201804 values less than (to_days('2018-04-30')),
partition p201805 values less than (to_days('2018-05-31')),
partition p201806 values less than (to_days('2018-06-30')),
partition p201807 values less than (to_days('2018-07-31')),
partition p201808 values less than (to_days('2018-08-31')),
partition p201809 values less than (to_days('2018-09-30')),
partition p201810 values less than (to_days('2018-10-31')),
partition p201811 values less than (to_days('2018-11-30')),
partition p201812 values less than (to_days('2018-12-31')),
partition p201901 values less than (to_days('2019-01-31')),
partition p201902 values less than (to_days('2019-02-30')),
partition p201903 values less than (to_days('2019-03-31')),
partition p201904 values less than (to_days('2019-04-30')),
partition p201905 values less than (to_days('2019-05-31')),
partition p201906 values less than (to_days('2019-06-30')),
partition p201907 values less than (to_days('2019-07-31')),
partition p201908 values less than (to_days('2019-08-31')),
partition p201909 values less than (to_days('2019-09-30')),
partition p201910 values less than (to_days('2019-10-31')),
partition p201911 values less than (to_days('2019-11-30')),
partition p201912 values less than (to_days('2019-12-31')),
partition p202001 values less than (to_days('2020-01-31')),
partition p202002 values less than (to_days('2020-02-30')),
partition p202003 values less than (to_days('2020-03-31')),
partition p202004 values less than (to_days('2020-04-30')),
partition p202005 values less than (to_days('2020-05-31')),
partition p202006 values less than (to_days('2020-06-30')),
partition p202007 values less than (to_days('2020-07-31')),
partition p202008 values less than (to_days('2020-08-31')),
partition p202009 values less than (to_days('2020-09-30')),
partition p202010 values less than (to_days('2020-10-31')),
partition p202011 values less than (to_days('2020-11-30')),
partition p202012 values less than (to_days('2020-12-31')),
PARTITION p202XYZ VALUES LESS THAN (MAXVALUE));

上述SQL执行报错:

ERROR  (HY000): Not allowed to use NULL value in VALUES LESS THAN

仔细查看,上述的LESS THAN后面没有NULL的值啊,都是写的很明确的年月日进行获取天数来得到分界线的啊。。。 最后研究下to_days(expr)函数,

官方文档:

TO_DAYS(date)

Given a date date, returns a day number (the number of days since year 0).

我怀疑是因为我给定的每年的2月份的年月日信息不合法,验证一下:

mysql> select TO_DAYS('2017-02-30');
+-----------------------+
| TO_DAYS('2017-02-30') |
+-----------------------+
| NULL |
+-----------------------+
row in set, warning (0.00 sec) mysql>
mysql> show warnings;
+---------+------+----------------------------------------+
| Level | Code | Message |
+---------+------+----------------------------------------+
| Warning | | Incorrect datetime value: '2017-02-30' |
+---------+------+----------------------------------------+
row in set (0.00 sec)

结合上述错误提示,将分区SQL语句调整一下如下:

alter table chat_message_history partition by range(to_days(message_time)) (
partition p201708 values less than (to_days('2017-09-01')),
partition p201709 values less than (to_days('2017-10-01')),
partition p201710 values less than (to_days('2017-11-01')),
partition p201711 values less than (to_days('2017-12-01')),
partition p201712 values less than (to_days('2018-01-01')),
partition p201801 values less than (to_days('2018-02-01')),
partition p201802 values less than (to_days('2018-03-01')),
partition p201803 values less than (to_days('2018-04-01')),
partition p201804 values less than (to_days('2018-05-01')),
partition p201805 values less than (to_days('2018-06-01')),
partition p201806 values less than (to_days('2018-07-01')),
partition p201807 values less than (to_days('2018-08-01')),
partition p201808 values less than (to_days('2018-09-01')),
partition p201809 values less than (to_days('2018-10-01')),
partition p201810 values less than (to_days('2018-11-01')),
partition p201811 values less than (to_days('2018-12-01')),
partition p201812 values less than (to_days('2019-01-01')),
partition p201901 values less than (to_days('2019-02-01')),
partition p201902 values less than (to_days('2019-03-01')),
partition p201903 values less than (to_days('2019-04-01')),
partition p201904 values less than (to_days('2019-05-01')),
partition p201905 values less than (to_days('2019-06-01')),
partition p201906 values less than (to_days('2019-07-01')),
partition p201907 values less than (to_days('2019-08-01')),
partition p201908 values less than (to_days('2019-09-01')),
partition p201909 values less than (to_days('2019-10-01')),
partition p201910 values less than (to_days('2019-11-01')),
partition p201911 values less than (to_days('2019-12-01')),
partition p201912 values less than (to_days('2020-01-01')),
partition p202001 values less than (to_days('2020-02-01')),
partition p202002 values less than (to_days('2020-03-01')),
partition p202003 values less than (to_days('2020-04-01')),
partition p202004 values less than (to_days('2020-05-01')),
partition p202005 values less than (to_days('2020-06-01')),
partition p202006 values less than (to_days('2020-07-01')),
partition p202007 values less than (to_days('2020-08-01')),
partition p202008 values less than (to_days('2020-09-01')),
partition p202009 values less than (to_days('2020-10-01')),
partition p202010 values less than (to_days('2020-11-01')),
partition p202011 values less than (to_days('2020-12-01')),
partition p202012 values less than (to_days('2021-01-01')),
PARTITION p202XYZ VALUES LESS THAN (MAXVALUE));

执行后还是报错:

ERROR  (HY000): A PRIMARY KEY must include all columns in the table's partitioning function

这个错误是说,分区函数里面,主键必须包含所有的用于建立分区的列。我这里分区,是按照message_time进行分区,所以,这里将message_time和既有的id主键建立联合主键。SQL如下(先删除既有的id主键,再建联合主键):

alter table chat_message_history drop primary key,add primary key (`id`,`message_time`); 

再次执行创建分区的SQL:

mysql> alter table chat_message_history partition by range(to_days(message_time)) (
-> partition p201708 values less than (to_days('2017-09-01')),
-> partition p201709 values less than (to_days('2017-10-01')),
-> partition p201710 values less than (to_days('2017-11-01')),
-> partition p201711 values less than (to_days('2017-12-01')),
-> partition p201712 values less than (to_days('2018-01-01')),
-> partition p201801 values less than (to_days('2018-02-01')),
-> partition p201802 values less than (to_days('2018-03-01')),
-> partition p201803 values less than (to_days('2018-04-01')),
-> partition p201804 values less than (to_days('2018-05-01')),
-> partition p201805 values less than (to_days('2018-06-01')),
-> partition p201806 values less than (to_days('2018-07-01')),
-> partition p201807 values less than (to_days('2018-08-01')),
-> partition p201808 values less than (to_days('2018-09-01')),
-> partition p201809 values less than (to_days('2018-10-01')),
-> partition p201810 values less than (to_days('2018-11-01')),
-> partition p201811 values less than (to_days('2018-12-01')),
-> partition p201812 values less than (to_days('2019-01-01')),
-> partition p201901 values less than (to_days('2019-02-01')),
-> partition p201902 values less than (to_days('2019-03-01')),
-> partition p201903 values less than (to_days('2019-04-01')),
-> partition p201904 values less than (to_days('2019-05-01')),
-> partition p201905 values less than (to_days('2019-06-01')),
-> partition p201906 values less than (to_days('2019-07-01')),
-> partition p201907 values less than (to_days('2019-08-01')),
-> partition p201908 values less than (to_days('2019-09-01')),
-> partition p201909 values less than (to_days('2019-10-01')),
-> partition p201910 values less than (to_days('2019-11-01')),
-> partition p201911 values less than (to_days('2019-12-01')),
-> partition p201912 values less than (to_days('2020-01-01')),
-> partition p202001 values less than (to_days('2021-02-01')),
-> partition p202002 values less than (to_days('2020-03-01')),
-> partition p202003 values less than (to_days('2020-04-01')),
-> partition p202004 values less than (to_days('2020-05-01')),
-> partition p202005 values less than (to_days('2020-06-01')),
-> partition p202006 values less than (to_days('2020-07-01')),
-> partition p202007 values less than (to_days('2020-08-01')),
-> partition p202008 values less than (to_days('2020-09-01')),
-> partition p202009 values less than (to_days('2020-10-01')),
-> partition p202010 values less than (to_days('2020-11-01')),
-> partition p202011 values less than (to_days('2020-12-01')),
-> partition p202012 values less than (to_days('2021-01-01')),
-> PARTITION p202XYZ VALUES LESS THAN (MAXVALUE));
Query OK, rows affected (1.28 sec)
Records: Duplicates: Warnings:

这回成功了,真是折腾!!!

现在就要来验证一下,我们的分区是否起到作用了。主要是进行对比呗,先看没有建立分区的SQL查询:

mysql> explain select * from chat_message_history where message_time > '2017-12-01' and message_time < '2018-01-01';
+----+-------------+----------------------+------+---------------+------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------------+------+---------------+------+---------+------+---------+-------------+
| | SIMPLE | chat_message_history | ALL | NULL | NULL | NULL | NULL | 5103176 | Using where |
+----+-------------+----------------------+------+---------------+------+---------+------+---------+-------------+
row in set (0.00 sec)

涉及到表扫描行数是5103176,这个表一共530W行记录,这里就扫描了510W行,够可以的。。。

那么,加了分区后呢?请看下面的SQL查询:

mysql> explain select * from chat_message_history where message_time > '2017-12-01' and message_time < '2018-01-01';
+----+-------------+----------------------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------------+------+---------------+------+---------+------+--------+-------------+
| | SIMPLE | chat_message_history | ALL | NULL | NULL | NULL | NULL | 829848 | Using where |
+----+-------------+----------------------+------+---------------+------+---------+------+--------+-------------+
row in set (0.00 sec)

这回查询扫描的行数,就变成了80多万行了,少了不少啊!

从这次分区看,分区查询和不分区查询,影响到的扫描行数还是挺明显的。

总结一下:

1,MySQL数据量达到几百万后,多表联合查询时,性能极其不稳定,这个是我们线上系统的真实写照,几天内,两次查询导致数据库连接数耗尽,这次600个连接,全部占用,导致系统不可用!

2,数据量大了,采用分区,或者加索引,可以缓解眼前的问题,但是,随着时间推移,若查询数据量不做限制,最终还是会出现查询响应非常慢的问题。所以,建议采用数据分割或者说是表拆分的方式,基于一定的业务场景或者需要进行,可以保证系统的高可用性。

MySQL数据库分区操作【RANGE】的更多相关文章

  1. (转)运维角度浅谈MySQL数据库优化

    转自:http://lizhenliang.blog.51cto.com/7876557/1657465 一个成熟的数据库架构并不是一开始设计就具备高可用.高伸缩等特性的,它是随着用户量的增加,基础架 ...

  2. MySQL数据库在linux的安装,编程与操作

    一.安装 ubuntu上安装MySQL非常简单只需要几条命令就可以完成. 1. sudo apt-get install mysql-server   2. apt-get isntall mysql ...

  3. 配合crond服务实现自定义周期备份MySQL数据库(使用innobackupex进行备份)

    备份 新建一个脚本/root/backup.py,内容如下: #!/usr/bin/env python# -*- coding: utf-8 -*- ''' 脚本作者:昨夜星辰 脚本作用:配合cro ...

  4. MySQL数据库之数据类型BOOL/BOOLEAN与TINYINT测试总结

    From: http://database.51cto.com/art/201203/323863.htm 网络上很多人咨询mysql是否提供布尔类型?MySQL数据库确实提供布尔类型,此数据类型的关 ...

  5. python2.7爬取豆瓣电影top250并写入到TXT,Excel,MySQL数据库

    python2.7爬取豆瓣电影top250并分别写入到TXT,Excel,MySQL数据库 1.任务 爬取豆瓣电影top250 以txt文件保存 以Excel文档保存 将数据录入数据库 2.分析 电影 ...

  6. 利用python2.7正则表达式进行豆瓣电影Top250的网络数据采集及MySQL数据库操作

    转载请注明出处 利用python2.7正则表达式进行豆瓣电影Top250的网络数据采集 1.任务 采集豆瓣电影名称.链接.评分.导演.演员.年份.国家.评论人数.简评等信息 将以上数据存入MySQL数 ...

  7. python【第十二篇下】操作MySQL数据库以及ORM之 sqlalchemy

    内容一览: 1.Python操作MySQL数据库 2.ORM sqlalchemy学习 1.Python操作MySQL数据库 2. ORM sqlachemy 2.1 ORM简介 对象关系映射(英语: ...

  8. 用python3.x与mysql数据库构建简单的爬虫系统(转)

    这是在博客园的第一篇文章,由于本人还是一个编程菜鸟,也写不出那些高大上的牛逼文章,这篇文章就是对自己这段时间学习python的一个总结吧. 众所周知python是一门对初学编程的人相当友好的编程语言, ...

  9. 第 8 章 MySQL 数据库 Query 的优化

      前言: 在之前“影响 MySQL 应用系统性能的相关因素”一章中我们就已经分析过了Query语句对数据库性能的影响非常大,所以本章将专门针对 MySQL 的 Query 语句的优化进行相应的分析. ...

随机推荐

  1. jsp-servlet(2)响应HTML文档-书籍管理系统

    基础知识预备:  目标: 构建一个书籍管理系统,实现以下功能. 功能: 1 图书信息查询,(查) 2 书籍管理:添加书籍 3 书籍管理:修改书籍信息 4 书籍管理:删除书籍 一.预备工作 Book{ ...

  2. 【leetcode】14-LongestCommonPrefix

    problem Longest Common Prefix 挨个比较每个字符串的元素是否相同,连续对应位置字符都相同,则为共同字符:否则不是. code class Solution { public ...

  3. Unity 3D UGUI Toggle用法教程

    UGUI Toggle用法教程 本文提供全流程,中文翻译. Chinar 坚持将简单的生活方式,带给世人!(拥有更好的阅读体验 -- 高分辨率用户请根据需求调整网页缩放比例) Chinar -- 心分 ...

  4. NET Core 实战 Dapper 扩展数据访问

    NET Core 实战:基于 Dapper 扩展你的数据访问方法 一.前言 在非静态页面的项目开发中,必定会涉及到对于数据库的访问,最开始呢,我们使用 Ado.Net,通过编写 SQL 帮助类帮我们实 ...

  5. 一次scrapy失败的提示信息:由于连接方在一段时间后没有正确答复或连接的主机没有反 应,连接尝试失败

    2017-10-31 19:09:26 [scrapy.extensions.logstats] INFO: Crawled 8096 pages (at 67 pages/min), scraped ...

  6. django额外参数的传递和url命名

    django额外参数的传递 path方法:path(route, view, kwargs=None, name=None) path方法可以传递入一个额外参数的字典参数(kwarg),字典里的值就会 ...

  7. (9)模板层 - templates(模板语言、语法、取值、过滤器、变量的使用)

    django的模板语言:DTL 模板语言的变量传入 这个是标签 {{ 变量名 }} {{ 变量名 }}   #模板语言的替换可以在模板中的任意位置生效 PS:通过 . 可以做深度查询 模板语言的过滤器 ...

  8. python------模块定义、导入、优化 ------->hashlib模块

    一.hashlib模块 用于加密相关的操作,3.x版本里代替了md5模块和sha模块,主要提供SHA1,SHA224,SHA256,SHA384,SHA512,MD5算法. (MD5消息摘要算法(英语 ...

  9. 工作记录[续] android OBB

    前两篇在这里: Android上使用native IO 最近工作中的问题笔记 最近遇到的问题是, java.io.IOException: FAT Full StackOverflow的结果: htt ...

  10. Python知识点整理,基础2 - 列表操作