MySQL数据库分区操作【RANGE】
客服平台,线上查询存在性能问题,为了解决或者说是缓解这个问题,除了加必要的索引,另外就是将表进行分区。
这里主要是针对既有的表进行分区,采用的是alter table xxx的方式,当然,也可以采用create table xxx partition by range(abc)的方式,都是可以的。两种方式,都验证和测试过,都可行!这里主要介绍alter的方式!
主要是因为alter的过程,遇到一点小小的问题,以备后查。
通过show create table 的方式查看我们的chat_message_history表,结构如下:
Table Create Table
chat_message_history CREATE TABLE `chat_message_history` (
`id` int() NOT NULL AUTO_INCREMENT,
`visitor_id` varchar() DEFAULT NULL,
`visitor_name` varchar() DEFAULT NULL,
`contentBlob` blob,
`sender` varchar() DEFAULT NULL,
`message_time` datetime DEFAULT NULL COMMENT '消息发送时间',
`jobId` varchar() DEFAULT NULL,
`robot_response` varchar() DEFAULT NULL COMMENT '机器人的回复消息',
`skill_group_id` varchar() DEFAULT NULL,
`type` varchar() DEFAULT NULL,
`new_skill_group_id` varchar() DEFAULT NULL,
`channel` varchar() DEFAULT 'WXAPP' COMMENT '渠道',
`message_id` varchar() DEFAULT NULL,
`sessionId` varchar() DEFAULT NULL,
`message_status` varchar() DEFAULT NULL,
`error_message` varchar() DEFAULT NULL,
`businessType` varchar() DEFAULT NULL COMMENT '1-欢迎语;',
`pFlag` varchar() DEFAULT NULL COMMENT '消息的产品属性 1-微医保 2-微重疾',
PRIMARY KEY (`id`),
KEY `IDX_jobId` (`jobId`),
KEY `idx_his_vis_ctdesc_key` (`visitor_id`,`skill_group_id`,`message_time`)
) DEFAULT CHARSET=utf8
然后就是alter table的方式添加分区,分区按照消息时间,大体是每个月一个分区:
alter table chat_message_history partition by range(to_days(message_time)) (
partition p201708 values less than (to_days('2017-08-31')),
partition p201709 values less than (to_days('2017-09-30')),
partition p201710 values less than (to_days('2017-10-31')),
partition p201711 values less than (to_days('2017-11-30')),
partition p201712 values less than (to_days('2017-12-31')),
partition p201801 values less than (to_days('2018-01-31')),
partition p201802 values less than (to_days('2018-02-30')),
partition p201803 values less than (to_days('2018-03-31')),
partition p201804 values less than (to_days('2018-04-30')),
partition p201805 values less than (to_days('2018-05-31')),
partition p201806 values less than (to_days('2018-06-30')),
partition p201807 values less than (to_days('2018-07-31')),
partition p201808 values less than (to_days('2018-08-31')),
partition p201809 values less than (to_days('2018-09-30')),
partition p201810 values less than (to_days('2018-10-31')),
partition p201811 values less than (to_days('2018-11-30')),
partition p201812 values less than (to_days('2018-12-31')),
partition p201901 values less than (to_days('2019-01-31')),
partition p201902 values less than (to_days('2019-02-30')),
partition p201903 values less than (to_days('2019-03-31')),
partition p201904 values less than (to_days('2019-04-30')),
partition p201905 values less than (to_days('2019-05-31')),
partition p201906 values less than (to_days('2019-06-30')),
partition p201907 values less than (to_days('2019-07-31')),
partition p201908 values less than (to_days('2019-08-31')),
partition p201909 values less than (to_days('2019-09-30')),
partition p201910 values less than (to_days('2019-10-31')),
partition p201911 values less than (to_days('2019-11-30')),
partition p201912 values less than (to_days('2019-12-31')),
partition p202001 values less than (to_days('2020-01-31')),
partition p202002 values less than (to_days('2020-02-30')),
partition p202003 values less than (to_days('2020-03-31')),
partition p202004 values less than (to_days('2020-04-30')),
partition p202005 values less than (to_days('2020-05-31')),
partition p202006 values less than (to_days('2020-06-30')),
partition p202007 values less than (to_days('2020-07-31')),
partition p202008 values less than (to_days('2020-08-31')),
partition p202009 values less than (to_days('2020-09-30')),
partition p202010 values less than (to_days('2020-10-31')),
partition p202011 values less than (to_days('2020-11-30')),
partition p202012 values less than (to_days('2020-12-31')),
PARTITION p202XYZ VALUES LESS THAN (MAXVALUE));
上述SQL执行报错:
ERROR (HY000): Not allowed to use NULL value in VALUES LESS THAN
仔细查看,上述的LESS THAN后面没有NULL的值啊,都是写的很明确的年月日进行获取天数来得到分界线的啊。。。 最后研究下to_days(expr)函数,
官方文档:
TO_DAYS(date) Given a date date, returns a day number (the number of days since year 0).
我怀疑是因为我给定的每年的2月份的年月日信息不合法,验证一下:
mysql> select TO_DAYS('2017-02-30');
+-----------------------+
| TO_DAYS('2017-02-30') |
+-----------------------+
| NULL |
+-----------------------+
row in set, warning (0.00 sec)
mysql>
mysql> show warnings;
+---------+------+----------------------------------------+
| Level | Code | Message |
+---------+------+----------------------------------------+
| Warning | | Incorrect datetime value: '2017-02-30' |
+---------+------+----------------------------------------+
row in set (0.00 sec)
结合上述错误提示,将分区SQL语句调整一下如下:
alter table chat_message_history partition by range(to_days(message_time)) (
partition p201708 values less than (to_days('2017-09-01')),
partition p201709 values less than (to_days('2017-10-01')),
partition p201710 values less than (to_days('2017-11-01')),
partition p201711 values less than (to_days('2017-12-01')),
partition p201712 values less than (to_days('2018-01-01')),
partition p201801 values less than (to_days('2018-02-01')),
partition p201802 values less than (to_days('2018-03-01')),
partition p201803 values less than (to_days('2018-04-01')),
partition p201804 values less than (to_days('2018-05-01')),
partition p201805 values less than (to_days('2018-06-01')),
partition p201806 values less than (to_days('2018-07-01')),
partition p201807 values less than (to_days('2018-08-01')),
partition p201808 values less than (to_days('2018-09-01')),
partition p201809 values less than (to_days('2018-10-01')),
partition p201810 values less than (to_days('2018-11-01')),
partition p201811 values less than (to_days('2018-12-01')),
partition p201812 values less than (to_days('2019-01-01')),
partition p201901 values less than (to_days('2019-02-01')),
partition p201902 values less than (to_days('2019-03-01')),
partition p201903 values less than (to_days('2019-04-01')),
partition p201904 values less than (to_days('2019-05-01')),
partition p201905 values less than (to_days('2019-06-01')),
partition p201906 values less than (to_days('2019-07-01')),
partition p201907 values less than (to_days('2019-08-01')),
partition p201908 values less than (to_days('2019-09-01')),
partition p201909 values less than (to_days('2019-10-01')),
partition p201910 values less than (to_days('2019-11-01')),
partition p201911 values less than (to_days('2019-12-01')),
partition p201912 values less than (to_days('2020-01-01')),
partition p202001 values less than (to_days('2020-02-01')),
partition p202002 values less than (to_days('2020-03-01')),
partition p202003 values less than (to_days('2020-04-01')),
partition p202004 values less than (to_days('2020-05-01')),
partition p202005 values less than (to_days('2020-06-01')),
partition p202006 values less than (to_days('2020-07-01')),
partition p202007 values less than (to_days('2020-08-01')),
partition p202008 values less than (to_days('2020-09-01')),
partition p202009 values less than (to_days('2020-10-01')),
partition p202010 values less than (to_days('2020-11-01')),
partition p202011 values less than (to_days('2020-12-01')),
partition p202012 values less than (to_days('2021-01-01')),
PARTITION p202XYZ VALUES LESS THAN (MAXVALUE));
执行后还是报错:
ERROR (HY000): A PRIMARY KEY must include all columns in the table's partitioning function
这个错误是说,分区函数里面,主键必须包含所有的用于建立分区的列。我这里分区,是按照message_time进行分区,所以,这里将message_time和既有的id主键建立联合主键。SQL如下(先删除既有的id主键,再建联合主键):
alter table chat_message_history drop primary key,add primary key (`id`,`message_time`);
再次执行创建分区的SQL:
mysql> alter table chat_message_history partition by range(to_days(message_time)) (
-> partition p201708 values less than (to_days('2017-09-01')),
-> partition p201709 values less than (to_days('2017-10-01')),
-> partition p201710 values less than (to_days('2017-11-01')),
-> partition p201711 values less than (to_days('2017-12-01')),
-> partition p201712 values less than (to_days('2018-01-01')),
-> partition p201801 values less than (to_days('2018-02-01')),
-> partition p201802 values less than (to_days('2018-03-01')),
-> partition p201803 values less than (to_days('2018-04-01')),
-> partition p201804 values less than (to_days('2018-05-01')),
-> partition p201805 values less than (to_days('2018-06-01')),
-> partition p201806 values less than (to_days('2018-07-01')),
-> partition p201807 values less than (to_days('2018-08-01')),
-> partition p201808 values less than (to_days('2018-09-01')),
-> partition p201809 values less than (to_days('2018-10-01')),
-> partition p201810 values less than (to_days('2018-11-01')),
-> partition p201811 values less than (to_days('2018-12-01')),
-> partition p201812 values less than (to_days('2019-01-01')),
-> partition p201901 values less than (to_days('2019-02-01')),
-> partition p201902 values less than (to_days('2019-03-01')),
-> partition p201903 values less than (to_days('2019-04-01')),
-> partition p201904 values less than (to_days('2019-05-01')),
-> partition p201905 values less than (to_days('2019-06-01')),
-> partition p201906 values less than (to_days('2019-07-01')),
-> partition p201907 values less than (to_days('2019-08-01')),
-> partition p201908 values less than (to_days('2019-09-01')),
-> partition p201909 values less than (to_days('2019-10-01')),
-> partition p201910 values less than (to_days('2019-11-01')),
-> partition p201911 values less than (to_days('2019-12-01')),
-> partition p201912 values less than (to_days('2020-01-01')),
-> partition p202001 values less than (to_days('2021-02-01')),
-> partition p202002 values less than (to_days('2020-03-01')),
-> partition p202003 values less than (to_days('2020-04-01')),
-> partition p202004 values less than (to_days('2020-05-01')),
-> partition p202005 values less than (to_days('2020-06-01')),
-> partition p202006 values less than (to_days('2020-07-01')),
-> partition p202007 values less than (to_days('2020-08-01')),
-> partition p202008 values less than (to_days('2020-09-01')),
-> partition p202009 values less than (to_days('2020-10-01')),
-> partition p202010 values less than (to_days('2020-11-01')),
-> partition p202011 values less than (to_days('2020-12-01')),
-> partition p202012 values less than (to_days('2021-01-01')),
-> PARTITION p202XYZ VALUES LESS THAN (MAXVALUE));
Query OK, rows affected (1.28 sec)
Records: Duplicates: Warnings:
这回成功了,真是折腾!!!
现在就要来验证一下,我们的分区是否起到作用了。主要是进行对比呗,先看没有建立分区的SQL查询:
mysql> explain select * from chat_message_history where message_time > '2017-12-01' and message_time < '2018-01-01';
+----+-------------+----------------------+------+---------------+------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------------+------+---------------+------+---------+------+---------+-------------+
| | SIMPLE | chat_message_history | ALL | NULL | NULL | NULL | NULL | 5103176 | Using where |
+----+-------------+----------------------+------+---------------+------+---------+------+---------+-------------+
row in set (0.00 sec)
涉及到表扫描行数是5103176,这个表一共530W行记录,这里就扫描了510W行,够可以的。。。
那么,加了分区后呢?请看下面的SQL查询:
mysql> explain select * from chat_message_history where message_time > '2017-12-01' and message_time < '2018-01-01';
+----+-------------+----------------------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------------+------+---------------+------+---------+------+--------+-------------+
| | SIMPLE | chat_message_history | ALL | NULL | NULL | NULL | NULL | 829848 | Using where |
+----+-------------+----------------------+------+---------------+------+---------+------+--------+-------------+
row in set (0.00 sec)
这回查询扫描的行数,就变成了80多万行了,少了不少啊!
从这次分区看,分区查询和不分区查询,影响到的扫描行数还是挺明显的。
总结一下:
1,MySQL数据量达到几百万后,多表联合查询时,性能极其不稳定,这个是我们线上系统的真实写照,几天内,两次查询导致数据库连接数耗尽,这次600个连接,全部占用,导致系统不可用!
2,数据量大了,采用分区,或者加索引,可以缓解眼前的问题,但是,随着时间推移,若查询数据量不做限制,最终还是会出现查询响应非常慢的问题。所以,建议采用数据分割或者说是表拆分的方式,基于一定的业务场景或者需要进行,可以保证系统的高可用性。
MySQL数据库分区操作【RANGE】的更多相关文章
- (转)运维角度浅谈MySQL数据库优化
转自:http://lizhenliang.blog.51cto.com/7876557/1657465 一个成熟的数据库架构并不是一开始设计就具备高可用.高伸缩等特性的,它是随着用户量的增加,基础架 ...
- MySQL数据库在linux的安装,编程与操作
一.安装 ubuntu上安装MySQL非常简单只需要几条命令就可以完成. 1. sudo apt-get install mysql-server 2. apt-get isntall mysql ...
- 配合crond服务实现自定义周期备份MySQL数据库(使用innobackupex进行备份)
备份 新建一个脚本/root/backup.py,内容如下: #!/usr/bin/env python# -*- coding: utf-8 -*- ''' 脚本作者:昨夜星辰 脚本作用:配合cro ...
- MySQL数据库之数据类型BOOL/BOOLEAN与TINYINT测试总结
From: http://database.51cto.com/art/201203/323863.htm 网络上很多人咨询mysql是否提供布尔类型?MySQL数据库确实提供布尔类型,此数据类型的关 ...
- python2.7爬取豆瓣电影top250并写入到TXT,Excel,MySQL数据库
python2.7爬取豆瓣电影top250并分别写入到TXT,Excel,MySQL数据库 1.任务 爬取豆瓣电影top250 以txt文件保存 以Excel文档保存 将数据录入数据库 2.分析 电影 ...
- 利用python2.7正则表达式进行豆瓣电影Top250的网络数据采集及MySQL数据库操作
转载请注明出处 利用python2.7正则表达式进行豆瓣电影Top250的网络数据采集 1.任务 采集豆瓣电影名称.链接.评分.导演.演员.年份.国家.评论人数.简评等信息 将以上数据存入MySQL数 ...
- python【第十二篇下】操作MySQL数据库以及ORM之 sqlalchemy
内容一览: 1.Python操作MySQL数据库 2.ORM sqlalchemy学习 1.Python操作MySQL数据库 2. ORM sqlachemy 2.1 ORM简介 对象关系映射(英语: ...
- 用python3.x与mysql数据库构建简单的爬虫系统(转)
这是在博客园的第一篇文章,由于本人还是一个编程菜鸟,也写不出那些高大上的牛逼文章,这篇文章就是对自己这段时间学习python的一个总结吧. 众所周知python是一门对初学编程的人相当友好的编程语言, ...
- 第 8 章 MySQL 数据库 Query 的优化
前言: 在之前“影响 MySQL 应用系统性能的相关因素”一章中我们就已经分析过了Query语句对数据库性能的影响非常大,所以本章将专门针对 MySQL 的 Query 语句的优化进行相应的分析. ...
随机推荐
- Centos7 使用Docker搭建Oracle测试环境
1.更新yum yum update 2.安装Docker yum install docker 安装完成后查看Docker的版本: docker version 查看Docker的信息: docke ...
- log4j下载地址及日志文件输入位置配置
ERROR StatusLogger Log4j2 could not find a logging implementation. Please add log4j-core to the clas ...
- Ubuntu 录制视频并制作成gif图
最近在github中添加项目实现效果时需要用到gif图,在网上搜了一些工具和教程,会发现windows系统比ubuntu多很多,经过实际验证以下这种方法可以实现gif图的录制与制作,具体方法如下: ...
- pandas的聚合操作: groupyby与agg
pandas提供基于行和列的聚合操作,groupby可理解为是基于行的,agg则是基于列的 从实现上看,groupby返回的是一个DataFrameGroupBy结构,这个结构必须调用聚合函数(如su ...
- flask中的模型
1.什么是模型 模型,是根据数据库中表的结构而创建出来的class.每一张表对应到编程语言中,就是一个class表中的每一个列对应到编程语言中就class中的一个属性. 2.ORM的三大特征 1.数据 ...
- hdu--1029 编程之美 在数组a中 (元素个数n n是奇数)找一个数字 它出现的次数大于(n+1)/2
我为什么总是犯这些愚蠢错误啊,还是自己逻辑不够严谨. 努力ing...... #include <iostream> #include <cstdio> #include &l ...
- 10 HashMap,Map.Entry,LinkedHashMap,TreeMap,Hashtable,Collections类
Map集合的功能概述 添加功能 * V put(K key,V value):添加元素. * 如果键是第一次存储,就直接存储元素,返回null * 如果键不 ...
- python 类编程相关内容(更新)
python作为面向对象的编程语言,类和对象相关的编程当然是少不了的! python类: class 类名 : 变量名 [ = 初始值 ] …… def 函数名 ( self [ , 其余参数列表 ] ...
- HTTP与TCP的区别和联系--转载
相信不少初学手机联网开发的朋友都想知道Http与Socket连接究竟有什么区别,希望通过自己的浅显理解能对初学者有所帮助. 一.基本概念 1.TCP连接 手机能够使用联网功能是因为手机底层实现了TCP ...
- doubleclick video notes
1,vast duration it must math this format ,if use “00:00:7 ” it will tip “ ” <Duration>00:00:0 ...