背景介绍
记录共128W条!
SELECT cpe_id, COUNT(*) restarts
FROM business_log
WHERE operate_time>='2012-12-05 00:00:00' AND operate_time<'2018-01-05 00:00:00' AND operate_type=3 AND result=0
GROUP BY cpe_id
尝试对原SQL语句进行优化后发现,统计速度依旧没有获得满意的提升。单独运行条件查询语句(不包含GROUP BY和COUNT函数)后发现,查询的结果数据量只有6655条,耗时0.825s;加上统计语句后,时间飙升至3s。
原理
mysql 解释:
MySQL说明文档中关于优化GROUP BY的部分指出:The most general way to satisfy a GROUP BY clause is to scan the whole table and create a new temporary table where all rows from each group are consecutive, and then use this temporary table to discover groups and apply aggregate functions (if any)。即,GROUP BY语句会扫描全表并新建一个临时表用来分组存放数据,然后根据临时表中的分组对数据执行聚合函数。现在的问题聚焦在:如果GROUP BY和WHERE在同一个语句中,这个“全表”指的是物理表还是WHERE过滤后的数据集合?
SELECT cpe_id, COUNT(*) restarts
FROM (
SELECT cpe_id
FROM business_log
WHERE operate_time>='2012-12-05 00:00:00' AND operate_time<'2018-01-05 00:00:00' AND operate_type=3 AND result=0
) t
GROUP BY cpe_id
---------------------
如上述语句所示,在查询语句外包了一个统计语句。执行结果:0.851s。时间消耗大幅减少!
结论
利用GROUP BY统计大数据时,应当将查询与统计分离,优化查询语句。
-
实际业务中
原查询过程
SELECT SUM(a.reportcount)reportcount,a.OrganizationId,a.organizationcode,a.organizationname,org.UpperComCode,org.organizationlevel FROM (SELECT COUNT(1) ReportCount, o.OrganizationCode ,o.organizationname, o.id OrganizationId,o.organizationlevel FROM report r JOIN organization o ON r.OrganizationCode = o.OrganizationCode WHERE o.OrganizationCode in ('44010000','44040000','44006600','44008800','44020000','44050000','44060000','44070000','44080000','44090000','44120000','44150000','44160000','44170000','44180000','44190000','44510000','44520000','44530000','44710000','44940000','44950000') and r.ReportTime>'2019/4/25 0:00:00' and r.ReportReasonSubmitCode in ('A10006','A10007') and r.ReportTime<'2019/5/20 0:00:00' and r.LossTime>'2019/4/11 0:00:00' and r.LossTime<'2019/5/6 0:00:00' and r.InsuranceType in (5,6,7,8) AND r.CatasCollectionId=361 GROUP BY o.OrganizationCode,o.organizationname, o.id,o.organizationlevel union all SELECT COUNT(1) ReportCount, o.OrganizationCode,o.organizationname, o.id OrganizationId,o.organizationlevel FROM report r JOIN organization o ON r.OrganizationCode = o.OrganizationCode WHERE o.UpperComCode IN('44010000','44040000','44006600','44008800','44020000','44050000','44060000','44070000','44080000','44090000','44120000','44150000','44160000','44170000','44180000','44190000','44510000','44520000','44530000','44710000','44940000','44950000')and r.ReportTime>'2019/4/25 0:00:00' and r.ReportReasonSubmitCode in ('A10006','A10007') and r.ReportTime<'2019/5/20 0:00:00' and r.LossTime>'2019/4/11 0:00:00' and r.LossTime<'2019/5/6 0:00:00' and r.InsuranceType in (5,6,7,8) AND r.CatasCollectionId=361 GROUP BY o.OrganizationCode,o.organizationname, o.id ,o.organizationlevel union all SELECT COUNT(1) ReportCount , o.UpperComCode OrganizationCode,o.ParentOrgName OrganizationName, o.ParentOrgId OrganizationId,o.organizationlevel FROM report r JOIN organization o ON r.OrganizationCode = o.OrganizationCode WHERE o.UpperComCode IN(SELECT OrganizationCode FROM organization WHERE UpperComCode IN ('44010000','44040000','44006600','44008800','44020000','44050000','44060000','44070000','44080000','44090000','44120000','44150000','44160000','44170000','44180000','44190000','44510000','44520000','44530000','44710000','44940000','44950000' )and r.ReportTime>'2019/4/25 0:00:00' and r.ReportReasonSubmitCode in ('A10006','A10007') and r.ReportTime<'2019/5/20 0:00:00' and r.LossTime>'2019/4/11 0:00:00' and r.LossTime<'2019/5/6 0:00:00' and r.InsuranceType in (5,6,7,8) AND r.CatasCollectionId=361) GROUP BY o.UpperComCode ,o.ParentOrgName, o.ParentOrgId,o.organizationlevel union all SELECT SUM(reportcount) reportcount , organization.UpperComCode OrganizationCode, organization.ParentOrgName OrganizationName,organization.ParentOrgId OrganizationId,organization.OrganizationLevel FROM ( SELECT COUNT(1) reportcount , o.UpperComCode newcode, o.organizationlevel FROM report r JOIN organization o ON r.OrganizationCode = o.OrganizationCode WHERE o.UpperComCode IN( SELECT OrganizationCode FROM organization WHERE UpperComCode IN(SELECT OrganizationCode FROM organization WHERE UpperComCode IN ('44010000','44040000','44006600','44008800','44020000','44050000','44060000','44070000','44080000','44090000','44120000','44150000','44160000','44170000','44180000','44190000','44510000','44520000','44530000','44710000','44940000','44950000' )and r.ReportTime>'2019/4/25 0:00:00' and r.ReportReasonSubmitCode in ('A10006','A10007') and r.ReportTime<'2019/5/20 0:00:00' and r.LossTime>'2019/4/11 0:00:00' and r.LossTime<'2019/5/6 0:00:00' and r.InsuranceType in (5,6,7,8) AND r.CatasCollectionId=361 ) ) GROUP BY o.UpperComCode, o.organizationlevel ) a JOIN organization ON a.newcode = organization.OrganizationCode GROUP BY organization.UpperComCode , organization.ParentOrgName ,organization.ParentOrgId ,organization.OrganizationLevel ) a JOIN organization org ON a.organizationcode=org.organizationcode GROUP BY a.organizationcode,a.organizationname,org.UpperComCode,org.organizationlevel,a.OrganizationId ORDER BY OrganizationLevel
优化结果
SELECT COUNT(1) AS reportcount,a.OrganizationId,a.organizationcode,a.organizationname,a.UpperComCode,a.organizationlevel FROM (
SELECT r.*,r.id AS reportid, o.OrganizationCode ,o.organizationname, o.id OrganizationId,o.organizationlevel ,o.UpperComCode
FROM report r JOIN organization o ON r.OrganizationCode = o.OrganizationCode
WHERE o.OrganizationCode IN ('44010000','44040000','44006600','44008800','44020000','44050000','44060000','44070000','44080000','44090000','44120000','44150000','44160000','44170000','44180000','44190000','44510000','44520000','44530000','44710000','44940000','44950000') AND r.ReportTime>'2010/4/25 0:00:00' AND r.ReportReasonSubmitCode IN ('A10006','A10007') AND r.ReportTime<'2019/5/20 0:00:00' AND r.LossTime>'2010/4/11 0:00:00' AND r.LossTime<'2019/5/6 0:00:00' AND r.InsuranceType IN (5,6,7,8) AND r.CatasCollectionId=0
UNION ALL
SELECT r.id AS reportid , o.OrganizationCode,o.organizationname, o.id OrganizationId,o.organizationlevel ,o.UpperComCode
FROM report r JOIN organization o ON r.OrganizationCode = o.OrganizationCode
WHERE o.UpperComCode IN('44010000','44040000','44006600','44008800','44020000','44050000','44060000','44070000','44080000','44090000','44120000','44150000','44160000','44170000','44180000','44190000','44510000','44520000','44530000','44710000','44940000','44950000')AND r.ReportTime>'2010/4/25 0:00:00' AND r.ReportReasonSubmitCode IN ('A10006','A10007') AND r.ReportTime<'2019/5/20 0:00:00' AND r.LossTime>'2010/4/11 0:00:00' AND r.LossTime<'2019/5/6 0:00:00' AND r.InsuranceType IN (5,6,7,8) AND r.CatasCollectionId=0
UNION ALL
SELECT r.id AS reportid , o.UpperComCode OrganizationCode,o.ParentOrgName OrganizationName, o.ParentOrgId OrganizationId,ou.organizationlevel ,ou.UpperComCode
FROM report r JOIN organization o ON r.OrganizationCode = o.OrganizationCode
JOIN organization ou ON o.UpperComCode=ou.OrganizationCode
WHERE o.UpperComCode IN
(SELECT OrganizationCode FROM organization WHERE UpperComCode IN
('44010000','44040000','44006600','44008800','44020000','44050000','44060000','44070000','44080000','44090000','44120000','44150000','44160000','44170000','44180000','44190000','44510000','44520000','44530000','44710000','44940000','44950000' )AND r.ReportTime>'2010/4/25 0:00:00' AND r.ReportReasonSubmitCode IN ('A10006','A10007') AND r.ReportTime<'2019/5/20 0:00:00' AND r.LossTime>'2010/4/11 0:00:00' AND r.LossTime<'2019/5/6 0:00:00' AND r.InsuranceType IN (5,6,7,8) AND r.CatasCollectionId=0)
) a
GROUP BY a.OrganizationCode,a.organizationname, a.OrganizationId,a.organizationlevel,a.UpperComCode
ORDER BY a.organizationlevel
- Web 性能优化: 使用 Webpack 分离数据的正确方法
摘要: Webpack骚操作. 原文:Web 性能优化: 使用 Webpack 分离数据的正确方法 作者:前端小智 Fundebug经授权转载,版权归原作者所有. 制定向用户提供文件的最佳方式可能是一 ...
- MySQL查询语句执行过程及性能优化-查询过程及优化方法(JOIN/ORDER BY)
在上一篇文章MySQL查询语句执行过程及性能优化-基本概念和EXPLAIN语句简介中介绍了EXPLAIN语句,并举了一个慢查询例子:
- MySQL有关Group By的优化
昨天我写了有关MySQL的loose index scan的相关博文(http://www.cnblogs.com/wingsless/p/5037625.html),后来我发现上次提到的那个优化方法 ...
- Mysql Join语法以及性能优化
引言 内外联结的区别是内联结将去除所有不符合条件的记录,而外联结则保留其中部分.外左联结与外右联结的区别在于如果用A左联结B则A中所有记录都会保留在结果中,此时B中只有符合联结条件的记录,而右联结相反 ...
- mysql问题排查与性能优化
MySQL 问题排查都有哪些手段? 使用 show processlist 命令查看当前所有连接信息. 使用 explain 命令查询 SQL 语句执行计划. 开启慢查询日志,查看慢查询的 SQL. ...
- mysql之数据库添加索引优化查询效率
项目中如果表中的数据过多的话,会影响查询的效率,那么我们需要想办法优化查询,通常添加索引就是我们的选择之一: 1.添加PRIMARY KEY(主键索引) mysql>ALTER TABLE `t ...
- mysql use index () 优化查询的例子
USE INDEX在你查询语句中表名的后面,添加 USE INDEX 来提供你希望 MySQ 去参考的索引列表,就可以让 MySQL 不再考虑其他可用的索引.Eg:SELECT * FROM myta ...
- mysql explain的使用(优化查询)
explain显示了mysql如何使用索引来处理select语句以及连接表.可以帮助选择更好的索引和写出更优化的查询语句. 1.创建数据库 创建的sql语句如下: /* Navicat MySQL D ...
- SqlServer性能优化 查询和索引优化(十二)
查询优化的过程: 查询优化: 功能:分析语句后最终生成执行计划 分析:获取操作语句参数 索引选择 Join算法选择 创建测试的表: select * into EmployeeOp from Adve ...
随机推荐
- wso2as安装
1.系统环境 Ubuntu12.04 192.168.0.97 root/password找管理员 Ubuntu12.04 192.168.0.99 root/password ...
- java web中读取properties文件时的路径问题
在web开发时,难免会有一些固定的参数,我们一般把这些固定的参数存在properties文件中,然后用的时候要读出来.但经常出现一些错误,找不到相应的路径,所以,今天特地讲一些如何正确获得路径. 首先 ...
- fancybox 使用方法
项目中需要做一个相册功能.选择的是fancybox,大概记录一下使用方法: 1.引用fancybox所需要的文件,你可以下载至本地或者引用CDN. fancybox最新版本下载地址:http://fa ...
- CSS3加载动画
图1 通常我们都使用gif格式的图片或者使用Ajax来实现诸如这类的动态加载条,但是现在CSS3也可以完成,并且灵活性更大. 选1个例子看看怎么实现的吧: 效果图: 图2 代码: 使用1个名为'l ...
- 20个优秀的JavaScript 键盘事件处理库
键盘事件是 Web 开发中最常用的事件之一,通过对键盘事件的捕获和处理可以提高网站的易用性和交互体验.下面,我们向大家介绍收集的20款优秀的 JavaScript 键盘事件处理库,帮助开发人员轻松处理 ...
- Ubuntu之Docker安装
1.添加官方的GPG key到系统 curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add - 2.添加D ...
- QButtonGroup:按钮类的非可视化容器,默认可实现按钮的子类实例的单选。
QButtonGroup The QButtonGroup class provides a container to organize groups of button widgets. QButt ...
- 【Java面试题】4 静态变量和实例变量的区别?详细解析
在语法定义上的区别:静态变量前要加static关键字,而实例变量前则不加.在程序运行时的区别:实例变量属于某个对象的属性,必须创建了实例对象,其中的实例变量才会被分配空间,才能使用这个实例变量.静态变 ...
- JavaSE(十)之反射
开始接触的时候可能大家都会很模糊到底什么是反射,大家都以为这个东西不重要,其实很重要的,几乎所有的框架都要用到反射,增加灵活度.到了后面几乎动不动就要用到反射. 首先我们先来认识一下对象 学生---- ...
- openal资料转贴
地址:http://blog.sina.com.cn/s/blog_685b5b220100ukbp.html OpenAL简介 OpenAL(Open Audio Library)是专门负责3D定位 ...