sqladminon September 26, 2018

In a DBA’s day to day activities, we are doing Archive operation on our transnational database servers to improve your queries and control the Disk space. The archive is a most expensive operation since its involved a huge number of Read and Write will be performed. So its mandatory to run the archive queries in chunks. The archive is depended on business use. Many of us need a copy of the data on an archive database to refer later. To perform the archive we can just simply run the delete query with the limit. But we need to run the query again and again until the matched rows count is 0. We can create a procedure to do this in a while loop. I have created one such procedure to archive many tables.

Image Source: Brent Ozar Unlimited

Why Archive is an expensive operation?

Generally how we are arching the data is delete from table_name where column_name <= some_value; If you are running on a table which needs to be deleted around 15million records, then you need the undo log file to hold all of these records. There will be a heavy IO happening in the Disk. And lt’ll lock the rows and some other locks will be held until the Archive complete. Replication may delay because of this.

When Archive is going to mess up the production?

  • Running archive commands on a heavy traffic time.
  • Archive without a proper where clause.
  • Delete data without limit.
  • Performing archive contrition on a not indexed column.
  • Continuously run the delete query in chunks on a replication environment. {without sleep(1 or few seconds}.

How to perform the archive properly?

  • To do this, the first condition is use limit in the delete.
  • Create an index on the where clause.
  • At least do sleep 1sec for each chuck which will be good for a replication infra.
  • Set autocommit=1
  • Optional: Set transaction isolation to Read Committed.
  • Do not mention the number of loops without knowing the actual loop counts to process the complete delete.

My approach to this:

Inspired by Rick James’s Blog, I have prepared a single stored procedure to perform archive on multiple tables. We just need to pass the table name, date column and then date to archive. I have tested with datetime and Primary key column.

Archive a single table:

The below procedure will perform delete on table test and remove older than 10 days records.

use sqladmin;

DROP PROCEDURE
IF EXISTS archive;
delimiter //
CREATE PROCEDURE
archive()
begin
DECLARE rows INT;
DECLARE rows_deleted INT;
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
SET rows = 1;
SET rows_deleted = 10000;
WHILE rows > 0
do
SET autocommit=1;
DELETE
FROM test
WHERE dop < DATE(Date_sub(Now(), INTERVAL 10 day))
LIMIT 10000;
SET rows = row_count();
select sleep(1);
commit;
END WHILE;
END //
delimiter ;

Archive multiple tables:

This procedure will help you to archive multiple tables, you just need to pass the table name, column name and the date for the archive. I love to use this 

use sqladmin;

DROP PROCEDURE
IF EXISTS sqladmin_archive;
delimiter //
CREATE PROCEDURE
sqladmin_archive(IN archive_dbname varchar(100), IN archive_table varchar(100), IN archive_column varchar(100), IN archive_date varchar(100)) begin
DECLARE rows INT;
DECLARE rows_deleted INT;
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
SET rows = 1;
SET rows_deleted = 10000;
WHILE rows > 0
do
SET autocommit=1;
SET @query =CONCAT('DELETE FROM ',archive_dbname,'.',archive_table,' WHERE ',archive_column,' <= "',archive_date ,'" LIMIT 10000;');
PREPARE arcive_stmt FROM @query;
EXECUTE arcive_stmt;
SET rows = row_count();
SET rows = row_count();
select sleep(1);
commit;
DEALLOCATE PREPARE arcive_stmt;
END WHILE;
END //
delimiter ; -- Execute this procedure
CALL sqladmin_archive ('mydb','test_table','created_at','2018-09-12');

Take dump before archive with where clause:

This script is my favorite one, but this depends on the above stored procedure. This shell script will take the dump of the table with where clause of the date that we want to archive. You can customize this as per your requirement.

#!/bin/bash

# pass variables
archive_dbname=$1
archive_table=$2
archive_column=$3
days_to_archive=$4
archive_date="'"`date +'%Y-%m-%d' --date="-$days_to_archive day"`"'"
where_clause=$archive_column'<='$archive_date
dump_file=$archive_table_`date +'%Y-%m-%d' --date="-$days_to_archive day"`".sql" # Dump the table
echo "DUMP Starting for the table $archive_table ....."
mysqldump -u root $archive_dbname $archive_table --where=$where_clause > $dump_file
echo "DUMP Done......" # Archive the data
echo "Deleting the data on the table $archive_table ....."
mysql -u root sqladmin -e"CALL sqladmin_archive('$archive_dbname','$archive_table','$archive_column',$archive_date);"
echo "Deleting is Done ....."

Example Archive:

This example, Im going to archive a table called test. The column started_at contains the timestamp value. I want to remove older than 15 days data in the table. This table is located in the database name called sqladmin.

./archive_script.sh sqladmin test started_at 15

Archive MySQL Data In Chunks Using Stored Procedure的更多相关文章

  1. JDBC连接执行 MySQL 存储过程报权限错误:User does not have access to metadata required to determine stored procedure parameter types. If rights can not be granted,

    国内私募机构九鼎控股打造APP,来就送 20元现金领取地址:http://jdb.jiudingcapital.com/phone.html 内部邀请码:C8E245J (不写邀请码,没有现金送) 国 ...

  2. Net连接mysql的公共Helper类MySqlHelper.cs带MySql.Data.dll下载

    MySqlHelper.cs代码如下: using System; using System.Collections.Generic; using System.Linq; using System. ...

  3. How To Call Stored Procedure In Hibernate

    How To Call Stored Procedure In Hibernate In this tutorial, you will learn how to call a store proce ...

  4. [转]Mapping Stored Procedure Parameters in SSIS OLE DB Source Editor

    本文转自:http://geekswithblogs.net/stun/archive/2009/03/05/mapping-stored-procedure-parameters-in-ssis-o ...

  5. 关于Linux和Windows下部署mysql.data.dll的注册问题

    mysql ado.net connector下载地址: http://dev.mysql.com/downloads/connector/net/ 选择版本: Generally Available ...

  6. Stored Procedure 里的 WITH RECOMPILE 到底是干麻的?

    在 SQL Server 创建或修改「存储过程(stored procedure)」时,可加上 WITH RECOMPILE 选项,但多数文档或书籍都写得语焉不详,或只解释为「每次执行此存储过程时,都 ...

  7. [转]Dynamic SQL & Stored Procedure Usage in T-SQL

    转自:http://www.sqlusa.com/bestpractices/training/scripts/dynamicsql/ Dynamic SQL & Stored Procedu ...

  8. Retrieving Out Params From a Stored Procedure With Python

    http://www.rodneyoliver.com/blog/2013/08/08/retrieving-out-params-from-a-stored-procedure-with-pytho ...

  9. [转]Easy Stored Procedure Output Oracle Select

    本文转自:http://www.oraclealchemist.com/oracle/easy-stored-procedure-output/ I answered a question on a ...

随机推荐

  1. Maven 入门——认识 Maven

    Maven /ˈmāvən/ ,可以翻译成"专家",是一款来自 Apache 组织的开源项目,用于项目管理.主要服务于基于 Java 平台的项目构建.依赖管理和项目信息管理. 构建 ...

  2. Spring Actuator源码分析(转)

    转自:http://blog.csdn.net/wsscy2004/article/details/50166333 Actuator Endpoint Actuator模块通过Endpoint暴露一 ...

  3. 解压cpio.gz、zip类型文件

    aix上的oracle介质文件是10gr2_aix5l64_database.cpio.gz 解压方法: gunzip 10gr2_aix5l64_database.cpio.gz cpio -idm ...

  4. CRM项目图形交互界面设计

    由于我们组在刚开始的时候 ,进度比较快的!老师本来是打算最后给我们用统一的学校已经封装好的界面给我们的!看着我们的现实都写完了!老师就提前把界面都给我们了!但是觉得界面一般,不怎么好看!我们就全部都是 ...

  5. IIS 共享目录读写报错 Access to the path:“\\192.168.0.1\1.txt”is denied解决方案

    这个是IIS权限的问题,主要修改了以下地方,如果两台电脑有相同的用户名和密码可以跳过第一步 1.找到共享目录的文件夹,属性=>共享,给电脑创建一个新用户,共享文件下添加新用户的读写权限,然后对应 ...

  6. Netty 接受请求过程源码分析 (基于4.1.23)

    前言 在前文中,我们分析了服务器是如何启动的.而服务器启动后肯定是要接受客户端请求并返回客户端想要的信息的,否则要你服务器干啥子呢?所以,我们今天就分析分析 Netty 在启动之后是如何接受客户端请求 ...

  7. maven根据不同的运行环境,打包不同的配置文件

    使用maven管理项目中的依赖,非常的方便.同时利用maven内置的各种插件,在命令行模式下完成打包.部署等操作,可方便后期的持续集成使用. 但是每一个maven工程(比如web项目),开发人员在开发 ...

  8. 让 VS2010 支持 HTML5 和 CSS3.0

    现在的热门话题之一是HTML5 和 CSS3.好的, 它们都很时髦,它们也必然会影响网络开发的未来. 让我们尝尝鲜,花点时间安装设置一下,尽快让Visual Studio2010支持HTML5 和 C ...

  9. RESTful API入门

      RESTful是一种设计风格,并不是一种标准. 简短的去概括的话,就是:1.URL 定位资源 资源,就是数据.比如newsfeed,friends,order等 2.用 HTTP 动词描述操作. ...

  10. [android] 轮播图-无限循环

    实现无限循环 在getCount()方法中,返回一个很大的值,Integer.MAX_VALUE 在instantiateItem()方法中,获取当前View的索引时,进行取于操作,传递进来的int ...