KingbaseES例程之拥有大量索引的表导入数据
概述
如何快速插入大量数据比如几千万上亿的带索引的数据表。
数据准备
准备一个拥有二十个索引的数据表。
kingbase=# \d+ bigtab
Table "kingbase.bigtab"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------+---------+-----------+----------+---------+----------+--------------+-------------
id | integer | | | | plain | |
c01 | integer | | | | plain | |
c02 | integer | | | | plain | |
c03 | integer | | | | plain | |
c04 | integer | | | | plain | |
c05 | integer | | | | plain | |
c06 | integer | | | | plain | |
c07 | integer | | | | plain | |
c08 | integer | | | | plain | |
c09 | integer | | | | plain | |
c10 | integer | | | | plain | |
c11 | integer | | | | plain | |
c12 | integer | | | | plain | |
c13 | integer | | | | plain | |
c14 | integer | | | | plain | |
c15 | integer | | | | plain | |
c16 | integer | | | | plain | |
c17 | integer | | | | plain | |
c18 | integer | | | | plain | |
c19 | integer | | | | plain | |
c20 | integer | | | | plain | |
c21 | integer | | | | plain | |
c22 | integer | | | | plain | |
c23 | integer | | | | plain | |
c24 | integer | | | | plain | |
c25 | integer | | | | plain | |
c26 | integer | | | | plain | |
c27 | integer | | | | plain | |
c28 | integer | | | | plain | |
c29 | integer | | | | plain | |
t01 | text | | | | extended | |
t02 | text | | | | extended | |
t03 | text | | | | extended | |
t04 | text | | | | extended | |
t05 | text | | | | extended | |
t06 | text | | | | extended | |
t07 | text | | | | extended | |
t08 | text | | | | extended | |
t09 | text | | | | extended | |
t10 | text | | | | extended | |
t11 | text | | | | extended | |
t12 | text | | | | extended | |
t13 | text | | | | extended | |
t14 | text | | | | extended | |
t15 | text | | | | extended | |
t16 | text | | | | extended | |
t17 | text | | | | extended | |
t18 | text | | | | extended | |
t19 | text | | | | extended | |
t20 | text | | | | extended | |
Indexes:
"bigtab_i01" btree (c01)
"bigtab_i02" btree (c02)
"bigtab_i03" btree (c03)
"bigtab_i04" btree (c04)
"bigtab_i05" btree (c05)
"bigtab_i06" btree (c06)
"bigtab_i07" btree (c07)
"bigtab_i08" btree (c08)
"bigtab_i09" btree (c09)
"bigtab_i10" btree (c10)
"bigtab_i11" btree (c11)
"bigtab_i12" btree (c12)
"bigtab_i13" btree (c13)
"bigtab_i14" btree (c14)
"bigtab_i15" btree (c15)
"bigtab_i16" btree (c16)
"bigtab_i17" btree (c17)
"bigtab_i18" btree (c18)
"bigtab_i19" btree (c19)
"bigtab_i20" btree (c20)
Access method: heap
kingbase=#
方法一:直接插入海量数据,自动维护索引
kingbase=#
kingbase=# insert into bigtab
kingbase-# select id
kingbase-# , (random() * 100)::int + 1000 c01
kingbase-# , (random() * 200)::int + 1000 c02
kingbase-# , (random() * 300)::int + 10000 c03
kingbase-# , (random() * 400)::int + 10000 c04
kingbase-# , (random() * 500)::int + 10000 c05
kingbase-# , (random() * 600)::int + 10000 c06
kingbase-# , (random() * 700)::int + 10000 c07
kingbase-# , (random() * 800)::int + 10000 c08
kingbase-# , (random() * 900)::int + 10000 c09
kingbase-# , (random() * 1000)::int + 10000 c10
kingbase-# , (random() * 2000)::int + 10000 c11
kingbase-# , (random() * 3000)::int + 10000 c12
kingbase-# , (random() * 4000)::int + 10000 c13
kingbase-# , (random() * 5000)::int + 10000 c14
kingbase-# , (random() * 6000)::int + 10000 c15
kingbase-# , (random() * 7000)::int + 10000 c16
kingbase-# , (random() * 8000)::int + 10000 c17
kingbase-# , (random() * 9000)::int + 10000 c18
kingbase-# , (random() * 10000)::int + 10000 c19
kingbase-# , (random() * 20000)::int + 10000 c20
kingbase-# , (random() * 30000)::int + 10000 c21
kingbase-# , (random() * 40000)::int + 10000 c22
kingbase-# , (random() * 50000)::int + 10000 c23
kingbase-# , (random() * 60000)::int + 10000 c24
kingbase-# , (random() * 70000)::int + 10000 c25
kingbase-# , (random() * 80000)::int + 10000 c26
kingbase-# , (random() * 90000)::int + 10000 c27
kingbase-# , (random() * 10000)::int + 10000 c28
kingbase-# , (random() * 10000)::int + 10000 c29
kingbase-# , md5(random()::text) t01
kingbase-# , md5(random()::text) t02
kingbase-# , md5(random()::text) t03
kingbase-# , md5(random()::text) t04
kingbase-# , md5(random()::text) t05
kingbase-# , md5(random()::text) t06
kingbase-# , md5(random()::text) t07
kingbase-# , md5(random()::text) t08
kingbase-# , md5(random()::text) t09
kingbase-# , md5(random()::text) t10
kingbase-# , md5(random()::text) t11
kingbase-# , md5(random()::text) t12
kingbase-# , md5(random()::text) t13
kingbase-# , md5(random()::text) t14
kingbase-# , md5(random()::text) t15
kingbase-# , md5(random()::text) t16
kingbase-# , md5(random()::text) t17
kingbase-# , md5(random()::text) t18
kingbase-# , md5(random()::text) t19
kingbase-# , md5(random()::text) t20
kingbase-# from generate_series(1, 2000000) id;
INSERT 0 2000000
Time: 299331.143 ms (04:59.331)
优点: 语句单一;自动维护索引;自动支持之后的索引。
缺点: 逐行维护索引,造成用时较长。
方法二:删除索引,插入海量数据,再创建索引
kingbase=#
kingbase=# do
kingbase-# $$
kingbase$# begin
kingbase$# drop index bigtab_i01;
kingbase$# drop index bigtab_i02;
kingbase$# drop index bigtab_i03;
kingbase$# drop index bigtab_i04;
kingbase$# drop index bigtab_i05;
kingbase$# drop index bigtab_i06;
kingbase$# drop index bigtab_i07;
kingbase$# drop index bigtab_i08;
kingbase$# drop index bigtab_i09;
kingbase$# drop index bigtab_i10;
kingbase$# drop index bigtab_i11;
kingbase$# drop index bigtab_i12;
kingbase$# drop index bigtab_i13;
kingbase$# drop index bigtab_i14;
kingbase$# drop index bigtab_i15;
kingbase$# drop index bigtab_i16;
kingbase$# drop index bigtab_i17;
kingbase$# drop index bigtab_i18;
kingbase$# drop index bigtab_i19;
kingbase$# drop index bigtab_i20;
kingbase$#
kingbase$# insert into bigtab
kingbase$# select id
kingbase$# , (random() * 100)::int + 1000 c01
kingbase$# , (random() * 200)::int + 1000 c02
kingbase$# , (random() * 300)::int + 10000 c03
kingbase$# , (random() * 400)::int + 10000 c04
kingbase$# , (random() * 500)::int + 10000 c05
kingbase$# , (random() * 600)::int + 10000 c06
kingbase$# , (random() * 700)::int + 10000 c07
kingbase$# , (random() * 800)::int + 10000 c08
kingbase$# , (random() * 900)::int + 10000 c09
kingbase$# , (random() * 1000)::int + 10000 c10
kingbase$# , (random() * 2000)::int + 10000 c11
kingbase$# , (random() * 3000)::int + 10000 c12
kingbase$# , (random() * 4000)::int + 10000 c13
kingbase$# , (random() * 5000)::int + 10000 c14
kingbase$# , (random() * 6000)::int + 10000 c15
kingbase$# , (random() * 7000)::int + 10000 c16
kingbase$# , (random() * 8000)::int + 10000 c17
kingbase$# , (random() * 9000)::int + 10000 c18
kingbase$# , (random() * 10000)::int + 10000 c19
kingbase$# , (random() * 20000)::int + 10000 c20
kingbase$# , (random() * 30000)::int + 10000 c21
kingbase$# , (random() * 40000)::int + 10000 c22
kingbase$# , (random() * 50000)::int + 10000 c23
kingbase$# , (random() * 60000)::int + 10000 c24
kingbase$# , (random() * 70000)::int + 10000 c25
kingbase$# , (random() * 80000)::int + 10000 c26
kingbase$# , (random() * 90000)::int + 10000 c27
kingbase$# , (random() * 10000)::int + 10000 c28
kingbase$# , (random() * 10000)::int + 10000 c29
kingbase$# , md5(random()::text) t01
kingbase$# , md5(random()::text) t02
kingbase$# , md5(random()::text) t03
kingbase$# , md5(random()::text) t04
kingbase$# , md5(random()::text) t05
kingbase$# , md5(random()::text) t06
kingbase$# , md5(random()::text) t07
kingbase$# , md5(random()::text) t08
kingbase$# , md5(random()::text) t09
kingbase$# , md5(random()::text) t10
kingbase$# , md5(random()::text) t11
kingbase$# , md5(random()::text) t12
kingbase$# , md5(random()::text) t13
kingbase$# , md5(random()::text) t14
kingbase$# , md5(random()::text) t15
kingbase$# , md5(random()::text) t16
kingbase$# , md5(random()::text) t17
kingbase$# , md5(random()::text) t18
kingbase$# , md5(random()::text) t19
kingbase$# , md5(random()::text) t20
kingbase$# from generate_series(1, 2000000) id;
kingbase$#
kingbase$# create index bigtab_i01 on bigtab (c01);
kingbase$# create index bigtab_i02 on bigtab (c02);
kingbase$# create index bigtab_i03 on bigtab (c03);
kingbase$# create index bigtab_i04 on bigtab (c04);
kingbase$# create index bigtab_i05 on bigtab (c05);
kingbase$# create index bigtab_i06 on bigtab (c06);
kingbase$# create index bigtab_i07 on bigtab (c07);
kingbase$# create index bigtab_i08 on bigtab (c08);
kingbase$# create index bigtab_i09 on bigtab (c09);
kingbase$# create index bigtab_i10 on bigtab (c10);
kingbase$# create index bigtab_i11 on bigtab (c11);
kingbase$# create index bigtab_i12 on bigtab (c12);
kingbase$# create index bigtab_i13 on bigtab (c13);
kingbase$# create index bigtab_i14 on bigtab (c14);
kingbase$# create index bigtab_i15 on bigtab (c15);
kingbase$# create index bigtab_i16 on bigtab (c16);
kingbase$# create index bigtab_i17 on bigtab (c17);
kingbase$# create index bigtab_i18 on bigtab (c18);
kingbase$# create index bigtab_i19 on bigtab (c19);
kingbase$# create index bigtab_i20 on bigtab (c20);
kingbase$#
kingbase$# end;
kingbase$# $$;
ANONYMOUS BLOCK
Time: 83069.170 ms (01:23.069)
优点: 批量维护索引,用时最短。
缺点: 语句复杂且固化;手动维护删建索引语句;不支持之后的索引。
方法三:禁止索引更改,插入海量数据,重建表的全部索引
kingbase=# do
kingbase-# $$
kingbase$# begin
kingbase$#
kingbase$# update pg_index
kingbase$# set indislive= false
kingbase$# where indrelid = 'bigtab'::regclass;
kingbase$#
kingbase$# insert into bigtab
kingbase$# select id
kingbase$# , (random() * 100)::int + 1000 c01
kingbase$# , (random() * 200)::int + 1000 c02
kingbase$# , (random() * 300)::int + 10000 c03
kingbase$# , (random() * 400)::int + 10000 c04
kingbase$# , (random() * 500)::int + 10000 c05
kingbase$# , (random() * 600)::int + 10000 c06
kingbase$# , (random() * 700)::int + 10000 c07
kingbase$# , (random() * 800)::int + 10000 c08
kingbase$# , (random() * 900)::int + 10000 c09
kingbase$# , (random() * 1000)::int + 10000 c10
kingbase$# , (random() * 2000)::int + 10000 c11
kingbase$# , (random() * 3000)::int + 10000 c12
kingbase$# , (random() * 4000)::int + 10000 c13
kingbase$# , (random() * 5000)::int + 10000 c14
kingbase$# , (random() * 6000)::int + 10000 c15
kingbase$# , (random() * 7000)::int + 10000 c16
kingbase$# , (random() * 8000)::int + 10000 c17
kingbase$# , (random() * 9000)::int + 10000 c18
kingbase$# , (random() * 10000)::int + 10000 c19
kingbase$# , (random() * 20000)::int + 10000 c20
kingbase$# , (random() * 30000)::int + 10000 c21
kingbase$# , (random() * 40000)::int + 10000 c22
kingbase$# , (random() * 50000)::int + 10000 c23
kingbase$# , (random() * 60000)::int + 10000 c24
kingbase$# , (random() * 70000)::int + 10000 c25
kingbase$# , (random() * 80000)::int + 10000 c26
kingbase$# , (random() * 90000)::int + 10000 c27
kingbase$# , (random() * 10000)::int + 10000 c28
kingbase$# , (random() * 10000)::int + 10000 c29
kingbase$# , md5(random()::text) t01
kingbase$# , md5(random()::text) t02
kingbase$# , md5(random()::text) t03
kingbase$# , md5(random()::text) t04
kingbase$# , md5(random()::text) t05
kingbase$# , md5(random()::text) t06
kingbase$# , md5(random()::text) t07
kingbase$# , md5(random()::text) t08
kingbase$# , md5(random()::text) t09
kingbase$# , md5(random()::text) t10
kingbase$# , md5(random()::text) t11
kingbase$# , md5(random()::text) t12
kingbase$# , md5(random()::text) t13
kingbase$# , md5(random()::text) t14
kingbase$# , md5(random()::text) t15
kingbase$# , md5(random()::text) t16
kingbase$# , md5(random()::text) t17
kingbase$# , md5(random()::text) t18
kingbase$# , md5(random()::text) t19
kingbase$# , md5(random()::text) t20
kingbase$# from generate_series(1, 2000000) id;
kingbase$#
kingbase$# update pg_index
kingbase$# set indislive= true
kingbase$# where indrelid = 'bigtab'::regclass;
kingbase$#
kingbase$# analyse bigtab;
kingbase$# reindex table bigtab;
kingbase$#
kingbase$# end;
kingbase$# $$;
ANONYMOUS BLOCK
Time: 87110.126 ms (01:27.110)
优点: 批量维护索引,用时短;语句固定模式;自动维护索引;支持之后的索引。
缺点: 多个SQL语句,不易嵌入语句块。
最后的话
reindex table 的执行依赖统计信息,所以需要执行 analyse table ,才能成功重建表的全部可更新的索引。
reindex index 不受上述因素的影响,可以强制重建不更新的索引,并自动修改 indislive= true。
如果在REINDEX期间出现异常,那么所有需要rebuild的索引的状态都是invalid,意味着这些索引仍然占用空间,定义仍在但不能使用。
避免REINDEX期间出现异常,可以在索引更新操作时,跳过唯一索引和外键依赖索引等。
KingbaseES例程之拥有大量索引的表导入数据的更多相关文章
- U8API——向U8数据库表导入数据
一.打开API资源管理器 替换两个引用 打开应用实例,选择相应的功能 复制相应的封装类到自己的目录下 在数据库新建临时表,与目标表相同 数据导入: 思路:先将要导入的数据导入到与U8目标表相同的临时表 ...
- mysql单表导入数据,全量备份导入单表
(1)“导出”表 导出表是在备份的prepare阶段进行的,因此,一旦完全备份完成,就可以在prepare过程中通过--export选项将某表导出了: innobackupex --apply-log ...
- asp.net 从Excel表导入数据到数据库中
http://www.cnblogs.com/hfzsjz/archive/2010/12/31/1922901.html http://hi.baidu.com/ctguyg/item/ebc857 ...
- 关于mysql 表导入数据
一.实验准备: 1.实验设备:Dell laptop 7559; 2.实验环境:windows 10操作系统; 3.数据库版本:mysql 8.0; 二.实验目的: 1.将一个宠物表pet.txt文件 ...
- oracle RAC 11g sqlload 生产表导入数据(ORA-12899)
背景:由于即将来临的双十一,业务部门(我司是做京东,天猫的短信服务),短信入库慢,需要DBA把数据库sqlload进数据库. 表结构如下: MRS VARCHAR2(100), STATUS VARC ...
- 从Excel表导入数据到Table
步骤: 1.写第一行SQL,(本sql对应的是oracle数据库) ="INSERT INTO TD_PROMOTION_RATE VALUES("&A3&&quo ...
- hive 建表导入数据
1. hive> create table wyp > (id int, name string, > age int, tel string) > ROW FORMAT DE ...
- Hive创建表|数据的导入|数据导出的几种方式
* Hive创建表的三种方式 1.使用create命令创建一个新表 例如:create table if not exists db_web_data.track_log(字段) partitione ...
- SQL Server 索引和表体系结构(聚集索引)
聚集索引 概述 关于索引和表体系结构的概念一直都是讨论比较多的话题,其中表的各种存储形式是讨论的重点,在各个网站上面也有很多关于这方面写的不错的文章,我写这篇文章的目的也是为了将所有的知识点尽可能的组 ...
随机推荐
- 全新升级的AOP框架Dora.Interception[6]: 实现任意的拦截器注册方式
Dora.Interception提供了两种拦截器注册方式,一种是利用标注在目标类型.属性和方法上的InterceptorAttribute特性,另一种采用基于目标方法或者属性的调用表达式.通过提供的 ...
- css-sticky 定位
前言 我们大多都了解绝对定位.相对定位.static 和 fixed 定位,而 sticky 定位常常会被忽略,本文来总结一下其相关使用方法. 正文 1.常见使用效果 我们滚动滚动条时,当 " ...
- MarkDown语法——更好地写博客
MarkDown语法--更好地写博客 我们在学习过程中要尽量养成编写博客的 好习惯:一方面方便自己在学习之后进行一次汇总,其次自己书写的文章可以在以后的时间里反复查看以便于巩固,在找工作时博客也是被招 ...
- 安卓fastboot刷机、刷magisk、aidlux备忘
环境就不多说了,网上一堆教程,我只在这边简单记录一下,以小米手机为例 刷机 解锁bootloader PC上配置好adb.fastboot,也就是platform-tools工具包加入系统变量,在命令 ...
- OpenCV视频防抖技术解析
视频防抖有很多种技术,各有优劣,主流的目前分为三种:EIS电子防抖EIS电子防抖是通过软件算法实现防抖的.其技术运作原理是通过加速度传感器和陀螺仪模块侦测手机抖动的幅度,从而来动态调节整ISO.快门以 ...
- tokitsukaze and Soldier 来源:牛客网
题目 链接:https://ac.nowcoder.com/acm/contest/28886/1004 来源:牛客网 时间限制:C/C++ 1秒,其他语言2秒 空间限制:C/C++ 524288K, ...
- 万字长文:从计算机本源深入探寻volatile和Java内存模型
万字长文:从计算机本源深入探寻volatile和Java内存模型 前言 在本篇文章当中,主要给大家深入介绍Volatile关键字和Java内存模型.在文章当中首先先介绍volatile的作用和Java ...
- odoo 14 一些常见问题集
1 # 当你往tree或者form视图中增加action的时候 2 # 记住!千万别重名 3 # 一旦重名,Export.Delete.Archive.Unarchive都会消失不见 4 # tree ...
- MySQL Update执行流程解读
GreatSQL社区原创内容未经授权不得随意使用,转载请联系小编并注明来源. 一.update跟踪执行配置 使用内部程序堆栈跟踪工具path_viewer,跟踪mysql update 一行数据的执行 ...
- 利用 SonarScanner 静态扫描 Rainbond 上的 Maven 项目
对代码进行静态扫描是一种非常常见的代码质量保证手段,这种扫描不仅仅可以检查到代码中的缺陷,应用各种业界最佳实践,也可以检查出安全方面的漏洞,给予项目代码全方位的提升.在各种代码扫描方案之中,Sonar ...