数仓性能优化:倾斜优化-表达式计算倾斜的hint优化
本文分享自华为云社区《GaussDB(DWS)性能调优:倾斜优化-表达式计算倾斜的hint优化》,作者: 譡里个檔 。
1.原始SQL
SELECT TMP4.TAX_AMT, CATE.L1_PUR_ITEM_CATG_CN_NAME || '-' || CATE.L2_PUR_ITEM_CATG_CN_NAME || '-' || CATE.L3_PUR_ITEM_CATG_CN_NAME AS PRODUCT_CATEGORY, MATE.ITEM_CODE AS PRODUCT_CODE, INVEN.INVENTORY_ORG_NAME, TMP4.INVOICE_WITHHOLDING_TAX_GROUP, TMP4.PAYMENT_WITHHOLDING_TAX_GROUP, TMP4.PO_CHARGE_ACCOUNT_CODE, TMP4.CFS_INVOICE_NUMBER, APR.TAX_INVOICE_DATE FROM DWLTAX.DWL_TAX_TAXDP_ERP_AP_INVOICE_TMP5 TMP4, DWRDIM_DW1.DWR_DIM_PUR_ITEM_CATEGORY_D CATE, DWRDIM_DW1.DWR_DIM_MATERIAL_CODE_D MATE, DWRDIM_DW1.DWR_DIM_INVENTORY_ORG_D INVEN, DWTAXDI.DWI_AP_INVOICE_I AP, DWTAXDI.DWI_AP_INVOICE_REGSTN_I APR WHERE 1 = 1 AND TMP4.ITEM_CATEGORY_KEY = CATE.PUR_ITEM_CATG_KEY(+) AND CATE.DEL_FLAG(+) = 'N' AND TMP4.ITEM_ID = MATE.ITEM_ID(+) AND MATE.DEL_FLAG(+) = 'N' AND TMP4.PO_SHIPMENT_TARGET_INV_ORG_KEY = INVEN.INVENTORY_ORG_KEY(+) AND INVEN.DEL_FLAG(+) = 'N' AND TMP4.AP_INVOICE_ID = AP.AP_INVOICE_ID(+) AND 6600 || AP.ATTRIBUTE1 = TO_CHAR(APR.AP_INVOICE_REGSTN_ID(+))
执行performance,查询具体执行情况和SQL自诊断信息(详细见附件case-step1-原始执行信息.txt)
id | operation | A-time | A-rows | E-rows | E-distinct | Peak Memory | E-memory | A-width | E-width | E-costs
----+------------------------------------------------------------------------------------------------------+------------------------+------------+------------+------------+----------------+----------------+-----------+---------+-------------
1 | -> Row Adapter | 69922.773 | 69237018 | 69237018 | | 87KB | | | 573 | 15160857.61
2 | -> Vector Streaming (type: GATHER) | 65581.989 | 69237018 | 69237018 | | 536KB | | | 573 | 15160857.61
3 | -> Vector Hash Right Join (4, 6) | [61186.201, 73129.055] | 69237018 | 69237018 | | [306MB, 682MB] | 1113MB(9990MB) | | 573 | 15159431.83
4 | -> Vector Streaming(type: BROADCAST ng: LC_DL1->LC_DW1) | [554.217, 21008.078] | 1382000544 | 1381572384 | 282184 | [4MB, 4MB] | 3MB | | 16 | 7056095.88
5 | -> CStore Scan on dwifin.dwi_ap_invoice_regstn s | [5.354, 11.617] | 28791678 | 28782758 | | [1MB, 1MB] | 1MB | | 16 | 28004.18
6 | -> Vector Hash Left Join (7, 19) | [1728.008, 2017.488] | 69237018 | 69237018 | 79721 | [834KB, 834KB] | 16MB | [229,252] | 578 | 1832322.90
7 | -> Vector Hash Left Join (8, 17) | [1428.799, 1925.653] | 69237018 | 69237018 | 179 | [32MB, 32MB] | 28MB(8901MB) | | 576 | 1817105.07
8 | -> Vector Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN) | [996.780, 1635.826] | 69237018 | 69237018 | 4167 | [1MB, 1MB] | 2MB | | 570 | 1788113.85
9 | -> Vector Hash Left Join (10, 14) | [1086.903, 1780.641] | 69237018 | 69237018 | | [173MB, 174MB] | 227MB(9067MB) | | 570 | 1304897.12
10 | -> Vector Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN) | [153.628, 891.680] | 69237018 | 69237018 | 20271 | [1MB, 1MB] | 2MB | | 567 | 847160.16
11 | -> Vector Hash Left Join (12, 13) | [367.155, 465.821] | 69237018 | 69237018 | | [30MB, 30MB] | 22MB(8896MB) | | 567 | 363943.43
12 | -> CStore Scan on dwltax.dwl_tax_taxdp_erp_ap_invoice_tmp5 tmp4 | [150.676, 178.827] | 69237018 | 69237018 | 526 | [4MB, 4MB] | 1MB | | 553 | 340168.44
13 | -> CStore Scan on dwrdim_dw1.dwr_dim_pur_item_category_d cate | [14.549, 24.399] | 8228448 | 8228448 | 171426 | [2MB, 2MB] | 1MB | [104,104] | 26 | 9056.99
14 | -> Vector Streaming(type: PART REDISTRIBUTE PART BROADCAST ng: LC_DL1->LC_DW1) | [315.926, 339.782] | 117191217 | 117191170 | 2441483 | [1MB, 1MB] | 3MB | [47,47] | 22 | 406136.10
15 | -> Vector Partition Iterator | [118.307, 151.248] | 117191170 | 117191170 | | [41KB, 41KB] | 1MB | | 22 | 300641.93
16 | -> Partitioned CStore Scan on dwifin.dwi_ap_invoice s | [86.557, 111.947] | 117191170 | 117191170 | | [6MB, 6MB] | 1MB | | 22 | 300641.93
17 | -> Vector Streaming(type: PART LOCAL PART BROADCAST) | [60.429, 99.381] | 15442613 | 15442566 | 321720 | [584KB, 584KB] | 2MB | [58,58] | 19 | 49578.19
18 | -> CStore Scan on dwrdim_dw1.dwr_dim_material_code_d mate | [19.779, 33.206] | 15442566 | 15442566 | | [1MB, 2MB] | 1MB | | 19 | 35704.02
19 | -> CStore Scan on dwrdim_dw1.dwr_dim_inventory_org_d inven | [0.383, 0.739] | 135072 | 135072 | 2814 | [1MB, 1MB] | 1MB | [53,53] | 14 | 2823.85
SQL Diagnostic Information
--------------------------------------------------------------------------------------------
Execute diagnostic information
PlanNode[4] Large Table in Broadcast "Vector Streaming(type: BROADCAST ng: LC_DL1->LC_DW1)"
Predicate Information (identified by plan id)
------------------------------------------------------------------------------------------------------------------------------
3 --Vector Hash Right Join (4, 6)
Hash Cond: (((numeric_out(s.ap_invoice_regstn_id))::character varying)::text = ('6600'::text || (s.attribute1)::text))
6 --Vector Hash Left Join (7, 19)
Hash Cond: (tmp4.po_shipment_target_inv_org_key = inven.inventory_org_key)
7 --Vector Hash Left Join (8, 17)
Hash Cond: (tmp4.item_id = mate.item_id)
Skew Join Optimized by Statistic
8 --Vector Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN)
Skew Filter(type: ROUNDROBIN): ((tmp4.item_id = (-999999)::numeric) OR (tmp4.item_id IS NULL))
9 --Vector Hash Left Join (10, 14)
Hash Cond: (tmp4.ap_invoice_id = s.ap_invoice_id)
Skew Join Optimized by Statistic
10 --Vector Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN)
Skew Filter(type: ROUNDROBIN): (tmp4.ap_invoice_id = 1001113812002::numeric)
11 --Vector Hash Left Join (12, 13)
Hash Cond: (tmp4.item_category_key = cate.pur_item_catg_key)
13 --CStore Scan on dwrdim_dw1.dwr_dim_pur_item_category_d cate
Filter: ((cate.del_flag)::text = 'N'::text)
Pushdown Predicate Filter: ((cate.del_flag)::text = 'N'::text)
14 --Vector Streaming(type: PART REDISTRIBUTE PART BROADCAST ng: LC_DL1->LC_DW1)
Skew Filter(type: BROADCAST): (s.ap_invoice_id = 1001113812002::numeric)
15 --Vector Partition Iterator
Iterations: 147
16 --Partitioned CStore Scan on dwifin.dwi_ap_invoice s
Partitions Selected by Static Prune: 1..147
17 --Vector Streaming(type: PART LOCAL PART BROADCAST)
Skew Filter(type: BROADCAST): (mate.item_id = (-999999)::numeric)
18 --CStore Scan on dwrdim_dw1.dwr_dim_material_code_d mate
Filter: ((mate.del_flag)::text = 'N'::text)
Pushdown Predicate Filter: ((mate.del_flag)::text = 'N'::text)
19 --CStore Scan on dwrdim_dw1.dwr_dim_inventory_org_d inven
Filter: ((inven.del_flag)::text = 'N'::text)
Pushdown Predicate Filter: ((inven.del_flag)::text = 'N'::text)
2.禁止大表广播
如上小节显示确实是id=4的这一步是一个大的结果集(2879w条)做了broadcast,并且紧接着的id=5的HashJoin耗时很长。因此通过增加hint方式禁止dwifin.dwi_ap_invoice_regstn走广播。分析发现表dwifin.dwi_ap_invoice_regstn是视图apr展开出现的,因此增加如下hint信息,其中
1. no merge (apr)是防止视图apr中的语句提升,导致的hint信息失效
2. no broadcast(apr)表示禁止apr走broadcast
EXPLAIN performance SELECT /*+ no merge (apr) no broadcast(apr) */ TMP4.TAX_AMT, CATE.L1_PUR_ITEM_CATG_CN_NAME || '-' || CATE.L2_PUR_ITEM_CATG_CN_NAME || '-' || CATE.L3_PUR_ITEM_CATG_CN_NAME AS PRODUCT_CATEGORY, MATE.ITEM_CODE AS PRODUCT_CODE, INVEN.INVENTORY_ORG_NAME, TMP4.INVOICE_WITHHOLDING_TAX_GROUP, TMP4.PAYMENT_WITHHOLDING_TAX_GROUP, TMP4.PO_CHARGE_ACCOUNT_CODE, TMP4.CFS_INVOICE_NUMBER, APR.TAX_INVOICE_DATE FROM DWLTAX.DWL_TAX_TAXDP_ERP_AP_INVOICE_TMP5 TMP4, DWRDIM_DW1.DWR_DIM_PUR_ITEM_CATEGORY_D CATE, DWRDIM_DW1.DWR_DIM_MATERIAL_CODE_D MATE, DWRDIM_DW1.DWR_DIM_INVENTORY_ORG_D INVEN, DWTAXDI.DWI_AP_INVOICE_I AP, DWTAXDI.DWI_AP_INVOICE_REGSTN_I APR WHERE 1 = 1 AND TMP4.ITEM_CATEGORY_KEY = CATE.PUR_ITEM_CATG_KEY(+) AND CATE.DEL_FLAG(+) = 'N' AND TMP4.ITEM_ID = MATE.ITEM_ID(+) AND MATE.DEL_FLAG(+) = 'N' AND TMP4.PO_SHIPMENT_TARGET_INV_ORG_KEY = INVEN.INVENTORY_ORG_KEY(+) AND INVEN.DEL_FLAG(+) = 'N' AND TMP4.AP_INVOICE_ID = AP.AP_INVOICE_ID(+) AND 6600 || AP.ATTRIBUTE1 = TO_CHAR(APR.AP_INVOICE_REGSTN_ID(+))
获取如上语句的performance信息(详细见附件 case-step2-禁止大表广播.txt)
id | operation | A-time | A-rows | E-rows | E-distinct | Peak Memory | E-memory | A-width | E-width | E-costs
----+---------------------------------------------------------------------------------------------------------+------------------------+-----------+-----------+------------+----------------+----------------+-----------+---------+-------------
1 | -> Row Adapter | 15685.781 | 69237018 | 69237018 | | 87KB | | | 573 | 33341721.22
2 | -> Vector Streaming (type: GATHER) | 11361.740 | 69237018 | 69237018 | | 536KB | | | 573 | 33341721.22
3 | -> Vector Hash Left Join (4, 19) | [15269.267, 18985.791] | 69237018 | 69237018 | | [74MB, 74MB] | 101MB(9984MB) | | 573 | 33340295.43
4 | -> Vector Streaming(type: REDISTRIBUTE) | [4743.867, 18632.182] | 69237018 | 69237018 | 79721 | [1MB, 2MB] | 2MB | | 578 | 29821930.76
5 | -> Vector Hash Left Join (6, 18) | [1473.990, 15359.055] | 69237018 | 69237018 | | [866KB, 898KB] | 16MB | | 578 | 1832322.90
6 | -> Vector Hash Left Join (7, 16) | [1130.814, 15223.646] | 69237018 | 69237018 | 179 | [32MB, 32MB] | 28MB(9923MB) | | 576 | 1817105.07
7 | -> Vector Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN) | [681.709, 14909.424] | 69237018 | 69237018 | 4167 | [1MB, 1MB] | 2MB | | 570 | 1788113.85
8 | -> Vector Hash Left Join (9, 13) | [1049.201, 12602.796] | 69237018 | 69237018 | | [173MB, 174MB] | 227MB(10089MB) | | 570 | 1304897.12
9 | -> Vector Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN) | [128.704, 11737.099] | 69237018 | 69237018 | 20271 | [1MB, 1MB] | 2MB | | 567 | 847160.16
10 | -> Vector Hash Left Join (11, 12) | [368.537, 443.623] | 69237018 | 69237018 | | [30MB, 30MB] | 22MB(9918MB) | | 567 | 363943.43
11 | -> CStore Scan on dwltax.dwl_tax_taxdp_erp_ap_invoice_tmp5 tmp4 | [148.366, 175.347] | 69237018 | 69237018 | 526 | [4MB, 4MB] | 1MB | | 553 | 340168.44
12 | -> CStore Scan on dwrdim_dw1.dwr_dim_pur_item_category_d cate | [13.319, 24.442] | 8228448 | 8228448 | 171426 | [2MB, 2MB] | 1MB | [104,104] | 26 | 9056.99
13 | -> Vector Streaming(type: PART REDISTRIBUTE PART BROADCAST ng: LC_DL1->LC_DW1) | [242.053, 294.233] | 117191217 | 117191170 | 2441483 | [1MB, 1MB] | 3MB | [47,47] | 22 | 406136.10
14 | -> Vector Partition Iterator | [118.124, 154.954] | 117191170 | 117191170 | | [41KB, 41KB] | 1MB | | 22 | 300641.93
15 | -> Partitioned CStore Scan on dwifin.dwi_ap_invoice s | [86.942, 105.441] | 117191170 | 117191170 | | [6MB, 6MB] | 1MB | | 22 | 300641.93
16 | -> Vector Streaming(type: PART LOCAL PART BROADCAST) | [83.793, 117.853] | 15442613 | 15442566 | 321720 | [584KB, 584KB] | 2MB | [58,58] | 19 | 49578.19
17 | -> CStore Scan on dwrdim_dw1.dwr_dim_material_code_d mate | [21.898, 35.895] | 15442566 | 15442566 | | [1MB, 2MB] | 1MB | | 19 | 35704.02
18 | -> CStore Scan on dwrdim_dw1.dwr_dim_inventory_org_d inven | [0.389, 0.661] | 135072 | 135072 | 2814 | [1MB, 1MB] | 1MB | [53,53] | 14 | 2823.85
19 | -> Vector Streaming(type: REDISTRIBUTE ng: LC_DL1->LC_DW1) | [30.667, 49.474] | 28791678 | 28782758 | 599641 | [2MB, 2MB] | 3MB | [75,75] | 16 | 56030.49
20 | -> Vector Subquery Scan on apr | [42.087, 61.734] | 28791678 | 28782758 | | [376KB, 376KB] | 1MB | | 16 | 30826.02
21 | -> CStore Scan on dwifin.dwi_ap_invoice_regstn s | [5.177, 8.049] | 28791678 | 28782758 | | [1MB, 1MB] | 1MB | | 16 | 28004.18
SQL Diagnostic Information
----------------------------------------------------------------------------------------------------------
Execute diagnostic information
PlanNode[4] DataSkew:"Vector Streaming(type: REDISTRIBUTE)", min_dn_tuples:257082, max_dn_tuples:47206637
Predicate Information (identified by plan id)
----------------------------------------------------------------------------------------------------------------------------------
3 --Vector Hash Left Join (4, 19)
Hash Cond: ((('6600'::text || (s.attribute1)::text)) = ((numeric_out(apr.ap_invoice_regstn_id))::character varying)::text)
5 --Vector Hash Left Join (6, 18)
Hash Cond: (tmp4.po_shipment_target_inv_org_key = inven.inventory_org_key)
6 --Vector Hash Left Join (7, 16)
Hash Cond: (tmp4.item_id = mate.item_id)
Skew Join Optimized by Statistic
7 --Vector Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN)
Skew Filter(type: ROUNDROBIN): ((tmp4.item_id = (-999999)::numeric) OR (tmp4.item_id IS NULL))
8 --Vector Hash Left Join (9, 13)
Hash Cond: (tmp4.ap_invoice_id = s.ap_invoice_id)
Skew Join Optimized by Statistic
9 --Vector Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN)
Skew Filter(type: ROUNDROBIN): (tmp4.ap_invoice_id = 1001113812002::numeric)
10 --Vector Hash Left Join (11, 12)
Hash Cond: (tmp4.item_category_key = cate.pur_item_catg_key)
12 --CStore Scan on dwrdim_dw1.dwr_dim_pur_item_category_d cate
Filter: ((cate.del_flag)::text = 'N'::text)
Pushdown Predicate Filter: ((cate.del_flag)::text = 'N'::text)
13 --Vector Streaming(type: PART REDISTRIBUTE PART BROADCAST ng: LC_DL1->LC_DW1)
Skew Filter(type: BROADCAST): (s.ap_invoice_id = 1001113812002::numeric)
14 --Vector Partition Iterator
Iterations: 147
15 --Partitioned CStore Scan on dwifin.dwi_ap_invoice s
Partitions Selected by Static Prune: 1..147
16 --Vector Streaming(type: PART LOCAL PART BROADCAST)
Skew Filter(type: BROADCAST): (mate.item_id = (-999999)::numeric)
17 --CStore Scan on dwrdim_dw1.dwr_dim_material_code_d mate
Filter: ((mate.del_flag)::text = 'N'::text)
Pushdown Predicate Filter: ((mate.del_flag)::text = 'N'::text)
18 --CStore Scan on dwrdim_dw1.dwr_dim_inventory_org_d inven
Filter: ((inven.del_flag)::text = 'N'::text)
Pushdown Predicate Filter: ((inven.del_flag)::text = 'N'::text)
3.表达式倾斜的hint
发现自诊断信息中倾斜告警

而Plan ID为4的算子是

其中是s是视图dwtaxdi.dwi_ap_invoice_i展开后的表dwifin.dwi_ap_invoice,查询此表的列attribute1的统计信息如下,发现在NULL值上存在严重倾斜

因为重分布列是一个表达式6600 || AP.ATTRIBUTE1,当前DWS的倾斜的hint不支持表达式,因为我们做如下变通实现表达式的值倾斜的hint
SELECT /*+ no merge (apr) no broadcast(apr) no merge(ap) skew(ap (attr1) ('6600')) */
TMP4.TAX_AMT,
CATE.L1_PUR_ITEM_CATG_CN_NAME || '-' ||
CATE.L2_PUR_ITEM_CATG_CN_NAME || '-' ||
CATE.L3_PUR_ITEM_CATG_CN_NAME AS PRODUCT_CATEGORY,
MATE.ITEM_CODE AS PRODUCT_CODE,
INVEN.INVENTORY_ORG_NAME,
TMP4.INVOICE_WITHHOLDING_TAX_GROUP,
TMP4.PAYMENT_WITHHOLDING_TAX_GROUP,
TMP4.PO_CHARGE_ACCOUNT_CODE,
TMP4.CFS_INVOICE_NUMBER,
APR.TAX_INVOICE_DATE
FROM DWLTAX.DWL_TAX_TAXDP_ERP_AP_INVOICE_TMP5 TMP4,
DWRDIM_DW1.DWR_DIM_PUR_ITEM_CATEGORY_D CATE,
DWRDIM_DW1.DWR_DIM_MATERIAL_CODE_D MATE,
DWRDIM_DW1.DWR_DIM_INVENTORY_ORG_D INVEN,
(SELECT *, 6600 || AP.ATTRIBUTE1 AS ATTR1 FROM DWTAXDI.DWI_AP_INVOICE_I AP) AP,
DWTAXDI.DWI_AP_INVOICE_REGSTN_I APR
WHERE 1 = 1
AND TMP4.ITEM_CATEGORY_KEY = CATE.PUR_ITEM_CATG_KEY(+)
AND CATE.DEL_FLAG(+) = 'N'
AND TMP4.ITEM_ID = MATE.ITEM_ID(+)
AND MATE.DEL_FLAG(+) = 'N'
AND TMP4.PO_SHIPMENT_TARGET_INV_ORG_KEY = INVEN.INVENTORY_ORG_KEY(+)
AND INVEN.DEL_FLAG(+) = 'N'
AND TMP4.AP_INVOICE_ID = AP.AP_INVOICE_ID(+)
AND ATTR1 = TO_CHAR(APR.AP_INVOICE_REGSTN_ID(+))
其中构建了子查询 AP
SELECT *, 6600 || AP.ATTRIBUTE1 AS ATTR1 FROM DWTAXDI.DWI_AP_INVOICE_I AP
在把原始的关联列表达式放到子查询里面,然后把 6600 || AP.ATTRIBUTE1 命名为attr1。
在父查询中首先禁止AP这个子查询提升。然后在父查询中通过hint 子查询AP这个结果集的列attr1存在倾斜值'6600' 。这个倾斜值是计算出来的(NULL || 6600 = ‘6600’),并且在原始关联计算中关联表达式是如下,即 6600 || AP.ATTRIBUTE1的结果被转换为text类型(字符串类型)

获取新的语句的performance如下(详细见附件 case-step3-倾斜优化.txt)
id | operation | A-time | A-rows | E-rows | E-distinct | Peak Memory | E-memory | A-width | E-width | E-costs
----+------------------------------------------------------------------------------------------------------+-----------------------+-----------+-----------+------------+----------------+----------------+-----------+---------+------------
1 | -> Row Adapter | 9045.793 | 69237018 | 69237018 | | 87KB | | | 573 | 2040755.71
2 | -> Vector Streaming (type: GATHER) | 4842.656 | 69237018 | 69237018 | | 520KB | | | 573 | 2040755.71
3 | -> Vector Hash Left Join (4, 21) | [2673.707, 11389.688] | 69237018 | 69237018 | | [1MB, 1MB] | 16MB | | 573 | 2039329.92
4 | -> Vector Hash Left Join (5, 19) | [1951.482, 10931.220] | 69237018 | 69237018 | 179 | [32MB, 32MB] | 28MB(10018MB) | | 571 | 2009687.71
5 | -> Vector Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN) | [1541.777, 10591.702] | 69237018 | 69237018 | 4167 | [1MB, 1MB] | 2MB | | 565 | 1980696.49
6 | -> Vector Hash Left Join (7, 18) | [1703.438, 1980.655] | 69237018 | 69237018 | | [30MB, 30MB] | 22MB(10010MB) | | 565 | 1497479.76
7 | -> Vector Hash Left Join (8, 10) | [1523.277, 1708.622] | 69237018 | 69237018 | 526 | [165MB, 166MB] | 191MB(10151MB) | | 551 | 1473704.77
8 | -> Vector Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN) | [94.501, 203.619] | 69237018 | 69237018 | 20271 | [1MB, 1MB] | 2MB | | 553 | 823385.17
9 | -> CStore Scan on dwltax.dwl_tax_taxdp_erp_ap_invoice_tmp5 tmp4 | [142.734, 171.486] | 69237018 | 69237018 | | [4MB, 4MB] | 1MB | | 553 | 340168.44
10 | -> Vector Streaming(type: PART REDISTRIBUTE PART BROADCAST ng: LC_DL1->LC_DW1) | [811.192, 853.583] | 117191217 | 117191170 | 2441483 | [2MB, 2MB] | 3MB | [44,44] | 17 | 598718.74
11 | -> Vector Hash Left Join (12, 15) | [340.998, 790.399] | 117191170 | 117191170 | | [39MB, 39MB] | 27MB(10015MB) | | 17 | 493224.57
12 | -> Vector Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN) | [53.170, 79.836] | 117191170 | 117191170 | 79721 | [2MB, 2MB] | 3MB | | 41 | 412662.90
13 | -> Vector Partition Iterator | [145.450, 171.527] | 117191170 | 117191170 | | [41KB, 41KB] | 1MB | | 22 | 303514.27
14 | -> Partitioned CStore Scan on dwifin.dwi_ap_invoice s | [112.099, 134.193] | 117191170 | 117191170 | | [6MB, 6MB] | 1MB | | 22 | 300641.93
15 | -> Vector Streaming(type: PART REDISTRIBUTE PART BROADCAST) | [48.632, 99.230] | 28791678 | 28782758 | 282184 | [2MB, 2MB] | 3MB | [75,75] | 16 | 56928.04
16 | -> Vector Subquery Scan on apr | [41.916, 78.189] | 28791678 | 28782758 | | [376KB, 376KB] | 1MB | | 16 | 30826.02
17 | -> CStore Scan on dwifin.dwi_ap_invoice_regstn s | [5.233, 10.667] | 28791678 | 28782758 | | [1MB, 1MB] | 1MB | | 16 | 28004.18
18 | -> CStore Scan on dwrdim_dw1.dwr_dim_pur_item_category_d cate | [12.065, 20.667] | 8228448 | 8228448 | 171426 | [2MB, 2MB] | 1MB | [104,104] | 26 | 9056.99
19 | -> Vector Streaming(type: PART LOCAL PART BROADCAST) | [67.272, 97.378] | 15442613 | 15442566 | 321720 | [584KB, 584KB] | 2MB | [58,58] | 19 | 49578.19
20 | -> CStore Scan on dwrdim_dw1.dwr_dim_material_code_d mate | [18.605, 31.713] | 15442566 | 15442566 | | [1MB, 2MB] | 1MB | | 19 | 35704.02
21 | -> CStore Scan on dwrdim_dw1.dwr_dim_inventory_org_d inven | [0.378, 0.647] | 135072 | 135072 | 2814 | [1MB, 1MB] | 1MB | [53,53] | 14 | 2823.85
Predicate Information (identified by plan id)
----------------------------------------------------------------------------------------------------------------------------------
3 --Vector Hash Left Join (4, 21)
Hash Cond: (tmp4.po_shipment_target_inv_org_key = inven.inventory_org_key)
4 --Vector Hash Left Join (5, 19)
Hash Cond: (tmp4.item_id = mate.item_id)
Skew Join Optimized by Statistic
5 --Vector Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN)
Skew Filter(type: ROUNDROBIN): ((tmp4.item_id = (-999999)::numeric) OR (tmp4.item_id IS NULL))
6 --Vector Hash Left Join (7, 18)
Hash Cond: (tmp4.item_category_key = cate.pur_item_catg_key)
7 --Vector Hash Left Join (8, 10)
Hash Cond: (tmp4.ap_invoice_id = s.ap_invoice_id)
Skew Join Optimized by Statistic
8 --Vector Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN)
Skew Filter(type: ROUNDROBIN): (tmp4.ap_invoice_id = 1001113812002::numeric)
10 --Vector Streaming(type: PART REDISTRIBUTE PART BROADCAST ng: LC_DL1->LC_DW1)
Skew Filter(type: BROADCAST): (s.ap_invoice_id = 1001113812002::numeric)
11 --Vector Hash Left Join (12, 15)
Hash Cond: ((('6600'::text || (s.attribute1)::text)) = ((numeric_out(apr.ap_invoice_regstn_id))::character varying)::text)
Skew Join Optimized by Hint
12 --Vector Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN)
Skew Filter(type: ROUNDROBIN): ((('6600'::text || (s.attribute1)::text)) = '6600'::text)
13 --Vector Partition Iterator
Iterations: 147
14 --Partitioned CStore Scan on dwifin.dwi_ap_invoice s
Partitions Selected by Static Prune: 1..147
15 --Vector Streaming(type: PART REDISTRIBUTE PART BROADCAST)
Skew Filter(type: BROADCAST): ((((numeric_out(apr.ap_invoice_regstn_id))::character varying)::text) = '6600'::text)
18 --CStore Scan on dwrdim_dw1.dwr_dim_pur_item_category_d cate
Filter: ((cate.del_flag)::text = 'N'::text)
Pushdown Predicate Filter: ((cate.del_flag)::text = 'N'::text)
19 --Vector Streaming(type: PART LOCAL PART BROADCAST)
Skew Filter(type: BROADCAST): (mate.item_id = (-999999)::numeric)
20 --CStore Scan on dwrdim_dw1.dwr_dim_material_code_d mate
Filter: ((mate.del_flag)::text = 'N'::text)
Pushdown Predicate Filter: ((mate.del_flag)::text = 'N'::text)
21 --CStore Scan on dwrdim_dw1.dwr_dim_inventory_org_d inven
Filter: ((inven.del_flag)::text = 'N'::text)
Pushdown Predicate Filter: ((inven.del_flag)::text = 'N'::text)
- 附件:case-step1-原始执行信息.txt0B
- 附件:case-step3-倾斜优化.txt862.61KB
- 附件:case-step2-禁止大表广播.txt0B
数仓性能优化:倾斜优化-表达式计算倾斜的hint优化的更多相关文章
- 【实时数仓】Day01-数据采集层:数仓分层、实时需求、架构分析、日志数据采集(采集到指定topic和落盘)、业务数据采集(MySQL-kafka)、Nginx反向代理、Maxwell、Canel
一.数仓分层介绍 1.实时计算与实时数仓 实时计算实时性高,但无中间结果,导致复用性差 实时数仓基于数据仓库,对数据处理规划.分层,目的是提高数据的复用性 2.电商数仓的分层 ODS:原始日志数据和业 ...
- ByteHouse云数仓版查询性能优化和MySQL生态完善
ByteHouse云数仓版是字节跳动数据平台团队在复用开源 ClickHouse runtime 的基础上,基于云原生架构重构设计,并新增和优化了大量功能.在字节内部,ByteHouse被广泛用于各类 ...
- Android性能优化:手把手带你全面实现内存优化
前言 在 Android开发中,性能优化策略十分重要 本文主要讲解性能优化中的内存优化,希望你们会喜欢 目录 1. 定义 优化处理 应用程序的内存使用.空间占用 2. 作用 避免因不正确使用内 ...
- 【深入理解JAVA虚拟机】第4部分.程序编译与代码优化.2.运行期优化。这章提到的具体的优化技术,应该对以后做性能工作会有帮助。
1.概述 Java程序最初是通过解释器(Interpreter)进行解释执行的,当虚拟机发现某个方法或代码块的运行特别频繁时,就会把这些代码认定为“热点代码”(Hot Spot Code). 为了提高 ...
- QuantLib 金融计算——数学工具之优化器
目录 QuantLib 金融计算--数学工具之优化器 概述 Optimizer Constraint OptimizationMethod EndCriteria 示例 Rosenbrock 问题 校 ...
- 技术专家说 | 如何基于 Spark 和 Z-Order 实现企业级离线数仓降本提效?
[点击了解更多大数据知识] 市场的变幻,政策的完善,技术的革新--种种因素让我们面对太多的挑战,这仍需我们不断探索.克服. 今年,网易数帆将持续推出新栏目「金融专家说」「技术专家说」「产品专家说」等, ...
- gcc -O0 -O1 -O2 -O3 四级优化选项及每级分别做什么优化【转】
转自:http://blog.csdn.net/qinrenzhi/article/details/78334677 相关博客http://blog.chinaunix.net/uid-2495495 ...
- HAWQ取代传统数仓实践(十九)——OLAP
一.OLAP简介 1. 概念 OLAP是英文是On-Line Analytical Processing的缩写,意为联机分析处理.此概念最早由关系数据库之父E.F.Codd于1993年提出.OLAP允 ...
- CarbonData:大数据融合数仓新一代引擎
[摘要] CarbonData将存储和计算逻辑分离,通过索引技术让存储和计算物理上更接近,提升CPU和IO效率,实现超高性能的大数据分析.以CarbonData为融合数仓的大数据解决方案,为金融转型打 ...
- 基于MaxCompute的数仓数据质量管理
声明 本文中介绍的非功能性规范均为建议性规范,产品功能无强制,仅供指导. 参考文献 <大数据之路——阿里巴巴大数据实践>——阿里巴巴数据技术及产品部 著. 背景及目的 数据对一个企业来说已 ...
随机推荐
- 通过Lambda函数的方式获取属性名称
前言: 最近在使用mybatis-plus框架, 常常会使用lambda的方法引用获取实体属性, 避免出现大量的魔法值. public List<User> listBySex() { L ...
- QT线程问题
QT线程问题 (一)QThread (二)QMutex和QMutexLocker (end)后面会更新 (一)QThread 文章 (二)QMutex和QMutexLocker 通俗理解 QMutex ...
- 每天5分钟复习OpenStack(八)存储虚拟化
KVM存储虚拟化是通过存储池(Storage Pool)和卷(Volume)来管理的.Storage Pool 是宿主机上可以看到的一片存储空间,可以是多种类型,Volume 是在 Storage P ...
- 提升运维效率:轻松掌握JumpServer安装和使用技巧
前言 JumpServer 是一个开源的跳板机的解决方案,提供了对远程服务器的安全访问.会话录制和审计.用户身份管理等功能,适用于需要管理机器资源&大量服务器资源的情况. 本文将在分享 doc ...
- 中龙技术 | HSCSEC CRY + MISC WP
CRYPTO[HSC-1th] Easy SignIn 5445705857464579517A4A48546A4A455231645457464243566B5579556C7053546C4A4E ...
- Calendar日历类型常见方法(必看!!)
Hi i,m JinXiang 前言 本篇文章主要介绍Calendar日历类型的几种常见方法以及部分理论知识 欢迎点赞 收藏 留言评论 私信必回哟 博主收将持续更新学习记录获,友友们有任何问题可以在 ...
- 吉特日化MES & 医药行业相关专业术语汇总
- video和表单组件(不常用 类型太少)
h5不能自动播放,只有在静音的前提下才能 button:(种类太少 不能满足需求) <button size="mini" type="primary" ...
- 解决URLEncoder.encode 编码空格变 + 号
jdk自带的URL编码工具类 URLEncoder 在对字符串进行URI编码的时候,会把空格编码为 + 号. 空格的URI编码其实是:%20 解决办法:对编码后的字符串,进行 + 号替换为 %20.总 ...
- Ruby 版本升级
一.升级原因 在开发shopify app的时候,提示我当前的Ruby版本不支持(如下图),所以需要升级Ruby. 由于Ruby 中的一些 Gem 依赖于 OpenSSL 库,所以更改 Ruby 版本 ...