本文分享自华为云社区《你的JoinHint为什么不生效【绽放吧!GaussDB(DWS)云原生数仓】》,作者:你是猴子请来的救兵吗 。
引言
提起数据库的Hint,几乎每个DBA都知道这一强大功能。在GaussDB(DWS)中,Hint可以被用来干预SQL的执行计划,但是在日常工作中,很多开发人员对Hint的缺乏深入了解,经常遇到Hint失效的情况却又束手无策。
本次针对JoinHint从案例着手深入解析JoinHint不生效的原因,以便读者能“知其所以然”。(本文不讨论Hint的基础语法问题)。
问题案例
内核版本 GaussDB 8.1.3
问题描述 两表关联查询,使用hashjoin hint干预join方式,但hint不生效
问题用例
CREATE TABLE workitem ( language character varying(10), userid character varying(240), opiontype character varying(240), processinstid character varying(240), workitemid character varying(240), type_name character varying(240), type_code character varying(240), createtime timestamp without time zone, endtime timestamp without time zone, notrejecttotal numeric, dws_created_time timestamp without time zone ) WITH (orientation=column, compression=low, colversion=2.0, enable_delta=false) DISTRIBUTE BY HASH(workitemid); CREATE TABLE workday ( mm timestamp with time zone, rn numeric ) WITH (orientation=column, compression=low, colversion=2.0, enable_delta=false) DISTRIBUTE BY HASH(mm); explain SELECT /*+ hashjoin(c d) */ c.userid,c.type_name,c.type_code,count(1) num FROM workitem c INNER JOIN workday d ON c.createtime = d.mm WHERE c.createtime >= '2023-09-01' AND c.endtime Row Adapter | 2 | | 1502 | 33.12 2 | -> Vector Sonic Hash Aggregate | 2 | | 1502 | 33.12 3 | -> Vector Streaming (type: GATHER) | 4 | | 1502 | 33.12 4 | -> Vector Sonic Hash Aggregate | 4 | 16MB | 1502 | 27.12 5 | -> Vector Nest Loop (6,8) | 5 | 1MB | 1494 | 27.08 6 | -> Vector Streaming(type: BROADCAST) | 14 | 2MB | 8 | 13.68 7 | -> CStore Scan on workday d | 7 | 1MB | 8 | 13.05 8 | -> Vector Materialize | 5 | 16MB | 1502 | 13.09 9 | -> CStore Scan on workitem c | 5 | 1MB | 1502 | 13.08 RunTime Analyze Information ------------------------------------------- "public.workitem" runtime: 25.794ms "public.workday" runtime: 18.098ms Predicate Information (identified by plan id) -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 5 --Vector Nest Loop (6,8) Join Filter: (c.createtime = d.mm) 7 --CStore Scan on workday d Filter: (mm >= '2023-09-01 00:00:00'::timestamp without time zone) Pushdown Predicate Filter: (mm >= '2023-09-01 00:00:00'::timestamp without time zone) 9 --CStore Scan on workitem c Filter: ((createtime >= '2023-09-01 00:00:00'::timestamp without time zone) AND (endtime = '2023-09-01 00:00:00'::timestamp without time zone) AND (endtime问题定位
尝试关闭nestloop路径,来验证是否可以生成hash计划
set enable_nestloop = off;
set enable_mergejoin = off;
set enable_hashjoin = on;
WARNING: unused hint: HashJoin(c d) QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- id | operation | E-rows | E-memory | E-width | E-costs ----+------------------------------------------------------+--------+----------+---------+--------- 1 | -> Row Adapter | 2 | | 1502 | 33.12 2 | -> Vector Sonic Hash Aggregate | 2 | | 1502 | 33.12 3 | -> Vector Streaming (type: GATHER) | 4 | | 1502 | 33.12 4 | -> Vector Sonic Hash Aggregate | 4 | 16MB | 1502 | 27.12 5 | -> Vector Nest Loop (6,8) | 5 | 1MB | 1494 | 27.08 6 | -> Vector Streaming(type: BROADCAST) | 14 | 2MB | 8 | 13.68 7 | -> CStore Scan on workday d | 7 | 1MB | 8 | 13.05 8 | -> Vector Materialize | 5 | 16MB | 1502 | 13.09 9 | -> CStore Scan on workitem c | 5 | 1MB | 1502 | 13.08 Predicate Information (identified by plan id) -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 5 --Vector Nest Loop (6,8) Join Filter: (c.createtime = d.mm) 7 --CStore Scan on workday d Filter: (mm >= '2023-09-01 00:00:00'::timestamp without time zone) Pushdown Predicate Filter: (mm >= '2023-09-01 00:00:00'::timestamp without time zone) 9 --CStore Scan on workitem c Filter: ((createtime >= '2023-09-01 00:00:00'::timestamp without time zone) AND (endtime = '2023-09-01 00:00:00'::timestamp without time zone) AND (endtime关闭nestloop路径后,仍然生成nestloop计划,且E-costs代价中未添加惩罚代价,说明该场景语句本身不支持hashjoin。
检查关联表达式(c.createtime = d.mm)
,确认是否支持hashjoin。
- 关联表达式为字段关联,不存在函数嵌套
- 关联表达式两边数据类型为timestamp without time zone和timestamp with time zone,通过系统表pg_operator确认是否支持hashjoin。
postgres=# select * from pg_operator where oprname = '=' and oprleft = 'timestamp'::regtype and oprright = 'timestamptz'::regtype; -[ RECORD 1 ]+------------------------- oprname | = oprnamespace | 11 oprowner | 10 oprkind | b oprcanmerge | t oprcanhash | f oprleft | 1114 oprright | 1184 oprresult | 16 oprcom | 2542 oprnegate | 2539 oprcode | timestamp_eq_timestamptz oprrest | eqsel oprjoin | eqjoinsel
- 通过结果确认oprcanhash为false,代表该操作符不支持hash连接;原因是,左边数据不带时区,右边数据带,在比较时要先处理时区问题,不能直接拿存储值进行判断。
改善办法
通过系统表确认timestamp类型的等值关联和timestamptz的等值关联均支持hash连接。
postgres=# select * from pg_operator where oprname = '=' and oprleft = oprright and oprleft in('timestamp'::regtype,'timestamptz'::regtype); -[ RECORD 1 ]+--------------- oprname | = oprnamespace | 11 oprowner | 10 oprkind | b oprcanmerge | t oprcanhash | t oprleft | 1184 oprright | 1184 oprresult | 16 oprcom | 1320 oprnegate | 1321 oprcode | timestamptz_eq oprrest | eqsel oprjoin | eqjoinsel -[ RECORD 2 ]+--------------- oprname | = oprnamespace | 11 oprowner | 10 oprkind | b oprcanmerge | t oprcanhash | t oprleft | 1114 oprright | 1114 oprresult | 16 oprcom | 2060 oprnegate | 2061 oprcode | timestamp_eq oprrest | eqsel oprjoin | eqjoinsel
在关联条件上添加类型转换,保证两边类型一致,即(c.createtime::timestamptz = d.mm)或(c.createtime = d.mm::timestamp)。
postgres=# explain postgres-# SELECT /*+ hashjoin(c d) */ postgres-# c.userid,c.type_name,c.type_code,count(1) num postgres-# FROM workitem c INNER JOIN workday d ON c.createtime::timestamptz = d.mm postgres-# WHERE c.createtime >= '2023-09-01' AND c.endtime Row Adapter | 2 | | 1502 | 34.29 2 | -> Vector Sonic Hash Aggregate | 2 | | 1502 | 34.29 3 | -> Vector Streaming (type: GATHER) | 4 | | 1502 | 34.29 4 | -> Vector Sonic Hash Aggregate | 4 | 16MB | 1502 | 28.29 5 | -> Vector Sonic Hash Join (6,8) | 5 | 16MB | 1494 | 28.25 6 | -> Vector Streaming(type: BROADCAST) | 40 | 2MB | 8 | 15.06 7 | -> CStore Scan on workday d | 20 | 1MB | 8 | 13.01 8 | -> CStore Scan on workitem c | 5 | 1MB | 1502 | 13.08 Predicate Information (identified by plan id) -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 5 --Vector Sonic Hash Join (6,8) Hash Cond: (d.mm = (c.createtime)::timestamp with time zone) 8 --CStore Scan on workitem c Filter: ((createtime >= '2023-09-01 00:00:00'::timestamp without time zone) AND (endtime = '2023-09-01 00:00:00'::timestamp without time zone) AND (endtime = '2023-09-01' AND c.endtime Row Adapter | 2 | | 1502 | 32.91 2 | -> Vector Sonic Hash Aggregate | 2 | | 1502 | 32.91 3 | -> Vector Streaming (type: GATHER) | 4 | | 1502 | 32.91 4 | -> Vector Sonic Hash Aggregate | 4 | 16MB | 1502 | 26.91 5 | -> Vector Sonic Hash Join (6,8) | 5 | 16MB | 1494 | 26.87 6 | -> Vector Streaming(type: BROADCAST) | 14 | 2MB | 8 | 13.71 7 | -> CStore Scan on workday d | 7 | 1MB | 8 | 13.08 8 | -> CStore Scan on workitem c | 5 | 1MB | 1502 | 13.08 Predicate Information (identified by plan id) -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 5 --Vector Sonic Hash Join (6,8) Hash Cond: ((d.mm)::timestamp without time zone = c.createtime) 7 --CStore Scan on workday d Filter: ((mm)::timestamp without time zone >= '2023-09-01 00:00:00'::timestamp without time zone) 8 --CStore Scan on workitem c Filter: ((createtime >= '2023-09-01 00:00:00'::timestamp without time zone) AND (endtime = '2023-09-01 00:00:00'::timestamp without time zone) AND (endtime知识小结
实际使用过程中导致hint生效的原因很多,这里总结排查hashjoin hint步骤以供参考:
- 检查hint中的表名是否正确、是否存在重名、是否在当前层可见,此类场景通常在explain中会给出提示,自行排查即可。
- 判断关联hint中的表名是否被提升导致表名不存在,此类场景通常在explain中会给出提示
postgres=# explain SELECT /*+ hashjoin(c d) */ c.userid,c.type_name,c.type_code,count(1) num FROM workitem c INNER JOIN (select * from workday where mm >= '2023-09-01') d ON c.createtime = d.mm::timestamp WHERE c.createtime >= '2023-09-01' AND c.endtime Row Adapter | 2 | | 1502 | 32.78 2 | -> Vector Sonic Hash Aggregate | 2 | | 1502 | 32.78 3 | -> Vector Streaming (type: GATHER) | 4 | | 1502 | 32.78 4 | -> Vector Sonic Hash Aggregate | 4 | 16MB | 1502 | 26.78 5 | -> Vector Sonic Hash Join (6,8) | 5 | 16MB | 1494 | 26.74 6 | -> Vector Streaming(type: BROADCAST) | 10 | 2MB | 8 | 13.58 7 | -> CStore Scan on workday | 5 | 1MB | 8 | 13.11 8 | -> CStore Scan on workitem c | 5 | 1MB | 1502 | 13.08 Predicate Information (identified by plan id) -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 5 --Vector Sonic Hash Join (6,8) Hash Cond: ((workday.mm)::timestamp without time zone = c.createtime) 7 --CStore Scan on workday Filter: ((mm >= '2023-09-01 00:00:00+08'::timestamp with time zone) AND ((mm)::timestamp without time zone >= '2023-09-01 00:00:00'::timestamp without time zone)) Pushdown Predicate Filter: (mm >= '2023-09-01 00:00:00+08'::timestamp with time zone) 8 --CStore Scan on workitem c Filter: ((createtime >= '2023-09-01 00:00:00'::timestamp without time zone) AND (endtime = '2023-09-01 00:00:00'::timestamp without time zone) AND (endtime针对此种情况,8.2.0及以上版本可以通过添加no merge hint来禁用子查询提升从而规避hint失效问题。
通过join路径参数验证目标路径是否可生效。
--如通过关闭其他路径参数来验证某一路径是否可达 set enable_nestloop = off; set enable_mergejoin = off; set enable_hashjoin = on;检查关联条件中是否存在volatile函数。
postgres=# create or replace function gettimediff(timestamp) returns interval language sql as 'select $1-timeofday()::timestamp' volatile; CREATE FUNCTION postgres=# explain SELECT /*+ hashjoin(c d) */ c.userid,c.type_name,c.type_code,count(1) num FROM workitem c INNER JOIN workday d ON gettimediff(c.createtime) = gettimediff(d.mm::timestamp) WHERE c.createtime >= '2023-09-01' AND c.endtime HashAggregate | 5 | 1502 | 3.10 2 | -> Nested Loop (3,4) | 5 | 1494 | 3.00 3 | -> Data Node Scan on workitem "_REMOTE_TABLE_QUERY_" | 5 | 1502 | 0.00 4 | -> Data Node Scan on workday "_REMOTE_TABLE_QUERY_" | 20 | 8 | 0.00 Predicate Information (identified by plan id) ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- 2 --Nested Loop (3,4) Join Filter: ((c.createtime - (timeofday())::timestamp without time zone) = ((d.mm)::timestamp without time zone - (timeofday())::timestamp without time zone)) (11 rows)检查关联条件中两表字段是否在等号两侧,若不是则进行调整。
postgres=# explain SELECT /*+ hashjoin(c d) */ c.userid,c.type_name,c.type_code,count(1) num FROM workitem c INNER JOIN workday d ON ifnull(c.createtime,d.mm) = now() WHERE c.createtime >= '2023-09-01' AND c.endtime Row Adapter | 1 | | 1502 | 35.37 2 | -> Vector Sonic Hash Aggregate | 1 | | 1502 | 35.37 3 | -> Vector Streaming (type: GATHER) | 2 | | 1502 | 35.37 4 | -> Vector Sonic Hash Aggregate | 2 | 16MB | 1502 | 29.37 5 | -> Vector Nest Loop (6,8) | 2 | 1MB | 1494 | 29.35 6 | -> Vector Streaming(type: BROADCAST) | 40 | 2MB | 8 | 15.06 7 | -> CStore Scan on workday d | 20 | 1MB | 8 | 13.01 8 | -> Vector Materialize | 5 | 16MB | 1502 | 13.09 9 | -> CStore Scan on workitem c | 5 | 1MB | 1502 | 13.08 Predicate Information (identified by plan id) -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 5 --Vector Nest Loop (6,8) Join Filter: (COALESCE((c.createtime)::timestamp with time zone, d.mm) = now()) 9 --CStore Scan on workitem c Filter: ((createtime >= '2023-09-01 00:00:00'::timestamp without time zone) AND (endtime = '2023-09-01 00:00:00'::timestamp without time zone) AND (endtime检查关联条件是否为等值关联,若不是则进行调整。
postgres=# explain SELECT /*+ hashjoin(c d) */ c.userid,c.type_name,c.type_code,count(1) num FROM workitem c INNER JOIN workday d ON c.createtime::timestamptz > d.mm WHERE c.createtime >= '2023-09-01' AND c.endtime Row Adapter | 5 | | 1502 | 35.41 2 | -> Vector Sonic Hash Aggregate | 5 | | 1502 | 35.41 3 | -> Vector Streaming (type: GATHER) | 10 | | 1502 | 35.41 4 | -> Vector Sonic Hash Aggregate | 10 | 16MB | 1502 | 29.41 5 | -> Vector Nest Loop (6,8) | 33 | 1MB | 1494 | 29.20 6 | -> Vector Streaming(type: BROADCAST) | 40 | 2MB | 8 | 15.06 7 | -> CStore Scan on workday d | 20 | 1MB | 8 | 13.01 8 | -> Vector Materialize | 5 | 16MB | 1502 | 13.09 9 | -> CStore Scan on workitem c | 5 | 1MB | 1502 | 13.08 Predicate Information (identified by plan id) -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 5 --Vector Nest Loop (6,8) Join Filter: ((c.createtime)::timestamp with time zone > d.mm) 9 --CStore Scan on workitem c Filter: ((createtime >= '2023-09-01 00:00:00'::timestamp without time zone) AND (endtime = '2023-09-01 00:00:00'::timestamp without time zone) AND (endtime检查关联关系两侧的数据类型,并通过pg_operator.oprcanhash确认是否支持hash连接,若不支持则需改写为支持的操作符。
postgres=# select * from pg_operator where oprname = '=' and oprleft = 'timestamp'::regtype and oprright = 'timestamptz'::regtype; -[ RECORD 1 ]+------------------------- oprname | = oprnamespace | 11 oprowner | 10 oprkind | b oprcanmerge | t oprcanhash | f oprleft | 1114 oprright | 1184 oprresult | 16 oprcom | 2542 oprnegate | 2539 oprcode | timestamp_eq_timestamptz oprrest | eqsel oprjoin | eqjoinsel如果是指定join顺序的hint,如leading(c e),会存在逻辑本身冲突导致hint失败的情况。
postgres=# explain SELECT /*+ leading(c e) */ c.userid,c.type_name,c.type_code,count(1) num FROM workitem c LEFT JOIN workday d ON c.createtime = d.mm LEFT JOIN workday e ON d.mm = e.mm WHERE c.createtime >= '2023-09-01' AND c.endtime Row Adapter | 2 | | 1502 | 47.97 2 | -> Vector Sonic Hash Aggregate | 2 | | 1502 | 47.97 3 | -> Vector Streaming (type: GATHER) | 4 | | 1502 | 47.97 4 | -> Vector Sonic Hash Aggregate | 4 | 16MB | 1502 | 41.97 5 | -> Vector Nest Loop Left Join (6, 7) | 5 | 1MB | 1494 | 41.93 6 | -> CStore Scan on workitem c | 5 | 1MB | 1502 | 13.08 7 | -> Vector Materialize | 40 | 16MB | 8 | 28.00 8 | -> Vector Streaming(type: BROADCAST) | 40 | 2MB | 8 | 27.90 9 | -> Vector Hash Left Join (10, 11) | 20 | 16MB | 8 | 26.32 10 | -> CStore Scan on workday d | 20 | 1MB | 8 | 13.01 11 | -> CStore Scan on workday e | 20 | 1MB | 8 | 13.01 Predicate Information (identified by plan id) -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 5 --Vector Nest Loop Left Join (6, 7) Join Filter: (c.createtime = d.mm) 6 --CStore Scan on workitem c Filter: ((createtime >= '2023-09-01 00:00:00'::timestamp without time zone) AND (endtime = '2023-09-01 00:00:00'::timestamp without time zone) AND (endtime检查查询语句中,from表数量是否超出from_collapse_limit,以及join表数量是否超出join_collapse_limit。超出时存在一定概率使hint无法生效。
比默认值小的数值将降低规划时间,但是可能生成差的执行计划。
postgres=# show from_collapse_limit; from_collapse_limit --------------------- 8 (1 row) postgres=# show join_collapse_limit; join_collapse_limit --------------------- 8 (1 row)检查查询语句中,from表数量是否超出geqo_threshold(geqo开启的前提下),如果超出会使用基因查询优化来生成计划,存在很大概率使hint无法生效。
对于简单的查询,通常用详尽搜索方法,当涉及多个表的查询的时候,用GEQO可以更好的管理查询。
postgres=# show geqo_threshold; geqo_threshold ---------------- 12 (1 row)点击关注,第一时间了解华为云新鲜技术~