PostgreSQL JOIN limit 优化器 成本计算 改进 - mergejoin startup cost 优化
PostgreSQL limit N的成本估算,是通过计算总成本A,以及估算得到的总记录数B得到:
(N/B)*A大概意思就是占比的方法计算
对于单表查询,这种方法通常来说比较适用,但是如果数据分布有倾斜,实际上也并不一定适用,例如以下两种情况:
1、符合条件的数据占总记录数的50%,但是全部分布在表的末尾,那么limit 10000 条到底是走索引快还是走全表扫描快呢?
2、符合条件的数据占总记录数的50%,全部分布在表的头部,那么LIMIT 10000 条,肯定是全表扫描快了。
对于JOIN的情况,同样有类似的问题:
比如JOIN并且带条件时,LIMIT N,是走嵌套循环快,还是走MERGE 或 HASH JOIN快?
1、嵌套循环+where+LIMIT的成本计算方法,可以使用LIMIT占总估算记录数占比的方法得到,还算是比较合理。
2、MERGE JOIN+where+LIMIT的成本计算方法,必须考虑启动成本,例如WHERE条件在A表上(可以走索引直接从条件位置开始扫描),B表则需要从索引的开头开始扫描(到与A表的索引匹配时,也许需要扫描很多的索引ENTRY,这个启动成本可能会很高),启动成本,加上LIMIT条数在剩余的所有成本中的一个占比,得到的成本是一个比较合理的成本。
3、hash join+where+limit的成本计算方法,使用启动成本+LIMIT占总估算记录数占比的方法得到,优化器目前就是这么做的,比较合理。
然而,对于MERGE JOIN,目前在使用LIMIT时,PG没有加上这个启动成本,使得最后得到的执行计划可能不准确。
改进方法建议可以加入merge join启动成本。
PostgreSQL 例子1、建表如下:
postgres=#createtabletest1(aint,btext,primarykey(a));CREATETABLEpostgres=#createtabletest2(aint,btext,primarykey(a));CREATETABLEpostgres=#altertabletest1addconstrainttestcheckforeignkey(a)referencestest2(a);ALTERTABLEpostgres=#insertintotest2selectgenerate_series(1,1000000),'abcdefg';INSERT01000000postgres=#insertintotest1selectgenerate_series(1,1000000,2),'abcdefg';INSERT0500000analyzetest1;analyzetest2;
2、查询SQL如下:
explain(analyze,verbose,timing,costs,buffers)select*fromtest2leftjointest1ontest2.a=test1.awheretest2.a>500000limit10;
该语句中表结构比较特殊,两个关联字段都是主键,并且存在外键约束关系,查询计划如下:
QUERYPLAN----------------------------------------------------------------------------------------------------------------------------------------------------Limit(cost=0.73..0.89rows=10width=24)(actualtime=54.729..54.739rows=10loops=1)Output:test2.a,test2.b,test1.a,test1.bBuffers:sharedhit=2042->MergeLeftJoin(cost=0.73..7929.35rows=498340width=24)(actualtime=54.728..54.735rows=10loops=1)Output:test2.a,test2.b,test1.a,test1.bInnerUnique:trueMergeCond:(test2.a=test1.a)Buffers:sharedhit=2042->IndexScanusingtest2_pkeyonpublic.test2(cost=0.37..3395.42rows=498340width=12)(actualtime=0.017..0.020rows=10loops=1)Output:test2.a,test2.bIndexCond:(test2.a>500000)Buffers:sharedhit=4->IndexScanusingtest1_pkeyonpublic.test1(cost=0.37..2322.99rows=500000width=12)(actualtime=0.006..34.120rows=250006loops=1)Output:test1.a,test1.bBuffers:sharedhit=2038PlanningTime:0.216msExecutionTime:54.765ms(17rows)
从执行计划上可以看出,根据test2表先查询出满足条件的10条记录,然后和test1表采用mergejoin关联,由于在估算的时候没有考虑到limit的影响,估算的行数非常大,是498340行,
实际采用nestloop效果会更好(关闭掉seqscan和megejoin)
postgres=#setenable_seqscan=off;SETpostgres=#setenable_mergejoin=off;SETpostgres=#explain(analyze,verbose,timing,costs,buffers)select*fromtest2leftjointest1ontest2.a=test1.awheretest2.a>500000limit10;QUERYPLAN-----------------------------------------------------------------------------------------------------------------------------------------------Limit(cost=0.73..4.53rows=10width=24)(actualtime=0.040..0.060rows=10loops=1)Output:test2.a,test2.b,test1.a,test1.bBuffers:sharedhit=39->NestedLoopLeftJoin(cost=0.73..189339.64rows=498340width=24)(actualtime=0.039..0.057rows=10loops=1)Output:test2.a,test2.b,test1.a,test1.bInnerUnique:trueBuffers:sharedhit=39->IndexScanusingtest2_pkeyonpublic.test2(cost=0.37..3395.42rows=498340width=12)(actualtime=0.025..0.027rows=10loops=1)Output:test2.a,test2.bIndexCond:(test2.a>500000)Buffers:sharedhit=4->IndexScanusingtest1_pkeyonpublic.test1(cost=0.37..0.37rows=1width=12)(actualtime=0.002..0.002rows=0loops=10)Output:test1.a,test1.bIndexCond:(test2.a=test1.a)Buffers:sharedhit=35PlanningTime:0.112msExecutionTime:0.078ms(17rows)
但是从评估的成本来看,merge join+limit 比 nestloop+limit更低,原因是nestloop的总成本更高(189339 比 7929)。所以优化器根据比例算法(未参照merge join的启动成本),认为在LIMIT的情况下,也是merge join成本更低。
实际情况是,MERGE JOIN的没带查询条件的B表,需要从索引的头部开始扫,而不是从指定位置开始扫。 因此实际情况是merge join是更慢的。
目前优化器使用hash join时,已经算上了startup cost,例子
postgres=#setenable_mergejoin=off;SETpostgres=#setenable_seqscan=off;SETpostgres=#setenable_nestloop=off;SET启动成本=3536.51postgres=#explain(analyze,verbose,timing,costs,buffers)select*fromtest2leftjointest1ontest2.a=test1.awheretest2.a>500000limit10;QUERYPLAN----------------------------------------------------------------------------------------------------------------------------------------------------------Limit(cost=3536.51..3536.61rows=10width=24)(actualtime=158.148..158.219rows=10loops=1)Output:test2.a,test2.b,test1.a,test1.bBuffers:sharedhit=4079,tempwritten=1464->HashLeftJoin(cost=3536.51..8135.83rows=494590width=24)(actualtime=158.147..158.215rows=10loops=1)Output:test2.a,test2.b,test1.a,test1.bInnerUnique:trueHashCond:(test2.a=test1.a)Buffers:sharedhit=4079,tempwritten=1464->IndexScanusingtest2_pkeyonpublic.test2(cost=0.37..3369.86rows=494590width=12)(actualtime=0.023..0.027rows=26loops=1)Output:test2.a,test2.bIndexCond:(test2.a>500000)Buffers:sharedhit=4->Hash(cost=2322.99..2322.99rows=500000width=12)(actualtime=156.848..156.849rows=500000loops=1)Output:test1.a,test1.bBuckets:262144Batches:4MemoryUsage:7418kBBuffers:sharedhit=4072,tempwritten=1464->IndexScanusingtest1_pkeyonpublic.test1(cost=0.37..2322.99rows=500000width=12)(actualtime=0.011..72.506rows=500000loops=1)Output:test1.a,test1.bBuffers:sharedhit=4072PlanningTime:0.141msExecutionTime:162.086ms(21rows)改进建议
针对test1表,需要估算a<500000有多少行,作为索引扫描的startup成本。
postgres=#explainselect*fromtest1wherea<500000;QUERYPLAN---------------------------------------------------------------------------------IndexScanusingtest1_pkeyontest1(cost=0.37..1702.83rows=249893width=12)IndexCond:(a<500000)(2rows)postgres=#explainselect*fromtest1;QUERYPLAN-------------------------------------------------------------SeqScanontest1(cost=0.00..133.15rows=500000width=12)(1row)
所以,索引扫描test1(where a > 500000)的merge join启动成本应该有 1702,加上这个成本后,成本远大于NEST LOOP JOIN的成本,所以不会选择merge join。
Oracle 例子createtabletest1(aint,bvarchar2(4000),primarykey(a));createtabletest2(aint,bvarchar2(4000),primarykey(a));altertabletest1addconstrainttestcheckforeignkey(a)referencestest2(a);insertintotest2selectrownum,'abcdefg'fromdualconnectbylevel<=1000000;insertintotest1select*from(selectrownumasrn,'abcdefg'fromdualconnectbylevel<=1000000)twheremod(rn,2)=1;
execDBMS_STATS.GATHER_TABLE_STATS('DIGOAL','TEST1',method_opt=>'FORCOLUMNS(a,b)');execDBMS_STATS.GATHER_TABLE_STATS('DIGOAL','TEST2',method_opt=>'FORCOLUMNS(a,b)');
查询SQL如下:
setautotraceonsetlinesize120setpagesize200setwrapoffselect*fromtest2leftjointest1ontest2.a=test1.awheretest2.a>500000andrownum<=10;AB-----------------------------------------------------------------------------------------------------------------------500001abcdefg500002abcdefg500003abcdefg500004abcdefg500005abcdefg500006abcdefg500007abcdefg500008abcdefg500009abcdefg500010abcdefg10rowsselected.ExecutionPlan----------------------------------------------------------Planhashvalue:3391785554-----------------------------------------------------------------------------------------------|Id|Operation|Name|Rows|Bytes|Cost(%CPU)|Time|-----------------------------------------------------------------------------------------------|0|SELECTSTATEMENT||10|500|15(0)|00:00:01||*1|COUNTSTOPKEY|||||||2|NESTEDLOOPSOUTER||10|500|15(0)|00:00:01||3|TABLEACCESSBYINDEXROWID|TEST2|10|250|4(0)|00:00:01||*4|INDEXRANGESCAN|SYS_C00151146|9000||3(0)|00:00:01||5|TABLEACCESSBYINDEXROWID|TEST1|1|25|2(0)|00:00:01||*6|INDEXUNIQUESCAN|SYS_C00151145|1||1(0)|00:00:01|-----------------------------------------------------------------------------------------------PredicateInformation(identifiedbyoperationid):---------------------------------------------------1-filter(ROWNUM<=10)4-access("TEST2"."A">500000)6-access("TEST2"."A"="TEST1"."A"(+))filter("TEST1"."A"(+)>500000)Statistics----------------------------------------------------------0recursivecalls0dbblockgets25consistentgets0physicalreads0redosize937bytessentviaSQL*Nettoclient500bytesreceivedviaSQL*Netfromclient2SQL*Netroundtripsto/fromclient0sorts(memory)0sorts(disk)10rowsprocessed
Oracle 选择了nestloop join。
使用HINT,让Oracle使用merge join,看看成本是多少,是否与修正PostgreSQL merge join启动成本接近。
select/*+USE_MERGE(test2,test1)*/*fromtest2leftjointest1ontest2.a=test1.awheretest2.a>500000andrownum<=10;ExecutionPlan----------------------------------------------------------Planhashvalue:492577188------------------------------------------------------------------------------------------------|Id|Operation|Name|Rows|Bytes|Cost(%CPU)|Time|------------------------------------------------------------------------------------------------|0|SELECTSTATEMENT||10|750|29(7)|00:00:01||*1|COUNTSTOPKEY|||||||2|MERGEJOINOUTER||10|750|29(7)|00:00:01||3|TABLEACCESSBYINDEXROWID|TEST2|10|250|4(0)|00:00:01||*4|INDEXRANGESCAN|SYS_C00151146|9000||3(0)|00:00:01||*5|SORTJOIN||25000|610K|25(8)|00:00:01||6|TABLEACCESSBYINDEXROWID|TEST1|25000|610K|23(0)|00:00:01||*7|INDEXRANGESCAN|SYS_C00151145|4500||11(0)|00:00:01|------------------------------------------------------------------------------------------------PredicateInformation(identifiedbyoperationid):---------------------------------------------------1-filter(ROWNUM<=10)4-access("TEST2"."A">500000)5-access("TEST2"."A"="TEST1"."A"(+))filter("TEST2"."A"="TEST1"."A"(+))7-access("TEST1"."A"(+)>500000)Statistics----------------------------------------------------------1recursivecalls0dbblockgets1099consistentgets0physicalreads0redosize937bytessentviaSQL*Nettoclient500bytesreceivedviaSQL*Netfromclient2SQL*Netroundtripsto/fromclient1sorts(memory)0sorts(disk)10rowsprocessed小结
1、PostgreSQL 在计算merge join+limit的成本时,优化器有优化的空间,可以考虑把启动成本算进来,提高优化器选择带limit输出的SQL的JOIN方法的正确性。
2、如果是inner join,通过query rewrite可以对merge join进行优化,跳过不符合条件的头部INDEX SCAN。
postgres=#explain(analyze,verbose,timing,costs,buffers)select*fromtest2jointest1ontest2.a=test1.awheretest2.a>500000limit10;QUERYPLAN----------------------------------------------------------------------------------------------------------------------------------------------------Limit(cost=0.77..1.09rows=10width=24)(actualtime=54.626..54.638rows=10loops=1)Output:test2.a,test2.b,test1.a,test1.bBuffers:sharedhit=2042->MergeJoin(cost=0.77..7895.19rows=247295width=24)(actualtime=54.625..54.635rows=10loops=1)Output:test2.a,test2.b,test1.a,test1.bInnerUnique:trueMergeCond:(test2.a=test1.a)Buffers:sharedhit=2042->IndexScanusingtest2_pkeyonpublic.test2(cost=0.37..3369.86rows=494590width=12)(actualtime=0.017..0.020rows=19loops=1)Output:test2.a,test2.bIndexCond:(test2.a>500000)Buffers:sharedhit=4->IndexScanusingtest1_pkeyonpublic.test1(cost=0.37..2322.99rows=500000width=12)(actualtime=0.008..34.009rows=250010loops=1)Output:test1.a,test1.bBuffers:sharedhit=2038PlanningTime:0.244msExecutionTime:54.669ms(17rows)sqlrewrite:可以做到内核里面,这样就不需要改SQL了。效果如下,超好。postgres=#explain(analyze,verbose,timing,costs,buffers)select*fromtest2jointest1ontest2.a=test1.awheretest2.a>500000andtest1.a>500000limit10;QUERYPLAN-----------------------------------------------------------------------------------------------------------------------------------------------Limit(cost=0.75..1.30rows=10width=24)(actualtime=0.035..0.048rows=10loops=1)Output:test2.a,test2.b,test1.a,test1.bBuffers:sharedhit=8->MergeJoin(cost=0.75..6711.51rows=123700width=24)(actualtime=0.034..0.044rows=10loops=1)Output:test2.a,test2.b,test1.a,test1.bInnerUnique:trueMergeCond:(test2.a=test1.a)Buffers:sharedhit=8->IndexScanusingtest2_pkeyonpublic.test2(cost=0.37..3369.86rows=494590width=12)(actualtime=0.015..0.019rows=19loops=1)Output:test2.a,test2.bIndexCond:(test2.a>500000)Buffers:sharedhit=4->IndexScanusingtest1_pkeyonpublic.test1(cost=0.37..1704.30rows=250106width=12)(actualtime=0.015..0.017rows=10loops=1)Output:test1.a,test1.bIndexCond:(test1.a>500000)Buffers:sharedhit=4PlanningTime:0.276msExecutionTime:0.074ms(18rows)参考
《PostgreSQL 优化器案例之 - order by limit 索引选择问题》
src/backend/optimizer/path/costsize.c
原文地址:https://github.com/digoal/blog/blob/master/201810/20181004_03.md
声明:本站所有文章资源内容,如无特殊说明或标注,均为采集网络资源。如若本站内容侵犯了原著者的合法权益,可联系本站删除。