这篇文章主要讲解了“怎么解决PostgreSQL窗口函数调用的限制”,文中的讲解内容简单清晰,易于学习与理解,下面请大家跟着小编的思路慢慢深入,一起来研究和学习“怎么解决PostgreSQL窗口函数调用的限制”吧!

背景

窗口函数是分析场景常用的,目前(citus 7.5)仅支持两种场景使用window函数,

1、partition by 必须是分布键。

2、where条件里面带分布键的等值过滤条件。

本质上:目前(citus 7.5)window函数不支持跨shard操作,或者说过程中不进行重分布。

而Greenplum这方面做得很好,是一个完整的MPP数据库。

citus window函数的支持

postgres=#\setVERBOSITYverbosepostgres=#selectrow_number()over(partitionbybidorderbyaid)rn,*frompgbench_accounts;ERROR:0A000:couldnotrundistributedquerybecausethewindowfunctionthatisusedcannotbepusheddownHINT:Windowfunctionsaresupportedintwoways.Eitheraddanequalityfilteronthedistributedtables'partitioncolumnorusethewindowfunctionswithaPARTITIONBYclausecontainingthedistributioncolumnLOCATION:DeferErrorIfQueryNotSupported,multi_logical_planner.c:938

满足以下条件即可支持

1、partition by 必须是分布键。

2、where条件里面带分布键的等值过滤条件。

postgres=#selectrow_number()over(partitionbybidorderbyaid)rn,*frompgbench_accountswhereaid=1;rn|aid|bid|abalance|filler----+-----+-----+----------+--------------------------------------------------------------------------------------1|1|1|0|(1row)postgres=#selectrow_number()over(partitionbyaidorderbybid)rn,*frompgbench_accountslimit1;rn|aid|bid|abalance|filler----+-----+-----+----------+--------------------------------------------------------------------------------------1|298|1|0|(1row)

执行计划

postgres=#explainverboseselectrow_number()over(partitionbyaidorderbybid)rn,*frompgbench_accountslimit1;QUERYPLAN---------------------------------------------------------------------------------------------------------------------------------------------------------Limit(cost=0.00..0.00rows=0width=0)Output:remote_scan.rn,remote_scan.aid,remote_scan.bid,remote_scan.abalance,remote_scan.filler->CustomScan(CitusReal-Time)(cost=0.00..0.00rows=0width=0)Output:remote_scan.rn,remote_scan.aid,remote_scan.bid,remote_scan.abalance,remote_scan.fillerTaskCount:128TasksShown:Oneof128->TaskNode:host=172.24.211.224port=1921dbname=postgres->Limit(cost=705.99..706.01rows=1width=105)Output:(row_number()OVER(?)),pgbench_accounts.aid,pgbench_accounts.bid,pgbench_accounts.abalance,pgbench_accounts.filler->WindowAgg(cost=705.99..860.95rows=7748width=105)Output:row_number()OVER(?),pgbench_accounts.aid,pgbench_accounts.bid,pgbench_accounts.abalance,pgbench_accounts.filler->Sort(cost=705.99..725.36rows=7748width=97)Output:pgbench_accounts.aid,pgbench_accounts.bid,pgbench_accounts.abalance,pgbench_accounts.fillerSortKey:pgbench_accounts.aid,pgbench_accounts.bid->SeqScanonpublic.pgbench_accounts_106812pgbench_accounts(cost=0.00..205.48rows=7748width=97)Output:pgbench_accounts.aid,pgbench_accounts.bid,pgbench_accounts.abalance,pgbench_accounts.filler(17rows)postgres=#explainverboseselectrow_number()over(partitionbybidorderbyaid)rn,*frompgbench_accountswhereaid=1;QUERYPLAN-------------------------------------------------------------------------------------------------------------------------------------------------------------CustomScan(CitusRouter)(cost=0.00..0.00rows=0width=0)Output:remote_scan.rn,remote_scan.aid,remote_scan.bid,remote_scan.abalance,remote_scan.fillerTaskCount:1TasksShown:All->TaskNode:host=172.24.211.232port=1921dbname=postgres->WindowAgg(cost=2.51..2.53rows=1width=105)Output:row_number()OVER(?),aid,bid,abalance,filler->Sort(cost=2.51..2.51rows=1width=97)Output:aid,bid,abalance,fillerSortKey:pgbench_accounts.bid->IndexScanusingpgbench_accounts_pkey_106819onpublic.pgbench_accounts_106819pgbench_accounts(cost=0.28..2.50rows=1width=97)Output:aid,bid,abalance,fillerIndexCond:(pgbench_accounts.aid=1)(14rows)

Citus未在window调用中支持重分布的过程。

greenplum window函数的支持

支持任意姿势的window调用

postgres=#createtablet(idint,c1int,c2int);NOTICE:Tabledoesn'thave'DISTRIBUTEDBY'clause--Usingcolumnnamed'id'astheGreenplumDatabasedatadistributionkeyforthistable.HINT:The'DISTRIBUTEDBY'clausedeterminesthedistributionofdata.Makesurecolumn(s)chosenaretheoptimaldatadistributionkeytominimizeskew.CREATETABLEpostgres=#insertintotselectrandom()*100000,random()*10,random()*100fromgenerate_series(1,10000000);INSERT010000000postgres=#explainselectrow_number()over(partitionbyc1orderbyid)rn,*fromt;QUERYPLAN------------------------------------------------------------------------------------------------------------------GatherMotion33:1(slice2;segments:33)(cost=1477974.88..1553064.94rows=10012008width=12)->Window(cost=1477974.88..1553064.94rows=303395width=12)PartitionBy:c1OrderBy:id->Sort(cost=1477974.88..1503004.90rows=303395width=12)SortKey:c1,id//以下在citus中用临时表代替->RedistributeMotion33:33(slice1;segments:33)(cost=0.00..313817.24rows=303395width=12)HashKey:c1->SeqScanont(cost=0.00..113577.08rows=303395width=12)Optimizerstatus:legacyqueryoptimizer(10rows)

甚至一个SQL中支持多个不同维度的partition

postgres=#explainselectrow_number()over(partitionbyc1orderbyid)rn1,row_number()over(partitionbyc2orderbyc1)rn2,*fromt;QUERYPLAN------------------------------------------------------------------------------------------------------------------------------------------------GatherMotion33:1(slice3;segments:33)(cost=3017582.83..3192792.97rows=10012008width=12)->SubqueryScancoplan(cost=3017582.83..3192792.97rows=303395width=12)->Window(cost=3017582.83..3092672.89rows=303395width=12)PartitionBy:coplan.c1OrderBy:coplan.id->Sort(cost=3017582.83..3042612.85rows=303395width=12)SortKey:coplan.c1,coplan.id//以下在citus中用临时表代替->RedistributeMotion33:33(slice2;segments:33)(cost=1477974.88..1853425.18rows=303395width=12)HashKey:coplan.c1->SubqueryScancoplan(cost=1477974.88..1653185.02rows=303395width=12)->Window(cost=1477974.88..1553064.94rows=303395width=12)PartitionBy:t.c2OrderBy:t.c1->Sort(cost=1477974.88..1503004.90rows=303395width=12)SortKey:t.c2,t.c1//以下在citus中用临时表代替->RedistributeMotion33:33(slice1;segments:33)(cost=0.00..313817.24rows=303395width=12)HashKey:t.c2->SeqScanont(cost=0.00..113577.08rows=303395width=12)Optimizerstatus:legacyqueryoptimizer(19rows)

感谢各位的阅读,以上就是“怎么解决PostgreSQL窗口函数调用的限制”的内容了,经过本文的学习后,相信大家对怎么解决PostgreSQL窗口函数调用的限制这一问题有了更深刻的体会,具体使用情况还需要大家实践验证。这里是亿速云,小编将为大家推送更多相关知识点的文章,欢迎关注!