PostgreSQL中Review subquery_planner函数的实现逻辑是什么

2025-03-11 技术教程

本篇内容介绍了“PostgreSQL中Review subquery_planner函数的实现逻辑是什么”的有关知识，在实际案例的操作过程中，不少人都会遇到这样的困境，接下来就让小编带领大家学习一下如何处理这些情况吧！希望大家仔细阅读，能够学有所成！

一、源码解读

subquery_planner函数由函数standard_planner调用,生成最终的结果Relation(成本最低),其输出作为生成实际执行计划的输入,在此函数中会调用grouping_planner执行主要的计划过程

/*--------------------*subquery_planner*Invokestheplanneronasubquery.Werecursetohereforeach*sub-SELECTfoundinthequerytree.*对子查询进行执行规划。对于查询树中的每个子查询(sub-SELECT)，都会递归此处理过程。**globistheglobalstateforthecurrentplannerrun.*parseisthequerytreeproducedbytheparser&rewriter.*parent_rootistheimmediateparentQuery'sinfo(NULLatthetoplevel).*hasRecursionistrueifthisisarecursiveWITHquery.*tuple_fractionisthefractionoftuplesweexpectwillberetrieved.*tuple_fractionisinterpretedasexplainedforgrouping_planner,below.*glob-当前计划器运行的全局状态。*parse-由解析器和重写器生成的查询树querytree。*parent_root是父查询的信息(如为顶层则为空)。*hasRecursion-如果这是一个带查询的递归，值为T。*tuple_fraction-扫描元组的比例。tuple_fraction在grouping_planner中详细解释。**Basically,thisroutinedoesthestuffthatshouldonlybedoneonce*perQueryobject.Itthencallsgrouping_planner.Atonetime,*grouping_plannercouldbeinvokedrecursivelyonthesameQueryobject;*that'snotcurrentlytrue,butwekeeptheseparationbetweenthetwo*routinesanyway,incaseweneeditagainsomeday.*基本上，这个函数包含完成了每个Query只需要执行一次的任务。*该函数调用grouping_planner一次。在同一个Query上，每次递归grouping_planner都调用一次;*当然，这不是通常的情况，但我们仍然保持这两个例程（subquery_planner和grouping_planner)之间的分离，*以防有一天我们再次需要它。**subquery_plannerwillbecalledrecursivelytohandlesub-Querynodes*foundwithinthequery'sexpressionsandrangetable.*函数subquery_planner将被递归调用，以处理表达式和RTE中的子查询节点。**ReturnsthePlannerInfostruct("root")thatcontainsalldatagenerated*whileplanningthesubquery.Inparticular,thePath(s)attachedto*the(UPPERREL_FINAL,NULL)upperrelrepresentourconclusionsaboutthe*cheapestway(s)toimplementthequery.Thetoplevelwillselectthe*bestPathandpassitthroughcreateplan.ctoproduceafinishedPlan.*返回PlannerInfostruct(“root”)，它包含在计划子查询时生成的所有数据。*特别地，访问路径附加到(UPPERREL_FINAL,NULL)上层关系中,以代表优化器已找到查询成本最低的方法.*顶层将选择最佳路径并将其通过createplan.c传递以制定一个已完成的计划。*--------------------*//*输入:glob-PlannerGlobalparse-Query结构体指针parent_root-父PlannerInfoRoot节点hasRecursion-是否递归?tuple_fraction-扫描Tuple比例输出:PlannerInfo指针*/PlannerInfo*subquery_planner(PlannerGlobal*glob,Query*parse,PlannerInfo*parent_root,boolhasRecursion,doubletuple_fraction){PlannerInfo*root;//返回值List*newWithCheckOptions;//List*newHaving;//Having子句boolhasOuterJoins;//是否存在OuterJoin?RelOptInfo*final_rel;//ListCell*l;//临时变量/*CreateaPlannerInfodatastructureforthissubquery*///创建一个规划器数据结构:PlannerInforoot=makeNode(PlannerInfo);//构造返回值root->parse=parse;root->glob=glob;root->query_level=parent_root?parent_root->query_level+1:1;root->parent_root=parent_root;root->plan_params=NIL;root->outer_params=NULL;root->planner_cxt=CurrentMemoryContext;root->init_plans=NIL;root->cte_plan_ids=NIL;root->multiexpr_params=NIL;root->eq_classes=NIL;root->append_rel_list=NIL;root->rowMarks=NIL;memset(root->upper_rels,0,sizeof(root->upper_rels));memset(root->upper_targets,0,sizeof(root->upper_targets));root->processed_tlist=NIL;root->grouping_map=NULL;root->minmax_aggs=NIL;root->qual_security_level=0;root->inhTargetKind=INHKIND_NONE;root->hasRecursion=hasRecursion;if(hasRecursion)root->wt_param_id=SS_assign_special_param(root);elseroot->wt_param_id=-1;root->non_recursive_path=NULL;root->partColsUpdated=false;/**IfthereisaWITHlist,processeachWITHqueryandbuildaninitplan*SubPlanstructureforit.*如果有一个WITH链表，使用查询处理每个链表，并为其构建一个initplan子计划结构。*/if(parse->cteList)SS_process_ctes(root);//处理With语句/**LookforANYandEXISTSSubLinksinWHEREandJOIN/ONclauses,andtry*totransformthemintojoins.Notethatthisstepdoesnotdescend*intosubqueries;ifwepullupanysubqueriesbelow,theirSubLinksare*processedjustbeforepullingthemup.*查找WHERE和JOIN/ON子句中的ANY/EXISTS子句，并尝试将它们转换为JOIN。*注意，此步骤不会下降为子查询;如果我们上拉子查询，它们的SubLinks将在调出它们上拉前被处理。*/if(parse->hasSubLinks)pull_up_sublinks(root);//上拉子链接/**Scantherangetableforset-returningfunctions,andinlinethemif*possible(producingsubqueriesthatmightgetpulledupnext).*RecursionissuesherearehandledinthesamewayasforSubLinks.*扫描RTE中的set-returning函数，*如果可能，内联它们(生成下一个可能被上拉的子查询)。*这里递归问题的处理方式与SubLinks相同。*/inline_set_returning_functions(root);///**Checktoseeifanysubqueriesinthejointreecanbemergedintothis*query.*检查连接树中的子查询是否可以合并到该查询中(上拉子查询)*/pull_up_subqueries(root);//上拉子查询/**IfthisisasimpleUNIONALLquery,flattenitintoanappendrel.We*dothisnowbecauseitrequiresapplyingpull_up_subqueriestotheleaf*queriesoftheUNIONALL,whichweren'ttouchedabovebecausethey*weren'treferencedbythejointree(theywillbeafterwedothis).*如果这是一个简单的UNIONALL查询，则将其ftatten为appendrel结构。*我们现在这样做是因为它需要对UNIONALL的叶子查询应用pull_up_subqueries，*上面没有涉及到这些查询，因为它们没有被jointree引用(在我们这样做之后它们将被引用)。*/if(parse->setOperations)flatten_simple_union_all(root);//扁平化处理UNIONALL/**DetectwhetheranyrangetableentriesareRTE_JOINkind;ifnot,wecan*avoidtheexpenseofdoingflatten_join_alias_vars().Alsocheckfor*outerjoins---ifnone,wecanskipreduce_outer_joins().Andcheck*forLATERALRTEs,too.Thismustbedoneafterwehavedone*pull_up_subqueries(),ofcourse.*检测是否有任何RTE中的元素是RTE_JOIN类型;如果没有，可以避免执行refin_join_alias_vars()的开销。*检查外部连接——如果没有，可以跳过reduce_outer_join()函数。同样的,我们会检查LATERALRTEs。*当然，这必须在我们完成pull_up_subqueries()调用之后完成。*///判断RTE中是否存在RTE_JOIN?root->hasJoinRTEs=false;root->hasLateralRTEs=false;hasOuterJoins=false;foreach(l,parse->rtable){RangeTblEntry*rte=lfirst_node(RangeTblEntry,l);if(rte->rtekind==RTE_JOIN){root->hasJoinRTEs=true;if(IS_OUTER_JOIN(rte->jointype))hasOuterJoins=true;}if(rte->lateral)root->hasLateralRTEs=true;}/**PreprocessRowMarkinformation.Weneedtodothisaftersubquery*pullup(sothatallnon-inheritedRTEsarepresent)andbefore*inheritanceexpansion(sothattheinfoisavailablefor*expand_inherited_tablestoexamineandmodify).*预处理RowMark信息。*我们需要在子查询上拉(以便所有非继承的RTEs都存在)和继承展开之后完成*(以便expand_inherited_tables可以使用这个信息来检查和修改)。*///预处理RowMark信息preprocess_rowmarks(root);/**Expandanyrangetableentriesthatareinheritancesetsinto"append*relations".Thiscanaddentriestotherangetable,buttheymustbe*plainbaserelationsnotjoins,soit'sOK(andmarginallymore*efficient)todoitaftercheckingforjoinRTEs.Wemustdoitafter*pullingupsubqueries,elsewe'dfailtohandleinheritedtablesin*subqueries.*将继承集的任何可范围条目展开为“appendrelations”。*将相关的relation添加到RTE中，但它们必须是纯基础关系而不是连接，*因此在检查连接RTEs之后执行它是可以的(而且更有效)。*我们必须在启动子查询后执行，否则我们将无法在子查询中处理继承表。*///展开继承表expand_inherited_tables(root);/**SethasHavingQualtorememberifHAVINGclauseispresent.Needed*becausepreprocess_expressionwillreduceaconstant-trueconditionto*anemptyquallist...but"HAVINGTRUE"isnotasemanticno-op.*如果存在HAVING子句，则务必设置hasHavingQual属性。*因为preprocess_expression将把constant-true条件减少为空的条件qual列表…*但是，“HAVINGTRUE”并没有语义错误。*///是否存在Having表达式root->hasHavingQual=(parse->havingQual!=NULL);/*Clearthisflag;mightgetsetindistribute_qual_to_rels*///清除hasPseudoConstantQuals标记,该标记可能在distribute_qual_to_rels函数中设置root->hasPseudoConstantQuals=false;/**Doexpressionpreprocessingontargetlistandquals,aswellasother*randomexpressionsinthequerytree.Notethatwedonotneedto*handlesort/groupexpressionsexplicitly,becausetheyareactually*partofthetargetlist.*对targetlist和quals以及querytree中的其他随机表达式进行表达式预处理。*注意，我们不需要显式地处理sort/group表达式，因为它们实际上是targetlist的一部分。*///预处理表达式:targetList(投影列)parse->targetList=(List*)preprocess_expression(root,(Node*)parse->targetList,EXPRKIND_TARGET);/*Constant-foldingmighthaveremovedallset-returningfunctions*///Constant-folding可能已经把set-returning函数去掉if(parse->hasTargetSRFs)parse->hasTargetSRFs=expression_returns_set((Node*)parse->targetList);newWithCheckOptions=NIL;foreach(l,parse->withCheckOptions)//witchCheckOptions{WithCheckOption*wco=lfirst_node(WithCheckOption,l);wco->qual=preprocess_expression(root,wco->qual,EXPRKIND_QUAL);if(wco->qual!=NULL)newWithCheckOptions=lappend(newWithCheckOptions,wco);}parse->withCheckOptions=newWithCheckOptions;//返回列信息returningListparse->returningList=(List*)preprocess_expression(root,(Node*)parse->returningList,EXPRKIND_TARGET);//预处理条件表达式preprocess_qual_conditions(root,(Node*)parse->jointree);//预处理Having表达式parse->havingQual=preprocess_expression(root,parse->havingQual,EXPRKIND_QUAL);//窗口函数foreach(l,parse->windowClause){WindowClause*wc=lfirst_node(WindowClause,l);/*partitionClause/orderClausearesort/groupexpressions*/wc->startOffset=preprocess_expression(root,wc->startOffset,EXPRKIND_LIMIT);wc->endOffset=preprocess_expression(root,wc->endOffset,EXPRKIND_LIMIT);}//Limit子句parse->limitOffset=preprocess_expression(root,parse->limitOffset,EXPRKIND_LIMIT);parse->limitCount=preprocess_expression(root,parse->limitCount,EXPRKIND_LIMIT);//OnConflict子句if(parse->onConflict){parse->onConflict->arbiterElems=(List*)preprocess_expression(root,(Node*)parse->onConflict->arbiterElems,EXPRKIND_ARBITER_ELEM);parse->onConflict->arbiterWhere=preprocess_expression(root,parse->onConflict->arbiterWhere,EXPRKIND_QUAL);parse->onConflict->onConflictSet=(List*)preprocess_expression(root,(Node*)parse->onConflict->onConflictSet,EXPRKIND_TARGET);parse->onConflict->onConflictWhere=preprocess_expression(root,parse->onConflict->onConflictWhere,EXPRKIND_QUAL);/*exclRelTlistcontainsonlyVars,sonopreprocessingneeded*/}//集合操作(AppendRelInfo)root->append_rel_list=(List*)preprocess_expression(root,(Node*)root->append_rel_list,EXPRKIND_APPINFO);//RTE/*AlsoneedtopreprocessexpressionswithinRTEs*/foreach(l,parse->rtable){RangeTblEntry*rte=lfirst_node(RangeTblEntry,l);intkind;ListCell*lcsq;if(rte->rtekind==RTE_RELATION){if(rte->tablesample)rte->tablesample=(TableSampleClause*)preprocess_expression(root,(Node*)rte->tablesample,EXPRKIND_TABLESAMPLE);//数据表采样语句}elseif(rte->rtekind==RTE_SUBQUERY)//子查询{/**Wedon'twanttodoallpreprocessingyetonthesubquery's*expressions,sincethatwillhappenwhenweplanit.Butifit*containsanyjoinaliasesofourlevel,thosehavetoget*expandednow,becauseplanningofthesubquerywon'tdoit.*That'sonlypossibleifthesubqueryisLATERAL.*我们还不想对子查询的表达式进行预处理，因为这将在计划时发生。*但是，如果它包含当前级别的任何连接别名，那么现在就必须扩展这些别名，*因为子查询的计划无法做到这一点。只有在子查询是LATERAL的情况下才有可能。*/if(rte->lateral&&root->hasJoinRTEs)rte->subquery=(Query*)flatten_join_alias_vars(root,(Node*)rte->subquery);}elseif(rte->rtekind==RTE_FUNCTION)//函数{/*Preprocessthefunctionexpression(s)fully*///预处理函数表达式kind=rte->lateral?EXPRKIND_RTFUNC_LATERAL:EXPRKIND_RTFUNC;rte->functions=(List*)preprocess_expression(root,(Node*)rte->functions,kind);}elseif(rte->rtekind==RTE_TABLEFUNC)//TABLEFUNC{/*Preprocessthefunctionexpression(s)fully*/kind=rte->lateral?EXPRKIND_TABLEFUNC_LATERAL:EXPRKIND_TABLEFUNC;rte->tablefunc=(TableFunc*)preprocess_expression(root,(Node*)rte->tablefunc,kind);}elseif(rte->rtekind==RTE_VALUES)//VALUES子句{/*Preprocessthevalueslistsfully*/kind=rte->lateral?EXPRKIND_VALUES_LATERAL:EXPRKIND_VALUES;rte->values_lists=(List*)preprocess_expression(root,(Node*)rte->values_lists,kind);}/**ProcesseachelementofthesecurityQualslistasifitwerea*separatequalexpression(asindeeditis).Weneedtodoitthis*waytogetpropercanonicalizationofAND/ORstructure.Notethat*thisconvertseachelementintoanimplicit-ANDsublist.*处理securityQuals列表的每个元素，就好像它是一个单独的qual表达式(事实也是如此)。*之所以这样做，是因为需要获得适当的规范化AND/OR结构。*注意，这将把每个元素转换为隐含的子列表。*/foreach(lcsq,rte->securityQuals){lfirst(lcsq)=preprocess_expression(root,(Node*)lfirst(lcsq),EXPRKIND_QUAL);}}/**Nowthatwearedonepreprocessingexpressions,andinparticulardone*flatteningjoinaliasvariables,getridofthejoinaliasvarslists.*Theynolongermatchwhatexpressionsintherestofthetreelook*like,becausewehavenotpreprocessedexpressionsinthoselists(and*donotwantto;forexample,expandingaSubLinktherewouldresultin*auselessunreferencedsubplan).Leavingtheminplacesimplycreates*ahazardforlaterscansofthetree.Wecouldtrytopreventthatby*usingQTW_IGNORE_JOINALIASESineverytreescandoneafterthispoint,*butthatdoesn'tsoundveryreliable.*现在，已经完成了预处理表达式，特别是扁平化连接别名变量，现在可以去掉joinaliasvars链表了。*它们不再匹配树中其他部分中的表达式，因为我们没有在那些链表中预处理表达式*(而且是不希望这样做,例如，在那里展开一个SubLink将导致无用的未引用的子计划)。*把它们放在链表中只会给以后扫描树造成问题。*我们可以在这之后的每一次树扫描中使用QTW_IGNORE_JOINALIASES来防止这种情况，虽然这听起来不太可靠。*/if(root->hasJoinRTEs){foreach(l,parse->rtable){RangeTblEntry*rte=lfirst_node(RangeTblEntry,l);rte->joinaliasvars=NIL;}}/**InsomecaseswemaywanttotransferaHAVINGclauseintoWHERE.We*cannotdosoiftheHAVINGclausecontainsaggregates(obviously)or*volatilefunctions(sinceaHAVINGclauseissupposedtobeexecuted*onlyoncepergroup).Wealsocan'tdothisifthereareanynonempty*groupingsets;movingsuchaclauseintoWHEREwouldpotentiallychange*theresults,ifanyreferencedcolumnisn'tpresentinallthegrouping*sets.(Ifthereareonlyemptygroupingsets,thentheHAVINGclause*mustbedegenerateasdiscussedbelow.)*在某些情况下，我们可能想把“HAVING”条件转移到WHERE子句中。*如果HAVING子句包含聚合(显式的)或易变volatile函数(因为每个GROUP只执行一次HAVING子句)，就不能这样做。*如果有任何非空GROUPINGSET，也不能这样做;*如果在所有GROUPINGSET中没有出现任何引用列，将这样的子句移动到WHERE可能会改变结果。*(如果只有空的GROUPSET分组集，则可以按照下面讨论的那样简化HAVING子句->WHERE中。)**Also,itmaybethattheclauseissoexpensivetoexecutethatwe're*betteroffdoingitonlyoncepergroup,despitethelossof*selectivity.Thisishardtoestimateshortofdoingtheentire*planningprocesstwice,soweuseaheuristic:clausescontaining*subplansareleftinHAVING.Otherwise,wemoveorcopytheHAVING*clauseintoWHERE,inhopesofeliminatingtuplesbeforeaggregation*insteadofafter.*而且，执行子句的成本非常高，所以最好每组只执行一次，尽管这样会导致选择性selectivity。*如果不把整个规划过程重复一遍，这是很难估计的，因此我们使用启发式的方法:*包含子计划的条款在HAVING的后面。*否则，我们将把HAVING子句移动到WHERE中，希望在聚合之前而不是聚合之后消除元组。**Ifthequeryhasexplicitgroupingthenwecansimplymovesucha*clauseintoWHERE;anygroupthatfailstheclausewillnotbeinthe*outputbecausenoneofitstupleswillreachthegroupingor*aggregationstage.Otherwisewemusthaveadegenerate(variable-free)*HAVINGclause,whichweputinWHEREsothatquery_planner()canuseit*inagatingResultnode,butalsokeepinHAVINGtoensurethatwe*don'temitabogusaggregatedrow.(Thiscouldbedonebetter,butit*seemsnotworthoptimizing.)*如果查询有显式分组，那么可以简单地将这样的子句移动到WHERE中;*任何失败的GROUP子句都不会出现在输出中，因为它的元组不会到达分组或聚合阶段。*否则，我们必须有一个退化的(无变量的)HAVING子句，把它放在WHERE中，*以便query_planner()可以在一个控制结果节点中使用它，但同时还要确保不会发出一个伪造的聚合行。*(这本来可以做得更好，但似乎不值得继续深入优化。)**NotethatbothhavingQualandparse->jointree->qualsarein*implicitly-ANDed-listformatthispoint,eventhoughtheyaredeclared*asNode*.*请注意，现在不管是qual还是parse->jointree->quals，即使它们被声明为节点*，*但它们在这个点上都是都是隐式的链表形式。*/newHaving=NIL;foreach(l,(List*)parse->havingQual){Node*havingclause=(Node*)lfirst(l);if((parse->groupClause&&parse->groupingSets)||contain_agg_clause(havingclause)||contain_volatile_functions(havingclause)||contain_subplans(havingclause)){/*keepitinHAVING*/newHaving=lappend(newHaving,havingclause);}elseif(parse->groupClause&&!parse->groupingSets){/*moveittoWHERE*/parse->jointree->quals=(Node*)lappend((List*)parse->jointree->quals,havingclause);}else{/*putacopyinWHERE,keepitinHAVING*/parse->jointree->quals=(Node*)lappend((List*)parse->jointree->quals,copyObject(havingclause));newHaving=lappend(newHaving,havingclause);}}parse->havingQual=(Node*)newHaving;/*RemoveanyredundantGROUPBYcolumns*///移除多余的GROUPBY列remove_useless_groupby_columns(root);/**Ifwehaveanyouterjoins,trytoreducethemtoplaininnerjoins.*Thisstepismosteasilydoneafterwe'vedoneexpression*preprocessing.*如果存在外连接，则尝试将它们转换为普通的内部连接。*在我们完成表达式预处理之后，这个步骤相对容易完成。*/if(hasOuterJoins)reduce_outer_joins(root);/**Dothemainplanning.Ifwehaveaninheritedtargetrelation,that*needsspecialprocessing,elsegostraighttogrouping_planner.*执行主要的计划过程。*如果存在继承的目标关系，则需要特殊处理，否则直接执行grouping_planner。*/if(parse->resultRelation&&rt_fetch(parse->resultRelation,parse->rtable)->inh)inheritance_planner(root);elsegrouping_planner(root,false,tuple_fraction);/**Capturethesetofouter-levelparamIDswehaveaccessto,forusein*extParam/allParamcalculationslater.*获取我们可以访问的outer-level的参数IDs,以便稍后在extParam/allParam计算中使用。*/SS_identify_outer_params(root);/**IfanyinitPlanswerecreatedinthisquerylevel,adjustthesurviving*Paths'costsandparallel-safetyflagstoaccountforthem.The*initPlanswon'tactuallygetattachedtotheplantreetill*create_plan()runs,butwemustincludetheireffectsnow.*如果在此查询级别中创建了initplan，则调整现存的访问路径成本和并行安全标志，以反映这些成本。*在create_plan()运行之前，initPlans实际上不会被附加到计划树中，但是我们现在必须包含它们的效果。*/final_rel=fetch_upper_rel(root,UPPERREL_FINAL,NULL);SS_charge_for_initplans(root,final_rel);/**Makesurewe'veidentifiedthecheapestPathforthefinalrel.(By*doingthisherenotingrouping_planner,weincludeinitPlancostsin*thedecision,thoughit'sunlikelythatwillchangeanything.)*确保我们已经为最终的关系确定了成本最低的路径*(我们没有在grouping_planner中这样做，而是在最终决定中加入了initPlan的成本，尽管这不太可能改变任何事情)。*/set_cheapest(final_rel);returnroot;}

“PostgreSQL中Review subquery_planner函数的实现逻辑是什么”的内容就介绍到这里了，感谢大家的阅读。如果想了解更多行业相关的知识可以关注亿速云网站，小编将为大家输出更多高质量的实用文章！