本篇内容主要讲解“PostgreSQL中哪个函数为连接新生成的joinrel构造访问路径”,感兴趣的朋友不妨来看看。本文介绍的方法操作简单快捷,实用性强。下面就让小编来带大家学习“PostgreSQL中哪个函数为连接新生成的joinrel构造访问路径”吧!

一、数据结构

SpecialJoinInfo

/**"Specialjoin"info.**One-sidedouterjoinsconstraintheorderofjoiningpartiallybutnot*completely.Weflattensuchjoinsintotheplanner'stop-levellistof*relationstojoin,butrecordinformationabouteachouterjoinina*SpecialJoinInfostruct.ThesestructsarekeptinthePlannerInfonode's*join_info_list.**Similarly,semijoinsandantijoinscreatedbyflatteningIN(subselect)*andEXISTS(subselect)clausescreatepartialconstraintsonjoinorder.*ThesearelikewiserecordedinSpecialJoinInfostructs.**WemakeSpecialJoinInfosforFULLJOINseventhoughthereisnoflexibility*ofplanningforthem,becausethissimplifiesmake_join_rel()'sAPI.**min_lefthandandmin_righthandarethesetsofbaserelidsthatmustbe*availableoneachsidewhenperformingthespecialjoin.lhs_strictis*trueifthespecialjoin'sconditioncannotsucceedwhentheLHSvariables*areallNULL(thismeansthatanouterjoincancommutewithupper-level*outerjoinsevenifitappearsintheirRHS).Wedon'tbothertoset*lhs_strictforFULLJOINs,however.**Itisnotvalidforeithermin_lefthandormin_righthandtobeemptysets;*iftheywere,thiswouldbreakthelogicthatenforcesjoinorder.**syn_lefthandandsyn_righthandarethesetsofbaserelidsthatare*syntacticallybelowthisspecialjoin.(Theseareneededtohelpcompute*min_lefthandandmin_righthandforhigherjoins.)**delay_upper_joinsissettrueifwedetectapushed-downclausethathas*tobeevaluatedafterthisjoinisformed(becauseitreferencestheRHS).*AnyouterjoinsthathavesuchaclauseandthisjoinintheirRHScannot*commutewiththisjoin,becausethatwouldleavenoplacetocheckthe*pushed-downclause.(Wedon'ttrackthisforFULLJOINs,either.)**Forasemijoin,wealsoextractthejoinoperatorsandtheirRHSarguments*andsetsemi_operators,semi_rhs_exprs,semi_can_btree,andsemi_can_hash.*Thisisdoneinsupportofpossiblyunique-ifyingtheRHS,sowedon't*botherunlessatleastoneofsemi_can_btreeandsemi_can_hashcanbeset*true.(Youmightexpectthatthisinformationwouldbecomputedduring*joinplanning;butit'shelpfultohaveitavailableduringplanningof*parameterizedtablescans,sowestoreitintheSpecialJoinInfostructs.)**jointypeisneverJOIN_RIGHT;aRIGHTJOINishandledbyswitching*theinputstomakeitaLEFTJOIN.Sotheallowedvaluesofjointype*inajoin_info_listmemberareonlyLEFT,FULL,SEMI,orANTI.**Forpurposesofjoinselectivityestimation,wecreatetransient*SpecialJoinInfostructuresforregularinnerjoins;soitispossible*tohavejointype==JOIN_INNERinsuchastructure,eventhoughthisis*notallowedwithinjoin_info_list.Wealsocreatetransient*SpecialJoinInfoswithjointype==JOIN_INNERforouterjoins,sincefor*costestimationpurposesitissometimesusefultoknowthejoinsizeunder*plaininnerjoinsemantics.Notethatlhs_strict,delay_upper_joins,and*ofcoursethesemi_xxxfieldsarenotsetmeaningfullywithinsuchstructs.*/typedefstructSpecialJoinInfo{NodeTagtype;Relidsmin_lefthand;/*baserelidsinminimumLHSforjoin*/Relidsmin_righthand;/*baserelidsinminimumRHSforjoin*/Relidssyn_lefthand;/*baserelidssyntacticallywithinLHS*/Relidssyn_righthand;/*baserelidssyntacticallywithinRHS*/JoinTypejointype;/*alwaysINNER,LEFT,FULL,SEMI,orANTI*/boollhs_strict;/*joinclauseisstrictforsomeLHSrel*/booldelay_upper_joins;/*can'tcommutewithupperRHS*//*RemainingfieldsaresetonlyforJOIN_SEMIjointype:*/boolsemi_can_btree;/*trueifsemi_operatorsareallbtree*/boolsemi_can_hash;/*trueifsemi_operatorsareallhash*/List*semi_operators;/*OIDsofequalityjoinoperators*/List*semi_rhs_exprs;/*righthand-sideexpressionsoftheseops*/}SpecialJoinInfo;

RelOptInfo

typedefenumRelOptKind{RELOPT_BASEREL,//基本关系(如基表/子查询等)RELOPT_JOINREL,//连接产生的关系,要注意的是通过连接等方式产生的结果亦可以视为关系RELOPT_OTHER_MEMBER_REL,RELOPT_OTHER_JOINREL,RELOPT_UPPER_REL,//上层的关系RELOPT_OTHER_UPPER_REL,RELOPT_DEADREL}RelOptKind;/**Isthegivenrelationasimplerelationi.eabaseor"other"member*relation?*/#defineIS_SIMPLE_REL(rel)\((rel)->reloptkind==RELOPT_BASEREL||\(rel)->reloptkind==RELOPT_OTHER_MEMBER_REL)/*Isthegivenrelationajoinrelation?*/#defineIS_JOIN_REL(rel)\((rel)->reloptkind==RELOPT_JOINREL||\(rel)->reloptkind==RELOPT_OTHER_JOINREL)/*Isthegivenrelationanupperrelation?*/#defineIS_UPPER_REL(rel)\((rel)->reloptkind==RELOPT_UPPER_REL||\(rel)->reloptkind==RELOPT_OTHER_UPPER_REL)/*Isthegivenrelationan"other"relation?*/#defineIS_OTHER_REL(rel)\((rel)->reloptkind==RELOPT_OTHER_MEMBER_REL||\(rel)->reloptkind==RELOPT_OTHER_JOINREL||\(rel)->reloptkind==RELOPT_OTHER_UPPER_REL)typedefstructRelOptInfo{NodeTagtype;//节点标识RelOptKindreloptkind;//RelOpt类型/*allrelationsincludedinthisRelOptInfo*/Relidsrelids;/*Relids(rtindex)集合setofbaserelids(rangetableindexes)*//*sizeestimatesgeneratedbyplanner*/doublerows;/*结果元组的估算数量estimatednumberofresulttuples*//*per-relationplannercontrolflags*/boolconsider_startup;/*是否考虑启动成本?是,需要保留启动成本低的路径keepcheap-startup-costpaths?*/boolconsider_param_startup;/*是否考虑参数化?的路径ditto,forparameterizedpaths?*/boolconsider_parallel;/*是否考虑并行处理路径considerparallelpaths?*//*defaultresulttargetlistforPathsscanningthisrelation*/structPathTarget*reltarget;/*扫描该Relation时默认的结果listofVars/Exprs,cost,width*//*materializationinformation*/List*pathlist;/*访问路径链表Pathstructures*/List*ppilist;/*路径链表中使用参数化路径进行ParamPathInfosusedinpathlist*/List*partial_pathlist;/*partialPaths*/structPath*cheapest_startup_path;//代价最低的启动路径structPath*cheapest_total_path;//代价最低的整体路径structPath*cheapest_unique_path;//代价最低的获取唯一值的路径List*cheapest_parameterized_paths;//代价最低的参数化?路径链表/*parameterizationinformationneededforbothbaserelsandjoinrels*//*(seealsolateral_varsandlateral_referencers)*/Relidsdirect_lateral_relids;/*使用lateral语法,需依赖的Relidsrelsdirectlylaterallyreferenced*/Relidslateral_relids;/*minimumparameterizationofrel*//*informationaboutabaserel(notsetforjoinrels!)*///reloptkind=RELOPT_BASEREL时使用的数据结构Indexrelid;/*RelationID*/Oidreltablespace;/*表空间containingtablespace*/RTEKindrtekind;/*基表?子查询?还是函数等等?RELATION,SUBQUERY,FUNCTION,etc*/AttrNumbermin_attr;/*最小的属性编号smallestattrnoofrel(often<0)*/AttrNumbermax_attr;/*最大的属性编号largestattrnoofrel*/Relids*attr_needed;/*数组arrayindexed[min_attr..max_attr]*/int32*attr_widths;/*属性宽度arrayindexed[min_attr..max_attr]*/List*lateral_vars;/*关系依赖的Vars/PHVsLATERALVarsandPHVsreferencedbyrel*/Relidslateral_referencers;/*依赖该关系的Relidsrelsthatreferencemelaterally*/List*indexlist;/*该关系的IndexOptInfo链表listofIndexOptInfo*/List*statlist;/*统计信息链表listofStatisticExtInfo*/BlockNumberpages;/*块数sizeestimatesderivedfrompg_class*/doubletuples;/*元组数*/doubleallvisfrac;/*?*/PlannerInfo*subroot;/*如为子查询,存储子查询的rootifsubquery*/List*subplan_params;/*如为子查询,存储子查询的参数ifsubquery*/intrel_parallel_workers;/*并行执行,需要多少个workers?wantednumberofparallelworkers*//*Informationaboutforeigntablesandforeignjoins*///FWD相关信息Oidserverid;/*identifiesserverforthetableorjoin*/Oiduserid;/*identifiesusertocheckaccessas*/booluseridiscurrent;/*joinisonlyvalidforcurrentuser*//*use"structFdwRoutine"toavoidincludingfdwapi.hhere*/structFdwRoutine*fdwroutine;void*fdw_private;/*cachespaceforrememberingifwehaveproventhisrelationunique*///已知的,可保证唯一的Relids链表List*unique_for_rels;/*knownuniquefortheseotherrelid*set(s)*/List*non_unique_for_rels;/*已知的,不唯一的Relids链表knownnotuniquefortheseset(s)*//*usedbyvariousscansandjoins:*/List*baserestrictinfo;/*如为基本关系,存储约束条件RestrictInfostructures(ifbaserel)*/QualCostbaserestrictcost;/*解析约束表达式的成本?costofevaluatingtheabove*/Indexbaserestrict_min_security;/*最低安全等级minsecurity_levelfoundin*baserestrictinfo*/List*joininfo;/*连接语句的约束条件信息RestrictInfostructuresforjoinclauses*involvingthisrel*/boolhas_eclass_joins;/*是否存在等价类连接?Tmeansjoininfoisincomplete*//*usedbypartitionwisejoins:*/boolconsider_partitionwise_join;/*分区?considerpartitionwise*joinpaths?(if*partitionedrel)*/Relidstop_parent_relids;/*Relidsoftopmostparents(if"other"*rel)*//*usedforpartitionedrelations*///分区表使用PartitionSchemepart_scheme;/*分区的schemaPartitioningscheme.*/intnparts;/*分区数numberofpartitions*/structPartitionBoundInfoData*boundinfo;/*分区边界信息Partitionbounds*/List*partition_qual;/*分区约束partitionconstraint*/structRelOptInfo**part_rels;/*分区的RelOptInfo数组ArrayofRelOptInfosofpartitions,*storedinthesameorderofbounds*/List**partexprs;/*非空分区键表达式Non-nullablepartitionkeyexpressions.*/List**nullable_partexprs;/*可为空的分区键表达式Nullablepartitionkeyexpressions.*/List*partitioned_child_rels;/*RTIndexes链表ListofRTindexes.*/}RelOptInfo;二、源码解读

join_search_one_level->...(如make_rels_by_clause_joins)->make_join_rel->populate_joinrel_with_paths函数为新生成的连接joinrel(给定参与连接的relations)构造访问路径.
输入参数中的sjinfo(SpecialJoinInfo结构体)提供了有关连接的详细信息,限制条件链表restrictlist(List)包含连接条件子句和适用于给定连接关系对的其他条件子句。

//--------------------------------------------------------------------populate_joinrel_with_paths/**populate_joinrel_with_paths*Addpathstothegivenjoinrelforgivenpairofjoiningrelations.The*SpecialJoinInfoprovidesdetailsaboutthejoinandtherestrictlist*containsthejoinclausesandtheotherclausesapplicableforgivenpair*ofthejoiningrelations.*为新生成的连接joinrel(给定参与连接的relations)构造访问路径.*SpecialJoinInfo提供了有关连接的详细信息,*限制条件链表包含连接条件子句和适用于给定连接关系对的其他条件子句。*/staticvoidpopulate_joinrel_with_paths(PlannerInfo*root,RelOptInfo*rel1,RelOptInfo*rel2,RelOptInfo*joinrel,SpecialJoinInfo*sjinfo,List*restrictlist){/**Considerpathsusingeachrelasbothouterandinner.Dependingon*thejointype,aprovablyemptyouterorinnerrelmightmeanthejoin*isprovablyemptytoo;inwhichcasethrowawayanypreviouslycomputed*pathsandmarkthejoinasdummy.(Wedoitthiswaysinceit's*conceivablethatdummy-nessofamulti-elementjoinmightonlybe*noticeableforcertainconstructionpaths.)*考虑使用每个rel分别作为外表和内表的路径。*根据连接类型的不同,一个可证明为空的外表或内表可能意味着连接结果也是为空;*在这种情况下,丢弃任何以前计算过的路径,并将连接标记为虚(dummy)连接。**Also,aprovablyconstant-falsejoinrestrictiontypicallymeansthat*wecanskipevaluatingoneorbothsidesofthejoin.Wedothisby*markingtheappropriaterelasdummy.Forouterjoins,a*constant-falserestrictionthatispusheddownstillmeansthewhole*joinisdummy,whileanon-pushed-downonemeansthatnoinnerrows*willjoinsowecantreattheinnerrelasdummy.*此外,可以证明的常量-false连接限制通常意味着我们可以跳过对连接的一个或两个方面的表达式解析。*我们通过将适当的rel标记为虚(dummy)来完成这一操作。*对于外连接,被下推的常量-false限制仍然意味着整个连接是虚连接,*而非下推的限制意味着没有内部行会连接,因此我们可以将内部rel视为虚拟的。**Weneedonlyconsiderthejointypesthatappearinjoin_info_list,plus*JOIN_INNER.*只需要考虑出现在join_info_list和JOIN_INNER中的jointype。*/switch(sjinfo->jointype){caseJOIN_INNER:if(is_dummy_rel(rel1)||is_dummy_rel(rel2)||restriction_is_constant_false(restrictlist,joinrel,false)){mark_dummy_rel(joinrel);//设置为虚连接break;}add_paths_to_joinrel(root,joinrel,rel1,rel2,JOIN_INNER,sjinfo,restrictlist);//添加路径,rel1为外表,rel2为内表add_paths_to_joinrel(root,joinrel,rel2,rel1,JOIN_INNER,sjinfo,restrictlist);//添加路径,rel2为外表,rel1为内表break;caseJOIN_LEFT://同上if(is_dummy_rel(rel1)||restriction_is_constant_false(restrictlist,joinrel,true)){mark_dummy_rel(joinrel);break;}if(restriction_is_constant_false(restrictlist,joinrel,false)&&bms_is_subset(rel2->relids,sjinfo->syn_righthand))mark_dummy_rel(rel2);add_paths_to_joinrel(root,joinrel,rel1,rel2,JOIN_LEFT,sjinfo,restrictlist);add_paths_to_joinrel(root,joinrel,rel2,rel1,JOIN_RIGHT,sjinfo,restrictlist);break;caseJOIN_FULL://同上if((is_dummy_rel(rel1)&&is_dummy_rel(rel2))||restriction_is_constant_false(restrictlist,joinrel,true)){mark_dummy_rel(joinrel);break;}add_paths_to_joinrel(root,joinrel,rel1,rel2,JOIN_FULL,sjinfo,restrictlist);add_paths_to_joinrel(root,joinrel,rel2,rel1,JOIN_FULL,sjinfo,restrictlist);/**Iftherearejoinqualsthataren'tmergeableorhashable,we*maynotbeabletobuildanyvalidplan.Complainheresothat*wecangiveasomewhat-usefulerrormessage.(Sincewehaveno*flexibilityofplanningforafulljoin,there'snochanceof*succeedinglaterwithanotherpairofinputrels.)*如果有无法合并或不能合并的joinquals,我们可能无法建立任何有效的计划。*在这里报错,这样就可以给出一些有用的错误消息。*(由于无法灵活地为全连接添加计划,所以以后再加入另一对rels就没有成功的机会了。)*/if(joinrel->pathlist==NIL)ereport(ERROR,(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),errmsg("FULLJOINisonlysupportedwithmerge-joinableorhash-joinablejoinconditions")));break;caseJOIN_SEMI://半连接/**Wemighthaveanormalsemijoin,oracasewherewedon'thave*enoughrelstodothesemijoinbutcanunique-ifytheRHSand*thendoaninnerjoin(seecommentsinjoin_is_legal).Inthe*lattercasewecan'tapplyJOIN_SEMIjoining.*可能有一个普通的半连接,或者我们没有足够的rels来做半连接,*但是可以通过唯一化RHS,然后做一个innerjoin(请参阅join_is_legal中的注释)。*在后一种情况下,我们不能应用JOIN_SEMIjoin。*/if(bms_is_subset(sjinfo->min_lefthand,rel1->relids)&&bms_is_subset(sjinfo->min_righthand,rel2->relids)){if(is_dummy_rel(rel1)||is_dummy_rel(rel2)||restriction_is_constant_false(restrictlist,joinrel,false)){mark_dummy_rel(joinrel);break;}add_paths_to_joinrel(root,joinrel,rel1,rel2,JOIN_SEMI,sjinfo,restrictlist);}/**Ifweknowhowtounique-ifytheRHSandoneinputrelis*exactlytheRHS(notasuperset)wecanconsiderunique-ifying*itandthendoingaregularjoin.(Thecreate_unique_path*checkhereisprobablyredundantwithwhatjoin_is_legaldid,*butifsothecheckischeapbecauseit'scached.Sotest*anywaytobesure.)*如果我们知道如何唯一化RHS,一个输入rel恰好是RHS(不是超集),*我们可以考虑唯一化它,然后进行常规连接。*(这里的create_unique_path检查与join_is_legal的检查可能是冗余的,*但是如果是的话,这样的检查的成本就显得很低,因为它可以缓存。所以不管怎样,还是要检查一下。*/if(bms_equal(sjinfo->syn_righthand,rel2->relids)&&create_unique_path(root,rel2,rel2->cheapest_total_path,sjinfo)!=NULL){if(is_dummy_rel(rel1)||is_dummy_rel(rel2)||restriction_is_constant_false(restrictlist,joinrel,false)){mark_dummy_rel(joinrel);break;}add_paths_to_joinrel(root,joinrel,rel1,rel2,JOIN_UNIQUE_INNER,sjinfo,restrictlist);add_paths_to_joinrel(root,joinrel,rel2,rel1,JOIN_UNIQUE_OUTER,sjinfo,restrictlist);}break;caseJOIN_ANTI://反连接if(is_dummy_rel(rel1)||restriction_is_constant_false(restrictlist,joinrel,true)){mark_dummy_rel(joinrel);break;}if(restriction_is_constant_false(restrictlist,joinrel,false)&&bms_is_subset(rel2->relids,sjinfo->syn_righthand))mark_dummy_rel(rel2);add_paths_to_joinrel(root,joinrel,rel1,rel2,JOIN_ANTI,sjinfo,restrictlist);break;default://非法的连接类型/*othervaluesnotexpectedhere*/elog(ERROR,"unrecognizedjointype:%d",(int)sjinfo->jointype);break;}/*尝试partitionwise技术.Applypartitionwisejointechnique,ifpossible.*/try_partitionwise_join(root,rel1,rel2,joinrel,sjinfo,restrictlist);}//-------------------------------------------------------------------add_paths_to_joinrel/**add_paths_to_joinrel*Givenajoinrelationandtwocomponentrelsfromwhichitcanbemade,*considerallpossiblepathsthatusethetwocomponentrelsasouter*andinnerrelrespectively.Addthesepathstothejoinrel'spathlist*iftheysurvivecomparisonwithotherpaths(andremoveanyexisting*pathsthataredominatedbythesepaths).*给出组成连接的两个组合rels,尝试所有可能的路径进行连接,比如分别设置为outer和inner表等.*如果连接的路径在与其他路径的比较中可以留存下来,*则将这些路径添加到连接rel的路径列表中(并删除现有的由这些路径控制的其他路径)。**Modifiesthepathlistfieldofthejoinrelnodetocontainthebest*pathsfoundsofar.*更新joinrel->pathlist链表已容纳最优的访问路径.**jointypeisnotnecessarilythesameassjinfo->jointype;itmightbe*"flippedaround"ifweareconsideringjoiningtherelsintheopposite*directionfromwhat'sindicatedinsjinfo.*jointype不需要与sjinfo->jointype一致,如果我们考虑加入与sjinfo所示相反方向的rels,它可能是“翻转”的。**Also,thisroutineandothersinthismoduleacceptthespecialJoinTypes*JOIN_UNIQUE_OUTERandJOIN_UNIQUE_INNERtoindicatethatweshould*unique-ifytheouterorinnerrelationandthenapplyaregularinner*join.Thesevaluesarenotallowedtopropagateoutsidethismodule,*however.Pathcostestimationcodemayneedtorecognizethatit's*dealingwithsuchacase---thecombinationofnominaljointypeINNER*withsjinfo->jointype==JOIN_SEMIindicatesthat.*此外,这个处理过程和这个模块中的其他过程接受特殊的JoinTypes(JOIN_UNIQUE_OUTER和JOIN_UNIQUE_INNER),*以表明应该对外部或内部关系进行唯一化,然后应用一个常规的内部连接。*但是,这些值不允许传播到这个模块之外。*访问路径成本估算过程可能需要认识到,*它正在处理这样的情况———名义上的INNERjointype与sjinfo->jointype==JOIN_SEMI的组合。*/voidadd_paths_to_joinrel(PlannerInfo*root,RelOptInfo*joinrel,RelOptInfo*outerrel,RelOptInfo*innerrel,JoinTypejointype,SpecialJoinInfo*sjinfo,List*restrictlist){JoinPathExtraDataextra;boolmergejoin_allowed=true;ListCell*lc;Relidsjoinrelids;/**PlannerInfodoesn'tcontaintheSpecialJoinInfoscreatedforjoins*betweenchildrelations,evenifthereisaSpecialJoinInfonodefor*thejoinbetweenthetopmostparents.So,whilecalculatingRelidsset*representingtherestriction,considerrelidsoftopmostparentof*partitions.*PlannerInfo不包含为子关系之间的连接创建的SpecialJoinInfo,*即使最顶层的父关系之间有一个SpecialJoinInfo节点。*因此,在计算表示限制条件的Relids集合时,需考虑分区的最顶层父类的Relids。*/if(joinrel->reloptkind==RELOPT_OTHER_JOINREL)joinrelids=joinrel->top_parent_relids;elsejoinrelids=joinrel->relids;extra.restrictlist=restrictlist;extra.mergeclause_list=NIL;extra.sjinfo=sjinfo;extra.param_source_rels=NULL;/**Seeiftheinnerrelationisprovablyuniqueforthisouterrel.*判断内表是否已被验证为唯一.**Wehavesomespecialcases:forJOIN_SEMIandJOIN_ANTI,itdoesn't*mattersincetheexecutorcanmaketheequivalentoptimizationanyway;*weneednotexpendplannercyclesonproofs.ForJOIN_UNIQUE_INNER,we*mustbeconsideringasemijoinwhoseinnersideisnotprovablyunique*(elsereduce_unique_semijoinswould'vesimplifiedit),sothere'sno*pointincallinginnerrel_is_unique.However,iftheLHScoversallof*thesemijoin'smin_lefthand,thenit'sappropriatetosetinner_unique*becausethepathproducedbycreate_unique_pathwillbeuniquerelative*totheLHS.(IfwehaveanLHSthat'sonlypartofthemin_lefthand,*thatis*not*true.)ForJOIN_UNIQUE_OUTER,passJOIN_INNERtoavoid*lettingthatvalueescapethismodule.*存在一些特殊的情况:*1.对于JOIN_SEMI和JOIN_ANTI,这无关紧要,因为执行器无论如何都可以进行等价的优化;*这些不需要在证明上花费时间证明。*2.对于JOIN_UNIQUE_INNER,必须考虑一个内部不是唯一的半连接(否则reduce_unique_semijoin会简化它),*所以调用innerrel_is_unique没有任何意义。*但是,如果LHS覆盖了半连接的所有min_left,那么就应该设置inner_unique,*因为create_unique_path生成的路径相对于LHS是唯一的。*(如果LHS只是min_left的一部分,那就不是真的)*对于JOIN_UNIQUE_OUTER,传递JOIN_INNER以避免让该值转义这个模块。*/switch(jointype){caseJOIN_SEMI:caseJOIN_ANTI:extra.inner_unique=false;/*well,unproven*/break;caseJOIN_UNIQUE_INNER:extra.inner_unique=bms_is_subset(sjinfo->min_lefthand,outerrel->relids);break;caseJOIN_UNIQUE_OUTER:extra.inner_unique=innerrel_is_unique(root,joinrel->relids,outerrel->relids,innerrel,JOIN_INNER,restrictlist,false);break;default:extra.inner_unique=innerrel_is_unique(root,joinrel->relids,outerrel->relids,innerrel,jointype,restrictlist,false);break;}/**Findpotentialmergejoinclauses.Wecanskipthisifwearenot*interestedindoingamergejoin.However,mergejoinmaybeouronly*wayofimplementingafullouterjoin,sooverrideenable_mergejoinif*it'safulljoin.*寻找潜在的mergejoin条件。如果不允许MergeJoin,则跳过。*然而,mergejoin可能是实现完整外部连接的唯一方法,*因此,如果它是完全连接,则不理会enable_mergejoin参数。*/if(enable_mergejoin||jointype==JOIN_FULL)extra.mergeclause_list=select_mergejoin_clauses(root,joinrel,outerrel,innerrel,restrictlist,jointype,&mergejoin_allowed);/**Ifit'sSEMI,ANTI,orinner_uniquejoin,computecorrectionfactors*forcostestimation.Thesewillbethesameforallpaths.*如果是半连接、反连接或inner_unique连接,则计算成本估算的相关因子,该值对所有路径都是一样的。*/if(jointype==JOIN_SEMI||jointype==JOIN_ANTI||extra.inner_unique)compute_semi_anti_join_factors(root,joinrel,outerrel,innerrel,jointype,sjinfo,restrictlist,&extra.semifactors);/**Decidewhetherit'ssensibletogenerateparameterizedpathsforthis*joinrel,andifso,whichrelationssuchpathsshouldrequire.There*isusuallynoneedtocreateaparameterizedresultpathunlessthere*isajoinorderrestrictionthatpreventsjoiningoneofourinputrels*directlytotheparametersourcerelinsteadofjoiningtotheother*inputrel.(Butseeallow_star_schema_join().)Thisrestriction*reducesthenumberofparameterizedpathswehavetodealwithat*higherjoinlevels,withoutcompromisingthequalityoftheresulting*plan.WeexpresstherestrictionasaRelidssetthatmustoverlapthe*parameterizationofanyproposedjoinpath.*确定为这个连接生成参数化路径是否合理,如果是,这些路径应该需要哪些关系。*通常不需要创建一个参数化的结果路径,除非存在一个连接顺序限制,*阻止将一个关系直接连接到参数源rel,而不是连接到另一个inputrel(但请参阅allow_star_schema_join())。*这种限制减少了在更高的连接级别上必须处理的参数化路径的数量,而不会影响最终计划的质量。*把这个限制表示为一个Relids集合,它必须与任何建议的连接路径的参数化重叠。*/foreach(lc,root->join_info_list){SpecialJoinInfo*sjinfo2=(SpecialJoinInfo*)lfirst(lc);/**SJisrelevanttothisjoinifwehavesomepartofitsRHS*(possiblynotallofit),andhaven'tyetjoinedtoitsLHS.(This*testisprettysimplistic,butshouldbesufficientconsideringthe*joinhasalreadybeenprovenlegal.)IftheSJisrelevant,it*presentsconstraintsforjoiningtoanythingnotinitsRHS.*与这个连接相关的SJ,如果有它的部分RHS(可能不是全部),并且还没有加入它的LHS。*(这个验证非常简单,但是考虑到连接已经被证明是合法的,这个验证就足够了。)*如果SJ是相关的,那么它就为连接到其RHS之外的任何内容提供了约束。*/if(bms_overlap(joinrelids,sjinfo2->min_righthand)&&!bms_overlap(joinrelids,sjinfo2->min_lefthand))extra.param_source_rels=bms_join(extra.param_source_rels,bms_difference(root->all_baserels,sjinfo2->min_righthand));/*全连接在语法上约束左右两边.fulljoinsconstrainbothsidessymmetrically*/if(sjinfo2->jointype==JOIN_FULL&&bms_overlap(joinrelids,sjinfo2->min_lefthand)&&!bms_overlap(joinrelids,sjinfo2->min_righthand))extra.param_source_rels=bms_join(extra.param_source_rels,bms_difference(root->all_baserels,sjinfo2->min_lefthand));}/**However,whenaLATERALsubqueryisinvolved,therewillsimplynotbe*anypathsforthejoinrelthataren'tparameterizedbywhateverthe*subqueryisparameterizedby,unlessitsparameterizationisresolved*withinthejoinrel.Sowemightaswellallowadditionaldependencies*onwhateverresiduallateraldependenciesthejoinrelwillhave.*然而,当涉及到一个LATERAL子查询时,除非在joinrel中解析其参数化,*否则joinrel的任何路径都不会被子查询参数化。*因此,也可以允许对joinrel将拥有的任何剩余的LATERAL依赖进行额外依赖。*/extra.param_source_rels=bms_add_members(extra.param_source_rels,joinrel->lateral_relids);/**1.Considermergejoinpathswherebothrelationsmustbeexplicitly*sorted.Skipthisifwecan'tmergejoin.*1.尝试mergejoin访问路径,其中两个关系必须执行显式的排序。*如果禁用mergejoin,则跳过。*/if(mergejoin_allowed)sort_inner_and_outer(root,joinrel,outerrel,innerrel,jointype,&extra);/**2.Considerpathswheretheouterrelationneednotbeexplicitly*sorted.Thisincludesbothnestloopsandmergejoinswheretheouter*pathisalreadyordered.Again,skipthisifwecan'tmergejoin.*(That'sokaybecauseweknowthatnestloopcan'thandleright/full*joinsatall,soitwouldn'tworkintheprohibitedcaseseither.)*2.考虑外部关系不需要显式排序的路径。*这包括nestloop和mergejoin,它们的外部路径已经排序。*再一次的,如果禁用mergejoin,则跳过。*(nestloop无法处理正确/完全连接,所以在禁止的情况下它也无法工作)*/if(mergejoin_allowed)match_unsorted_outer(root,joinrel,outerrel,innerrel,jointype,&extra);#ifdefNOT_USED/**3.Considerpathswheretheinnerrelationneednotbeexplicitly*sorted.Thisincludesmergejoinsonly(nestloopswerealreadybuiltin*match_unsorted_outer).*3.尝试内部关系不需要显式排序的路径。这只包括mergejoin(在match_unsorted_outer中已经构建了nestloop)。*(已废弃)**Dikedoutasredundant2/13/2000--tgl.Thereisn'tanyreally*significantdifferencebetweentheinnerandoutersideofamergejoin,*somatch_unsorted_innercreatesnopathsthataren'tequivalentto*thosemadebymatch_unsorted_outerwhenadd_paths_to_joinrel()is*invokedwiththetworelsgivenintheotherorder.*/if(mergejoin_allowed)match_unsorted_inner(root,joinrel,outerrel,innerrel,jointype,&extra);#endif/**4.Considerpathswherebothouterandinnerrelationsmustbehashed*beforebeingjoined.Asabove,disregardenable_hashjoinforfull*joins,becausetheremaybenootheralternative.*4.考虑在连接之前必须对外部/内部关系进行散列处理的路径。*如上所述,对于完全连接,忽略enable_hashjoin,因为可能没有其他选择。*/if(enable_hashjoin||jointype==JOIN_FULL)hash_inner_and_outer(root,joinrel,outerrel,innerrel,jointype,&extra);/**5.Ifinnerandouterrelationsareforeigntables(orjoins)belonging*tothesameserverandassignedtothesameusertocheckaccess*permissionsas,givetheFDWachancetopushdownjoins.*如果内部和外部关系是属于同一服务器的外部表(或连接),*并分配给同一用户以检查访问权限,则FDW有机会下推连接。*/if(joinrel->fdwroutine&&joinrel->fdwroutine->GetForeignJoinPaths)joinrel->fdwroutine->GetForeignJoinPaths(root,joinrel,outerrel,innerrel,jointype,&extra);/**6.Finally,giveextensionsachancetomanipulatethepathlist.*6.最后,调用扩展钩子函数.*/if(set_join_pathlist_hook)set_join_pathlist_hook(root,joinrel,outerrel,innerrel,jointype,&extra);}三、跟踪分析

SQL语句如下:

testdb=#explainverboseselectdw.*,grjf.grbh,grjf.xm,grjf.ny,grjf.jefromt_dwxxdw,lateral(selectgr.grbh,gr.xm,jf.ny,jf.jefromt_grxxgrinnerjoint_jfxxjfongr.dwbh=dw.dwbhandgr.grbh=jf.grbh)grjforderbydw.dwbh;QUERYPLAN-------------------------------------------------------------------------------------------------------MergeJoin(cost=18841.64..21009.94rows=99850width=47)Output:dw.dwmc,dw.dwbh,dw.dwdz,gr.grbh,gr.xm,jf.ny,jf.jeMergeCond:((dw.dwbh)::text=(gr.dwbh)::text)->IndexScanusingt_dwxx_pkeyonpublic.t_dwxxdw(cost=0.29..399.62rows=10000width=20)Output:dw.dwmc,dw.dwbh,dw.dwdz->Materialize(cost=18836.82..19336.82rows=100000width=31)Output:gr.grbh,gr.xm,gr.dwbh,jf.ny,jf.je->Sort(cost=18836.82..19086.82rows=100000width=31)Output:gr.grbh,gr.xm,gr.dwbh,jf.ny,jf.jeSortKey:gr.dwbh->HashJoin(cost=3465.00..8138.00rows=100000width=31)Output:gr.grbh,gr.xm,gr.dwbh,jf.ny,jf.jeHashCond:((jf.grbh)::text=(gr.grbh)::text)->SeqScanonpublic.t_jfxxjf(cost=0.00..1637.00rows=100000width=20)Output:jf.ny,jf.je,jf.grbh->Hash(cost=1726.00..1726.00rows=100000width=16)Output:gr.grbh,gr.xm,gr.dwbh->SeqScanonpublic.t_grxxgr(cost=0.00..1726.00rows=100000width=16)Output:gr.grbh,gr.xm,gr.dwbh(19rows)

参与连接的有3张基表,分别是t_dwxx/t_grxx/t_jfxx,从执行计划可见,由于存在order by dwbh排序子句,优化器"聪明"的选择Merge Join.

启动gdb,设置断点,只考察level=3的情况(最终结果)

(gdb)bjoin_search_one_levelBreakpoint1at0x755667:filejoinrels.c,line67.(gdb)cContinuing.Breakpoint1,join_search_one_level(root=0x1cae678,level=2)atjoinrels.c:6767List**joinrels=root->join_rel_level;(gdb)cContinuing.Breakpoint1,join_search_one_level(root=0x1cae678,level=3)atjoinrels.c:6767List**joinrels=root->join_rel_level;(gdb)

跟踪populate_joinrel_with_paths

(gdb)bpopulate_joinrel_with_pathsBreakpoint2at0x75646d:filejoinrels.c,line780.

进入populate_joinrel_with_paths函数

(gdb)cContinuing.Breakpoint2,populate_joinrel_with_paths(root=0x1cae678,rel1=0x1d10978,rel2=0x1d09610,joinrel=0x1d131b8,sjinfo=0x7ffef59baf20,restrictlist=0x1d135e8)atjoinrels.c:780780switch(sjinfo->jointype)

查看输入参数
1.root:simple_rte_array数组,其中simple_rel_array_size = 6,存在6个Item,1->16734/t_dwxx,3->16742/t_grxx,4->16747/t_jfxx
2.rel1:1号和3号连接生成的Relation,即t_dwxx和t_grxx连接
3.rel2:4号RTE,即t_jfxx
4.joinrel:rel1和rel2通过build_join_rel函数生成的连接Relation
5.sjinfo:连接信息,连接类型为内连接JOIN_INNER
6.restrictlist:约束条件链表,t_grxx.grbh=t_jfxx.grbh

(gdb)p*root$3={type=T_PlannerInfo,parse=0x1cd7830,glob=0x1cb8d38,query_level=1,parent_root=0x0,plan_params=0x0,outer_params=0x0,simple_rel_array=0x1d07af8,simple_rel_array_size=6,simple_rte_array=0x1d07b48,all_baserels=0x1d0ada8,nullable_baserels=0x0,join_rel_list=0x1d10e48,join_rel_hash=0x0,join_rel_level=0x1d10930,join_cur_level=3,init_plans=0x0,cte_plan_ids=0x0,multiexpr_params=0x0,eq_classes=0x1d0a6d8,canon_pathkeys=0x1d0ad28,left_join_clauses=0x0,right_join_clauses=0x0,full_join_clauses=0x0,join_info_list=0x0,append_rel_list=0x0,rowMarks=0x0,placeholder_list=0x0,fkey_list=0x0,query_pathkeys=0x1d0ad78,group_pathkeys=0x0,window_pathkeys=0x0,distinct_pathkeys=0x0,sort_pathkeys=0x1d0ad78,part_schemes=0x0,initial_rels=0x1d108c0,upper_rels={0x0,0x0,0x0,0x0,0x0,0x0,0x0},upper_targets={0x0,0x0,0x0,0x0,0x0,0x0,0x0},processed_tlist=0x1cbb608,grouping_map=0x0,minmax_aggs=0x0,planner_cxt=0x1bfa040,total_table_pages=1427,tuple_fraction=0,limit_tuples=-1,qual_security_level=0,inhTargetKind=INHKIND_NONE,hasJoinRTEs=true,hasLateralRTEs=false,hasDeletedRTEs=false,hasHavingQual=false,hasPseudoConstantQuals=false,hasRecursion=false,wt_param_id=-1,non_recursive_path=0x0,curOuterRels=0x0,curOuterParams=0x0,join_search_private=0x0,partColsUpdated=false}(gdb)p*root->simple_rte_array[1]$4={type=T_RangeTblEntry,rtekind=RTE_RELATION,relid=16734,relkind=114'r',tablesample=0x0,subquery=0x0,......(gdb)p*rel1->relids$10={nwords=1,words=0x1d10b8c}(gdb)p*rel1->relids->words$11=10(gdb)p*rel2->relids->words$13=16(gdb)p*joinrel->relids->words$15=26(gdb)p*sjinfo$16={type=T_SpecialJoinInfo,min_lefthand=0x1d10b88,min_righthand=0x1d09518,syn_lefthand=0x1d10b88,syn_righthand=0x1d09518,jointype=JOIN_INNER,lhs_strict=false,delay_upper_joins=false,semi_can_btree=false,semi_can_hash=false,semi_operators=0x0,semi_rhs_exprs=0x0}...(gdb)p*(Var*)((RelabelType*)$args->head->data.ptr_value)->arg$34={xpr={type=T_Var},varno=3,varattno=2,vartype=1043,vartypmod=14,varcollid=100,varlevelsup=0,varnoold=3,varoattno=2,location=273}-->t_grxx.grbh(gdb)p*(Var*)((RelabelType*)$args->head->next->data.ptr_value)->arg$35={xpr={type=T_Var},varno=4,varattno=1,vartype=1043,vartypmod=14,varcollid=100,varlevelsup=0,varnoold=4,varoattno=1,location=283}-->t_jfxx.grbh

进入JOIN_INNER分支,调用函数add_paths_to_joinrel

(gdb)789add_paths_to_joinrel(root,joinrel,rel1,rel2,

进入add_paths_to_joinrel函数

(gdb)stepadd_paths_to_joinrel(root=0x1cae678,joinrel=0x1d131b8,outerrel=0x1d10978,innerrel=0x1d09610,jointype=JOIN_INNER,sjinfo=0x7ffef59baf20,restrictlist=0x1d135e8)atjoinpath.c:126126boolmergejoin_allowed=true;

判断内表是否已被验证为唯一

162switch(jointype)(gdb)182extra.inner_unique=innerrel_is_unique(root,(gdb)189break;(gdb)pextra.inner_unique$36=false

寻找潜在的mergejoin条件。如果不允许Merge Join,则跳过
merge join的条件是t_grxx.grbh=t_jfxx.grbh

(gdb)n198if(enable_mergejoin||jointype==JOIN_FULL)(gdb)199extra.mergeclause_list=select_mergejoin_clauses(root,(gdb)211if(jointype==JOIN_SEMI||jointype==JOIN_ANTI||extra.inner_unique)(gdb)p*(Var*)((RelabelType*)$args->head->data.ptr_value)->arg$47={xpr={type=T_Var},varno=3,varattno=2,vartype=1043,vartypmod=14,varcollid=100,varlevelsup=0,varnoold=3,varoattno=2,location=273}-->t_grxx.grbh(gdb)p*(Var*)((RelabelType*)$args->head->next->data.ptr_value)->arg$48={xpr={type=T_Var},varno=4,varattno=1,vartype=1043,vartypmod=14,varcollid=100,varlevelsup=0,varnoold=4,varoattno=1,location=283}-->t_jfxx.grbh

确定为这个连接生成参数化路径是否合理,如果是,这些路径应该需要哪些关系(结果为:NULL)

(gdb)261extra.param_source_rels=bms_add_members(extra.param_source_rels,(gdb)268if(mergejoin_allowed)(gdb)p*extra.param_source_relsCannotaccessmemoryataddress0x0

尝试merge join访问路径,其中两个关系必须执行显式的排序.
注:joinrel->pathlist在执行前为NULL,执行后生成了访问路径.

(gdb)p*joinrel->pathlistCannotaccessmemoryataddress0x0(gdb)n269sort_inner_and_outer(root,joinrel,outerrel,innerrel,(gdb)279if(mergejoin_allowed)(gdb)p*joinrel->pathlist$50={type=T_List,length=1,head=0x1d13850,tail=0x1d13850}

其他实现逻辑类似,sort_inner_and_outer等函数的实现逻辑,后续再行详细解读.
最终结果是生成了2条访问路径,存储在pathlist链表中.

324if(set_join_pathlist_hook)(gdb)327}(gdb)p*joinrel->pathlist$51={type=T_List,length=2,head=0x1d13850,tail=0x1d13930}(gdb)p*(Node*)joinrel->pathlist->head->data.ptr_value$52={type=T_HashPath}(gdb)p*(HashPath*)joinrel->pathlist->head->data.ptr_value$53={jpath={path={type=T_HashPath,pathtype=T_HashJoin,parent=0x1d131b8,pathtarget=0x1d133c8,param_info=0x0,parallel_aware=false,parallel_safe=true,parallel_workers=0,rows=99850,startup_cost=3762,total_cost=10075.348750000001,pathkeys=0x0},jointype=JOIN_INNER,inner_unique=false,outerjoinpath=0x1d11f48,innerjoinpath=0x1d0f548,joinrestrictinfo=0x1d135e8},path_hashclauses=0x1d13aa0,num_batches=2,inner_rows_total=100000}(gdb)p*(Node*)joinrel->pathlist->head->next->data.ptr_value$54={type=T_NestPath}(gdb)p*(NestPath*)joinrel->pathlist->head->next->data.ptr_value$55={path={type=T_NestPath,pathtype=T_NestLoop,parent=0x1d131b8,pathtarget=0x1d133c8,param_info=0x0,parallel_aware=false,parallel_safe=true,parallel_workers=0,rows=99850,startup_cost=39.801122856046675,total_cost=41318.966172885761,pathkeys=0x1d0b818},jointype=JOIN_INNER,inner_unique=false,outerjoinpath=0x1d119d8,innerjoinpath=0x1d0f9d8,joinrestrictinfo=0x0}

到此,相信大家对“PostgreSQL中哪个函数为连接新生成的joinrel构造访问路径”有了更深的了解,不妨来实际操作一番吧!这里是亿速云网站,更多相关内容可以进入相关频道进行查询,关注我们,继续学习!