这篇文章主要讲解了“PostgreSQL的set_base_rel_pathlists函数及其子函数分析”,文中的讲解内容简单清晰,易于学习与理解,下面请大家跟着小编的思路慢慢深入,一起来研究和学习“PostgreSQL的set_base_rel_pathlists函数及其子函数分析”吧!

set_base_rel_pathlists函数的目的是为每一个base rel找出所有可用的访问路径(包括顺序扫描和所有可用的索引),每一个可用的路径都会添加到pathlist链表中。这一小节主要介绍常规(区别于并行)顺序扫描部分。

make_one_rel源代码:

RelOptInfo*make_one_rel(PlannerInfo*root,List*joinlist){//.../**Computesizeestimatesandconsider_parallelflagsforeachbaserel,*thengenerateaccesspaths.*/set_base_rel_sizes(root);//估算Relation的Size并且设置consider_parallel标记set_base_rel_pathlists(root);//生成Relation的扫描(访问)路径/**Generateaccesspathsfortheentirejointree.*通过动态规划或遗传算法生成连接路径*/rel=make_rel_from_joinlist(root,joinlist);/**Theresultshouldjoinallandonlythequery'sbaserels.*/Assert(bms_equal(rel->relids,root->all_baserels));//返回最上层的RelOptInforeturnrel;}一、数据结构

RelOptInfo

typedefstructRelOptInfo{NodeTagtype;//节点标识RelOptKindreloptkind;//RelOpt类型/*allrelationsincludedinthisRelOptInfo*/Relidsrelids;/*Relids(rtindex)集合setofbaserelids(rangetableindexes)*//*sizeestimatesgeneratedbyplanner*/doublerows;/*结果元组的估算数量estimatednumberofresulttuples*//*per-relationplannercontrolflags*/boolconsider_startup;/*是否考虑启动成本?是,需要保留启动成本低的路径keepcheap-startup-costpaths?*/boolconsider_param_startup;/*是否考虑参数化?的路径ditto,forparameterizedpaths?*/boolconsider_parallel;/*是否考虑并行处理路径considerparallelpaths?*//*defaultresulttargetlistforPathsscanningthisrelation*/structPathTarget*reltarget;/*扫描该Relation时默认的结果listofVars/Exprs,cost,width*//*materializationinformation*/List*pathlist;/*访问路径链表Pathstructures*/List*ppilist;/*路径链表中使用参数化路径进行ParamPathInfosusedinpathlist*/List*partial_pathlist;/*partialPaths*/structPath*cheapest_startup_path;//代价最低的启动路径structPath*cheapest_total_path;//代价最低的整体路径structPath*cheapest_unique_path;//代价最低的获取唯一值的路径List*cheapest_parameterized_paths;//代价最低的参数化路径链表/*parameterizationinformationneededforbothbaserelsandjoinrels*//*(seealsolateral_varsandlateral_referencers)*/Relidsdirect_lateral_relids;/*使用lateral语法,需依赖的Relidsrelsdirectlylaterallyreferenced*/Relidslateral_relids;/*minimumparameterizationofrel*//*informationaboutabaserel(notsetforjoinrels!)*///reloptkind=RELOPT_BASEREL时使用的数据结构Indexrelid;/*RelationID*/Oidreltablespace;/*表空间containingtablespace*/RTEKindrtekind;/*基表?子查询?还是函数等等?RELATION,SUBQUERY,FUNCTION,etc*/AttrNumbermin_attr;/*最小的属性编号smallestattrnoofrel(often<0)*/AttrNumbermax_attr;/*最大的属性编号largestattrnoofrel*/Relids*attr_needed;/*数组arrayindexed[min_attr..max_attr]*/int32*attr_widths;/*属性宽度arrayindexed[min_attr..max_attr]*/List*lateral_vars;/*关系依赖的Vars/PHVsLATERALVarsandPHVsreferencedbyrel*/Relidslateral_referencers;/*依赖该关系的Relidsrelsthatreferencemelaterally*/List*indexlist;/*该关系的IndexOptInfo链表listofIndexOptInfo*/List*statlist;/*统计信息链表listofStatisticExtInfo*/BlockNumberpages;/*块数sizeestimatesderivedfrompg_class*/doubletuples;/*元组数*/doubleallvisfrac;/*?*/PlannerInfo*subroot;/*如为子查询,存储子查询的rootifsubquery*/List*subplan_params;/*如为子查询,存储子查询的参数ifsubquery*/intrel_parallel_workers;/*并行执行,需要多少个workers?wantednumberofparallelworkers*//*Informationaboutforeigntablesandforeignjoins*///FDW相关信息Oidserverid;/*identifiesserverforthetableorjoin*/Oiduserid;/*identifiesusertocheckaccessas*/booluseridiscurrent;/*joinisonlyvalidforcurrentuser*//*use"structFdwRoutine"toavoidincludingfdwapi.hhere*/structFdwRoutine*fdwroutine;void*fdw_private;/*cachespaceforrememberingifwehaveproventhisrelationunique*///已知的,可保证唯一元组返回的Relids链表List*unique_for_rels;/*knownuniquefortheseotherrelid*set(s)*/List*non_unique_for_rels;/*已知的,返回的数据不唯一的Relids链表knownnotuniquefortheseset(s)*//*usedbyvariousscansandjoins:*/List*baserestrictinfo;/*如为基本关系,则存储约束条件RestrictInfostructures(ifbaserel)*/QualCostbaserestrictcost;/*解析约束表达式的成本?costofevaluatingtheabove*/Indexbaserestrict_min_security;/*最低安全等级minsecurity_levelfoundin*baserestrictinfo*/List*joininfo;/*连接语句的约束条件信息RestrictInfostructuresforjoinclauses*involvingthisrel*/boolhas_eclass_joins;/*是否存在等价类连接?True意味着joininfo并不完整,,Tmeansjoininfoisincomplete*//*usedbypartitionwisejoins:*///是否尝试partitionwise连接,这是PG11的一个新特性.boolconsider_partitionwise_join;/*considerpartitionwise*joinpaths?(if*partitionedrel)*/Relidstop_parent_relids;/*Relidsoftopmostparents(if"other"*rel)*//*usedforpartitionedrelations*///分区表使用PartitionSchemepart_scheme;/*分区的schemaPartitioningscheme.*/intnparts;/*分区数numberofpartitions*/structPartitionBoundInfoData*boundinfo;/*分区边界信息Partitionbounds*/List*partition_qual;/*分区约束partitionconstraint*/structRelOptInfo**part_rels;/*分区的RelOptInfo数组ArrayofRelOptInfosofpartitions,*storedinthesameorderofbounds*/List**partexprs;/*非空分区键表达式Non-nullablepartitionkeyexpressions.*/List**nullable_partexprs;/*可为空的分区键表达式Nullablepartitionkeyexpressions.*/List*partitioned_child_rels;/*RTIndexes链表ListofRTindexes.*/}RelOptInfo;

ParamPathInfo

/**ParamPathInfo**Allparameterizedpathsforagivenrelationwithgivenrequiredouterrels*linktoasingleParamPathInfo,whichstorescommoninformationsuchas*theestimatedrowcountforthisparameterization.Wedothispartlyto*avoidrecalculations,butmostlytoensurethattheestimatedrowcount*isinfactthesameforeverysuchpath.**Note:ppi_clausesisonlyusedinParamPathInfosforbaserelationpaths;*injoincasesit'sNILbecausethesetofrelevantclausesvariesdepending*onhowthejoinisformed.Therelevantclauseswillappearineach*parameterizedjoinpath'sjoinrestrictinfolist,instead.*/typedefstructParamPathInfo{NodeTagtype;//节点类型Relidsppi_req_outer;/*访问路径需要使用的Relids参数,relssupplyingparametersusedbypath*/doubleppi_rows;/*估算的结果元组数,estimatednumberofresulttuples*/List*ppi_clauses;/*外部rels提供的可用连接条件链表,joinclausesavailablefromouterrels*/}ParamPathInfo;

Cost相关
注意:实际使用的参数值通过系统配置文件定义,而不是这里的常量定义!

/**Thecostestimateproducedbycost_qual_eval()includesbothaone-time*(startup)cost,andaper-tuplecost.*/typedefstructQualCost{Coststartup;/*启动成本,one-timecost*/Costper_tuple;/*每个元组的成本,per-evaluationcost*/}QualCost;typedefdoubleCost;/*executioncost(inpage-accessunits)*//*defaultsforcostsize.c'sCostparameters*//*NB:cost-estimationcodeshouldusethevariables,nottheseconstants!*//*注意:实际值通过系统配置文件定义,而不是这里的常量定义!*//*Ifyouchangethese,updatebackend/utils/misc/postgresql.sample.conf*/#defineDEFAULT_SEQ_PAGE_COST1.0//顺序扫描page的成本#defineDEFAULT_RANDOM_PAGE_COST4.0//随机扫描page的成本#defineDEFAULT_CPU_TUPLE_COST0.01//处理一个元组的CPU成本#defineDEFAULT_CPU_INDEX_TUPLE_COST0.005//处理一个索引元组的CPU成本#defineDEFAULT_CPU_OPERATOR_COST0.0025//执行一次操作或函数的CPU成本#defineDEFAULT_PARALLEL_TUPLE_COST0.1//并行执行,从一个worker传输一个元组到另一个worker的成本#defineDEFAULT_PARALLEL_SETUP_COST1000.0//构建并行执行环境的成本#defineDEFAULT_EFFECTIVE_CACHE_SIZE524288/*先前已有介绍,measuredinpages*/doubleseq_page_cost=DEFAULT_SEQ_PAGE_COST;doublerandom_page_cost=DEFAULT_RANDOM_PAGE_COST;doublecpu_tuple_cost=DEFAULT_CPU_TUPLE_COST;doublecpu_index_tuple_cost=DEFAULT_CPU_INDEX_TUPLE_COST;doublecpu_operator_cost=DEFAULT_CPU_OPERATOR_COST;doubleparallel_tuple_cost=DEFAULT_PARALLEL_TUPLE_COST;doubleparallel_setup_cost=DEFAULT_PARALLEL_SETUP_COST;inteffective_cache_size=DEFAULT_EFFECTIVE_CACHE_SIZE;Costdisable_cost=1.0e10;//1后面10个0,通过设置一个巨大的成本,让优化器自动放弃此路径intmax_parallel_workers_per_gather=2;//每次gather使用的worker数二、源码解读

set_base_rel_pathlists函数遍历RelOptInfo数组,为每一个Rel构造访问路径.

//--------------------------------------------------------/**set_base_rel_pathlists*Findsallpathsavailableforscanningeachbase-relationentry.*Sequentialscanandanyavailableindicesareconsidered.*Eachusefulpathisattachedtoitsrelation's'pathlist'field.**为每一个baserels找出所有可用的访问路径(包括顺序扫描和所有可用的索引)*每一个可用的路径都会添加到pathlist链表中**/staticvoidset_base_rel_pathlists(PlannerInfo*root){Indexrti;for(rti=1;rti<root->simple_rel_array_size;rti++)//遍历RelOptInfo数组{RelOptInfo*rel=root->simple_rel_array[rti];/*theremaybeemptyslotscorrespondingtonon-baserelRTEs*/if(rel==NULL)continue;Assert(rel->relid==rti);/*sanitycheckonarray*//*ignoreRTEsthatare"otherrels"*/if(rel->reloptkind!=RELOPT_BASEREL)continue;set_rel_pathlist(root,rel,rti,root->simple_rte_array[rti]);}}//--------------------------------------------------------set_rel_pathlist/**set_rel_pathlist*Buildaccesspathsforabaserelation*/staticvoidset_rel_pathlist(PlannerInfo*root,RelOptInfo*rel,Indexrti,RangeTblEntry*rte){if(IS_DUMMY_REL(rel)){/*Wealreadyprovedtherelationempty,sonothingmoretodo*/}elseif(rte->inh)//inherit{/*It'san"appendrelation",processaccordingly*/set_append_rel_pathlist(root,rel,rti,rte);}else//常规{switch(rel->rtekind){caseRTE_RELATION://数据表if(rte->relkind==RELKIND_FOREIGN_TABLE)//FDW{/*Foreigntable*/set_foreign_pathlist(root,rel,rte);}elseif(rte->tablesample!=NULL)//采样表{/*Sampledrelation*/set_tablesample_rel_pathlist(root,rel,rte);}else//常规数据表{/*Plainrelation*/set_plain_rel_pathlist(root,rel,rte);}break;caseRTE_SUBQUERY://子查询/*Subquery---已在set_rel_size处理,fullyhandledduringset_rel_size*//*函数:set_subquery_pathlist*/break;caseRTE_FUNCTION:/*RangeFunction*/set_function_pathlist(root,rel,rte);break;caseRTE_TABLEFUNC:/*TableFunction*/set_tablefunc_pathlist(root,rel,rte);break;caseRTE_VALUES:/*Valueslist*/set_values_pathlist(root,rel,rte);break;caseRTE_CTE:/*CTEreference---fullyhandledduringset_rel_size*/break;caseRTE_NAMEDTUPLESTORE:/*tuplestorereference---fullyhandledduringset_rel_size*/break;default:elog(ERROR,"unexpectedrtekind:%d",(int)rel->rtekind);break;}}/**Ifthisisabaserel,weshouldnormallyconsidergatheringanypartial*pathswemayhavecreatedforit.**However,ifthisisaninheritancechild,skipit.Otherwise,wecould*endupwithaverylargenumberofgathernodes,eachtryingtograb*itsownpoolofworkers.Instead,we'llconsidergatheringpartial*pathsfortheparentappendrel.**Also,ifthisisthetopmostscan/joinrel(thatis,theonlybaserel),*wepostponethisuntilthefinalscan/jointargelistisavailable(see*grouping_planner).*/if(rel->reloptkind==RELOPT_BASEREL&&bms_membership(root->all_baserels)!=BMS_SINGLETON)generate_gather_paths(root,rel,false);/**AllowaplugintoeditorializeonthesetofPathsforthisbase*relation.Itcouldaddnewpaths(suchasCustomPaths)bycalling*add_path(),ordeleteormodifypathsaddedbythecorecode.*/if(set_rel_pathlist_hook)//钩子函数(*set_rel_pathlist_hook)(root,rel,rti,rte);/*Nowfindthecheapestofthepathsforthisrel*/set_cheapest(rel);//找出代价最低的访问路径#ifdefOPTIMIZER_DEBUGdebug_print_rel(root,rel);#endif}//--------------------------------------------------------set_plain_rel_pathlist/**set_plain_rel_pathlist*Buildaccesspathsforaplainrelation(nosubquery,noinheritance)*/staticvoidset_plain_rel_pathlist(PlannerInfo*root,RelOptInfo*rel,RangeTblEntry*rte){Relidsrequired_outer;/**Wedon'tsupportpushingjoinclausesintothequalsofaseqscan,but*itcouldstillhaverequiredparameterizationduetoLATERALrefsin*itstlist.*/required_outer=rel->lateral_relids;//需依赖的上层Relids/*顺序扫描,Considersequentialscan*/add_path(rel,create_seqscan_path(root,rel,required_outer,0));/*如合适,尝试并行顺序扫描,Ifappropriate,considerparallelsequentialscan*/if(rel->consider_parallel&&required_outer==NULL)create_plain_partial_paths(root,rel);/*索引扫描,Considerindexscans*/create_index_paths(root,rel);/*TID扫描,ConsiderTIDscans*/create_tidscan_paths(root,rel);}//--------------------------------------------------------create_seqscan_path/**create_seqscan_path*Createsapathcorrespondingtoasequentialscan,returningthe*pathnode.*/Path*create_seqscan_path(PlannerInfo*root,RelOptInfo*rel,Relidsrequired_outer,intparallel_workers){Path*pathnode=makeNode(Path);//顺序扫描路径pathnode->pathtype=T_SeqScan;//顺序扫描pathnode->parent=rel;//RelOptInfopathnode->pathtarget=rel->reltarget;//投影列pathnode->param_info=get_baserel_parampathinfo(root,rel,required_outer);//获取参数化路径信息ParamPathInfopathnode->parallel_aware=parallel_workers>0?true:false;//并行pathnode->parallel_safe=rel->consider_parallel;//pathnode->parallel_workers=parallel_workers;//并行数pathnode->pathkeys=NIL;/*顺序扫描,不执行排序,seqscanhasunorderedresult*/cost_seqscan(pathnode,root,rel,pathnode->param_info);returnpathnode;}//--------------------------------------------get_baserel_parampathinfo/**get_baserel_parampathinfo*GettheParamPathInfoforaparameterizedpathforabaserelation,*constructingoneifwedon'thaveonealready.**获取baserel的参数化路径,存储在结构体ParamPathInfo中.如果没有,那么构造一个.**Thiscentralizesestimatingtherowcountsforparameterizedpaths.*Weneedtocachethosetobesureweusethesamerowcountforallpaths*ofthesameparameterizationforagivenrel.Thisisalsoaconvenient*placetodeterminewhichmovablejoinclausestheparameterizedpathwill*beresponsibleforevaluating.**统一/集中估计参数化路径的行数。通过缓存这些数据,对于相同的参数,可以快速给出相应的行数。*/ParamPathInfo*get_baserel_parampathinfo(PlannerInfo*root,RelOptInfo*baserel,Relidsrequired_outer){ParamPathInfo*ppi;//ppi变量Relidsjoinrelids;//参与连接的relidsList*pclauses;//条件链表doublerows;//行数ListCell*lc;//临时变量/*UnparameterizedpathshavenoParamPathInfo*/if(bms_is_empty(required_outer))returnNULL;Assert(!bms_overlap(baserel->relids,required_outer));/*IfwealreadyhaveaPPIforthisparameterization,justreturnit*/if((ppi=find_param_path_info(baserel,required_outer)))//已有缓存?returnppi;//直接返回/**Identifyalljoinclausesthataremovabletothisbaserelgiventhis*parameterization.*在给定参数条件下,识别所有可移动到此基础关系的连接子句。*/joinrelids=bms_union(baserel->relids,required_outer);//合并relidspclauses=NIL;foreach(lc,baserel->joininfo)//遍历连接条件{RestrictInfo*rinfo=(RestrictInfo*)lfirst(lc);//连接条件if(join_clause_is_movable_into(rinfo,baserel->relids,joinrelids))pclauses=lappend(pclauses,rinfo);//如可以移动,添加到链表中}/**AddinjoinclausesgeneratedbyEquivalenceClasses,too.(These*necessarilysatisfyjoin_clause_is_movable_into.)*通过等价类产生的连接条件一并合并到链表中*/pclauses=list_concat(pclauses,generate_join_implied_equalities(root,joinrelids,required_outer,baserel));/*Estimatethenumberofrowsreturnedbytheparameterizedscan*/rows=get_parameterized_baserel_size(root,baserel,pclauses);//获取估算行数/*AndnowwecanbuildtheParamPathInfo*/ppi=makeNode(ParamPathInfo);//构造PPI返回节点ppi->ppi_req_outer=required_outer;ppi->ppi_rows=rows;ppi->ppi_clauses=pclauses;baserel->ppilist=lappend(baserel->ppilist,ppi);returnppi;}//---------------------------------get_parameterized_baserel_size/**get_parameterized_baserel_size*Makeasizeestimateforaparameterizedscanofabaserelation.*估算参数化扫描基础关系的大小**'param_clauses'liststheadditionaljoinclausestobeused.*param_clauses是使用的连接条件链表**set_baserel_size_estimatesmusthavebeenappliedalready.*调用此函数前,要求已调用set_baserel_size_estimates函数*/doubleget_parameterized_baserel_size(PlannerInfo*root,RelOptInfo*rel,List*param_clauses){List*allclauses;doublenrows;/**Estimatethenumberofrowsreturnedbytheparameterizedscan,knowing*thatitwillapplyalltheextrajoinclausesaswellastherel'sown*restrictionclauses.Notethatweforcetheclausestobetreatedas*non-joinclausesduringselectivityestimation.*/allclauses=list_concat(list_copy(param_clauses),rel->baserestrictinfo);//合并条件nrows=rel->tuples*clauselist_selectivity(root,allclauses,rel->relid,/*donotuse0!*/JOIN_INNER,NULL);//获取行数nrows=clamp_row_est(nrows);/*Forsafety,makesureresultisnotmorethanthebaseestimate*/if(nrows>rel->rows)nrows=rel->rows;returnnrows;//返回}//--------------------------------------------cost_seqscan/**cost_seqscan*Determinesandreturnsthecostofscanningarelationsequentially.*计算顺序扫描Rel的成本并返回**'baserel'istherelationtobescanned*baserel:即将被扫描的Rel**'param_info'istheParamPathInfoifthisisaparameterizedpath,elseNULL*param_info:ppi结构体*/voidcost_seqscan(Path*path,PlannerInfo*root,RelOptInfo*baserel,ParamPathInfo*param_info){Coststartup_cost=0;//启动成本Costcpu_run_cost;//CPU成本Costdisk_run_cost;//IO成本doublespc_seq_page_cost;//QualCostqpqual_cost;//表达式成本Costcpu_per_tuple;//每个元组的CPU成本/*Shouldonlybeappliedtobaserelations*/Assert(baserel->relid>0);Assert(baserel->rtekind==RTE_RELATION);/*Markthepathwiththecorrectrowestimate*/if(param_info)//存在PPIpath->rows=param_info->ppi_rows;//行数elsepath->rows=baserel->rows;//直接取基础关系行数if(!enable_seqscan)startup_cost+=disable_cost;//不允许顺序扫描,disable_cost=1.0e10,即1后面10个0,这样的路径无需考虑/*fetchestimatedpagecostfortablespacecontainingtable*/get_tablespace_page_costs(baserel->reltablespace,NULL,&spc_seq_page_cost);//获取顺序访问表空间page的成本/**diskcosts*/disk_run_cost=spc_seq_page_cost*baserel->pages;//IO成本/*CPUcosts*/get_restriction_qual_cost(root,baserel,param_info,&qpqual_cost);//CPU成本startup_cost+=qpqual_cost.startup;//启动成本cpu_per_tuple=cpu_tuple_cost+qpqual_cost.per_tuple;//处理每个元组的成本cpu_run_cost=cpu_per_tuple*baserel->tuples;//CPU执行过程中的成本/*tlistevalcostsarepaidperoutputrow,notpertuplescanned*/startup_cost+=path->pathtarget->cost.startup;//加上获取最终投影列的成本cpu_run_cost+=path->pathtarget->cost.per_tuple*path->rows;//加上获取最终投影列的成本/*Adjustcostingforparallelism,ifused.*/if(path->parallel_workers>0)//并行执行{doubleparallel_divisor=get_parallel_divisor(path);//拆分多少份/*TheCPUcostisdividedamongalltheworkers.*/cpu_run_cost/=parallel_divisor;//每一份的成本/**ItmaybepossibletoamortizesomeoftheI/Ocost,butprobably*notverymuch,becausemostoperatingsystemsalreadydoaggressive*prefetching.Fornow,weassumethatthediskruncostcan'tbe*amortizedatall.*//**Inthecaseofaparallelplan,therowcountneedstorepresent*thenumberoftuplesprocessedperworker.*/path->rows=clamp_row_est(path->rows/parallel_divisor);//每一份的行数}path->startup_cost=startup_cost;//赋值path->total_cost=startup_cost+cpu_run_cost+disk_run_cost;//总成本=启动+执行期的CPU和IO成本}//--------------------------------get_tablespace_page_costs/**get_tablespace_page_costs*Returnrandomand/orsequentialpagecostsforagiventablespace.*返回给定表空间的随机/顺序读取page的成本**Thisvalueisnotlockedbythetransaction,sothisvaluemay*bechangedwhileaSELECTthathasusedthesevaluesforplanning*isstillexecuting.*/voidget_tablespace_page_costs(Oidspcid,//表空间Oiddouble*spc_random_page_cost,double*spc_seq_page_cost){TableSpaceCacheEntry*spc=get_tablespace(spcid);Assert(spc!=NULL);if(spc_random_page_cost)//随机读取{if(!spc->opts||spc->opts->random_page_cost<0)*spc_random_page_cost=random_page_cost;else*spc_random_page_cost=spc->opts->random_page_cost;}if(spc_seq_page_cost)//顺序读取{if(!spc->opts||spc->opts->seq_page_cost<0)*spc_seq_page_cost=seq_page_cost;else*spc_seq_page_cost=spc->opts->seq_page_cost;}}//--------------------------------get_restriction_qual_cost/**get_restriction_qual_cost*Computeevaluationcostsofabaserel'srestrictionquals,plusany*movablejoinqualsthathavebeenpusheddowntothescan.*Resultsarereturnedinto*qpqual_cost.*计算baserel限制条件的估算成本,包括被下推到限制条件的连接条件**Thisisaconveniencesubroutinethatworksforseqscansandothercases*whereallthegivenqualswillbeevaluatedthehardway.It'snotuseful*forcost_index(),forexample,wheretheindexmachinerytakescareof*someofthequals.Weassumebaserestrictcostwaspreviouslysetby*set_baserel_size_estimates().*/staticvoidget_restriction_qual_cost(PlannerInfo*root,RelOptInfo*baserel,ParamPathInfo*param_info,QualCost*qpqual_cost){if(param_info)//参数化信息{/*Includecostsofpushed-downclauses*/cost_qual_eval(qpqual_cost,param_info->ppi_clauses,root);//评估成本qpqual_cost->startup+=baserel->baserestrictcost.startup;qpqual_cost->per_tuple+=baserel->baserestrictcost.per_tuple;}else*qpqual_cost=baserel->baserestrictcost;//没有参数化信息,直接返回}//-------------------cost_qual_eval/**cost_qual_eval*EstimatetheCPUcostsofevaluatingaWHEREclause.*Theinputcanbeeitheranimplicitly-ANDedlistofboolean*expressions,oralistofRestrictInfonodes.(Thelatteris*preferredsinceitallowscachingoftheresults.)*Theresultincludesbothaone-time(startup)component,*andaper-evaluationcomponent.*/voidcost_qual_eval(QualCost*cost,List*quals,PlannerInfo*root){cost_qual_eval_contextcontext;ListCell*l;context.root=root;context.total.startup=0;context.total.per_tuple=0;/*Wedon'tchargeanycostfortheimplicitANDingattoplevel...*/foreach(l,quals)//遍历链表{Node*qual=(Node*)lfirst(l);cost_qual_eval_walker(qual,&context);//遍历表达式}*cost=context.total;//返回总成本}//------------cost_qual_eval_walkerstaticboolcost_qual_eval_walker(Node*node,cost_qual_eval_context*context){if(node==NULL)returnfalse;/**RestrictInfonodescontainaneval_costfieldreservedforthis*routine'suse,sothatit'snotnecessarytoevaluatethequalclause's*costmorethanonce.Iftheclause'scosthasn'tbeencomputedyet,*thefield'sstartupvaluewillcontain-1.*/if(IsA(node,RestrictInfo))//约束条件{RestrictInfo*rinfo=(RestrictInfo*)node;if(rinfo->eval_cost.startup<0)//未计算成本,初始值为-1{cost_qual_eval_contextlocContext;locContext.root=context->root;locContext.total.startup=0;locContext.total.per_tuple=0;/**ForanORclause,recurseintothemarked-uptreesothatwe*settheeval_costforcontainedRestrictInfostoo.*/if(rinfo->orclause)cost_qual_eval_walker((Node*)rinfo->orclause,&locContext);//递归OR条件elsecost_qual_eval_walker((Node*)rinfo->clause,&locContext);//递归/**IftheRestrictInfoismarkedpseudoconstant,itwillbetested*onlyonce,sotreatitscostasallstartupcost.*/if(rinfo->pseudoconstant)//pseudoconstant标志为T{/*countoneexecutionduringstartup*/locContext.total.startup+=locContext.total.per_tuple;locContext.total.per_tuple=0;}rinfo->eval_cost=locContext.total;}context->total.startup+=rinfo->eval_cost.startup;context->total.per_tuple+=rinfo->eval_cost.per_tuple;/*doNOTrecurseintochildren*/returnfalse;}/**Foreachoperatororfunctionnodeinthegiventree,wechargethe*estimatedexecutioncostgivenbypg_proc.procost(remembertomultiply*thisbycpu_operator_cost).**VarsandConstsarechargedzero,andsoarebooleanoperators(AND,*OR,NOT).Simplistic,butalotbetterthannomodelatall.**Shouldwetrytoaccountforthepossibilityofshort-circuit*evaluationofAND/OR?Probably*not*,becausethatwouldmakethe*resultsdependontheclauseordering,andwearenotinanyposition*toexpectthatthecurrentorderingoftheclausesistheonethat's*goingtoendupbeingused.Theaboveper-RestrictInfocachingwould*notmixwellwithtryingtore-orderclausesanyway.**Anotherissuethatisentirelyignoredhereisthatifaset-returning*functionisbelowtoplevelinthetree,thefunctions/operatorsabove*itwillneedtobeevaluatedmultipletimes.Inpracticaluse,such*casesarisesoseldomastonotbeworththeaddedcomplexityneeded;*moreover,sinceourrowcountestimatesforfunctionstendtobepretty*phony,theresultswouldalsobeprettyphony.*/if(IsA(node,FuncExpr))//函数表达式{context->total.per_tuple+=get_func_cost(((FuncExpr*)node)->funcid)*cpu_operator_cost;//调用get_func_cost函数}elseif(IsA(node,OpExpr)||IsA(node,DistinctExpr)||IsA(node,NullIfExpr))//操作符/Distinct/NullIf判断,调用get_func_cost{/*relyonstructequivalencetotreattheseallalike*/set_opfuncid((OpExpr*)node);context->total.per_tuple+=get_func_cost(((OpExpr*)node)->opfuncid)*cpu_operator_cost;}elseif(IsA(node,ScalarArrayOpExpr))//数组{/**Estimatethattheoperatorwillbeappliedtoabouthalfofthe*arrayelementsbeforetheanswerisdetermined.*/ScalarArrayOpExpr*saop=(ScalarArrayOpExpr*)node;Node*arraynode=(Node*)lsecond(saop->args);set_sa_opfuncid(saop);context->total.per_tuple+=get_func_cost(saop->opfuncid)*cpu_operator_cost*estimate_array_length(arraynode)*0.5;}elseif(IsA(node,Aggref)||IsA(node,WindowFunc))//聚合函数或者窗口函数{/**AggrefandWindowFuncnodesare(andshouldbe)treatedlikeVars,*ie,zeroexecutioncostinthecurrentmodel,becausetheybehave*essentiallylikeVarsatexecution.Wedisregardthecostsof*theirinputexpressionsforthesamereason.Theactualexecution*costsoftheaggregate/windowfunctionsandtheirargumentshaveto*befactoredintoplan-node-specificcostingoftheAggorWindowAgg*plannode.*/returnfalse;/*0成本,不再递归到子节点中,don'trecurseintochildren*/}elseif(IsA(node,CoerceViaIO))//CoerceViaIO{CoerceViaIO*iocoerce=(CoerceViaIO*)node;Oidiofunc;Oidtypioparam;booltypisvarlena;/*checktheresulttype'sinputfunction*/getTypeInputInfo(iocoerce->resulttype,&iofunc,&typioparam);context->total.per_tuple+=get_func_cost(iofunc)*cpu_operator_cost;/*checktheinputtype'soutputfunction*/getTypeOutputInfo(exprType((Node*)iocoerce->arg),&iofunc,&typisvarlena);context->total.per_tuple+=get_func_cost(iofunc)*cpu_operator_cost;}elseif(IsA(node,ArrayCoerceExpr))//ArrayCoerceExpr{ArrayCoerceExpr*acoerce=(ArrayCoerceExpr*)node;QualCostperelemcost;cost_qual_eval_node(&perelemcost,(Node*)acoerce->elemexpr,context->root);context->total.startup+=perelemcost.startup;if(perelemcost.per_tuple>0)context->total.per_tuple+=perelemcost.per_tuple*estimate_array_length((Node*)acoerce->arg);}elseif(IsA(node,RowCompareExpr))//RowCompareExpr{/*Conservativelyassumewewillcheckallthecolumns*/RowCompareExpr*rcexpr=(RowCompareExpr*)node;ListCell*lc;foreach(lc,rcexpr->opnos){Oidopid=lfirst_oid(lc);context->total.per_tuple+=get_func_cost(get_opcode(opid))*cpu_operator_cost;}}elseif(IsA(node,MinMaxExpr)||IsA(node,SQLValueFunction)||IsA(node,XmlExpr)||IsA(node,CoerceToDomain)||IsA(node,NextValueExpr))//最小最大值/SQLValueFunction/XML表达式/CoerceToDomain/NextValueExpr{/*Treatalltheseashavingcost1*/context->total.per_tuple+=cpu_operator_cost;}elseif(IsA(node,CurrentOfExpr))//CurrentOfExpr{/*ReporthighcosttopreventselectionofanythingbutTIDscan*/context->total.startup+=disable_cost;//不考虑顺序扫描,使用TID扫描}elseif(IsA(node,SubLink)){/*Thisroutineshouldnotbeappliedtoun-plannedexpressions*/elog(ERROR,"cannothandleunplannedsub-select");//子链接,报错}elseif(IsA(node,SubPlan))//子计划{/**Asubplannodeinanexpressiontypicallyindicatesthatthe*subplanwillbeexecutedoneachevaluation,sochargeaccordingly.*(Sub-selectsthatcanbeexecutedasInitPlanshavealreadybeen*removedfromtheexpression.)*/SubPlan*subplan=(SubPlan*)node;context->total.startup+=subplan->startup_cost;//直接相加context->total.per_tuple+=subplan->per_call_cost;/**Wedon'twanttorecurseintothetestexpr,becauseitwasalready*countedintheSubPlannode'scosts.Sowe'redone.*/returnfalse;}elseif(IsA(node,AlternativeSubPlan))//AlternativeSubPlan{/**Arbitrarilyusethefirstalternativeplanforcosting.(Weshould*certainlyonlyincludeonealternative,andwedon'tyethave*enoughinformationtoknowwhichonetheexecutorismostlikelyto*use.)*/AlternativeSubPlan*asplan=(AlternativeSubPlan*)node;returncost_qual_eval_walker((Node*)linitial(asplan->subplans),context);}elseif(IsA(node,PlaceHolderVar))//PHV,成本为0{/**APlaceHolderVarshouldbegivencostzerowhenconsideringgeneral*expressionevaluationcosts.Theexpenseofdoingthecontained*expressionischargedaspartofthetlistevalcostsofthescan*orjoinwherethePHVisfirstcomputed(seeset_rel_widthand*add_placeholders_to_joinrel).Ifwechargeditagainhere,we'dbe*double-countingthecostforeachlevelofplanthatthePHV*bubblesupthrough.Hence,returnwithoutrecursingintothe*phexpr.*/returnfalse;}/*recurseintochildren*/returnexpression_tree_walker(node,cost_qual_eval_walker,(void*)context);//递归到子节点中}//-------get_func_cost/**get_func_cost*Givenprocedureid,returnthefunction'sprocostfield.*/float4get_func_cost(Oidfuncid){HeapTupletp;float4result;tp=SearchSysCache1(PROCOID,ObjectIdGetDatum(funcid));//获取函数Oidif(!HeapTupleIsValid(tp))elog(ERROR,"cachelookupfailedforfunction%u",funcid);//查询数据字典:selectproname,procostfrompg_procwhereprocost>1;result=((Form_pg_proc)GETSTRUCT(tp))->procost;//直接获取函数的procostReleaseSysCache(tp);returnresult;}三、跟踪分析

启动gdb:

(gdb)bset_base_rel_pathlistsBreakpoint1at0x73bfb5:fileallpaths.c,line296.(gdb)cContinuing.Breakpoint1,set_base_rel_pathlists(root=0x2fd9418)atallpaths.c:296296for(rti=1;rti<root->simple_rel_array_size;rti++)

进入set_plain_rel_pathlist:

(gdb)452set_plain_rel_pathlist(root,rel,rte);(gdb)stepset_plain_rel_pathlist(root=0x2fd9418,rel=0x2f98278,rte=0x2eaa5d8)atallpaths.c:704704required_outer=rel->lateral_relids;

进入create_seqscan_path:

(gdb)stepcreate_seqscan_path(root=0x2fd9418,rel=0x2f98278,required_outer=0x0,parallel_workers=0)atpathnode.c:957957Path*pathnode=makeNode(Path);(gdb)969cost_seqscan(pathnode,root,rel,pathnode->param_info);(gdb)p*pathnode$2={type=T_Path,pathtype=T_SeqScan,parent=0x2f98278,pathtarget=0x2f98488,param_info=0x0,parallel_aware=false,parallel_safe=true,parallel_workers=0,rows=0,startup_cost=0,total_cost=0,pathkeys=0x0}

进入cost_seqscan:

...(gdb)230path->rows=baserel->rows;#rows的获得上一节已作介绍(gdb)pbaserel->rows$4=10926...#表空间顺序扫描的成本(gdb)pspc_seq_page_cost$5=1#IO成本(gdb)pdisk_run_cost$6=726

进入get_restriction_qual_cost:

(gdb)stepget_restriction_qual_cost(root=0x2fd9418,baserel=0x2f98278,param_info=0x0,qpqual_cost=0x7ffe12ca4620)atcostsize.c:39993999if(param_info)#没有参数化信息,直接使用baserel->baserestrictcost(gdb)n4008*qpqual_cost=baserel->baserestrictcost;(gdb)pbaserel->baserestrictcost$7={startup=0,per_tuple=0.0050000000000000001}

回到cost_seqscan

(gdb)cost_seqscan(path=0x2f98990,root=0x2fd9418,baserel=0x2f98278,param_info=0x0)atcostsize.c:248248startup_cost+=qpqual_cost.startup;...

执行cost_seqscan,最终的path:

(gdb)p*path$8={type=T_Path,pathtype=T_SeqScan,parent=0x2f98278,pathtarget=0x2f98488,param_info=0x0,parallel_aware=false,parallel_safe=true,parallel_workers=0,rows=10926,startup_cost=0,total_cost=2226,pathkeys=0x0}(gdb)pcpu_run_cost$9=1500(gdb)pdisk_run_cost$10=726

回到上层函数:

(gdb)ncreate_seqscan_path(root=0x2fd9418,rel=0x2f98278,required_outer=0x0,parallel_workers=0)atpathnode.c:971971returnpathnode;(gdb)972}(gdb)set_plain_rel_pathlist(root=0x2fd9418,rel=0x2f98278,rte=0x2eaa5d8)atallpaths.c:710710if(rel->consider_parallel&&required_outer==NULL)

继续执行构建索引扫描路径/TID扫描路径函数:

714create_index_paths(root,rel);(gdb)717create_tidscan_paths(root,rel);(gdb)n718}

索引扫描路径的结果,rows = 10926, startup_cost = 324.40899999999999,total_cost = 1214.299

(gdb)p*rel->pathlist$14={type=T_List,length=1,head=0x2fe8d10,tail=0x2fe8d10}(gdb)p*(Path*)rel->pathlist->head->data.ptr_value$15={type=T_BitmapHeapPath,pathtype=T_BitmapHeapScan,parent=0x2f98278,pathtarget=0x2f98488,param_info=0x0,parallel_aware=false,parallel_safe=true,parallel_workers=0,rows=10926,startup_cost=324.40899999999999,total_cost=1214.299,pathkeys=0x0}

结束调用

(gdb)set_base_rel_pathlists(root=0x2fd9418)atallpaths.c:296296for(rti=1;rti<root->simple_rel_array_size;rti++)(gdb)312}(gdb)make_one_rel(root=0x2fd9418,joinlist=0x2f985d8)atallpaths.c:185185rel=make_rel_from_joinlist(root,joinlist);#DONE!

相应的SQL执行计划,cost=324.41..1214.30请参照索引扫描路径结果(这部分源码下一节分析):

testdb=#explainanalyzeverboseselectt1.*fromt_dwxxt1wheredwbh>'10000'anddwbh<'20000';QUERYPLAN--------------------------------------------------------------------------------------------------------------------------------BitmapHeapScanonpublic.t_dwxxt1(cost=324.41..1214.30rows=10926width=23)(actualtime=3.196..4.959rows=11111loops=1)Output:dwmc,dwbh,dwdzRecheckCond:(((t1.dwbh)::text>'10000'::text)AND((t1.dwbh)::text<'20000'::text))HeapBlocks:exact=85->BitmapIndexScanont_dwxx_pkey(cost=0.00..321.68rows=10926width=0)(actualtime=3.159..3.159rows=11111loops=1)IndexCond:(((t1.dwbh)::text>'10000'::text)AND((t1.dwbh)::text<'20000'::text))PlanningTime:0.315msExecutionTime:5.673ms(8rows)

感谢各位的阅读,以上就是“PostgreSQL的set_base_rel_pathlists函数及其子函数分析”的内容了,经过本文的学习后,相信大家对PostgreSQL的set_base_rel_pathlists函数及其子函数分析这一问题有了更深刻的体会,具体使用情况还需要大家实践验证。这里是亿速云,小编将为大家推送更多相关知识点的文章,欢迎关注!