数据库中cluster factor对执行计划的影响是什么

2025-03-14 技术教程

这篇文章主要讲解了“数据库中cluster factor对执行计划的影响是什么”，文中的讲解内容简单清晰，易于学习与理解，下面请大家跟着小编的思路慢慢深入，一起来研究和学习“数据库中cluster factor对执行计划的影响是什么”吧！

cluster factor对执行计划的影响

测试环境：Linux 7.6 + ORACLE 19.6.1

1.创建测试环境1.1 创建测试表并插入数据

CZH@czhpdb>createtabletest_ffsasselect*fromhr.employees;Tablecreated.CZH@czhpdb>insertintotest_ffsselect*fromtest_ffs;ExecutionPlan----------------------------------------------------------Planhashvalue:296244252---------------------------------------------------------------------------------------------|Id|Operation|Name|Rows|Bytes|Cost(%CPU)|Time|---------------------------------------------------------------------------------------------|0|INSERTSTATEMENT||107|7383|3(0)|00:00:01||1|LOADTABLECONVENTIONAL|TEST_FFS||||||2|OPTIMIZERSTATISTICSGATHERING||107|7383|3(0)|00:00:01||3|TABLEACCESSFULL|TEST_FFS|107|7383|3(0)|00:00:01|---------------------------------------------------------------------------------------------Note------dynamicstatisticsused:statisticsforconventionalDMLStatistics----------------------------------------------------------72recursivecalls89dbblockgets81consistentgets12physicalreads21576redosize195bytessentviaSQL*Nettoclient394bytesreceivedviaSQL*Netfromclient1SQL*Netroundtripsto/fromclient3sorts(memory)0sorts(disk)107rowsprocessed

上面autotrace执行计划可以看到两个新特性：

1.2 12c R1与19c两个新特性1.2.1 12c R1新特性OPTIMIZER STATISTICS GATHERING：

# OPTIMIZER STATISTICS GATHERING：12cR1以后的新特性，direct path load时，空表第一次加载数据时会自动收集统计信息。

# Oracle Database 12c introduced onlinestatisticsgatheringforCREATE TABLE AS SELECT statements and direct-path inserts.

1.2.2 19c新特性real-timestatistics

Oracle Database 19c introducesreal-timestatistics
, which extend online support toconventionalDMLstatements
. Becausestatisticscan go stale between DBMS_STATS jobs, real-timestatisticshelps the optimizer generate more optimal plans.Whereas bulk load operations gather all necessarystatistics, real-timestatisticsaugment rather than replace traditionalstatistics.

· Oracle introduced new parameters

· "_optimizer_gather_stats_on_conventional_dml" and "_optimizer_use_stats_on_conventional_dml" which are true by default

· "_optimizer_stats_on_conventional_dml_sample_rate" at 100%

· How does real timestatisticsworks?

· By default the "_optimizer_gather_stats_on_conventional_dml" is true so its automatically kicks off

· When aDMLoperation is currently modifying a table (conventional), Oracle Database dynamically computes valuesforthe most essentialstatisticsif the above parameter is on.

· Consider a example of table that is having lot of inserts and rows are increasing. Real-timestatisticskeep track of the increasing row count as rows are being inserted. If the optimizer performs a hard parse of a new query, then the optimizer can use the real-timestatisticsto obtain a more accurate cost estimate.

· DBA_TAB_COL_STATISTICSand DBA_TAB_STATISITICS has columns NOTES tell real timestatisticshave been used. STATS_ON_CONVENTIONAL_DML

SELECTNVL(PARTITION_NAME,'GLOBAL')PARTITION_NAME,NUM_ROWS,BLOCKS,NOTES

FROMUSER_TAB_STATISTICS

WHERETABLE_NAME='SALES'

ORDERBY1,4;

PARTITION_NAMNUM_ROWSBLOCKSNOTES

----------------------------------------------------------

GLOBAL18376863315STATS_ON_CONVENTIONAL_DML1.3 插入大量数据并收集统计信息

CZH@czhpdb>setautotoffCZH@czhpdb>insertintotest_ffsselect*fromtest_ffs;CZH@czhpdb>insertintotest_ffsselect*fromtest_ffs;CZH@czhpdb>insertintotest_ffsselect*fromtest_ffs;CZH@czhpdb>insertintotest_ffsselect*fromtest_ffs;CZH@czhpdb>insertintotest_ffsselect*fromtest_ffs;CZH@czhpdb>insertintotest_ffsselect*fromtest_ffs;CZH@czhpdb>insertintotest_ffsselect*fromtest_ffs;CZH@czhpdb>insertintotest_ffsselect*fromtest_ffs;CZH@czhpdb>insertintotest_ffsselect*fromtest_ffs;CZH@czhpdb>insertintotest_ffsselect*fromtest_ffs;CZH@czhpdb>commit;CZH@czhpdb>CREATEINDEXIDX_TEST_FFSONTEST_FFS(EMPLOYEE_ID);CZH@czhpdb>EXECDBMS_STATS.GATHER_TABLE_STATS(user,’TEST_FFS’,cascade=>true);1.4 使用Hint /*+ gather_plan_statistics */获取sql真实执行计划

# sqlplus中set autotrace与explain plan for都是CBO预估出来的执行计划，可能与真实执行的并不相同，我们使用下面hint获取真实执行计划。

CZH@czhpdb>SELECT/*+gather_plan_statistics*/salaryfromtest_ffswhereemployee_id<100;norowsselected真实执行计划：SYS@orcl2>select*fromtable(dbms_xplan.display_cursor('c9qg9su5khysd',null,'allstatslast'));PLAN_TABLE_OUTPUT------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------SQL_IDc9qg9su5khysd,childnumber0-------------------------------------SELECT/*+gather_plan_statistics*/salaryfromtest_ffswhereemployee_id<100Planhashvalue:296244252----------------------------------------------------------------------------------------|Id|Operation|Name|Starts|E-Rows|A-Rows|A-Time|Buffers|----------------------------------------------------------------------------------------|0|SELECTSTATEMENT||1||0|00:00:00.01|2288||*1|TABLEACCESSFULL|TEST_FFS|1|1|0|00:00:00.01|2288|----------------------------------------------------------------------------------------PredicateInformation(identifiedbyoperationid):---------------------------------------------------

1 - filter("EMPLOYEE_ID"<100)

# 可以看到由于表中数据没有employee_id < 100，我们认为明显走索引的sql并未选择索引，那肯定是由于某些原因，cbo认为走索引并不是最优执行路径，我们就利用10053获取为什么cbo认为全表扫描cost更低。

SYS@orcl2>altersystemflushshared_pool;Systemaltered.

#如果不清空shared_pool或者使游标失效，软解析开启10053事件，不会生成trace文件。

CZH@czhpdb>ALTERSESSIONSETEVENTS'10053tracenamecontextforever,level1';Sessionaltered.CZH@czhpdb>SELECT/*+gather_plan_statistics*/salaryfromtest_ffswhereemployee_id<100;norowsselectedCZH@czhpdb>ALTERSESSIONSETEVENTS'10053tracenamecontextoff';Sessionaltered.

19c 10053：

# 可以从下面10053看到DK(distinct key),CLUF(clustering factor),IX_SEL,下一步将根据几个参数计算为何CBO认为走索引cost会高于全表扫描。

***************************************BASESTATISTICALINFORMATION***********************TableStats::Table:TEST_FFSAlias:TEST_FFSonlinetablestatsforconventionalDML(blockcount:2263rowcount:219029)usedon(TEST_FFS)blockcount:5->2263,rowcount:107->219136#Rows:219136SSZ:0LGR:0#Blks:2263AvgRowLen:69.00NEB:0ChainCnt:0.00ScanRate:0.00SPC:0RFL:0RNF:0CBK:0CHR:0KQDFLG:193#IMCUs:0IMCRowCnt:0IMCJournalRowCnt:0#IMCBlocks:0IMCQuotient:0.000000IndexStats::Index:IDX_TEST_FFSCol#:1LVLS:1#LB:458#DK:107LB/K:4.00DB/K:1524.00CLUF:163174.00NRW:219136.00SSZ:0.00LGR:0.00CBK:0.00GQL:0.00CHR:0.00KQDFLG:8192BSZ:1KKEISFLG:1trytogeneratesingle-tablefilterpredicatesfromORsforqueryblockSEL$1(#0)finally:"TEST_FFS"."EMPLOYEE_ID"<100=======================================SPD:BEGINcontextatqueryblocklevel=======================================QueryBlockSEL$1(#0)ReturncodeinqosdSetupDirCtx4QB:NOCTX=====================================SPD:ENDcontextatqueryblocklevel=====================================AccesspathanalysisforTEST_FFS***************************************SINGLETABLEACCESSPATHSingleTableCardinalityEstimationforTEST_FFS[TEST_FFS]SPD:ReturncodeinqosdDSDirSetup:NOCTX,estType=TABLEkkecdn:SingleTablePredicate:"TEST_FFS"."EMPLOYEE_ID"<100onlinecolumnstatsforconventionalDMLusedon(TEST_FFS.EMPLOYEE_ID)min:100.00->100.00,max:206.00->206.00,nnl:0->0,acl:4->0Column(#1):EMPLOYEE_ID(NUMBER)AvgLen:22NDV:107Nulls:0Density:0.009346Min:100.000000Max:206.000000Usingdensity:0.009346ofcol#1asselectivityofunpopularvaluepredTable:TEST_FFSAlias:TEST_FFSCard:Original:219136.000000Rounded:2048Computed:2048.000000NonAdjusted:2048.000000ScanIOCost(Disk)=615.000000ScanCPUCost(Disk)=49272938.720000Costofpredicates:io=NOCOST,cpu=50.000000,sel=0.009346flag=2048("TEST_FFS"."EMPLOYEE_ID"<100)TotalScanIOCost=615.000000(scan(Disk))+0.000000(iofiltereval)(=0.000000(perrow)*219136.000000(#rows))=615.000000TotalScanCPUCost=49272938.720000(scan(Disk))+10956800.000000(cpufiltereval)(=50.000000(perrow)*219136.000000(#rows))=60229738.720000AccessPath:TableScanCost:621.167026Resp:621.167026Degree:0Cost_io:615.000000Cost_cpu:60229739Resp_io:615.000000Resp_cpu:60229739******CostingIndexIDX_TEST_FFSSPD:ReturncodeinqosdDSDirSetup:NOCTX,estType=INDEX_SCANSPD:ReturncodeinqosdDSDirSetup:NOCTX,estType=INDEX_FILTERUsingdensity:0.009346ofcol#1asselectivityofunpopularvaluepredAccessPath:index(RangeScan)Index:IDX_TEST_FFSresc_io:1531.000000resc_cpu:11906445ix_sel:0.009346ix_sel_with_filters:0.009346Cost:1532.219121Resp:1532.219121Degree:1Best::AccessPath:TableScanCost:621.167026Degree:1Resp:621.167026Card:2048.000000Bytes:0.000000onlinecolumnstatsforconventionalDMLusedon(TEST_FFS.SALARY)min:2100.00->2100.00,max:24000.00->24000.00,nnl:0->0,acl:4->0***************************************2.调整cluster factor2.1 cluster factor聚簇因子说明

cluster factor表示索引顺序与表存储数据一致性，顺序扫描索引时，如果索引键值扫描到键值对应的表数据行对应的数据块发生变化时，则cluster factor加1，所以cluster factor最低为表数据块，最大为表数据行，与表存储顺序高度相关，如果表是按照顺序插入，则cluster factor较低，如果表数据为无序插入，则cluster factor较高，这就是为什么同样表数据情况下，执行计划会有时候有差别的原因。

索引扫描成本公式：

INDEX ACCESS I/O COST=BLEVEL+CEIL(#LEAF_BLOCKS*IX_SEL)

TABLE_ACCESS I/O COST=CEIL(CLUSTERING_FACTOR*IX_SEL_WITH_FILTERS)

IX_SEL与IX_SEL_WITH_FILTERS为索引选择率与索引带谓词选择率，一般为1/(DISTINCT KEY)值，本例中走全表扫描时，IX_SEL=1/107=0.009345，则计算走索引成本为：

ACCESSINDEXCOST=INDEXACCESSI/OCOST+TABLEACCESSI/OCOST=2+CEIL(458*0.009345)+CEIL(163174*0.009345)=1540

近似等于CBO预计出来的1532，是高于全表扫的COST 615的，所以选择走了全表扫描。

2.2 调整cluster factor

重建表，order by排序，降低cluster factor

CZH@czhpdb>createtabletest_ffs_03asselect*fromtest_ffs_02orderbyemployee_id;Tablecreated.CZH@czhpdb>createindexidx_test_ffs_03ontest_ffs_03(employee_id);Indexcreated.CZH@czhpdb>selectclustering_factor,index_namefromuser_indexeswhereindex_name='IDX_TEST_FFS_03';CLUSTERING_FACTORINDEX_NAME------------------------------------------------------------1128IDX_TEST_FFS_03

# 可以看到cluster factor明显降低。

CZH@czhpdb>select/*+gather_plan_statistics*/salaryfromtest_ffs_03whereemployee_id<100;norowsselectedSYS@orcl2>select*fromtable(dbms_xplan.display_cursor('8fpk2b8vzn5y2',null,'allstatslast'));PLAN_TABLE_OUTPUT------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------SQL_ID8fpk2b8vzn5y2,childnumber0-------------------------------------select/*+gather_plan_statistics*/salaryfromtest_ffs_03whereemployee_id<100Planhashvalue:704625359--------------------------------------------------------------------------------------------------------------------------|Id|Operation|Name|Starts|E-Rows|A-Rows|A-Time|Buffers|Reads|--------------------------------------------------------------------------------------------------------------------------|0|SELECTSTATEMENT||1||0|00:00:00.01|2|1||1|TABLEACCESSBYINDEXROWIDBATCHED|TEST_FFS_03|1|1024|0|00:00:00.01|2|1||*2|INDEXRANGESCAN|IDX_TEST_FFS_03|1|1024|0|00:00:00.01|2|1|--------------------------------------------------------------------------------------------------------------------------PredicateInformation(identifiedbyoperationid):---------------------------------------------------2-access("EMPLOYEE_ID"<100)

感谢各位的阅读，以上就是“数据库中cluster factor对执行计划的影响是什么”的内容了，经过本文的学习后，相信大家对数据库中cluster factor对执行计划的影响是什么这一问题有了更深刻的体会，具体使用情况还需要大家实践验证。这里是亿速云，小编将为大家推送更多相关知识点的文章，欢迎关注！