这篇文章主要讲解了“PostgreSQL checkpoint中用于刷一个脏page的函数是什么”,文中的讲解内容简单清晰,易于学习与理解,下面请大家跟着小编的思路慢慢深入,一起来研究和学习“PostgreSQL checkpoint中用于刷一个脏page的函数是什么”吧!

一、数据结构

宏定义
checkpoints request flag bits,检查点请求标记位定义.

/**OR-ablerequestflagbitsforcheckpoints.The"cause"bitsareusedonly*forloggingpurposes.Note:theflagsmustbedefinedsothatit's*sensibletoORtogetherrequestflagsarisingfromdifferentrequestors.*//*ThesedirectlyaffectthebehaviorofCreateCheckPointandsubsidiaries*/#defineCHECKPOINT_IS_SHUTDOWN0x0001/*Checkpointisforshutdown*/#defineCHECKPOINT_END_OF_RECOVERY0x0002/*Likeshutdowncheckpoint,but*issuedatendofWALrecovery*/#defineCHECKPOINT_IMMEDIATE0x0004/*Doitwithoutdelays*/#defineCHECKPOINT_FORCE0x0008/*Forceevenifnoactivity*/#defineCHECKPOINT_FLUSH_ALL0x0010/*Flushallpages,includingthose*belongingtounloggedtables*//*TheseareimportanttoRequestCheckpoint*/#defineCHECKPOINT_WAIT0x0020/*Waitforcompletion*/#defineCHECKPOINT_REQUESTED0x0040/*Checkpointrequesthasbeenmade*//*Theseindicatethecauseofacheckpointrequest*/#defineCHECKPOINT_CAUSE_XLOG0x0080/*XLOGconsumption*/#defineCHECKPOINT_CAUSE_TIME0x0100/*Elapsedtime*/二、源码解读

SyncOneBuffer,在syncing期间处理一个buffer,其主要处理逻辑如下:
1.获取buffer描述符
2.锁定buffer
3.根据buffer状态和输入参数执行相关判断/处理
4.钉住脏页,上共享锁,调用FlushBuffer刷盘
5.解锁/解钉和其他收尾工作

/**SyncOneBuffer--processasinglebufferduringsyncing.*在syncing期间处理一个buffer**Ifskip_recently_usedistrue,wedon'twritecurrently-pinnedbuffers,nor*buffersmarkedrecentlyused,asthesearenotreplacementcandidates.*如skip_recently_used为T,既不写currently-pinnedbuffers,*也不写标记为最近使用的buffers,因为这些缓冲区不是可替代的缓冲区.**Returnsabitmaskcontainingthefollowingflagbits:*BUF_WRITTEN:wewrotethebuffer.*BUF_REUSABLE:bufferisavailableforreplacement,ie,ithas*pincount0andusagecount0.*返回位掩码:*BUF_WRITTEN:已写入buffer*BUF_REUSABLE:buffer可用于替代(pincount和usagecount均为0)**(BUF_WRITTENcouldbesetinerrorifFlushBuffersfindsthebufferclean*afterlockingit,butwedon'tcareallthatmuch.)**Note:callermusthavedoneResourceOwnerEnlargeBuffers.*/staticintSyncOneBuffer(intbuf_id,boolskip_recently_used,WritebackContext*wb_context){BufferDesc*bufHdr=GetBufferDescriptor(buf_id);intresult=0;uint32buf_state;BufferTagtag;ReservePrivateRefCountEntry();/**Checkwhetherbufferneedswriting.*检查buffer是否需要写入.**Wecanmakethischeckwithouttakingthebuffercontentlocksolong*aswemarkpagesdirtyinaccessmethods*before*loggingchangeswith*XLogInsert():ifsomeonemarksthebufferdirtyjustafterourcheckwe*don'tworrybecauseourcheckpoint.redopointsbeforelogrecordfor*upcomingchangesandsowearenotrequiredtowritesuchdirtybuffer.*在使用XLogInsert()logging变化前通过访问方法标记pages为脏时,*不需要持有锁太长的时间来执行该检查:*因为如果某个进程在检查后标记buffer为脏,*在这种情况下checkpoint.redo指向了变化出现前的log位置,因此无需担心,而且不必写这样的脏块.*/buf_state=LockBufHdr(bufHdr);if(BUF_STATE_GET_REFCOUNT(buf_state)==0&&BUF_STATE_GET_USAGECOUNT(buf_state)==0){result|=BUF_REUSABLE;}elseif(skip_recently_used){/*Callertoldusnottowriterecently-usedbuffers*///跳过最近使用的bufferUnlockBufHdr(bufHdr,buf_state);returnresult;}if(!(buf_state&BM_VALID)||!(buf_state&BM_DIRTY)){/*It'sclean,sonothingtodo*///buffer无效或者不是脏块UnlockBufHdr(bufHdr,buf_state);returnresult;}/**Pinit,share-lockit,writeit.(FlushBufferwilldonothingifthe*bufferiscleanbythetimewe'velockedit.)*钉住它,上共享锁,并刷到盘上.*/PinBuffer_Locked(bufHdr);LWLockAcquire(BufferDescriptorGetContentLock(bufHdr),LW_SHARED);//调用FlushBuffer//Ifthecallerhasansmgrreferenceforthebuffer'srelation,passitasthesecondparameter.//Ifnot,passNULL.FlushBuffer(bufHdr,NULL);LWLockRelease(BufferDescriptorGetContentLock(bufHdr));tag=bufHdr->tag;UnpinBuffer(bufHdr,true);ScheduleBufferTagForWriteback(wb_context,&tag);returnresult|BUF_WRITTEN;}

FlushBuffer
FlushBuffer函数物理上把共享缓存刷盘,主要实现函数还是smgrwrite(storage manager write).

/**FlushBuffer*Physicallywriteoutasharedbuffer.*物理上把共享缓存刷盘.**NOTE:thisactuallyjustpassesthebuffercontentstothekernel;the*realwritetodiskwon'thappenuntilthekernelfeelslikeit.This*isokayfromourpointofviewsincewecanredothechangesfromWAL.*However,wewillneedtoforcethechangestodiskviafsyncbefore*wecancheckpointWAL.*只是把buffer内容发给os内核,何时真正写盘由os来确定.*在checkpointWAL前需要通过fsync强制落盘.**Thecallermustholdapinonthebufferandhaveshare-lockedthe*buffercontents.(Note:ashare-lockdoesnotpreventupdatesof*hintbitsinthebuffer,sothepagecouldchangewhilethewrite*isinprogress,butweassumethatthatwillnotinvalidatethedata*written.)*调用者必须钉住了缓存并且持有共享锁.*(注意:共享锁不会buffer中的hintbits的更新,因此在写入期间page可能会出现变化,*但我假定那样不会让写入的数据无效)**Ifthecallerhasansmgrreferenceforthebuffer'srelation,passit*asthesecondparameter.Ifnot,passNULL.*/staticvoidFlushBuffer(BufferDesc*buf,SMgrRelationreln){XLogRecPtrrecptr;ErrorContextCallbackerrcallback;instr_timeio_start,io_time;BlockbufBlock;char*bufToWrite;uint32buf_state;/**Acquirethebuffer'sio_in_progresslock.IfStartBufferIOreturns*false,thensomeoneelseflushedthebufferbeforewecould,soweneed*notdoanything.*/if(!StartBufferIO(buf,false))return;/*Setuperrortracebacksupportforereport()*/errcallback.callback=shared_buffer_write_error_callback;errcallback.arg=(void*)buf;errcallback.previous=error_context_stack;error_context_stack=&errcallback;/*Findsmgrrelationforbuffer*/if(reln==NULL)reln=smgropen(buf->tag.rnode,InvalidBackendId);TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,buf->tag.blockNum,reln->smgr_rnode.node.spcNode,reln->smgr_rnode.node.dbNode,reln->smgr_rnode.node.relNode);buf_state=LockBufHdr(buf);/**RunPageGetLSNwhileholdingheaderlock,sincewedon'thavethe*bufferlockedexclusivelyinallcases.*/recptr=BufferGetLSN(buf);/*Tocheckifblockcontentchangeswhileflushing.-vadim01/17/97*/buf_state&=~BM_JUST_DIRTIED;UnlockBufHdr(buf,buf_state);/**ForceXLOGflushuptobuffer'sLSN.ThisimplementsthebasicWAL*rulethatlogupdatesmusthitdiskbeforeanyofthedata-filechanges*theydescribedo.**However,thisruledoesnotapplytounloggedrelations,whichwillbe*lostafteracrashanyway.Mostunloggedrelationpagesdonotbear*LSNssinceweneveremitWALrecordsforthem,andthereforeflushing*upthroughthebufferLSNwouldbeuseless,butharmless.However,*GiSTindexesuseLSNsinternallytotrackpage-splits,andtherefore*unloggedGiSTpagesbear"fake"LSNsgeneratedby*GetFakeLSNForUnloggedRel.Itisunlikelybutpossiblethatthefake*LSNcountercouldadvancepasttheWALinsertionpoint;andifitdid*happen,attemptingtoflushWALthroughthatlocationwouldfail,with*disastroussystem-wideconsequences.Tomakesurethatcan'thappen,*skiptheflushifthebufferisn'tpermanent.*/if(buf_state&BM_PERMANENT)XLogFlush(recptr);/**Nowit'ssafetowritebuffertodisk.Notethatnooneelseshould*havebeenabletowriteitwhilewewerebusywithlogflushingbecause*wehavetheio_in_progresslock.*/bufBlock=BufHdrGetBlock(buf);/**Updatepagechecksumifdesired.Sincewehaveonlysharedlockonthe*buffer,otherprocessesmightbeupdatinghintbitsinit,sowemust*copythepagetoprivatestorageifwedochecksumming.*/bufToWrite=PageSetChecksumCopy((Page)bufBlock,buf->tag.blockNum);if(track_io_timing)INSTR_TIME_SET_CURRENT(io_start);/**bufToWriteiseitherthesharedbufferoracopy,asappropriate.*/smgrwrite(reln,buf->tag.forkNum,buf->tag.blockNum,bufToWrite,false);if(track_io_timing){INSTR_TIME_SET_CURRENT(io_time);INSTR_TIME_SUBTRACT(io_time,io_start);pgstat_count_buffer_write_time(INSTR_TIME_GET_MICROSEC(io_time));INSTR_TIME_ADD(pgBufferUsage.blk_write_time,io_time);}pgBufferUsage.shared_blks_written++;/**Markthebufferasclean(unlessBM_JUST_DIRTIEDhasbecomeset)and*endtheio_in_progressstate.*/TerminateBufferIO(buf,true,0);TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(buf->tag.forkNum,buf->tag.blockNum,reln->smgr_rnode.node.spcNode,reln->smgr_rnode.node.dbNode,reln->smgr_rnode.node.relNode);/*Poptheerrorcontextstack*/error_context_stack=errcallback.previous;}三、跟踪分析

测试脚本

testdb=#updatet_wal_ckptsetc2='C4#'||substr(c2,4,40);UPDATE1testdb=#checkpoint;

跟踪分析

(gdb)handleSIGINTprintnostoppassSIGINTisusedbythedebugger.Areyousureyouwanttochangeit?(yorn)ySignalStopPrintPasstoprogramDescriptionSIGINTNoYesYesInterrupt(gdb)bSyncOneBufferBreakpoint1at0x8a7167:filebufmgr.c,line2357.(gdb)cContinuing.ProgramreceivedsignalSIGINT,Interrupt.Breakpoint1,SyncOneBuffer(buf_id=0,skip_recently_used=false,wb_context=0x7fff27f5ae00)atbufmgr.c:23572357BufferDesc*bufHdr=GetBufferDescriptor(buf_id);(gdb)n2358intresult=0;(gdb)p*bufHdr$1={tag={rnode={spcNode=1663,dbNode=16384,relNode=221290},forkNum=MAIN_FORKNUM,blockNum=0},buf_id=0,state={value=3548905472},wait_backend_pid=0,freeNext=-2,content_lock={tranche=53,state={value=536870912},waiters={head=2147483647,tail=2147483647}}}(gdb)n2362ReservePrivateRefCountEntry();(gdb)2373buf_state=LockBufHdr(bufHdr);(gdb)2375if(BUF_STATE_GET_REFCOUNT(buf_state)==0&&(gdb)2376BUF_STATE_GET_USAGECOUNT(buf_state)==0)(gdb)2375if(BUF_STATE_GET_REFCOUNT(buf_state)==0&&(gdb)2380elseif(skip_recently_used)(gdb)2387if(!(buf_state&BM_VALID)||!(buf_state&BM_DIRTY))(gdb)2398PinBuffer_Locked(bufHdr);(gdb)pbuf_state$2=3553099776(gdb)n2399LWLockAcquire(BufferDescriptorGetContentLock(bufHdr),LW_SHARED);(gdb)2401FlushBuffer(bufHdr,NULL);(gdb)stepFlushBuffer(buf=0x7fedc4a68300,reln=0x0)atbufmgr.c:26872687if(!StartBufferIO(buf,false))(gdb)n2691errcallback.callback=shared_buffer_write_error_callback;(gdb)2692errcallback.arg=(void*)buf;(gdb)2693errcallback.previous=error_context_stack;(gdb)2694error_context_stack=&errcallback;(gdb)2697if(reln==NULL)(gdb)2698reln=smgropen(buf->tag.rnode,InvalidBackendId);(gdb)2700TRACE_POSTGRESQL_BUFFER_FLUSH_START(buf->tag.forkNum,(gdb)2706buf_state=LockBufHdr(buf);(gdb)2712recptr=BufferGetLSN(buf);(gdb)2715buf_state&=~BM_JUST_DIRTIED;(gdb)precptr$3=16953421760(gdb)n2716UnlockBufHdr(buf,buf_state);(gdb)2735if(buf_state&BM_PERMANENT)(gdb)2736XLogFlush(recptr);(gdb)2743bufBlock=BufHdrGetBlock(buf);(gdb)2750bufToWrite=PageSetChecksumCopy((Page)bufBlock,buf->tag.blockNum);(gdb)pbufBlock$4=(Block)0x7fedc4e68300(gdb)n2752if(track_io_timing)(gdb)2758smgrwrite(reln,(gdb)2764if(track_io_timing)(gdb)2772pgBufferUsage.shared_blks_written++;(gdb)2778TerminateBufferIO(buf,true,0);(gdb)2780TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(buf->tag.forkNum,(gdb)2787error_context_stack=errcallback.previous;(gdb)2788}(gdb)SyncOneBuffer(buf_id=0,skip_recently_used=false,wb_context=0x7fff27f5ae00)atbufmgr.c:24032403LWLockRelease(BufferDescriptorGetContentLock(bufHdr));(gdb)2405tag=bufHdr->tag;(gdb)2407UnpinBuffer(bufHdr,true);(gdb)2409ScheduleBufferTagForWriteback(wb_context,&tag);(gdb)2411returnresult|BUF_WRITTEN;(gdb)2412}(gdb)

感谢各位的阅读,以上就是“PostgreSQL checkpoint中用于刷一个脏page的函数是什么”的内容了,经过本文的学习后,相信大家对PostgreSQL checkpoint中用于刷一个脏page的函数是什么这一问题有了更深刻的体会,具体使用情况还需要大家实践验证。这里是亿速云,小编将为大家推送更多相关知识点的文章,欢迎关注!