PostgreSQL中ReadBuffer_common函数有什么作用

发布时间：2021-11-09 16:03:28 来源：亿速云阅读：297 作者：iii 栏目：关系型数据库

这篇文章主要介绍“PostgreSQL中ReadBuffer_common函数有什么作用”，在日常操作中，相信很多人在PostgreSQL中ReadBuffer_common函数有什么作用问题上存在疑惑，小编查阅了各式资料，整理出简单好用的操作方法，希望对大家解答”PostgreSQL中ReadBuffer_common函数有什么作用”的疑惑有所帮助！接下来，请跟着小编一起来学习吧！

一、数据结构

BufferDesc
共享缓冲区的共享描述符(状态)数据

/*  * Flags for buffer descriptors  * buffer描述器标记  *  * Note: TAG_VALID essentially means that there is a buffer hashtable  * entry associated with the buffer's tag.  * 注意:TAG_VALID本质上意味着有一个与缓冲区的标记相关联的缓冲区散列表条目。  */ //buffer header锁定 #define BM_LOCKED               (1U << 22)  /* buffer header is locked */ //数据需要写入(标记为DIRTY) #define BM_DIRTY                (1U << 23)  /* data needs writing */ //数据是有效的 #define BM_VALID                (1U << 24)  /* data is valid */ //已分配buffer tag #define BM_TAG_VALID            (1U << 25)  /* tag is assigned */ //正在R/W #define BM_IO_IN_PROGRESS       (1U << 26)  /* read or write in progress */ //上一个I/O出现错误 #define BM_IO_ERROR             (1U << 27)  /* previous I/O failed */ //开始写则变DIRTY #define BM_JUST_DIRTIED         (1U << 28)  /* dirtied since write started */ //存在等待sole pin的其他进程 #define BM_PIN_COUNT_WAITER     (1U << 29)  /* have waiter for sole pin */ //checkpoint发生,必须刷到磁盘上 #define BM_CHECKPOINT_NEEDED    (1U << 30)  /* must write for checkpoint */ //持久化buffer(不是unlogged或者初始化fork) #define BM_PERMANENT            (1U << 31)  /* permanent buffer (not unlogged,                                              * or init fork) */ /*  *  BufferDesc -- shared descriptor/state data for a single shared buffer.  *  BufferDesc -- 共享缓冲区的共享描述符(状态)数据  *  * Note: Buffer header lock (BM_LOCKED flag) must be held to examine or change  * the tag, state or wait_backend_pid fields.  In general, buffer header lock  * is a spinlock which is combined with flags, refcount and usagecount into  * single atomic variable.  This layout allow us to do some operations in a  * single atomic operation, without actually acquiring and releasing spinlock;  * for instance, increase or decrease refcount.  buf_id field never changes  * after initialization, so does not need locking.  freeNext is protected by  * the buffer_strategy_lock not buffer header lock.  The LWLock can take care  * of itself.  The buffer header lock is *not* used to control access to the  * data in the buffer!  * 注意:必须持有Buffer header锁(BM_LOCKED标记)才能检查或修改tag/state/wait_backend_pid字段.  * 通常来说,buffer header lock是spinlock,它与标记位/参考计数/使用计数组合到单个原子变量中.  * 这个布局设计允许我们执行原子操作,而不需要实际获得或者释放spinlock(比如,增加或者减少参考计数).  * buf_id字段在初始化后不会出现变化,因此不需要锁定.  * freeNext通过buffer_strategy_lock锁而不是buffer header lock保护.  * LWLock可以很好的处理自己的状态.  * 务请注意的是:buffer header lock不用于控制buffer中的数据访问!  *  * It's assumed that nobody changes the state field while buffer header lock  * is held.  Thus buffer header lock holder can do complex updates of the  * state variable in single write, simultaneously with lock release (cleaning  * BM_LOCKED flag).  On the other hand, updating of state without holding  * buffer header lock is restricted to CAS, which insure that BM_LOCKED flag  * is not set.  Atomic increment/decrement, OR/AND etc. are not allowed.  * 假定在持有buffer header lock的情况下,没有人改变状态字段.  * 持有buffer header lock的进程可以执行在单个写操作中执行复杂的状态变量更新,  *   同步的释放锁(清除BM_LOCKED标记).  * 换句话说,如果没有持有buffer header lock的状态更新,会受限于CAS,  *   这种情况下确保BM_LOCKED没有被设置.  * 比如原子的增加/减少(AND/OR)等操作是不允许的.  *  * An exception is that if we have the buffer pinned, its tag can't change  * underneath us, so we can examine the tag without locking the buffer header.  * Also, in places we do one-time reads of the flags without bothering to  * lock the buffer header; this is generally for situations where we don't  * expect the flag bit being tested to be changing.  * 一种例外情况是如果我们已有buffer pinned,该buffer的tag不能改变(在本进程之下),  *   因此不需要锁定buffer header就可以检查tag了.  * 同时,在执行一次性的flags读取时不需要锁定buffer header.  * 这种情况通常用于我们不希望正在测试的flag bit将被改变.  *  * We can't physically remove items from a disk page if another backend has  * the buffer pinned.  Hence, a backend may need to wait for all other pins  * to go away.  This is signaled by storing its own PID into  * wait_backend_pid and setting flag bit BM_PIN_COUNT_WAITER.  At present,  * there can be only one such waiter per buffer.  * 如果其他进程有buffer pinned,那么进程不能物理的从磁盘页面中删除items.  * 因此,后台进程需要等待其他pins清除.这可以通过存储它自己的PID到wait_backend_pid中,  *   并设置标记位BM_PIN_COUNT_WAITER.  * 目前,每个缓冲区只能由一个等待进程.  *  * We use this same struct for local buffer headers, but the locks are not  * used and not all of the flag bits are useful either. To avoid unnecessary  * overhead, manipulations of the state field should be done without actual  * atomic operations (i.e. only pg_atomic_read_u32() and  * pg_atomic_unlocked_write_u32()).  * 本地缓冲头部使用同样的结构,但并不需要使用locks,而且并不是所有的标记位都使用.  * 为了避免不必要的负载,状态域的维护不需要实际的原子操作  * (比如只有pg_atomic_read_u32() and pg_atomic_unlocked_write_u32())  *  * Be careful to avoid increasing the size of the struct when adding or  * reordering members.  Keeping it below 64 bytes (the most common CPU  * cache line size) is fairly important for performance.  * 在增加或者记录成员变量时,小心避免增加结构体的大小.  * 保持结构体大小在64字节内(通常的CPU缓存线大小)对于性能是非常重要的.  */ typedef struct BufferDesc {     //buffer tag     BufferTag   tag;            /* ID of page contained in buffer */     //buffer索引编号(0开始)     int         buf_id;         /* buffer's index number (from 0) */     /* state of the tag, containing flags, refcount and usagecount */     //tag状态,包括flags/refcount和usagecount     pg_atomic_uint32 state;     //pin-count等待进程ID     int         wait_backend_pid;   /* backend PID of pin-count waiter */     //空闲链表链中下一个空闲的buffer     int         freeNext;       /* link in freelist chain */     //缓冲区内容锁     LWLock      content_lock;   /* to lock access to buffer contents */ } BufferDesc;

BufferTag
Buffer tag标记了buffer存储的是磁盘中哪个block

/*  * Buffer tag identifies which disk block the buffer contains.  * Buffer tag标记了buffer存储的是磁盘中哪个block  *  * Note: the BufferTag data must be sufficient to determine where to write the  * block, without reference to pg_class or pg_tablespace entries.  It's  * possible that the backend flushing the buffer doesn't even believe the  * relation is visible yet (its xact may have started before the xact that  * created the rel).  The storage manager must be able to cope anyway.  * 注意:BufferTag必须足以确定如何写block而不需要参照pg_class或者pg_tablespace数据字典信息.  * 有可能后台进程在刷新缓冲区的时候深圳不相信关系是可见的(事务可能在创建rel的事务之前).  * 存储管理器必须可以处理这些事情.  *  * Note: if there's any pad bytes in the struct, INIT_BUFFERTAG will have  * to be fixed to zero them, since this struct is used as a hash key.  * 注意:如果在结构体中有填充的字节,INIT_BUFFERTAG必须将它们固定为零，因为这个结构体用作散列键.  */ typedef struct buftag {     //物理relation标识符     RelFileNode rnode;          /* physical relation identifier */     ForkNumber  forkNum;     //相对于relation起始的块号     BlockNumber blockNum;       /* blknum relative to begin of reln */ } BufferTag;

二、源码解读

ReadBuffer_common函数是所有ReadBuffer相关的通用逻辑,其实现逻辑如下:
1.初始化相关变量和执行相关判断(是否扩展isExtend?是否临时表isLocalBuf?)
2.如为临时表,则调用LocalBufferAlloc获取描述符;否则调用BufferAlloc获取描述符;
同时,设置是否在缓存命中的标记(变量found)
3.如在缓存中命中
3.1如非扩展buffer,更新统计信息,如有需要,锁定buffer并返回
3.2如为扩展buffer,则获取block
3.2.1如PageIsNew返回F,则报错
3.2.2如为本地buffer(临时表),则调整标记
3.2.3如非本地buffer,则清除BM_VALID标记
4.没有在缓存中命中,则获取block
4.1如为扩展buffer,通过填充0初始化buffer,调用smgrextend扩展
4.2如为普通buffer
4.2.1如模式为RBM_ZERO_AND_LOCK/RBM_ZERO_AND_CLEANUP_LOCK,填充0
4.2.2否则,通过smgr(存储管理器)读取block,如需要,则跟踪I/O时间,同时检查垃圾数据
5.已扩展了buffer或者已读取了block
5.1如需要,锁定buffer
5.2如为临时表,则调整标记;否则设置BM_VALID,中断IO,唤醒等待的进程
5.3更新统计信息
5.4返回buffer

/*  * ReadBuffer_common -- common logic for all ReadBuffer variants  * ReadBuffer_common -- 所有ReadBuffer相关的通用逻辑  *  * *hit is set to true if the request was satisfied from shared buffer cache.  * *hit设置为T,如shared buffer中已存在此buffer  */ static Buffer ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,                   BlockNumber blockNum, ReadBufferMode mode,                   BufferAccessStrategy strategy, bool *hit) {     BufferDesc *bufHdr;//buffer描述符     Block       bufBlock;//相应的block     bool        found;//是否命中?     bool        isExtend;//扩展?     bool        isLocalBuf = SmgrIsTemp(smgr);//本地buffer?     *hit = false;     /* Make sure we will have room to remember the buffer pin */     //确保有空间存储buffer pin     ResourceOwnerEnlargeBuffers(CurrentResourceOwner);     //如为P_NEW,则需扩展     isExtend = (blockNum == P_NEW);     //跟踪     TRACE_POSTGRESQL_BUFFER_READ_START(forkNum, blockNum,                                        smgr->smgr_rnode.node.spcNode,                                        smgr->smgr_rnode.node.dbNode,                                        smgr->smgr_rnode.node.relNode,                                        smgr->smgr_rnode.backend,                                        isExtend);     /* Substitute proper block number if caller asked for P_NEW */     //如调用方要求P_NEW，则替换适当的块号     if (isExtend)         blockNum = smgrnblocks(smgr, forkNum);     if (isLocalBuf)     {         //本地buffer(临时表)         bufHdr = LocalBufferAlloc(smgr, forkNum, blockNum, &found);         if (found)             pgBufferUsage.local_blks_hit++;         else if (isExtend)             pgBufferUsage.local_blks_written++;         else if (mode == RBM_NORMAL || mode == RBM_NORMAL_NO_LOG ||                  mode == RBM_ZERO_ON_ERROR)             pgBufferUsage.local_blks_read++;     }     else     {         //非临时表         /*          * lookup the buffer.  IO_IN_PROGRESS is set if the requested block is          * not currently in memory.          * 搜索buffer.          * 如请求的block不在内存中,则IO_IN_PROGRESS设置为T          */         //获取buffer描述符         bufHdr = BufferAlloc(smgr, relpersistence, forkNum, blockNum,                              strategy, &found);         if (found)             //在内存中命中             pgBufferUsage.shared_blks_hit++;         else if (isExtend)             //新的buffer             pgBufferUsage.shared_blks_written++;         else if (mode == RBM_NORMAL || mode == RBM_NORMAL_NO_LOG ||                  mode == RBM_ZERO_ON_ERROR)             //读取block             pgBufferUsage.shared_blks_read++;     }     /* At this point we do NOT hold any locks. */     //这时候,我们还没有持有任何锁.     /* if it was already in the buffer pool, we're done */     //---------- 如果buffer已在换冲池中,工作已完成     if (found)     {         //------------- buffer已在缓冲池中         //已在换冲池中         if (!isExtend)         {             //非扩展buffer             /* Just need to update stats before we exit */             //在退出前,更新统计信息             *hit = true;             VacuumPageHit++;             if (VacuumCostActive)                 VacuumCostBalance += VacuumCostPageHit;             TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,                                               smgr->smgr_rnode.node.spcNode,                                               smgr->smgr_rnode.node.dbNode,                                               smgr->smgr_rnode.node.relNode,                                               smgr->smgr_rnode.backend,                                               isExtend,                                               found);             /*              * In RBM_ZERO_AND_LOCK mode the caller expects the page to be              * locked on return.              * RBM_ZERO_AND_LOCK模式,调用者期望page锁定后才返回              */             if (!isLocalBuf)             {                 //非临时表buffer                 if (mode == RBM_ZERO_AND_LOCK)                     LWLockAcquire(BufferDescriptorGetContentLock(bufHdr),                                   LW_EXCLUSIVE);                 else if (mode == RBM_ZERO_AND_CLEANUP_LOCK)                     LockBufferForCleanup(BufferDescriptorGetBuffer(bufHdr));             }             //根据buffer描述符读取buffer并返回buffer             //#define BufferDescriptorGetBuffer(bdesc) ((bdesc)->buf_id + 1)             return BufferDescriptorGetBuffer(bufHdr);         }         /*          * We get here only in the corner case where we are trying to extend          * the relation but we found a pre-existing buffer marked BM_VALID.          * This can happen because mdread doesn't complain about reads beyond          * EOF (when zero_damaged_pages is ON) and so a previous attempt to          * read a block beyond EOF could have left a "valid" zero-filled          * buffer.  Unfortunately, we have also seen this case occurring          * because of buggy Linux kernels that sometimes return an          * lseek(SEEK_END) result that doesn't account for a recent write. In          * that situation, the pre-existing buffer would contain valid data          * that we don't want to overwrite.  Since the legitimate case should          * always have left a zero-filled buffer, complain if not PageIsNew.          * 程序执行来到这里,进程尝试扩展relation但发现了先前已存在的标记为BM_VALID的buffer.          * 这种情况之所以发生是因为mdread对于在EOF之后的读不会报错(zero_damaged_pages设置为ON),          *   并且先前尝试读取EOF的block遗留了"valid"的已初始化(填充0)的buffer.          * 不幸的是,我们同样发现因为Linux内核的bug(有时候会返回lseek/SEEK_END结果)导致这种情况.          * 在这种情况下,先前已存在的buffer会存储有效的数据,这些数据不希望被覆盖.          * 由于合法的情况下应该总是留下一个零填充的缓冲区，如果不是PageIsNew，则报错。          */         //获取block         bufBlock = isLocalBuf ? LocalBufHdrGetBlock(bufHdr) : BufHdrGetBlock(bufHdr);         if (!PageIsNew((Page) bufBlock))             //不是PageIsNew,则报错             ereport(ERROR,                     (errmsg("unexpected data beyond EOF in block %u of relation %s",                             blockNum, relpath(smgr->smgr_rnode, forkNum)),                      errhint("This has been seen to occur with buggy kernels; consider updating your system.")));         /*          * We *must* do smgrextend before succeeding, else the page will not          * be reserved by the kernel, and the next P_NEW call will decide to          * return the same page.  Clear the BM_VALID bit, do the StartBufferIO          * call that BufferAlloc didn't, and proceed.          * 在成功执行前,必须执行smgrextend,否则的话page不能被内核保留,          *   同时下一个P_NEW调用会确定返回同样的page.          * 清除BM_VALID位，执行BufferAlloc没有执行的StartBufferIO调用，然后继续。          */         if (isLocalBuf)         {             //临时表             /* Only need to adjust flags */             //只需要调整标记             uint32      buf_state = pg_atomic_read_u32(&bufHdr->state);             Assert(buf_state & BM_VALID);             buf_state &= ~BM_VALID;             pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);         }         else         {             //非临时表             /*              * Loop to handle the very small possibility that someone re-sets              * BM_VALID between our clearing it and StartBufferIO inspecting              * it.              * 循环,直至StartBufferIO返回T为止              */             do             {                 uint32      buf_state = LockBufHdr(bufHdr);                 Assert(buf_state & BM_VALID);                 //清除BM_VALID标记                 buf_state &= ~BM_VALID;                 UnlockBufHdr(bufHdr, buf_state);             } while (!StartBufferIO(bufHdr, true));         }     }     //------------- buffer不在缓冲池中     /*      * if we have gotten to this point, we have allocated a buffer for the      * page but its contents are not yet valid.  IO_IN_PROGRESS is set for it,      * if it's a shared buffer.      * 如果到了这个份上,我们已经为page分配了buffer,但其中的内容还没有生效.      * 如果是共享内存,那么设置IO_IN_PROGRESS标记.      *      * Note: if smgrextend fails, we will end up with a buffer that is      * allocated but not marked BM_VALID.  P_NEW will still select the same      * block number (because the relation didn't get any longer on disk) and      * so future attempts to extend the relation will find the same buffer (if      * it's not been recycled) but come right back here to try smgrextend      * again.      * 注意:如果smgrextend失败,我们将以一个已分配但为设置为BM_VALID的buffer结束这次调用      */     //验证     Assert(!(pg_atomic_read_u32(&bufHdr->state) & BM_VALID));   /* spinlock not needed */     //获取block     bufBlock = isLocalBuf ? LocalBufHdrGetBlock(bufHdr) : BufHdrGetBlock(bufHdr);     if (isExtend)     {         //-------- 扩展block         /* new buffers are zero-filled */         //新buffers使用0填充         MemSet((char *) bufBlock, 0, BLCKSZ);         /* don't set checksum for all-zero page */         //对于使用全0填充的page,不要设置checksum         smgrextend(smgr, forkNum, blockNum, (char *) bufBlock, false);         /*          * NB: we're *not* doing a ScheduleBufferTagForWriteback here;          * although we're essentially performing a write. At least on linux          * doing so defeats the 'delayed allocation' mechanism, leading to          * increased file fragmentation.          * 注意:这里我们不会执行ScheduleBufferTagForWriteback.虽然我们实质上正在执行写操作.          * 起码,在Linux平台,执行这个操作会破坏“延迟分配”机制,导致文件碎片.          */     }     else     {         //-------- 普通block         /*          * Read in the page, unless the caller intends to overwrite it and          * just wants us to allocate a buffer.          * 读取page,除非调用者期望覆盖它并且希望我们分配buffer.          *           */         if (mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK)             //如为RBM_ZERO_AND_LOCK或者RBM_ZERO_AND_CLEANUP_LOCK模式,初始化为0             MemSet((char *) bufBlock, 0, BLCKSZ);         else         {             //其他模式             instr_time  io_start,//io的起止时间                         io_time;             if (track_io_timing)                 INSTR_TIME_SET_CURRENT(io_start);             //smgr(存储管理器)读取block             smgrread(smgr, forkNum, blockNum, (char *) bufBlock);             if (track_io_timing)             {                 //需要跟踪io时间                 INSTR_TIME_SET_CURRENT(io_time);                 INSTR_TIME_SUBTRACT(io_time, io_start);                 pgstat_count_buffer_read_time(INSTR_TIME_GET_MICROSEC(io_time));                 INSTR_TIME_ADD(pgBufferUsage.blk_read_time, io_time);             }             /* check for garbage data */             //检查垃圾数据             if (!PageIsVerified((Page) bufBlock, blockNum))             {                 //如果page为通过验证                 if (mode == RBM_ZERO_ON_ERROR || zero_damaged_pages)                 {                     //出错,则初始化                     ereport(WARNING,                             (errcode(ERRCODE_DATA_CORRUPTED),                              errmsg("invalid page in block %u of relation %s; zeroing out page",                                     blockNum,                                     relpath(smgr->smgr_rnode, forkNum))));                     //初始化                     MemSet((char *) bufBlock, 0, BLCKSZ);                 }                 else                     //出错,报错                     ereport(ERROR,                             (errcode(ERRCODE_DATA_CORRUPTED),                              errmsg("invalid page in block %u of relation %s",                                     blockNum,                                     relpath(smgr->smgr_rnode, forkNum))));             }         }     }     //--------- 已扩展了buffer或者已读取了block     /*      * In RBM_ZERO_AND_LOCK mode, grab the buffer content lock before marking      * the page as valid, to make sure that no other backend sees the zeroed      * page before the caller has had a chance to initialize it.      * 在RBM_ZERO_AND_LOCK模式下,在标记page为有效之前获取buffer content lock,      *   确保在调用者初始化之前没有其他进程看到已初始化为0的page      *      * Since no-one else can be looking at the page contents yet, there is no      * difference between an exclusive lock and a cleanup-strength lock. (Note      * that we cannot use LockBuffer() or LockBufferForCleanup() here, because      * they assert that the buffer is already valid.)      * 由于没有其他进程可以搜索page内容,因此获取独占锁和cleanup-strength锁没有区别.      * (注意不能在这里使用LockBuffer()或者LockBufferForCleanup(),因为这些函数假定buffer有效)      */     if ((mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK) &&         !isLocalBuf)     {         //锁定         LWLockAcquire(BufferDescriptorGetContentLock(bufHdr), LW_EXCLUSIVE);     }     if (isLocalBuf)     {         //临时表         /* Only need to adjust flags */         //只需要调整标记         uint32      buf_state = pg_atomic_read_u32(&bufHdr->state);         buf_state |= BM_VALID;         pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);     }     else     {         //普通表         /* Set BM_VALID, terminate IO, and wake up any waiters */         //设置BM_VALID,中断IO,唤醒等待的进程         TerminateBufferIO(bufHdr, false, BM_VALID);     }     //更新统计信息     VacuumPageMiss++;     if (VacuumCostActive)         VacuumCostBalance += VacuumCostPageMiss;     //跟踪     TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,                                       smgr->smgr_rnode.node.spcNode,                                       smgr->smgr_rnode.node.dbNode,                                       smgr->smgr_rnode.node.relNode,                                       smgr->smgr_rnode.backend,                                       isExtend,                                       found);     //返回buffer     //#define BufferDescriptorGetBuffer(bdesc) ((bdesc)->buf_id + 1)     return BufferDescriptorGetBuffer(bufHdr); }

三、跟踪分析

测试场景一:Block不在缓冲区中
脚本:

16:42:48 (xdb@[local]:5432)testdb=# select * from t1 limit 10;

启动gdb,设置断点

(gdb) b ReadBuffer_common Breakpoint 1 at 0x876e28: file bufmgr.c, line 711. (gdb) c Continuing. Breakpoint 1, ReadBuffer_common (smgr=0x2b7cce0, relpersistence=112 'p', forkNum=MAIN_FORKNUM, blockNum=0, mode=RBM_NORMAL,      strategy=0x0, hit=0x7ffc7761dfab) at bufmgr.c:711 711     bool        isLocalBuf = SmgrIsTemp(smgr); (gdb)

1.初始化相关变量和执行相关判断(是否扩展isExtend?是否临时表isLocalBuf?)

(gdb) n 713     *hit = false; (gdb)  716     ResourceOwnerEnlargeBuffers(CurrentResourceOwner); (gdb)  718     isExtend = (blockNum == P_NEW); (gdb)  720     TRACE_POSTGRESQL_BUFFER_READ_START(forkNum, blockNum, (gdb)  728     if (isExtend) (gdb)  731     if (isLocalBuf) (gdb)  745         bufHdr = BufferAlloc(smgr, relpersistence, forkNum, blockNum, (gdb)

2.调用BufferAlloc获取buffer描述符

(gdb)  747         if (found) (gdb) p *bufHdr $1 = {tag = {rnode = {spcNode = 1663, dbNode = 16402, relNode = 51439}, forkNum = MAIN_FORKNUM, blockNum = 0},    buf_id = 108, state = {value = 2248409089}, wait_backend_pid = 0, freeNext = -2, content_lock = {tranche = 54, state = {       value = 536870912}, waiters = {head = 2147483647, tail = 2147483647}}} (gdb) p found $2 = false (gdb)  (gdb) n 750             pgBufferUsage.shared_blks_read++; --> 更新统计信息 (gdb)

4.没有在缓存中命中,则获取block

756     if (found) (gdb)  856     Assert(!(pg_atomic_read_u32(&bufHdr->state) & BM_VALID));   /* spinlock not needed */ (gdb)  858     bufBlock = isLocalBuf ? LocalBufHdrGetBlock(bufHdr) : BufHdrGetBlock(bufHdr); (gdb)  860     if (isExtend) (gdb) p bufBlock $4 = (Block) 0x7fe8c240e380

4.2如为普通buffer
4.2.1如模式为RBM_ZERO_AND_LOCK/RBM_ZERO_AND_CLEANUP_LOCK,填充0
4.2.2否则,通过smgr(存储管理器)读取block,如需要,则跟踪I/O时间,同时检查垃圾数据

(gdb) p mode $5 = RBM_NORMAL (gdb)  (gdb) n 880         if (mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK) (gdb)  887             if (track_io_timing) (gdb)  890             smgrread(smgr, forkNum, blockNum, (char *) bufBlock); (gdb)  892             if (track_io_timing) (gdb) p *smgr $6 = {smgr_rnode = {node = {spcNode = 1663, dbNode = 16402, relNode = 51439}, backend = -1}, smgr_owner = 0x7fe8ee2bc7b8,    smgr_targblock = 4294967295, smgr_fsm_nblocks = 4294967295, smgr_vm_nblocks = 4294967295, smgr_which = 0,    md_num_open_segs = {1, 0, 0, 0}, md_seg_fds = {0x2b0dd78, 0x0, 0x0, 0x0}, next_unowned_reln = 0x0} (gdb) p forkNum $7 = MAIN_FORKNUM (gdb) p blockNum $8 = 0 (gdb) p (char *) bufBlock $9 = 0x7fe8c240e380 "\001" (gdb)

5.已扩展了buffer或者已读取了block
5.1如需要,锁定buffer
5.2如为临时表,则调整标记;否则设置BM_VALID,中断IO,唤醒等待的进程

(gdb) n 901             if (!PageIsVerified((Page) bufBlock, blockNum)) (gdb)  932     if ((mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK) && (gdb) n 938     if (isLocalBuf) (gdb)  949         TerminateBufferIO(bufHdr, false, BM_VALID); (gdb)

5.3更新统计信息
5.4返回buffer

(gdb)  952     VacuumPageMiss++; (gdb)  953     if (VacuumCostActive) (gdb)  956     TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum, (gdb)  964     return BufferDescriptorGetBuffer(bufHdr); (gdb)  965 } (gdb)

buf为109

(gdb)  ReadBufferExtended (reln=0x7fe8ee2bc7a8, forkNum=MAIN_FORKNUM, blockNum=0, mode=RBM_NORMAL, strategy=0x0) at bufmgr.c:666 666     if (hit) (gdb)  668     return buf; (gdb) p buf $10 = 109 (gdb)

测试场景二:Block已在缓冲区中
再次执行上面的SQL语句,这时候相应的block已读入到buffer中

(gdb) del Delete all breakpoints? (y or n) y (gdb) c Continuing. ^C Program received signal SIGINT, Interrupt. 0x00007fe8ec448903 in __epoll_wait_nocancel () at ../sysdeps/unix/syscall-template.S:81 81  T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS) (gdb) b ReadBuffer_common Breakpoint 2 at 0x876e28: file bufmgr.c, line 711. (gdb)

found变量为T

... (gdb)  745         bufHdr = BufferAlloc(smgr, relpersistence, forkNum, blockNum, (gdb)  747         if (found) (gdb) p found $11 = true (gdb)  (gdb) n 748             pgBufferUsage.shared_blks_hit++; (gdb)

进入相应的逻辑
3.如在缓存中命中
3.1如非扩展buffer,更新统计信息,如有需要,锁定buffer并返回
3.2如为扩展buffer,则获取block
3.2.1如PageIsNew返回F,则报错
3.2.2如为本地buffer(临时表),则调整标记
3.2.3如非本地buffer,则清除BM_VALID标记

(gdb)  756     if (found) (gdb)  758         if (!isExtend) (gdb)  761             *hit = true; (gdb)  762             VacuumPageHit++; (gdb)  764             if (VacuumCostActive) (gdb)  767             TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum, (gdb)  779             if (!isLocalBuf) (gdb)  781                 if (mode == RBM_ZERO_AND_LOCK) (gdb)  784                 else if (mode == RBM_ZERO_AND_CLEANUP_LOCK) (gdb)  788             return BufferDescriptorGetBuffer(bufHdr); (gdb)  965 } (gdb)

到此，关于“PostgreSQL中ReadBuffer_common函数有什么作用”的学习就结束了，希望能够解决大家的疑惑。理论与实践的搭配能更好的帮助大家学习，快去试试吧！若想继续学习更多相关知识，请继续关注亿速云网站，小编会继续努力为大家带来更多实用的文章！

向AI问一下细节

PostgreSQL中ReadBuffer_common函数有什么作用

一、数据结构

二、源码解读

三、跟踪分析

猜你喜欢

最新资讯

相关推荐

相关标签