前言

去年暑假完成了 CMU15-445 Fall 2019 的四个实验，分别对应下述博客：

今年打算接着完成 Fall 2020 的四个实验，同时解读一下课程组写好的那一部分代码，比如数据存储和页面布局的代码，加深自己对数据库系统的理解。

环境搭建

在 GitHub 上新建一个私有仓库，命名为 CMU15445-Fall2020，然后将官方仓库克隆到本地：

git clone git@github.com:cmu-db/bustub.git ./cmu15445-fall2020

cd cmu15445-fall2020

目前官方的代码应该更新到 Fall2023 了，需要回滚到 Fall2020，并将代码传到自己的远程仓库：

git reset --hard 444765a

git remote rm origin

git remote add origin git@github.com:zhiyiYo/cmu15445-fall2020.git #添加自己仓库作为远程分支

git push -u origin main

实验环境为 Ubuntu20.04 虚拟机，所以执行下述代码安装依赖包：

sudo build_support/packages.sh

和去年一样，因为 googletest 仓库将 master 分支重命名为 main 了，所以需要将 build_support/gtest_CMakeLists.txt.in 的内容改为：

cmake_minimum_required(VERSION 3.8)

project(googletest-download NONE)

include(ExternalProject)

ExternalProject_Add(googletest

        GIT_REPOSITORY git@github.com:google/googletest.git

        GIT_TAG main

        SOURCE_DIR "${CMAKE_BINARY_DIR}/googletest-src"

        BINARY_DIR "${CMAKE_BINARY_DIR}/googletest-build"

        CONFIGURE_COMMAND ""

        BUILD_COMMAND ""

        INSTALL_COMMAND ""

        TEST_COMMAND ""

        )

最后编译一下，如果编译成功就说明环境搭建完成：

mkdir build

cd build

cmake ..

make

缓存池

由于磁盘读写速度远慢于内存，所以数据库会在内存中开辟一块连续空间，用于存储最近访问的页，这块空间称为缓存池。执行引擎不会直接从磁盘读取页，而是向缓存池要。如果缓存池中没有想要的页，就会从磁盘读入到池中，然后返回给执行引擎。页内数据更新后也不会立即写入磁盘，而是打上了一个 Dirty 标志位并暂存在缓存池中，等到时机成熟再写入。

缓冲池的本质是一个数组，只能存一定数量的页。如果执行引擎想要的 Page 不在缓存池中，且缓存池已满，这时候需要从中踢出一个页来腾出空间给新 Page，被踢出的 Dirty 页需要被保存到磁盘中来保证数据一致性。需要指出的是，不是任何 Page 都能被换出，那些正在被使用的页不能换出，而判断一个页是否正被使用的依据是 Page 内部保存的 Pin/Reference 计数器，只要计数器的值大于 0，就说明至少有一个线程在使用它。

缓冲池内部维护着一个 page_id 到 frame_id 的映射表，用来指出页和内部数组索引的映射关系。同时内部还有一个互斥锁来保证并发安全，对缓存池的增删改查都需要上锁。

实验要求

任务 1：LRU Replacement Policy

Fall2019 要求实现的是时钟替换算法，而 Fall2020 则改成了 LRU 替换算法，实现方式一般使用双向链表 + 哈希表，C艹可以直接用标准库中的 std::list 和 std::unordered_map。双向链表中存放允许被换出的 frame_id，哈希表中存 frame_id 及其对应的双向链表迭代器，这样可以实现 \(O(1)\) 复杂度的读写。链表的表头处存放最近访问的 frame_id，而尾处则是距离上次访问时间最远的的 frame_id。

class LRUReplacer : public Replacer {

 public:

  /**

   * Create a new LRUReplacer.

   * @param num_pages the maximum number of pages the LRUReplacer will be required to store

   */

  explicit LRUReplacer(size_t num_pages);

  ~LRUReplacer() override;

  /**

   * Remove the victim frame as defined by the replacement policy.

   * @param[out] frame_id id of frame that was removed, nullptr if no victim was found

   * @return true if a victim frame was found, false otherwise

   */

  bool Victim(frame_id_t *frame_id) override;

  /**

   * Pins a frame, indicating that it should not be victimized until it is unpinned.

   * @param frame_id the id of the frame to pin

   */

  void Pin(frame_id_t frame_id) override;

  /**

   * Unpins a frame, indicating that it can now be victimized.

   * @param frame_id the id of the frame to unpin

   */

  void Unpin(frame_id_t frame_id) override;

  /** @return the number of elements in the replacer that can be victimized */

  size_t Size() override;

 private:

  size_t num_pages_;

  std::list<frame_id_t> list_;

  std::unordered_map<frame_id_t, std::list<frame_id_t>::iterator> map_;

  std::shared_mutex mutex_;

};

具体实现如下所示，可以看到 LRUReplacer 对缓冲池中存了多少页以及存了哪些页是一无所知的，它只关心能被换出的 frame_id，外界通过调用 LURReplacer::Unpin() 添加一个能被换出的 frame_id，调用 LRUReplacer::Pin() 来移除一个 frame_id：

LRUReplacer::LRUReplacer(size_t num_pages) : num_pages_(num_pages) {}

LRUReplacer::~LRUReplacer() = default;

bool LRUReplacer::Victim(frame_id_t *frame_id) {

  lock_guard<shared_mutex> lock(mutex_);

  if (Size() == 0) {

    return false;

  }

  *frame_id = list_.back();

  list_.pop_back();

  map_.erase(*frame_id);

  return true;

}

void LRUReplacer::Pin(frame_id_t frame_id) {

  lock_guard<shared_mutex> lock(mutex_);

  // frame 需要在缓冲池中

  if (!map_.count(frame_id)) {

    return;

  }

  auto it = map_[frame_id];

  map_.erase(frame_id);

  list_.erase(it);

}

void LRUReplacer::Unpin(frame_id_t frame_id) {

  lock_guard<shared_mutex> lock(mutex_);

  // 缓冲池满了不能插入新的 page，不能重复插入 page

  if (Size() == num_pages_ || map_.count(frame_id)) {

    return;

  }

  list_.push_front(frame_id);

  map_[frame_id] = list_.begin();

}

size_t LRUReplacer::Size() {

  return list_.size();

}

在终端输入命令：

mkdir build

cd build

cmake ..

make lru_replacer_test

./test/lru_replacer_test

测试结果如下：

任务2：Buffer Pool Manager

BufferPoolManager 用于管理缓冲池，内部有一个 DiskManager 来读写磁盘数据，LRUReplacer 执行替换算法。这个类要求我们实现五个函数：

FetchPageImpl(page_id)
NewPageImpl(page_id)
UnpinPageImpl(page_id, is_dirty)
FlushPageImpl(page_id)
DeletePageImpl(page_id)
FlushAllPagesImpl()

下面会一个个实现上述函数。

FetchPageImpl(page_id)

该函数实现了缓冲池的主要功能：向上层提供指定的 page。缓冲池管理器首先在 page_table_ 中查找 page_id 键是否存在：

如果存在就根据 page_id 对应的 frame_id 从缓冲池 pages_ 取出 page
如果不存在就通过 GetVictimFrameId() 函数选择被换出的帧，该函数首先从 free_list_ 中查找缓冲池的空位，如果没找到空位就得靠上一节实现的 LRUReplacer 选出被换出的冤大头

具体代码如下：



Page *BufferPoolManager::FetchPageImpl(page_id_t page_id) {

  lock_guard<mutex> lock(latch_);

  // 1.     Search the page table for the requested page (P).

  Page *page;

  auto it = page_table_.find(page_id);

  // 1.1    If P exists, pin it and return it immediately.

  if (it != page_table_.end()) {

    auto frame_id = it->second;

    page = &pages_[frame_id];

    replacer_->Pin(frame_id);

    page->pin_count_++;

    return page;

  }

  // 1.2    If P does not exist, find a replacement page (R) from either the free list or the replacer.

  //        Note that pages are always found from the free list first.

  auto frame_id = GetVictimFrameId();

  if (frame_id == INVALID_PAGE_ID) {

    return nullptr;

  }

  // 2.     If R is dirty, write it back to the disk.

  page = &pages_[frame_id];

  if (page->IsDirty()) {

    disk_manager_->WritePage(page->page_id_, page->data_);

  }

  // 3.     Delete R from the page table and insert P.

  page_table_.erase(page->page_id_);

  page_table_[page_id] = frame_id;

  // 4.     Update P's metadata, read in the page content from disk, and then return a pointer to P.

  disk_manager_->ReadPage(page_id, page->data_);

  page->update(page_id, 1, false);

  replacer_->Pin(frame_id);

  return page;

}

frame_id_t BufferPoolManager::GetVictimFrameId() {

  frame_id_t frame_id = INVALID_PAGE_ID;

  if (!free_list_.empty()) {

    frame_id = free_list_.front();

    free_list_.pop_front();

  } else {

    replacer_->Victim(&frame_id);

  }

  return frame_id;

}

上述代码中还用了一个 Page::update 辅助函数，用于更新 page 的元数据：

/**

* update the meta data of page

* @param page_id the page id

* @param pin_count the pin count

* @param is_dirty is page dirty

* @param reset_memory whether to reset the memory of page

*/

void update(page_id_t page_id, int pin_count, bool is_dirty, bool reset_memory = false) {

  page_id_ = page_id;

  pin_count_ = pin_count;

  is_dirty_ = is_dirty;

  if (reset_memory) {

    ResetMemory();

  }

}

NewPageImpl(page_id)

该函数在缓冲池中插入一个新页，如果缓冲池中的所有页面都正在被线程访问，插入失败，否则靠 GetVictimFrameId() 计算插入位置：



Page *BufferPoolManager::NewPageImpl(page_id_t *page_id) {

  // 0.   Make sure you call DiskManager::AllocatePage!

  lock_guard<mutex> lock(latch_);

  // 1.   If all the pages in the buffer pool are pinned, return nullptr.

  auto frame_id = GetVictimFrameId();

  if (frame_id == INVALID_PAGE_ID) {

    return nullptr;

  }

  // 2.   Pick a victim page P from either the free list or the replacer. Always pick from the free list first.

  auto page = &pages_[frame_id];

  if (page->IsDirty()) {

    disk_manager_->WritePage(page->page_id_, page->data_);

  }

  // 3.   Update P's metadata, zero out memory and add P to the page table.

  *page_id = disk_manager_->AllocatePage();

  page_table_.erase(page->page_id_);

  page_table_[*page_id] = frame_id;

  page->update(*page_id, 1, false, true);

  replacer_->Pin(frame_id);

  // 4.   Set the page ID output parameter. Return a pointer to P.

  return page;

}

DeletePageImpl(page_id)

该函数从缓冲池和数据库文件中删除一个 page，并将其 page_id 设置为 INVALID_PAGE_ID：

bool BufferPoolManager::DeletePageImpl(page_id_t page_id) {

  // 0.   Make sure you call DiskManager::DeallocatePage!

  lock_guard<mutex> lock(latch_);

  // 1.   Search the page table for the requested page (P).

  // 1.   If P does not exist, return true.

  auto it = page_table_.find(page_id);

  if (it == page_table_.end()) {

    return true;

  }

  // 2.   If P exists, but has a non-zero pin-count, return false. Someone is using the page.

  auto frame_id = it->second;

  auto &page = pages_[frame_id];

  if (page.pin_count_ > 0) {

    return false;

  }

  // 3.   Otherwise, P can be deleted. Remove P from the page table, reset its metadata and return it to the free list.

  disk_manager_->DeallocatePage(page_id);

  page_table_.erase(page.page_id_);

  free_list_.push_back(frame_id);

  page.update(INVALID_PAGE_ID, 0, false);

  return true;

}

UnpinPageImpl(page_id, is_dirty)

该函数用以减少对某个页的引用数 pin count，当 pin_count 为 0 时需要将其添加到 LRUReplacer 中：



bool BufferPoolManager::UnpinPageImpl(page_id_t page_id, bool is_dirty) {

  lock_guard<mutex> lock(latch_);

  auto it = page_table_.find(page_id);

  if (it == page_table_.end()) {

    return false;

  }

  auto frame_id = it->second;

  auto &page = pages_[frame_id];

  if (page.pin_count_ <= 0) {

    return false;

  }

  page.is_dirty_ |= is_dirty;

  if (--page.pin_count_ == 0) {

    replacer_->Unpin(frame_id);

  }

  return true;

}

FlushPageImpl(page_id)

该函数将缓冲池中的页写入磁盘以保持同步，这里不管页是否为脏，一律写入磁盘，不然并发的测试用例过不了：

bool BufferPoolManager::FlushPageImpl(page_id_t page_id) {

  // Make sure you call DiskManager::WritePage!

  lock_guard<mutex> lock(latch_);

  auto it = page_table_.find(page_id);

  if (it == page_table_.end()) {

    return false;

  }

  auto &page = pages_[it->second];

  disk_manager_->WritePage(page_id, page.data_);

  page.is_dirty_ = false;

  return true;

}

FlushAllPagesImpl()

该函数将缓冲池中的所有 page 写入磁盘：

void BufferPoolManager::FlushAllPagesImpl() {

  lock_guard<mutex> lock(latch_);

  for (auto &[page_id, frame_id] : page_table_) {

    auto &page = pages_[frame_id];

    if (page.IsDirty()) {

      disk_manager_->WritePage(page_id, page.data_);

      page.is_dirty_ = false;

    }

  }

}

测试

在终端输入指令：

cd build

make buffer_pool_manager_test

./test/buffer_pool_manager_test

# 下面是从 gradescope 扒下来的测试用例

make buffer_pool_manager_concurrency_test

./test/buffer_pool_manager_concurrency_test