Kafka 源代码分析之LogSegment

这里分析kafka LogSegment源代码

通过一步步分析LogManager,Log源代码之后就会发现,最终的log操作都在LogSegment上实现.LogSegment负责分片的读写恢复刷新删除等动作都在这里实现.LogSegment代码同样在源代码目录log下.

LogSegment是一个日志分片的操作最小单元.直接作用与messages之上.负责实体消息的读写追加等等.

LogSegment实际上是FileMessageSet类的代理类.LogSegment中的所有最终处理都在FileMessageSet类中实现.FileMessageSet类的最终操作建立在ByteBufferMessageSet这个消息实体类的基础上.通过操作FileChannel对象来实现消息读写.

下面来看看主要的一些函数方法.

　　初始化部分

class LogSegment(val log: FileMessageSet,     //实际构造是这个.

                 val index: OffsetIndex,

                 val baseOffset: Long,

                 val indexIntervalBytes: Int,

                 val rollJitterMs: Long,

                 time: Time) extends Logging {

  var created = time.milliseconds

  /* the number of bytes since we last added an entry in the offset index */

  private var bytesSinceLastIndexEntry = 0

  //在Log中被调用的构造是这个.可以看见是通过topicAndPartition路径和startOffset来创建index和logfile的.

  def this(dir: File, startOffset: Long, indexIntervalBytes: Int, maxIndexSize: Int, rollJitterMs: Long, time: Time) =

    this(new FileMessageSet(file = Log.logFilename(dir, startOffset)),

         new OffsetIndex(file = Log.indexFilename(dir, startOffset), baseOffset = startOffset, maxIndexSize = maxIndexSize),

         startOffset,

         indexIntervalBytes,

         rollJitterMs,

         time)

　　添加消息函数append

def append(offset: Long, messages: ByteBufferMessageSet) {

    if (messages.sizeInBytes > 0) { //判断消息不为空.

      trace("Inserting %d bytes at offset %d at position %d".format(messages.sizeInBytes, offset, log.sizeInBytes()))

      // append an entry to the index (if needed)

      if(bytesSinceLastIndexEntry > indexIntervalBytes) {

        index.append(offset, log.sizeInBytes())

        this.bytesSinceLastIndexEntry = 0

      }

      // append the messages

      log.append(messages) //调用FileMessageSet类的append方法想写消息.实际上最终调用的是ByteBufferMessageSet类方法来操作消息实体的.

      this.bytesSinceLastIndexEntry += messages.sizeInBytes

    }

  }

　　刷新消息到磁盘的flush函数

def flush() {

    LogFlushStats.logFlushTimer.time {

      log.flush()   //可以看见调用的FileMessageSet类的方法.最终FileMessageSet.flush方法调用channel.force方法刷新存储设备.

      index.flush() //同上.

    }

  }

　　读取消息的read函数

def read(startOffset: Long, maxOffset: Option[Long], maxSize: Int): FetchDataInfo = {

    if(maxSize < 0)

      throw new IllegalArgumentException("Invalid max size for log read (%d)".format(maxSize))

    val logSize = log.sizeInBytes // this may change, need to save a consistent copy

    val startPosition = translateOffset(startOffset)  //获取对应offset的读取点位置.

    // if the start position is already off the end of the log, return null

    if(startPosition == null) //没有读取点位置则返回空

      return null

    val offsetMetadata = new LogOffsetMetadata(startOffset, this.baseOffset, startPosition.position) //定义offsetMetadata

    // if the size is zero, still return a log segment but with zero size

    if(maxSize == 0)  //最大读取尺寸是0的话.返回空消息.

      return FetchDataInfo(offsetMetadata, MessageSet.Empty)

    // calculate the length of the message set to read based on whether or not they gave us a maxOffset

    val length =   //计算最大读取的消息总长度.

      maxOffset match {

        case None => //未设置maxoffset则使用maxsize.

          // no max offset, just use the max size they gave unmolested

          maxSize

        case Some(offset) => { //如果设置了Maxoffset,则计算对应的消息长度.

          // there is a max offset, translate it to a file position and use that to calculate the max read size

          if(offset < startOffset)  //maxoffset小于startoffset则返回异常

            throw new IllegalArgumentException("Attempt to read with a maximum offset (%d) less than the start offset (%d).".format(offset, startOffset))

          val mapping = translateOffset(offset, startPosition.position) //获取相对maxoffset读取点.

          val endPosition =

            if(mapping == null)

              logSize // the max offset is off the end of the log, use the end of the file

            else

              mapping.position

          min(endPosition - startPosition.position, maxSize) //用maxoffset读取点减去开始的读取点.获取需要读取的数据长度.如果长度比maxsize大则返回maxsize

        }

      }

    FetchDataInfo(offsetMetadata, log.read(startPosition.position, length)) //使用FileMessageSet.read读取相应长度的数据返回FetchDataInfo的封装对象.

  }

　　读取函数通过映射offset到读取长度.来读取多个offset.

private[log] def translateOffset(offset: Long, startingFilePosition: Int = 0): OffsetPosition = {  //用来将offset映射到读取指针位置的函数.

    val mapping = index.lookup(offset) //通过查找index获取对应的指针对象.

    log.searchFor(offset, max(mapping.position, startingFilePosition)) //通过FileMessageSet获取对应的指针位置.

  }

　　recover函数.kafka启动检查时用到的各层调用的最后代理函数.

def recover(maxMessageSize: Int): Int = {

    index.truncate()

    index.resize(index.maxIndexSize)

    var validBytes = 0

    var lastIndexEntry = 0

    val iter = log.iterator(maxMessageSize)

    try {

      while(iter.hasNext) {

        val entry = iter.next

        entry.message.ensureValid()

        if(validBytes - lastIndexEntry > indexIntervalBytes) {

          // we need to decompress the message, if required, to get the offset of the first uncompressed message

          val startOffset =

            entry.message.compressionCodec match {

              case NoCompressionCodec =>

                entry.offset

              case _ =>

                ByteBufferMessageSet.decompress(entry.message).head.offset

          }

          index.append(startOffset, validBytes)

          lastIndexEntry = validBytes

        }

        validBytes += MessageSet.entrySize(entry.message)

      }

    } catch {

      case e: InvalidMessageException =>

        logger.warn("Found invalid messages in log segment %s at byte offset %d: %s.".format(log.file.getAbsolutePath, validBytes, e.getMessage))

    }

    val truncated = log.sizeInBytes - validBytes

    log.truncateTo(validBytes)

    index.trimToValidSize()

    truncated

  }

　　分片删除函数

 def delete() {

    val deletedLog = log.delete()   //最终是删除文件,关闭内存数组.在FileMessageSet里实现.

    val deletedIndex = index.delete() //同上.

    if(!deletedLog && log.file.exists)

      throw new KafkaStorageException("Delete of log " + log.file.getName + " failed.")

    if(!deletedIndex && index.file.exists)

      throw new KafkaStorageException("Delete of index " + index.file.getName + " failed.")

  }

到这里LogSegment主要函数都分析完了.

Kafka 源代码分析之LogSegment的更多相关文章

Kafka 源代码分析之LogManager
这里分析kafka 0.8.2的LogManager logmanager是kafka用来管理log文件的子系统.源代码文件在log目录下. 这里会逐步分析logmanager的源代码.首先看clas ...
Kafka 源代码分析.
这里记录kafka源代码笔记.(代码版本是0.8.2.1) kafka的源代码如何下载.这里简单说一下. git clone https://git-wip-us.apache.org/repos/a ...
Kafka 源代码分析之FileMessageSet
这里主要分析FileMessageSet类这个类主要是管理log消息的内存对象和文件对象的类.源代码文件在log目录下.这个类被LogSegment类代理调用用来管理分片. 下面是完整代码.代码比较 ...
Kafka 源代码分析之ByteBufferMessageSet
这里分析一下message的封装类ByteBufferMessageSet类 ByteBufferMessageSet类的源代码在源代码目录message目录下.这个类主要封装了message,mes ...
Kafka 源代码分析之Log
这里分析Log对象本身的源代码. Log类是一个topic分区的基础类.一个topic分区的所有基本管理动作.都在这个对象里完成.类源代码文件为Log.scala.在源代码log目录下. Log类是L ...
kafka 源代码分析之Message（v0.10）
这里主要更新一下kafka 0.10.0版本的message消息格式的变化. message 的格式在0.10.0的版本里发生了一些变化(相对于0.8.2.1的版本)这里把0.10.0的message ...
Kafka 源代码分析之Message
这里主要分析一下message的格式. 一条message的构成由以下部分组成 val CrcOffset = 0 //crc校验部分和字长 val CrcLength = 4 val MagicOf ...
Kafka 源代码分析之log框架介绍
这里主要介绍log管理,读写相关的类的调用关系的介绍. 在围绕log的实际处理上.有很多层的封装和调用.这里主要介绍一下调用结构和顺序. 首先从LogManager开始. 调用关系简单如下:LogMa ...
Kafka 源代码分析之MessageSet
这里分析MessageSet类 MessageSet是一个抽象类,定义了一条log的一些接口和常量,FileMessageSet就是MessageSet类的实现类.一条日志中存储的log完整格式如下 ...

随机推荐

IIS7.5 用 IIS AppPool\应用程序池名做账号将各站点权限分开
IIS6里面,要把服务器上的各站点权限分开,要建一堆帐号,再一个一个站点绑定.IIS7.5就不用了. 选择 "应用程序用户" 选择 "应用程序用户",启动应用程 ...
Java集合框架类
java集合框架类图 Collection接口(List.Set.Queue.Stack):
用 Vue 全家桶二次开发 V2EX 社区
一.开发背景为了全面的熟悉Vue+Vue-router+Vuex+axios技术栈,结合V2EX的开放API开发了这个简洁版的V2EX. 在线预览 (为了实现跨域,直接npm run dev部署的, ...
通过js给网页加上水印背景
有些后端管理系统,因为业务逻辑的需要,需要加上水印,下面就是水印方法. function watermark(settings) { debugger; //默认设置 var defaultSetti ...
ucenter 单点登录，终极版
一 ,discuz ecshop 两边登陆都可以同步登陆到另一程序上,但退出则无法实现同步登陆.顺着 Ecshop 的退出流程,顺藤摸瓜找到了 lib_common.php 文件中的 uc_ca ...
《Android进阶》之第二篇 launcher
public boolean addViewToCellLayout(View child, int index, int childId, LayoutParams params, boolean ...
GRPC在NET上的实践（记录篇）
GRPC是什么? GRPC是一个开源RPC框架,于2015年3月开源,其由Google主要面向移动应用开发并基于HTTP/2协议标准而设计,基于Protobuf 3.0(Protocol Buffer ...
使用CSS设置滚动条样式以及如何去掉滚动条的方法
<STYLE> BODY { SCROLLBAR-FACE-COLOR: #f892cc; SCROLLBAR-HIGHLIGHT-COLOR: #f256c6; SCROLLBAR-SH ...
Python教程(2.5)——控制台输入
写Python程序时,你可能希望用户与程序有所交互.例如你可能希望用户输入一些信息,这样就可以让程序的扩展性提高. 这一节我们来谈一谈Python的控制台输入. 输入字符串 Python提供一个叫做i ...
DOS学习札记（一）
DOS学习入门最近碰到几个关于cmd命令操作,感觉操作快捷方便(也许是偶尔新鲜感使然),由于重装系统后,系统的资源管理器与功能分布都有一些不同,导致在寻找一些windows功能时有些费劲,比如说关闭 ...

Kafka 源代码分析之LogSegment

这里分析kafka LogSegment源代码

Kafka 源代码分析之LogSegment的更多相关文章

随机推荐

热门专题