RocketMQ作为消息中间件,经常会被用来和其他消息中间件做比较,比对rabbitmq, kafka... 但个人觉得它一直对标的,都是kafka。因为它们面对的场景往往都是超高并发,超高性能要求的场景。

  所以,有必要深挖下其实现高性能,高并发的原因。实际上,这是非常大的话题,我这里也不打算一口吃个大胖子。我会给出个大概答案,然后我们再深入挖掘其中部分实现。如题所述。

1. 高性能高并发系统的底层技能概述

  我不打算单讲rocketmq到底是如何实现高性能高并发的,因为实际上的底层原则都是差不多的。rocketmq不过是其中的一个实现者而已。

  那么,要想实现高性能高并发,大概需要怎么做的呢?本质上讲,我们的系统服务能利用的东西并不多,CPU、内存、磁盘、网络、硬件特性。。。 当然了,还有一个非常重要的东西,就是我们基本都是在做应用层服务,所以我们的能力往往必须依托于操作系统提供的服务,由这些服务去更好地利用底层硬件的东西。好吧,显得逼格好像有点高了,实际上就是一个系统API调用。

  接下来,我们从每个小点出发,来看看我们如何做到高性能高并发:

  第一个:CPU。可以说,CPU代表了单机的极限。如果能够做有效利用CPU, 使其随时可保证在80%以上的使用率,那么你这个服务绝对够牛逼了(注意不是导致疯狂GC的那种利用率哦)。那么,我们如何做到高效利用CPU呢?有些应用天然就是CPU型的,比如提供一些做大数的运算服务,天生就需要大量CPU运算。而其他的很多的IO型的应用,则CPU往往不会太高,或者说我们应该往高了的方向优化。

  第二个:内存。内存是一个非常宝贵的资源,因为内存的调度速度非常快。如果一个应用的业务处理都是基于内存的,那么这个应用基本上就会超级强悍。大部分情况下,越大的内存往往也能提供越高的性能,比如ES的搜索服务,要想性能好必需有足够内存。当然了,内存除了使用起来非常方便之外,它还有一个重要的工作,就是内存的回收。当然,这部分工作一般都会被编程屏蔽掉,因为它实在太难了。我们一般只需按照语言特性,合理的处理对象即可。另外,我们可以一些可能需要从外部设备读入的数据,加载到内存中长期使用,这将是一件非常重要的优化动作。如何维护好数据一致性与安全性和准确性,是这类工作的重点。

  第三个:磁盘。内存虽好,但却不常有。内存往往意味着大小受限。而与之对应的就是磁盘,磁盘则往往意味空间非常大,数据永久存储安全。磁盘基本上就代表了单机的存储极限,但也同时限制了并发能力。

  第四个:网络。也许这个名词不太合适,但却是因为网络才发生了变化。如果说前面讲的都是单机的极限性能,那么,网络就会带来分布式系统的极限性能。一个庞大的系统,一定是分布式的,因此必然会使用到网络这个设备。但我们一般不会在这上面节省多少东西,我们能做的,也许就是简单的压缩下文件数据而已。更多的,也许我们只是申请更大的带宽,或者开辟新的布线,以满足系统的需要。在网络这一环境,如何更好地组织网络设备,是重中之重,而这往往又回到了上面三个话题之一了。

最后,排除上面几个硬技能,还有一个也是非常重要的技能:那就是算法,没有好的算法,再多的优化可能也只是杯水车薪。(当然了我们大部分情况下是无需高级算法的,因为大部分时间,我们只是大自然的搬运工)

2. 高性能高并发操作系统api列举

  前面说的,更多是理论上讲如何形成牛逼的应用服务。但我们又没那能力去搞操作系统的东西,所以也只能想想而已。那么说到底,我们能做什么呢?所谓工欲善其事,必先利其器。所谓利器,也就是看看操作系统提供什么样的底层API 。

  我这里就稍微列几个吧(我也就知道这么些了):

  epoll系列: IO多路复用技术,高并发高性能网络应用必备。大致作用就是使极少数的线程,高效地管理大量io事件,通知应用等。大概的接口有: epoll_create(), epoll_ctl(), epoll_wait();

  pagecache系列: 操作系统页缓存,高效写文件必备。大致作用就是保留部分磁盘数据在内存中,以便应用想读取或者磁盘数据数据时能够非常快速的响应。相关接口如: read(), write(), readpage(), writepage(), sync(), fsync().

  mmap系列: 内存映射。可以将文件映射到内存中,用户写数据时直接向该内存缓冲区写数据,即可达到写磁盘的作用了,从而提高写入性能。接口如: mmap(), munmap();

  directio系列: 直接io操作,避免用户态数据到内核态数据的相互copy, 节省cpu和内存占用。

  CAS系列: 高效安全锁实现。相关接口: cmpxchg() 。

  多线程系列: 大部分网络应用,都io型的,那么如何同时处理多个请求就非常重要了。多线程提供非常便捷的并发编程基础,使得我们可以更简单的处理业务而且提供超高的处理能力。这自然是编程语言直接提供的。

3. rocketmq中的高性能法宝

  rocketmq想要实现高并发高性能处理能力,自然要从操作系统层面去寻求方法,自然也就逃不过前面的几点说法了。

  首先,它基于netty实现了高性能的网络通信,netty基于事件的io模型,零拷贝的技术,已经提供了非常好的技术前提,rocketmq只需利用一下,就可以将自己变得很厉害了。当然,这只是其厉害的一个点,因为单有了高效网络通信处理能力还不够的。至少,rocketmq得提供高效的数据序列化方式。

  其次,有了netty作为通信框架,高效接入请求后,rocketmq自身处理业务的方式非常重要。如果能够直接基于内存保存数据,那必然是最高性能的。但是它不能那样做,因为内存太小,无法容纳下应有的消息。所以,只能基于文件做存储。而文件本身的操作又是代价非常高的,所以,必须要有些稍微的措施,避免重量级的操作文件。所以,文件的多级存储又是非常重要的了,即如索引文件在db中的应用就知道了。

  再其次,java提供了非常好的多线程编程环境,不加以利用就对不起观众了。良好的线程模型,为其高性能呐喊助威。

  最后,基于pagecache和mmap的高效文件读写,才是其制胜法宝。这也是我们接下来想要重点说明的。

4. rocketmq中对mmap和pagecache的应用

  上一点中提到的每个点,都是rocketmq出众的原因,但我们今天只会来说一点:rocketmq的高效文件存储。

  实际上,根据我之前的几篇文章,我们很容易找到rocketmq是如何对文件进行读写的。我们就以producer写消息数据为例,来回顾看看rmq是如何进行高效文件存储的。

  1. // 处理器入口
  2. // org.apache.rocketmq.broker.processor.SendMessageProcessor#processRequest
  3. @Override
  4. public RemotingCommand processRequest(ChannelHandlerContext ctx,
  5. RemotingCommand request) throws RemotingCommandException {
  6. RemotingCommand response = null;
  7. try {
  8. response = asyncProcessRequest(ctx, request).get();
  9. } catch (InterruptedException | ExecutionException e) {
  10. log.error("process SendMessage error, request : " + request.toString(), e);
  11. }
  12. return response;
  13. }
  14. // 接收转发,异步处理
  15. public CompletableFuture<RemotingCommand> asyncProcessRequest(ChannelHandlerContext ctx,
  16. RemotingCommand request) throws RemotingCommandException {
  17. final SendMessageContext mqtraceContext;
  18. switch (request.getCode()) {
  19. case RequestCode.CONSUMER_SEND_MSG_BACK:
  20. return this.asyncConsumerSendMsgBack(ctx, request);
  21. default:
  22. // 写入数据
  23. SendMessageRequestHeader requestHeader = parseRequestHeader(request);
  24. if (requestHeader == null) {
  25. return CompletableFuture.completedFuture(null);
  26. }
  27. mqtraceContext = buildMsgContext(ctx, requestHeader);
  28. this.executeSendMessageHookBefore(ctx, request, mqtraceContext);
  29. if (requestHeader.isBatch()) {
  30. return this.asyncSendBatchMessage(ctx, request, mqtraceContext, requestHeader);
  31. } else {
  32. return this.asyncSendMessage(ctx, request, mqtraceContext, requestHeader);
  33. }
  34. }
  35. }
  36. // org.apache.rocketmq.store.CommitLog#putMessage
  37. public PutMessageResult putMessage(final MessageExtBrokerInner msg) {
  38. // Set the storage time
  39. msg.setStoreTimestamp(System.currentTimeMillis());
  40. // Set the message body BODY CRC (consider the most appropriate setting
  41. // on the client)
  42. msg.setBodyCRC(UtilAll.crc32(msg.getBody()));
  43. // Back to Results
  44. AppendMessageResult result = null;
  45.  
  46. StoreStatsService storeStatsService = this.defaultMessageStore.getStoreStatsService();
  47.  
  48. String topic = msg.getTopic();
  49. int queueId = msg.getQueueId();
  50.  
  51. final int tranType = MessageSysFlag.getTransactionValue(msg.getSysFlag());
  52. if (tranType == MessageSysFlag.TRANSACTION_NOT_TYPE
  53. || tranType == MessageSysFlag.TRANSACTION_COMMIT_TYPE) {
  54. // Delay Delivery
  55. if (msg.getDelayTimeLevel() > 0) {
  56. if (msg.getDelayTimeLevel() > this.defaultMessageStore.getScheduleMessageService().getMaxDelayLevel()) {
  57. msg.setDelayTimeLevel(this.defaultMessageStore.getScheduleMessageService().getMaxDelayLevel());
  58. }
  59.  
  60. topic = TopicValidator.RMQ_SYS_SCHEDULE_TOPIC;
  61. queueId = ScheduleMessageService.delayLevel2QueueId(msg.getDelayTimeLevel());
  62.  
  63. // Backup real topic, queueId
  64. MessageAccessor.putProperty(msg, MessageConst.PROPERTY_REAL_TOPIC, msg.getTopic());
  65. MessageAccessor.putProperty(msg, MessageConst.PROPERTY_REAL_QUEUE_ID, String.valueOf(msg.getQueueId()));
  66. msg.setPropertiesString(MessageDecoder.messageProperties2String(msg.getProperties()));
  67.  
  68. msg.setTopic(topic);
  69. msg.setQueueId(queueId);
  70. }
  71. }
  72.  
  73. InetSocketAddress bornSocketAddress = (InetSocketAddress) msg.getBornHost();
  74. if (bornSocketAddress.getAddress() instanceof Inet6Address) {
  75. msg.setBornHostV6Flag();
  76. }
  77.  
  78. InetSocketAddress storeSocketAddress = (InetSocketAddress) msg.getStoreHost();
  79. if (storeSocketAddress.getAddress() instanceof Inet6Address) {
  80. msg.setStoreHostAddressV6Flag();
  81. }
  82.  
  83. long elapsedTimeInLock = 0;
  84.  
  85. MappedFile unlockMappedFile = null;
  86. MappedFile mappedFile = this.mappedFileQueue.getLastMappedFile();
  87.  
  88. putMessageLock.lock(); //spin or ReentrantLock ,depending on store config
  89. try {
  90. long beginLockTimestamp = this.defaultMessageStore.getSystemClock().now();
  91. this.beginTimeInLock = beginLockTimestamp;
  92.  
  93. // Here settings are stored timestamp, in order to ensure an orderly
  94. // global
  95. msg.setStoreTimestamp(beginLockTimestamp);
  96.  
  97. if (null == mappedFile || mappedFile.isFull()) {
  98. mappedFile = this.mappedFileQueue.getLastMappedFile(0); // Mark: NewFile may be cause noise
  99. }
  100. if (null == mappedFile) {
  101. log.error("create mapped file1 error, topic: " + msg.getTopic() + " clientAddr: " + msg.getBornHostString());
  102. beginTimeInLock = 0;
  103. return new PutMessageResult(PutMessageStatus.CREATE_MAPEDFILE_FAILED, null);
  104. }
  105.  
  106. result = mappedFile.appendMessage(msg, this.appendMessageCallback);
  107. switch (result.getStatus()) {
  108. case PUT_OK:
  109. break;
  110. case END_OF_FILE:
  111. unlockMappedFile = mappedFile;
  112. // Create a new file, re-write the message
  113. mappedFile = this.mappedFileQueue.getLastMappedFile(0);
  114. if (null == mappedFile) {
  115. // XXX: warn and notify me
  116. log.error("create mapped file2 error, topic: " + msg.getTopic() + " clientAddr: " + msg.getBornHostString());
  117. beginTimeInLock = 0;
  118. return new PutMessageResult(PutMessageStatus.CREATE_MAPEDFILE_FAILED, result);
  119. }
  120. result = mappedFile.appendMessage(msg, this.appendMessageCallback);
  121. break;
  122. case MESSAGE_SIZE_EXCEEDED:
  123. case PROPERTIES_SIZE_EXCEEDED:
  124. beginTimeInLock = 0;
  125. return new PutMessageResult(PutMessageStatus.MESSAGE_ILLEGAL, result);
  126. case UNKNOWN_ERROR:
  127. beginTimeInLock = 0;
  128. return new PutMessageResult(PutMessageStatus.UNKNOWN_ERROR, result);
  129. default:
  130. beginTimeInLock = 0;
  131. return new PutMessageResult(PutMessageStatus.UNKNOWN_ERROR, result);
  132. }
  133.  
  134. elapsedTimeInLock = this.defaultMessageStore.getSystemClock().now() - beginLockTimestamp;
  135. beginTimeInLock = 0;
  136. } finally {
  137. putMessageLock.unlock();
  138. }
  139.  
  140. if (elapsedTimeInLock > 500) {
  141. log.warn("[NOTIFYME]putMessage in lock cost time(ms)={}, bodyLength={} AppendMessageResult={}", elapsedTimeInLock, msg.getBody().length, result);
  142. }
  143.  
  144. if (null != unlockMappedFile && this.defaultMessageStore.getMessageStoreConfig().isWarmMapedFileEnable()) {
  145. this.defaultMessageStore.unlockMappedFile(unlockMappedFile);
  146. }
  147.  
  148. PutMessageResult putMessageResult = new PutMessageResult(PutMessageStatus.PUT_OK, result);
  149.  
  150. // Statistics
  151. storeStatsService.getSinglePutMessageTopicTimesTotal(msg.getTopic()).incrementAndGet();
  152. storeStatsService.getSinglePutMessageTopicSizeTotal(topic).addAndGet(result.getWroteBytes());
  153.  
  154. handleDiskFlush(result, putMessageResult, msg);
  155. handleHA(result, putMessageResult, msg);
  156.  
  157. return putMessageResult;
  158. }
  159.  
  160. // org.apache.rocketmq.broker.processor.SendMessageProcessor#asyncSendMessage
  161. private CompletableFuture<RemotingCommand> asyncSendMessage(ChannelHandlerContext ctx, RemotingCommand request,
  162. SendMessageContext mqtraceContext,
  163. SendMessageRequestHeader requestHeader) {
  164. final RemotingCommand response = preSend(ctx, request, requestHeader);
  165. final SendMessageResponseHeader responseHeader = (SendMessageResponseHeader)response.readCustomHeader();
  166.  
  167. if (response.getCode() != -1) {
  168. return CompletableFuture.completedFuture(response);
  169. }
  170.  
  171. final byte[] body = request.getBody();
  172.  
  173. int queueIdInt = requestHeader.getQueueId();
  174. TopicConfig topicConfig = this.brokerController.getTopicConfigManager().selectTopicConfig(requestHeader.getTopic());
  175.  
  176. if (queueIdInt < 0) {
  177. queueIdInt = randomQueueId(topicConfig.getWriteQueueNums());
  178. }
  179.  
  180. MessageExtBrokerInner msgInner = new MessageExtBrokerInner();
  181. msgInner.setTopic(requestHeader.getTopic());
  182. msgInner.setQueueId(queueIdInt);
  183.  
  184. if (!handleRetryAndDLQ(requestHeader, response, request, msgInner, topicConfig)) {
  185. return CompletableFuture.completedFuture(response);
  186. }
  187.  
  188. msgInner.setBody(body);
  189. msgInner.setFlag(requestHeader.getFlag());
  190. MessageAccessor.setProperties(msgInner, MessageDecoder.string2messageProperties(requestHeader.getProperties()));
  191. msgInner.setPropertiesString(requestHeader.getProperties());
  192. msgInner.setBornTimestamp(requestHeader.getBornTimestamp());
  193. msgInner.setBornHost(ctx.channel().remoteAddress());
  194. msgInner.setStoreHost(this.getStoreHost());
  195. msgInner.setReconsumeTimes(requestHeader.getReconsumeTimes() == null ? 0 : requestHeader.getReconsumeTimes());
  196. String clusterName = this.brokerController.getBrokerConfig().getBrokerClusterName();
  197. MessageAccessor.putProperty(msgInner, MessageConst.PROPERTY_CLUSTER, clusterName);
  198. msgInner.setPropertiesString(MessageDecoder.messageProperties2String(msgInner.getProperties()));
  199.  
  200. CompletableFuture<PutMessageResult> putMessageResult = null;
  201. Map<String, String> origProps = MessageDecoder.string2messageProperties(requestHeader.getProperties());
  202. String transFlag = origProps.get(MessageConst.PROPERTY_TRANSACTION_PREPARED);
  203. if (transFlag != null && Boolean.parseBoolean(transFlag)) {
  204. if (this.brokerController.getBrokerConfig().isRejectTransactionMessage()) {
  205. response.setCode(ResponseCode.NO_PERMISSION);
  206. response.setRemark(
  207. "the broker[" + this.brokerController.getBrokerConfig().getBrokerIP1()
  208. + "] sending transaction message is forbidden");
  209. return CompletableFuture.completedFuture(response);
  210. }
  211. putMessageResult = this.brokerController.getTransactionalMessageService().asyncPrepareMessage(msgInner);
  212. } else {
  213. // 简单起见,我们只看非事务的消息写入
  214. putMessageResult = this.brokerController.getMessageStore().asyncPutMessage(msgInner);
  215. }
  216. return handlePutMessageResultFuture(putMessageResult, response, request, msgInner, responseHeader, mqtraceContext, ctx, queueIdInt);
  217. }
  218. // org.apache.rocketmq.store.DefaultMessageStore#asyncPutMessage
  219. @Override
  220. public CompletableFuture<PutMessageResult> asyncPutMessage(MessageExtBrokerInner msg) {
  221. PutMessageStatus checkStoreStatus = this.checkStoreStatus();
  222. if (checkStoreStatus != PutMessageStatus.PUT_OK) {
  223. return CompletableFuture.completedFuture(new PutMessageResult(checkStoreStatus, null));
  224. }
  225.  
  226. PutMessageStatus msgCheckStatus = this.checkMessage(msg);
  227. if (msgCheckStatus == PutMessageStatus.MESSAGE_ILLEGAL) {
  228. return CompletableFuture.completedFuture(new PutMessageResult(msgCheckStatus, null));
  229. }
  230. // 写入消息数据到commitLog中
  231. long beginTime = this.getSystemClock().now();
  232. CompletableFuture<PutMessageResult> putResultFuture = this.commitLog.asyncPutMessage(msg);
  233.  
  234. putResultFuture.thenAccept((result) -> {
  235. long elapsedTime = this.getSystemClock().now() - beginTime;
  236. if (elapsedTime > 500) {
  237. log.warn("putMessage not in lock elapsed time(ms)={}, bodyLength={}", elapsedTime, msg.getBody().length);
  238. }
  239. this.storeStatsService.setPutMessageEntireTimeMax(elapsedTime);
  240.  
  241. if (null == result || !result.isOk()) {
  242. this.storeStatsService.getPutMessageFailedTimes().incrementAndGet();
  243. }
  244. });
  245.  
  246. return putResultFuture;
  247. }
  248. // 写入消息数据到commitLog中
  249. // org.apache.rocketmq.store.CommitLog#asyncPutMessage
  250. public CompletableFuture<PutMessageResult> asyncPutMessage(final MessageExtBrokerInner msg) {
  251. // Set the storage time
  252. msg.setStoreTimestamp(System.currentTimeMillis());
  253. // Set the message body BODY CRC (consider the most appropriate setting
  254. // on the client)
  255. msg.setBodyCRC(UtilAll.crc32(msg.getBody()));
  256. // Back to Results
  257. AppendMessageResult result = null;
  258.  
  259. StoreStatsService storeStatsService = this.defaultMessageStore.getStoreStatsService();
  260.  
  261. String topic = msg.getTopic();
  262. int queueId = msg.getQueueId();
  263.  
  264. final int tranType = MessageSysFlag.getTransactionValue(msg.getSysFlag());
  265. if (tranType == MessageSysFlag.TRANSACTION_NOT_TYPE
  266. || tranType == MessageSysFlag.TRANSACTION_COMMIT_TYPE) {
  267. // Delay Delivery
  268. if (msg.getDelayTimeLevel() > 0) {
  269. if (msg.getDelayTimeLevel() > this.defaultMessageStore.getScheduleMessageService().getMaxDelayLevel()) {
  270. msg.setDelayTimeLevel(this.defaultMessageStore.getScheduleMessageService().getMaxDelayLevel());
  271. }
  272.  
  273. topic = TopicValidator.RMQ_SYS_SCHEDULE_TOPIC;
  274. queueId = ScheduleMessageService.delayLevel2QueueId(msg.getDelayTimeLevel());
  275.  
  276. // Backup real topic, queueId
  277. MessageAccessor.putProperty(msg, MessageConst.PROPERTY_REAL_TOPIC, msg.getTopic());
  278. MessageAccessor.putProperty(msg, MessageConst.PROPERTY_REAL_QUEUE_ID, String.valueOf(msg.getQueueId()));
  279. msg.setPropertiesString(MessageDecoder.messageProperties2String(msg.getProperties()));
  280.  
  281. msg.setTopic(topic);
  282. msg.setQueueId(queueId);
  283. }
  284. }
  285.  
  286. long elapsedTimeInLock = 0;
  287. // 获取mappedFile实例,后续将向其写入数据
  288. MappedFile unlockMappedFile = null;
  289. MappedFile mappedFile = this.mappedFileQueue.getLastMappedFile();
  290. // 上锁写数据,保证数据写入安全准确
  291. putMessageLock.lock(); //spin or ReentrantLock ,depending on store config
  292. try {
  293. long beginLockTimestamp = this.defaultMessageStore.getSystemClock().now();
  294. this.beginTimeInLock = beginLockTimestamp;
  295.  
  296. // Here settings are stored timestamp, in order to ensure an orderly
  297. // global
  298. msg.setStoreTimestamp(beginLockTimestamp);
  299. // 确保mappedFile有效
  300. if (null == mappedFile || mappedFile.isFull()) {
  301. mappedFile = this.mappedFileQueue.getLastMappedFile(0); // Mark: NewFile may be cause noise
  302. }
  303. if (null == mappedFile) {
  304. log.error("create mapped file1 error, topic: " + msg.getTopic() + " clientAddr: " + msg.getBornHostString());
  305. beginTimeInLock = 0;
  306. return CompletableFuture.completedFuture(new PutMessageResult(PutMessageStatus.CREATE_MAPEDFILE_FAILED, null));
  307. }
  308. // 向mappedFile中追加数据,完成写入动作
  309. result = mappedFile.appendMessage(msg, this.appendMessageCallback);
  310. switch (result.getStatus()) {
  311. case PUT_OK:
  312. break;
  313. case END_OF_FILE:
  314. unlockMappedFile = mappedFile;
  315. // Create a new file, re-write the message
  316. mappedFile = this.mappedFileQueue.getLastMappedFile(0);
  317. if (null == mappedFile) {
  318. // XXX: warn and notify me
  319. log.error("create mapped file2 error, topic: " + msg.getTopic() + " clientAddr: " + msg.getBornHostString());
  320. beginTimeInLock = 0;
  321. return CompletableFuture.completedFuture(new PutMessageResult(PutMessageStatus.CREATE_MAPEDFILE_FAILED, result));
  322. }
  323. result = mappedFile.appendMessage(msg, this.appendMessageCallback);
  324. break;
  325. case MESSAGE_SIZE_EXCEEDED:
  326. case PROPERTIES_SIZE_EXCEEDED:
  327. beginTimeInLock = 0;
  328. return CompletableFuture.completedFuture(new PutMessageResult(PutMessageStatus.MESSAGE_ILLEGAL, result));
  329. case UNKNOWN_ERROR:
  330. beginTimeInLock = 0;
  331. return CompletableFuture.completedFuture(new PutMessageResult(PutMessageStatus.UNKNOWN_ERROR, result));
  332. default:
  333. beginTimeInLock = 0;
  334. return CompletableFuture.completedFuture(new PutMessageResult(PutMessageStatus.UNKNOWN_ERROR, result));
  335. }
  336.  
  337. elapsedTimeInLock = this.defaultMessageStore.getSystemClock().now() - beginLockTimestamp;
  338. beginTimeInLock = 0;
  339. } finally {
  340. putMessageLock.unlock();
  341. }
  342.  
  343. if (elapsedTimeInLock > 500) {
  344. log.warn("[NOTIFYME]putMessage in lock cost time(ms)={}, bodyLength={} AppendMessageResult={}", elapsedTimeInLock, msg.getBody().length, result);
  345. }
  346.  
  347. if (null != unlockMappedFile && this.defaultMessageStore.getMessageStoreConfig().isWarmMapedFileEnable()) {
  348. this.defaultMessageStore.unlockMappedFile(unlockMappedFile);
  349. }
  350.  
  351. PutMessageResult putMessageResult = new PutMessageResult(PutMessageStatus.PUT_OK, result);
  352.  
  353. // Statistics
  354. storeStatsService.getSinglePutMessageTopicTimesTotal(msg.getTopic()).incrementAndGet();
  355. storeStatsService.getSinglePutMessageTopicSizeTotal(topic).addAndGet(result.getWroteBytes());
  356.  
  357. CompletableFuture<PutMessageStatus> flushResultFuture = submitFlushRequest(result, putMessageResult, msg);
  358. CompletableFuture<PutMessageStatus> replicaResultFuture = submitReplicaRequest(result, putMessageResult, msg);
  359. return flushResultFuture.thenCombine(replicaResultFuture, (flushStatus, replicaStatus) -> {
  360. if (flushStatus != PutMessageStatus.PUT_OK) {
  361. putMessageResult.setPutMessageStatus(PutMessageStatus.FLUSH_DISK_TIMEOUT);
  362. }
  363. if (replicaStatus != PutMessageStatus.PUT_OK) {
  364. putMessageResult.setPutMessageStatus(replicaStatus);
  365. }
  366. return putMessageResult;
  367. });
  368. }
  369.  
  370. // 获取有效的mappedFile实例
  371. // org.apache.rocketmq.store.MappedFileQueue#getLastMappedFile()
  372. public MappedFile getLastMappedFile() {
  373. MappedFile mappedFileLast = null;
  374.  
  375. while (!this.mappedFiles.isEmpty()) {
  376. try {
  377. mappedFileLast = this.mappedFiles.get(this.mappedFiles.size() - 1);
  378. break;
  379. } catch (IndexOutOfBoundsException e) {
  380. //continue;
  381. } catch (Exception e) {
  382. log.error("getLastMappedFile has exception.", e);
  383. break;
  384. }
  385. }
  386.  
  387. return mappedFileLast;
  388. }
  389. // 再次尝试获取 mappedFile, 没有则创建一个新的
  390. // org.apache.rocketmq.store.MappedFileQueue#getLastMappedFile(long)
  391. public MappedFile getLastMappedFile(final long startOffset) {
  392. return getLastMappedFile(startOffset, true);
  393. }
  394. public MappedFile getLastMappedFile(final long startOffset, boolean needCreate) {
  395. long createOffset = -1;
  396. MappedFile mappedFileLast = getLastMappedFile();
  397.  
  398. if (mappedFileLast == null) {
  399. createOffset = startOffset - (startOffset % this.mappedFileSize);
  400. }
  401.  
  402. if (mappedFileLast != null && mappedFileLast.isFull()) {
  403. createOffset = mappedFileLast.getFileFromOffset() + this.mappedFileSize;
  404. }
  405.  
  406. if (createOffset != -1 && needCreate) {
  407. String nextFilePath = this.storePath + File.separator + UtilAll.offset2FileName(createOffset);
  408. String nextNextFilePath = this.storePath + File.separator
  409. + UtilAll.offset2FileName(createOffset + this.mappedFileSize);
  410. MappedFile mappedFile = null;
  411. // 分配创建一个新的commitLog文件
  412. if (this.allocateMappedFileService != null) {
  413. mappedFile = this.allocateMappedFileService.putRequestAndReturnMappedFile(nextFilePath,
  414. nextNextFilePath, this.mappedFileSize);
  415. } else {
  416. try {
  417. mappedFile = new MappedFile(nextFilePath, this.mappedFileSize);
  418. } catch (IOException e) {
  419. log.error("create mappedFile exception", e);
  420. }
  421. }
  422.  
  423. if (mappedFile != null) {
  424. if (this.mappedFiles.isEmpty()) {
  425. mappedFile.setFirstCreateInQueue(true);
  426. }
  427. this.mappedFiles.add(mappedFile);
  428. }
  429.  
  430. return mappedFile;
  431. }
  432.  
  433. return mappedFileLast;
  434. }
  435.  
  436. // 向commitLog中得到的mappedFile顺序写入数据
  437. public AppendMessageResult appendMessage(final MessageExtBrokerInner msg, final AppendMessageCallback cb) {
  438. return appendMessagesInner(msg, cb);
  439. }
  440. public AppendMessageResult appendMessagesInner(final MessageExt messageExt, final AppendMessageCallback cb) {
  441. assert messageExt != null;
  442. assert cb != null;
  443.  
  444. int currentPos = this.wrotePosition.get();
  445.  
  446. if (currentPos < this.fileSize) {
  447. ByteBuffer byteBuffer = writeBuffer != null ? writeBuffer.slice() : this.mappedByteBuffer.slice();
  448. byteBuffer.position(currentPos);
  449. AppendMessageResult result;
  450. if (messageExt instanceof MessageExtBrokerInner) {
  451. // 回调,写入数据到 commitLog 中
  452. // 将数据写入 byteBuffer, 即将数据写入了pagecache, 也就写入了磁盘文件中了
  453. result = cb.doAppend(this.getFileFromOffset(), byteBuffer, this.fileSize - currentPos, (MessageExtBrokerInner) messageExt);
  454. } else if (messageExt instanceof MessageExtBatch) {
  455. result = cb.doAppend(this.getFileFromOffset(), byteBuffer, this.fileSize - currentPos, (MessageExtBatch) messageExt);
  456. } else {
  457. return new AppendMessageResult(AppendMessageStatus.UNKNOWN_ERROR);
  458. }
  459. this.wrotePosition.addAndGet(result.getWroteBytes());
  460. this.storeTimestamp = result.getStoreTimestamp();
  461. return result;
  462. }
  463. log.error("MappedFile.appendMessage return null, wrotePosition: {} fileSize: {}", currentPos, this.fileSize);
  464. return new AppendMessageResult(AppendMessageStatus.UNKNOWN_ERROR);
  465. }
  466. // org.apache.rocketmq.store.CommitLog.DefaultAppendMessageCallback#doAppend(long, java.nio.ByteBuffer, int, org.apache.rocketmq.store.MessageExtBrokerInner)
  467. public AppendMessageResult doAppend(final long fileFromOffset, final ByteBuffer byteBuffer, final int maxBlank,
  468. final MessageExtBrokerInner msgInner) {
  469. // STORETIMESTAMP + STOREHOSTADDRESS + OFFSET <br>
  470.  
  471. // PHY OFFSET
  472. long wroteOffset = fileFromOffset + byteBuffer.position();
  473.  
  474. int sysflag = msgInner.getSysFlag();
  475.  
  476. int bornHostLength = (sysflag & MessageSysFlag.BORNHOST_V6_FLAG) == 0 ? 4 + 4 : 16 + 4;
  477. int storeHostLength = (sysflag & MessageSysFlag.STOREHOSTADDRESS_V6_FLAG) == 0 ? 4 + 4 : 16 + 4;
  478. ByteBuffer bornHostHolder = ByteBuffer.allocate(bornHostLength);
  479. ByteBuffer storeHostHolder = ByteBuffer.allocate(storeHostLength);
  480.  
  481. this.resetByteBuffer(storeHostHolder, storeHostLength);
  482. String msgId;
  483. if ((sysflag & MessageSysFlag.STOREHOSTADDRESS_V6_FLAG) == 0) {
  484. msgId = MessageDecoder.createMessageId(this.msgIdMemory, msgInner.getStoreHostBytes(storeHostHolder), wroteOffset);
  485. } else {
  486. msgId = MessageDecoder.createMessageId(this.msgIdV6Memory, msgInner.getStoreHostBytes(storeHostHolder), wroteOffset);
  487. }
  488.  
  489. // Record ConsumeQueue information
  490. keyBuilder.setLength(0);
  491. keyBuilder.append(msgInner.getTopic());
  492. keyBuilder.append('-');
  493. keyBuilder.append(msgInner.getQueueId());
  494. String key = keyBuilder.toString();
  495. Long queueOffset = CommitLog.this.topicQueueTable.get(key);
  496. // 初始化queueId信息
  497. if (null == queueOffset) {
  498. queueOffset = 0L;
  499. CommitLog.this.topicQueueTable.put(key, queueOffset);
  500. }
  501.  
  502. // Transaction messages that require special handling
  503. final int tranType = MessageSysFlag.getTransactionValue(msgInner.getSysFlag());
  504. switch (tranType) {
  505. // Prepared and Rollback message is not consumed, will not enter the
  506. // consumer queuec
  507. case MessageSysFlag.TRANSACTION_PREPARED_TYPE:
  508. case MessageSysFlag.TRANSACTION_ROLLBACK_TYPE:
  509. queueOffset = 0L;
  510. break;
  511. case MessageSysFlag.TRANSACTION_NOT_TYPE:
  512. case MessageSysFlag.TRANSACTION_COMMIT_TYPE:
  513. default:
  514. break;
  515. }
  516.  
  517. /**
  518. * Serialize message
  519. */
  520. final byte[] propertiesData =
  521. msgInner.getPropertiesString() == null ? null : msgInner.getPropertiesString().getBytes(MessageDecoder.CHARSET_UTF8);
  522.  
  523. final int propertiesLength = propertiesData == null ? 0 : propertiesData.length;
  524.  
  525. if (propertiesLength > Short.MAX_VALUE) {
  526. log.warn("putMessage message properties length too long. length={}", propertiesData.length);
  527. return new AppendMessageResult(AppendMessageStatus.PROPERTIES_SIZE_EXCEEDED);
  528. }
  529.  
  530. final byte[] topicData = msgInner.getTopic().getBytes(MessageDecoder.CHARSET_UTF8);
  531. final int topicLength = topicData.length;
  532.  
  533. final int bodyLength = msgInner.getBody() == null ? 0 : msgInner.getBody().length;
  534.  
  535. final int msgLen = calMsgLength(msgInner.getSysFlag(), bodyLength, topicLength, propertiesLength);
  536.  
  537. // Exceeds the maximum message
  538. if (msgLen > this.maxMessageSize) {
  539. CommitLog.log.warn("message size exceeded, msg total size: " + msgLen + ", msg body size: " + bodyLength
  540. + ", maxMessageSize: " + this.maxMessageSize);
  541. return new AppendMessageResult(AppendMessageStatus.MESSAGE_SIZE_EXCEEDED);
  542. }
  543.  
  544. // Determines whether there is sufficient free space
  545. if ((msgLen + END_FILE_MIN_BLANK_LENGTH) > maxBlank) {
  546. this.resetByteBuffer(this.msgStoreItemMemory, maxBlank);
  547. // 1 TOTALSIZE
  548. this.msgStoreItemMemory.putInt(maxBlank);
  549. // 2 MAGICCODE
  550. this.msgStoreItemMemory.putInt(CommitLog.BLANK_MAGIC_CODE);
  551. // 3 The remaining space may be any value
  552. // Here the length of the specially set maxBlank
  553. final long beginTimeMills = CommitLog.this.defaultMessageStore.now();
  554. byteBuffer.put(this.msgStoreItemMemory.array(), 0, maxBlank);
  555. return new AppendMessageResult(AppendMessageStatus.END_OF_FILE, wroteOffset, maxBlank, msgId, msgInner.getStoreTimestamp(),
  556. queueOffset, CommitLog.this.defaultMessageStore.now() - beginTimeMills);
  557. }
  558. // 序列化写入数据,写header... body...
  559. // Initialization of storage space
  560. this.resetByteBuffer(msgStoreItemMemory, msgLen);
  561. // 1 TOTALSIZE
  562. this.msgStoreItemMemory.putInt(msgLen);
  563. // 2 MAGICCODE
  564. this.msgStoreItemMemory.putInt(CommitLog.MESSAGE_MAGIC_CODE);
  565. // 3 BODYCRC
  566. this.msgStoreItemMemory.putInt(msgInner.getBodyCRC());
  567. // 4 QUEUEID
  568. this.msgStoreItemMemory.putInt(msgInner.getQueueId());
  569. // 5 FLAG
  570. this.msgStoreItemMemory.putInt(msgInner.getFlag());
  571. // 6 QUEUEOFFSET
  572. this.msgStoreItemMemory.putLong(queueOffset);
  573. // 7 PHYSICALOFFSET
  574. this.msgStoreItemMemory.putLong(fileFromOffset + byteBuffer.position());
  575. // 8 SYSFLAG
  576. this.msgStoreItemMemory.putInt(msgInner.getSysFlag());
  577. // 9 BORNTIMESTAMP
  578. this.msgStoreItemMemory.putLong(msgInner.getBornTimestamp());
  579. // 10 BORNHOST
  580. this.resetByteBuffer(bornHostHolder, bornHostLength);
  581. this.msgStoreItemMemory.put(msgInner.getBornHostBytes(bornHostHolder));
  582. // 11 STORETIMESTAMP
  583. this.msgStoreItemMemory.putLong(msgInner.getStoreTimestamp());
  584. // 12 STOREHOSTADDRESS
  585. this.resetByteBuffer(storeHostHolder, storeHostLength);
  586. this.msgStoreItemMemory.put(msgInner.getStoreHostBytes(storeHostHolder));
  587. // 13 RECONSUMETIMES
  588. this.msgStoreItemMemory.putInt(msgInner.getReconsumeTimes());
  589. // 14 Prepared Transaction Offset
  590. this.msgStoreItemMemory.putLong(msgInner.getPreparedTransactionOffset());
  591. // 15 BODY
  592. this.msgStoreItemMemory.putInt(bodyLength);
  593. if (bodyLength > 0)
  594. this.msgStoreItemMemory.put(msgInner.getBody());
  595. // 16 TOPIC
  596. this.msgStoreItemMemory.put((byte) topicLength);
  597. this.msgStoreItemMemory.put(topicData);
  598. // 17 PROPERTIES
  599. this.msgStoreItemMemory.putShort((short) propertiesLength);
  600. if (propertiesLength > 0)
  601. this.msgStoreItemMemory.put(propertiesData);
  602.  
  603. final long beginTimeMills = CommitLog.this.defaultMessageStore.now();
  604. // Write messages to the queue buffer
  605. // 将数据写入 ByteBuffer 中,
  606. byteBuffer.put(this.msgStoreItemMemory.array(), 0, msgLen);
  607.  
  608. AppendMessageResult result = new AppendMessageResult(AppendMessageStatus.PUT_OK, wroteOffset, msgLen, msgId,
  609. msgInner.getStoreTimestamp(), queueOffset, CommitLog.this.defaultMessageStore.now() - beginTimeMills);
  610.  
  611. switch (tranType) {
  612. case MessageSysFlag.TRANSACTION_PREPARED_TYPE:
  613. case MessageSysFlag.TRANSACTION_ROLLBACK_TYPE:
  614. break;
  615. case MessageSysFlag.TRANSACTION_NOT_TYPE:
  616. case MessageSysFlag.TRANSACTION_COMMIT_TYPE:
  617. // The next update ConsumeQueue information
  618. CommitLog.this.topicQueueTable.put(key, ++queueOffset);
  619. break;
  620. default:
  621. break;
  622. }
  623. return result;
  624. }

以上过程看似复杂,实则只有最后一个bytebuffer.putXxx() 是真正的和mappedFile 是相关的。当然了,还有MappedFile的初始过程,它会先尝试从现有打开的mappFiles中获取最后一个实例,如果mappedFile满了之后,就会尝试创建一个新的mappedFile,  这个过程一般伴随着新的commitLog文件的创建。

  mappedFile 的刷盘动作,主要分为同步刷盘和异步刷,底层都是一样的,即调用 flush(),MappedFileChannel.force(),  将pagecache强制刷入到磁盘上。一般地,将数据写入pagecache,基本就能保证不丢失了。但还是有例外情况,比如机器掉电,或者系统bug这种极端情况,还是会导致丢数据哟。

下面大致来看看 mappedFile 同步刷盘过程:

  1. // org.apache.rocketmq.store.CommitLog#handleDiskFlush
  2. public void handleDiskFlush(AppendMessageResult result, PutMessageResult putMessageResult, MessageExt messageExt) {
  3. // Synchronization flush
  4. // 同步刷盘
  5. if (FlushDiskType.SYNC_FLUSH == this.defaultMessageStore.getMessageStoreConfig().getFlushDiskType()) {
  6. final GroupCommitService service = (GroupCommitService) this.flushCommitLogService;
  7. if (messageExt.isWaitStoreMsgOK()) {
  8. // 提交一个刷盘任务到 GroupCommitService, 同步等待结果响应
  9. GroupCommitRequest request = new GroupCommitRequest(result.getWroteOffset() + result.getWroteBytes());
  10. service.putRequest(request);
  11. CompletableFuture<PutMessageStatus> flushOkFuture = request.future();
  12. PutMessageStatus flushStatus = null;
  13. try {
  14. flushStatus = flushOkFuture.get(this.defaultMessageStore.getMessageStoreConfig().getSyncFlushTimeout(),
  15. TimeUnit.MILLISECONDS);
  16. } catch (InterruptedException | ExecutionException | TimeoutException e) {
  17. //flushOK=false;
  18. }
  19. if (flushStatus != PutMessageStatus.PUT_OK) {
  20. log.error("do groupcommit, wait for flush failed, topic: " + messageExt.getTopic() + " tags: " + messageExt.getTags()
  21. + " client address: " + messageExt.getBornHostString());
  22. putMessageResult.setPutMessageStatus(PutMessageStatus.FLUSH_DISK_TIMEOUT);
  23. }
  24. } else {
  25. service.wakeup();
  26. }
  27. }
  28. // Asynchronous flush
  29. // 异步刷盘
  30. else {
  31. // 异步刷盘则直接唤醒一个刷盘线程即可
  32. if (!this.defaultMessageStore.getMessageStoreConfig().isTransientStorePoolEnable()) {
  33. flushCommitLogService.wakeup();
  34. } else {
  35. commitLogService.wakeup();
  36. }
  37. }
  38. }
  39. // org.apache.rocketmq.store.CommitLog.GroupCommitService#putRequest
  40. // 添加刷盘请求
  41. public synchronized void putRequest(final GroupCommitRequest request) {
  42. synchronized (this.requestsWrite) {
  43. this.requestsWrite.add(request);
  44. }
  45. this.wakeup();
  46. }
  47. // 刷盘线程一直运行
  48. public void run() {
  49. CommitLog.log.info(this.getServiceName() + " service started");
  50.  
  51. while (!this.isStopped()) {
  52. try {
  53. this.waitForRunning(10);
  54. // 运行真正的刷动作
  55. this.doCommit();
  56. } catch (Exception e) {
  57. CommitLog.log.warn(this.getServiceName() + " service has exception. ", e);
  58. }
  59. }
  60.  
  61. // Under normal circumstances shutdown, wait for the arrival of the
  62. // request, and then flush
  63. try {
  64. Thread.sleep(10);
  65. } catch (InterruptedException e) {
  66. CommitLog.log.warn("GroupCommitService Exception, ", e);
  67. }
  68.  
  69. synchronized (this) {
  70. this.swapRequests();
  71. }
  72.  
  73. this.doCommit();
  74.  
  75. CommitLog.log.info(this.getServiceName() + " service end");
  76. }
  77.  
  78. private void doCommit() {
  79. synchronized (this.requestsRead) {
  80. if (!this.requestsRead.isEmpty()) {
  81. for (GroupCommitRequest req : this.requestsRead) {
  82. // There may be a message in the next file, so a maximum of
  83. // two times the flush
  84. boolean flushOK = false;
  85. for (int i = 0; i < 2 && !flushOK; i++) {
  86. flushOK = CommitLog.this.mappedFileQueue.getFlushedWhere() >= req.getNextOffset();
  87.  
  88. if (!flushOK) {
  89. // 刷盘实现
  90. CommitLog.this.mappedFileQueue.flush(0);
  91. }
  92. }
  93.  
  94. req.wakeupCustomer(flushOK ? PutMessageStatus.PUT_OK : PutMessageStatus.FLUSH_DISK_TIMEOUT);
  95. }
  96.  
  97. long storeTimestamp = CommitLog.this.mappedFileQueue.getStoreTimestamp();
  98. if (storeTimestamp > 0) {
  99. CommitLog.this.defaultMessageStore.getStoreCheckpoint().setPhysicMsgTimestamp(storeTimestamp);
  100. }
  101.  
  102. this.requestsRead.clear();
  103. } else {
  104. // Because of individual messages is set to not sync flush, it
  105. // will come to this process
  106. CommitLog.this.mappedFileQueue.flush(0);
  107. }
  108. }
  109. }
  110.  
  111. // org.apache.rocketmq.store.MappedFileQueue#flush
  112. public boolean flush(final int flushLeastPages) {
  113. boolean result = true;
  114. MappedFile mappedFile = this.findMappedFileByOffset(this.flushedWhere, this.flushedWhere == 0);
  115. if (mappedFile != null) {
  116. long tmpTimeStamp = mappedFile.getStoreTimestamp();
  117. int offset = mappedFile.flush(flushLeastPages);
  118. long where = mappedFile.getFileFromOffset() + offset;
  119. result = where == this.flushedWhere;
  120. this.flushedWhere = where;
  121. if (0 == flushLeastPages) {
  122. this.storeTimestamp = tmpTimeStamp;
  123. }
  124. }
  125.  
  126. return result;
  127. }
  128. // org.apache.rocketmq.store.MappedFile#flush
  129. /**
  130. * @return The current flushed position
  131. */
  132. public int flush(final int flushLeastPages) {
  133. if (this.isAbleToFlush(flushLeastPages)) {
  134. if (this.hold()) {
  135. int value = getReadPosition();
  136.  
  137. try {
  138. //We only append data to fileChannel or mappedByteBuffer, never both.
  139. if (writeBuffer != null || this.fileChannel.position() != 0) {
  140. this.fileChannel.force(false);
  141. } else {
  142. this.mappedByteBuffer.force();
  143. }
  144. } catch (Throwable e) {
  145. log.error("Error occurred when force data to disk.", e);
  146. }
  147.  
  148. this.flushedPosition.set(value);
  149. this.release();
  150. } else {
  151. log.warn("in flush, hold failed, flush offset = " + this.flushedPosition.get());
  152. this.flushedPosition.set(getReadPosition());
  153. }
  154. }
  155. return this.getFlushedPosition();
  156. }

  最后,来看看mappedFile 的创建和预热过程如何:

  1. public MappedFile putRequestAndReturnMappedFile(String nextFilePath, String nextNextFilePath, int fileSize) {
  2. int canSubmitRequests = 2;
  3. if (this.messageStore.getMessageStoreConfig().isTransientStorePoolEnable()) {
  4. if (this.messageStore.getMessageStoreConfig().isFastFailIfNoBufferInStorePool()
  5. && BrokerRole.SLAVE != this.messageStore.getMessageStoreConfig().getBrokerRole()) { //if broker is slave, don't fast fail even no buffer in pool
  6. canSubmitRequests = this.messageStore.getTransientStorePool().availableBufferNums() - this.requestQueue.size();
  7. }
  8. }
  9.  
  10. AllocateRequest nextReq = new AllocateRequest(nextFilePath, fileSize);
  11. boolean nextPutOK = this.requestTable.putIfAbsent(nextFilePath, nextReq) == null;
  12.  
  13. if (nextPutOK) {
  14. if (canSubmitRequests <= 0) {
  15. log.warn("[NOTIFYME]TransientStorePool is not enough, so create mapped file error, " +
  16. "RequestQueueSize : {}, StorePoolSize: {}", this.requestQueue.size(), this.messageStore.getTransientStorePool().availableBufferNums());
  17. this.requestTable.remove(nextFilePath);
  18. return null;
  19. }
  20. boolean offerOK = this.requestQueue.offer(nextReq);
  21. if (!offerOK) {
  22. log.warn("never expected here, add a request to preallocate queue failed");
  23. }
  24. canSubmitRequests--;
  25. }
  26.  
  27. AllocateRequest nextNextReq = new AllocateRequest(nextNextFilePath, fileSize);
  28. // 放入请求表中,会有任务处理
  29. boolean nextNextPutOK = this.requestTable.putIfAbsent(nextNextFilePath, nextNextReq) == null;
  30. if (nextNextPutOK) {
  31. if (canSubmitRequests <= 0) {
  32. log.warn("[NOTIFYME]TransientStorePool is not enough, so skip preallocate mapped file, " +
  33. "RequestQueueSize : {}, StorePoolSize: {}", this.requestQueue.size(), this.messageStore.getTransientStorePool().availableBufferNums());
  34. this.requestTable.remove(nextNextFilePath);
  35. } else {
  36. // 放入mmap处理队列,后台任务开始处理
  37. boolean offerOK = this.requestQueue.offer(nextNextReq);
  38. if (!offerOK) {
  39. log.warn("never expected here, add a request to preallocate queue failed");
  40. }
  41. }
  42. }
  43.  
  44. if (hasException) {
  45. log.warn(this.getServiceName() + " service has exception. so return null");
  46. return null;
  47. }
  48.  
  49. AllocateRequest result = this.requestTable.get(nextFilePath);
  50. try {
  51. if (result != null) {
  52. // 同步等待mmap请求处理完成
  53. boolean waitOK = result.getCountDownLatch().await(waitTimeOut, TimeUnit.MILLISECONDS);
  54. if (!waitOK) {
  55. log.warn("create mmap timeout " + result.getFilePath() + " " + result.getFileSize());
  56. return null;
  57. } else {
  58. this.requestTable.remove(nextFilePath);
  59. return result.getMappedFile();
  60. }
  61. } else {
  62. log.error("find preallocate mmap failed, this never happen");
  63. }
  64. } catch (InterruptedException e) {
  65. log.warn(this.getServiceName() + " service has exception. ", e);
  66. }
  67.  
  68. return null;
  69. }
  70.  
  71. // 任务只干一件事,处理调用mmapOperation
  72. public void run() {
  73. log.info(this.getServiceName() + " service started");
  74.  
  75. while (!this.isStopped() && this.mmapOperation()) {
  76.  
  77. }
  78. log.info(this.getServiceName() + " service end");
  79. }
  80.  
  81. /**
  82. * Only interrupted by the external thread, will return false
  83. */
  84. private boolean mmapOperation() {
  85. boolean isSuccess = false;
  86. AllocateRequest req = null;
  87. try {
  88. req = this.requestQueue.take();
  89. AllocateRequest expectedRequest = this.requestTable.get(req.getFilePath());
  90. if (null == expectedRequest) {
  91. log.warn("this mmap request expired, maybe cause timeout " + req.getFilePath() + " "
  92. + req.getFileSize());
  93. return true;
  94. }
  95. if (expectedRequest != req) {
  96. log.warn("never expected here, maybe cause timeout " + req.getFilePath() + " "
  97. + req.getFileSize() + ", req:" + req + ", expectedRequest:" + expectedRequest);
  98. return true;
  99. }
  100.  
  101. if (req.getMappedFile() == null) {
  102. long beginTime = System.currentTimeMillis();
  103.  
  104. MappedFile mappedFile;
  105. if (messageStore.getMessageStoreConfig().isTransientStorePoolEnable()) {
  106. try {
  107. mappedFile = ServiceLoader.load(MappedFile.class).iterator().next();
  108. // 初始化 mappedFile,实际就是创建commitLog文件
  109. mappedFile.init(req.getFilePath(), req.getFileSize(), messageStore.getTransientStorePool());
  110. } catch (RuntimeException e) {
  111. log.warn("Use default implementation.");
  112. mappedFile = new MappedFile(req.getFilePath(), req.getFileSize(), messageStore.getTransientStorePool());
  113. }
  114. } else {
  115. mappedFile = new MappedFile(req.getFilePath(), req.getFileSize());
  116. }
  117.  
  118. long elapsedTime = UtilAll.computeElapsedTimeMilliseconds(beginTime);
  119. if (elapsedTime > 10) {
  120. int queueSize = this.requestQueue.size();
  121. log.warn("create mappedFile spent time(ms) " + elapsedTime + " queue size " + queueSize
  122. + " " + req.getFilePath() + " " + req.getFileSize());
  123. }
  124.  
  125. // pre write mappedFile
  126. if (mappedFile.getFileSize() >= this.messageStore.getMessageStoreConfig()
  127. .getMappedFileSizeCommitLog()
  128. &&
  129. this.messageStore.getMessageStoreConfig().isWarmMapedFileEnable()) {
  130. mappedFile.warmMappedFile(this.messageStore.getMessageStoreConfig().getFlushDiskType(),
  131. this.messageStore.getMessageStoreConfig().getFlushLeastPagesWhenWarmMapedFile());
  132. }
  133.  
  134. req.setMappedFile(mappedFile);
  135. this.hasException = false;
  136. isSuccess = true;
  137. }
  138. } catch (InterruptedException e) {
  139. log.warn(this.getServiceName() + " interrupted, possibly by shutdown.");
  140. this.hasException = true;
  141. return false;
  142. } catch (IOException e) {
  143. log.warn(this.getServiceName() + " service has exception. ", e);
  144. this.hasException = true;
  145. if (null != req) {
  146. requestQueue.offer(req);
  147. try {
  148. Thread.sleep(1);
  149. } catch (InterruptedException ignored) {
  150. }
  151. }
  152. } finally {
  153. if (req != null && isSuccess)
  154. req.getCountDownLatch().countDown();
  155. }
  156. return true;
  157. }
  158.  
  159. // store.MappedFile.init()
  160. public void init(final String fileName, final int fileSize,
  161. final TransientStorePool transientStorePool) throws IOException {
  162. init(fileName, fileSize);
  163. this.writeBuffer = transientStorePool.borrowBuffer();
  164. this.transientStorePool = transientStorePool;
  165. }
  166. private void init(final String fileName, final int fileSize) throws IOException {
  167. this.fileName = fileName;
  168. this.fileSize = fileSize;
  169. this.file = new File(fileName);
  170. this.fileFromOffset = Long.parseLong(this.file.getName());
  171. boolean ok = false;
  172.  
  173. ensureDirOK(this.file.getParent());
  174.  
  175. try {
  176. // 最核心的创建mmap的地方
  177. // 创建 fileChannel
  178. // 创建 mappedByteBuffer, 后续直接使用
  179. this.fileChannel = new RandomAccessFile(this.file, "rw").getChannel();
  180. this.mappedByteBuffer = this.fileChannel.map(MapMode.READ_WRITE, 0, fileSize);
  181. TOTAL_MAPPED_VIRTUAL_MEMORY.addAndGet(fileSize);
  182. TOTAL_MAPPED_FILES.incrementAndGet();
  183. ok = true;
  184. } catch (FileNotFoundException e) {
  185. log.error("Failed to create file " + this.fileName, e);
  186. throw e;
  187. } catch (IOException e) {
  188. log.error("Failed to map file " + this.fileName, e);
  189. throw e;
  190. } finally {
  191. if (!ok && this.fileChannel != null) {
  192. this.fileChannel.close();
  193. }
  194. }
  195. }

  预热:

  1. // org.apache.rocketmq.store.MappedFile#warmMappedFile
  2. public void warmMappedFile(FlushDiskType type, int pages) {
  3. long beginTime = System.currentTimeMillis();
  4. ByteBuffer byteBuffer = this.mappedByteBuffer.slice();
  5. int flush = 0;
  6. long time = System.currentTimeMillis();
  7. for (int i = 0, j = 0; i < this.fileSize; i += MappedFile.OS_PAGE_SIZE, j++) {
  8. byteBuffer.put(i, (byte) 0);
  9. // force flush when flush disk type is sync
  10. if (type == FlushDiskType.SYNC_FLUSH) {
  11. if ((i / OS_PAGE_SIZE) - (flush / OS_PAGE_SIZE) >= pages) {
  12. flush = i;
  13. mappedByteBuffer.force();
  14. }
  15. }
  16.  
  17. // prevent gc
  18. if (j % 1000 == 0) {
  19. log.info("j={}, costTime={}", j, System.currentTimeMillis() - time);
  20. time = System.currentTimeMillis();
  21. try {
  22. Thread.sleep(0);
  23. } catch (InterruptedException e) {
  24. log.error("Interrupted", e);
  25. }
  26. }
  27. }
  28.  
  29. // force flush when prepare load finished
  30. if (type == FlushDiskType.SYNC_FLUSH) {
  31. log.info("mapped file warm-up done, force to disk, mappedFile={}, costTime={}",
  32. this.getFileName(), System.currentTimeMillis() - beginTime);
  33. mappedByteBuffer.force();
  34. }
  35. log.info("mapped file warm-up done. mappedFile={}, costTime={}", this.getFileName(),
  36. System.currentTimeMillis() - beginTime);
  37.  
  38. this.mlock();
  39. }

  如此,整个rocketmq对mappedfile的使用过程就厘清了。

5. mappedFile 压测性能几何

  使用jmh 压测下。

  1. @State(Scope.Benchmark)
  2. public class MmapFileBenchmarkTest {
  3.  
  4. public static void main(String[] args) throws RunnerException {
  5. Options opt = new OptionsBuilder()
  6. .include(MmapFileBenchmarkTest.class.getSimpleName())
  7. // .include(BenchMarkUsage.class.getSimpleName()+".*measureThroughput*")
  8. // 预热3轮
  9. .warmupIterations(3)
  10. // 度量5轮
  11. .measurementIterations(5)
  12. .forks(1)
  13. .build();
  14. new Runner(opt).run();
  15. }
  16.  
  17. private FileChannel fileChannel;
  18.  
  19. private MappedByteBuffer mappedByteBuffer;
  20.  
  21. private OutputStream outputStream;
  22.  
  23. private int maxWriteLines = 100_0000;
  24.  
  25. private int fileSize = 102400000;
  26.  
  27. @Setup
  28. @Before
  29. public void setup() throws IOException {
  30. File file1 = new File("/tmp/t_mappedFileTest.txt");
  31. this.fileChannel = new RandomAccessFile(file1, "rw").getChannel();
  32. this.mappedByteBuffer = this.fileChannel.map(FileChannel.MapMode.READ_WRITE,
  33. 0, 1024000000);
  34. // 忽略预热
  35. // warmMappedFile();
  36. outputStream = FileUtils.openOutputStream(new File("/tmp/t_normalFileTest.txt"));
  37.  
  38. }
  39. private void warmMappedFile() {
  40. long beginTime = System.currentTimeMillis();
  41. ByteBuffer byteBuffer = this.mappedByteBuffer.slice();
  42. int flush = 0;
  43. long time = System.currentTimeMillis();
  44. for (int i = 0, j = 0; i < this.fileSize; i += 4096, j++) {
  45. byteBuffer.put(i, (byte) 0);
  46.  
  47. // prevent gc
  48. if (j % 1000 == 0) {
  49. logInfo("j=%s, costTime=%d", j, System.currentTimeMillis() - time);
  50. time = System.currentTimeMillis();
  51. try {
  52. Thread.sleep(0);
  53. } catch (InterruptedException e) {
  54. logInfo("Interrupted, %s", e);
  55. }
  56. }
  57. }
  58. // force flush when prepare load finished
  59. mappedByteBuffer.force();
  60. // this.mlock();
  61. }
  62. private void logInfo(String message, Object... args) {
  63. System.out.println(String.format(message, args));
  64. }
  65.  
  66. @Benchmark
  67. @BenchmarkMode(Mode.Throughput)
  68. @OutputTimeUnit(TimeUnit.SECONDS)
  69. @Test
  70. public void testAppendMappedFile() throws IOException {
  71. for (int i = 0; i < maxWriteLines; i++ ) {
  72. mappedByteBuffer.put("abc1234567\n".getBytes());
  73. }
  74. mappedByteBuffer.flip();
  75. }
  76.  
  77. @Benchmark
  78. @BenchmarkMode(Mode.Throughput)
  79. @OutputTimeUnit(TimeUnit.SECONDS)
  80. @Test
  81. public void testAppendNormalFile() throws IOException {
  82. for (int i = 0; i < maxWriteLines; i++ ) {
  83. outputStream.write("abc1234567\n".getBytes());
  84. }
  85. outputStream.flush();
  86. }
  87.  
  88. }

  测试结果如下:

  1. # Run progress: 0.00% complete, ETA 00:00:16
  2. # Fork: 1 of 1
  3. # Warmup Iteration 1: 14.808 ops/s
  4. # Warmup Iteration 2: 16.170 ops/s
  5. # Warmup Iteration 3: 18.633 ops/s
  6. Iteration 1: 15.692 ops/s
  7. Iteration 2: 17.273 ops/s
  8. Iteration 3: 18.145 ops/s
  9. Iteration 4: 18.356 ops/s
  10. Iteration 5: 18.868 ops/s
  11.  
  12. Result "MmapFileBenchmarkTest.testAppendMappedFile":
  13. 17.667 ±(99.9%) 4.795 ops/s [Average]
  14. (min, avg, max) = (15.692, 17.667, 18.868), stdev = 1.245
  15. CI (99.9%): [12.871, 22.462] (assumes normal distribution)
  16.  
  17. # JMH version: 1.19
  18. # VM version: JDK 1.8.0_121, VM 25.121-b13
  19. # Warmup: 3 iterations, 1 s each
  20. # Measurement: 5 iterations, 1 s each
  21. # Timeout: 10 min per iteration
  22. # Threads: 1 thread, will synchronize iterations
  23. # Benchmark mode: Throughput, ops/time
  24. # Benchmark: MmapFileBenchmarkTest.testAppendNormalFile
  25.  
  26. # Run progress: 50.00% complete, ETA 00:00:09
  27. # Fork: 1 of 1
  28. # Warmup Iteration 1: 0.443 ops/s
  29. # Warmup Iteration 2: 0.456 ops/s
  30. # Warmup Iteration 3: 0.438 ops/s
  31. Iteration 1: 0.406 ops/s
  32. Iteration 2: 0.430 ops/s
  33. Iteration 3: 0.408 ops/s
  34. Iteration 4: 0.399 ops/s
  35. Iteration 5: 0.410 ops/s
  36.  
  37. Result "MmapFileBenchmarkTest.testAppendNormalFile":
  38. 0.411 ±(99.9%) 0.044 ops/s [Average]
  39. (min, avg, max) = (0.399, 0.411, 0.430), stdev = 0.011
  40. CI (99.9%): [0.367, 0.454] (assumes normal distribution)
  41.  
  42. # Run complete. Total time: 00:00:29
  43.  
  44. Benchmark Mode Cnt Score Error Units
  45. MmapFileBenchmarkTest.testAppendMappedFile thrpt 5 17.667 ± 4.795 ops/s
  46. MmapFileBenchmarkTest.testAppendNormalFile thrpt 5 0.411 ± 0.044 ops/s

  结论粗糙,仅供参考!

RocketMQ(七):高性能探秘之MappedFile的更多相关文章

  1. 第七章 探秘Qt的核心机制-信号与槽

    第七章 探秘Qt的核心机制-信号与槽 注:要想使用Qt的核心机制信号与槽,就必须在类的私有数据区声明Q_OBJECT宏,然后会有moc编译器负责读取这个宏进行代码转化,从而使Qt这个特有的机制得到使用 ...

  2. RocketMQ(七):高性能探秘之线程池

    上一篇文章讲了如何设计和实现高并发高性能的应用,从根本上说明了一些道理.且以rocketmq的mappedFile的实现作为一个突破点,讲解了rocketmq是如何具体实现高性能的.从中我们也知道,m ...

  3. 探秘 RocketMQ 消息持久化机制

    我们知道 RocketMQ 是一款高性能.高可靠的分布式消息中间件,高性能和高可靠是很难兼得的.因为要保证高可靠,那么数据就必须持久化到磁盘上,将数据持久化到磁盘,那么可能就不能保证高性能了. Roc ...

  4. 🏆【Alibaba中间件技术系列】「RocketMQ技术专题」系统服务底层原理以及高性能存储设计分析

    设计背景 消息中间件的本身定义来考虑,应该尽量减少对于外部第三方中间件的依赖.一般来说依赖的外部系统越多,也会使得本身的设计越复杂,采用文件系统作为消息存储的方式. RocketMQ存储机制 消息中间 ...

  5. RocketMQ消息存储

    转载:RocketMQ源码学习--消息存储篇 消息中间件—RocketMQ消息存储(一) RocketMQ高性能之底层存储设计 存储架构 RMQ存储架构 上图即为RocketMQ的消息存储整体架构,R ...

  6. RocketMQ之六:RocketMQ消息存储

    一.RocketMQ的消息存储基本介绍 先看一张图: 1.Commit log存储消息实体.顺序写,随机读.2.Message queue存储消息的偏移量.读消息先读message queue,根据偏 ...

  7. RocketMq(三):server端处理框架及消费数据查找实现

    rocketmq作为一个高性能的消息中间件,咱们光停留在使用层面,总感觉缺点什么.虽然rocketmq的官方设计文档讲得还是比较详细的,但纸上得来终觉浅!今天我们就来亲自挖一挖rocketmq的实现细 ...

  8. 深入剖析 RocketMQ 源码 - 消息存储模块

    一.简介 RocketMQ 是阿里巴巴开源的分布式消息中间件,它借鉴了 Kafka 实现,支持消息订阅与发布.顺序消息.事务消息.定时消息.消息回溯.死信队列等功能.RocketMQ 架构上主要分为四 ...

  9. Apache RocketMQ分布式消息传递和流数据平台及大厂面试宝典v4.9.2

    概述 **本人博客网站 **IT小神 www.itxiaoshen.com 定义 Apache RocketMQ官网地址 https://rocketmq.apache.org/ Latest rel ...

随机推荐

  1. hashmap(有空可以看看算法这本书中对于这部分的实现,很有道理)

    //转载:https://baijiahao.baidu.com/s?id=1618550070727689060&wfr=spider&for=pc 1.为什么用HashMap? H ...

  2. C++基础知识篇:C++ 存储类

    存储类定义 C++ 程序中变量/函数的范围(可见性)和生命周期.这些说明符放置在它们所修饰的类型之前.下面列出 C++ 程序中可用的存储类: auto register static extern m ...

  3. 【线程池】toString

    java.util.concurrent.RejectedExecutionException: Task com.personal.practice.jvm.Jstacktest$1@7d605a5 ...

  4. 如何测试一个APP

    1.是否支持各种手机系统 2.是否会因为分辨率而出错 3.不同机型能否安装 4.老旧机型 能否通用 5.广告时长 6.测试能否登陆注册 7.卸载时是否会发生意外 8.安装时会不会误认为带病毒 9.用户 ...

  5. 自学linux——18.FTP服务器的搭建

    Centos7下FTP服务器的搭建 一.FTP的作用 文件传输协议(File Transfer Protocol,FTP),是一种在互联网中进行文件传输的协议,基于客户端/服务器模式,默认使用 20. ...

  6. WPF 学习笔记(一)

    一.概述 WPF(Windows Presentation Foundation)是微软推出的基于Windows 的用户界面框架,随着.NET Framework 3.0发布第一个版本.它提供了统一的 ...

  7. 老哥们,请问我做的对么?(记一次失败的st表乱搞)

    今天a开始就不是很顺,然后到d,d努力读完题理解完题意,感觉自己又行了{ 问最大的jump,我觉得如果单纯贪心策略显然会t,问min,max这类rmq果断上了st表(这东西我隔离的时候写的,没想到被拉 ...

  8. 转1:Python字符编码详解

    Python27字符编码详解 声明 一 字符编码基础 1 抽象字符清单ACR 2 已编码字符集CCS 3 字符编码格式CEF 31 ASCII初创 311 ASCII 312 EASCII 32 MB ...

  9. PyQt(Python+Qt)学习随笔:Mode/View中的枚举类 QItemSelectionModel.SelectionFlag取值及含义

    老猿Python博文目录 专栏:使用PyQt开发图形界面Python应用 老猿Python博客地址 以上取值可以通过或操作进行组合使用. 老猿Python,跟老猿学Python! 老猿Python博文 ...

  10. dataframe检查重复值,去重

    flag = df.price.duplicated() # flag = df.duplicated() #参考:https://www.cnblogs.com/trotl/p/11876292.h ...