IO的详细解释:It's all about buffers: zero-copy, mmap and Java NIO
There are use cases where data need to be read from source to a sink without modification. In code this might look quite simple: for example in Java, you may read data from one InputStream
chunk by chunk into a small buffer (typically 8KB), and feed them into the OutputStream
, or even better, you could create a PipedInputStream
, which is basically just a util that maintains that buffer for you. However, if low latency is crucial to your software, this might be quite expensive from the OS perspective and I shall explain.
What happens under the hood
Well, here’s what happens when the above code is used:
- JVM sends read() syscall.
- OS context switches to kernel mode and reads data into the input socket buffer.
- OS kernel then copies data into user buffer, and context switches back to user mode. read() returns.
- JVM processes code logic and sends write() syscall.
- OS context switches to kernel mode and copies data from user buffer to output socket buffer.
- OS returns to user mode and logic in JVM continues.
This would be fine if latency and throughput aren’t your service’s concern or bottleneck, but it would be annoying if you do care, say for a static asset server. There are 4 context switches and 2 unnecessary copies for the above example.
OS-level zero copy for the rescue
Clearly in this use case, the copy from/to user space memory is totally unnecessary because we didn’t do anything other than dumping data to a different socket. Zero copy can thus be used here to save the 2 extra copies. The actual implementation doesn’t really have a standard and is up to the OS how to achieve that. Typically *nix systems will offer sendfile()
. Its man page can be found here. Some say some operating systems have broken versions of that with one of them being OSX link. Honestly with such low-level feature, I wouldn’t trust Apple’s BSD-like system so never tested there.
With that, the diagram would be like this:
You may say OS still has to make a copy of the data in kernel memory space. Yes but from OS’s perspective this is already zero-copy because there’s no data copied from kernel space to user space. The reason why kernel needs to make a copy is because general hardware DMA access expects consecutive memory space (and hence the buffer). However this is avoidable if the hardware supports scatter-n-gather:
A lot of web servers do support zero-copy such as Tomcat and Apache. For example apache’s related doc can be found here but by default it’s off.
Note: Java’s NIO offers this through transferTo
(doc).
mmap
The problem with the above zero-copy approach is that because there’s no user mode actually involved, code cannot do anything other than piping the stream. However, there’s a more expensive yet more useful approach - mmap, short for memory-map.
Mmap allows code to map file to kernel memory and access that directly as if it were in the application user space, thus avoiding the unnecessary copy. As a tradeoff, that will still involve 4 context switches. But since OS maps certain chunk of file into memory, you get all benefits from OS virtual memory management - hot content can be intelligently cached efficiently, and all data are page-aligned thus no buffer copying is needed to write stuff back.
However, nothing comes for free - while mmap does avoid that extra copy, it doesn’t guarantee the code will always be faster - depending on the OS implementation, there may be quite a bit of setup and teardown overhead (since it needs to find the space and maintain it in the TLB and make sure to flush it after unmapping) and page fault gets much more expensive since kernel now needs to read from hardware (like disk) to update the memory space and TLB. Hence, if performance is this critical, benchmark is always needed as abusing mmap() may yield worse performance than simply doing the copy.
The corresponding class in Java is MappedByteBuffer
from NIO package. It’s actually a variation of DirectByteBuffer
though there’s no direct relationship between classes. The actual usage is out of scope of this post.
NIO DirectByteBuffer
Java NIO introduces ByteBuffer
which represents the buffer area used for channels. There are 3 main implementations of ByteBuffer
:
HeapByteBuffer
This is used when
ByteBuffer.allocate()
is called. It’s called heap because it’s maintained in JVM’s heap space and hence you get all benefits like GC support and caching optimization. However, it’s not page aligned, which means if you need to talk to native code through JNI, JVM would have to make a copy to the aligned buffer space.DirectByteBuffer
Used when
ByteBuffer.allocateDirect()
is called. JVM will allocate memory space outside the heap space usingmalloc()
. Because it’s not managed by JVM, your memory space is page-aligned and not subject to GC, which makes it perfect candidate for working with native code (e.g. when writing OpenGL stuff). However, you are then “deteriorated” to C programmer as you’ll have to allocate and deallocate memory yourself to prevent memory leak.MappedByteBuffer
Used when
FileChannel.map()
is called. Similar toDirectByteBuffer
this is also outside of JVM heap. It essentially functions as a wrapper around OS mmap() system call in order for code to directly manipulate mapped physical memory data.
Conclusion
sendfile()
and mmap()
offer efficient, low-latency low-level solutions to data manipulation across sockets. Again, no code should assume these are silver bullets as real world scenarios may be complex and it might not be worth the effort to switch code to them if this is not the true bottleneck. For software engineering to get the most ROI, in most cases, it’s better to “make it right” and then “make it fast”. Without the guardrails offered by JVM, it’s easy to make software much more vulnerable to crashing (I literally mean crashing, not exceptions) when it comes to complicated logic.
https://xunnanxu.github.io/2016/09/10/It-s-all-about-buffers-zero-copy-mmap-and-Java-NIO/
IO的详细解释:It's all about buffers: zero-copy, mmap and Java NIO的更多相关文章
- Java NIO和IO的区别(转)
原文链接:Java NIO和IO的区别 下表总结了Java NIO和IO之间的主要差别,我会更详细地描述表中每部分的差异. 复制代码代码如下: IO NIO面向流 ...
- .htaccess语法之RewriteCond与RewriteRule指令格式详细解释
htaccess语法之RewriteCond与RewriteRule指令格式详细解释 (2012-11-09 18:09:08) 转载▼ 标签: htaccess it 分类: 网络 上文htacc ...
- cookie的详细解释
突然看到网页上中英文切换的效果,不明白怎么弄得查了查 查到了cookie 并且附有详细解释 就copy留作 以后温习 http://blog.csdn.net/xidor/article/detail ...
- tar命令的详细解释
tar命令的详细解释 标签: linuxfileoutputbashinputshell 2010-05-04 12:11 235881人阅读 评论(12) 收藏 举报 分类: linux/unix ...
- Linux学习笔记15——GDB 命令详细解释【转】
GDB 命令详细解释 Linux中包含有一个很有用的调试工具--gdb(GNU Debuger),它可以用来调试C和C++程序,功能不亚于Windows下的许多图形界面的调试工具. 和所有常用的调试工 ...
- C语言 - 结构体(struct)比特字段(:) 详细解释
结构体(struct)比特字段(:) 详细解释 本文地址: http://blog.csdn.net/caroline_wendy/article/details/26722511 结构体(struc ...
- 姿势体系结构的详细解释 -- C
我基本上总结出以下4部分: 1.问题的足迹大小. 2.字节对齐问题. 3.特别保留位0. 4.这种结构被存储在存储器中的位置. #include <stdio.h> #include &l ...
- Java - 面向对象(object oriented)计划 详细解释
面向对象(object oriented)计划 详细解释 本文地址: http://blog.csdn.net/caroline_wendy/article/details/24058107 程序包括 ...
- 设计模式 - 迭代模式(iterator pattern) Java 迭代器(Iterator) 详细解释
迭代模式(iterator pattern) Java 迭代器(Iterator) 详细解释 本文地址: http://blog.csdn.net/caroline_wendy 參考迭代器模式(ite ...
随机推荐
- Puppeteer之爬虫入门
译者按: 本文通过简单的例子介绍如何使用Puppeteer来爬取网页数据,特别是用谷歌开发者工具获取元素选择器值得学习. 原文: A Guide to Automating & Scrapin ...
- blfs(systemd版本)学习笔记-构建gnome桌面系统后的配置及安装的应用
我的邮箱地址:zytrenren@163.com欢迎大家交流学习纠错! 一.构建安装ibus-libpinyin的笔记地址:https://www.cnblogs.com/renren-study-n ...
- Java 开源博客 Solo 1.9.0 发布 - 新皮肤
这个版本主要是改进了评论模版机制,让大家更方便皮肤制作,并发布了一款新皮肤:9IPHP. Solo 是一款一个命令就能搭建好的 Java 开源博客系统,并内置了 15+ 套精心制作的皮肤.除此之外,S ...
- 23.Odoo产品分析 (三) – 人力资源板块(4) – 招聘流程(1)
查看Odoo产品分析系列--目录 安装招聘流程模块: 可以看到我们在前面的章节中设置的"生产经理"岗位,和其他的看板视图一样,每一个岗位板块提供了各种便捷的操作入口和颜色设置. ...
- pyinstaller打包错误 UnicodeDecodeError: 'gbk' codec can't decode byte 0xab in position 160:
注:我的博客原本在CSDN,现转到博客园,图片采用以前的图片,并没有盗图. 在将.py文件打包时,出现了下列错误 >>C:\Users\小呆\PycharmProjects\pycharm ...
- leetcode-58.最后一个单词的长度
leetcode-58.最后一个单词的长度 题意 给定一个仅包含大小写字母和空格 ' ' 的字符串,返回其最后一个单词的长度. 如果不存在最后一个单词,请返回 0 . 说明:一个单词是指由字母组成,但 ...
- JavascriptDom编程艺术(笔记)
如果想快速学习dom的话,建议去菜鸟教程,比较浅显易懂,实战性较强.我是看纸质的书,主要是花钱,心疼,所以看完,容易记住. 1.重点: .变量 -.var修饰 -.赋值,用=号,例如ver age = ...
- Centos7开启ssh免密码登录
1.输入命令:cd .ssh进入rsa公钥私钥目录(清空旧秘钥) 2.在当前目录下执行ssh-keygen -t rsa,三次回车后生成新的公钥(id_rsa.pub)私钥(id_rsa)文件(每个节 ...
- spring4笔记----web.xml中2.4以上版本Listener的配置
基本上没用过Servlet2.4以下版本,所以2.4版本以下不必学了 <?xml version="1.0" encoding="UTF-8"?> ...
- 自动化测试基础篇--Selenium简单的163邮箱登录实例
摘自https://www.cnblogs.com/sanzangTst/p/7472556.html 前面几篇内容一直讲解Selenium Python的基本使用方法.学习了什么是selenium: ...