Java I/O theory in system level
参考文章: JAVA NIO之浅谈内存映射文件原理与DirectMemory
Java NIO 2.0 : Memory-Mapped Files | MappedByteBuffer Tutorial
How Java I/O Works Internally at Lower Level?
1. JAVA I/O theory at lower system level
Before this post, We assume you are fmailiar with basic JAVA I/O operations.
Here is the content:
- Buffer Handling and Kernel vs User Space
- Virtual Memory Memory
- Paging
- File/Block Oriented
- I/O File Locking
- Stream Oriented I/O
1.1 Buffer Handling and Kernel vs User Space
Buffers and how Buffers are handled are the basis of I/O. "Input/Output" means nothing more than moving data from buffer to somewhere or move data from somewhere to the buffer in user space.
Commonly, processes send I/O requests to the OS that data in user space Buffer will be drained to buffer in kernel space(write operation), and the OS performs an incredilby complex transfer. Here is data flow diagram:
The image above shows s simplified "logical" diagram of how block data moves from an external device, such as a hard disk, to user space memory. Firstly, the process send system call read(), and kernel catch the call and issuing a command to the disk controller to fetch data from disk. The disk controller writes the data directly into kernel memory buffer by DMA. Then the kernel copies the data from temporary buffer in kernel space to the buffer in the user space. Write operation is similar to read operation.
After first read operation, the kernel will cache and/or perfetch data, so the data you request may already in the kernel space. You can try read a big file 3 times, you will find that second and thrid time is far more fast than the first. Here is an example:
static void CpFileStreamIO () throws IOException {
String inFileStr = "***kimchi_v2.pdf";
String outFileStr = "./kimchi_v2.pdf";
long startTime, elapsedTime; // for speed benchmarking
int bufferSizeKB = 4;
int bufferSize = bufferSizeKB * 1024;
int repeatedTimes = 5;
System.out.println("Using Buffered Stream");
try (BufferedInputStream in = new BufferedInputStream(new FileInputStream(inFileStr));
BufferedOutputStream out = new BufferedOutputStream(new FileOutputStream(outFileStr))) {
for (int i = 0; i < repeatedTimes; i++) {
copyFile(in, out);
}
} catch (IOException ex) {
ex.printStackTrace();
}
}
private static void copyFile(BufferedInputStream in, BufferedOutputStream out) throws IOException {
long startTime;
long elapsedTime;
startTime = System.nanoTime();
int bytescount;
while ((bytescount = in.read()) != -1) {
out.write(bytescount);
}
elapsedTime = System.nanoTime() - startTime;
System.out.println("Elapsed time is " + (elapsedTime / 1000000.0) + " msec");
}
Firstly we create an BufferedInputStream and BufferedOutputStream instance then we copy strings in file from BufferedInputStream to BufferedOutputStream 5 times. Here is the console output:
Using Buffered Stream
Elapsed time is 85.175 msec
Elapsed time is 0.005 msec
Elapsed time is 0.003 msec
Elapsed time is 0.003 msec
Elapsed time is 0.004 msec
In this example, the first time cost is far more longer because kernel need to read data from harddisk, while from the second time, user space can fetch data from buffers in kernel space.
1.2 Virtual Memory
Virtual memory means that virtual addresses are used more than pgysical memory(RAM or other internal storage) . Virtual memory brings 2 advantages:
1. More than one virtual address can be mapped to the same physical memory location, Which can reduce redundance of data in memory.
2. Virtual memory space can be larger than the physical memory space available. For example a user process can allocate 4G memory even the RAM is 1G only.
So to transfering data between user space and kernel space, we can only map the physical address of the virtual address in kernel space to the virtual address in kernel. For DMA (which can access only physical memory addresses) can fill a buffer that is simultaneousy visible to both the kernel and a user space process. This eliminates copies between kernel and user space, but the kernel and user buffer should share teh same page alignment. Buffers must also be a multiple of the block size used by disk controller(512bytes usually). And the virtual and physical memory are divided into pages, and the virtual and physical memory page sizes are always the same.
1.3 Memory Paging
Aligning memory page sizes as mutiples of the disk block size allows the kernel to directly command the disk controller the write pages back to disks and reload them from disks. And all disk I/O are done at the page level. Modern CUPs contain a subsystem known as teh Memory Mangement Unit(MMU). This device logically sits between the CPU and physical memory. CPU needs it's mapping information needed to translate virtual addresses to physical memory addresses.
1.4 File I/O
File I/O always occures within the context of a filesystem. Filesystem is quite different concept from disk. Filesystem is an high level of abstracion. Filesystem is particular method of arranging and interpreting data. Our processes are always interacts with fs, not the disk directly. It defines the concept of names, paths, files, directories and other abstact object.
A filesystem organizes (in hard disk) a sequence of uniformy sized data blocks. Some blocks store inodes about where to find meta data, and other store real data. Adnd filesystems pages sizes range from 2KB to *KB as multiples of memory page size.
Here is the process to find a file from file system:
1. Determine which filesystem page needed(according the path of file, if an file path has "/root" as prefix mean to find the file in disks mounted as "/root" mountpoint )
2. Allocate enough memory pages in kernel space to hold the identified fileysystem pages.
3. Establish mappings between memory pages and the filesystem pages stored in disk.
4. Instructions runs in CPU may need code in a virtual address which is in memory(MMU find the page desired not in memory), then CPU raise page faults for each of those memory pages.
5. Linux operating system will allocate more pages to the process, filling those pages with data from disk, configuring the MMU, and CPU continue works.
The Filesystem data is cached libk other memory pages. On subsequent I/O requests, some or all of the file data may still be present in physical and can be reused without rereading from disk. Just like the coying file example in 1.1.
1.4 FIle Locking
File locking is a scheme in which a process can prevent others from accessing data stored in the private space.
1.5 Stream I/O
Not all I/O is block-oriented, There is also stream I/O modled as on a pipeline. The bytes of an I/O stream must be accessed sequentially. TTY(console) devices, print ports, and network connections are common examples of streams.
Streams are generally, but not necessarily, slower than block devices and are often as the source of the intermittent input. Most OS allows streams to be placed into non-blocking mode, which permits a process to check if input is avavible on the stream without getting stuck to waiting input.
Another ability is for stream is readiness selection. This is similiar to non-blocking mode, but offloads the check for whether the process is ready to the operating system. The operating system can be told to watch a collection of streams and returns an indication to the process about which streams are ready.This ability permits a process to multiplex many active streams using common code and a single thread by leveraging the readiness information returned by the operating system. This is widely used in network servers to handle large numbers of network connections. Readiness selection is essential for high-volume scaling.
Java I/O theory in system level的更多相关文章
- 解决java.lang.NoClassDefFoundError: org/apache/log4j/Level
现象: java.lang.NoClassDefFoundError: org/apache/log4j/Level at org.slf4j.LoggerFactory.getSingleton(L ...
- Linux/Unix System Level Attack、Privilege Escalation(undone)
目录 . How To Start A System Level Attack . Remote Access Attack . Local Access Attack . After Get Roo ...
- Java中JIN机制及System.loadLibrary() 的执行过程
Android平台Native开发与JNI机制详解 http://mysuperbaby.iteye.com/blog/915425 个人认为下面这篇转载的文章写的很清晰很不错. 注意Android平 ...
- Java获取系统相关信息System.getProperty()
java.version Java 运行时环境版本 java.vendor Java 运行时环境供应商 java.vendor.url Java 供应商的 URL java.home Java 安装目 ...
- [转] 检查更新时出错:无法启动更新检查(错误代码为 4: 0x80070005 — system level)
Google浏览器Chrome更新到时候提示错误:检查更新时出错:无法启动更新检查(错误代码为 4: 0x80070005 -- system level),很有可能是Chrome更新服务被禁用了,我 ...
- 对于应用需要记录某个方法耗时的场景,必须使用clock_gettime传入CLOCK_MONOTONIC参数,该参数获得的是自系统开机起单调递增的纳秒级别精度时钟,相比gettimeofday精度提高不少,并且不受NTP等外部服务影响,能准确更准确来统计耗时(java中对应的是System.nanoTime),也就是说所有使用gettimeofday来统计耗时(java中是System.curre
对于应用需要记录某个方法耗时的场景,必须使用clock_gettime传入CLOCK_MONOTONIC参数,该参数获得的是自系统开机起单调递增的纳秒级别精度时钟,相比gettimeofday精度提高 ...
- java的IO操作:System类对IO的支持。
目标: 1,掌握SYStem对IO的三种支持: system.out system.in system.err 2,掌握system.out及system.err的区别. 3,掌握输入,输出重定向. ...
- Java for LeetCode 107 Binary Tree Level Order Traversal II
Given a binary tree, return the bottom-up level order traversal of its nodes' values. (ie, from left ...
- java: new Date().getTime() 与 System.currentTimeMillis() 与 System.nanoTime()
java使用new Date()和System.currentTimeMillis()获取当前时间戳 在开发过程中,通常很多人都习惯使用new Date()来获取当前时间,使用起来也比较方便,同时 ...
随机推荐
- asp.net BulletedList样式修改 css
首先编写一段简单的css脚本 然后呢,在asp:BulletedList中通过 CssClass ="style1"将样式作用到控件上.看看运行效果 注意到上下边框的颜色分别是红色 ...
- alimama open source mdrill启动后访问蓝鲸任务时出错:Caused by:org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
启动后,访问:http://IP:1107/mdrill.jsp 蓝鲸任务
- 图论测试题(一)第一题:longest
第一题:longest 乌托邦有n个城市,某些城市之间有公路连接.任意两个城市都可以通过公路直接或者间接到达,并且任意两个城市之间有且仅有一条路径(What does this imply? A tr ...
- MFC网络编程
一.概念1.同步方式与异步方式同步方式:发送方不等接收方响应,便接着发送下一个数据包的通信方式异步方式:发送方发出数据,等收到接收方发回的响应后,才发送下一个数据包的通信方式2.阻塞与非阻塞方式阻塞套 ...
- php curl_exec optimize
需求:前端过来一个请求,后台php要通过两次http请求,请求不同的地址得到资源后拼接返回给前端 请求A站: 请求B站: 同时请求A站和B站(php 串行 curl_exec ) 同时请求A站和B站( ...
- STL MAP 反序迭代
ITS_NOTICE_MAP::reverse_iterator it = noticeMap.rbegin(); for ( ; it != noticeMap.rend(); ++it ) { I ...
- MediaStore
Class Overview 提供的多媒体数据包括内部和扩展的所有多媒体元数据. Summary Nested Classes MediaStore.Audio:此类包含了所有音频相关信息. Medi ...
- php的cURL库介绍
cURL 是一个利用URL语法规定来传输文件和数据的工具,支持很多协议,如HTTP.FTP.TELNET等.很多小偷程序都是使用这个函数.最爽的是,PHP也支持 cURL 库.本文将介绍 cURL 的 ...
- threadid=1: thread exiting with uncaught exception (group=0x40db8930)
异常信息如下: 07-26 17:23:49.521: W/dalvikvm(29229): threadid=1: thread exiting with uncaught exception (g ...
- < meta > 元素 概要
< meta > 元素 概要 标签提供关于HTML文档的元数据.元数据不会显示在页面上,但是对于机器是可读的.它可用于浏览器(如何显示内容或重新加载页面),搜索引擎(关键词),或其他 we ...