What is Copy-on-write?

Copy-on-write

     Copy-on-write (sometimes referred to as "COW") is an optimization strategy used in computer programming. The fundamental idea is that if multiple callers ask for resources which are initially indistinguishable, you can give them pointers to the same resource.
This function can be maintained until a caller tries to modify its "copy" of the resource, at which point a true private copy is created to prevent the changes becoming visible to everyone else. All of this happens transparently to the callers. The primary
advantage is that if a caller never makes any modifications, no private copy need ever be created.

Copy-on-write in virtual memory

     Copy-on-write finds its main use in virtual memory operating systems; when a process creates a copy of itself, the pages in memory that might be modified by either the process or its copy are marked copy-on-write. When one process modifies the memory,
the operating system's kernel intercepts the operation and copies the memory so that changes in one process's memory are not visible to the other.

     Another use is in the calloc function. This can be implemented by having a page of physical memory filled with zeroes. When the memory is allocated, the pages returned all refer to the page of zeroes and are all marked as copy-on-write. This way, the amount
of physical memory allocated for the process does not increase until data is written. This is typically only done for larger allocations.

     Copy-on-write can be implemented by telling the MMU that certain pages in the process's address space are read-only. When data is written to these pages, the MMU raises an exception which is handled by the kernel, which allocates new space in physical
memory and makes the page being written to correspond to that new location in physical memory.

     One major advantage of COW is the ability to use memory sparsely. Because the usage of physical memory only increases as data is stored in it, very efficient hash tables can be implemented which only use little more physical memory than is necessary to
store the objects they contain. However, such programs run the risk of running out of virtual address space -- virtual pages unused by the hash table cannot be used by other parts of the program. The main problem with COW at the kernel level is the complexity
it adds, but the concerns are similar to those raised by more basic virtual memory concerns such as swapping pages to disk; when the kernel writes to pages, it must copy them if they are marked copy-on-write.

Other applications of copy-on-write

     COW is also used outside the kernel, in library, application and system code. The string class provided by the C++ standard library, for example, was specifically designed to allow copy-on-write implementations. One hazard of COW in these contexts arises
in multithreaded code, where the additional locking required for objects in different threads to safely share the same representation can easily outweigh the benefits of the approach.

     The COW concept is also used in virtualization/emulation software such as Bochs, QEMU, and UML for virtual disk storage. This allows a great reduction in required disk space when multiple VMs can be based on the same hard disk image, as well as increased
performance as disk reads can be cached in RAM and subsequent reads served to other VMs out of the cache.

     The COW concept is also used in maintenance of instant snapshot on database servers like Microsoft SQL Server 2005. Instant snapshots preserve a static view of a database by storing a pre-modification copy of data when underlaying data are updated. Instant
snapshots are used for testing uses or moment-dependent reports and should not be used to replace backups.

     COW may also be used as the underlying mechanism for snapshots provided by logical volume management and Microsoft Volume Shadow Copy Service.

     The copy-on-write technique can be used to emulate a read-write storage on media that require wear levelling or are physically Write Once Read Many.


快照COW的更多相关文章

  1. ROW/COW 快照技术原理解析

    NOTE:ROW/COW 最新更新请跳转<再谈 COW.ROW 快照技术> 目录 目录 快照与备份的区别 Snapshot 快照技术 全量快照 增量快照 COW 写时拷贝快照技术 ROW ...

  2. Linux就这个范儿 第15章 七种武器 linux 同步IO: sync、fsync与fdatasync Linux中的内存大页面huge page/large page David Cutler Linux读写内存数据的三种方式

    Linux就这个范儿 第15章 七种武器  linux 同步IO: sync.fsync与fdatasync   Linux中的内存大页面huge page/large page  David Cut ...

  3. kvm虚拟化一: 图形化的管理方式

    1.安装必要工具yum install -y / qemu-kvm //kvm主程序 libvirt //虚拟化服务库 libguestfs-tools //虚拟机系统管理工具 virt-instal ...

  4. docker的简单操作和端口映射

    一:简介 Docker镜像 在Docker中容器是基于镜像启动的 镜像是启动容器的核心 镜像采用分层设计,最顶层为读写层 使用快照COW技术,确保底层不丢失 通过ifconfig(ip  a)来查看d ...

  5. 再谈 COW、ROW 快照技术

    目录 目录 前言 快照与备份的区别 快照技术 增量快照之 COW 增量快照之 row 前言 在经过了一段时间的实践之后,再次回顾 COW/ROW 快照技术的实现原理,温故而知新. 快照与备份的区别 传 ...

  6. btrfs-snapper 实现Linux 文件系统快照回滚

    ###btrfs-snapper 应用 ----------####环境介绍> btrfs文件系统是从ext4过渡而来的被誉为“下一代的文件系统”.该文件系统具有高扩展性(B-tree).数据一 ...

  7. raw,cow,qcow,qcow2镜像的比较

    在linux下,虚拟机的选择方式有很多,比如vmware for linux,virtual box,还有qemu,在以前,使用qemu的人不多,主要是使用起来有些麻烦,但现在随着Openstack的 ...

  8. lvm snapshot(lvm 快照)

    lvm快照有多种实现方法,其中一种是COW(Copy-On-Write),不用停止服务或将逻辑卷设为只读就可以进行备份,当一个 snapshot创建的时候只是拷贝原始卷里的元数据,而不是物理上的数据, ...

  9. LVM快照(snapshot)备份

    转载自:http://wenku.baidu.com/link?url=cbioiMKsfrxlzrJmoUMaztbrTelkE0FQ8F9qUHX7sa9va-BkkL4amvzCCAKg2hBv ...

随机推荐

  1. urllib urllib2

    #-*-coding:utf-8-*- import urllib import urllib2 import cookielib ##urllib url="http://www.qq.c ...

  2. Linux 常见安全检查方法

    Linux 常见安全检查方法进行概要说明: 一.检查系统密码文件,查看文件修改日期 # ls -l /etc/passwd 二.查看 passwd 文件中有哪些特权用户 # awk -F: '$3= ...

  3. Memcached进程挂掉自动重启脚本

    vim memcached_check.sh   #!/bin/sh #check memcached process and restart if down PATH=$PATH:/opt/env/ ...

  4. Ubuntu升级出现/boot空间不足解决(转)

    经常升级Linux内核,导致更新时警告/boot分区空间不足.这是以为多次升级内核后,导致内核版本太多,清理一下没用的内核文件就行了.命令如下: zht@zht-Ubuntu:~$ dpkg -l ' ...

  5. 【SpringMVC学习06】SpringMVC中的数据校验

    这一篇博文主要总结一下springmvc中对数据的校验.在实际中,通常使用较多是前端的校验,比如页面中js校验,对于安全要求较高的建议在服务端也要进行校验.服务端校验可以是在控制层conroller, ...

  6. Python 多线程和单线程本质应用区别

    先了解下CPU的简单运行原理: 它运行速度非常快,1s内可以运行成千上万次,一个核心可以把1s切分成成千上万个时间片段,这个核心确实同时只能运行一个任务:但是可以将多个任务交替执行,比如上一个时间片段 ...

  7. ASP.NET基本对象介绍

    ASP.NET能够成为一个庞大的软件体系,与它提供了大量的对象类库有很大的关系.这些类库中包含许多封装好的内置对象,开发人员可以直接使用这些对象的方法和属性,因此用较少的代码量就能轻松完成很多对象.  ...

  8. elasticsearch报错syncedb_path

    一般默认syncdb_path在$HOME目录下隐藏文件,也可以自己指定一个文件,记住,这里只能指定文件,不能只写目录input { file { path => "/home/tom ...

  9. SpringCloud系列一:微服务理解

    1. 单体架构 一个归档包(例如war格式)包含所有功能的应用程序,通常称为单体应用. > 复杂性高:模块多,模块的边界模糊,依赖关系不清楚,代码质量参差不齐. > 技术债务:随着时间推移 ...

  10. java中利用WeakHashMap实现缓存

    简介 WeakHashMap是Java集合框架里的一员,从名字可以看出它是某种 Map.它的特殊之处在于 WeakHashMap 里的entry可能会被GC自动删除,即使程序员没有调用remove() ...