Linux下的文件系统2

2017-03-13

_{^{上文针对VFS的基本信息做了介绍，并简单介绍了VFS涉及的几个数据机构，本节结合LInux源码，对各个结构之间的关系进行分析。}}

一、总体架构图

总体架构图如上图所示，结合进程访问文件的实际情况，根据上图进行细节化的描述。进程通过其结构中的files_struct结构和文件建立联系，看戏files_struct结构

struct files_struct {

  /*

   * read mostly part

   */

    atomic_t count;

    struct fdtable __rcu *fdt;

    struct fdtable fdtab;

  /*

   * written part on a separate cache line in SMP

   */

    spinlock_t file_lock ____cacheline_aligned_in_smp;

    int next_fd;

    unsigned long close_on_exec_init[];

    unsigned long open_fds_init[];

    struct file __rcu * fd_array[NR_OPEN_DEFAULT];

};

首先是一个原子变量，记录打开文件的个数，注意这里不只是普通文件，还包括设备文件等其他文件。next_fd记录当前下一个可用的文件描述符，用于在下次进程打开文件时快速分配。而close_on_exec_init和open_fds_init是位图。fd_array是初始化状态的文件描述符数组，而fdtab是真正管理文件描述符的结构。看下fdtable结构

struct fdtable {

    unsigned int max_fds;

    struct file __rcu **fd;      /* current fd array */

    unsigned long *close_on_exec;

    unsigned long *open_fds;

    struct rcu_head rcu;

};

max_fds表示最大的打开文件数，可以更改。fd是一个指向文件描述符数组的指针，初始化为files_struct结构中fd_array数组的地址，close_on_exec指向files_struct结构中的位域。open_fds是一个指向位域的指针，管理着当前打开的所有描述符，如果位域中的对应位被置位表示该描述符在使用中。

关于描述符表扩展的情况，最后进行解释。上面的描述符表中，都是指向file结构的指针。进程每打开一个文件，就会有一个file结构与之对应。换句话说，file结构记录的某次进程对文件的某一次操作信息。看下file结构

struct file {

    /*

     * fu_list becomes invalid after file_free is called and queued via

     * fu_rcuhead for RCU freeing

     */

    union {

        struct list_head    fu_list;

        struct rcu_head     fu_rcuhead;

    } f_u;

    struct path        f_path;

#define f_dentry    f_path.dentry

    struct inode        *f_inode;    /* cached value */

    const struct file_operations    *f_op;

    /*

     * Protects f_ep_links, f_flags, f_pos vs i_size in lseek SEEK_CUR.

     * Must not be taken from IRQ context.

     */

    spinlock_t        f_lock;

#ifdef CONFIG_SMP

    int            f_sb_list_cpu;

#endif

    atomic_long_t        f_count;

    unsigned int         f_flags;

    fmode_t            f_mode;

    loff_t            f_pos;

    struct fown_struct    f_owner;

    const struct cred    *f_cred;

    struct file_ra_state    f_ra;

    u64            f_version;

#ifdef CONFIG_SECURITY

    void            *f_security;

#endif

    /* needed for tty driver, and maybe others */

    void            *private_data;

#ifdef CONFIG_EPOLL

    /* Used by fs/eventpoll.c to link all the hooks to this file */

    struct list_head    f_ep_links;

    struct list_head    f_tfile_llink;

#endif /* #ifdef CONFIG_EPOLL */

    struct address_space    *f_mapping;

#ifdef CONFIG_DEBUG_WRITECOUNT

    unsigned long f_mnt_write_state;

#endif

};

同一个超级块下打开的所有文件都会通过双链表连接起来。另外，file结构中主要由对应文件的inode缓存，在下次访问不需要通过dentry查找inode了。还有一个path结构，该结构记录当前文件的emulation项和vfsmount结构。其余记录文件的权限模式、读写位置等信息，还有一个重要的函数表，保存操作文件的一些函数的指针。关键是一个进程在这里可以根据dentry查找到inode。究竟是如何查找的呢？看下dentry的结构:

struct dentry {

    /* RCU lookup touched fields */

    unsigned int d_flags;        /* protected by d_lock */

    seqcount_t d_seq;        /* per dentry seqlock */

    struct hlist_bl_node d_hash;    /* lookup hash list */

    struct dentry *d_parent;    /* parent directory */

    struct qstr d_name;

    struct inode *d_inode;        /* Where the name belongs to - NULL is

                     * negative */

    unsigned char d_iname[DNAME_INLINE_LEN];    /* small names */

    /* Ref lookup also touches following */

    unsigned int d_count;        /* protected by d_lock */

    spinlock_t d_lock;        /* per dentry lock */

    const struct dentry_operations *d_op;

    struct super_block *d_sb;    /* The root of the dentry tree */

    unsigned long d_time;        /* used by d_revalidate */

    void *d_fsdata;            /* fs-specific data */

    struct list_head d_lru;        /* LRU list */

    /*

     * d_child and d_rcu can share memory

     */

    union {

        struct list_head d_child;    /* child of parent list */

         struct rcu_head d_rcu;

    } d_u;

    struct list_head d_subdirs;    /* our children */

    struct hlist_node d_alias;    /* inode alias list */

};

dentry结构中有指向当前操作文件的inode指针，父目录的dentry，,当然还包括文件名信息。注意这里文件名并没有作为属性保存在inode节点中，而是保存在dentry结构中。因为文件名对于系统来讲主要来查找inode，而通过dentry可以查找到inode，所以这里其实dentry之后就不需要文件名了。通过dentry还可以定位所属的超级块。该结构中也有个函数表dentry_operations，主要是针对dentry的操作，如增加、删除dentry。一个目录下的所有子目录会形成一个链表，d_subdirs是链表头。而d_child作为一个节点，连接到父目录的子链表中。上节已经提到，系统中所有的dentry通过一个hash表维护起来，以便于查找。表头是全局变量dentry_hashtable.而对于未使用的dentry，内核使用dentry_unused全局链表来组织。因为每个父目录均会有一条自己子目录的链表，所以系统中还存在一个dentry树。

到目前为止已经找到了具体的inode，inode记录文件的真实属性信息，修改时间，是否是脏，以及在内存中的映射信息。看下inode结构

struct inode {

    umode_t            i_mode;

    unsigned short        i_opflags;

    kuid_t            i_uid;

    kgid_t            i_gid;

    unsigned int        i_flags;

#ifdef CONFIG_FS_POSIX_ACL

    struct posix_acl    *i_acl;

    struct posix_acl    *i_default_acl;

#endif

    const struct inode_operations    *i_op;

    struct super_block    *i_sb;

    struct address_space    *i_mapping;

#ifdef CONFIG_SECURITY

    void            *i_security;

#endif

    /* Stat data, not accessed from path walking */

    unsigned long        i_ino;

    /*

     * Filesystems may only read i_nlink directly.  They shall use the

     * following functions for modification:

     *

     *    (set|clear|inc|drop)_nlink

     *    inode_(inc|dec)_link_count

     */

    union {

        const unsigned int i_nlink;

        unsigned int __i_nlink;

    };

    dev_t            i_rdev;

    loff_t            i_size;

    struct timespec        i_atime;

    struct timespec        i_mtime;

    struct timespec        i_ctime;

    spinlock_t        i_lock;    /* i_blocks, i_bytes, maybe i_size */

    unsigned short          i_bytes;

    unsigned int        i_blkbits;

    blkcnt_t        i_blocks;

#ifdef __NEED_I_SIZE_ORDERED

    seqcount_t        i_size_seqcount;

#endif

    /* Misc */

    unsigned long        i_state;

    struct mutex        i_mutex;

    unsigned long        dirtied_when;    /* jiffies of first dirtying */

    struct hlist_node    i_hash;

    struct list_head    i_wb_list;    /* backing dev IO list */

    struct list_head    i_lru;        /* inode LRU list */

    struct list_head    i_sb_list;

    union {

        struct hlist_head    i_dentry;

        struct rcu_head        i_rcu;

    };

    u64            i_version;

    atomic_t        i_count;

    atomic_t        i_dio_count;

    atomic_t        i_writecount;

    const struct file_operations    *i_fop;    /* former ->i_op->default_file_ops */

    struct file_lock    *i_flock;

    struct address_space    i_data;

#ifdef CONFIG_QUOTA

    struct dquot        *i_dquot[MAXQUOTAS];

#endif

    struct list_head    i_devices;

    union {

        struct pipe_inode_info    *i_pipe;

        struct block_device    *i_bdev;

        struct cdev        *i_cdev;

    };

    __u32            i_generation;

#ifdef CONFIG_FSNOTIFY

    __u32            i_fsnotify_mask; /* all events this inode cares about */

    struct hlist_head    i_fsnotify_marks;

#endif

#ifdef CONFIG_IMA

    atomic_t        i_readcount; /* struct files open RO */

#endif

    void            *i_private; /* fs or device private pointer */

};

inode是一个比较庞大的结构，开头记录了文件的权限信息，如用户、用户组等。该结构中有个函数表inode_operations，记录针对inode的一些操作。inode中还有指向当前文件所属文件系统的超级块结构。当然一个至关重要的就是address_space 类型的i_mapping指针了。其指向一个address_space 结构，记录当前文件在内存中的映射情况。这点等会在分析。除此之外，记录文件的一些时间信息。前文说过，inode在内存中有三种类型：位于内存中但未使用的；位于内存中正在使用的；位于内存中已经发生变化即需要写会到磁盘的，前两种都是全局链表，第三种特定于超级块结构。除此之外，inode还在一个hash表中出现，表头是inode_hashtable,支持根据inode编号和超级块快速访问inode。

到此进程已经找到了具体的inode节点，后来又是如何把文件映射到内存中呢？一个核心结构就是address_space,先看下该结构

struct address_space {

    struct inode        *host;        /* owner: inode, block_device */

    struct radix_tree_root    page_tree;    /* radix tree of all pages */

    spinlock_t        tree_lock;    /* and lock protecting it */

    unsigned int        i_mmap_writable;/* count VM_SHARED mappings */

    struct rb_root        i_mmap;        /* tree of private and shared mappings */

    struct list_head    i_mmap_nonlinear;/*list VM_NONLINEAR mappings */

    struct mutex        i_mmap_mutex;    /* protect tree, count, list */

    /* Protected by tree_lock together with the radix tree */

    unsigned long        nrpages;    /* number of total pages */

    pgoff_t            writeback_index;/* writeback starts here */

    const struct address_space_operations *a_ops;    /* methods */

    unsigned long        flags;        /* error bits/gfp mask */

    struct backing_dev_info *backing_dev_info; /* device readahead, etc */

    spinlock_t        private_lock;    /* for use by the address_space */

    struct list_head    private_list;    /* ditto */

    void            *private_data;    /* ditto */

} __attribute__((aligned(sizeof(long))));

该结构特定于inode节点存在，多个进程可以共享同一个文件，所以在特定于进程的file结构中，有一个指向该inode address_space的指针f_mapping。具体的访问位置记录在file结构中。address_space仅仅负责对文件的映射，该结构管理了对应文件映射的所有内存区域vm_area_struct实例。上面的i_map作为一个红黑树根，关联所有的vm_area_struct,而i_mmap_nonliner是一个双向链表，关联所有非线性映射的vm_area_struct实例。该结构中还记录了所属inode节点的指针host,区域包含的虚拟页面的数量nrpages,当然还有一组操作函数，用于和设备交互，如读取一个页或者写入一个页，设置页面为脏等。关于进程虚拟内存的管理，参考另一篇文章：

Linux下的文件系统2的更多相关文章

全面了解Linux下Proc文件系统
全面了解Linux下Proc文件系统 Proc是一个虚拟文件系统,在Linux系统中它被挂载于/proc目录之上.Proc有多个功能 ,这其中包括用户可以通过它访问内核信息或用于排错,这其中一个非 ...
Linux 下EXT2文件系统 —— 如何将蚂蚁和大象优雅的装进冰箱里
这一阵子真是偷懒,无时无刻不和自己身体中的懒癌做斗争.最终我还是被打败了,星期天两天几乎都是荒废过去的,在空闲的时候实际上我内心也是有点焦虑的,不知道去怎么度过这时间.学习吧又不想学习,看电视娱乐吧也 ...
linux下的文件系统
转http://www.cnblogs.com/yyyyy5101/articles/1901842.html 谈谈个人对于文件系统的认识,其实这也体现了计算机操作系统的抽象:你不用管计算机中的文件如 ...
Linux下删除文件系统空间不释放的问题
删除了Linux下的一个文件,但是系统空间并没有被释放. 如下:/home/hadmin/data/hadoop 使用了1.3T的空间,但是实际只使用了600多G 原因是我删除了一个600多G的文件, ...
<解说linux下proc文件系统>
proc文件系统的作用是访问系统内核信息 proc不是一个真实的文件系统,它不占系统的外存空间,只是以文件的形式为用户访问linux内核数据提供接口,因为系统内核总是动态的变化,所以我们所捕捉到的也只 ...
linux下查看文件系统类型
1. df -hT命令 -h, --human-readable print sizes in human readable format (e.g., 1K 234M 2G) -T, --pr ...
Linux下查看文件系统磁盘使用
[root@localhost ~]# df -h 可以查看所有文件系统的磁盘使用情况 du --max-depth=1 -h 可以查看当前目录下各子目录的磁盘使用情况参考:http://www.2 ...
Linux下网络文件系统NFS服务搭建易错点总结
一.环境准备: 1 [root@czh ~]# cat /etc/redhat-release 2 CentOS release 6.7 (Final) 3 [root@czh ~]# uname - ...
Linux下识别分区文件系统类型
Linux下挂载文件系统有时候需要填写文件系统.但有的设备拿到手还不知道文件系统,这种情况,可以用 parted命令 # parted /dev/vda GNU Parted 3.2 Using /d ...

随机推荐

python中的__new__与__init__，新式类和经典类（2.x)
在python2.x中,从object继承得来的类称为新式类(如class A(object))不从object继承得来的类称为经典类(如class A()) 新式类跟经典类的差别主要是以下几点: 1 ...
C++学习的书籍
https://www.ossblog.org/master-c-programming-with-open-source-books/
看雪CTF第十四题
from z3 import * dest=[] s = Solver() data = [, , , , , , , , , , , , , , , , , , , , , , , , , , , ...
java maven 编译文件时有些类型文件不存在
在pom.xml中添加如下: <build> <resources> <resource> <directory>src/main/resource&l ...
python3 使用ldap3来作为django认证后台
首先先使用ldap3测试ldap服务是否正常我们先要拿到dc的数据,以及连接ldap的密码,还有搜索的字段(search_filter), 一般来说search_filter 这个是从负责ldap运 ...
Mac下，如何把项目托管到github
以前一直使用的是svn,如下图: 附个下载链接:链接: https://pan.baidu.com/s/1nv6z5XJ 密码: pwqe:不太熟悉的小伙伴可以参考我写的一篇本地搭建svn服务器的博客 ...
Fiddler 会话过滤功能
我们访问任何网址都会被 Fiddler 捕获,但有时我们只想捕获某个地址,可以使用 Fiddler 的会话过滤,如下我们只过滤出百度的域名如下,切换到 Filters --- 把 Use Filte ...
Github恶搞之自定义你的contribution图表
在正式写程序之前让我先来看看效果: 对了,这个程序的效果就是生成一个具有你想要的“contributions in the last year”图表的html页面. 当然,html文件,而不是你在Gi ...
Sqlalchemy model 文件自动生成
自动生成Sqlalchemy的models文件的包早用过了,有个字段类型做了改动,调了得10几分钟才搞定.记录下自动生成models文件的python包sqlacodegen sqlacodegen已 ...
c++ protobuf序列化
只看了int类型的序列化,后面的有时间再研究 #include <vector> #include <iostream> int main() { ; while (true) ...

Linux下的文件系统2

Linux下的文件系统2的更多相关文章

随机推荐

热门专题