Linux 中线程和进程切换的开销:

Linux 操作系统层面的进程和线程的实现都是task_struct描述符. task_struct 包含成员变量:内核态stack.  这些都存在3-4G虚拟地址空间的内核态空间中。内核栈用于保存各个寄存器值:CS,DS,SS等.  os层面的线程进程切换,都是在kernel mode下操作的。每个process都有自己unique的内核栈(因为每个process对应一个task_struct,kernel stack is member of the struct).

process context switch: 从user mode 到kernel mode, 内核stack用于保存user mode的寄存器值,用于下次返回用户态时候,能够通过寄存器找到指令和内存地址。user mode 通过中断进去kernel mode,通过int $80 syscall mechanism,找到中断处理程序:

包括:

The int instruction is a complex multi step instruction. Here is an explanation of what it does:

1.) Extracts descriptor from IDT (IDT address stored in special register) and checks that CPL <= DPL. CPL is a current privilege level, which could be read from CS register. DPL is stored in the IDT descriptor. As a consequence of this - you can't generate some exceptions (f.e. page fault) from user space directly by int instruction. If you will try to do this, you will get general protection exception

2.) The processor switches to the stack defined in TSS. TSS was initialized earlier, and already contains values of ESP and SS, which holds the kernel stack address. So now ESP points to kernel stack.

3.) The processor pushes to the newly switched kernel stack user space registers: ss, esp, eflags, cs, eip. We need to return back after syscall is served, right?

4.) Next processor set CS and EIP from IDT descriptor. This address defines exception vector entry point.

5.) Here we are in the syscall exception vector in kernel.

以上是user to kernel,那么如果是线程进程切换呢?sched_yield system call会接着把选择一个线程进行切换,把new 线程的内核栈pop到寄存器中,正式进入新线程的内核态,然后返回user mode。完成切换

区别呢?proces 切换包括 虚拟地址空间的切换,切换的实质就是cr3切换(内存空间切换,在switch_mm函数中)+ 寄存器切换(包括EIP,ESP等,均在switch_to函数中). 任何线程内核态的页表完全一样,是共享的。只有用户态页表不同。这就是主要区别,就是页表,由此到来的TLB 失效,导致的性能开销。 所谓TLB,是因为TLB存在最近使用的页表项,页表本身是物理内存。TLB减少了页表项的寻址.

用户层面的线程栈大小为什么是8MB限制。因为很多语言都支持多线程。例如C++ pthread,所谓线程栈都在进程地址空间的stack栈区。不同线程栈不应该相互重叠,否则会写坏各自的栈区crash。所以如果不事先规定stack的地址和大小。而是无限增长,那么肯定会重叠。且分配过大会导致可create的线程数变小。用户态线程切换的本质就是寄存器的切换,非常轻量级别

CPU的特权级别:ring 0- ring 3. cs段选择子本质就是cs寄存器的值,包括index 和 CPL,index用于找到段描述符表的一个段描述符entry的偏移地址。段描述符包含段基址和DPL,也就是段地址:线性地址。同时表明这个线性地址的特权级别。注意分段机制下,cs和ds,ss段看成不同的段,现代os已经废除分段机制,intel只是为了兼容。内核态的cs,ss,ds段都会把DPL置成0,表明user mode 的指令不能操作它们。这就是保护模式。那么为什么需要RPL呢?

RPL – Requested Privilege Level

These are the last two bits of DS, ES, SS, FS, GS registers. RPL field is used to harden the CPL, when higher-privileged code is servicing lower-privileged processes requests.

Assume a higher-privileged device-driver that supports a mechanism where, it can copy data from disks directly into lower-privileged processes’ data-segments. Lower-privileged processes must pass their data-segment details (selector, address and size of data to copy) to the device-driver so that device-driver can copy data into appropriate location.

Since a device-driver is higher-privileged, a lower-privileged process can trick the driver to copy data into high-privileged data-segments, simply by passing wrong selector value. This kind of exploit is called, Privilege Escalation.

How RPL helps to solve Privilege Escalation problem?

Continuing the above example, whenever device-driver loads the destination segment, it modifies the destination segment’s RPL to match the requestor (lower-privileged) process. Since protection rules for data-segments check for both CPL <= DPL and RPL <= DPL conditions, higher-privileged process gets a protection-fault on RPL <= DPL check.

The point to note is, higher-privileged code, when it is providing services to lower-privileged processes should reduce its privilege temporarily to the requestors’ privilege-level.

cpu 的privilege 模式可以保护内存,如果user态范围了受保护的内存地址,会触发segment fault error.

至于二级页表的根本目的就是减少连续虚拟地址空间的需求,不然32位的process 会需要4MB的页表大小(单页4KB前提下)。 因为物理页框的大小是4KB,那么虚拟线性地址空间如果找到物理地址呢?假如采用直接映射的话,一个页表项对应一个页框,4GB/4KB=1MB。需要1mb个页表项进行映射,那么每个页表项需要多少bytes呢?1MB有20bit,所以最少需要20bit,3bytes大小,实际取4bytes大小。所以不采用分页目录,每个进程页表4MB物理内存。 4KB的物理页框是2的12次方个的物理地址。说明如果是32位的话,后12位可以不考虑,直接寻址前20位。

https://blog.csdn.net/displayMessage/article/details/80905810

Linux thread process and kernel mode and user mode page table的更多相关文章

  1. WSL(Windows Subsystem for Linux)--Pico Process Overview

    [转载] Windows Subsystem for Linux -- Pico Process Overview Overview This post discusses pico processe ...

  2. Android开发:Android虚拟机启动错误Can't find 'Linux version ' string in kernel image file

    Android启动出错,虚拟机报错信息如下: Starting emulator for AVD 'test' emulator: ERROR: Can't find 'Linux version ' ...

  3. yum安装提示错误Thread/process failed: Thread died in Berkeley DB library

    问题描述: yum 安装更新提示 rpmdb: Thread/process failed: Thread died in Berkeley DB library 问题解决: 01.删除yum临时库文 ...

  4. rpmdb: Thread/process 9180/139855524558592 failed: Thread died in Berkeley DB library

    使用yum安装出现问题:rpmdb: Thread/process 9180/139855524558592 failed: Thread died in Berkeley DB library 解决 ...

  5. rpmdb: Thread/process 10646/3086534416 failed: Thread died in Berkeley DB library

    明明用rpm查看包存在,但删除的时候进程就停住了.后来出现以下错误:rpmdb: Thread/process 10646/3086534416 failed: Thread died in Berk ...

  6. js in depth: event loop & micro-task, macro-task & stack, queue, heap & thread, process

    js in depth: event loop & micro-task, macro-task & stack, queue, heap & thread, process ...

  7. linux page table entry struct

    Page Table Entry The access control information is held in the PTE and is CPU specific; figure bit f ...

  8. Kernel Page Global Directory (PGD) of Page table of Process created in Linux Kernel

    Kernel Page Global Directory (PGD) of User process created 在早期版本: 在fork一个进程的时候,必须建立进程自己的内核页目录项(内核页目录 ...

  9. TCP Socket Establish;UDP Send Package Process In Kernel Sourcecode Learning

    目录 . 引言 . TCP握手流程 . TCP connect() API原理 . TCP listen() API原理 . UDP交互过程 . UDP send() API原理 . UDP bind ...

随机推荐

  1. Python tkinter模块弹出窗口及传值回到主窗口操作详解

    这篇文章主要介绍了Python tkinter模块弹出窗口及传值回到主窗口操作,结合实例形式分析了Python使用tkinter模块实现的弹出窗口及参数传递相关操作技巧,需要的朋友可以参考下 本文实例 ...

  2. git和bootstrap

    在linux系统中某种类型的服务有没有启动:ps -ef|grep 对应的服务名称 然后修改gitlab中的两个配置文件的信息 一般情况下是先创建组,然后在创建项目 常见的协议有http协议   ss ...

  3. vue 的computed 和 watch 两者的区别

    computed是计算属性,依赖其他属性计算,并且computed的值有缓存,只有当计算值发生变化才会返回内容. computed 用来监控自己定义的变量,该变量不在data里面声明,直接在compu ...

  4. MySql数据封装操作类

    1.先引用MySQL的DLL文件 using System; using System.Collections.Generic; using System.Linq; using System.Tex ...

  5. RedisTemplate在项目中的应用

    如下主要通去年无聊做的 "涂涂影院后台管理系统" 一个 demo,看 RedisTemplate 的使用. 体验地址:http://video.71xun.com:8080  账户 ...

  6. 学习Spring-Data-Jpa(十三)---动态查询接口JpaSpecificationExecutor

    1.JpaSpecificationExecutor JPA2引入了一个criteria API,我们可以使用它以编程的形式构建查询.通过编写criteria,动态生成query语句.JpaSpeci ...

  7. Time Frequency (T-F) Masking Technique

    时频掩蔽技术. 掩蔽效应 声掩蔽(auditory masking)是指一个声音的听阈因另一个声音的存在而上升的现象.纯音被白噪声所掩蔽时,纯音听阈上升的分贝数,主要决定于以纯音频率为中心一个窄带噪声 ...

  8. 使用go初步调用etcd

    使用go初步調用etcd package main import ( "context" "go.etcd.io/etcd/clientv3" "ti ...

  9. 干货 | 10分钟教你用column generation求解vehicle routing problems

    OUTLINE 前言 VRPTW description column generation Illustration code reference 00 前言 此前向大家介绍了列生成算法的详细过程, ...

  10. 【luoguP2997】[USACO10NOV]旗帜Banner

    题目链接 长和宽的gcd(x,y)=1,就没有中间结点,一种线段有两种方向,暴力统计一下就好了 注意x=0或y=0时的线段只有一种方向 #include<iostream> #includ ...