A thread is a Windows concept whose job is to virtualize the CPU.

Thread Overhead

  • Thread kernel object The operating system allocates and initializes one of these data structures for each thread created in the system. The data structure contains a bunch of properties  that describe the thread. This data structure also contains what is called the thread’s context. The context is a block of memory that contains a set of the CPU’s registers. For the x86, x64, and ARM CPU architectures, the thread’s context uses approximately 700, 1,240, or 350 bytes of memory, respectively.
  • Thread environment block (TEB) The TEB is a block of memory allocated and initialized in user mode (address space that application code can quickly access). The TEB consumes 1 page of memory (4 KB on x86, x64 CPUs, and ARM CPUs). The TEB contains the head of the thread’s exception-handling chain. Each try block that the thread enters inserts a node in the head of this chain; the node is removed from the chain when the thread exits the try block. In addition, the TEB contains the thread’s thread-local storage data and some data structures for use by Graphics Device Interface (GDI) and OpenGL graphics.
  • User-mode stack The user-mode stack is used for local variables and arguments passed to methods. It also contains the address indicating what the thread should execute next when the current method returns. By default, Windows allocates 1 MB of memory for each thread’s user-mode stack. More specifically, Windows reserves the 1 MB of address space and sparsely commits physical storage to it as the thread actually requires it when growing the stack.
  • Kernel-mode stack The kernel-mode stack is also used when application code passes arguments to a kernel-mode function in the operating system. For security reasons, Windows copies any arguments passed from user-mode code to the kernel from the thread’s user-mode stack to the thread’s kernel-mode stack. Once copied, the kernel can verify the arguments’ values, and because the application code can’t access the kernel-mode stack, the application can’t modify the arguments’ values after they have been validated and the operating system kernel code begins to operate on them. In addition, the kernel calls methods within itself and uses the kernel-mode stack to pass its own arguments, to store a function’s local variables, and to store return addresses. The kernel-mode stack is 12 KB when running on a 32-bit Windows system and 24 KB when running on a 64-bit Windows system.
  • DLL thread-attach and thread-detach notifications Windows has a policy that whenever a thread is created in a process, all unmanaged DLLs loaded in that process have their DllMain method called, passing a DLL_THREAD_ATTACH flag. Similarly, whenever a thread dies, all DLLs in the process have their DllMain method called, passing it a DLL_THREAD_DETACH flag. Some DLLs need these notifications to perform some special initialization or cleanup for each thread created/destroyed in the process. For example, the C-Runtime library DLL allocates some thread-local storage state that is required should the thread use functions contained within the C-Runtime library.

now we’re going to start talking about context switching. A computer with only one CPU in it can do only one thing at a time. Therefore, Windows has to share the actual CPU hardware among all the threads (logical CPUs) that are sitting around in the system.

At any given moment in time, Windows assigns one thread to a CPU. That thread is allowed to run for a time-slice (sometimes referred to as a quantum). When the time-slice expires, Windows context switches to another thread. Every context switch requires that Windows performs the following actions:

  • Save the values in the CPU’s registers to the currently running thread’s context structure inside the thread’s kernel object.
  • Select one thread from the set of existing threads to schedule next. If this thread is owned by another process, then Windows must also switch the virtual address space seen by the CPU before it starts executing any code or touching any data.
  • Load the values in the selected thread’s context structure into the CPU’s registers.

After the context switch is complete, the CPU executes the selected thread until its time-slice expires, and then another context switch happens again. Windows performs context switches about every 30 ms. Context switches are pure overhead; that is, there is no memory or performance benefit that comes from context switches. Windows performs context switching to provide end users with a robust and responsive operating system.

the performance hit is much worse than you might think. Yes, a performance hit occurs when Windows context switches to another thread. But the CPU was executing another thread, and the previously running thread’s code and data reside in the CPU’s caches so that the CPU doesn’t have to access RAM memory as much, which has significant latency associated with it. When Windows context switches to a new thread, this new thread is most likely executing different code and accessing different data that is not in the CPU’s cache. The CPU must access RAM memory to populate its cache so it can get back to a good execution speed. But then, about 30 ms later, another context switch occurs.

A thread can voluntarily end its time-slice early, which happens quite frequently. Threads typically wait for I/O operations (keyboard, mouse, file, network, etc.) to complete. For example, Notepad’s thread usually sits idle with nothing to do; this thread is waiting for input. If the user presses the J key on the keyboard, Windows wakes Notepad’s thread to have it process the J keystroke. It may take Notepad’s thread just 5 ms to process the key, and then it calls a Win32 function that tells Windows that it is ready to process the next input event. If there are no more input events, then Windows puts Notepad’s thread into a wait state (relinquishing the remainder of its time-slice) so that the thread is not scheduled on any CPU until the next input stimulus occurs. This improves overall system performance because threads that are waiting for I/O operations to complete are not scheduled on a CPU and do not waste CPU time; other threads can be scheduled on the CPU instead.

when Windows was originally designed, single-CPU computers were commonplace, and Windows added threads to improve system responsiveness and reliability. Today, threads are also being used to improve scalability, which can happen only on computers that have multiple cores in them.

CLR Threads and Windows Threads

Today, the CLR uses the threading capabilities of Windows, so Part V of this book is really focusing on how the threading capabilities of Windows are exposed to developers who write code by using the CLR.

.NET:CLR via C# Thread Basics的更多相关文章

  1. Qt 线程基础(Thread Basics的翻译,线程的五种使用情况)

    Qt 线程基础(QThread.QtConcurrent等) 转载自:http://blog.csdn.net/dbzhang800/article/details/6554104 昨晚看Qt的Man ...

  2. 浅析Android中的消息机制-解决:Only the original thread that created a view hierarchy can touch its views.

    在分析Android消息机制之前,我们先来看一段代码: public class MainActivity extends Activity implements View.OnClickListen ...

  3. Java内存泄漏分析系列之二:jstack生成的Thread Dump日志结构解析

    原文地址:http://www.javatang.com 一个典型的thread dump文件主要由一下几个部分组成: 上图将JVM上的线程堆栈信息和线程信息做了详细的拆解. 第一部分:Full th ...

  4. .NET:CLR via C#:CLR Hosting And AppDomains

    AppDomain Unloading To unload an AppDomain, you call AppDomain’s Unload static method.This call caus ...

  5. JVM故障分析系列之四:jstack生成的Thread Dump日志线程状态

    JVM故障分析系列之四:jstack生成的Thread Dump日志线程状态  2017年10月25日  Jet Ma  JavaPlatform JVM故障分析系列系列文章 JVM故障分析系列之一: ...

  6. 错误:Only the original thread that created a view hierarchy can touch its views——Handler的使用

    在跟随教程学习到显示web页面的html源码时报错:Only the original thread that created a view hierarchy can touch its views ...

  7. .NET:CLR via C# Primitive Thread Synchronization Constructs

    User-Mode Constructs The CLR guarantees that reads and writes to variables of the following data typ ...

  8. .NET:CLR via C# Compute-Bound Asynchronous Operations

    线程槽 使用线程池了以后就不要使用线程槽了,当线程池执行完调度任务后,线程槽的数据还在. 测试代码 using System; using System.Collections.Generic; us ...

  9. .NET:CLR via C# The CLR’s Execution Model

    The CLR’s Execution Model The core features of the CLR memory management. assembly loading. security ...

随机推荐

  1. http://localhost/ 或 http://127.0.0.1/ 报错:HTTP 404 的解决办法

    一些初次接触使用 Eclipse 工具来开发 JAVA Web 工程的开发人员,可能会对 Eclipse 和 Tomcat 的绑定产生一个疑惑. 那就是 在修改了 Tomcat 的8080端口为80后 ...

  2. VMware下CenOS7系统的安装及lnmp服务器的搭建

    CentOS7系统的安装 CentOS7下载:http://mirrors.aliyun.com/centos/7/isos/x86_64/CentOS-7-x86_64-DVD-1708.iso 下 ...

  3. MySQL 字段类型占用空间

    MySQL支持多种列类型:数值类型.日期/时间类型和字符串(字符)类型. 首先来看下各类型的存储需求(即占用空间大小): 数值类型存储需求 列类型 存储需求 TINYINT 1个字节 SMALLINT ...

  4. 学习 HMM

    简介 HMM 中的变量可以分为两组. 第一组是状态变量 \(\{y_i,y_2,\cdots, y_n\}\), 其中 \(y_i \in \mathcal{Y}\) 表示第 \(i\) 时刻的系统状 ...

  5. python 创建项目

    项目骨架 nose 测试框架 Windows 10 配置 创建骨架项目目录 Windows 10 的 PowerShell mkdir projects cd projects/ mkdir skel ...

  6. JAVA单向链表实现

    JAVA单向链表实现 单向链表 链表和数组一样是一种最常用的线性数据结构,两者各有优缺点.数组我们知道是在内存上的一块连续的空间构成,所以其元素访问可以通过下标进行,随机访问速度很快,但数组也有其缺点 ...

  7. php操作mongodb or查询这样写!

    $where['$or'] = [ ['id' => ['lt'=>0]], ['id2' => ['lt'=>1]] ]; 这个是查询 id>0 或者id2>1的 ...

  8. ARM Linux 驱动Input子系统之按键驱动测试

    上一篇已经谈过,在现内核的中引入设备树之后对于内核驱动的编写,主要集中在硬件接口的配置上了即xxxx.dts文件的编写. 在自己的开发板上移植按键驱动: 1.根据开发板的原理图 确定按键的硬件接口为: ...

  9. 常用的Jquery工具方法

    一.根据后端动态字段,如何把驻点输出在页面上?1.可以提前写好css,设置li的宽度,在页面中通过模板引擎语法动态加载不同的className.2.可以根据驻点个数和位置,用jquery去动态计算赋值 ...

  10. 安装部署VMware vSphere 5.5文档 (6-5) 安装配置vCenter

    部署VMware vSphere 5.5 实施文档 ########################################################################## ...