DotNet Core Threadpool

Jai Rathore

https://medium.com/@jaiadityarathore/dotnet-core-threadpool-bef2f5a37888

If you are reading this chances are that your Dotnet core app is experiencing some performance issues while scaling. And you think thread starvation could be one of the reasons behind it. Let’s take a look into how the Dotnet threadpool works.

既然你在阅读这篇文章,可能正是因为你的 .NET 应用正在经受着扩展中的遇到的问题。并且线程池饥渴可能是背后的原因之一。让我们花费一点时间来看看 .NET 的线程池是如何工作的。

Threadpool is the native software level thread management system for Dotnet runtime. It is also the queuing mechanism for all the incoming requests. There is no separate request queuing mechanism in Dotnet core besides the threadpool.

线程池是 .NET 运行时中原生软件级别的线程管理系统。它还是对于所有进入的请求使用队列机制。在线程池之外,没有另外的请求队列机制了。

On the hardware level we typically have 1 thread per core of the CPU. So in most cases of our applications and in the context of this post we will consider 16 threads available, assuming we have a 16 core processor. Ideally we want our spawned threads to be at or around the same number. This may be a little confusing but stay with me.

在硬件级别,对于 CPU 的每个核心我们通常有一个线程对应。所以,对于大多数场景下的应用,以及在本文的场景中,我们将考虑存在 16 个线程。并假设我们拥有 16 个内核的处理器。理想情况下,我们期望产生的线程也有相同的数量。这有一点困扰,但可以这样考虑。

In a fully asynchronous environment, the way Dotnet Threadpool works is :- for an incoming request it spawns a new thread — that thread runs it’s current operation (before async call), saves the current context and adds itself to the global pool as an available thread. When the remaining async call completes or if a new request comes in at this point, the Dotnet threadpool checks if all the 16 threads are already in operation, if yes it spawns up a new thread, if not it takes up one of the available threads from the pool.

在完全异步的环境下,.NET 线程池的工作方式为:对于每个进入的请求,它生成一个新线程,该线程处理它当前的操作 (在异步调用之前),保存当前的上下文,然后将自己加入到公共池中作为可用的线程。当原有的异步调用完成,或者如果一个新的请求来到,那么 .NET 线程池会检查是否所有 16 个线程都已经在处理中了,如果使得,那么它会生成一个新的线程,如果不是,它就会从线程池中取得一个可用的线程。

Synchronous vs Asynchronous request processing

However if some of the calls in the application are blocking or are not asynchronous, the thread which had run part of the operation can neither add itself to the global pool nor destruct itself. It’s blocked till the entire operation is complete. And if all the current 16 threads are stuck at a given point in a similar fashion, the threadpool will have to spawn a new thread for every incoming request.

不过,如果应用程序中的某些调用是阻塞的,或者它不是异步的,则运行这部分操作的线程既不能将自身添加回全局池中,也不能销毁自身。它会被阻塞在整个操作完成之前。如果所有当前的 16 个线程都以类似的方式卡在给定点,则线程池将不得不为每个传入请求生成一个新线程。

If at this point there is a burst of 1000 incoming request , dotnet threadpool will have to spawn a thread for every single one of those request which would result in a lot of threads being spun up. But the CPU can still only process 16 threads at any single point of time. Thread creation, destruction, context switching and their existence itself is very expensive , they eat up a lot of memory and can bring a system to a halt. We do not want so many threads running.

如果此时猝发了 1000 个新的请求,.NET 线程池将不得不为每个单独的请求创建一个线程,导致大量的线程被创建出来。但是,此时的 CPU 在此刻仍然在原来的 16 个线程上运行。线程的创建、销毁、上下文切换,以及它们的存在都是非常昂贵的,它们吃掉了大量的内存,会导致系统挂起。我们并不希望如此大量的线程在同时运行中。

To avoid this Dotnet threadpool has a throttling mechanism. It throttles the incoming request and spawns a new thread at 0.5 second per request. So that means in worst cases if application has a lot of blocking calls and all threads get stuck on the blocking calls, in worst cases when there is a burst of 1000 incoming request a new incoming request/half processed request can wait up to 1000*0.5 = 500 seconds. So in some cases this means that an operation which has to retrieve data from the database, even though the first part of request has retrieved the data from the database — it might have to wait upto 500 seconds to send that data back to the application, which on the application level will show up as a slow query when in reality even though the query execution was pretty fast, it was the thread availability which was the issue. This problem is commonly referred as Thread Starvation.

为了避免这种状况,.NET 线程池拥有一个节流机制。它会对进入的请求节流,每 0.5 秒每个请求才会创建一个新线程。这样在最坏的场景下,如果应用程序存在大量的阻塞调用,并且所有的线程都被阻塞在这些调用上,同时有 1000 个猝发请求进入,在最坏的场景下,会有最多 1000 * 0.5 = 500 秒的才会创建。因此在有些场景下,这意味着某个需要访问数据库获取数据的操作,从请求开始到从数据库获得数据,可能不得不等待多至 500 秒才能将数据返回。在应用程序的层面上的表现将是一个很慢的查询,但实际上的查询是非常快的。这就是线程可用性的问题所在,通常被称为线程饥渴。

ThreadPool Default behavior

使用指定最小值的线程池 Thread Pool with Custom Minimum Value

Dotnet threadpool provides 2 settings — MinimumValue and MaximumValue

线程池支持 2 个设定:最小值和最大值

MaximumValue — total number of threads that can be spawned , which is typically 32,767 (default)

最大值 - 可以创建的线程总数,默认值为 32767

Minimum Value — does not mean the minimum number of threads always present. The application does not boot with the minimum value of threads. The application will still boot with the threads equal to the number of CPU cores. It only means — the minimum number of threads that can be spawned before dotnet start the thread throttling process. We can think of it more in terms of the threshold limit. By default Dotnet sets this value as the number of hardware threads available based on the CPU setting — 16. This means after all 16 threads are in use and 5 new requests come in. Dotnet thread pool will wait 0.5 second for each request then check if a thread becomes free, if yes allots that to the request. If not spawns a new thread. If all the threads are blocked because of some blocking operation — the 5th request with current setting might have to wait 5 * 0.5 = 2.5 seconds to be processed.

最小值 - 不是线程保持的最小值。应用不会启动最小数量的线程。应用还是创建与 CPU 核数相同的的线程数。这只是意味着,在 .NET 开始线程节流之前,创建的最小数量的线程。我们可以进一步考虑配额限制。默认的 .NET 会设置该值为物理 CPU 内核数量,也就是 16。这就是说在所有 16 个线程被使用后,并且有 5 个新的请求到达。.NET 线程池将对每个请求等待 0.5 秒,然后检查是否已经有线程可用,如果已经有线程可用,那么分配这个线程。如果还没有线程可用,那么创建一个新的线程。如果所有的的线程都已经被某种操作所阻塞,第 5 个请求请求可能不得不等待 5 * 0.5 = 2.5 秒才能处理。

If we increase the minimum value to 100. This means that if there is a sudden burst of 100+ request and all the current threads are busy. It will instantly spawn 100 threads for each of those requests and only then start the throttling process. This means there will now be at least 116 threads fighting for the resources on the machine (memory/CPU) which is ideally only designed to handle 16 threads. If the request burst is bigger or if it continues for a longer time, our system can soon become unresponsive and will need a reboot.

如果将最小值提升到 100。这意味着如果突然有 100+ 的请求到达,而且所有现有的线程都处于忙碌状态,它会立即创建 100 个线程来处理每个请求,只有在这之后,才开始节流处理。这样将会有至少 116 个线程开始争夺机器上的资源,例如 CPU/内存等等。理想情况下,机器智能同时处理 16 个线程。如果请求猝发的值很大,或者持续较长的时间,我们的系统很快就会变得失去响应能力,甚至需要重新启动。

We can use minimum value as a patch for sometime, but it is not recommended by Microsoft. However minimum value of 100 should be relatively safe provided our burst of incoming requests is not constant or long lasting. And their recommendation was if the blocking calls cannot be avoided then to use it with some sort of a concurrency limiter which would not take incoming request (return 503) past a certain limit so the system doesn’t become unresponsive.

有的时候我们可以将最小值看作一种补救,但是微软并不建议如此。不过,最小值 100 应该是相对安全的,前提是我们的传入请求突发不是恒定的或持久的。他们的建议是,如果无法避免阻塞调用,那么将其与某种并发限制器 一起使用,该限制器不会使传入的请求(返回503)超过某个限制,因此系统不会变得无响应。

To Setup custom values for the minimum Value thresholds of Threadpool, we only need to make a small change. For example to set minimum value as 100 for both WorkerThread and IOCP thread, in the ConfigServices of your Startup.cs add this line of code.

为了配置该线程池的最小值,我们仅仅需要做一点变更。例如针对 WorkerThread 和 IOCP 线程将该最小值设置为 100,在应用的 Startup.cs 中的 ConfigService 中增加如下代码:

ThreadPool.SetMinThreads(100, 100);

The real problem is all the blocking calls (non-async) across our applications. We should clean up on all the instances where we are using blocking calls like .Wait() / .Result() / .GetResult() and try using use await instead.

真正的问题是,所有阻塞调用 (非异步) 遍布我们的应用程序。我们应该清理所有这些使用阻塞调用的用法,例如 .Wait()/ .Result() / .GetResult() 等等,并使用 await 来替代它们。

使用 ReadFormAsync 来代替 .Form() (Use ReadFormAsync instead of .Form())

If running on Dotnet 3.0 + use IAsyncEnumerable instead of IEnumerable if returning from action results( because the latter results in synchronous serialization).

如果你已经在使用 .NET 3.0 及其以上版本,如果从 Action 中返回,那么使用 IAsyncEnumerable 来替代 IEnumerable ,因为后面的版本使用了同步序列化。

Before setting custom values for the thread we should try to measure the current metrics of the app like the current number of threads in the threadpool. This can be done by using a package like AppMetrics DotnetRuntime. (I will try to cover setting that up in another post). We should play with different values of minimum value of threadpool untill we find that sweep spot. Also look at these recommendations.

在为线程池设置定制的值之前,我们应该尝试测量应用当前的指标,例如线程池中的当前线程数量。这可以通过使用诸如 AppMetrics DotnetRuntime 包来实现(我可能还在其它的文章中涉及)。还可以参考 https://docs.microsoft.com/en-us/aspnet/core/performance/performance-best-practices?view=aspnetcore-5.0

However it’s best to leave the threadpool alone and as soon as the expected burst of request is gone, the first step should be to remove all the blocking calls from your apps and to make it all asynchronous as soon as possible.

事实上,最好不要随便调整线程池,一旦预期的猝发状况不存在,首先应该的操作是从你的应用中删除所有的阻塞调用,并尽快全部使用异步方式。

DotNet Core Threadpool的更多相关文章

  1. dotNet Core开发环境搭建及简要说明

    一.安装 .NET Core SDK 在 Windows 上使用 .NET Core 的最佳途径:使用Visual Studio. 免费下载地址: Visual Studio Community 20 ...

  2. dotnet core 使用 MongoDB 进行高性能Nosql数据库操作

    好久没有写过Blog, 每天看着开源的Java社区流口水, 心里满不是滋味. 终于等到了今年六月份 dotnet core 的正式发布, 看着dotnet 社区也一步一步走向繁荣, 一片蒸蒸日上的大好 ...

  3. dotnet Core Asp.net 项目搭建

    Asp.Net Core 介绍 Asp.Net Core 目前最新版本 1.0.0-preview2-003131 Asp.Net Core官网:https://dotnet.github.io/ A ...

  4. DotNet Core 介绍

    前言 asp.net core rtm 6月底即将发布,自己也想着为社区做点共享,刚好最近不太忙,看到社区的小伙伴们都在为dotnet core的推广而贡献力量,项目中刚好在用rc2版本,就多写些文章 ...

  5. dotnet core 出现Can not find runtime target for framework '.NETCoreApp,Version=v1.6' 的解决办法

    如果你在更新dotnet core新的类库后运行程序提示如下的错误: Can not find runtime target for framework '.NETCoreAPP, Version=v ...

  6. DotNet Core 1.0 集成 CentOS 开发与运行环境部署

    一.     DotNet Core 1.0 开发环境部署 操作系统安装 我们使用CentOS 7.2.1511版本. 安装libunwind库 执行:sudo yum install libunwi ...

  7. ubuntu15.10 或者 16.04 或者 ElementryOS 下使用 Dotnet Core

    这里我们不讲安装,缺少libicu52自行安装. 安装完成后使用dotnet restore或者build都会失败,一是报编译的dll不适合当前系统,二是编译到ubuntu16.04文件夹下会产生一些 ...

  8. 北京时间28号0点以后Scott Hanselman同志台宣布dotnet core 1.0 rtm

    今日占住微信号头条的好消息<终于来了!微软.Net Core 1.0下载放出>.本人立马跑到官网http://dot.net看了一下,仍然是.net core 1.0 Preview 1版 ...

  9. DotNet Core 之旅(一)

    1.下载安装 DotNetCore.1.0.0-SDK.Preview2-x64.exe 下载链接:https://www.microsoft.com/net/download ps:如果有vs201 ...

  10. dotnet core 开发体验之Routing

    开始 回顾上一篇文章:dotnet core开发体验之开始MVC 里面体验了一把mvc,然后我们知道了aspnet mvc是靠Routing来驱动起来的,所以感觉需要研究一下Routing是什么鬼. ...

随机推荐

  1. Android Systrace 基础知识 -- Systrace 简介

    1. 正文 Systrace 是 Android4.1 中新增的性能数据采样和分析工具.它可帮助开发者收集 Android 关键子系统(如 SurfaceFlinger/SystemServer/Ke ...

  2. 深入探索Spring AI:源码分析流式回答

    在上一章节中,我们深入分析了Spring AI的阻塞式请求与响应机制,并探讨了如何增强其记忆能力.今天,我们将重点讲解流式响应的概念与实现.毕竟,AI的流式回答功能与其交互体验密切相关,是提升用户满意 ...

  3. 02-react中jsx的基本使用

    // 使用 createElement太繁琐 不直观 不优雅开发体验不好 代码维护不行 // jsx 不是 js 而是 js的扩展语法 // jsx 是react的核心内容 // react项目中已经 ...

  4. Monaco Editor 实现一个日志查看器

    我们是袋鼠云数栈 UED 团队,致力于打造优秀的一站式数据中台产品.我们始终保持工匠精神,探索前端道路,为社区积累并传播经验价值. 本文作者:文长 前言 在 Web IDE 中,控制台中展示日志是至关 ...

  5. 双指针习题:Kalindrome Array

    Kalindrome Array 题目链接: Kalindrome Array - 洛谷 | 计算机科学教育新生态 (luogu.com.cn) 题面翻译 对于长度为 \(m\) 的序列 \(b\), ...

  6. 修复 KubeSphere 内置 Jenkins 的 Apache Log4j2 漏洞

    作者:老Z,中电信数智科技有限公司山东分公司运维架构师,云原生爱好者,目前专注于云原生运维,云原生领域技术栈涉及 Kubernetes.KubeSphere.DevOps.OpenStack.Ansi ...

  7. 优雅简单玩转python3异步并发

    在python3之后,随着async/await引入,异步调用以全新而便捷的方式让人眼前一亮. 首先,尽量用async/await定义协程 这里以使用aiohttp请求网络,async函数中,不要使用 ...

  8. Java高并发Lock接口讲解,精准通知线程间的执行顺序

    题目:两个线程操作一个变量,实现两个线程对同一个资源一个进行加1操作,另外一个进行减1操作,且需要交替实现,变量的初始值为0.即两个线程对同一个资源进行加一减一交替操作. Lock接口与Conditi ...

  9. vue 从后端拿到验证码并点击刷新

    验证码登录的实现思路1.前端从后端拿到验证码图片2.输入验证码进行登录3.后端拿到验证码进行比对,正确登录成功. 前端请求验证码直接写在img标签中即可,不必单独发送axios请求 // templa ...

  10. ROS入门21讲(2)

    四.创建工作空间与功能包 1.工作空间 工作空间(workspace):是一个存放工程开发相关文件的文件夹(相当于在IDE中创建的工程文件). 包含: src:代码空间(Source Space),放 ...