A brief introduction to per-cpu variables
墙外通道:http://thinkiii.blogspot.com/2014/05/a-brief-introduction-to-per-cpu.html
per-cpu variables are widely used in Linux kernel such as per-cpu counters, per-cpu cache. The advantages of per-cpu variables are obvious: for a per-cpu data, we do not need locks to synchronize with other cpus. Without locks, we can gain more performance.
There are two kinds of type of per-cpu variables: static and dynamic. For static variables are defined in build time. Linux provides a DEFINE_PER_CPU macro to defines this per-cpu variables.
#define DEFINE_PER_CPU(type, name) static DEFINE_PER_CPU(struct delayed_work, vmstat_work);
Dynamic per-cpu variables can be obtained in run-time by __alloc_percpu API. __alloca_percpu returns the per-cpu address of the variable.
void __percpu *__alloc_percpu(size_t size, size_t align)
s->cpu_slab = __alloc_percpu(sizeof(struct kmem_cache_cpu), * sizeof(void *));
One big difference between per-cpu variable and other variable is that we must use per-cpu variable macros to access the real per-cpu variable for a given cpu. Accessing per-cpu variables without through these macros is a bug in Linux kernel programming. We will see the reason later.
Here are two examples of accessing per-cpu variables:
struct vm_event_state *this = &per_cpu(vm_event_states, cpu); struct kmem_cache_cpu *c = per_cpu_ptr(s->cpu_slab, cpu);
Let's take a closer look at the behaviour of Linux per-cpu variables. After we define our static per-cpu variables, the complier will collect all static per-cpu variables to the per-cpu sections. We can see them by 'readelf' or 'nm' tools:
D __per_cpu_start
...
000000000000f1c0 d lru_add_drain_work
000000000000f1e0 D vm_event_states
000000000000f420 d vmstat_work
000000000000f4a0 d vmap_block_queue
000000000000f4c0 d vfree_deferred
000000000000f4f0 d memory_failure_cpu
...
0000000000013ac0 D __per_cpu_end
[] .vvar PROGBITS ffffffff81698000
00000000000000f0 WA
[] .data..percpu PROGBITS 00a00000
0000000000013ac0 WA
[] .init.text PROGBITS ffffffff816ad000 00aad000
000000000003fa21 AX
You can see our vmstat_work is at 0xf420, which is within __per_cpu_start and __per_cpu_end. The two special symbols (__per_cpu_start and __per_cpu_end) mark the start and end address of the per-cpu section.
One simple question: there are only one entry of vmstat_work in the per-cpu section, but we should have NR_CPUS entries of it. Where are all other vmstat_work entries?
Actually the per-cpu section is just a roadmap of all per-cpu variables. The real body of every per-cpu variable is allocated in a per-cpu chunk at runt-time. Linux make NR_CPUS copies of static/dynamic varables. To get to those real bodies of per-cpu variables, we use per_cpu or per_cpu_ptr macros.
What per_cpu and per_cpu_ptr do is to add a offset (named __per_cpu_offset) to the given address to reach the read body of the per-cpu variable.
#define per_cpu(var, cpu) \
(*SHIFT_PERCPU_PTR(&(var), per_cpu_offset(cpu))) #define per_cpu_offset(x) (__per_cpu_offset[x])
It's easier to understand the idea by a picture:

Translating a per-cpu variable to its real body (NR_CPUS = 4)
Take a closer look:
There are three part of an unit: static, reserved, and dynamic.
static: the static per-cpu variables. (__per_cpu_end - __per_cpu_start)
reserved: per-cpu slot reserved for kernel modules
dynamic: slots for dynamic allocation (__alloc_percpu)

Unit and chunk
static struct pcpu_alloc_info * __init pcpu_build_alloc_info(
size_t reserved_size, size_t dyn_size,
size_t atom_size,
pcpu_fc_cpu_distance_fn_t cpu_distance_fn)
{
static int group_map[NR_CPUS] __initdata;
static int group_cnt[NR_CPUS] __initdata;
const size_t static_size = __per_cpu_end - __per_cpu_start;
+-- lines: int nr_groups = , nr_units = ;----------------------
/* calculate size_sum and ensure dyn_size is enough for early alloc */
size_sum = PFN_ALIGN(static_size + reserved_size +
max_t(size_t, dyn_size, PERCPU_DYNAMIC_EARLY_SIZE));
dyn_size = size_sum - static_size - reserved_size;
+-- lines: Determine min_unit_size, alloc_size and max_upa such that--
}
After determining the size of the unit, the chunk is allocated by the memblock APIs.
int __init pcpu_embed_first_chunk(size_t reserved_size, size_t dyn_size,
size_t atom_size,
pcpu_fc_cpu_distance_fn_t cpu_distance_fn,
pcpu_fc_alloc_fn_t alloc_fn,
pcpu_fc_free_fn_t free_fn)
{
+-- lines: void *base = (void *)ULONG_MAX;---------------------------------
/* allocate, copy and determine base address */
for (group = ; group < ai->nr_groups; group++) {
struct pcpu_group_info *gi = &ai->groups[group];
unsigned int cpu = NR_CPUS;
void *ptr; for (i = ; i < gi->nr_units && cpu == NR_CPUS; i++)
cpu = gi->cpu_map[i];
BUG_ON(cpu == NR_CPUS); /* allocate space for the whole group */
ptr = alloc_fn(cpu, gi->nr_units * ai->unit_size, atom_size);
if (!ptr) {
rc = -ENOMEM;
goto out_free_areas;
}
/* kmemleak tracks the percpu allocations separately */
kmemleak_free(ptr);
areas[group] = ptr; base = min(ptr, base);
}
+-- lines: Copy data and free unused parts. This should happen after all---
}
static void * __init pcpu_dfl_fc_alloc(unsigned int cpu, size_t size,
size_t align)
{
return memblock_virt_alloc_from_nopanic(
size, align, __pa(MAX_DMA_ADDRESS));
}
A brief introduction to per-cpu variables的更多相关文章
- InnoDB Spin rounds per wait在>32位机器上可能为负
今天发现一个系统innodb的spin rounds per wait为负,感觉很奇怪,原来是个bug: For example (output from PS but we have no patc ...
- 机器学习、NLP、Python和Math最好的150余个教程(建议收藏)
编辑 | MingMing 尽管机器学习的历史可以追溯到1959年,但目前,这个领域正以前所未有的速度发展.最近,我一直在网上寻找关于机器学习和NLP各方面的好资源,为了帮助到和我有相同需求的人,我整 ...
- 超过 150 个最佳机器学习,NLP 和 Python教程
超过 150 个最佳机器学习,NLP 和 Python教程 微信号 & QQ:862251340微信公众号:coderpai简书地址:http://www.jianshu.com/p/2be3 ...
- Introduction to Parallel Computing
Copied From:https://computing.llnl.gov/tutorials/parallel_comp/ Author: Blaise Barney, Lawrence Live ...
- Linux CPU Hotplug CPU热插拔
http://blog.chinaunix.net/uid-15007890-id-106930.html CPU hotplug Support in Linux(tm) Kernel Linu ...
- Sed - An Introduction and Tutorial by Bruce Barnett
http://www.grymoire.com/unix/sed.html Quick Links - NEW Sed Commands : label # comment {....} Block ...
- An Introduction to Lock-Free Programming
Lock-free programming is a challenge, not just because of the complexity of the task itself, but bec ...
- Android 性能优化(20)多核cpu入门:SMP Primer for Android
SMP Primer for Android 1.In this document Theory Memory consistency models Processor consistency CPU ...
- Introduction to Linux Threads
Introduction to Linux Threads A thread of execution is often regarded as the smallest unit of proces ...
随机推荐
- SVN忘记登陆用户
C:\Users\Yaolz\AppData\Roaming\Subversion\auth 删除里面所有文件
- mybatis电子商务平台b2b2c
技术解决方案 开发语言: java.j2ee 数据库:mysql JDK支持版本: JDK1.6.JDK1.7.JDK1.8版本 核心技术:分布式.云服务.微服务.服务编排等. 核心架构: 使用Spr ...
- 谷歌发布了 T2T(Tensor2Tensor)深度学习开源系统
谷歌开源T2T模型库,深度学习系统进入模块化时代! 谷歌大脑颠覆深度学习混乱现状,要用单一模型学会多项任务 https://github.com/tensorflow/models https://g ...
- selenium实现淘宝的商品爬取
一.问题 本次利用selenium自动化测试,完成对淘宝的爬取,这样可以避免一些反爬的措施,也是一种爬虫常用的手段.本次实战的难点: 1.如何利用selenium绕过淘宝的登录界面 2.获取淘宝的页面 ...
- qhfl-6 购物车
购物车中心 用户点击价格策略加入购物车,个人中心可以查看自己所有购物车中数据 在购物车中可以删除课程,还可以更新购物车中课程的价格策略 所以接口应该有四种请求方式, get,post,patch,de ...
- 第37章:MongoDB-集群--Replica Sets(副本集)---单机的搭建
①创建副本集 1:先创建几个存放数据的文件夹,比如在前面的dbs下面创建db1,db2,db3: 同理在前面的logs下面创建logs1,logs2,logs3 2:在启动MongoDB服务器的时候, ...
- 【repost】javascript:;与javascript:void(0)使用介绍
有时候我们在编写js过程中,需要触发事件而不需要返回值,那么就可能需要这样的写法 最近看了好几个关于<a>标签和javascript:void(0)的帖子,谨记于此,以资查阅. 注:以下代 ...
- _ZNote_Objective-C_用终端编译OC程序
某些情况下,仅仅想写一些简单的代码,可以不用Xcode,仅仅使用终端即可编译OC程序. 打开终端. 输入vi test.m 输入一下代码: #import <Foundation/Foundat ...
- php中 curl, fsockopen ,file_get_contents 三个函数
赵永斌:有些时候用file_get_contents()调用外部文件,容易超时报错.换成curl后就可以.具体原因不清楚curl 效率比file_get_contents()和fsockopen()高 ...
- Visual Studio中xml文件使用app.config、web.config等的智能提示的方法
在.Net开发的过程中,有时我们需要使用Xml文件作为配置文件(基于某些情况的考虑),而不是app.config.web.config这种,但是我们在xml中配置时希望可以增加类似编辑app.conf ...