关于call_rcu在内核模块退出时可能引起kernel panic的问题
http://paulmck.livejournal.com/7314.html
RCU的作者,paul在他的blog中有提到这个问题,也明确提到需要在module exit的地方使用rcu_barrier来等待保证call_rcu的回调函数callback能够执行完成,然后再正式卸载模块,方式快速卸载之后call_back回调发现空指针的问题,从而导致kernel panic的问题。
RCU and unloadable modules
- Jun. 8th, 2009 at 1:38 PM
The rcu_barrier() function was described some time back in an article on Linux Weekly News. This rcu_barrier() function solves the problem where a given module invokes call_rcu() using a function in that module, but the module is removed before the corresponding grace period elapses, or at least before the callback can be invoked. This results in an attempt to call a function whose code has been removed from the Linux kernel. Oops!!!
Since the above article was written, rcu_barrier_bh() and rcu_barrier_sched() have been accepted into the Linux kernel, for use with call_rcu_bh() and call_rcu_sched(), respectively. These functions have seen relatively little use, which is no surprise, given that they are quite specialized. However, Jesper Dangaard recently discovered that they need to be used a bit more heavily. This lead to the question of exactly when they needed to be used, to which I responded as follows:
Unless there is some other mechanism to ensure that all the RCU callbacks have been invoked before the module exit, there needs to be code in the module-exit function that does the following:
- Prevents any new RCU callbacks from being posted. In other words, make sure that no future
call_rcu()invocations happen from this module unless thosecall_rcu()invocations touch only functions and data that outlive this module.- Invokes
rcu_barrier().- Of course, if the module uses
call_rcu_sched()instead ofcall_rcu(), then it should invokercu_barrier_sched()instead ofrcu_barrier(). Similarly, if it usescall_rcu_bh()instead ofcall_rcu(), then it should invokercu_barrier_bh()instead ofrcu_barrier(). If the module uses more than one ofcall_rcu(),call_rcu_sched(), andcall_rcu_bh(), then it must invoke more than one ofrcu_barrier(),rcu_barrier_sched(), andrcu_barrier_bh().What other mechanism could be used? I cannot think of one that it safe. For example, a module that tried to count the number of RCU callbacks in flight would be vulnerable to races as follows:
- CPU 0: RCU callback decrements the counter.
- CPU 1: module-exit function notices that the counter is zero, so removes the module.
- CPU 0: attempts to execute the code returning from the RCU callback, and dies horribly due to that code no longer being in memory.
If there was an easy solution (or even a hard solution) to this problem, then I do not believe that Nikita Danilov would have asked Dipankar Sarma for
rcu_barrier(). Therefore, I do not expect anyone to be able to come up with an alternative torcu_barrier()and friends. Always happy to learn something by being proven wrong, of course!!!So unless someone can show me some other safe mechanism, every unloadable module that uses
call_rcu(),call_rcu_sched(), orcall_rcu_bh()must usercu_barrier(),rcu_barrier_sched(), and/orrcu_barrier_bh()in its module-exit function.
So if you have a module that uses one of the call_rcu() functions, please use the corresponding rcu_barrier()function in the module-exit code!
Update: Peter Zijlstra rightly points out that the issue is not whether your module invokes call_rcu(), but rather whether the corresponding RCU callback invokes a function that is in a module. So, if there is a call_rcu(), call_rcu_sched(), or call_rcu_bh() anywhere in the kernel whose RCU callback either directly or indirectly invokes a function in your module, then your module's exit function needs to invoke rcu_barrier(), rcu_barrier_sched(), and/or rcu_barrier_bh(). Thanks to Peter for pointing this out!
关于call_rcu在内核模块退出时可能引起kernel panic的问题的更多相关文章
- Android退出时关闭所有Activity的方法
Android退出时,有的Activity可能没有被关闭.为了在Android退出时关闭所有的Activity,设计了以下的类: //关闭Activity的类 public class CloseAc ...
- Qt 程序退出时断言错误——_BLOCK_TYPE_IS_VALID(pHead->nBlockUse),由setAttribute(Qt::WA_DeleteOnClose)引起
最近在学习QT,自己仿写了一个简单的QT绘图程序,但是在退出时总是报错,断言错误: 报错主要问题在_BLOCK_TYPE_IS_VALID(pHead->nBlockUse),是在关闭窗口时报的 ...
- Android设置Activity启动和退出时的动画
业务开发时遇到的一个小特技,要求实现Activity启动时自下向上弹出,退出时自上向下退出. 此处不关注启动和退出时其他Activity的动画效果,实现方法有两种: 1.代码方式,通过Activity ...
- HP平台由于变量声明冲突导致程序退出时的core
最近遇到一个莫名的问题,在HP-UX B.11.31 U ia64平台下,程序PetriService在接收到产品化退出或Ctrl-C时,程序在main函数返回后析构全局的CTQueue<SMs ...
- Android 编程下 Activity 的创建和应用退出时的销毁
为了确保对应用中 Activity 的创建和销毁状态进行控制,所以就需要一个全局的变量来记录和销毁这些 Activity.这里的大概思路是写一个类继承 Application,并使获取该 Applic ...
- 解决log4cxx退出时的异常
解决log4cxx退出时的异常(金庆的专栏)如果使用log4cxx的FileWatchdog线程来监视日志配置文件进行动态配置,就可能碰到程序退出时产生的异常.程序退出时清理工作耗时很长时,该异常很容 ...
- 神奇的bug,退出时自动更新时间
遇到一个神奇的bug,用户退出时,上次登录时间会变成退出时的时间. 于是开始跟踪,发现Laravel在退出时,会做一次脏检查,这时会更新rember_token,这时就会有update操作如下. 而粗 ...
- java实现创建临时文件然后在程序退出时自动删除文件(转)
这篇文章主要介绍了java实现创建临时文件然后在程序退出时自动删除文件,从个人项目中提取出来的,小伙伴们可以直接拿走使用. 通过java的File类创建临时文件,然后在程序退出时自动删除临时文件.下面 ...
- os.waitpid()无法获取sys.exit()退出时的status code
[目的] 父进程使用os.waitpid()等待子进程退出,并检测子进程的exit code,以决定是否重启子进程. (常见的应用场景是:子进程接收外部命令,收到"stop"时退出 ...
随机推荐
- List,Set,Map集合的遍历方法
List的三种实现:ArrayList(数组) LinkedList(链表) Vector(线程安全) List集合遍历方法: List<String> list = new Arra ...
- /src/applicationContext.xml
<?xml version="1.0" encoding="UTF-8"?> <beans xmlns="http://www.sp ...
- rectangle,boundingRect和Rect
rectangle( rook_image, Point( , *w/8.0 ), Point( w, w), Scalar( , , ), , ); 矩形将被画到图像 rook_image 上 矩形 ...
- vue 解决页面闪烁问题
watch: { //监听表格数据的变化[使用 watch+nextTick 可以完成页面数据监听的 不会出现闪烁] tableData: { //深度监听,可监听到对象.数组的变化 handler( ...
- Android无法访问本地服务器(localhost/127.0.0.1)的解决方案
[Android无法访问本地服务器(localhost/127.0.0.1)的解决方案] 在Android开发中通过localhost或127.0.0.1访问本地服务器时,会报Java.NET.Con ...
- Ajax异步提交造成变量undefined
在使用jQuery的get方法或post方法向后台发ajax请求时,在其中定义一个变量htmlcollectionlst,但是在循环结束后却发现是undifined $.get("GetPl ...
- Ajax图片异步上传并回显
1.jsp页面 <td width="20%" class="pn-flabel pn-flabel-h"></td> <td w ...
- linux下的C++项目创建
CMake项目的完整构建 Linux下的CMake项目通常由几个文件夹组成.小伙伴们可以先在自己的电脑上新建一个文件夹,作为你代码的根目录,然后往里面建几个子文件夹,这里并不涉及具体的代码,只是可以作 ...
- Compare Version Numbers(STRING-TYPE CONVERTION)
QUESTION Compare two version numbers version1 and version1.If version1 > version2 return 1, if ve ...
- bootloader新的理解
1.对于bootloader这样的程序,作为板卡刚开始启动的部分,大致的顺序是一致的,大部分都是分为两个部分,一部分是汇编编写的,一部分是用c语言编写的.一般在汇编部分完成各种初始化的操作,比如关闭看 ...