http://paulmck.livejournal.com/7314.html

RCU的作者,paul在他的blog中有提到这个问题,也明确提到需要在module exit的地方使用rcu_barrier来等待保证call_rcu的回调函数callback能够执行完成,然后再正式卸载模块,方式快速卸载之后call_back回调发现空指针的问题,从而导致kernel panic的问题。

RCU and unloadable modules

  • Jun. 8th, 2009 at 1:38 PM

The rcu_barrier() function was described some time back in an article on Linux Weekly News. This rcu_barrier() function solves the problem where a given module invokes call_rcu() using a function in that module, but the module is removed before the corresponding grace period elapses, or at least before the callback can be invoked. This results in an attempt to call a function whose code has been removed from the Linux kernel. Oops!!!

Since the above article was written, rcu_barrier_bh() and rcu_barrier_sched() have been accepted into the Linux kernel, for use with call_rcu_bh() and call_rcu_sched(), respectively. These functions have seen relatively little use, which is no surprise, given that they are quite specialized. However, Jesper Dangaard recently discovered that they need to be used a bit more heavily. This lead to the question of exactly when they needed to be used, to which I responded as follows:

Unless there is some other mechanism to ensure that all the RCU callbacks have been invoked before the module exit, there needs to be code in the module-exit function that does the following:

  1. Prevents any new RCU callbacks from being posted. In other words, make sure that no future call_rcu()invocations happen from this module unless those call_rcu() invocations touch only functions and data that outlive this module.
  2. Invokes rcu_barrier().
  3. Of course, if the module uses call_rcu_sched() instead of call_rcu(), then it should invoke rcu_barrier_sched() instead of rcu_barrier(). Similarly, if it uses call_rcu_bh() instead of call_rcu(), then it should invoke rcu_barrier_bh() instead of rcu_barrier(). If the module uses more than one of call_rcu()call_rcu_sched(), and call_rcu_bh(), then it must invoke more than one of rcu_barrier()rcu_barrier_sched(), and rcu_barrier_bh().

What other mechanism could be used? I cannot think of one that it safe. For example, a module that tried to count the number of RCU callbacks in flight would be vulnerable to races as follows:

  1. CPU 0: RCU callback decrements the counter.
  2. CPU 1: module-exit function notices that the counter is zero, so removes the module.
  3. CPU 0: attempts to execute the code returning from the RCU callback, and dies horribly due to that code no longer being in memory.

If there was an easy solution (or even a hard solution) to this problem, then I do not believe that Nikita Danilov would have asked Dipankar Sarma for rcu_barrier(). Therefore, I do not expect anyone to be able to come up with an alternative to rcu_barrier() and friends. Always happy to learn something by being proven wrong, of course!!!

So unless someone can show me some other safe mechanism, every unloadable module that uses call_rcu()call_rcu_sched(), or call_rcu_bh() must use rcu_barrier()rcu_barrier_sched(), and/or rcu_barrier_bh() in its module-exit function.

So if you have a module that uses one of the call_rcu() functions, please use the corresponding rcu_barrier()function in the module-exit code!

Update: Peter Zijlstra rightly points out that the issue is not whether your module invokes call_rcu(), but rather whether the corresponding RCU callback invokes a function that is in a module. So, if there is a call_rcu()call_rcu_sched(), or call_rcu_bh() anywhere in the kernel whose RCU callback either directly or indirectly invokes a function in your module, then your module's exit function needs to invoke rcu_barrier()rcu_barrier_sched(), and/or rcu_barrier_bh(). Thanks to Peter for pointing this out!

关于call_rcu在内核模块退出时可能引起kernel panic的问题的更多相关文章

  1. Android退出时关闭所有Activity的方法

    Android退出时,有的Activity可能没有被关闭.为了在Android退出时关闭所有的Activity,设计了以下的类: //关闭Activity的类 public class CloseAc ...

  2. Qt 程序退出时断言错误——_BLOCK_TYPE_IS_VALID(pHead->nBlockUse),由setAttribute(Qt::WA_DeleteOnClose)引起

    最近在学习QT,自己仿写了一个简单的QT绘图程序,但是在退出时总是报错,断言错误: 报错主要问题在_BLOCK_TYPE_IS_VALID(pHead->nBlockUse),是在关闭窗口时报的 ...

  3. Android设置Activity启动和退出时的动画

    业务开发时遇到的一个小特技,要求实现Activity启动时自下向上弹出,退出时自上向下退出. 此处不关注启动和退出时其他Activity的动画效果,实现方法有两种: 1.代码方式,通过Activity ...

  4. HP平台由于变量声明冲突导致程序退出时的core

    最近遇到一个莫名的问题,在HP-UX B.11.31 U ia64平台下,程序PetriService在接收到产品化退出或Ctrl-C时,程序在main函数返回后析构全局的CTQueue<SMs ...

  5. Android 编程下 Activity 的创建和应用退出时的销毁

    为了确保对应用中 Activity 的创建和销毁状态进行控制,所以就需要一个全局的变量来记录和销毁这些 Activity.这里的大概思路是写一个类继承 Application,并使获取该 Applic ...

  6. 解决log4cxx退出时的异常

    解决log4cxx退出时的异常(金庆的专栏)如果使用log4cxx的FileWatchdog线程来监视日志配置文件进行动态配置,就可能碰到程序退出时产生的异常.程序退出时清理工作耗时很长时,该异常很容 ...

  7. 神奇的bug,退出时自动更新时间

    遇到一个神奇的bug,用户退出时,上次登录时间会变成退出时的时间. 于是开始跟踪,发现Laravel在退出时,会做一次脏检查,这时会更新rember_token,这时就会有update操作如下. 而粗 ...

  8. java实现创建临时文件然后在程序退出时自动删除文件(转)

    这篇文章主要介绍了java实现创建临时文件然后在程序退出时自动删除文件,从个人项目中提取出来的,小伙伴们可以直接拿走使用. 通过java的File类创建临时文件,然后在程序退出时自动删除临时文件.下面 ...

  9. os.waitpid()无法获取sys.exit()退出时的status code

    [目的] 父进程使用os.waitpid()等待子进程退出,并检测子进程的exit code,以决定是否重启子进程. (常见的应用场景是:子进程接收外部命令,收到"stop"时退出 ...

随机推荐

  1. mp4格式(转帖加修改) 转载

    下面的软件下载地址:http://download.csdn.net/source/2607382 ftyp: 这是一个筐,可以装mdat等其他Box. 例:00 00 00 14 66 74 79 ...

  2. tensorflow serving 中 No module named tensorflow_serving.apis,找不到predict_pb2问题

    最近在学习tensorflow serving,但是运行官网例子,不使用bazel时,发现运行mnist_client.py的时候出错, 在api文件中也没找到predict_pb2,因此,后面在网上 ...

  3. LinQ to sql 各种数据库查询方法

    1.多条件查询: 并且 && 或者 || var list = con.car.Where(r => r.code == "c014" || r.oil == ...

  4. ASP.NET 分页+组合查询 练习

    分页和组合查询都是通过拼接SQL语句到数据库查询进行实现 到汽车表(car)中查询 ,汽车表选取了“编号 code”,“车名 name”,“日期 time”,“油耗 oil ”,“马力 powers” ...

  5. editplus设置自动换行方法 editplus自动换行设置步骤

    原文链接:https://www.jb51.net/softjc/165897.html 发布时间:2014-05-14 17:03:54   作者:佚名    我要评论 editplus自动换行设置 ...

  6. SpringMVC Shiro与filterChainDefinitions

    SpringMVC整合Shiro,Shiro是一个强大易用的Java安全框架,提供了认证.授权.加密和会话管理等功能. 第一步:配置web.xml <!-- 配置Shiro过滤器,先让Shiro ...

  7. Vi命令:如何删除全部内容

    Vi命令:如何删除全部内容? 在命令模式下,输入:.,$d 一回车就全没了. 表示从当前行到末行全部删除掉. 用gg表示移动到首行.

  8. python 函数的定义及传参

    函数是一个通用的程序结构部件,是组织好的,可重复使用的,用来实现单一,或相关联功能的代码段. 定义一个简单的函数: >>> def test(a): #创建一个函数,函数名是test ...

  9. java面试题:基础知识

    类和对象 Q:讲一下面向对象OOP思想. 面向对象主要是抽象,封装,继承,多态. 多态又分为重载和重写.重载主要是方法参数类型不一样,重写则是方法内容不一样. Q:抽象类和接口有什么区别? 抽象类中可 ...

  10. Idea设置类注释模板

    1.选择File–>Settings–>Editor–>File and Code Templates–>Includes–>File Header. 在Descript ...