__asm__ __volatile__("": : :"memory");

参考：http://stackoverflow.com/questions/14950614/working-of-asm-volatile-memory

asmvolatile("":::"memory");

creates a compiler level memory barrier forcing optimizer to not re-order memory accesses across the barrier.

For example, if you need to access some address in a specific order (probably because that memory area is actually backed by a different device rather than a memory) you need to be able tell this to the compiler otherwise it may just optimize your steps for the sake of efficiency.

Assume in this scenario you must increment a value in address, read something and increment another value in an adjacent address.

int c(int *d, int *e) {

        int r;

        d[] += ;

        r = e[];

        d[] += ;

        return r;

}

Problem is compiler (gcc in this case) can rearrange your memory access to get better performance if you ask for it (-O). Probably leading to a sequence of instructions like below:

 <c>:

   :           mov r3, r0

   :   c805        ldmia   r0, {r0, r2}

   :           adds    r0, #

   :           adds    r2, #

   :           str r0, [r3, #]

   a:           ldr r0, [r1, #]

   c:   605a        str r2, [r3, #]

   e:           bx  lr

Above values for d[0] and d[1] are loaded at the same time. Lets assume this is something you want to avoid then you need to tell compiler not to reorder memory accesses and that is to use asm volatile("" ::: "memory").

int c(int *d, int *e) {

        int r;

        d[] += ;

        r = e[];

        asm volatile("" ::: "memory");

        d[] += ;

        return r;

}

So you'll get your instruction sequence as you want it to be:

 <c>:

   :           ldr r2, [r0, #]

   :           mov r3, r0

   :           adds    r2, #

   :           str r2, [r0, #]

   :           ldr r0, [r1, #]

   a:   685a        ldr r2, [r3, #]

   c:           adds    r2, #

   e:   605a        str r2, [r3, #]

  :           bx  lr

  :   bf00        nop

It should be noted that this is only compile time memory barrier to avoid compiler to reorder memory accesses, as it puts no extra hardware level instructions to flush memories or wait for load or stores to be completed. CPUs can still reorder memory accesses if they have the architectural capabilities.

This sequence is a compiler memory access scheduling barrier, as noted in the article referenced by Udo. This one is GCC specific - other compilers have other ways of describing them, some of them with more explicit (and less esoteric) statements.

__asm__ is a gcc extension of permitting assembly language statements to be entered nested within your C code - used here for its property of being able to specify side effects that prevent the compiler from performing certain types of optimisations (which in this case might end up generating incorrect code).

__volatile__ is required to ensure that the asm statement itself is not reordered with any other volatile accesses any (a guarantee in the C language).

memory is an instruction to GCC that (sort of) says that the inline asm sequence has side effects on global memory, and hence not just effects on local variables need to be taken into account.

asm volatile("": : :"memory");的更多相关文章

#define barrier() __asm__ __volatile__("": : :"memory") 中的memory是gcc的东西
gcc内嵌汇编简介在内嵌汇编中,可以将C语言表达式指定为汇编指令的操作数,而且不用去管如何将C语言表达式的值读入哪个寄存器,以及如何将计算结果写回C 变量,你只要告诉程序中C语言表达式与汇编指令操作 ...
转: __asm__ __volatile__内嵌汇编用法简述
from: http://www.embedu.org/Column/Column28.htm __asm__ __volatile__内嵌汇编用法简述作者:刘老师,华清远见嵌入式学院高级讲师,AR ...
内存屏障（Memory barrier）-- 转发
本文例子均在 Linux(g++)下验证通过,CPU 为 X86-64 处理器架构.所有罗列的 Linux 内核代码也均在(或只在)X86-64 下有效. 本文首先通过范例(以及内核代码)来解释 Me ...
并行计算之Memory barrier（内存
本文转载自:http://name5566.com/4535.html 参考文献列表:http://en.wikipedia.org/wiki/Memory_barrierhttp://en.wiki ...
理解 Memory barrier
理解 Memory barrier(内存屏障) 发布于 2014 年 04 月 21 日2014 年 05 月 15 日作者 name5566 参考文献列表:http://en.wikipedia. ...
Linux内核同步机制之（三）：memory barrier【转】
转自:http://www.wowotech.net/kernel_synchronization/memory-barrier.html 一.前言我记得以前上学的时候大家经常说的一个词汇叫做所见即 ...
理解 Memory barrier（内存屏障）无锁环形队列
原文:https://www.cnblogs.com/my_life/articles/5220172.html Memory barrier 简介程序在运行时内存实际的访问顺序和程序代码编写的访问 ...
Memory barrier 简介
Memory barrier Memory barrier 简介程序在运行时内存实际的访问顺序和程序代码编写的访问顺序不一定一致,这就是内存乱序访问.内存乱序访问行为出现的理由是为了提升程序运行时的 ...
理解 Memory barrier（内存屏障）【转】
转自:http://name5566.com/4535.html 参考文献列表:http://en.wikipedia.org/wiki/Memory_barrierhttp://en.wikiped ...

随机推荐

Geodesic-based robust blind watermarking method for three-dimensional mesh animation by using mesh segmentation and vertex trajectory
之前因为考试,中断了实验室的工作,现在结束考试了,不能再荒废了. 最近看了一篇关于序列水印的文章,大体思想是:对于一个网格序列,首先对第一帧进行处理,在第一帧上,用网格分割算法(SDF)将网格分割成几 ...
三道JS试题（遍历、创建对象、URL解析）
最近在网上看到了三道不错的JS试题,还是很基础(一直认为学好前端基本功很重要...),现在记录如下: 原帖地址:http://www.w3cfuns.com/forum.php?mod=viewthr ...
RPC框架motan: 通信框架netty之Netty4Client
上文已经初步探讨了如何实现一个具体的transport,本文就来讨论一个具体的transport,本文讨论netty4的的相关实现.老规矩,看看motan-transport的目录结构. 其中最重要的 ...
Hard-Margin SVM（支持向量机）
什么是Hard-Margin SVM?指的是这个向量机只适用于“数据完全可分(seperately)”的情况. (一)什么是支持向量机? 上述三条直线,选择哪一条比较好?直觉上来说,最右面的那条直线最 ...
2016年CCF第七次测试俄罗斯方块
//2016年CCF第七次测试俄罗斯方块 // 这道小模拟题还是不错 // 思路:处理出输入矩阵中含1格子的行数和列数 // 再判是否有一个格子碰到底部,否则整体再往下移动一步,如果有一个格子不能移 ...
web服务器分析与设计（一）
自己写一个简单的服务器. 面向对象分析与设计第一步:获取需求(基于用例) 功能:1,支持html静态网页,2,支持常用HTTP请求,且容易扩展支持不现请求 3,可以发布站点补充:至于对动态网页等高级 ...
关于OpenCV做图像处理内存释放的一些问题
转载:http://blog.sina.com.cn/s/blog_67a7426a0101czyr.html 工程运行,发现内存持续增长,到一定的时候就发生了内存泄漏. 内存泄露的定义内存泄露是说 ...
编译器对C++ 11变参模板（Variadic Template）的函数包扩展实现的差异
编译器对C++ 11变参模板(Variadic Template)的函数包扩展实现的差异题目挺绕口的.C++ 11的好东西不算太多,但变参模板(Variadic Template)肯定是其中耀眼的一 ...
HDU 2296 Ring （AC自动机+DP）
Ring Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 32768/32768 K (Java/Others)Total Submis ...
uva 315 Network（无向图求割点）
https://uva.onlinejudge.org/index.php?option=com_onlinejudge&Itemid=8&page=show_problem& ...

__asm__ __volatile__("": : :"memory");

__asm__ __volatile__("": : :"memory");的更多相关文章

随机推荐

热门专题

asm volatile("": : :"memory");

asm volatile("": : :"memory");的更多相关文章