Atomic operations on the x86 processors
On the Intel type of x86 processors including AMD, increasingly there are more CPU cores or processors running in parallel.
In the old days when there was a single processor, the operation:
++i;
Would be thread safe because it was one machine instruction on a single processor. These days laptops have numerous CPU cores so that even single instruction operations aren't safe. What do you do? Do you need to wrap all operations in a mutex or semaphore? Well, maybe you don't need too.
Fortunately, the x86 has an instruction prefix that allows a few memory referencing instruction to execute on specific memory locations exclusively.
There are a few basic structures that can use this:
(for the GNU Compiler)
void atom_inc(volatile int *num)
{
__asm__ __volatile__ ( "lock incl %0" : "=m" (*num));
}
void atom_dec(volatile int *num)
{
__asm__ __volatile__ ( "lock decl %0" : "=m" (*num));
}
int atom_xchg(volatile int *m, int inval)
{
register int val = inval;
__asm__ __volatile__ ( "lock xchg %1,%0" : "=m" (*m), "=r" (val) : "1" (inval));
return val;
}
void atom_add(volatile int *m, int inval)
{
register int val = inval;
__asm__ __volatile__ ( "lock add %1,%0" : "=m" (*m), "=r" (val) : "1" (inval));
}
void atom_sub(volatile int *m, int inval)
{
register int val = inval;
__asm__ __volatile__ ( "lock sub %1,%0" : "=m" (*m), "=r" (val) : "1" (inval));
}
For the Microsoft Compiler:
void atom_inc(volatile int *num)
{
_asm
{
mov esi, num
lock inc DWORD PTR [esi]
};
}
void atom_dec(volatile int *num)
{
_asm
{ mov esi, num
lock dec DWORD PTR [esi]
};
}
int atom_xchg(volatile int *m, int inval)
{
_asm
{
mov eax, inval
mov esi, m
lock xchg eax, DWORD PTR [esi]
mov inval, eax
}
return inval;
}
void atom_add(volatile int *num, int val)
{
_asm
{ mov esi, num
mov eax, val
lock add DWORD PTR [esi], eax
};
}
void atom_sub(volatile int *num, int val)
{
_asm
{ mov esi, num
mov eax, val
lock sub DWORD PTR [esi], eax
};
}
The lock prefix is not universally applied. It only works if all accesses to the locations also use lock. So, even though you use "lock" in one section of code, another section of code that just sets the value will not be locked out. Think of it as just a mutex.
Basic usage:
class poll
{
int m_pollCount;
....
....
void pollAdd()
{
atom_inc(&m_pollCount);
}
};
The above example increments a poll object count by one.
SRC=http://www.mohawksoft.org/?q=node/78
Atomic operations on the x86 processors的更多相关文章
- Voting and Shuffling to Optimize Atomic Operations
2iSome years ago I started work on my first CUDA implementation of the Multiparticle Collision Dynam ...
- 什么是Java中的原子操作( atomic operations)
1.啥是java的原子性 原子性:即一个操作或者多个操作 要么全部执行并且执行的过程不会被任何因素打断,要么就都不执行. 一个很经典的例子就是银行账户转账问题: 比如从账户A向账户B转1000元,那么 ...
- 【转】ARM vs X86 – Key differences explained!
原文:http://www.androidauthority.com/arm-vs-x86-key-differences-explained-568718/ Android supports 3 d ...
- 原子操作(atomic operation)
深入分析Volatile的实现原理 引言 在多线程并发编程中synchronized和Volatile都扮演着重要的角色,Volatile是轻量级的synchronized,它在多处理器开发中保证了共 ...
- A multiprocessing system including an apparatus for optimizing spin-lock operations
A multiprocessing system having a plurality of processing nodes interconnected by an interconnect ne ...
- Adaptively handling remote atomic execution based upon contention prediction
In one embodiment, a method includes receiving an instruction for decoding in a processor core and d ...
- Method and apparatus for an atomic operation in a parallel computing environment
A method and apparatus for a atomic operation is described. A method comprises receiving a first pro ...
- Moving x86 assembly to 64-bit (x86-64)
While 64-bit x86 processors have now been on the market for more than 5 years, software support is o ...
- A trip through the Graphics Pipeline 2011_13 Compute Shaders, UAV, atomic, structured buffer
Welcome back to what’s going to be the last “official” part of this series – I’ll do more GPU-relate ...
随机推荐
- Recovery 中的UI知识积累【转】
本文转载自:http://blog.csdn.net/wed110/article/details/26554197 int gr_init(void); /* 初始化图形显示 ...
- 深度学习必备:随机梯度下降(SGD)优化算法及可视化
补充在前:实际上在我使用LSTM为流量基线建模时候,发现有效的激活函数是elu.relu.linear.prelu.leaky_relu.softplus,对应的梯度算法是adam.mom.rmspr ...
- leetcode数组相关
目录 4寻找两个有序数组的中位数 11盛最多水的容器,42接雨水 15三数之和,16最接近的三数之和,18四数之和 26/80删除排序数组中的重复项, 27移除元素 31下一个排列 53最大子序和 5 ...
- 机器学习——Day 2 简单线性回归
写在开头 由于某些原因开始了机器学习,为了更好的理解和深入的思考(记录)所以开始写博客. 学习教程来源于github的Avik-Jain的100-Days-Of-MLCode 英文版:https:// ...
- lua 10进制转换成其它进制table表示
-- params@num integer -- ~) 默认为10 -- NOTE:先不输出符号 function NumberToArray(num, radix) if type(num) ~= ...
- 全局设置border-box
全局设置 border-box 很好,更符合我们通常对一个「盒子」尺寸的认知.,其次它可以省去一次又一次的加加减减,它还有一个关键作用——让有边框的盒子正常使用百分比宽度.但是使用了 border-b ...
- Android 应用安装成功之后删除apk文件
问题: 在应用开发中遇到需要这样的需求:在用户下载我们的应用安装之后删除安装包. 解决: android会在每个外界操作APK的动作之后发出系统级别的广播,过滤器名称: android.intent. ...
- Android Studio 将module打成jar包
1.新建测试工程,工程里面有两个module,app是Android工程,mylibrary是Android Library库. 2.打开mylibrary目录下的build.gradle文件,加入下 ...
- ArrayList 源码
1.ArrayList的类关系: 2.属性及方法 2.1 构造 三个构造方法分别对应: 通过传入初始化容器大小构造数组列表 ...
- Eclipse中配置SVN(步骤简述)
————Eclipse中配置SVN(步骤简述)———— 1.有客户端(tortoiseSVN),服务器端(visualSVN) 两种,根据需要安装,安装后需重启电脑 2.服务器端配置:创建版本库(放工 ...