这是2013年写的一篇旧文,放在gegahost.net上面  http://raison.gegahost.net/?p=97

March 11, 2013

General mistakes in parallel computing

Filed under: concurrency,software — Tags: atomic, cocurrency, data race, dead lock, parellel — Raison @ 2:51 am

(Original Work by Peixu Zhu)

In parallel computing environment, some general mistakes are frequent and difficult to shoot, caused by random CPU sequence in different thread contexts. Most of them are atomic violation, order violation, and dead lock. Studies show that some famous software also have such mistakes, like MySQL, Apache, Mozilla, and OpenOffice.

1. Atomic violation

In sequent programming, we seldom care the atomic operation, however, in parallel programming, we must remember atomic operations at first. for example:

[Thread 1]


if (_ptr)         // A
*_ptr = 0; // B

[Thread 2]

_ptr = NULL;        // C

For above code, there’s one statement to be executed in thread 1 and
thread 2 respectively, it seems that it should be running the statement
in thread 1 or thread 2, they should not be interlaced. But, in fact,
statement in thread 1 is not atomic, at least, it can
be divided into step A and B, thus, if it is arranged to execute in
order of A-B-C, it is okay, however, it is also possible be scheduled to
run as A-C-B, this will bring an unexpected memory access error.

We assume that the statement region in thread 1 is atomic, but it is
not true. This is the root of the atomic violation. In many cases, the
problem is caused by code modification, for above example, the statement
in thread 1 may be a simple assignment statement at first:
_ptr = &_val;
And later, the code is modified, and the implicit atomicity is broken.

For systems with multiple cores, the problem will be more
complicated, since each core may cache a block of memory respectively.
For example, core 1 runs thread 1, and core 2 runs thread 2:
[Thread 1]
_ptr = &_val;

[Thread 2]
_ptr = NULL;

Are they atomic ? No, they are not in fact. the `_ptr` may be
optimized to be register value in one core locally, or it is cached in
different core. Thus, the we can not determine the value of `_ptr`.

To avoid atomic violation, we must make the code region atomic, by
locking or atomic operations. Explicit atomic operations on a shared
variable is a good habit, since we are noticed by the statement that it
is atomicity demanded when we try to modify the code.

2. Order violation

Considering below example:
[Thread 1]
_ptr = allocate_memory(); // A

[Thread 2]
_ptr[1] = "right"; // B

If the code is not synchronized, execution order of A-B or B-A are
all possible. In such cases, we must synchronize the code block to
ensure the order of execution.

3. Dead lock

Locking is elemental in concurrent programming. If there’s more than
one threads working with more than with one shared resource, such as
memory block, it is possible that each thread owning a resource is
waiting for each others resource.
[Thread 1]

lock_a.lock();
a = 0; // A
lock_b.lock();
b = 0; // B
lock_b.unlock();
lock_a.unlock();

[Thread 2]

lock_b.lock();
b = 1; // C
lock_a.lock();
a = 1; // D
lock_a.unlock();
lock_b.unlock();

if the code is running as A-B-C-D, there’s no problem, however, if it
is running as A-C-B-D, there’s dead lock. Dead locking requires four
conditions:
a. mutex exclusion
b. hold and wait
c. no preemption
d. circular waiting

Breaking at least one of above four condition will break the dead locking.

General mistakes in parallel computing的更多相关文章

  1. Introduction to Parallel Computing

    Copied From:https://computing.llnl.gov/tutorials/parallel_comp/ Author: Blaise Barney, Lawrence Live ...

  2. Method and apparatus for an atomic operation in a parallel computing environment

    A method and apparatus for a atomic operation is described. A method comprises receiving a first pro ...

  3. PatentTips - Safe general purpose virtual machine computing system

    BACKGROUND OF THE INVENTION The present invention relates to virtual machine implementations, and in ...

  4. STROME --realtime & online parallel computing

    Data Collections ---> Stream to Channel (as source input) ----> Parallel Computing---> Resu ...

  5. Parallel Computing–Cannon算法 (MPI 实现)

    原理不解释,直接上代码 代码中被注释的源程序可用于打印中间结果,检查运算是否正确. #include "mpi.h" #include <math.h> #includ ...

  6. Distributed and Parallel Computing

    Omega Network Model

  7. How-to go parallel in R – basics + tips(转)

    Today is a good day to start parallelizing your code. I’ve been using the parallel package since its ...

  8. Parallel Gradient Boosting Decision Trees

    本文转载自:链接 Highlights Three different methods for parallel gradient boosting decision trees. My algori ...

  9. Massively parallel supercomputer

    A novel massively parallel supercomputer of hundreds of teraOPS-scale includes node architectures ba ...

随机推荐

  1. linux初级学习笔记六:linux用户及权限详解!(视频序号:03_4)

    本节学习的命令:/etc/passwd,/etc/shadow,/etc/group文件详解 本节学习的技能: 安全上下文 文件与目录的权限管理 影子命令 用户,用户组类别详解 /etc/passwd ...

  2. 一步一步学Silverlight 2系列(9):使用控件模板

    述 Silverlight 2 Beta 1版本发布了,无论从Runtime还是Tools都给我们带来了很多的惊喜,如支持框架语言Visual Basic, Visual C#, IronRuby, ...

  3. android广播接收器

    Android程序创建广播接收器继承BroadcastReceiver Android广播接收器需要在AndroidManifest.xml文件中声明: <recevie android:nam ...

  4. Android自动化测试环境搭建

    Android自动化环境的搭建主要包括: 1. java jdk和jre的安装和环境的配置 2. appium服务器的安装和配置 3. eclipse开发工具,这里不必要用Android Studio ...

  5. oracle重命名数据文件

    重命名数据文件   方法1: sql>alter tablespace users offline; sql>host cp /u01/app/oracle/oradata/orcl/us ...

  6. Linux系统CentOS下mysql的安装日志

    今天自己捣鼓了一下,在linux系统CentOs6.5下使用源码方式安装和配置mysql,这里记录一下步骤. a) 下载mysql,source版本.Mysql-5.6.20.tar.gz b) 安装 ...

  7. [Selenium] 如何使ChromeDriver 每次启动的端口不会随机变化

    ChromeDriver  在不指定任何参数的情况下,启动监听端口会随机变化.如果需要保证其端口固定不变,可通过ChromeDriverService 打的目的 public class testCh ...

  8. 【iOS】KVC 和 KVO 的使用场景

    http://blog.csdn.net/chenglibin1988/article/details/38259865   Key Value Coding Key Value Coding是coc ...

  9. 《Eye In-Painting with Exemplar Generative Adversarial Networks》论文阅读笔记

    Abstract 基于conditional GAN使用隐藏在reference image中的exemplar information生成high-quality,personalized in-p ...

  10. 指针 * &

    int main() { ; //定义int变量updates int * p_updates; //定义指针p_updates p_updates=&updates;//将updates的地 ...