actor concurrency

The hardware we rely on is changing rapidly as ever-faster chips are replaced by ever-increasing numbers of cores. As a result, concurrency and parallelism, niche features today, will soon be a basic requirement for most software. Application developers seeking concurrency solutions have discovered that shared-state concurrency (employed by Java and other imperative languages) comes up short when compared to functional models that promise better reliability and scalability. In this two-part article Alex Miller introduces the actor concurrency model, which has been popularized by Erlang and more recently adopted by Scala. In Part 1 you'll be introduced to the actor model and its benefits; in Part 2 you'll learn how to use it on the JVM. Level: Intermediate

As programmers we know that our software lives on top of hardware, which for decades has relied on chips getting faster at an unrelenting pace. Moore's Law famously states that the number of transistors on a chip will double approximately every 18 months. This exponential law has held true for nearly four decades. and has even exceeded that pace. Intel chips had a few thousand transistors in the early 1970s. The chip in the computer you're reading this article with probably has hundreds of millions of transistors, and newer quad-core chips have several billion.

Until recently, the increase in chip counts (and reduction in size) has made chips faster, increased the number of pipelines, and dramatically increased the number and size of on-chip caches. But hardware crossed a boundary in the early 2000s: chips got big enough and cycle speed got fast enough that a signal can no longer reach the whole chip in a clock cycle -- a physical and architectural limit that will be difficult to breach. Yet the transistors per chip continue their unrelenting exponential march. Instead of increased speed per core, we are now seeing an increase in the number of cores per chip. CPU speeds are likely to stay relatively flat or possibly even decrease in the near future in the service of reduced heat and power consumption (see Figure 1). Themulticore era is well under way, and it will change the way we write programs -- a reality that many software developers have yet to face.

Figure 1. Transistor and MIPS (millions of instructions per second) trends over time. Click to enlarge.

Right now you probably use a dual-core machine for day-to-day work, or maybe a quad-core if you're lucky. You might be deploying your server application to an 8-core machine. The concurrency model that most popular languages use now -- shared-state concurrency -- can make good use of multiple cores and result in fast and efficient software. But if the number of cores doubles every 18 months, in a decade we will be dealing with thousands of cores. (This seems ridiculous, but exponential curves always seem ridiculous to our feeble wetware. Forty years ago, the growth in transistor count to today's levels seemed ridiculous and unsustainable.) The prospect of multicore systems of this magnitude threatens the viability of the shared-state concurrency model. This article introduces you to a robust alternative -- the actor concurrency model -- and explains how it is implemented in a 20+-year-old yet increasingly relevant functional language: Erlang.

Shared-state concurrency

The most popular languages today rely on mutable shared state and locks as a model of concurrency. A typical example of multithreaded code is the Counter class in Listing 1.

Listing 1. A typical synchronized Counter class in Java

public class Counter {

  private int count = 0;

  public synchronized int increment() {

    return ++count;

  }

}

Counter is a thread-safe class that maintains a counter (the shared state) and protects access to that state by allowing only one thread at a time to read or modify that state. In Listing 1 that protection is provided by the synchronized block, but it could also be provided by explicit Lock objects.

If you've written programs with shared-state concurrency, you've likely encountered the legions of dragons that can arise. If mutable state is not modified and read under the scope of a lock, you might not see the latest version of the value from another thread. Or threads can obtain multiple locks in differing orders, leading to deadlock. Or the necessity for locking can lead to lock contention.

Perhaps the most difficult aspect of concurrent programming is that it's quite easy to combine several properly written concurrent components and thereby produce a system that either is broken or performs poorly. The composite system can be broken if the independent components modify state under different independent locks but the program requires atomic operations that modify state in both components. You can address this problem in a variety of ways with careful architecture, by exposing the locking semantics of the components, or by adding new layers of locks over the components. However, most of the easiest ways to deal with these issues result in poor performance or unmaintainable code.

Whether shared-state concurrency can scale is an equally important question. Will it scale to 16 cores? 32? 1000? Some parts of your program will naturally scale up. If you've got a thread pool working on a set of parallel tasks, it's entirely feasible to scale that to a larger number of cores. But in general, much of your program does not consist of discrete tasks that can be easily made parallel.Amdahl's Law is commonly used to calculate the maximum benefit from increasing the parallel parts of your program. But the sad fact remains that as you scale, the non-parallel parts of your program are the ones that matter, and you can't parallelize them away.

Actor concurrency

So what's the alternative? The actor model of concurrency has recently gained prominence, largely thanks to some success achieved by its flagship language,Erlang.

The actor model consists of a few key principles:

No shared state
Lightweight processes
Asynchronous message-passing
Mailboxes to buffer incoming messages
Mailbox processing with pattern matching

Let's look at these principles in more detail. An actor is a process that executes a function. Here a process is a lightweight user-space thread (not to be confused with a typical heavyweight operating-system process). Actors never share state and thus never need to compete for locks for access to shared data. Instead, actors share data by sending messages that are immutable. Immutable data cannot be modified, so reads do not require a lock.

Messages are sent asynchronously and are buffered in an actor's mailbox. A mailbox is essentially a queue with multiple producers (other actors) and a single consumer. A particular actor is driven by receiving messages from the mailbox based on pattern matching.

Some key assumptions are built into this model:

Processes are cheap (in memory) and fast to create
Processes are small and thus large numbers of processes can be created
Processes are not bound to kernel threads and can be efficiently scheduled in user space
The scheduler requires the ability to pause and continue process execution
Messages can be retrieved by efficiently pattern-matching messages in the mailbox

Erlang

The actor model has most famously been associated with Erlang, a functional, dynamically typed language invented in 1986 at Ericsson. Erlang was designed for creating applications (such as telephone switches) that must run nonstop. That requires the code to be hot-swappable, distributed, and robust in the face of errors. Erlang was open-sourced in 1998 and has gained prominence in the last few years as concurrency has become a concern. A complete overview of Erlang is outside this article's scope, but I'll give you a quick introduction to the language's key aspects.

Erlang programs are compiled and run in a virtual machine, but you can also use an interactive shell to explore the basics. Erlang variables, which start with a capital letter or a _ character, can be assigned only once, unlike in most other languages. Listing 2 shows an example trace in the Erlang shell with a Value variable.

Listing 2. Erlang variables

1> Value = 4.

4

2> Value.

4

3> Value = 6.

** exception error: no match of right hand side value 6

In this trace you can see expressions being entered on the lines starting with 1>, 2>, and 3> and the response from the shell on the following lines. Line 1 assigns the value 4 to the Value variable. Line 2 returns the value of Value. Line 3 demonstrates that Value can't be reassigned to a new value because it is already bound to the value 4.

Actually, = is not an assignment operator at all in Erlang. It is really a pattern-matching operator. In the case of an unbound variable on the left-hand side, the pattern is unmatched and will bind a value from the right-hand side. But that's just the beginning of what = can do.

Words starting with lowercase letters are atoms -- fixed symbols that always represent a constant, number, or string of the same name, even across machines. Atuple is a fixed-size set of heterogenous values and is specified with { }. Here we match a variable with a tuple of atoms:

Stooges = {larry, curly, moe}.

You can also put tuples on the left side and use variables and the pattern-matching =to extract values from a tuple on the right side:

{Stooge1, Stooge2, Stooge3} = {larry, curly, moe}.

Here the variables Stooge1, Stooge2, and Stooge3 match the atoms larry, curly, and moe.

Lists contain a variable number of elements and are represented in [ ]. The | (pipe) is used to separate the head element from the rest of the list (similar to car and cdr in Lisp). In Listing 3 we build a list and then extract the list's head and tail with pattern matching.

Listing 3. Lists

1> List=[1,2,3].

[1,2,3]

2> [First | Rest]=List.

[1,2,3]

3> First.

1

4> Rest.

[2,3]

Functions are first-class values, as you would expect in a functional language. They are defined by a function name and the number of parameters, using a simple syntax. Here we match the anonymous function to the Square variable. You can then execute it or even create functions that return functions, like the TimesN function in Listing 4.

Listing 4. Functions

1> Square = fun(X) -> X*X end.

#Fun<erl_eval.6.13229925>

2> Square(5).

25.

3> TimesN = fun(N) -> (fun(X) -> N*X end)

3> end.

#Fun<erl_eval.6.13229925>

4> lists:map(TimesN(10), [1,2,3]).

[10,20,30]

At the end of Listing 4 you can see a call to the function mapin one of the most important Erlang modules: lists.

Normally you define Erlang code in source files with the .erlextension. Each source file can declare a module and which functions are exported or imported to that module. For example, Listing 5 is a definition of the mymath module, which has two functions -- square and fib.

Listing 5. The mymath.erl module

-module(mymath).

-export([square/1,fib/1]).

square(Value) -> Value*Value.

fib(0) -> 0;

fib(1) -> 1;

fib(N) when N>1 -> fib(N-1) + fib(N-2).

In Listing 5, fib is defined by three different patterns, and the runtime will decide which pattern to call as appropriate. The when ... is a guard clause that can constrain when a pattern is applicable.

Listing 6 shows how to compile and load this module in the shell.

Listing 6 Compiling a module and executing its functions

1> c(mymath.erl).

2> mymath:square(5).

25

3> mymath:fib(7).

13

Actors in Erlang

Now that you're at least a little bit comfortable with Erlang, we can begin looking at actors. Three basic elements in Erlang form the foundation for concurrency. First, the built-in spawn function creates a new process executing a function and returns the new process's process identifier. Second is a syntax for sending a message to a process with the ! operator. And finally, actors can use a receive...end function to pattern-match messages from the actor's mailbox.

To see how these pieces fit together, we'll create a process that does conversions between Celsius and Fahrenheit temperatures. First we must define the function that the process will run, as shown in Listing 7.

Listing 7. temperature.erl

temperatureConverter() ->

 receive

 {toF, C} ->

 io:format("~p C is ~p F~n", [C, 32+C*9/5]),

 temperatureConverter();

 {toC, F} ->

 io:format("~p F is ~p C~n", [F, (F-32)*5/9]),

 temperatureConverter();

 {stop} ->

 io:format("Stopping~n");

 Other ->

 io:format("Unknown: ~p~n", [Other]),

 temperatureConverter()

 end.

This function receives messages in the form of three possible tuple formats. The first format is {toF, C}. In this tuple, the first element is the toF atom, and the second is the temperature in Celsius. In the code the variable C is matched to the value. On match, it prints the answer and calls back to this function again. This is a tail-recursive call. Erlang actor functions often follow this idiom of matching an incoming message, then making a tail-recursive call back to the same function. State can be maintained in the actor by passing it in a function parameter and modifying it on the recursive call.

The {stop} message is a special message to stop the process. The last pattern is just an Other variable that maps to anything, prints an error message, and restarts.

You've no doubt noticed that depending on the circumstance, clauses and functions are separated by , (comma), ;(semicolon), and . (period). In Erlang, these punctuation marks serve much the same purpose as their use in English. The comma separates phrases (often subsequent expressions to evaluate). The semicolon indicates a related but separate phrase. The period terminates a complete statement.

We can then spawn a process to run this function and send messages to it, as shown in Listing 8.

Listing 8. Running the temperature converter process

1> c(temperature).

{ok,temperature}

2> Pid = spawn(fun temperature:temperatureConverter/0).

<0.128.0>

3> Pid ! {toC, 32}.

32F is 0.0 C

{convertToC,32}

4> Pid ! {toF, 100}.

100C is 212.0 F

{convertToF,100}

5> Pid ! {stop}.

Stopping

{stop}

In this trace in the shell, we load and compile the temperature module, then spawn a process running the temperatureConverter function. In return we get the process ID of the converter process. We can then send it messages using the ! operator.

RPC in Erlang

In the example in Listing 7 we simply printed the answer in the process, but it's probably far more useful to get a response to our question. A remote procedure call (RPC) is easy to build by simply including the process identifier of the source as part of the message. We must rework our converter function to expect a source identifier and to respond with a message instead of printing, as shown in Listing 9.

Listing 9. Reworking the converter for RPC

temperatureConverter() ->

 receive

 {From, {toF, C}} ->

 From ! {self(), 32+C*9/5},

 temperatureConverter();

 ...etc

We can then add some functions to hide the spawn and the remote call behind a simpler interface, as shown in Listing 10.

Listing 10. Wrapping the RPC

start() ->

 spawn(fun() -> temperatureConverter() end).

convert(Pid, Request) ->

 Pid ! {self(), Request},

 receive

 {Pid, Response} -> Response

 end.

The start() function spawns the converter process and returns its process identifier. The convert function uses the process identifier to call the converter process, sending the current identifier as the source, then blocks in a receivewaiting for a response on the current's process mailbox, which is subsequently returned.

These are just the most basic ways to use actors in Erlang. Erlang also makes it easy to handle errors: instead of having a process manage its own errors, it's more common to create a new process (they're cheap, right?) to watch the original process and deal with an error. Many patterns, such as propagating the error or restarting the process, are possible. In fact, Erlang provides a whole library that deals with patterns of error handling, hot-swapping code, and other useful functionality.

Does it work?

Erlang wouldn't be receiving attention today without real-life success stories. At Ericsson, the AXD 301 switch is the flagship example of Erlang's scalability and reliability. This switch handles millions of calls per week and has a reported 99.9999999 percent uptime. Once Erlang was open-sourced, it was picked up by Ericsson's competitor Nortel and integrated into the Nortel Alteon SSL Accelerator, which does hundreds of transactions per second.

Because of Erlang's focus on messaging, it seems to be a good fit for these kinds of systems. The ejabberd daemon is a highly-regarded Jabber/XMPP implementation that has been used in some large implementations such as jabber.org. Similarly,RabbitMQ is a high-performance AMQP server written in Erlang that can reportedly handle 400,000 messages per second.

actor concurrency的更多相关文章

Erlang/Elixir精选-第4期(20191223)
精选文章 A digital symphony - The architecture of API Fortress. 使用Actor模型来支持基于微服务的大规模分布式软件架构.用实例解释了Actor ...
Concurrency != Parallelism
前段时间在公司给大家分享GO语言的一些特性,然后讲到了并发概念,大家表示很迷茫,然后分享过程中我拿来了Rob Pike大神的Slides <Concurrency is not Parallel ...
Concurrency vs. Parallelism
http://getakka.net/docs/concepts/terminology Terminology and Concepts In this chapter we attempt to ...
C++多线程开发之actor model
最近想把写过的一个多线程程序整理一下,这个程序主要特点是有一系列的互相之间有依赖关系的task.于是在网上找相关类库 1,一类是简单的线程池了,这也是原本俺的做法.之前使用的是手工调度,代码实现的很蛋 ...
Akka的Actor模型及使用实例
本文的绝大部分内容转载自rerun.me这一blog,老外写的东西就是好啊. ACTORS介绍 Anyone who has done multithreading in the past won't ...
actor
_timcharper1月 20 02:18 @timcharper , so what app the actor model can help, and what app the thread m ...
Akka（1）：Actor - 靠消息驱动的运算器
Akka是由各种角色和功能的Actor组成的,工作的主要原理是把一项大的计算任务分割成小环节,再按各环节的要求构建相应功能的Actor,然后把各环节的运算托付给相应的Actor去独立完成.Akka是个 ...
并发之痛 Thread，Goroutine，Actor
转自:http://jolestar.com/parallel-programming-model-thread-goroutine-actor/ 先梳理下两个概念,几乎所有讲并发的文章都要先讲这两个 ...
Actor模型---SwiftActors
actor是一个无线程区别的内存访问对象:actor背后有线程支持:actor的事件处理依赖与这个线程(队列.池). actor是一种面向对象的线程(池)模型,强调对事件的响应:在iOS中相当于一种通 ...

随机推荐

用C++实现的解数独（Sudoku）程序
我是一个C++初学者,控制台实现了一个解数独的小程序. 代码如下: //"数独游戏"V1.0 //李国良于2016年11月11日编写完成 #include <iostream ...
Delphi容器类之---TList、TObjectList、TComponentList、TClassList
转载自:http://blog.csdn.net/iseekcode/article/details/4922001 从Delphi5开始VCL中增加了新的Contnrs单元,单元中定义了8个新的类, ...
python学习笔记(4)--函数
1.函数函数是指将一组语句的集合通过一个名字封装起来.要想执行这个函数,只需调用其函数名即可. 函数的特性: 1.减少重复代码 2.使程序变的课扩展 3.使程序变得易维护语法定义: def pri ...
Breakpad Google的crash捕获、抓取开源库
简介: Breadpad为google chrominum项目下用于处理dump的一套工具:内部采用跨平台方式实现捕获.生成.解析与平台无关的dump,便于统一处理:支持进程内与进程外捕获,当为进程外 ...
网站跨站点脚本，Sql注入等攻击的处理
从360安全论坛里找到的一段代码,经过整理封装,直接在站点Global.asax文件或写一个HttpModule来拦截恶意请求即可: http://bbs.webscan.360.cn/forum.p ...
Linux下Keepalived+LVS-DR模式配置高可用负载均衡集群
一.环境说明: 操作系统:Centos-6.5_x86_64 keepalived软件安装在node2和node3机器上. 实际安装之前,先关闭keepalived节点(node ...
Winform MDI窗体容器、权限、简单通讯
MDI窗体容器: 一般来说,窗体是顶级容器,不允许放在其他任何容器内,但是如果将某个窗体的IsMdiContainer属性设置为True,那此窗体就会成为窗体容器,可以在其中放入其他窗体在内部的窗体 ...
增删改查--windows下mysql客户端--表的使用
>>>>>>>>>>>>>>>>>>>> selet 5种子句之where常用运 ...
1.3 jQuery基础
使用jQuery需要引用jQuery文件(页面引用多个js文件时jQuery在最前面) <script src="jquery-1.11.2.min.js"></ ...
laravel 操作 redis
laravel框架中本身已经存在相应的redis的配置我们在使用的时候只需要更改配置即可,但是在使用的时候一定要注意命名空间的问题,具体可查看config/app.php下面的aliases数组中具体 ...