The hardware we rely on is changing rapidly as ever-faster chips are replaced by ever-increasing numbers of cores. As a result, concurrency and parallelism, niche features today, will soon be a basic requirement for most software. Application developers seeking concurrency solutions have discovered that shared-state concurrency (employed by Java and other imperative languages) comes up short when compared to functional models that promise better reliability and scalability. In this two-part article Alex Miller introduces the actor concurrency model, which has been popularized by Erlang and more recently adopted by Scala. In Part 1 you'll be introduced to the actor model and its benefits; in Part 2 you'll learn how to use it on the JVM. Level: Intermediate

As programmers we know that our software lives on top of hardware, which for decades has relied on chips getting faster at an unrelenting pace. Moore's Law famously states that the number of transistors on a chip will double approximately every 18 months. This exponential law has held true for nearly four decades. and has even exceeded that pace. Intel chips had a few thousand transistors in the early 1970s. The chip in the computer you're reading this article with probably has hundreds of millions of transistors, and newer quad-core chips have several billion.

Until recently, the increase in chip counts (and reduction in size) has made chips faster, increased the number of pipelines, and dramatically increased the number and size of on-chip caches. But hardware crossed a boundary in the early 2000s: chips got big enough and cycle speed got fast enough that a signal can no longer reach the whole chip in a clock cycle -- a physical and architectural limit that will be difficult to breach. Yet the transistors per chip continue their unrelenting exponential march. Instead of increased speed per core, we are now seeing an increase in the number of cores per chip. CPU speeds are likely to stay relatively flat or possibly even decrease in the near future in the service of reduced heat and power consumption (see Figure 1). Themulticore era is well under way, and it will change the way we write programs -- a reality that many software developers have yet to face.

 

Figure 1. Transistor and MIPS (millions of instructions per second) trends over time. Click to enlarge.

Right now you probably use a dual-core machine for day-to-day work, or maybe a quad-core if you're lucky. You might be deploying your server application to an 8-core machine. The concurrency model that most popular languages use now -- shared-state concurrency -- can make good use of multiple cores and result in fast and efficient software. But if the number of cores doubles every 18 months, in a decade we will be dealing with thousands of cores. (This seems ridiculous, but exponential curves always seem ridiculous to our feeble wetware. Forty years ago, the growth in transistor count to today's levels seemed ridiculous and unsustainable.) The prospect of multicore systems of this magnitude threatens the viability of the shared-state concurrency model. This article introduces you to a robust alternative -- the actor concurrency model -- and explains how it is implemented in a 20+-year-old yet increasingly relevant functional language: Erlang.

Shared-state concurrency

The most popular languages today rely on mutable shared state and locks as a model of concurrency. A typical example of multithreaded code is the Counter class in Listing 1.

Listing 1. A typical synchronized Counter class in Java

public class Counter {
private int count = 0; public synchronized int increment() {
return ++count;
}
}
 

Counter is a thread-safe class that maintains a counter (the shared state) and protects access to that state by allowing only one thread at a time to read or modify that state. In Listing 1 that protection is provided by the synchronized block, but it could also be provided by explicit Lock objects.

If you've written programs with shared-state concurrency, you've likely encountered the legions of dragons that can arise. If mutable state is not modified and read under the scope of a lock, you might not see the latest version of the value from another thread. Or threads can obtain multiple locks in differing orders, leading to deadlock. Or the necessity for locking can lead to lock contention.

Perhaps the most difficult aspect of concurrent programming is that it's quite easy to combine several properly written concurrent components and thereby produce a system that either is broken or performs poorly. The composite system can be broken if the independent components modify state under different independent locks but the program requires atomic operations that modify state in both components. You can address this problem in a variety of ways with careful architecture, by exposing the locking semantics of the components, or by adding new layers of locks over the components. However, most of the easiest ways to deal with these issues result in poor performance or unmaintainable code.

 

Whether shared-state concurrency can scale is an equally important question. Will it scale to 16 cores? 32? 1000? Some parts of your program will naturally scale up. If you've got a thread pool working on a set of parallel tasks, it's entirely feasible to scale that to a larger number of cores. But in general, much of your program does not consist of discrete tasks that can be easily made parallel.Amdahl's Law is commonly used to calculate the maximum benefit from increasing the parallel parts of your program. But the sad fact remains that as you scale, the non-parallel parts of your program are the ones that matter, and you can't parallelize them away.

Actor concurrency

So what's the alternative? The actor model of concurrency has recently gained prominence, largely thanks to some success achieved by its flagship language,Erlang.

The actor model consists of a few key principles:

  • No shared state
  • Lightweight processes
  • Asynchronous message-passing
  • Mailboxes to buffer incoming messages
  • Mailbox processing with pattern matching

Let's look at these principles in more detail. An actor is a process that executes a function. Here a process is a lightweight user-space thread (not to be confused with a typical heavyweight operating-system process). Actors never share state and thus never need to compete for locks for access to shared data. Instead, actors share data by sending messages that are immutable. Immutable data cannot be modified, so reads do not require a lock.

 

Messages are sent asynchronously and are buffered in an actor's mailbox. A mailbox is essentially a queue with multiple producers (other actors) and a single consumer. A particular actor is driven by receiving messages from the mailbox based on pattern matching.

Some key assumptions are built into this model:

  • Processes are cheap (in memory) and fast to create
  • Processes are small and thus large numbers of processes can be created
  • Processes are not bound to kernel threads and can be efficiently scheduled in user space
  • The scheduler requires the ability to pause and continue process execution
  • Messages can be retrieved by efficiently pattern-matching messages in the mailbox

Erlang

The actor model has most famously been associated with Erlang, a functional, dynamically typed language invented in 1986 at Ericsson. Erlang was designed for creating applications (such as telephone switches) that must run nonstop. That requires the code to be hot-swappable, distributed, and robust in the face of errors. Erlang was open-sourced in 1998 and has gained prominence in the last few years as concurrency has become a concern. A complete overview of Erlang is outside this article's scope, but I'll give you a quick introduction to the language's key aspects.

Erlang programs are compiled and run in a virtual machine, but you can also use an interactive shell to explore the basics. Erlang variables, which start with a capital letter or a _ character, can be assigned only once, unlike in most other languages. Listing 2 shows an example trace in the Erlang shell with a Value variable.

Listing 2. Erlang variables

1> Value = 4.
4
2> Value.
4
3> Value = 6.
** exception error: no match of right hand side value 6

In this trace you can see expressions being entered on the lines starting with 1>2>, and 3> and the response from the shell on the following lines. Line 1 assigns the value 4 to the Value variable. Line 2 returns the value of Value. Line 3 demonstrates that Value can't be reassigned to a new value because it is already bound to the value 4.

Actually, = is not an assignment operator at all in Erlang. It is really a pattern-matching operator. In the case of an unbound variable on the left-hand side, the pattern is unmatched and will bind a value from the right-hand side. But that's just the beginning of what = can do.

Words starting with lowercase letters are atoms -- fixed symbols that always represent a constant, number, or string of the same name, even across machines. Atuple is a fixed-size set of heterogenous values and is specified with { }. Here we match a variable with a tuple of atoms:

Stooges = {larry, curly, moe}.

You can also put tuples on the left side and use variables and the pattern-matching =to extract values from a tuple on the right side:

{Stooge1, Stooge2, Stooge3} = {larry, curly, moe}.
 

Here the variables Stooge1Stooge2, and Stooge3 match the atoms larrycurly, and moe.

Lists contain a variable number of elements and are represented in [ ]. The | (pipe) is used to separate the head element from the rest of the list (similar to car and cdr in Lisp). In Listing 3 we build a list and then extract the list's head and tail with pattern matching.

Listing 3. Lists

1> List=[1,2,3].
[1,2,3]
2> [First | Rest]=List.
[1,2,3]
3> First.
1
4> Rest.
[2,3]

Functions are first-class values, as you would expect in a functional language. They are defined by a function name and the number of parameters, using a simple syntax. Here we match the anonymous function to the Square variable. You can then execute it or even create functions that return functions, like the TimesN function in Listing 4.

Listing 4. Functions

1> Square = fun(X) -> X*X end.
#Fun<erl_eval.6.13229925>
2> Square(5).
25.
3> TimesN = fun(N) -> (fun(X) -> N*X end)
3> end.
#Fun<erl_eval.6.13229925>
4> lists:map(TimesN(10), [1,2,3]).
[10,20,30]
 

At the end of Listing 4 you can see a call to the function mapin one of the most important Erlang modules: lists.

Normally you define Erlang code in source files with the .erlextension. Each source file can declare a module and which functions are exported or imported to that module. For example, Listing 5 is a definition of the mymath module, which has two functions -- square and fib.

Listing 5. The mymath.erl module

-module(mymath).
-export([square/1,fib/1]). square(Value) -> Value*Value. fib(0) -> 0;
fib(1) -> 1;
fib(N) when N>1 -> fib(N-1) + fib(N-2).

In Listing 5, fib is defined by three different patterns, and the runtime will decide which pattern to call as appropriate. The when ... is a guard clause that can constrain when a pattern is applicable.

Listing 6 shows how to compile and load this module in the shell.

Listing 6 Compiling a module and executing its functions

1> c(mymath.erl).
2> mymath:square(5).
25
3> mymath:fib(7).
13

Actors in Erlang

Now that you're at least a little bit comfortable with Erlang, we can begin looking at actors. Three basic elements in Erlang form the foundation for concurrency. First, the built-in spawn function creates a new process executing a function and returns the new process's process identifier. Second is a syntax for sending a message to a process with the ! operator. And finally, actors can use a receive...end function to pattern-match messages from the actor's mailbox.

To see how these pieces fit together, we'll create a process that does conversions between Celsius and Fahrenheit temperatures. First we must define the function that the process will run, as shown in Listing 7.

Listing 7. temperature.erl

temperatureConverter() ->
receive
{toF, C} ->
io:format("~p C is ~p F~n", [C, 32+C*9/5]),
temperatureConverter();
{toC, F} ->
io:format("~p F is ~p C~n", [F, (F-32)*5/9]),
temperatureConverter();
{stop} ->
io:format("Stopping~n");
Other ->
io:format("Unknown: ~p~n", [Other]),
temperatureConverter()
end.

This function receives messages in the form of three possible tuple formats. The first format is {toF, C}. In this tuple, the first element is the toF atom, and the second is the temperature in Celsius. In the code the variable C is matched to the value. On match, it prints the answer and calls back to this function again. This is a tail-recursive call. Erlang actor functions often follow this idiom of matching an incoming message, then making a tail-recursive call back to the same function. State can be maintained in the actor by passing it in a function parameter and modifying it on the recursive call.

The {stop} message is a special message to stop the process. The last pattern is just an Other variable that maps to anything, prints an error message, and restarts.

You've no doubt noticed that depending on the circumstance, clauses and functions are separated by , (comma), ;(semicolon), and . (period). In Erlang, these punctuation marks serve much the same purpose as their use in English. The comma separates phrases (often subsequent expressions to evaluate). The semicolon indicates a related but separate phrase. The period terminates a complete statement.

We can then spawn a process to run this function and send messages to it, as shown in Listing 8.

Listing 8. Running the temperature converter process

1> c(temperature).
{ok,temperature}
2> Pid = spawn(fun temperature:temperatureConverter/0).
<0.128.0>
3> Pid ! {toC, 32}.
32F is 0.0 C
{convertToC,32}
4> Pid ! {toF, 100}.
100C is 212.0 F
{convertToF,100}
5> Pid ! {stop}.
Stopping
{stop}

In this trace in the shell, we load and compile the temperature module, then spawn a process running the temperatureConverter function. In return we get the process ID of the converter process. We can then send it messages using the ! operator.

RPC in Erlang

In the example in Listing 7 we simply printed the answer in the process, but it's probably far more useful to get a response to our question. A remote procedure call (RPC) is easy to build by simply including the process identifier of the source as part of the message. We must rework our converter function to expect a source identifier and to respond with a message instead of printing, as shown in Listing 9.

Listing 9. Reworking the converter for RPC

temperatureConverter() ->
receive
{From, {toF, C}} ->
From ! {self(), 32+C*9/5},
temperatureConverter();
...etc

We can then add some functions to hide the spawn and the remote call behind a simpler interface, as shown in Listing 10.

Listing 10. Wrapping the RPC

start() ->
spawn(fun() -> temperatureConverter() end). convert(Pid, Request) ->
Pid ! {self(), Request},
receive
{Pid, Response} -> Response
end.
 

The start() function spawns the converter process and returns its process identifier. The convert function uses the process identifier to call the converter process, sending the current identifier as the source, then blocks in a receivewaiting for a response on the current's process mailbox, which is subsequently returned.

These are just the most basic ways to use actors in Erlang. Erlang also makes it easy to handle errors: instead of having a process manage its own errors, it's more common to create a new process (they're cheap, right?) to watch the original process and deal with an error. Many patterns, such as propagating the error or restarting the process, are possible. In fact, Erlang provides a whole library that deals with patterns of error handling, hot-swapping code, and other useful functionality.

Does it work?

Erlang wouldn't be receiving attention today without real-life success stories. At Ericsson, the AXD 301 switch is the flagship example of Erlang's scalability and reliability. This switch handles millions of calls per week and has a reported 99.9999999 percent uptime. Once Erlang was open-sourced, it was picked up by Ericsson's competitor Nortel and integrated into the Nortel Alteon SSL Accelerator, which does hundreds of transactions per second.

Because of Erlang's focus on messaging, it seems to be a good fit for these kinds of systems. The ejabberd daemon is a highly-regarded Jabber/XMPP implementation that has been used in some large implementations such as jabber.org. Similarly,RabbitMQ is a high-performance AMQP server written in Erlang that can reportedly handle 400,000 messages per second.

SIMILAR ARTICLES

Several so-called Web 2.0 companies have also had success building parts of their systems in Erlang. Facebook Chatuses a combination of C++ and an Erlang Web server to handle 70,000 concurrent users. Delicious 2.0 relaunched in 2008 and is primarily based on C++ but uses Erlang in several subsystems such as data migrations, rolling migrations, and algorithmic analysis. CouchDB, which has gotten a lot of press lately as a document-centric Web database, is written entirely in Erlang. Reportedly the Amazon SimpleDB service also uses Erlang internally.

In short, Erlang can boast a number of high-profile successes, and these users consistently tout Erlang's strengths for writing distributed, concurrent, fault-tolerant software. It's clear that the actor model is a valid alternative for addressing concurrency without shared state and locks.

You may be thinking, "This sounds cool, but there's no way my manager will let me start deploying software in Erlang." Don't you wish you could use the actor model on the JVM? Well, stay tuned for Part 2.

Alex Miller is a tech lead at Terracotta Inc, the makers of the open-source Java clustering product Terracotta. Previously, Alex worked at BEA Systems on the AquaLogic product line and was Chief Architect at MetaMatrix. Alex blogs at Pure Danger Tech and speaks at user groups, conferences, and the No Fluff Just Stuff tour. He has an enduring love for his wife and three children, music, and nachos.

actor concurrency的更多相关文章

  1. Erlang/Elixir精选-第4期(20191223)

    精选文章 A digital symphony - The architecture of API Fortress. 使用Actor模型来支持基于微服务的大规模分布式软件架构.用实例解释了Actor ...

  2. Concurrency != Parallelism

    前段时间在公司给大家分享GO语言的一些特性,然后讲到了并发概念,大家表示很迷茫,然后分享过程中我拿来了Rob Pike大神的Slides <Concurrency is not Parallel ...

  3. Concurrency vs. Parallelism

    http://getakka.net/docs/concepts/terminology Terminology and Concepts In this chapter we attempt to ...

  4. C++多线程开发之actor model

    最近想把写过的一个多线程程序整理一下,这个程序主要特点是有一系列的互相之间有依赖关系的task.于是在网上找相关类库 1,一类是简单的线程池了,这也是原本俺的做法.之前使用的是手工调度,代码实现的很蛋 ...

  5. Akka的Actor模型及使用实例

    本文的绝大部分内容转载自rerun.me这一blog,老外写的东西就是好啊. ACTORS介绍 Anyone who has done multithreading in the past won't ...

  6. actor

    _timcharper1月 20 02:18 @timcharper , so what app the actor model can help, and what app the thread m ...

  7. Akka(1):Actor - 靠消息驱动的运算器

    Akka是由各种角色和功能的Actor组成的,工作的主要原理是把一项大的计算任务分割成小环节,再按各环节的要求构建相应功能的Actor,然后把各环节的运算托付给相应的Actor去独立完成.Akka是个 ...

  8. 并发之痛 Thread,Goroutine,Actor

    转自:http://jolestar.com/parallel-programming-model-thread-goroutine-actor/ 先梳理下两个概念,几乎所有讲并发的文章都要先讲这两个 ...

  9. Actor模型---SwiftActors

    actor是一个无线程区别的内存访问对象:actor背后有线程支持:actor的事件处理依赖与这个线程(队列.池). actor是一种面向对象的线程(池)模型,强调对事件的响应:在iOS中相当于一种通 ...

随机推荐

  1. python之路:Day02 --- Python基础2

    本节内容 1.列表操作 2.元组操作 3.字符串操作 4.字典操作 5.集合操作 6.文件操作 7.字符编码与转换 一.列表操作 定义列表 names = ['Ming',"Hua" ...

  2. Java 8中一些常用的全新的函数式接口

    这一篇属于菜鸟级博客,只是介绍了一些在Java 8中新出现的一些很有用的接口,通过一些简单的例子加以说明,没有深入地阐述. 函数式接口 什么是函数式接口? 函数式接口,@FunctionalInter ...

  3. *****linux下redis安装

    我用的系统是:redhat [root@infa ~]# wget http://download.redis.io/releases/redis-2.8.12.tar.gz tar xzf redi ...

  4. Python 修改电脑DNS

    Pc电脑一般连网都是动态DHCP获取局域的IP地址,而且DNS服务器为自动获取地址.要修改DNS就要打开本地网络-本地连接- 属性- TCP/IP 手动修改DNS. 其实Python也可以通过WMI接 ...

  5. gcc -Wall -pedantic -ansi(转载)

    转载自R-G-Y-CQ的新浪博客 -Wall显示所有的警告信息 -Wall选项可以打开所有类型的语法警告,以便于确定程序源代码是否是正确的,并且尽可能实现可移植性. 对Linux开发人员来讲,GCC给 ...

  6. ubuntu14 备份

    备份命令 # tar cvpjf backup.tar.bz2 –exclude=/proc –exclude=/lost+found –exclude=/backup.tar.bz2 –exclud ...

  7. spring 定时任务配置

    1.(易)如何在spring中配置定时任务? spring的定时任务配置分为三个步骤: 1.定义任务 2.任务执行策略配置 3.启动任务 (程序中一般我们都是到过写的,直观些) 1.定义任务 < ...

  8. linux man的使用

    在Linux中无论是管理系统还是在Linux环境下编程,内嵌的手册man都是一个很好用的工具,“Linux下不懂得就找man”(man是manual的意思).本文将介绍我所知道的所有关于man的知识( ...

  9. 单独编译使用WebRTC的音频处理模块

    块,每块个点,(12*64=768采样)即AEC-PC仅能处理48ms的单声道16kHz延迟的数据,而 - 加载编译好的NS模块动态库 接下来只需要按照 此文 的描述在 android 的JAVA代码 ...

  10. Oracle VM VirtualBox 安装CentOS 配置图形界面记录

    su yum groupinstall "X Window System" -y yum groupinstall "Desktop" -y 最后运行#star ...