动机

Motivation

The way we consume services from the internet today includes many instances of streaming data, both down- loading from a service as well as uploading to it or peer-to-peer data transfers. Regarding data as a stream of elements instead of in its entirety is very useful because it matches the way computers send and receive them (for example via TCP), but it is often also a necessity because data sets frequently become too large to be handled as a whole. We spread computations or analyses over large clusters and call it “big data”, where the whole principle of processing them is by feeding those data sequentially—as a stream—through some CPUs.

现如今我们从因特网上获取服务的方式包括了很多流式的数据,比如下载、上传或是p2p(peer to peer)的数据传输。把数据视为元素(译注:即整体的组成部分)的流而不是整体可以更有用,因为它符合计算机实际上处理它的方式(例如,通过TCP),但是经常这也是必需的,因为数据集经常变得太大而不能当作整体处理。我们把计算和分析分布到一个大集群中,称之为“大数据”,它的处理原则就是把数据顺序地(作为流)提供给一些CPU

Actors can be seen as dealing with streams as well: they send and receive series of messages in order to transfer knowledge (or data) from one place to another. We have found it tedious and error-prone to implement all the proper measures in order to achieve stable streaming between actors, since in addition to sending and receiving we also need to take care to not overflow any buffers or mailboxes in the process. Another pitfall is that Actor messages can be lost and must be retransmitted in that case lest the stream have holes on the receiving side. When dealing with streams of elements of a fixed given type, Actors also do not currently offer good static guarantees that no wiring errors are made: type-safety could be improved in this case.

也可以认为actor处理的也是流:它们接收消息、发送消息,来把知识(或者数据)从一个地方传送到另一个地方。我们发现想要使用恰当的实现手段来在actor之间构造一个稳定的流非常繁杂、容易出错,因为在接收和发送之外,我们还得确保不会使得缓冲区和mailbox(actor的mailbox)溢出。另一个陷阱是,Actor的消息可能会丢失,因此需要进行重传,以免在流的接收端出现漏洞。有的流的元素是一个给定的类型,在处理这种情况是,actor并不能提供很好的静态保证(译注:指编译器的类型检查)来确保不发生交织时的错误(译注:应该是指消息流的编织),在这种情况下类型安全可以改进。

(译注:这一段说明了设计Akka Stream的动机:

  1. 确保stream经过的各处的缓冲不会溢出(可以认为mailbox是actor的消息缓冲)

  2. 保证消息传递的可靠性,提供高于at-least-once的消息传递语义。

  3. 在处理元素类型给定的流时,提供类型安全。

  这些问题在构造一个actor系统时,是非常核心的问题。特别是前两个,自己构造actor系统时的确得采用很繁琐的手段才能实现。actor系统存在的问题,可以参考下这篇文章

  Why I Don't Like Akka Actors

 )

For these reasons we decided to bundle up a solution to these problems as an Akka Streams API. The purpose is to offer an intuitive and safe way to formulate stream processing setups such that we can then execute them efficiently and with bounded resource usage—no more OutOfMemoryErrors. In order to achieve this our streams need to be able to limit the buffering that they employ, they need to be able to slow down producers if the consumers cannot keep up. This feature is called back-pressure and is at the core of the Reactive Streams initiative of which Akka is a founding member. For you this means that the hard problem of propagating and reacting to back-pressure has been incorporated in the design of Akka Streams already, so you have one less thing to worry about; it also means that Akka Streams interoperate seamlessly with all other Reactive Streams implementations (where Reactive Streams interfaces define the interoperability SPI while implementations like Akka Streams offer a nice user API).

由于这些原因,我们想要把解决方案打包进Akka Stream API里。目的是提供一个直观和安全的方式来设计流处理过程,使得我们可以高效地执行它们,而且使用有界的资源消耗——不再有OutOfMemoryErrors。为了实现这点,我们的流需要能够限制它们采用的缓冲大小,在消费者跟不上生产者时,我们需要能使用生产者慢下来。这个特性称为后向压力(back-pressure),它是Reactive Streams的核心提议,而Akka是Reactive Streams的创始成员。对你而言这意味着传递和应对back-pressure的问题已经被纳入了Akka Stream的设计,所以你的担心可以少一个了;这也意味着Akka Streams可以无缝地与其它Reactive Streams的实现(Reactive Streams接口定义了互操作的Service Provider Interface,而像Akka这样的实现提供了一个很好地用户API)互操作。

Akka Stream文档翻译:Motivation的更多相关文章

  1. Akka Stream文档翻译:Quick Start Guide: Reactive Tweets

    Quick Start Guide: Reactive Tweets 快速入门指南: Reactive Tweets (reactive tweets 大概可以理解为“响应式推文”,在此可以测试下GF ...

  2. 报错:Flink Could not resolve substitution to a value: ${akka.stream.materializer}

    报错现象: Exception in thread "main" com.typesafe.config.ConfigException$UnresolvedSubstitutio ...

  3. Akka Stream之Graph

    最近在项目中需要实现图的一些操作,因此,初步考虑使用Akka Stream的Graph实现.从而学习了下: 一.介绍 我们知道在Akka Stream中有三种简单的线性数据流操作:Source/Flo ...

  4. Lagom学习 六 Akka Stream

    lagom中的stream 流数据处理是基于akka stream的,异步的处理流数据的.如下看代码: 流式service好处是: A: 并行:  hellos.mapAsync(8, name -& ...

  5. Akka官方文档翻译:Cluster Specification

    参加了CSDN的一个翻译项目,翻译Akka的文档.CSDN提供的翻译系统不好使,故先排版一下放在博客上. 5.1 集群规范 注意:本文档介绍了集群的设计理念.它分成两部分,第一部分描述了当前已经实现的 ...

  6. Akka(17): Stream:数据流基础组件-Source,Flow,Sink简介

    在大数据程序流行的今天,许多程序都面临着共同的难题:程序输入数据趋于无限大,抵达时间又不确定.一般的解决方法是采用回调函数(callback-function)来实现的,但这样的解决方案很容易造成“回 ...

  7. Akka(18): Stream:组合数据流,组件-Graph components

    akka-stream的数据流可以由一些组件组合而成.这些组件统称数据流图Graph,它描述了数据流向和处理环节.Source,Flow,Sink是最基础的Graph.用基础Graph又可以组合更复杂 ...

  8. Akka(19): Stream:组合数据流,组合共用-Graph modular composition

    akka-stream的Graph是一种运算方案,它可能代表某种简单的线性数据流图如:Source/Flow/Sink,也可能是由更基础的流图组合而成相对复杂点的某种复合流图,而这个复合流图本身又可以 ...

  9. Akka(20): Stream:压力缓冲-Batching backpressure and buffering

    akka-stream原则上是一种推式(push-model)的数据流.push-model和pull-model的区别在于它们解决问题倾向性:push模式面向高效的数据流下游(fast-downst ...

随机推荐

  1. Android之ORMLite实现数据持久化的简单使用

    Android中内置了sqlite,但是常用的开发语言java是面向对象的,而数据库是关系型的,二者之间的转化每次都很麻烦.(作为程序员,应该学会偷懒)而Java Web开发中有很多orm框架(其实我 ...

  2. ubuntu 更新软件源

    ubuntu 更新软件源 修改文件sources.list 位于/etc/apt/sources.list,并备份原文件为sources.list.bak deb http://mirrors.163 ...

  3. ios 多线程-GCD-NSOperation

    一.线程间的通讯 1.使用NSObject类的方法performSelectorInBackground:withObject:来创建一个线程. 具体的代码:隐式创建,自动启动 [Object per ...

  4. 【转】 SqlServer性能检测和优化工具使用详细

    http://blog.csdn.net/dcx903170332/article/details/45917387

  5. dorado问题查询&快捷键重命名

    重命名还有一个快捷键  F2 有关dorado的问题可以进 www.bsdn.org提问,而且更好

  6. 结构体的malloc与数组空间

    结构体的malloc 如果结构体中有指针,对结构体的malloc 和其指针成员变量的malloc是没有关系的 结构体malloc的是存储自己地址的 忘记了面试常考试的sizeof的几个主要点 ==== ...

  7. poj 1659 Frogs' Neighborhood Havel-Hakimi定理 可简单图定理

    作者:jostree 转载请注明出处 http://www.cnblogs.com/jostree/p/4098136.html 给定一个非负整数序列$D=\{d_1,d_2,...d_n\}$,若存 ...

  8. manifest save for self

    一.使用html5的缓存机制 1.先上规则代码:m.manifest CACHE MANIFEST # 2015-04-24 14:20 #直接缓存的文件 CACHE: /templates/spec ...

  9. sublime 经验总结 主题有 less2css

    1. 安装“包控制”模块 操作步骤见该网站:https://sublime.wbond.net/installation#Simple sublime2的代码如下: import urllib2,os ...

  10. Linux编程学习笔记 -- Process

    进程是一个程序的运行.   在一个程序中执行另一个执程序的方法有两种: 1)system 在shell中执行程序 2)fork + exec 复制一个进程,在进程中用新的程序替换原有的程序   for ...