实时流式计算框架Storm 0.9.0发布通知(中文版)

Storm0.9.0发布通知中文翻译版(2013/12/10 by 富士通邵贤军有错误一定告诉我 shaoxianjun@hotmail.com^_^）

我们很高兴宣布Storm 0.9.0已经成功发布，你可以从the downloads page下载. 本次发布对茁壮成长的Storm来说是一次巨大的进步。

我们追加了一些新特性，你会在下面看到详细的介绍, 此外这次发布的另一个着重点是修复了大量跟稳定性相关的 bug. 虽然很多用户已经在自己的环境中把0.9.x版本的Storm成功运行起来，但我们不保证那些版本的稳定性。0.9.0是目前最稳定的版本，我们强烈推荐各位使用，特别是0.8.x的用户们。

特性1：Netty做消息传输层

第一个重大的特点是新的传输层。我们引入了使用纯Java语言编写的Netty作为我们的传输层，这个工作是由好基友Yahoo! Engineering完成的。关于Storm的核心消息传输层能以插拔形式更换这一点，我想大家都知道了，只可惜原来只有ZeroMQ，而现在Storm提供了两种消息传输层实现，分别是原来的ZeroMQ和新的Netty。

在以前的版本里，Storm只能依赖ZeroMQ做消息的传输，但它其实并不好，我也不晓得Nathan为什么头脑热使用了ZeroMQ。为什么ZeroMQ不好呢？这是由于：

1.ZeroMQ是一个本地化的消息库，它过度依赖操作系统环境；

2.安装起来比较麻烦；

3.ZeroMQ的稳定性在不同版本之间差异巨大，并且目前只有2.1.7版本的ZeroMQ能与Storm协调的工作。

我们引入Netty的原因是：

1.平台隔离，Netty是一个纯Java实现的消息队列，可以帮助Storm实现更好的跨平台特性，同时基于JVM的实现可以让我们对消息有更好的控制；

2.高性能，Netty的性能要比ZeroMQ快两倍左右，这里有一篇文章this blog post 专门比较了ZeroMQ和Netty的性能（待翻译）。

3. 安全性认证，使得我们将来要做的 worker 进程之间的认证授权机制成为可能。

如果要在Storm里使用Netty做传输层，只需要简单的把下面的内容加入到storm.yaml中，并根据你的实际情况调整参数即可：

storm.messaging.transport: "backtype.storm.messaging.netty.Context"

storm.messaging.netty.server_worker_threads: 1

storm.messaging.netty.client_worker_threads: 1

storm.messaging.netty.buffer_size: 5242880

storm.messaging.netty.max_retries: 100

storm.messaging.netty.max_wait_ms: 1000

storm.messaging.netty.min_wait_ms: 100

如果你不喜欢ZeroMQ或者Netty，你也可以通过实现backtype.storm.messaging.IContextinterface来用自己的消息传输层，但是要满足几个条件，这里就不多说了。

特性2：日志查看UI

新版本的Storm增加了一个很给力的特性用来调试和监视topology——logviewer进程。在早期的版本里，查看Worker节点的日志决定于Worker节点的位置（host/port），典型的是通过Storm UI，然后用ssh连接那个主机查看该主机上worker的日志文件。在最新的日志查看机制里，现在可以很容易的去访问一个指定worker节点的日志，你只需要在浏览器中的StormUI里点击worker的port就可以了。

新的logviewer进程与supervisor是相对独立的进程，如果要在新的Storm里启动它，你只需要在集群的supervisor节点执行如下命令：

    $ storm logviewer

特性3：跨平台

在以前的版本里，如果想在Windows平台上运行Storm，你需要安装ZeroMQ，修改Storm的源码，追加一些Windows平台特定的脚本。而在新的版本里，因为用netty替换了ZeroMQ，由于netty用纯java实现，因此使得Storm具有更好的跨平台特性，现在要在Windows上运行Storm比以前容易很多。

特性4：安全

安全，认证，授权这些一直是非常重要的领域，我们在后续会持续追加相关的特性，Storm0.9.0提供了API用来实现可插拔的Tuple序列化，并且有一个基于BlowFish的加密算法来用于加密Tuple的实现。

特性5：API 兼容性和升级

对大多数的Storm开发者来说，更新到0.9.0只是简单的更新它的依赖包而已，Storm的核心API自从0.8.2以来变更很少。而在生产环境中的开发运维方面，如果要更新最新的Storm，最好在升级之前把已经存在的状态信息给清空，比如ZooKeeper上的信息和storm.local.dir配置的信息。

特性6：日志方式变更

另一个非常大的变化是对日志的改变，Storm里面大量使用slf4j 的API，而有些Storm的依赖库或Storm的使用者则依赖于log4j的API。所以现在Storm改为依赖于log4j-over-slf4j，它可以在log4j与slf4j之间架起一座桥梁。这些改变会涉及到已经使用log4jAPI的拓扑和拓扑组件。总之，如果可以的话，还是尽可能的使用slf4j的API来做日志记录吧！

鸣谢

最后特别感谢那些为Storm的贡献的小伙伴们，不管是贡献代码、文档、提BUG或者在邮件列表里为其他人提供帮助的人，你们都功不可没，Nathan爱你们，Storm小组爱你们。

0.9.0变更日志

Update build configuration to force compatibility with Java 1.6
Fixed a netty client issue where sleep times for reconnection could be negative (thanks brndnmtthws)
Fixed an issue that would cause storm-netty unit tests to fail
Added configuration to limit ShellBolt internal _pendingWrites queue length (thanks xiaokang)
Fixed a a netty client issue where sleep times for reconnection could be negative (thanks brndnmtthws)
Fixed a display issue with system stats in Storm UI (thanks d2r)
Nimbus now does worker heartbeat timeout checks as soon as heartbeats are updated (thanks d2r)
The logviewer now determines log file location by examining the logback configuration (thanks strongh)
Allow tick tuples to work with the system bolt (thanks xumingming)
Add default configuration values for the netty transport and the ability to configure the number of worker threads (thanks revans2)
Added timeout to unit tests to prevent a situation where tests would hang indefinitely (thanks d2r)
Fixed an issue in the system bolt where local mode would not be detected accurately (thanks miofthena)
Fixed storm jar command to work properly when STORM_JAR_JVM_OPTS is not specified (thanks roadkill001)
All logging now done with slf4j
Replaced log4j logging system with logback
Logs are now limited to 1GB per worker (configurable via logging configuration file)
Build upgraded to leiningen 2.0
Revamped Trident spout interfaces to support more dynamic spouts, such as a spout who reads from a changing set of brokers
How tuples are serialized is now pluggable (thanks anfeng)
Added blowfish encryption based tuple serialization (thanks anfeng)
Have storm fall back to installed storm.yaml (thanks revans2)
Improve error message when Storm detects bundled storm.yaml to show the URL's for offending resources (thanks revans2)
Nimbus throws NotAliveException instead of FileNotFoundException from various query methods when topology is no longer alive (thanks revans2)
Escape HTML and Javascript appropriately in Storm UI (thanks d2r)
Storm's Zookeeper client now uses bounded exponential backoff strategy on failures
Automatically drain and log error stream of multilang subprocesses
Append component name to thread name of running executors so that logs are easier to read
Messaging system used for passing messages between workers is now pluggable (thanks anfeng)
Netty implementation of messaging (thanks anfeng)
Include topology id, worker port, and worker id in properties for worker processes, useful for logging (thanks d2r)
Tick tuples can now be scheduled using floating point seconds (thanks tscurtu)
Added log viewer daemon and links from UI to logviewers (thanks xiaokang)
DRPC server childopts now configurable (thanks strongh)
Default number of ackers to number of workers, instead of just one (thanks lyogavin)
Validate that Storm configs are of proper types/format/structure (thanks d2r)
FixedBatchSpout will now replay batches appropriately on batch failure (thanks ptgoetz)
Can set JAR_JVM_OPTS env variable to add jvm options when calling 'storm jar' (thanks srmelody)
Throw error if batch id for transaction is behind the batch id in the opaque value (thanks mrflip)
Sort topologies by name in UI (thanks jaked)
Added LoggingMetricsConsumer to log all metrics to a file, by default not enabled (thanks mrflip)
Add prepare(Map conf) method to TopologyValidator (thanks ankitoshniwal)
Bug fix: Supervisor provides full path to workers to logging config rather than relative path (thanks revans2)
Bug fix: Call ReducerAggregator#init properly when used within persistentAggregate (thanks lorcan)
Bug fix: Set component-specific configs correctly for Trident spouts

原文地址：http://storm-project.net/2013/12/08/storm090-released.html