Property Name           

Default 

Description

flume.called.from.service

If this property is specified then the Flume agent will continue polling for the config file even if the config file is not found at the expected location. Otherwise, the Flume agent will terminate if the config doesn’t exist at the expected location. No property value is needed when setting this property (eg, just specifying -Dflume.called.from.service is enough)

如果这个属性被指定了,那么Flume agent会轮询配置文档即使在指定路径找不到该文档。此外,FLume agent将会结束如果配置文档不在指定位置上。这个属性不需要设置值(例如,只是指定-Dflume.called.from.service就足够了)

Property: flume.called.from.service

Flume periodically polls, every 30 seconds, for changes to the specified config file. A Flume agent loads a new configuration from the config file if either an existing file is polled for the first time, or if an existing file’s modification date has changed since the last time it was polled. Renaming or moving a file does not change its modification time. When a Flume agent polls a non-existent file then one of two things happens: 1. When the agent polls a non-existent config file for the first time, then the agent behaves according to the flume.called.from.service property. If the property is set, then the agent will continue polling (always at the same period – every 30 seconds). If the property is not set, then the agent immediately terminates. ...OR... 2. When the agent polls a non-existent config file and this is not the first time the file is polled, then the agent makes no config changes for this polling period. The agent continues polling rather than terminating.

Flume每30秒周期轮询配置文档是否改变。如果一个文档是第一次被轮询到或者在上次轮询后修改时间被改变了,那么Flume agent会加载新的配置文档。重命名和移动一个文档不会改变文档的修改时间。当一个Flume agent轮询一个不存在的文档会出现以下两种情况的一种:1. 当在指定目录下轮询不到配置文件时,agent会根据flume.called.from.service property这个属性来决定他的行为。如果这个属性设置了,那么会以30秒为周期地进行轮询;如果没设置,那么找不到就立即停止。2. 如果agent在加载过配置文件后在指定路径轮询不到文件的话,那么将不会做任何改变,然后继续轮询。

Log4J Appender(Log4J 日志存储器)

Appends Log4j events to a flume agent’s avro source. A client using this appender must have the flume-ng-sdk in the classpath (eg, flume-ng-sdk-1.8.0-SNAPSHOT.jar). Required properties are in bold.

将Log4j events 添加到一个flume agent的avro source。一个客户端想要使用这个appender必须要有 flume-ng-sdk在类路径下(例如flume-ng-sdk-1.8.0-SNAPSHOT.jar)。必须要的属性用黑体加粗。

Property Name

Default

Description

Hostname

The hostname on which a remote Flume agent is running with an avro source.

运行avro source的远程Flumeagent的主机名

Port

The port at which the remote Flume agent’s avro source is listening.

远程Flume agent的avro source所监听的端口

UnsafeMode

false

If true, the appender will not throw exceptions on failure to send the events.

如果设为真,appender将不会在发送events失败时抛出异常。

AvroReflectionEnabled

false

Use Avro Reflection to serialize Log4j events. (Do not use when users log strings)

使用 Avro反射来序列化 Log4j events。

AvroSchemaUrl

A URL from which the Avro schema can be retrieved.

一个用来恢复数据的URL,该URL是从 Avro schema来的。

Sample log4j.properties file:

#...

log4j.appender.flume = org.apache.flume.clients.log4jappender.Log4jAppender

log4j.appender.flume.Hostname = example.com

log4j.appender.flume.Port = 41414

log4j.appender.flume.UnsafeMode = true

# configure a class's logger to output to the flume appender

log4j.logger.org.example.MyClass = DEBUG,flume

#...

By default each event is converted to a string by calling toString(), or by using the Log4j layout, if specified.

If the event is an instance of org.apache.avro.generic.GenericRecord, org.apache.avro.specific.SpecificRecord, or if the property AvroReflectionEnabled  is set to true then the event will be serialized using Avro serialization.

Serializing every event with its Avro schema is inefficient, so it is good practice to provide a schema URL from which the schema can be retrieved by the downstream sink, typically the HDFS sink. If AvroSchemaUrl is not specified, then the schema will be included as a Flume header.

Sample log4j.properties file configured to use Avro serialization:

每个events默认都可以通过toString()来转换成字符串,或者有指定的话可用Log4j layout。

如果events是一个org.apache.avro.generic.GenericRecord, org.apache.avro.specific.SpecificRecord类的实例,或者它的属性AvroReflectionEnabled的值为true,那么会使用Avro serialization进行序列化。

对每个event和它的Avro schema进行序列化是低效的,所以,一个好的实践是提供一个可以从下流的sink中恢复的schemaURL,通常是HDFS sink。如果没有指定AvroSchemaUrl的话,schema会被纳入到Flume haeder。

一个使用Avro serialization的log4j属性文档的例子如下:

#...

log4j.appender.flume = org.apache.flume.clients.log4jappender.Log4jAppender

log4j.appender.flume.Hostname = example.com

log4j.appender.flume.Port = 41414

log4j.appender.flume.AvroReflectionEnabled = true

log4j.appender.flume.AvroSchemaUrl = hdfs://namenode/path/to/schema.avsc

# configure a class's logger to output to the flume appender

log4j.logger.org.example.MyClass = DEBUG,flume

#...

Load Balancing Log4J Appender

Appends Log4j events to a list of flume agent’s avro source. A client using this appender must have the flume-ng-sdk in the classpath (eg, flume-ng-sdk-1.8.0-SNAPSHOT.jar). This appender supports a round-robin and random scheme for performing the load balancing. It also supports a configurable backoff timeout so that down agents are removed temporarily from the set of hosts .Required properties are in bold.

将Log4j events 添加到一个flume agent的avro source。一个客户端想要使用这个appender必须要有 flume-ng-sdk在类路径下(例如flume-ng-sdk-1.8.0-SNAPSHOT.jar)。这个日志存储器支持一个循环和随机计划来执行负载均衡。它也支持一个可配置的后移超时以便将挂掉的agent从主机中移除。黑体字标注的属性是必须要的。

Property Name

Default

Description

Hosts

A space-separated list of host:port at which Flume (through an AvroSource) is listening for events

列出监听events的主机列表,每个host:port用空格隔开。

Selector

ROUND_ROBIN

Selection mechanism. Must be either ROUND_ROBIN, RANDOM or custom FQDN to class that inherits from LoadBalancingSelector.

选择机制。必须从ROUND_ROBIN,RANDOM或者继承LoadBalancingSelector的自定义FQDN类。

MaxBackoff

A long value representing the maximum amount of time in milliseconds the Load balancing client will backoff from a node that has failed to consume an event. Defaults to no backoff

这个值代表以毫秒为单位的退避超时最大值,也就是当一个节点在消费event时失效了,等待超时时间再进行重发event。默认是没有退避的

UnsafeMode

false

If true, the appender will not throw exceptions on failure to send the events.

如果设为真,appender将不会在发送events失败时抛出异常。

AvroReflectionEnabled

false

Use Avro Reflection to serialize Log4j events.

使用 Avro反射来序列化 Log4j events。

AvroSchemaUrl

A URL from which the Avro schema can be retrieved.

一个用来恢复数据的URL,该URL是从 Avro schema来的。

Sample log4j.properties file configured using defaults:

#...

log4j.appender.out2 = org.apache.flume.clients.log4jappender.LoadBalancingLog4jAppender

log4j.appender.out2.Hosts = localhost:25430 localhost:25431

# configure a class's logger to output to the flume appender

log4j.logger.org.example.MyClass = DEBUG,flume

#...

Sample log4j.properties file configured using RANDOM load balancing:

#...

log4j.appender.out2 = org.apache.flume.clients.log4jappender.LoadBalancingLog4jAppender

log4j.appender.out2.Hosts = localhost:25430 localhost:25431

log4j.appender.out2.Selector = RANDOM

# configure a class's logger to output to the flume appender

log4j.logger.org.example.MyClass = DEBUG,flume

#...

Sample log4j.properties file configured using backoff:

#...

log4j.appender.out2 = org.apache.flume.clients.log4jappender.LoadBalancingLog4jAppender

log4j.appender.out2.Hosts = localhost:25430 localhost:25431 localhost:25432

log4j.appender.out2.Selector = ROUND_ROBIN

log4j.appender.out2.MaxBackoff = 30000

# configure a class's logger to output to the flume appender

log4j.logger.org.example.MyClass = DEBUG,flume

#...

Security(安全性)

The HDFS sink, HBase sink, Thrift source, Thrift sink and Kite Dataset sink all support Kerberos authentication. Please refer to the corresponding sections for configuring the Kerberos-related options.

Flume agent will authenticate to the kerberos KDC as a single principal, which will be used by different components that require kerberos authentication. The principal and keytab configured for Thrift source, Thrift sink, HDFS sink, HBase sink and DataSet sink should be the same, otherwise the component will fail to start.

HDFS sink、HBase sink、Thrift source、Thrift sink和Kite Dataset sink支持Kerberos认证。请参考配置Kerberos相关选项的章节。

当agent中的不同组件需要kerberos验证,Flume agent会作为kerberos KDC验证的主体。Thrift source, Thrift sink, HDFS sink, HBase sink and DataSet sink的密钥和主体都应该相同,否则组件无法启动。

Flume官方文档翻译——Flume 1.7.0 User Guide (unreleased version)中一些知识点的更多相关文章

  1. Flume官方文档翻译——Flume 1.7.0 User Guide (unreleased version)(二)

    Flume官方文档翻译--Flume 1.7.0 User Guide (unreleased version)(一) Logging raw data(记录原始数据) Logging the raw ...

  2. Flume官方文档翻译——Flume 1.7.0 User Guide (unreleased version)(一)

    Flume 1.7.0 User Guide Introduction(简介) Overview(综述) System Requirements(系统需求) Architecture(架构) Data ...

  3. Flume性能测试报告(翻译Flume官方wiki报告)

    因使用flume的时候总是会对其性能有所调研,网上找的要么就是自测的这里找到一份官方wiki的测试报告供大家参考 https://cwiki.apache.org/confluence/display ...

  4. 蓝牙4.0——Android BLE开发官方文档翻译

    ble4.0开发整理资料_百度文库 http://wenku.baidu.com/link?url=ZYix8_obOT37JUQyFv-t9Y0Sv7SPCIfmc5QwjW-aifxA8WJ4iW ...

  5. 【翻译】Flume 1.8.0 User Guide(用户指南) Processors

    翻译自官网flume1.8用户指南,原文地址:Flume 1.8.0 User Guide 篇幅限制,分为以下5篇: [翻译]Flume 1.8.0 User Guide(用户指南) [翻译]Flum ...

  6. 【翻译】Flume 1.8.0 User Guide(用户指南) Channel

    翻译自官网flume1.8用户指南,原文地址:Flume 1.8.0 User Guide 篇幅限制,分为以下5篇: [翻译]Flume 1.8.0 User Guide(用户指南) [翻译]Flum ...

  7. 【翻译】Flume 1.8.0 User Guide(用户指南) Sink

    翻译自官网flume1.8用户指南,原文地址:Flume 1.8.0 User Guide 篇幅限制,分为以下5篇: [翻译]Flume 1.8.0 User Guide(用户指南) [翻译]Flum ...

  8. 【翻译】Flume 1.8.0 User Guide(用户指南) source

    翻译自官网flume1.8用户指南,原文地址:Flume 1.8.0 User Guide 篇幅限制,分为以下5篇: [翻译]Flume 1.8.0 User Guide(用户指南) [翻译]Flum ...

  9. 【翻译】Flume 1.8.0 User Guide(用户指南)

    翻译自官网flume1.8用户指南,原文地址:Flume 1.8.0 User Guide 篇幅限制,分为以下5篇: [翻译]Flume 1.8.0 User Guide(用户指南) [翻译]Flum ...

随机推荐

  1. js数组操作总结

    1.数组的检测 ECMAScript3    if(value instanceof Array){ //执行操作 }    假定单一全局环境,如果网页存在多个框架,多个window,Array具有不 ...

  2. POJ2369 Permutations(置换的周期)

    链接:http://poj.org/problem?id=2369 Permutations Time Limit: 1000MS   Memory Limit: 65536K Total Submi ...

  3. 9.30notes

    memcached   缓存机制,减轻数据库负载.它通过在内存中缓存数据和对象来减少读取数据库的次数,从而提高动态.数据库驱动网站的速度. array_slice(data['list'],0,10) ...

  4. 培训SQLServer 嵌套事务PPT分享

    培训SQLServer 嵌套事务PPT分享 下载地址 http://files.cnblogs.com/files/lyhabc/SQLServer%E5%B5%8C%E5%A5%97%E4%BA%8 ...

  5. 你的眼睛背叛你的心:解决 .NET Core 中 GetHostAddressesAsync 引起的 EnyimMemcached 死锁问题

    在我们将站点从 ASP.NET + Windows 迁移至 ASP.NET Core + Linux 的过程中,目前遇到的最大障碍就是 —— 没有可用的支持 .NET Core 的 memcached ...

  6. JavaScript异步编程原理

    众所周知,JavaScript 的执行环境是单线程的,所谓的单线程就是一次只能完成一个任务,其任务的调度方式就是排队,这就和火车站洗手间门口的等待一样,前面的那个人没有搞定,你就只能站在后面排队等着. ...

  7. 细说 Data URI

    Data URL 早在 1995 年就被提出,那个时候有很多个版本的 Data URL Schema 定义陆续出现在 VRML 之中,随后不久,其中的一个版本被提上了议案——将它做个一个嵌入式的资源放 ...

  8. 探索c#之尾递归编译器优化

    阅读目录: 递归运用 尾递归优化 编译器优化 递归运用 一个函数直接或间接的调用自身,这个函数即可叫做递归函数. 递归主要功能是把问题转换成较小规模的子问题,以子问题的解去逐渐逼近最终结果. 递归最重 ...

  9. MySQL 主主复制

    200 ? "200px" : this.width)!important;} --> 介绍 环境 OS:CentOS 6.7,MySQL 5.6 Master:192.16 ...

  10. 使用GDB调试Go语言

    用Go语言已经有一段时间了,总结一下如何用GDB来调试它! ps:网上有很多文章都有描述,但是都不是很全面,这里将那些方法汇总一下 GDB简介  GDB是GNU开源组织发布的⼀一个强⼤大的UNIX下的 ...