以下jie皆来自官网:

1:首先版本是flume 1.8

查看版本:  bin/flume-ng version

2:配置与启动

https://flume.apache.org/FlumeUserGuide.html#configuration

Defining the flow

# list the sources, sinks and channels for the agent
<Agent>.sources = <Source>
<Agent>.sinks = <Sink>
<Agent>.channels = <Channel1> <Channel2> # set channel for source
<Agent>.sources.<Source>.channels = <Channel1> <Channel2> ... # set channel for sink
<Agent>.sinks.<Sink>.channel = <Channel1> eg2:
# list the sources, sinks and channels for the agent
agent_foo.sources = avro-appserver-src-1
agent_foo.sinks = hdfs-sink-1
agent_foo.channels = mem-channel-1 # set channel for source
agent_foo.sources.avro-appserver-src-1.channels = mem-channel-1 # set channel for sink
agent_foo.sinks.hdfs-sink-1.channel = mem-channel-1

Using environment variables in configuration files

Flume has the ability to substitute environment variables in the configuration. For example:

a1.sources = r1
a1.sources.r1.type = netcat
a1.sources.r1.bind = 0.0.0.0
a1.sources.r1.port = ${NC_PORT}
a1.sources.r1.channels = c1

NB: it currently works for values only, not for keys. (Ie. only on the “right side” of the = mark of the config lines.)

This can be enabled via Java system properties on agent invocation by setting propertiesImplementation = org.apache.flume.node.EnvVarResolverProperties.

For example::
$ NC_PORT=44444 bin/flume-ng agent –conf conf –conf-file example.conf –name a1 -Dflume.root.logger=INFO,console -DpropertiesImplementation=org.apache.flume.node.EnvVarResolverProperties

Note the above is just an example, environment variables can be configured in other ways, including being set in conf/flume-env.sh.

Configuring individual components

# properties for sources
<Agent>.sources.<Source>.<someProperty> = <someValue> # properties for channels
<Agent>.channel.<Channel>.<someProperty> = <someValue> # properties for sinks
<Agent>.sources.<Sink>.<someProperty> = <someValue>

Adding multiple flows in an agent


# list the sources, sinks and channels for the agent
<Agent>.sources = <Source1> <Source2>
<Agent>.sinks = <Sink1> <Sink2>
<Agent>.channels = <Channel1> <Channel2>


# list the sources, sinks and channels in the agent
agent_foo.sources = avro-AppSrv-source1 exec-tail-source2
agent_foo.sinks = hdfs-Cluster1-sink1 avro-forward-sink2
agent_foo.channels = mem-channel-1 file-channel-2 # flow #1 configuration
agent_foo.sources.avro-AppSrv-source1.channels = mem-channel-1
agent_foo.sinks.hdfs-Cluster1-sink1.channel = mem-channel-1 # flow #2 configuration
agent_foo.sources.exec-tail-source2.channels = file-channel-2
agent_foo.sinks.avro-forward-sink2.channel = file-channel-2

Configuring a multi agent flow


Weblog agent config:


# list sources, sinks and channels in the agent
agent_foo.sources = avro-AppSrv-source
agent_foo.sinks = avro-forward-sink
agent_foo.channels = file-channel # define the flow
agent_foo.sources.avro-AppSrv-source.channels = file-channel
agent_foo.sinks.avro-forward-sink.channel = file-channel # avro sink properties
agent_foo.sinks.avro-forward-sink.type = avro
agent_foo.sinks.avro-forward-sink.hostname = 10.1.1.100
agent_foo.sinks.avro-forward-sink.port = 10000 # configure other pieces
#...

HDFS agent config:


# list sources, sinks and channels in the agent
agent_foo.sources = avro-collection-source
agent_foo.sinks = hdfs-sink
agent_foo.channels = mem-channel # define the flow
agent_foo.sources.avro-collection-source.channels = mem-channel
agent_foo.sinks.hdfs-sink.channel = mem-channel # avro source properties
agent_foo.sources.avro-collection-source.type = avro
agent_foo.sources.avro-collection-source.bind = 10.1.1.100
agent_foo.sources.avro-collection-source.port = 10000 # configure other pieces
#...

Fan out flow


# List the sources, sinks and channels for the agent
<Agent>.sources = <Source1>
<Agent>.sinks = <Sink1> <Sink2>
<Agent>.channels = <Channel1> <Channel2> # set list of channels for source (separated by space)
<Agent>.sources.<Source1>.channels = <Channel1> <Channel2> # set channel for sinks
<Agent>.sinks.<Sink1>.channel = <Channel1>
<Agent>.sinks.<Sink2>.channel = <Channel2> <Agent>.sources.<Source1>.selector.type = replicating
# Mapping for multiplexing selector
<Agent>.sources.<Source1>.selector.type = multiplexing
<Agent>.sources.<Source1>.selector.header = <someHeader>
<Agent>.sources.<Source1>.selector.mapping.<Value1> = <Channel1>
<Agent>.sources.<Source1>.selector.mapping.<Value2> = <Channel1> <Channel2>
<Agent>.sources.<Source1>.selector.mapping.<Value3> = <Channel2>
#... <Agent>.sources.<Source1>.selector.default = <Channel2>
# list the sources, sinks and channels in the agent
agent_foo.sources = avro-AppSrv-source1
agent_foo.sinks = hdfs-Cluster1-sink1 avro-forward-sink2
agent_foo.channels = mem-channel-1 file-channel-2 # set channels for source
agent_foo.sources.avro-AppSrv-source1.channels = mem-channel-1 file-channel-2 # set channel for sinks
agent_foo.sinks.hdfs-Cluster1-sink1.channel = mem-channel-1
agent_foo.sinks.avro-forward-sink2.channel = file-channel-2 # channel selector configuration
agent_foo.sources.avro-AppSrv-source1.selector.type = multiplexing
agent_foo.sources.avro-AppSrv-source1.selector.header = State
agent_foo.sources.avro-AppSrv-source1.selector.mapping.CA = mem-channel-1
agent_foo.sources.avro-AppSrv-source1.selector.mapping.AZ = file-channel-2
agent_foo.sources.avro-AppSrv-source1.selector.mapping.NY = mem-channel-1 file-channel-2
agent_foo.sources.avro-AppSrv-source1.selector.default = mem-channel-1
# channel selector configuration
agent_foo.sources.avro-AppSrv-source1.selector.type = multiplexing
agent_foo.sources.avro-AppSrv-source1.selector.header = State
agent_foo.sources.avro-AppSrv-source1.selector.mapping.CA = mem-channel-1
agent_foo.sources.avro-AppSrv-source1.selector.mapping.AZ = file-channel-2
agent_foo.sources.avro-AppSrv-source1.selector.mapping.NY = mem-channel-1 file-channel-2
agent_foo.sources.avro-AppSrv-source1.selector.optional.CA = mem-channel-1 file-channel-2
agent_foo.sources.avro-AppSrv-source1.selector.mapping.AZ = file-channel-2
agent_foo.sources.avro-AppSrv-source1.selector.default = mem-channel-1
 

Network streams


Flume supports the following mechanisms to read data from popular log stream types, such as:


  1. Avro
  2. Thrift
  3. Syslog
  4. Netcat
  5. exec:
  6.   a1.sources.r1.type = exec
    a1.sources.r1.command = tail -f /opt/xfs/logs/tomcat/xfs-cs/logs/xfs_cs_41
  7. seq
  8. spool :暂时不清楚
  9. http:
  10. a1.sources.r1.type =org.apache.flume.source.http.HTTPSource

    a1.sources.r1.port= 5142

  11. tcp || syslog
  12. a1.sources.r1.type =syslogtcp

    a1.sources.r1.port= 5140    ####built log ==>  echo "hello idoall.org syslog" | nc localhost 5140


 
bin/flume-ng agent --conf conf --conf-file example.conf --name a1 -Dflume.root.logger=INFO,console  #####--name 就是配置中的agent

flume 使用手册的更多相关文章

  1. 【Flume学习之二】Flume 使用场景

    环境 apache-flume-1.6.0 一.多agent连接 1.node101配置 option2 # Name the components on this agent a1.sources ...

  2. 【Flume学习之一】Flume简介

    环境 apache-flume-1.6.0 Flume是分布式日志收集系统.可以将应用产生的数据存储到任何集中存储器中,比如HDFS,HBase:同类工具:Facebook Scribe,Apache ...

  3. flume+kafka+smart数据接入实施手册

    1.  概述 本手册主要介绍了,一个将传统数据接入到Hadoop集群的数据接入方案和实施方法.供数据接入和集群运维人员参考. 1.1.   整体方案 Flume作为日志收集工具,监控一个文件目录或者一 ...

  4. 分布式日志收集系统Apache Flume的设计详细介绍

    问题导读: 1.Flume传输的数据的基本单位是是什么? 2.Event是什么,流向是怎么样的? 3.Source:完成对日志数据的收集,分成什么打入Channel中? 4.Channel的作用是什么 ...

  5. DevOps之服务手册

    唠叨话 关于德语噢屁事的知识点,仅提供精华汇总,具体知识点细节,参考教程网址,如需帮助,请留言. <DevOps服务手册(Manual)> <IT资源目标化>1.设施和设备(I ...

  6. 日志采集框架Flume以及Flume的安装部署(一个分布式、可靠、和高可用的海量日志采集、聚合和传输的系统)

    Flume支持众多的source和sink类型,详细手册可参考官方文档,更多source和sink组件 http://flume.apache.org/FlumeUserGuide.html Flum ...

  7. Flume+Sqoop+Azkaban笔记

    大纲(辅助系统) 离线辅助系统 数据接入 Flume介绍 Flume组件 Flume实战案例 任务调度 调度器基础 市面上调度工具 Oozie的使用 Oozie的流程定义详解 数据导出 sqoop基础 ...

  8. Flume 概述+环境配置+监听Hive日志信息并写入到hdfs

    Flume介绍Flume是Apache基金会组织的一个提供的高可用的,高可靠的,分布式的海量日志采集.聚合和传输的系统,Flume支持在日志系统中定制各类数据发送方,用于收集数据:同时,Flume提供 ...

  9. 大数据技术之_09_Flume学习_Flume概述+Flume快速入门+Flume企业开发案例+Flume监控之Ganglia+Flume高级之自定义MySQLSource+Flume企业真实面试题(重点)

    第1章 Flume概述1.1 Flume定义1.2 Flume组成架构1.2.1 Agent1.2.2 Source1.2.3 Channel1.2.4 Sink1.2.5 Event1.3 Flum ...

随机推荐

  1. WPF DataGrid多表头/列头,多行头,合并单元格,一列占据多行

    先上效果图: 思路说明:这是两个DataGrid,没有嵌套,位置和高度保持一致,在加上ScrollViewer滚动条,这就像是在一个DataGrid中. 缺点: 因为最外层有透明的Border,所以没 ...

  2. 本地计算机上的OracleDBConsoleorcl服务启动后停止

    emca -repos dropemca -repos createemca -config dbcontrol db 这三步你都运行成功了也没有报错?最后没有提示你dbcontrol已经启动了么?, ...

  3. python二进制转换

    例一.题目描述: 输入一个整数,输出该数二进制表示中1的个数.其中负数用补码表示. 分析: python没有unsigned int类型 >>> print ("%x&qu ...

  4. tkinter获取键盘输入

    tkinter获取键盘输入

  5. 432 4.3.2 STOREDRV.Deliver; recipient thread limit exceeded

    最近几天Hub-Mailbox服务器时不时就CPU超过90%.在任务管理器里面看到edgetransport占用大量CPU.进入EMC的队列查看器,看到邮箱数据库堵塞,队列上万. 堵塞的邮件大多是收件 ...

  6. hping安装过程

    转载: http://www.safecdn.cn/website-announcement/2018/12/hping-install/97.html ‎ Hping的主要功能有: 测试防火墙实用的 ...

  7. Dubbox服务demo

    一.安装虚拟机,安装所需要的jdk.zookeeper并启动zookeeper,虚拟机的ip+zookeeper默认端口号2181 二.编写Service服务方 1.创建Maven项目 2.编写接口 ...

  8. SaltStack 和 Ansible 的简单比较

    https://blog.csdn.net/nqxqxq/article/details/76154847 https://www.cnblogs.com/lgeng/p/6567424.html   ...

  9. unrecognized import path "golang.org/x/net/html"

    go run的时候报:unrecognized import path "golang.org/x/net/html" 应该是被墙掉了,自己去github上下载包即可 git cl ...

  10. 用U盘制作启动盘后空间变小的恢复方法

    先把u盘插好, 运行cmd(按住键盘左下角第二个windows键的同时按R), 输入diskpart,回车, (此时可以再输入list disk,回车,能看到这台电脑的所有磁盘大致情况,u盘一般是磁盘 ...