Flume 与 Kakfa结合例子(Kakfa 作为flume 的sink 输出到 Kafka topic)

进行准备工作:

$sudo mkdir -p /flume/web_spooldir
$sudo chmod a+w -R /flume

编辑 flume的配置文件:

$ cat /home/tester/flafka/spooldir_kafka.conf

# Name the components on this agent
agent1.sources = weblogsrc
agent1.sinks = kafka-sink
agent1.channels = memchannel

# Configure the source
agent1.sources.weblogsrc.type = spooldir
agent1.sources.weblogsrc.spoolDir = /flume/web_spooldir
agent1.sources.weblogsrc.channels = memchannel

# Configure the sink
agent1.sinks.kafka-sink.type = org.apache.flume.sink.kafka.KafkaSink
agent1.sinks.kafka-sink.topic = weblogs
agent1.sinks.kafka-sink.brokerList = localhost:9092
agent1.sinks.kafka-sink.batchSize = 20
agent1.sinks.kafka-sink.channel = memchannel

# Use a channel which buffers events in memory
agent1.channels.memchannel.type = memory
agent1.channels.memchannel.capacity = 100000
agent1.channels.memchannel.transactionCapacity = 1000
$

运行 Flume-ng:

$ flume-ng agent --conf /etc/flume-ng/conf \
> --conf-file spooldir_kafka.conf \
> --name agent1 -Dflume.root.logger=INFO,console

输出类似如下:

Info: Sourcing environment configuration script /etc/flume-ng/conf/flume-env.sh
Info: Including Hadoop libraries found via (/usr/bin/hadoop) for HDFS access
Info: Excluding /usr/lib/hadoop/lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/lib/hadoop/lib/slf4j-log4j12.jar from classpath
Info: Including HBASE libraries found via (/usr/bin/hbase) for HBASE access
Info: Excluding /usr/lib/hbase/bin/../lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/lib/hbase/bin/../lib/slf4j-log4j12.jar from classpath
Info: Excluding /usr/lib/hadoop/lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/lib/hadoop/lib/slf4j-log4j12.jar from classpath
Info: Excluding /usr/lib/hadoop/lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/lib/hadoop/lib/slf4j-log4j12.jar from classpath
Info: Excluding /usr/lib/zookeeper/lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar from classpath
Info: Excluding /usr/lib/zookeeper/lib/slf4j-log4j12.jar from classpath
Info: Including Hive libraries found via () for Hive access
+ exec /usr/java/default/bin/java -Xmx500m -Dflume.root.logger=INFO,console -cp '/etc/flume-ng/conf:/usr/lib/flume- 
ng/lib/*:/etc/hadoop/conf:/usr/lib/hadoop/lib/activation-1.1.jar:/usr/lib/hadoop/lib/apacheds-i18n-2.0.0-M15.jar

...

-Djava.library.path=:/usr/lib/hadoop/lib/native:/usr/lib/hadoop/lib/native:/usr/lib/hbase/bin/../lib/native/Linux-amd64-64 
org.apache.flume.node.Application --conf-file spooldir_kafka.conf --name agent1
2017-10-23 01:15:11,209 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.node.PollingPropertiesFileConfigurationProvider.start 
(PollingPropertiesFileConfigurationProvider.java:61)] Configuration provider starting
2017-10-23 01:15:11,223 (conf-file-poller-0) [INFO - org.apache.flume.node.PollingPropertiesFileConfigurationProvider 
$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:133)] Reloading configuration file:spooldir_kafka.conf
2017-10-23 01:15:11,256 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty 
(FlumeConfiguration.java:1017)] Processing:kafka-sink

...

2017-10-23 01:15:11,933 (lifecycleSupervisor-1-3) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start 
(MonitoredCounterGroup.java:96)] Component type: SOURCE, name: weblogsrc started
2017-10-23 01:15:13,003 (lifecycleSupervisor-1-1) [INFO - kafka.utils.Logging$class.info(Logging.scala:68)] Verifying properties
2017-10-23 01:15:13,271 (lifecycleSupervisor-1-1) [INFO - kafka.utils.Logging$class.info(Logging.scala:68)] Property 
key.serializer.class is overridden to kafka.serializer.StringEncoder
2017-10-23 01:15:13,271 (lifecycleSupervisor-1-1) [INFO - kafka.utils.Logging$class.info(Logging.scala:68)] Property 
metadata.broker.list is overridden to localhost:9092
2017-10-23 01:15:13,277 (lifecycleSupervisor-1-1) [INFO - kafka.utils.Logging$class.info(Logging.scala:68)] Property 
request.required.acks is overridden to 1
2017-10-23 01:15:13,277 (lifecycleSupervisor-1-1) [INFO - kafka.utils.Logging$class.info(Logging.scala:68)] Property serializer.class 
is overridden to kafka.serializer.DefaultEncoder
2017-10-23 01:15:13,718 (lifecycleSupervisor-1-1) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register 
(MonitoredCounterGroup.java:120)] Monitored counter group for type: SINK, name: kafka-sink: Successfully registered new MBean.
2017-10-23 01:15:13,719 (lifecycleSupervisor-1-1) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start 
(MonitoredCounterGroup.java:96)] Component type: SINK, name: kafka-sink started
...

2017-10-23 01:15:13,720 (pool-4-thread-1) [INFO - org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents 
(ReliableSpoolingFileEventReader.java:258)] Last read took us just up to a file boundary. Rolling to the next file, if there is one.
2017-10-23 01:15:13,720 (pool-4-thread-1) [INFO - org.apache.flume.client.avro.ReliableSpoolingFileEventReader.rollCurrentFile 
(ReliableSpoolingFileEventReader.java:348)] Preparing to move file /flume/web_spooldir/2014-01-13.log to 
/flume/web_spooldir/2014-01-13.log.COMPLETED

..

2017-10-23 01:16:11,441 (pool-4-thread-1) [INFO - org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents 
(ReliableSpoolingFileEventReader.java:258)] Last read took us just up to a file boundary. Rolling to the next file, if there is one.
2017-10-23 01:16:11,451 (pool-4-thread-1) [INFO - org.apache.flume.client.avro.ReliableSpoolingFileEventReader.rollCurrentFile 
(ReliableSpoolingFileEventReader.java:348)] Preparing to move file /flume/web_spooldir/2014-01-24.log to 
/flume/web_spooldir/2014-01-24.log.COMPLETED
2017-10-23 01:16:11,818 (pool-4-thread-1) [INFO - org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents 
(ReliableSpoolingFileEventReader.java:258)] Last read took us just up to a file boundary. Rolling to the next file, if there is one.
2017-10-23 01:16:11,819 (pool-4-thread-1) [INFO - org.apache.flume.client.avro.ReliableSpoolingFileEventReader.rollCurrentFile 
(ReliableSpoolingFileEventReader.java:348)] Preparing to move file /flume/web_spooldir/2014-02-15.log to 
/flume/web_spooldir/2014-02-15.log.COMPLETED

执行kafka consumer 程序:

$kafka-console-consumer --zookeeper localhost:2181 --topic weblogs

在另外的一个终端窗口,向/flume/web_spooldir 目录输入 web log:

cp -rf /home/tester/weblogs /tmp/tmp_weblogs
mv /tmp/tmp_weblogs/* /flume/web_spooldir
rm -rf /tmp/tmp_weblogs

Flume-ng 窗口显示的内容(正在传输log文件到Kafka topic weblogs):

2017-10-23 01:36:28,436 (pool-4-thread-1) [INFO - org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents 
(ReliableSpoolingFileEventReader.java:258)] Last read took us just up to a file boundary. Rolling to the next file, if there is one.
2017-10-23 01:36:28,449 (pool-4-thread-1) [INFO - org.apache.flume.client.avro.ReliableSpoolingFileEventReader.rollCurrentFile 
(ReliableSpoolingFileEventReader.java:348)] Preparing to move file /flume/web_spooldir/2013-09-22.log to 
/flume/web_spooldir/2013-09-22.log.COMPLETED
2017-10-23 01:36:28,971 (pool-4-thread-1) [INFO - org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents 
(ReliableSpoolingFileEventReader.java:258)] Last read took us just up to a file boundary. Rolling to the next file, if there is one.
...

2017-10-23 01:37:39,011 (pool-4-thread-1) [INFO - org.apache.flume.client.avro.ReliableSpoolingFileEventReader.rollCurrentFile 
(ReliableSpoolingFileEventReader.java:348)] Preparing to move file /flume/web_spooldir/2014-02-19.log to 
/flume/web_spooldir/2014-02-19.log.COMPLETED
2017-10-23 01:37:39,386 (pool-4-thread-1) [INFO - org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents 
(ReliableSpoolingFileEventReader.java:258)] Last read took us just up to a file boundary. Rolling to the next file, if there is one.
2017-10-23 01:37:39,386 (pool-4-thread-1) [INFO - org.apache.flume.client.avro.ReliableSpoolingFileEventReader.rollCurrentFile 
(ReliableSpoolingFileEventReader.java:348)] Preparing to move file /flume/web_spooldir/2014-03-09.log to 
/flume/web_spooldir/2014-03-09.log.COMPLETED

Consumer 窗口,输出 所有 web 文件的内容(接收 topic weblogs,获得所有web log 内容):

...

213.125.211.10 - 66543 [09/Mar/2014:00:00:14 +0100] "GET /KBDOC-00131.html HTTP/1.0" 200 9807 "http://www.tester.com" "tester 
test 001"
213.125.211.10 - 66543 [09/Mar/2014:00:00:14 +0100] "GET /theme.css HTTP/1.0" 200 6448 "http://www.tester.com" "tester test 002"

$kafka-console-consumer --zookeeper localhost:2181 --topic weblogs

[Flume][Kafka]Flume 与 Kakfa结合例子(Kakfa 作为flume 的sink 输出到 Kafka topic)的更多相关文章

  1. [Spark][kafka]kafka 生产者,消费者 互动例子

    [Spark][kafka]kafka 生产者,消费者 互动例子 # pwd/usr/local/kafka_2.11-0.10.0.1/bin 创建topic:# ./kafka-topics.sh ...

  2. Kafka Cached zkVersion [62] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition) 问题分析

    我司业务Kafka集群是3节点(broker分别为10,20,30),每个Topic 3 Partition,3 Repilication的配置,早上起床突然发现所有Topic的Broker节点都变为 ...

  3. ELK日志方案--使用Filebeat收集日志并输出到Kafka

    1,Filebeat简介 Filebeat是一个使用Go语言实现的轻量型日志采集器.在微服务体系中他与微服务部署在一起收集微服务产生的日志并推送到ELK. 在我们的架构设计中Kafka负责微服务和EL ...

  4. kafka启动时出现FATAL Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer) java.io.IOException: Permission denied错误解决办法(图文详解)

    首先,说明,我kafk的server.properties是 kafka的server.properties配置文件参考示范(图文详解)(多种方式) 问题详情 然后,我启动时,出现如下 [hadoop ...

  5. Flink 自定义source和sink,获取kafka的key,输出指定key

    --------20190905更新------- 沙雕了,可以用  JSONKeyValueDeserializationSchema,接收ObjectNode的数据,如果有key,会放在Objec ...

  6. elk-日志方案--使用Filebeat收集日志并输出到Kafka

      1,Filebeat简介 Filebeat是一个使用Go语言实现的轻量型日志采集器.在微服务体系中他与微服务部署在一起收集微服务产生的日志并推送到ELK. 在我们的架构设计中Kafka负责微服务和 ...

  7. Golang:将日志以Json格式输出到Kafka

    在上一篇文章中我实现了一个支持Debug.Info.Error等多个级别的日志库,并将日志写到了磁盘文件中,代码比较简单,适合练手.有兴趣的可以通过这个链接前往:https://github.com/ ...

  8. flume从log4j收集日志输出到kafka

    1. flume安装 (1)下载:wget http://archive.cloudera.com/cdh5/cdh/5/flume-ng-1.6.0-cdh5.7.1.tar.gz (2)解压:ta ...

  9. Flume下读取kafka数据后再打把数据输出到kafka,利用拦截器解决topic覆盖问题

    1:如果在一个Flume Agent中同时使用Kafka Source和Kafka Sink来处理events,便会遇到Kafka Topic覆盖问题,具体表现为,Kafka Source可以正常从指 ...

随机推荐

  1. Python __exit__,__enter__函数with语句的组合应用

    __exit__,__enter__函数with语句的组合应用   by:授客 QQ:1033553122 简介 设计对象类时,我们可以为对象类新增两个方法,一个是__enter(self)__,一个 ...

  2. Android Studio 无法预览xml布局视图:failed to load AppCompat ActionBar with unkNown error

    问题如下: 解决方法: 找到res-->values-->styles.xml 文件 可以看到主题Them设置如下: 修改为: 界面预览可以正常显示

  3. 杨学明老师推出全新课程--《敏捷开发&IPD和敏捷开发结合的实践》

    课时:13小时(2天) 敏捷开发&IPD和敏捷开发结合的实践 讲  师:杨学明 [课程背景] 集成产品开发(IPD).集成能力成熟度模型(CMMI).敏捷开发(Agile Developmen ...

  4. 智能POS打印配置&常见问题FAQ 12-14 后期持续更新

    1.安卓一体机会员注销钱会不会退回到支付宝 智能pos会员注销钱目前只能现金退还. 2.支付异常订单悬浮球在哪关闭 设置-->功能设置-->系统设置-->开启支付异常订单悬浮球 3. ...

  5. Oracle解锁scott用户

    解决: (1)conn sys/sys as sysdba;//以DBA的身份登录 (2)alter user scott account unlock;// 然后解锁 (3)conn scott/t ...

  6. 洗礼灵魂,修炼python(84)-- 知识拾遗篇 —— 网络编程之socket

    学习本篇文章的前提,你需要了解网络技术基础,请参阅我的另一个分类的博文:网络互联技术(4)——计算机网络常识.原理剖析 网络通信要素 1.IP地址: 用来标识网络上一台独立的终端(PC或者主机) ip ...

  7. 区块链会与io域名有什么关系

    为什么区块链会与io域名有这么大的联系? 近几年,区块链成为各国央行到国内外各大商业银行.联合国.国际货币基金组织到许多国家政府研究机构讨论的热点,"区块链+"应用创新正在成为引领 ...

  8. ASP.NET MVC概述及第一个MVC程序

    一.ASP.NET 概述        1. .NET Framework 与 ASP.NET                .NET Framework包含两个重要组件:.NET Framework ...

  9. c/c++ 重载运算符的思考

    c/c++ 重载运算符的思考 #include <iostream> using namespace std; class Imaginary{ public: Imaginary():r ...

  10. PowerDesigner 16.5 使用VBScript脚本从Excel导入物理数据模型

    本文使用的数据库类型是Oracle 11g 最近在工作中遇到一个问题:数据的设计以表格的形式保存在Excel文件中.(由于保密原因,我只能看到数据库设计文档,无法访问数据库.=_=!) 其中包括Nam ...