Flume环境搭建_五种案例
Flume环境搭建_五种案例
http://flume.apache.org/FlumeUserGuide.html
A simple example
Here, we give an example configuration file, describing a single-node Flume deployment. This configuration lets a user generate events and subsequently logs them to the console.
# example.conf: A single-node Flume configuration # Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1 # Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444 # Describe the sink
a1.sinks.k1.type = logger # Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100 # Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
This configuration defines a single agent named a1. a1 has a source that listens for data on port 44444, a channel that buffers event data in memory, and a sink that logs event data to the console. The configuration file names the various components, then describes their types and configuration parameters. A given configuration file might define several named agents; when a given Flume process is launched a flag is passed telling it which named agent to manifest.
Given this configuration file, we can start Flume as follows:
$ bin/flume-ng agent --conf conf --conf-file example.conf --name a1 -Dflume.root.logger=INFO,console
Note that in a full deployment we would typically include one more option: --conf=<conf-dir>. The <conf-dir> directory would include a shell script flume-env.sh and potentially a log4j properties file. In this example, we pass a Java option to force Flume to log to the console and we go without a custom environment script.
From a separate terminal, we can then telnet port 44444 and send Flume an event:
$ telnet localhost 44444
Trying 127.0.0.1...
Connected to localhost.localdomain (127.0.0.1).
Escape character is '^]'.
Hello world! <ENTER>
OK
The original Flume terminal will output the event in a log message.
12/06/19 15:32:19 INFO source.NetcatSource: Source starting
12/06/19 15:32:19 INFO source.NetcatSource: Created serverSocket:sun.nio.ch.ServerSocketChannelImpl[/127.0.0.1:44444]
12/06/19 15:32:34 INFO sink.LoggerSink: Event: { headers:{} body: 48 65 6C 6C 6F 20 77 6F 72 6C 64 21 0D Hello world!. }
Congratulations - you’ve successfully configured and deployed a Flume agent! Subsequent sections cover agent configuration in much more detail.
以下为具体搭建流程
Flume搭建_案例一:单个Flume
安装node2上
1. 上传到/home/tools,解压,解压后移动到/home下
2. 重命名,并修改flume-env.sh








Flume搭建_案例二:两个Flume做集群

MemoryChanel配置
capacity:默认该通道中最大的可以存储的event数量是100,
trasactionCapacity:每次最大可以从source中拿到或者送到sink中的event数量也是100
keep-alive:event添加到通道中或者移出的允许时间
byte**:即event的字节量的限制,只包括eventbody
在/etc/profile下
node1,node2下创建测试目录test_flume,并分别在node1,node2下创建配置文件——flume21,flume22




先启动node02的Flume
flume-ng agent -n a1 -c conf -f avro.conf -Dflume.root.logger=INFO,console
flume-ng agent -n a1 -c conf -f /home/test_flume/flume22 -Dflume.root.logger=INFO,console
再启动node01的Flume
flume-ng agent -n a1 -c conf -f simple.conf2 -Dflume.root.logger=INFO,console
flume-ng agent -n a1 -c conf -f /home/test_flume/flume21 -Dflume.root.logger=INFO,console






Flume搭建_案例三:如何监控一个文件的变化?
在/etc/profile下


启动Flume
flume-ng agent -n a1 -c conf -f exec.conf -Dflume.root.logger=INFO,console
flume-ng agent -n a1 -c conf -f /home/test_flume/flume3 -Dflume.root.logger=INFO,console


Flume搭建_案例四: 如何监控一个文件:目录的变化?

在/etc/profile下





Flume搭建_案例五: 如何定义一个HDFS类型的Sink?

Flume搭建_案例五_配置项解读


hdfs.rollInterval | 30 | Number of seconds to wait before rolling current file (0 = never roll based on time interval) |
hdfs.rollSize | 1024 | File size to trigger roll, in bytes (0: never roll based on file size) |
hdfs.rollCount | 10 | Number of events written to file before it rolled (0 = never roll based on number of events) |
4. 多长时间没有操作,Flume将一个临时文件生成新文件?
hdfs.idleTimeout | 0 | Timeout after which inactive files get closed (0 = disable automatic closing of idle files) |
5. 多长时间生成一个新的目录?(比如每10s生成一个新的目录)
四舍五入,没有五入,只有四舍
(比如57分划分为55分,5,6,7,8,9在一个目录,10,11,12,13,14在一个目录)
hdfs.round | false | Should the timestamp be rounded down (if true, affects all time based escape sequences except %t) |
hdfs.roundValue | 1 | Rounded down to the highest multiple of this (in the unit configured using hdfs.roundUnit), less than current time. |
hdfs.roundUnit | second | The unit of the round down value - second, minute or hour. |
在/etc/profile下






Flume环境搭建_五种案例的更多相关文章
- Flume环境搭建_五种案例(转)
Flume环境搭建_五种案例 http://flume.apache.org/FlumeUserGuide.html A simple example Here, we give an example ...
- Hadoop Hive概念学习系列之hive三种方式区别和搭建、HiveServer2环境搭建、HWI环境搭建和beeline环境搭建(五)
说在前面的话 以下三种情况,最好是在3台集群里做,比如,master.slave1.slave2的master和slave1都安装了hive,将master作为服务端,将slave1作为服务端. 以 ...
- 从环境搭建到回归神经网络案例,带你掌握Keras
摘要:Keras作为神经网络的高级包,能够快速搭建神经网络,它的兼容性非常广,兼容了TensorFlow和Theano. 本文分享自华为云社区<[Python人工智能] 十六.Keras环境搭建 ...
- 03 Mybatis:01.Mybatis课程介绍及环境搭建&&02.Mybatis入门案例
mybatis框架共四天第一天:mybatis入门 mybatis的概述 mybatis的环境搭建 mybatis入门案例 -------------------------------------- ...
- Kafka开发环境搭建(五)
如果你要利用代码来跑kafka的应用,那你最好先把官网给出的example先在单机环境和分布式环境下跑通,然后再逐步将原有的consumer.producer和broker替换成自己写的代码.所以在阅 ...
- [刘阳Java]_EasyUI环境搭建_第2讲
在EasyUI的第1讲中我们介绍了学习EasyUI能够做什么,这次我们得快速搭建一个EasyUI环境,来测试一下它的运行效果 1.jQuery EasyUI环境搭建 <script type=& ...
- 第3章 文件I/O(5)_五种I/O模型
6. I/O处理方式(5种I/O模型) 6.1 几个概念的辨析 (1)同步和异步 ①是访问数据的方式,主要是针对IO(资源.数据)而言的.关键在于I/O操作完成后,有没有提供通知机制. ②同步的IO, ...
- Linux 下LAMP环境搭建_【all】
LAMP = Linux + Apache + Mysql + PHP 0. Linux环境搭建 Linux 系统安装[Redhat] 1.http服务软件分类及企业实战用途介绍 静态程序: Apac ...
- Linux 下LNMP环境搭建_【all】
LNMP = Linux + Nginx + Mysql + PHP 1.0 Linux环境搭建 Linux 系统安装[Redhat] 1.1. FastCGI介绍 1.什么是CGI(common g ...
随机推荐
- 编译TensorFlow源码
编译TensorFlow源码 参考: https://www.tensorflow.org/install/install_sources https://github.com/tensorflo ...
- CSS3媒体查询(Media Queries)介绍
媒体类型 all 所有设备 screen 电脑显示器 handheld 便携设备 tv 电视类型设备 print 打印用纸打印预览视图 关键字 and not(排除某种设备) only(限定某种设备) ...
- 房上的猫:for循环,跳转语句与循环结构,跳转语句进阶
一.for循环 1.定义: for循环语句的主要作用是反复执行一段代码,直到满足一定条件为止 2.组成部分: (1)初始部分:设置循环的初始状态 (2)循环体:重复执行的代码 (3)迭代部分: ...
- python实现散列表的链表法
在散列中,链接法是一种最简单的碰撞解决技术,这种方法的原理就是把散列到同一槽中的所有元素 都放在一个链表中. 链接法有两个定理,定理一: 在简单一致散列的假设下,一次不成功查找的期望时间为O(1 + ...
- Bootstrap File Input的简单使用
安装引入 使用前需要引入其css和js文件, 注意引入路径的问题 <link rel="stylesheet" href="/__PUB__/fileinput/c ...
- 基础环境之Docker入门
随着Docker技术的不断成熟,越来越多的企业开始考虑使用Docker.Docker有很多的优势,本文主要讲述了Docker的五个最重要优势,即持续集成.版本控制.可移植性.隔离性和安全性. 有了Do ...
- angular4.0命令行汇总
查看ng命令行 ng help 创建项目 ng new projectName ng new projectName --routing[--routing表示创建带路由的项目] 配置依赖 npm i ...
- [笔记]《JavaScript高级程序设计》- 最佳实践
一.可维护性 1 什么是可维护的代码 可理解性--其他人可以接受代码并理解它的意图和一般途径,而无需原开发人员的完整解释. 直观性--代码中的东西一看就能明白,不管其操作过程多么复杂. 可适应性--代 ...
- 【次小生成树】bzoj1977 [BeiJing2010组队]次小生成树 Tree
Description 小 C 最近学了很多最小生成树的算法,Prim 算法.Kurskal 算法.消圈算法等等. 正当小 C 洋洋得意之时,小 P 又来泼小 C 冷水了.小 P 说,让小 C 求出一 ...
- 意外断电数据库无法启动牵扯到异步IO的参数设置
一客户机房新装的UPS不太稳定,好几次意外断电,第3次意外断电之后问题终于来了, 数据库起不来了-- 数据库的硬件环境是一台IBM DS5020存储,2台IBM X3850 X5 软件环境是Linux ...