Flume HA

flume提供fail over和load balance功能

1.添加collector配置（配置两个collector）

# Name the components on this agent
s1.sources = r1
s1.sinks = k1
s1.channels = c1

# Describe/configure the source
s1.sources.r1.type = avro #设置source类型，固定avro
s1.sources.r1.bind = node2 #设置绑定的hostname，agent会上传数据到这个hostname的端口
s1.sources.r1.port = 52020 #设置port
s1.sources.r1.interceptors = avroSerializeInterceptor
s1.sources.r1.interceptors.avroSerializeInterceptor.type = AvroSerializeInterceptor$Builder
#a1.sources.r1.port = 44444

# Describe the sink
s1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
s1.sinks.k1.kafka.topic = tp002
s1.sinks.k1.kafka.bootstrap.servers = 192.168.0.118:9092,192.168.0.118:9093,192.168.0.118:9094
s1.sinks.k1.kafka.flumeBatchSize = 20
s1.sinks.k1.kafka.producer.acks = 1
s1.sinks.k1.kafka.producer.linger.ms = 1
s1.sinks.k1.kafka.producer.compression.type = snappy

# Use a channel which buffers events in memory
s1.channels.c1.type = memory
s1.channels.c1.capacity = 1000
s1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
s1.sources.r1.channels = c1
s1.sinks.k1.channel = c1

2.添加agent配置

# Name the components on this agent
a1.sources = r1
a1.sinks = k1 k2 #设置多个sink
a1.channels = c1
a1.sinkgroups = g1 #设置sinkgroup，为配置load balance或者failover做准备

# Describe/configure the source
a1.sources.r1.channels = c1
a1.sources.r1.type = exec
a1.sources.r1.command = tail -f /tmp/test.log

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Describe the sink
a1.sinks.k1.channel = c1
a1.sinks.k1.type = avro
a1.sinks.k1.hostname = node2 #设置要上传数据的hostname和端口，对应collector中的配置
a1.sinks.k1.port = 52020

a1.sinks.k2.channel = c1
a1.sinks.k2.type = avro
a1.sinks.k2.hostname = node2 #同上
a1.sinks.k2.port = 52021

# set sink group
a1.sinkgroups.g1.sinks = k1 k2 #设置group中的sink

# set group
a1.sinkgroups.g1.processor.type = failover #设置sinkgroup处理类型为fail over，取值类型为default,failover,load_balance
a1.sinkgroups.g1.processor.priority.k1 = 1 #设置sink权重
a1.sinkgroups.g1.processor.priority.k2 = 2
a1.sinkgroups.g1.processor.maxpenalty = 10000

Load balance配置

启动collector和agent会有相关日志

Flume HA的更多相关文章

海量日志采集Flume(HA)
海量日志采集Flume(HA) 1.介绍: Flume是Cloudera提供的一个高可用的,高可靠的,分布式的海量日志采集.聚合和传输的系统,Flume支持在日志系统中定制各类数据发送方,用于收集数据 ...
flume的使用
1.flume的安装和配置 1.1 配置java_home,修改/opt/cdh/flume-1.5.0-cdh5.3.6/conf/flume-env.sh文件
Flume - Kafka日志平台整合
1. Flume介绍 Flume是Cloudera提供的一个高可用的,高可靠的,分布式的海量日志采集.聚合和传输的系统,Flume支持在日志系统中定制各类数据发送方,用于收集数据:同时,Flume提供 ...
Flafka: Apache Flume Meets Apache Kafka for Event Processing
The new integration between Flume and Kafka offers sub-second-latency event processing without the n ...
【翻译】Flume 1.8.0 User Guide(用户指南) Processors
翻译自官网flume1.8用户指南,原文地址:Flume 1.8.0 User Guide 篇幅限制,分为以下5篇: [翻译]Flume 1.8.0 User Guide(用户指南) [翻译]Flum ...
【翻译】Flume 1.8.0 User Guide(用户指南) Channel
翻译自官网flume1.8用户指南,原文地址:Flume 1.8.0 User Guide 篇幅限制,分为以下5篇: [翻译]Flume 1.8.0 User Guide(用户指南) [翻译]Flum ...
【翻译】Flume 1.8.0 User Guide(用户指南) Sink
翻译自官网flume1.8用户指南,原文地址:Flume 1.8.0 User Guide 篇幅限制,分为以下5篇: [翻译]Flume 1.8.0 User Guide(用户指南) [翻译]Flum ...
HAProxy + Keepalived + Flume 构建高性能高可用分布式日志系统
一.HAProxy简介 HAProxy提供高可用性.负载均衡以及基于TCP和HTTP应用的代理,支持虚拟主机,它是免费.快速并且可靠的一种解决方案.HAProxy特别适用于那些负载特大的web站点, ...
flume学习笔记
#################################################################################################### ...

随机推荐

javascript模板字符串（反引号）
模板字面量是允许嵌入表达式的字符串字面量. 你可以使用多行字符串和字符串插值功能.它们在ES2015规范的先前版本中被称为“模板字符串”. 语法 `string text``string text ...
Python核心技术与实战——二十|assert的合理利用
我们平时在看代码的时候,或多或少会看到过assert的存在,并且在有些code review也可以通过增加assert来使代码更加健壮.但是即便如此,assert还是很容易被人忽略,可是这个很不起眼的 ...
poj1740 A New Stone Game[博弈]
有若干堆石子,每一次需要从一堆石子中拿走一些,然后如果愿意的话,再从这堆石子中拿一些(揣度题意应该是不能拿出全部)分给其它任意不为空的堆.不能操作的人为负. 一直不会博弈啊..感觉完全就是个智商题,虽 ...
理解 Cookie、Session、Token
发展史 Cookie Session Token Token的起源基于服务器的验证基于服务器验证方式暴露的一些问题基于Token的验证原理 Tokens的优势发展史 1.很久很久以前,Web ...
Python中self的用法详解，或者总是提示：TypeError: add() missing 1 required positional argument: 'self'的问题解决
https://blog.csdn.net/songlh1234/article/details/83587086 下面总结一下self的用法详解,大家可以访问,可以针对平时踩过的坑更深入的了解下. ...
head first 设计模式笔记8-模板方法模式
模板设计模式:就是定义一个算法的骨架,而将具体的算法延迟到子类中来实现. 优点:使用模板方法模式,在定义算法骨架的同时,可以很灵活的实现具体的算法,满足用户灵活多变的需求. 缺点:如果算法骨架有修改的 ...
Error: Node Sass does not yet support your current environment: Windows 64-bit with Unsupported runtime (64)
错误提示: Error: Node Sass does not yet support your current environment: Windows 64-bit with Unsupporte ...
【BZOJ4552】排序（线段树，二分）
题意:给定一个n个数的排列,有m次操作:op,l,r op=0时表示将位置[L,R]升序排序 op=1时表示将位置[L,R]降序排序最后询问第q个位置上的数字 n,m,q<=1e5 思路:Fr ...
KMP 串的模式匹配 (25 分)
给定两个由英文字母组成的字符串 String 和 Pattern,要求找到 Pattern 在 String 中第一次出现的位置,并将此位置后的 String 的子串输出.如果找不到,则输出“Not ...
APIView源码与Request源码分析
一.APIView源码分析 1.安装djangorestframework 2.使用 drf是基于cbv view的封装,所以必须写cbv ①第一步:写视图,必须写cbv 路由配置: from res ...