flume hdfs一些简单配置记忆
############################################
# producer config
############################################
#agent section
producer.sources = s
producer.channels = c c1 c2
producer.sinks = r h es #source section
producer.sources.s.type =exec
producer.sources.s.command = tail -f /usr/local/nginx/logs/test1.log
#producer.sources.s.type = spooldir
#producer.sources.s.spoolDir = /usr/local/nginx/logs/
#producer.sources.s.fileHeader = true producer.sources.s.channels = c c1 c2 producer.sources.s.interceptors = i
#不支持忽略大小写
producer.sources.s.interceptors.i.regex = .*\.(css|js|jpg|jpeg|png|gif|ico).*
producer.sources.s.interceptors.i.type = org.apache.flume.interceptor.RegexFilteringInterceptor$Builder
#不包含
producer.sources.s.interceptors.i.excludeEvents = true ############################################
# hdfs config
############################################
producer.channels.c.type = memory
#Timeout in seconds for adding or removing an event
producer.channels.c.keep-alive= 30
producer.channels.c.capacity = 10000
producer.channels.c.transactionCapacity = 10000
producer.channels.c.byteCapacityBufferPercentage = 20
producer.channels.c.byteCapacity = 800000 producer.sinks.r.channel = c producer.sinks.r.type = avro
producer.sinks.r.hostname = 127.0.0.1
producer.sinks.r.port = 10101
############################################
# hdfs config
############################################
producer.channels.c1.type = memory
#Timeout in seconds for adding or removing an event
producer.channels.c1.keep-alive= 30
producer.channels.c1.capacity = 10000
producer.channels.c1.transactionCapacity = 10000
producer.channels.c1.byteCapacityBufferPercentage = 20
producer.channels.c1.byteCapacity = 800000 producer.sinks.h.channel = c1 producer.sinks.h.type = hdfs
#目录位置
producer.sinks.h.hdfs.path = hdfs://127.0.0.1/tmp/flume/%Y/%m/%d
#文件前缀
producer.sinks.h.hdfs.filePrefix=nginx-%Y-%m-%d-%H
producer.sinks.h.hdfs.fileType = DataStream
#时间类型必加,不然会报错
producer.sinks.h.hdfs.useLocalTimeStamp = true
producer.sinks.h.hdfs.writeFormat = Text
#hdfs创建多长时间新建文件,0不基于时间
#Number of seconds to wait before rolling current file (0 = never roll based on time interval)
producer.sinks.h.hdfs.rollInterval=0
hdfs多大时新建文件,0不基于文件大小
#File size to trigger roll, in bytes (0: never roll based on file size)
producer.sinks.h.hdfs.rollSize = 0
#hdfs有多少条消息时新建文件,0不基于消息个数
#Number of events written to file before it rolled (0 = never roll based on number of events)
producer.sinks.h.hdfs.rollCount = 0
#批量写入hdfs的个数
#number of events written to file before it is flushed to HDFS
producer.sinks.h.hdfs.batchSize=1000
#flume操作hdfs的线程数(包括新建,写入等)
#Number of threads per HDFS sink for HDFS IO ops (open, write, etc.)
producer.sinks.h.hdfs.threadsPoolSize=15
#操作hdfs超时时间
#Number of milliseconds allowed for HDFS operations, such as open, write, flush, close. This number should be increased if many HDFS timeout operations are occurring.
producer.sinks.h.hdfs.callTimeout=30000
| hdfs.round | false | Should the timestamp be rounded down (if true, affects all time based escape sequences except %t) |
| hdfs.roundValue | 1 | Rounded down to the highest multiple of this (in the unit configured using hdfs.roundUnit), less than current time. |
| hdfs.roundUnit | second | The unit of the round down value - second, minute or hour. |
############################################
# elasticsearch config
############################################
producer.channels.c2.type = memory
#Timeout in seconds for adding or removing an event
producer.channels.c2.keep-alive= 30
producer.channels.c2.capacity = 10000
producer.channels.c2.transactionCapacity = 10000
producer.channels.c2.byteCapacityBufferPercentage = 20
producer.channels.c2.byteCapacity = 800000 producer.sinks.es.channel = c2 producer.sinks.es.type = org.apache.flume.sink.elasticsearch.ElasticSearchSink
producer.sinks.es.hostNames = 127.0.0.1:9300
#Name of the ElasticSearch cluster to connect to
producer.sinks.es.clusterName = sunxucool
#Number of events to be written per txn.
producer.sinks.es.batchSize = 1000
#The name of the index which the date will be appended to. Example ‘flume’ -> ‘flume-yyyy-MM-dd’
producer.sinks.es.indexName = flume_es
#The type to index the document to, defaults to ‘log’
producer.sinks.es.indexType = test
producer.sinks.es.serializer = org.apache.flume.sink.elasticsearch.ElasticSearchLogStashEventSerializer
flume hdfs一些简单配置记忆的更多相关文章
- Flume初入门简单配置与使用
1.Flume在集群中扮演的角色 Flume.Kafka用来实时进行数据收集,Spark.Storm用来实时处理数据,impala用来实时查询. 2.Flume框架简介 1.1 Flume提供一个分布 ...
- Flume + HDFS + Hive日志收集系统
最近一段时间,负责公司的产品日志埋点与收集工作,搭建了基于Flume+HDFS+Hive日志搜集系统. 一.日志搜集系统架构: 简单画了一下日志搜集系统的架构图,可以看出,flume承担了agent与 ...
- flume从kafka读取数据到hdfs中的配置
#source的名字 agent.sources = kafkaSource # channels的名字,建议按照type来命名 agent.channels = memoryChannel # si ...
- [bigdata] 使用Flume hdfs sink, hdfs文件未关闭的问题
现象: 执行mapreduce任务时失败 通过hadoop fsck -openforwrite命令查看发现有文件没有关闭. [root@com ~]# hadoop fsck -openforwri ...
- Flume的安装与配置
Flume的安装与配置 一. 资源下载 资源地址:http://flume.apache.org/download.html 程序地址:http://apache.fayea.com/fl ...
- kafka+flume+HDFS日志采集项目框架
1,项目图如下: 2, 实现过程 启动HDFS: sbin/start-dfs.sh 启动zookeeper(三台): bin/zkServer.sh start 启动kafka(三台): root@ ...
- 使用QJM实现HDFS的HA配置
使用QJM实现HDFS的HA配置 1.背景 hadoop 2.0.0之前,namenode存在单点故障问题(SPOF,single point of failure),如果主机或进程不可用时,整个集群 ...
- 小丁带你走进git世界一-git简单配置
小丁带你走进git世界一-git简单配置 1.github的简单配置 配置提交代码的信息,例如是谁提交的代码之类的. git config –global user.name BattleHeaer ...
- 以实际的WebGIS例子探讨Nginx的简单配置
文章版权由作者李晓晖和博客园共有,若转载请于明显处标明出处:http://www.cnblogs.com/naaoveGIS/ 1.背景 以实际项目中的一个例子来详细讲解Nginx中的一般配置,其中涉 ...
随机推荐
- windows控制台程序——关于UNICODE字符的总结(转)
前言:从Windows NT/2000开如,Windows系统已经是一个标准的UNICODE系统,系统内部所有字符串存储及操作均使用UNICODE编码.因此Win32 API都是UNICODE版本的, ...
- Running CMD.EXE as Local System(转)
Many times in the past I had to run an interactive command-line shell under the Local SYSTEM account ...
- 利用Jenkins实现JavaWeb项目的自动化部署
修改代码,打包,上传,重启... 大把的时间花费在这些重复无味的工作上.笔者与当前主流的价值观保持一致:我们应该把时间花费在更有意义的事情上.我们可以尝试借助一些工具,让这些重复机械的工作交给计算机去 ...
- GDB 调试PYTHON
http://www.cnblogs.com/dkblog/p/3806277.html
- iptables详解与Centos7 关闭防火墙
http://www.cnblogs.com/metoy/p/4320813.html CentOS 7.0默认使用的是firewall作为防火墙,使用iptables必须重新设置一下 1.直接关闭防 ...
- IIS服务中五种身份验证
转载:http://os.51cto.com/art/201005/202380.htm 作为微软最经典的Web服务之一的IIS服务有大致上五种Web身份认证方法.身份认证时保障IIS服务安全的根本, ...
- MVC把表格导出到Excel
有关Model: namespace MvcApplication1.Models { public class Coach { public int Id { get; set; } public ...
- Openfire 服务器更换ip后的恢复方法
如果你的服务器名称和mysql的地址都是使用的静态ip地址配置的,更改ip后,openfire就会开启失败,这种情况下请看下面的解决方法. 比如你的ip地址由 192.168.0.111 改为192. ...
- 基于libhid/libusb进行开发
操作环境:ubuntu,基于libhid/libusb进行开发 libusb介绍: libusb 设计了一系列的外部API 为应用程序所调用,通过这些API应用程序可以操作硬件,从libusb的源 ...
- [12] 扇形体(Fan)图形的生成算法
顶点数据的生成 bool YfBuildFunVertices ( Yreal radius, Yreal degree, Yreal height, Yuint slices, YeOriginPo ...