Flume wasn't able to parse timestamp header
来自:http://caiguangguang.blog.51cto.com/1652935/1384187
flume bucketpath的bug一例
测试的配置文件:
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
|
agent-server1.sources= testtailagent-server1.sinks = hdfs-sinkagent-server1.channels= hdfs-channelagent-server1.sources.testtail.type = netcatagent-server1.sources.testtail.bind = localhostagent-server1.sources.testtail.port = 9999agent-server1.sinks.hdfs-sink.hdfs.kerberosPrincipal = hdfs/_HOST@KERBEROS_HADOOPagent-server1.sinks.hdfs-sink.hdfs.kerberosKeytab = /home/vipshop/conf/hdfs.keytabagent-server1.channels.hdfs-channel.type = memoryagent-server1.channels.hdfs-channel.capacity = 200000000agent-server1.channels.hdfs-channel.transactionCapacity = 10000agent-server1.sinks.hdfs-sink.type = hdfsagent-server1.sinks.hdfs-sink.hdfs.path = hdfs://bipcluster/tmp/flume/%Y%m%dagent-server1.sinks.hdfs-sink.hdfs.rollInterval = 60agent-server1.sinks.hdfs-sink.hdfs.rollSize = 0agent-server1.sinks.hdfs-sink.hdfs.rollCount = 0agent-server1.sinks.hdfs-sink.hdfs.threadsPoolSize = 10agent-server1.sinks.hdfs-sink.hdfs.round = falseagent-server1.sinks.hdfs-sink.hdfs.roundValue = 30agent-server1.sinks.hdfs-sink.hdfs.roundUnit = minuteagent-server1.sinks.hdfs-sink.hdfs.batchSize = 100agent-server1.sinks.hdfs-sink.hdfs.fileType = DataStreamagent-server1.sinks.hdfs-sink.hdfs.writeFormat = Textagent-server1.sinks.hdfs-sink.hdfs.callTimeout = 60000agent-server1.sinks.hdfs-sink.hdfs.idleTimeout = 100agent-server1.sinks.hdfs-sink.hdfs.filePrefix = ipagent-server1.sinks.hdfs-sink.channel = hdfs-channelagent-server1.sources.testtail.channels = hdfs-channel |
在启动服务后,使用telnet进行测试,发现如下报错:
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
|
14/03/24 18:03:07 ERROR hdfs.HDFSEventSink: process failedjava.lang.RuntimeException: Flume wasn't able to parse timestamp header in the event to resolve time based bucketing. Please check that you're correctly populating timestamp header (for example using TimestampInterceptor source interceptor). at org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:160) at org.apache.flume.formatter.output.BucketPath.escapeString(BucketPath.java:343) at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392) at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) at java.lang.Thread.run(Thread.java:662)Caused by: java.lang.NumberFormatException: null at java.lang.Long.parseLong(Long.java:375) at java.lang.Long.valueOf(Long.java:525) at org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:158) ... 5 more14/03/24 18:03:07 ERROR flume.SinkRunner: Unable to deliver event. Exception follows.org.apache.flume.EventDeliveryException: java.lang.RuntimeException: Flume wasn't able to parse timestamp header in the event toresolve time based bucketing. Please check that you're correctly populating timestamp header (for example using TimestampInterceptor source interceptor). at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:461) at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) at java.lang.Thread.run(Thread.java:662)Caused by: java.lang.RuntimeException: Flume wasn't able to parse timestamp header in the event to resolve time based bucketing. Please check that you're correctly populating timestamp header (for example using TimestampInterceptor source interceptor). at org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:160) at org.apache.flume.formatter.output.BucketPath.escapeString(BucketPath.java:343) at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392) ... 3 moreCaused by: java.lang.NumberFormatException: null at java.lang.Long.parseLong(Long.java:375) at java.lang.Long.valueOf(Long.java:525) at org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:158) ... 5 more |
从调用栈的信息来看,错误出在org.apache.flume.formatter.output.BucketPath类的replaceShorthand方法。
在org.apache.flume.sink.hdfs.HDFSEventSink类中,使用process方法来生成hdfs的url,其中主要是调用了BucketPath类的escapeString方法来进行字符的转换,并最终调用了replaceShorthand方法。
其中replaceShorthand方法的相关代码如下:
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
|
public static String replaceShorthand(char c, Map<String, String> headers, TimeZone timeZone, boolean needRounding, int unit, int roundDown) { String timestampHeader = headers.get("timestamp"); long ts; try { ts = Long.valueOf(timestampHeader); } catch (NumberFormatException e) { throw new RuntimeException("Flume wasn't able to parse timestamp header" + " in the event to resolve time based bucketing. Please check that" + " you're correctly populating timestamp header (for example using" + " TimestampInterceptor source interceptor).", e); } if(needRounding){ ts = roundDown(roundDown, unit, ts); }........ |
从代码中可以看到,timestampHeader 的值如果取不到,在向ts赋值时就会报错。。
这其实是flume的一个bug,bug id:
https://issues.apache.org/jira/browse/FLUME-1419
解决方法有3个:
1.更改配置,更新hdfs文件的路径格式
|
1
|
agent-server1.sinks.hdfs-sink.hdfs.path = hdfs://bipcluster/tmp/flume |
但是这样就不能按天来存放日志了
2.通过更改相关的代码
(patch:https://issues.apache.org/jira/secure/attachment/12538891/FLUME-1419.patch)
如果在headers中获取不到timestamp的值,就给它一个当前timestamp的值。
相关代码:
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
String timestampHeader = headers.get("timestamp"); long ts; try { if (timestampHeader == null) { ts = System.currentTimeMillis(); } else { ts = Long.valueOf(timestampHeader); } } catch (NumberFormatException e) { throw new RuntimeException("Flume wasn't able to parse timestamp header" + " in the event to resolve time based bucketing. Please check that" + " you're correctly populating timestamp header (for example using" + " TimestampInterceptor source interceptor).", e);} |
3.为source定义基于timestamp的interceptors
在配置中增加两行即可:
|
1
2
|
agent-server1.sources.testtail.interceptors = i1agent-server1.sources.testtail.interceptors.i1.type = org.apache.flume.interceptor.TimestampInterceptor$Builder |
一个技巧:
在debug flume的问题时,可以在flume的启动参数中设置把debug日志打到console中。
|
1
|
-Dflume.root.logger=DEBUG,console,LOGFILE |
Flume wasn't able to parse timestamp header的更多相关文章
- 关于flume中涉及到时间戳的错误解决,Expected timestamp in the Flume even
在搭建flume集群收集日志写入hdfs时发生了下面的错误: java.lang.NullPointerException: Expected timestamp in the Flume event ...
- 当我new class的时候,提示以下错误: Unable to parse template "Class" Error message: This template did not produce a Java class or an interface Error parsing file template: Unable to find resource 'Package Header.j
你肯定修改过class的template模板,改回去就好了 #if (${PACKAGE_NAME} && ${PACKAGE_NAME} != "")packag ...
- Flume架构
Flume是Cloudera提供的一个高可用的,高可靠的,分布式的海量日志采集.聚合和传输的系统: Flume 介绍 Flume是由cloudera软件公司产出的高可用.高可靠.分布式的海量日志收集系 ...
- netty 网关 flume 提交数据 去除透明 批处理 批提交 cat head tail 结合 管道显示行号
D:\javaNettyAction\NettyA\src\main\java\com\test\HexDumpProxy.java package com.test; import io.netty ...
- TimeStamp
private void Form1_Load(object sender, EventArgs e) { textBox1.Text= GenerateTimeStamp(System.DateTi ...
- Flume官方文档翻译——Flume 1.7.0 User Guide (unreleased version)(一)
Flume 1.7.0 User Guide Introduction(简介) Overview(综述) System Requirements(系统需求) Architecture(架构) Data ...
- c# datetime与 timeStamp(unix时间戳) 互相转换
/// <summary> /// Unix时间戳转为C#格式时间 /// </summary> /// <param name="timeStamp" ...
- webMagic解析淘宝cookie 提示Invalid cookie header
webMagic解析淘宝cookie 提示Invalid cookie header 在使用webMagic框架做爬虫爬取淘宝极又家页面时候一直提醒cookie设置不可用如下图 淘宝的验证特别严重,c ...
- c# datetime与 timeStamp时间戳 互相转换
将时间格式转化为一个int类型 // ::26时间转完后为:1389675686数字 为什么使用时间戳? 关于Unix时间戳,大概是这个意思,从1970年0时0分0秒开始到现在的秒数.使用它来获得的是 ...
随机推荐
- mac下git+maven+jenkins自动打包发布
随着springboot+springcloud(dubbo)越来越多人使用,流行的微服务的概念越来越深入人心.分布式部署越来越复杂,给手动发布带来很大工作量.为了方便前期测试和后期线上部署更新,可使 ...
- js跨域请求提示函数未定义的问题
我的代码是这么写的 window.onload=function(){ function sendRequest(){ var script=document.getElementById(" ...
- django源码(2.0.2)粗解之命令行执行
前言 django的命令行在整个的django web开发中都会经常用到,而且是必须得用到.所以,能够了解下django的命令行实现其实是非常有帮助的. 如果大家比较关心django命令的详细说明和使 ...
- JSON在PHP中的基本应用
从5.2版本开始,PHP原生提供json_encode()和json_decode()函数,前者用于编码,后者用于解码. 一.json_encode() 该函数主要用来将数组和对象,转换为json格式 ...
- Zookeeper Monitor集群监控开发
随着线上越来越多的系统依赖Zookeeper集群.以至于Zookeeper集群的执行状况越来越重要.可是眼下还没有什么好用的Zookeeper集群监控系统(淘宝开源了一个Zookeeper监控系统,可 ...
- mysql-debug
http://www1.huachu.com.cn/read/readbookinfo.asp?sectionid=1000002778 http://hedengcheng.com/?p=238 h ...
- as 汇编器
[root@localhost ~]# cat .s .file "write.s" .section .rodata hello: .string "hello, wo ...
- GlobalGetAtomName GlobalDeleteAtom 引用 WinAPI: AddAtom、DeleteAtom、FindAtom、GetAtomName、GlobalAddAtom、GlobalDeleteAtom、GlobalFindAtom、GlobalGetAtomName
http://www.cnblogs.com/del/archive/2008/02/28/1085124.html 这是储存字符串的一组 API.通过 AddAtom 储存一个字符串, 返回一个 I ...
- 【docker】linux系统centOS 7上安装docker
要求: 一个centOS 7系统 虚拟就上安装CentOS 7步骤 本文操作在本机上使用xshell连接虚拟机上的centOS 7进行操作 1.Docker 要求 CentOS 系统的内核版本高于 ...
- 技术交流:DDD在企业开发的案例分享
背景 因为工作上的原因,这次技术交流准备的不够充分,晚上通宵写的演示代码,不过整个过程还是收获蛮大的,具体如下: 对原子操作有了更深入的了解,自己写的无锁循环队列(有点类似 RingBuffer)终于 ...