来自:http://caiguangguang.blog.51cto.com/1652935/1384187

flume bucketpath的bug一例

测试的配置文件:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
agent-server1.sources= testtail
agent-server1.sinks = hdfs-sink
agent-server1.channels= hdfs-channel
agent-server1.sources.testtail.type = netcat
agent-server1.sources.testtail.bind = localhost
agent-server1.sources.testtail.port = 9999
agent-server1.sinks.hdfs-sink.hdfs.kerberosPrincipal = hdfs/_HOST@KERBEROS_HADOOP
agent-server1.sinks.hdfs-sink.hdfs.kerberosKeytab = /home/vipshop/conf/hdfs.keytab
agent-server1.channels.hdfs-channel.type = memory
agent-server1.channels.hdfs-channel.capacity = 200000000
agent-server1.channels.hdfs-channel.transactionCapacity = 10000
agent-server1.sinks.hdfs-sink.type = hdfs
agent-server1.sinks.hdfs-sink.hdfs.path = hdfs://bipcluster/tmp/flume/%Y%m%d
agent-server1.sinks.hdfs-sink.hdfs.rollInterval = 60
agent-server1.sinks.hdfs-sink.hdfs.rollSize = 0
agent-server1.sinks.hdfs-sink.hdfs.rollCount = 0
agent-server1.sinks.hdfs-sink.hdfs.threadsPoolSize = 10
agent-server1.sinks.hdfs-sink.hdfs.round = false
agent-server1.sinks.hdfs-sink.hdfs.roundValue = 30
agent-server1.sinks.hdfs-sink.hdfs.roundUnit = minute
agent-server1.sinks.hdfs-sink.hdfs.batchSize = 100
agent-server1.sinks.hdfs-sink.hdfs.fileType = DataStream
agent-server1.sinks.hdfs-sink.hdfs.writeFormat = Text
agent-server1.sinks.hdfs-sink.hdfs.callTimeout = 60000
agent-server1.sinks.hdfs-sink.hdfs.idleTimeout = 100
agent-server1.sinks.hdfs-sink.hdfs.filePrefix = ip
agent-server1.sinks.hdfs-sink.channel = hdfs-channel
agent-server1.sources.testtail.channels = hdfs-channel

在启动服务后,使用telnet进行测试,发现如下报错:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
14/03/24 18:03:07 ERROR hdfs.HDFSEventSink: process failed
java.lang.RuntimeException: Flume wasn't able to parse timestamp header in the event to resolve time based bucketing.
 Please check that you're correctly populating timestamp header (for example using TimestampInterceptor source interceptor).
        at org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:160)
        at org.apache.flume.formatter.output.BucketPath.escapeString(BucketPath.java:343)
        at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392)
        at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
        at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
        at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.NumberFormatException: null
        at java.lang.Long.parseLong(Long.java:375)
        at java.lang.Long.valueOf(Long.java:525)
        at org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:158)
        ... 5 more
14/03/24 18:03:07 ERROR flume.SinkRunner: Unable to deliver event. Exception follows.
org.apache.flume.EventDeliveryException: java.lang.RuntimeException: Flume wasn't able to parse timestamp header in the event to
resolve time based bucketing. Please check that you're correctly populating timestamp header (for example using TimestampInterceptor source interceptor).
        at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:461)
        at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
        at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
        at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.RuntimeException: Flume wasn't able to parse timestamp header in the event to resolve time based bucketing. Please check that you're correctly populating timestamp header (for example using TimestampInterceptor source interceptor).
        at org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:160)
        at org.apache.flume.formatter.output.BucketPath.escapeString(BucketPath.java:343)
        at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392)
        ... 3 more
Caused by: java.lang.NumberFormatException: null
        at java.lang.Long.parseLong(Long.java:375)
        at java.lang.Long.valueOf(Long.java:525)
        at org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:158)
        ... 5 more

从调用栈的信息来看,错误出在org.apache.flume.formatter.output.BucketPath类的replaceShorthand方法。
在org.apache.flume.sink.hdfs.HDFSEventSink类中,使用process方法来生成hdfs的url,其中主要是调用了BucketPath类的escapeString方法来进行字符的转换,并最终调用了replaceShorthand方法。
其中replaceShorthand方法的相关代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
  public static String replaceShorthand(char c, Map<String, String> headers,
      TimeZone timeZone, boolean needRounding, int unit, int roundDown) {
    String timestampHeader = headers.get("timestamp");
    long ts;
    try {
      ts = Long.valueOf(timestampHeader);
    catch (NumberFormatException e) {
      throw new RuntimeException("Flume wasn't able to parse timestamp header"
        " in the event to resolve time based bucketing. Please check that"
        " you're correctly populating timestamp header (for example using"
        " TimestampInterceptor source interceptor).", e);
    }
    if(needRounding){
      ts = roundDown(roundDown, unit, ts);
    }
........

从代码中可以看到,timestampHeader 的值如果取不到,在向ts赋值时就会报错。。
这其实是flume的一个bug,bug id:
https://issues.apache.org/jira/browse/FLUME-1419
解决方法有3个:
1.更改配置,更新hdfs文件的路径格式

1
agent-server1.sinks.hdfs-sink.hdfs.path = hdfs://bipcluster/tmp/flume

但是这样就不能按天来存放日志了
2.通过更改相关的代码
(patch:https://issues.apache.org/jira/secure/attachment/12538891/FLUME-1419.patch)
如果在headers中获取不到timestamp的值,就给它一个当前timestamp的值。
相关代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
     String timestampHeader = headers.get("timestamp");
     long ts;
     try {
      if (timestampHeader == null) {
        ts = System.currentTimeMillis();
      else {
        ts = Long.valueOf(timestampHeader);
      }
     } catch (NumberFormatException e) {
       throw new RuntimeException("Flume wasn't able to parse timestamp header"
         " in the event to resolve time based bucketing. Please check that"
         " you're correctly populating timestamp header (for example using"
                  " TimestampInterceptor source interceptor).", e);
}

3.为source定义基于timestamp的interceptors 
在配置中增加两行即可:

1
2
agent-server1.sources.testtail.interceptors = i1
agent-server1.sources.testtail.interceptors.i1.type = org.apache.flume.interceptor.TimestampInterceptor$Builder

一个技巧:
在debug flume的问题时,可以在flume的启动参数中设置把debug日志打到console中。

1
-Dflume.root.logger=DEBUG,console,LOGFILE

Flume wasn't able to parse timestamp header的更多相关文章

  1. 关于flume中涉及到时间戳的错误解决,Expected timestamp in the Flume even

    在搭建flume集群收集日志写入hdfs时发生了下面的错误: java.lang.NullPointerException: Expected timestamp in the Flume event ...

  2. 当我new class的时候,提示以下错误: Unable to parse template "Class" Error message: This template did not produce a Java class or an interface Error parsing file template: Unable to find resource 'Package Header.j

    你肯定修改过class的template模板,改回去就好了 #if (${PACKAGE_NAME} && ${PACKAGE_NAME} != "")packag ...

  3. Flume架构

    Flume是Cloudera提供的一个高可用的,高可靠的,分布式的海量日志采集.聚合和传输的系统: Flume 介绍 Flume是由cloudera软件公司产出的高可用.高可靠.分布式的海量日志收集系 ...

  4. netty 网关 flume 提交数据 去除透明 批处理 批提交 cat head tail 结合 管道显示行号

    D:\javaNettyAction\NettyA\src\main\java\com\test\HexDumpProxy.java package com.test; import io.netty ...

  5. TimeStamp

    private void Form1_Load(object sender, EventArgs e) { textBox1.Text= GenerateTimeStamp(System.DateTi ...

  6. Flume官方文档翻译——Flume 1.7.0 User Guide (unreleased version)(一)

    Flume 1.7.0 User Guide Introduction(简介) Overview(综述) System Requirements(系统需求) Architecture(架构) Data ...

  7. c# datetime与 timeStamp(unix时间戳) 互相转换

    /// <summary> /// Unix时间戳转为C#格式时间 /// </summary> /// <param name="timeStamp" ...

  8. webMagic解析淘宝cookie 提示Invalid cookie header

    webMagic解析淘宝cookie 提示Invalid cookie header 在使用webMagic框架做爬虫爬取淘宝极又家页面时候一直提醒cookie设置不可用如下图 淘宝的验证特别严重,c ...

  9. c# datetime与 timeStamp时间戳 互相转换

    将时间格式转化为一个int类型 // ::26时间转完后为:1389675686数字 为什么使用时间戳? 关于Unix时间戳,大概是这个意思,从1970年0时0分0秒开始到现在的秒数.使用它来获得的是 ...

随机推荐

  1. 自动化运维之 ~cobbler~

    一 .Cobbler简介 Cobbler是一个快速网络安装linux的服务,而且在经过调整也可以支持网络安装windows.该工具使用python开发,小巧轻便(才15k 行python代码),使用简 ...

  2. Unity消息

    GameObject关于Message带有三种方法, gameObject.SendMessageUpwards ("test1",4);gameObject.SendMessag ...

  3. A* search算法解迷宫

    这是一个使用A* search算法解迷宫的问题,细节请看:http://www.laurentluce.com/posts/solving-mazes-using-python-simple-recu ...

  4. 使用Chrome快速实现数据的抓取(二)——协议

    在前面的文章简单的介绍了一下Chrome调试模式的启动方式,但前面的API只能做到简单的打开,关闭标签操作,当我们需要对某个标签页进行详细的操作时,则需要用到页面管理API.首先我们还是来回顾下获取页 ...

  5. 汉字转拼音的Java类库——JPinyin

    原文:http://blog.csdn.net/stuxuhai/article/details/8932715 [JPinyin主要特性]1.准确.完善的字库:Unicode编码从4E00-9FA5 ...

  6. 转 Objective-C中不同方式实现锁(二)

    在上一文中,我们已经讨论过用Objective-C锁几种实现(跳转地址),也用代码实际的演示了如何通过构建一个互斥锁来实现多线程的资源共享及线程安全,今天我们继续讨论锁的一些高级用法. .NSRecu ...

  7. .NET:何时应该 “包装异常”?

    背景 提到异常,我们会想到:抛出异常.异常恢复.资源清理.吞掉异常.重新抛出异常.替换异常.包装异常.本文想谈谈 “包装异常”,主要针对这个问题:何时应该 “包装异常”? “包装异常” 的技术形式 包 ...

  8. JAVA nio 2 定义 Path 类

    一旦确认了文件系统上的一个文件或目录,那么就可以定义一个 Path 类来指向它.定义 Path 类可以使用绝对路径.相对路径.路径中带有一个点号“.”(表示当前目录).路径中带有两个点“..”(表示上 ...

  9. UILabel混合显示动画效果

    UILabel混合显示动画效果 效果 源码 https://github.com/YouXianMing/Animations // // MixedColorProgressViewControll ...

  10. Using platform encoding (GBK actually) to copy filtered resources, i.e. build is platform dependent!

    执行Maven Install打包的时候,提示以下警告信息: [WARNING] Using platform encoding (GBK actually) to copy filtered res ...