来自:http://caiguangguang.blog.51cto.com/1652935/1384187

flume bucketpath的bug一例

测试的配置文件:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
agent-server1.sources= testtail
agent-server1.sinks = hdfs-sink
agent-server1.channels= hdfs-channel
agent-server1.sources.testtail.type = netcat
agent-server1.sources.testtail.bind = localhost
agent-server1.sources.testtail.port = 9999
agent-server1.sinks.hdfs-sink.hdfs.kerberosPrincipal = hdfs/_HOST@KERBEROS_HADOOP
agent-server1.sinks.hdfs-sink.hdfs.kerberosKeytab = /home/vipshop/conf/hdfs.keytab
agent-server1.channels.hdfs-channel.type = memory
agent-server1.channels.hdfs-channel.capacity = 200000000
agent-server1.channels.hdfs-channel.transactionCapacity = 10000
agent-server1.sinks.hdfs-sink.type = hdfs
agent-server1.sinks.hdfs-sink.hdfs.path = hdfs://bipcluster/tmp/flume/%Y%m%d
agent-server1.sinks.hdfs-sink.hdfs.rollInterval = 60
agent-server1.sinks.hdfs-sink.hdfs.rollSize = 0
agent-server1.sinks.hdfs-sink.hdfs.rollCount = 0
agent-server1.sinks.hdfs-sink.hdfs.threadsPoolSize = 10
agent-server1.sinks.hdfs-sink.hdfs.round = false
agent-server1.sinks.hdfs-sink.hdfs.roundValue = 30
agent-server1.sinks.hdfs-sink.hdfs.roundUnit = minute
agent-server1.sinks.hdfs-sink.hdfs.batchSize = 100
agent-server1.sinks.hdfs-sink.hdfs.fileType = DataStream
agent-server1.sinks.hdfs-sink.hdfs.writeFormat = Text
agent-server1.sinks.hdfs-sink.hdfs.callTimeout = 60000
agent-server1.sinks.hdfs-sink.hdfs.idleTimeout = 100
agent-server1.sinks.hdfs-sink.hdfs.filePrefix = ip
agent-server1.sinks.hdfs-sink.channel = hdfs-channel
agent-server1.sources.testtail.channels = hdfs-channel

在启动服务后,使用telnet进行测试,发现如下报错:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
14/03/24 18:03:07 ERROR hdfs.HDFSEventSink: process failed
java.lang.RuntimeException: Flume wasn't able to parse timestamp header in the event to resolve time based bucketing.
 Please check that you're correctly populating timestamp header (for example using TimestampInterceptor source interceptor).
        at org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:160)
        at org.apache.flume.formatter.output.BucketPath.escapeString(BucketPath.java:343)
        at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392)
        at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
        at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
        at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.NumberFormatException: null
        at java.lang.Long.parseLong(Long.java:375)
        at java.lang.Long.valueOf(Long.java:525)
        at org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:158)
        ... 5 more
14/03/24 18:03:07 ERROR flume.SinkRunner: Unable to deliver event. Exception follows.
org.apache.flume.EventDeliveryException: java.lang.RuntimeException: Flume wasn't able to parse timestamp header in the event to
resolve time based bucketing. Please check that you're correctly populating timestamp header (for example using TimestampInterceptor source interceptor).
        at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:461)
        at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
        at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
        at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.RuntimeException: Flume wasn't able to parse timestamp header in the event to resolve time based bucketing. Please check that you're correctly populating timestamp header (for example using TimestampInterceptor source interceptor).
        at org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:160)
        at org.apache.flume.formatter.output.BucketPath.escapeString(BucketPath.java:343)
        at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392)
        ... 3 more
Caused by: java.lang.NumberFormatException: null
        at java.lang.Long.parseLong(Long.java:375)
        at java.lang.Long.valueOf(Long.java:525)
        at org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:158)
        ... 5 more

从调用栈的信息来看,错误出在org.apache.flume.formatter.output.BucketPath类的replaceShorthand方法。
在org.apache.flume.sink.hdfs.HDFSEventSink类中,使用process方法来生成hdfs的url,其中主要是调用了BucketPath类的escapeString方法来进行字符的转换,并最终调用了replaceShorthand方法。
其中replaceShorthand方法的相关代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
  public static String replaceShorthand(char c, Map<String, String> headers,
      TimeZone timeZone, boolean needRounding, int unit, int roundDown) {
    String timestampHeader = headers.get("timestamp");
    long ts;
    try {
      ts = Long.valueOf(timestampHeader);
    catch (NumberFormatException e) {
      throw new RuntimeException("Flume wasn't able to parse timestamp header"
        " in the event to resolve time based bucketing. Please check that"
        " you're correctly populating timestamp header (for example using"
        " TimestampInterceptor source interceptor).", e);
    }
    if(needRounding){
      ts = roundDown(roundDown, unit, ts);
    }
........

从代码中可以看到,timestampHeader 的值如果取不到,在向ts赋值时就会报错。。
这其实是flume的一个bug,bug id:
https://issues.apache.org/jira/browse/FLUME-1419
解决方法有3个:
1.更改配置,更新hdfs文件的路径格式

1
agent-server1.sinks.hdfs-sink.hdfs.path = hdfs://bipcluster/tmp/flume

但是这样就不能按天来存放日志了
2.通过更改相关的代码
(patch:https://issues.apache.org/jira/secure/attachment/12538891/FLUME-1419.patch)
如果在headers中获取不到timestamp的值,就给它一个当前timestamp的值。
相关代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
     String timestampHeader = headers.get("timestamp");
     long ts;
     try {
      if (timestampHeader == null) {
        ts = System.currentTimeMillis();
      else {
        ts = Long.valueOf(timestampHeader);
      }
     } catch (NumberFormatException e) {
       throw new RuntimeException("Flume wasn't able to parse timestamp header"
         " in the event to resolve time based bucketing. Please check that"
         " you're correctly populating timestamp header (for example using"
                  " TimestampInterceptor source interceptor).", e);
}

3.为source定义基于timestamp的interceptors 
在配置中增加两行即可:

1
2
agent-server1.sources.testtail.interceptors = i1
agent-server1.sources.testtail.interceptors.i1.type = org.apache.flume.interceptor.TimestampInterceptor$Builder

一个技巧:
在debug flume的问题时,可以在flume的启动参数中设置把debug日志打到console中。

1
-Dflume.root.logger=DEBUG,console,LOGFILE

Flume wasn't able to parse timestamp header的更多相关文章

  1. 关于flume中涉及到时间戳的错误解决,Expected timestamp in the Flume even

    在搭建flume集群收集日志写入hdfs时发生了下面的错误: java.lang.NullPointerException: Expected timestamp in the Flume event ...

  2. 当我new class的时候,提示以下错误: Unable to parse template "Class" Error message: This template did not produce a Java class or an interface Error parsing file template: Unable to find resource 'Package Header.j

    你肯定修改过class的template模板,改回去就好了 #if (${PACKAGE_NAME} && ${PACKAGE_NAME} != "")packag ...

  3. Flume架构

    Flume是Cloudera提供的一个高可用的,高可靠的,分布式的海量日志采集.聚合和传输的系统: Flume 介绍 Flume是由cloudera软件公司产出的高可用.高可靠.分布式的海量日志收集系 ...

  4. netty 网关 flume 提交数据 去除透明 批处理 批提交 cat head tail 结合 管道显示行号

    D:\javaNettyAction\NettyA\src\main\java\com\test\HexDumpProxy.java package com.test; import io.netty ...

  5. TimeStamp

    private void Form1_Load(object sender, EventArgs e) { textBox1.Text= GenerateTimeStamp(System.DateTi ...

  6. Flume官方文档翻译——Flume 1.7.0 User Guide (unreleased version)(一)

    Flume 1.7.0 User Guide Introduction(简介) Overview(综述) System Requirements(系统需求) Architecture(架构) Data ...

  7. c# datetime与 timeStamp(unix时间戳) 互相转换

    /// <summary> /// Unix时间戳转为C#格式时间 /// </summary> /// <param name="timeStamp" ...

  8. webMagic解析淘宝cookie 提示Invalid cookie header

    webMagic解析淘宝cookie 提示Invalid cookie header 在使用webMagic框架做爬虫爬取淘宝极又家页面时候一直提醒cookie设置不可用如下图 淘宝的验证特别严重,c ...

  9. c# datetime与 timeStamp时间戳 互相转换

    将时间格式转化为一个int类型 // ::26时间转完后为:1389675686数字 为什么使用时间戳? 关于Unix时间戳,大概是这个意思,从1970年0时0分0秒开始到现在的秒数.使用它来获得的是 ...

随机推荐

  1. 读书笔记_Effective_C++_条款三十六:绝不重新定义继承而来的non-virtual函数

    这个条款的内容很简单,见下面的示例: class BaseClass { public: void NonVirtualFunction() { cout << "BaseCla ...

  2. java值和地址值传递、字符串常量池的理解

    #java值和地址值传递的理解: - 基本数据类型和基本数据类型的封装类都是:值传递    * 形式参数的改变不会影响实际参数的改变(相当于将值复制一份传递给形参,自身没做任何改变)   - 引用数据 ...

  3. make and make bzImage

    2.6内核 make = make bzImage + make modules 无非是改下Makefile而已 2.4 内核 01.make menuconfig 02.make dep 03.ma ...

  4. HTML5 vs FLASH vs SILVERLIGHT

    Introduction HTML5 kills off flash; HTML5 kills off Silverlight; HTML5 makes the dinner and does the ...

  5. MVC文件上传06-使用客户端jQuery-File-Upload插件和服务端Backload组件自定义控制器上传多个文件

    当需要在控制器中处理除了文件的其他表单字段,执行控制器独有的业务逻辑......等等,这时候我们可以自定义控制器. MVC文件上传相关兄弟篇: MVC文件上传01-使用jquery异步上传并客户端验证 ...

  6. 【docker】docker基础原理,核心技术简介

    关于docker的核心技术,就是以下的三大技术: 1.namespaces [命名空间] 使用linux的命名空间实现的进程间隔离.Docker 容器内部的任意进程都对宿主机器的进程一无所知. 除了进 ...

  7. 创建新的Cocos2dx 3.0项目并解决一些编译问题

    转载请注明出处:http://blog.csdn.net/cywn_d/article/details/25775019 假设是原来使用cocos2dx 2.x要升级到3.0的项目,可能须要替换coc ...

  8. Appium+python自动化7-输入中文

    前言 在做app自动化过程中会踩很多坑,咱们都是用的中文的app,所以首先要解决中文输入的问题! 本篇通过屏蔽软键盘,绕过手机的软键盘方法,解决中文输入问题. 一.定位搜索 1.打开淘宝点搜索按钮,进 ...

  9. Unity3d-Particle System 5.x系统的学习(四)

    Unity3d-Particle System 5.x系统的学习(四) 今天,我们来聊聊unity5.x的粒子系统和unity4.x粒子系统的区别. 我大致看了下,区别还是蛮多的,但是总体的粒子制作思 ...

  10. Android 简单介绍图片压缩和图片内存缓存

    转载请注明出处:http://blog.csdn.net/guolin_blog/article/details/9316683 本篇文章主要内容来自于Android Doc,我翻译之后又做了些加工, ...