HBase Error: connection object not serializable

想在spark driver程序中连接HBase数据库,并将数据插入到HBase,但是在spark集群提交运行过程中遇到错误:connection object not serializable

详细的错误:

Exception in thread "main" java.io.NotSerializableException: DStream checkpointing has been enabled but the DStreams with their functions are not serializable
com.sae.model.HbaseHelper
Serialization stack:
- object not serializable (class: com.sae.model.HbaseHelper, value: com.sae.model.HbaseHelper@27a09971)
- field (class: com.sae.demo.KafkaStreamingTest$$anonfun$main$, name: hbHelper$, type: class com.sae.model.HbaseHelper)
- object (class com.sae.demo.KafkaStreamingTest$$anonfun$main$, <function1>)
- field (class: org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$$$anonfun$apply$mcV$sp$, name: cleanedF$, type: interface scala.Function1)
- object (class org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$$$anonfun$apply$mcV$sp$, <function2>)
- writeObject data (class: org.apache.spark.streaming.dstream.DStream)
- object (class org.apache.spark.streaming.dstream.ForEachDStream, org.apache.spark.streaming.dstream.ForEachDStream@4bdf)
- element of array (index: )
- array (class [Ljava.lang.Object;, size )
- field (class: scala.collection.mutable.ArrayBuffer, name: array, type: class [Ljava.lang.Object;)
- object (class scala.collection.mutable.ArrayBuffer, ArrayBuffer(org.apache.spark.streaming.dstream.ForEachDStream@4bdf, org.apache.spark.streaming.dstream.ForEachDStream@2b4d4327))
- writeObject data (class: org.apache.spark.streaming.dstream.DStreamCheckpointData)
- object (class org.apache.spark.streaming.dstream.DStreamCheckpointData, [
checkpoint files ])
- writeObject data (class: org.apache.spark.streaming.dstream.DStream)
- object (class org.apache.spark.streaming.kafka.KafkaInputDStream, org.apache.spark.streaming.kafka.KafkaInputDStream@163042ea)
- element of array (index: )
- array (class [Ljava.lang.Object;, size )
- field (class: scala.collection.mutable.ArrayBuffer, name: array, type: class [Ljava.lang.Object;)
- object (class scala.collection.mutable.ArrayBuffer, ArrayBuffer(org.apache.spark.streaming.kafka.KafkaInputDStream@163042ea))
- writeObject data (class: org.apache.spark.streaming.DStreamGraph)
- object (class org.apache.spark.streaming.DStreamGraph, org.apache.spark.streaming.DStreamGraph@2577a95d)
- field (class: org.apache.spark.streaming.Checkpoint, name: graph, type: class org.apache.spark.streaming.DStreamGraph)
- object (class org.apache.spark.streaming.Checkpoint, org.apache.spark.streaming.Checkpoint@2b4b96a4)
at org.apache.spark.streaming.StreamingContext.validate(StreamingContext.scala:)
at org.apache.spark.streaming.StreamingContext.liftedTree1$(StreamingContext.scala:)
at org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:)
at com.sae.demo.KafkaStreamingTest$.main(StreamingDataFromKafka.scala:)
at com.sae.demo.KafkaStreamingTest.main(StreamingDataFromKafka.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:)
at java.lang.reflect.Method.invoke(Method.java:)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$(SparkSubmit.scala:)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

解决办法:

参考官方文档:传送门

应该把打开数据库连接的代码放到foreachPartition内部,如:

dstream.foreachRDD { rdd =>
rdd.foreachPartition { partitionOfRecords =>
val connection = createNewConnection()
partitionOfRecords.foreach(record => connection.send(record))
connection.close()
}
}

HBase Error: connection object not serializable的更多相关文章

  1. 怎样处理“error C2220: warning treated as error - no object file generated”错误

    最近用VS2010 编译ceflib开源库是出现"怎样处理"error C2220: warning treated as error - no object file gener ...

  2. ab测试出现error: connection reset by peer的解决方案

    我们在使用一些开源程序之前,可能会使用ab工具在服务器或者本地进行一次性能评估,但是很多时候却总是会以失败告终,因为,服务器会拒绝你的ab工具发出的http请求, 出现 error: connecti ...

  3. DJANGO问题--Error: ‘ManyRelatedManager’ object is not iterable

    http://www.itblah.com/django-error-manyrelatedmanager-object-iterable/ Django: Error: ‘ManyRelatedMa ...

  4. unexpected error ConnectionError object has no attribute

    unexpected error ConnectionError object has no attribute

  5. 问题-Error creating object. Please verify that the Microsoft Data Access Components 2.1(or later) have been properly installed.

    问题现象:软件在启动时报如下错误信息:Exception Exception in module zhujiangguanjia.exe at 001da37f. Error creating obj ...

  6. error C2220: warning treated as error - no 'object' file generated解决方法

    error C2220: warning treated as error - no 'object' file generated 警讯视为错误 - 生成的对象文件 / WX告诉编译器将所有警告视为 ...

  7. 项目的ip地址更改,用git从远程提取代码出现错误,提示为 network error connection timed out

    昨天公司的ip进行了修改,在今天从远程提取代码的过程中提示network error connection timed out错误,从网上看了一下解决方法 1:打开项目文件夹,点击查看 2:勾选隐藏的 ...

  8. 亚马逊的PuTTY连接AWS出现network error connection refused,终极解决方案。

    使用PuTTY连接AWS的时候,一直出现network error connection refused.百度了这个问题,大家都说是SSH要设置成22.但是我已经设置过了,为什么还是遇到这个问题呢? ...

  9. wince6.0 编译报错:"error C2220: warning treated as error - no 'object' file generated"的解决办法

    内容提要:wince6.0编译报错:"error C2220: warning treated as error - no 'object' file generated" 原因是 ...

随机推荐

  1. D3D11_USAGE使用

    MSDN文档链接:http://msdn.microsoft.com/en-us/library/windows/desktop/ff476259(v=vs.85).aspx 不得不同吐槽一点的是,你 ...

  2. Python pycurl

    常用方法: pycurl.Curl() #创建一个pycurl对象的方法 pycurl.Curl(pycurl.URL, http://www.google.com.hk) #设置要访问的URL py ...

  3. 使用 CreateInstallMedia 创建 苹果系统安装U盘

    一般来说,从app store上面 下载下来的image位置,都是在 /Applications 下面 使用命令创建安装U盘,(备份一下命令,太长,记不住) sudo /Applications/In ...

  4. Chapter3:字符串、向量和数组

    C++11新特性:范围for(range for) for (declaration: expression) statement vector和数组都是对象的集合,而引用不是对象. vector对象 ...

  5. 查看linux服务器中的apache是否安装以及安装路径

    1.可以通过 apachectl -v 查看apache是否安装,如果安装了的话会显示版本号: 2.如果通过rpm包安装的话可以用  rpm -q  httpd 查看,如果安装的的话会显示包的名称

  6. mybatis系列-04-mybatis开发dao的方法

    4.1     SqlSession使用范围 4.1.1     SqlSessionFactoryBuilder 通过SqlSessionFactoryBuilder创建会话工厂SqlSession ...

  7. linux 命令 之chomd

    chmod用于改变文件或目录的访问权限.用户用它控制文件或目录的访问权限.该命令有两种用法.一种是包含字母和操作符表达式的文字设定法:另一种是包含数字的数字设定法.  1. 文字设定法 语法:chmo ...

  8. gdb 技巧

    现实数组: 比如说要显示a[10]中全部的内容用 p a显示的是地址,用p *a显示的是第一个元素显示全部或某一个:p (int [10])*a或者p *a@10 如果你使用 p *a@3 或 p * ...

  9. The h.264 Sequence Parameter Set

    转债:  http://www.cardinalpeak.com/blog/the-h-264-sequence-parameter-set/ View from the Peak The h.264 ...

  10. ACM OJ Collection

    浙江大学(ZJU):http://acm.zju.edu.cn/ 北京大学(PKU):http://acm.pku.edu.cn/JudgeOnline/ 同济大学(TJU):http://acm.t ...